Reputation
Badges 1
43 × Eureka!Are those fixed from the local environment or do i need to also supply those again in the remote context?
I don't see any console errors
Sorry, i meant the arguments that are supplied to the decorator method, itself @PipelineDecorator.pipeline() and @PipelineDecorator.component() , things like name , project , docker_args , etc.
Ahhh okay, thank you. Perhaps in the future, it would be great to allow this from the UI as well?
Correction: it works when I am running the code in my local VSCode session. I still don't get resource logging when I run in an agent. 🤔 . And on a similar topic, I have a separate task that is logging metrics with tensorboard. When running locally, I see the metrics appear in the "scalars" tab in ClearML, but when running in an agent, nothing. Any suggestions on where to look?
Sure. I can send it on Monday. Thank you.
Here's my example script:
from random import randint
from clearml import Task
if __name__ == "__main__":
task: Task = Task.init(
project_name="clearml-examples", task_name="try-to-make-logging-work"
)
task.execute_remotely(queue_name="5da90f42dd4c40edab972a4bef8eab04")
logger = task.get_logger()
for i in range(10):
logger.report_scalar("example plot", series="random", value=randint(0, 100), iteration=i)
Hi @<1523701205467926528:profile|AgitatedDove14> , on the resource logging: I tried with a sleep test and it works when I'm running it from my local machine, but when I run remotely in an agent, i do not see resource logging.
And, similarly, with tensorboard logging, it works fine when running from my machine, but not when running remotely in an agent. For this, I've decided to just re-write the logging code to use ClearML's built-in logging methods, which work fine in the agent. Would stil...
The result i get in the agent is:
Traceback (most recent call last):
File "src/clearml_pipelines_examples/pipelines/examples/train_model_on_random_data/pipeline.py", line 89, in <module>
pipeline(**pipeline_ui_config)
TypeError: 'NoneType' object is not callable
Seems like the call to pipeline = PipelineDecorator.get_current_pipeline() returns None . Also, in the UI, I should be seeing all of the pipeline function parameters but I only see the config_file_path
Hi @<1523701205467926528:profile|AgitatedDove14> , thanks so the code to be executed by the task needs to be provided to the Task.create() method as script=some/path.py or does it work to have something like
def my_node_task_factory(node: PipelineController.Node) -> Task:
task = Task.create(...)
my_function()
return task
Hi @<1523701205467926528:profile|AgitatedDove14> , sorry for the delayed reply. So what you’re saying is to first kick off a new run and then rename the underlying Pipeline Task, which will cause that particular run to become a new pipeline name? But you have to do this only after you’ve started the run.
What would be most ideal would be to be able to right-click on a pipeline run and have a “clone” option, like you can with a task, where you can start a new run with a new name in a single ...
Hi @<1523701205467926528:profile|AgitatedDove14> , CLEARML_TASK_ID is set inside the agent's process, which is how I was able to get the task by running Task.get_task(environ["CLEARML_TASK_ID") . However I believe I've sorted out how to make both the resource logging and the tensorboard logging work in the agent. It seems that using Task.current_task() to get the task object does not work when running remotely, but calling Task.init() again does work. And after having called ...
Thanks very much! Yeah, it tends to fill up the console
Hi @<1523701205467926528:profile|AgitatedDove14> , I've actually hit on something accidentally that might be a clue. I have noticed that when running inside an agent, there is a bug wherein both Task.current_task() and Logger.current_logger() return None . If these are being used by the clearml package under the hood, this could be the reason we aren't seeing the metrics.
As a workaround, I created this utility function, which works for explicit logging (though it doesn't c...
import json
import os
import sys
from argparse import ArgumentParser
from logging import getLogger
from pathlib import Path
from typing import Callable
from clearml import PipelineDecorator, Task
from clearml_pipelines_examples.base.pipeline_settings import ExecutionMode
from clearml_pipelines_examples.pipelines.examples.train_model_on_random_data import (
TrainModelPipelineKwargs,
TrainModelPipelineSettings,
)
from clearml_pipelines_examples.tasks.examples import generate_dat...
Hi @<1523701070390366208:profile|CostlyOstrich36> , thanks for your reply. I’ll try both and see what happens.
@<1576381444509405184:profile|ManiacalLizard2> ,that’s interesting. So you actually need the imports to be in a certain order. That’s definitely new and a bit of an anti-pattern as it goes against recommended import statement order (built-in packages imported first) but if it works, that’s good news at least. I’ll try that as well. Thanks!
Hi @<1523701070390366208:profile|CostlyOstrich36> , this is what our devops engineer said:
the proxy-body-size limitation crashed for the Clearml api, for WEB and FileServer I set it to unlimited, but for the API I didn't change it.
Okay well I have to supply them again for the function to work, but the values are ignored so i can just have a hard-coded version for remote.
I am still struggling to figure out how to update the parameter defaults, though. I would like to be able to do the equivalent of the PipelineController.add_parameter() so that I can supply a local config with new defaults that are used on the remote execution. Otherwise, I’m stuck with whatever defaults are in the function signature.
I think this is what you're looking for but let me know if you meant something different:
{
"meta": {
"id": "76fffdf3b04247fa8f0c3fc0743b3ccb",
"trx": "76fffdf3b04247fa8f0c3fc0743b3ccb",
"endpoint": {
"name": "tasks.get_by_id_ex",
"requested_version": "2.30",
"actual_version": "1.0"
},
"result_code": 200,
"result_subcode": 0,
"result_msg": "OK",
"error_stack": "",
"error_data"...
Unfortunately, it's turning out to be quite time consuming to manually remove all of the private info in here. Is there a particular section of the log that would be useful to see? I can try to focus on just sharing that part.
@<1523701225533476864:profile|ObedientDolphin41> , I was searching for anyone having an issue like me and found this thread. I have created a simple pipeline using decorators and when I try to clone it in the UI, I get that base_task_id is empty error. It works fine when triggered programmatically from my machine. I’m wondering if you could elaborate on how you utilized the get_configuration_object and set_configuration_object methods to solve this? In my case, I’m not setting a...
Hi Max, thanks very much for your message! I understand what you’re saying now, though I suppose this is not my issue since I’m not setting any of the decorator values with variables. I’ll post a query in the main channel with code snippets to see if anyone has ideas. Thank you!
No, i'm not seeing that "Dataset Content" section. We have some older datasets that were copied from a prior server deployment that do have the section and it appears in the UI.
It seems so, yes. I'm not the one who did the server migration, but as a user I believe this is when I started noticing the issue for new datasets created after the migration.
Ah interesting, okay. I'll try adding a sleep in here for testing it out. Thanks