Reputation
Badges 1
43 × Eureka!Ah interesting, okay. I'll try adding a sleep in here for testing it out. Thanks
The result i get in the agent is:
Traceback (most recent call last):
File "src/clearml_pipelines_examples/pipelines/examples/train_model_on_random_data/pipeline.py", line 89, in <module>
pipeline(**pipeline_ui_config)
TypeError: 'NoneType' object is not callable
Seems like the call to pipeline = PipelineDecorator.get_current_pipeline()
returns None
. Also, in the UI, I should be seeing all of the pipeline function parameters but I only see the config_file_path
Hi @<1523701435869433856:profile|SmugDolphin23> , so I need to call the pipeline function again in the remote context? I guess I thought when I start it up, my local session parses the pipeline and then transmits it to the server to run but it sounds like, it just copies the code and then i need to effectively call it again in the agent?
Okay so I discovered that setting -e CLEARML_AGENT_PACKAGE_PYTORCH_RESOLVE=none
solves the issue.
That said, if someone could explain to me why this error was occurring and why it only happens in the case of cloning, I'd love to understand. Thanks!
I believe you should be able to set the queue_name
parameter to None
to accomplish this.
@<1523701225533476864:profile|ObedientDolphin41> , I was searching for anyone having an issue like me and found this thread. I have created a simple pipeline using decorators and when I try to clone it in the UI, I get that base_task_id is empty
error. It works fine when triggered programmatically from my machine. I’m wondering if you could elaborate on how you utilized the
get_configuration_object
and set_configuration_object
methods to solve this? In my case, I’m not setting a...
@<1576381444509405184:profile|ManiacalLizard2> ,that’s interesting. So you actually need the imports to be in a certain order. That’s definitely new and a bit of an anti-pattern as it goes against recommended import statement order (built-in packages imported first) but if it works, that’s good news at least. I’ll try that as well. Thanks!
Hi @<1523701070390366208:profile|CostlyOstrich36> , this is what our devops engineer said:
the proxy-body-size limitation crashed for the Clearml api, for WEB and FileServer I set it to unlimited, but for the API I didn't change it.
Server (see screenshot). Thanks!
I don't see any console errors
No, i'm not seeing that "Dataset Content" section. We have some older datasets that were copied from a prior server deployment that do have the section and it appears in the UI.
It seems so, yes. I'm not the one who did the server migration, but as a user I believe this is when I started noticing the issue for new datasets created after the migration.
Hi @<1523701070390366208:profile|CostlyOstrich36> , I would expect the loss_func
parameter to be FocalLoss
instead of ['FocalLoss', 'FocalLoss', 'FocalLoss', 'FocalLoss']
(and same for the validation_split_name
parameter. I will try to put together an example, though it might take a little time before I can do it.
Yes, that did make it work in this case, thank you.
To be clear Task.init()
was called initially. I had to call it again later in the code in order to get the current task object instead of Task.current_task()
, which only seems to work locally. That's the part that is not intuitive.
Hi @<1523701205467926528:profile|AgitatedDove14> , on the resource logging: I tried with a sleep test and it works when I'm running it from my local machine, but when I run remotely in an agent, i do not see resource logging.
And, similarly, with tensorboard logging, it works fine when running from my machine, but not when running remotely in an agent. For this, I've decided to just re-write the logging code to use ClearML's built-in logging methods, which work fine in the agent. Would stil...