
Reputation
Badges 1
45 × Eureka!Solved it by doing clearml.Task.current_task().id but thank you
yep, just a string which is a path but not to upload the folder
Also looked at it but its only supported registered artifact object type is a pandas.DataFrame and not strings.
I think I'll keep it with ':' in the start of the string and that way it won't upload the folder
@<1523701087100473344:profile|SuccessfulKoala55> and @<1523701070390366208:profile|CostlyOstrich36> Ok so I found the problem but its weird,
when the agent is setting up the enviorment its installing torch=1.11.0 and not installing the one in the requirements which is torch=1.11.0+cu113,
I've checked the clearml.conf and i do have this flag set:
force_repo_requirements_txt: true
and I have a local whl of torch=1.11.0+cu113 with a path set to its location in the requirements.txt ...
when i tried doing with the decorators it threw me an error that it cannot run task init in side a working task (the pipe lines task)
its working now, thanks that was the problem.
Thanks @<1523701070390366208:profile|CostlyOstrich36> , but doesn’t the agent create/caches an environment from the requirements.txt when running? I’m reproducing an old project that used to work like that, and also my ClearML.conf set to work that way
Btw in pipelines is there a way to get the pipelines main task id? for example <step_name>.id gets me the stages id but I need the main pipeline that's running all the tasks
Oh so in that case I'll need to change every agent's pip config file.
The flow is: Training.py (which creates and runs a training task) -> conversion_task.py (converts the outputs of the models into a format of our choosing) -> testing.py (testing the model after conversion).
I tried using the decorators and fucntions but they both threw me errors that i cannot do task init in side a running task.
sadly the teammate that had the problem re-ran the experiments so i don't have the taskids but I do have the cpu and gpu usage of the agent that ran the experiment:
@<1523701087100473344:profile|SuccessfulKoala55> But when i use this setting it the packages download only from the torch repo and not a local repo correct? or does it use the url-extra-link? and is there a way to cancel the auto cuda detect?
@<1523701087100473344:profile|SuccessfulKoala55> After going into the steps full details I reset the step and enqueued it
I reviewed this example and sadly there isn't anything about how to upload a path as a string only.
Wow, thanks a lot @<1523701070390366208:profile|CostlyOstrich36> for pointing me in the right direction. I also see that i can use sdk.development.worker.log_stdout
if i really need to kill my api calls before I'll Host my own server.
BTW what does suppress_update_message
do? I mean which kind of messages does it suppress?
Hi @<1523701070390366208:profile|CostlyOstrich36> , Here is a better explanation of my situation, in my IDE the working directory is where my code starts and I'm importing from common_utils my custom augmentations and locally the code is working with the import I've added in my previous message, however when i run from ClearML agent the import from point a to point b isn't working however they are both in the same git repo and i don't want to copy the files into project_1 as to not have unne...
Thank you @<1523701070390366208:profile|CostlyOstrich36> and @<1523701205467926528:profile|AgitatedDove14> , after that bit on information, can you tell me where I can find the differences between the community server and self hosted server?
Are there any additional downsides to migrating to a self hosted server?
@<1523701070390366208:profile|CostlyOstrich36> After discussing with my TL, we think the plan we are subscribed to might not be for us, can you point me to a person who we can have a meeting with and advice us the best plan for my team?
@<1523701087100473344:profile|SuccessfulKoala55> and @<1523701070390366208:profile|CostlyOstrich36> , in the end I've found the problem, it was due to me running the pipeline locally and when running the pipeline locally it, doesn't copy all the dir but only the script that is running None
No, until now we used the default server that is handled by Clearml and we want to transfer to a self hosted one
Just upgraded to clearml-agent==1.5.1 and I still get this error.
yes sometimes I suffer from small network issues, is there a way to make clearml have a bigger timeout when installing packages?
and if not is there a way to point it to a local package for installation or a local virtual enviroment?
Yes it does, thank you @<1523701070390366208:profile|CostlyOstrich36>
And I'm looking at None as an example of a clearml.conf file and i can't seem to find sdk.development.worker.console_cr_flush_period
this flag.
Hi @<1523701070390366208:profile|CostlyOstrich36> , I am using the community server, what happens if i change to a self hosting server?
I'm using Tensorboard to report everything, nothing special besides that.
@<1523701087100473344:profile|SuccessfulKoala55> yes the working dir is set to the correct path and yet it cannot import the train module
Thanks John, I read the one about the pip timeout, the problem is that I'm assume clearml runs the following command :
"pip install -r requirments.txt" and I want to know if I make clearml add the timeout flag.
It’s running a agent without docker, we aren’t using docker
yeah i see it now in the requirements of the task, that's weird, I'll create a new environment and check it again, thanks