AgitatedDove14 its the same file system, so it would be better just to use the original code files and the same conda env. if possible…
AgitatedDove14 that worked! but i had to add:os.environ['CLEARML_PROC_MASTER_ID'] = '' os.environ['TRAINS_PROC_MASTER_ID'] = ''
or else it tought it was the parent optimizer task i was trying to run.
but now im facing new issue, the details are empty:
i can create tasks and reterive them from the queues
can i run a random task from a queue? like this
clearml-agent execute --id <TASK_ID>
or
ChubbyLouse32 This will just work out of the box 🙂
No need to enqueue the Task, just reset it (in the UI)
, the easiest way possible would be if could just some how run task and let the lsf manage the environment
You mean let the LSF set the conda/venv ? or do you also mean to get the code-base, changes etc ?
Does the machine have connection to the backend?
so it would be better just to use the original code files and the same conda env. if possible…
Hmm you can actually run your code in "agent mode" assuming you have everything else setup.
This basically means you set a few environment variables prior to launching the code:
Basically:export CLEARML_TASK_ID=<The_task_id_to_run> export CLEARML_LOG_TASK_TO_BACKEND=1 export CLEARML_SIMULATE_REMOTE_TASK=1 python my_script_here.py
now i noticed clearml-agent list
gets stuck as well
Are you running on your own server or the community server?
can you get the agent to execute the task on the current conda env without setting up new environment? or is there any other way to get task from the queue running locally in the current conda env?
os.environ['CLEARML_PROC_MASTER_ID'] = ''
Nice catch! (I'm assuming you also called Task.init somewhere before, otherwise I do not think this was necessary)
I think i solved it by deleting the project and running the base_task one time before the hyper parameter optimzation
So isit working now? everything is there ?
can you get the agent to execute the task on the current conda env without setting up new environment?
Wouldn't that break easily ? Is this a way to avoid dockers, or a specific use case ?
is there any other way to get task from the queue running locally in the current conda env?
You mean including cloning the code etc. but not installing any python packages ?
Nice catch! (I’m assuming you also called Task.init somewhere before, otherwise I do not think this was necessary)
i was calling task init and it still somehow tought its the parent task, until i fixed it as i said.
and yes, everything is working now! im running hyper parameter optimzation on LSF cluster where every task is an LSF job running without clearml-agent
I think i solved it by deleting the project and running the base_task one time before the hyper parameter optimzation
Cant you paste the output until the stuck point? Sounds very strange. Does it work when it's not enqueued? Also, what version of clearml-agent & server are you on?
"os": "Linux-4.18.0-348.2.1.el8_5.x86_64-x86_64-with-glibc2.28", "python": "3.9.7"
did i have to configure the environment first maybe? i assumed it just uses the environment where it was called
Im still trying to figure out what is the best way to execute task on LSF cluster, the easiest way possible would be if could just some how run task and let the lsf manage the environment, on the same filesystem it is very easy to use shared conda env etc
On what OS are you on?
Regarding your question - I can't recall for sure. I think it still creates a virtualenv
I'm running hyper parameter optimzation on LSF cluster where every task is an LSF job running without clearml-agent
WOW this is so cool! 🎊
clearml 1.1.6 clearml-agent 1.1.2
no output at all, so nothing to paste