Reputation
Badges 1
25 × Eureka!Hi ScaryLeopard77
Could that be solved with this PR?
https://github.com/allegroai/clearml/pull/548
So you want to have two Tasks and connect the two ?
Maybe the best approach is to have th current_task. the parent of the Dataset Task ?dataset._task.set_parent(Task.current_task())
One last question: Is it possible to set the pip_version task-dependent?
no... but why would it matter on a Task basis ? (meaning what would be a use case to change the pip version per Task)
No worries 🙂 glad it worked
Disable automatic model uploads
Disable the auto uploadtask = Task.init(..., auto_connect_frameworks{'pytorch': False})
you mean to spin a pod with the agent inside it (daemon in services mode).
Or connect the services queue to the k8s cluster (i.e. define the pod template that uses cpu with not a lot of ram)?
Task.debug_simulate_remote_task
simulates the Task being executed by the agent (basically same behaviour, only local). the argument it gets is the Task ID (string).
The to see how it works is to run the code once (no debug_simulate call), get the Task ID we created, then rerun with the debug_simulate_remote_task
passing the previous Task.ID
Make sense ?
This part is odd:SCRIPT PATH: tmp.7dSvBcyI7m
How did you end with this random filename? how are you running this code?
Hi ConvolutedSealion94
Just making sure, you spinned the docker-compose of the clearml serving as well ?
Clearml automatically gets these reported metrics from TB, since you mentioned see the scalars , I assume huggingface reports to TB. Could you verify? Is there a quick code sample to reproduce?
Hi SmallDeer34
Can you try with the latest RC , I think we fixed something with the jupyter/colab/vscode support!pip install clearml==1.0.3rc1
I aborted the task because of a bug on my side
🙂
Following this one, is treating abort as failed a must feature for the pipeline (in your case) or is it sort of a bug in your opinion ?
which to my understanding has to be given before a call to an argparser,
SmarmySeaurchin8 You can call argparse before Task.init, no worries it will catch the arguments and trains-agent
will be able to override them :)
, I can see the shape is
[136, 64, 80, 80]
. Is that correct?
Yes that's correct. In case of the name, just try input__0
Notice you also need to convert it to torchscript
It does work about 50% of the times
EcstaticGoat95 what do you mean by "work about 50%" ? do you mean the other 50% it hangs ?
Could not locate channel name 'gg_clearml'
CheerfulGorilla72 these are the permissions:
https://github.com/allegroai/clearml/blob/427b98270cc846b5d7e4af49f9732e3eb8d7d3ae/examples/services/monitoring/slack_alerts.py#L13channels:join channels:read chat:write
My use case is when I have a merge request for a model modification I need to provide several informations for our Quality Management System one is to show that the experiment is a success and the model has some improvement over the previous iteration.
Sounds likes good approach 🙂
Obviously I don't want the reviewer to see all ...
Maybe move publish the experiment and move it to a dedicated folder ? Then even if they see all other experiments, they are under "development" p...
This looks strange that only a single scalar is reported.
Hi SubstantialElk6
I'm not sure what you are asking 🙂
Basically the clearml-agent
will pull a Task from an execution queue, and execute it (based on the definition on the Task, i.e. git repo, python packages docker image etc.)
Hi @<1523703472304689152:profile|UpsetTurkey67>
I circumvented the problem by putting timestamp in task name, but I don't think this is necessary.
Just pass reuse_last_task_id=False
to Task.init, it will never try to reuse them 🙂
None
Hi GreasyPenguin14
Quick question, any reason not to use a 2D scatter ? or a histogram (or any other non time-series plot)?
Could it be it checks the root target folder and you do not have permissions there only on subfolders?
do you have docker installed on all slurm agent/worker machines
Docker support?
The idea is that it is not necessary, using the trains-agent you can not only launch the experiment on a remote machine, you can override the parameters, not just cmd line arguments, but any dictionary you connected with the Task or configuration...
build your containers off these two? or are you building directly from code ?
JitteryCoyote63 I found it 🙂
Are you working in docker mode or venv mode ?
I want to be able to access the data just avoid reporting the experiment results
Yes, you are correct 😞
If you just want to skip the logging you can always add an if to the Tasl.init call ?!
Can you also share the full log? the numbers seem of (and clearml cannot actually "invent" those numbers they are coming from somewhere...)