yup, it's there in draft mode so I can get the latest git commit when it's used as a base task
Yes that seems to be the problem, if it is in draft mode, you have no outputs...
I'm assuming you mean for the clients, right?
Is it possible to substitute these steps using containers instead.
I'm not sure I follow, could you expand ?
for a TPU with more than 16GB GRAM and less than 40GB, so sometime we need to provision a A100 to get the training speed we want but we don't use all the GRAM
Oh that makes sense...
Just saw this one, this might help?
https://www.globenewswire.com/news-release/2022/10/24/2539924/0/en/ClearML-and-Genesis-Cloud-Announce-New-MLOps-Partnership-Delivering-100-Green-Energy-Compute-Solution-for-Machine-Learning.html
DM me the entire log, I would assume this is something with the configuration
SmarmyDolphin68 okay what's happening is the process exists before the actual data is being sent (report_matplotlib_figure is an async call, and data is sent in the background)
Basically you should just wait for all the events to be flushedtask.flush(wait_for_uploads=True)
That said, quickly testing it it seems it does not wait properly (again I think this is due to the fact we do not have a main Task here, I'll continue debugging)
In the meantime you can just dosleep(3.0)
And it wil...
What do you mean by a custom queue ?
In the queues page you have a plus button, this will just create a new queue
BTW: I tested the code you previously attached, and it showed the plot in the "Plots" section
(Tested with latest trains from GitHub)
trains-agent should be deployed to GPU instances, not the trains-server.
The trains-agent purpose is for you to be able to send jobs to a GPU (at least in most cases) instance.
The "trains-server" is a control plane , basically telling the agent what to run (by storing the execution queue and tasks). Make sense ?
It should be under script.diff:'script': {'binary': '', 'repository': '', 'tag': '', 'branch': '', 'version_num': '', 'entry_point': '', 'working_dir': '', 'requirements': {'pip': ''}, 'diff': ''}
For some reason this is empty in your case, are you seeing it in the UI?
If you are querying the current task (i.e. running) it might not be there yet.
You can call this internal function that returns only after the repo detection is done.task._wait_for_repo_detection()
HappyDove3
see here https://github.com/allegroai/clearml-pycharm-plugin π
That's with the key at
/root/.ssh/id_rsa
You mean inside the container that the autoscaler spinned ?
Notice that the agent by defult would mount the Host .ssh over the existing .ssh inside the container, if you do not want this behavior you need to set: agent.disable_ssh_mount: true
in clearml.conf
Hi GiganticTurtle0
The problem is that the packages that I define in 'required_packages' are not in the scripts corresponding
What do you mean by that? is "Xarray" a wheel package? is it instllable from a git repo (example: pip install git+
http://github.com/user/xarray/axrray.git )
Hi NastyFox63
What do you mean not all of them are shown?
Do they have diff series/titles, are they plots or scalars ? How are you reporting them ?
HandsomeCrow5
BTW: out of curiosity, how do you generate the html reports. I remember a few users suggesting trains should have a report generating functionality
I think it is only in get_task
(and by default it is true)
I think query task does not filter the
Thanks JitteryCoyote63 !
Any chance you want to open github issue with the exact details or fix with a PR ?
(I just want to make sure we fix it as soon as we can π )
trains-agent RC (which they tell me will be out tomorrow) will have a switch to do that, just so it is easier π
Hi GreasyPenguin14
Yes, I think you are right the series name should be next to the title. Let me check it...
Hi @<1523701066867150848:profile|JitteryCoyote63>
Could you please push the code for that version on github?
oh seems like it is not synced, thank you for noticing (it will be taken care immediately)
Regrading the issue:
Look at the attached images
None does not contain a specific wheel for cuda117 to x86, they use the pip defualt one

I'm wondering why this is the case as docker best practices does indicate we should use a non root on production images.
The docker image for the service-agent is not root ?
Hi ShallowArcticwolf27
from the command line to a remote machine while loading a localΒ
.env
Β file as a configuration object?
Where would the ".env" go to ? Are we trying to pass it to the remote machine somehow ?
Seems like settings on the clearml-server disappeared (specifically default queue tag?!)
could it be the polling on the Task (can't remember whats the interval), but it will update it's state once every X minutes/seconds