Do you have python 3.7 in the docker ?
BTW trains agent will not delete the venv until the next run, so you can check exactly what's missing there
I think the main issue is that for some reason the container running changed one of the files inside the temp folder. then the host machine is "stuck" with a file that the root user owned/changed, and now it cannot reuse / delete the temp folder.
I think the fix is to make sure the container deleted the temp folder when it is done
Thanks for checking @<1545216070686609408:profile|EnthusiasticCow4> stable release will be out soon
- Correct. Basically the order is restapi body dictionary-> preprocess -> process -> post-process -> restapi dictionary return
Hi @<1603198134261911552:profile|ColossalReindeer77>
I would also check this one: None
And the agent is in docker mode or venv mode?
Hi OddAlligator72
for instance - remove all the metrics from some step onward?
(I think that as long as the Task is not published you could do such a thing directly with the RestAPI (aka APIClient from python)
What's the use case?
GleamingGrasshopper63 what do you have configured in the "package manager" section?
https://github.com/allegroai/clearml-agent/blob/5446aed9cf6217f876d3b62226e38f21d88374f7/docs/clearml.conf#L64
Hi BroadMole64
'from X import Y', which says that there isn't such module X. any help? thanks.
can you see package X under the "Execution" tab "Installed Packages" section ?
(think of this section as requirements.txt section, in order for the agent to install the package on the remote machine it should have it listed there)
By default the agent will add the root of the git repository into the pythonpath , so that you can import...
Yes 🙂 documentation is being worked on ... Anyhow we will be uploading a new documentation site soon (hopefully in a week or so), putting it all on GitHub so it will be easier for the community to edit and add more
btw: you can also do cron
for that:
None
@reboot sleep 60 && clearml-agent daemon ...
This should have worked with the latest clearml RC.
And you verified it is not working?
I'm not sure how to debug it, that would be my first question. So I should first check if docker is executed with --gpus? I'll pay attention to this next time this happens, thanks.
The first line of the Task console log should have the exact docker command that was used, this could be a good start
also check if there is any chance there is another agent listening to this queue, maybe it actually runs somewhere without a gpu at all?
ContemplativeGoat37 I think there was an issues just lije you described and it was solved in later versions, upgrade to the latest clearml package version, you should be fine 🙂
Now in case I needed to do it, can I add new parameters to cloned experiment or will these get deleted?
Adding new parameters is supported 🙂
We are using k8s glue to spawn the job. ...
I think this is actual network latency, nothing to do with the jobs, could it be the server is very far away?
What happens when you manually start a Task from your machine ?
Is the latency fixed? Is it just when starting a new Task?
https://stackoverflow.com/questions/5419/python-unicode-and-the-windows-console
Hmm try to set this one before spinning the agent
Windowsset PYTHONIOENCODING=:replace
Inside Colabos.environ["PYTHONIOENCODING"] = ":replace"
It should actually work the same, if you find out it fails to properly register let me know (and then I guess a github issue is the next step)
Set it on the PID of the agent process itself (i.e. the clearml-agent python process)
the task is being Aborted rather than be in Draft. Am I missing something?
Yes, the reason is for not missing anything that you might have reported on it.
And usually execute_remotely will get the execution queue as a paramter (i.e. immdiatly launching the Task)
You can now (starting v1.0) enqueue an aborted Task so it should not make a difference, you can also reset the Task and edit it in the UI
uploading artifacts
if you call task.upload_artifact(...) , there is no need to set output_uri. If you want models to be uploaded (e.g. torch.save(...) ) only then you have to set output_uri.
Otherwise correct 🙂
Hi @<1523709807092043776:profile|GrittyKangaroo27>
some of my completed datasets,
This only has an effect on the dataset when it is being uploaded, if completed it is there for logging purposes only. What is exactly the use case? (just to be verify, once a Task/Dataset is completed you cannot edit it)
you could also use:
https://github.com/allegroai/clearml/blob/ce7e77a00e869a2690f31cbc578636ce88bc4613/docs/clearml.conf#L188
and setup the clearml.conf
on the users machine to automatically log the environment variables at run time (stored under the Configuration tab).
Then the agent will pull these same variables at execution time and set them
Sure thing, hopefully I'll remember to ping tomorrow once GitHub is synced, I'd appreciate it if you could verify the fix works 🙂
Hi @<1610808279263350784:profile|FriendlyShrimp96>
Is there a way to get a list of variants given a metric, or even just a full list of metrics and variants for a given task id?
Try this
None
from clearml.backend_api.session.client import APIClient
c = APIClient()
metrics = c.events.get_task_metrics(tasks=["TASK_ID_HERE"], event_type="training_debug_image")
print(metrics)
I think API ...
Hi ScatteredClams84
Is there any parameter that adjusts the "number of files that can be stored in the cache"? I am using clearml python version 1.0.3 to upload artifacts and get the artifacts back from a task. (edited)
Yes you are correct, the default value is 100 entries.
You can configure it in the clearml.conf, just add:sdk.storage.cache.default_cache_manager_size = 1000
or from code:
` from clearml.storage.cache import CacheManager
CacheManager.get_cache_manager(cache_file_...
LazyFish41 just making sure, you built a container from the docker file, and used it as base docker image for the Task, is that correct ?
Also notice the cleaml-agent will not change the entry point of the docker meaning if the entry point does not end with plain bash, it will not actually run anything