data:image/s3,"s3://crabby-images/ea8fc/ea8fc4a242d3fbf9f124d8906a48b69b89ea53a2" alt="Profile picture"
Reputation
Badges 1
25 × Eureka!Question - why is this the expected behavior?
It is 🙂 I mean the original python version is stored, but pip does not support replacing python version. It is doable with conda, but than you have to use conda for everything...
what if cleanup service is launched using ClearML-Agent Services container
The easiest is to use the container args and pass the AWS credentials as env variables:-e AWS_ACCESS_KEY_ID=abcd -e ....
Make sense ?
Hmm I suspect the 'set_initial_iteration' does not change/store the state on the Task, so when it is launched, the value is not overwritten. Could you maybe open a GitHub issue on it?
so far I understand, clearml tracks each library called from scripts and saves the list of this libraries somewhere (as I assume, this list is saved as requirements.txt file somewhere - which is later loaded into venv, when pipeline is running).
Correct
Can I edit this file (just to comment the row with "object-detection==0.1)?
BTW, regarding the object-detection library. My training scripts have calls like:
Yes in the UI, iu can right click on the Task select "reset", then it...
UpsetTurkey67 are you saying there is a sym link in the original repository, and when it copies it, it breaks the symlink ?
With pleasure, I'll make sure we officially release RC1 soon :)
I have no idea what string reference could be used when steps come from Task?
Oh I see, you are correct, when it comes to Tasks the assumption is your are passing strings (with selectors on the strings, i.e. the curly brackets) but there is no fancy serialization/deserialization as you have with pipelines from decorators / functions. The reason for that is that the Task itslef is a standalone, there is no way for the pipeline logic to actually "pull data" from it and "pass" it to the o...
Meaning if I create a sleep endpoint that is async
Hmm are you calling "sleep" or "async.sleep"?
Also are you running the serving service with GUNICORN or UVCORN?
see here:
None
So you want to have two Tasks and connect the two ?
Maybe the best approach is to have th current_task. the parent of the Dataset Task ?dataset._task.set_parent(Task.current_task())
Hi CostlyElephant1
What do you mean by "delete raw data"? Data is always fetched to cached folders and clearml takes care of cache cleanup
That said notice that get mutable copy is a target you specify, in this case you should definetly delete after usage. Wdyt ?
ok, I will do a simple workaround for this (use an additional parameter that I can update using parameter_override and then check if it exists and update the configuration in python myself)
Yep sounds good, something like this?from clearml.utilities.dicts import ReadOnlyDict, merge_dicts overrides = {} task.connect(overrides) configuration = {#stuff here} task.connect_configuration(configuration) merge_dicts configuration.update(overrides)
BTW: this will allow you to override any s...
Hi ConvolutedSealion94
Yes 🙂Task.set_random_seed(my_seed=123) # disable setting random number generators by passing None task = Task.init(...)
(BTW: draft means they are in edit mode, i.e. before execution, then they should be queued (i.e. pending) then running then completed)
Ohh, two options:
From the script itself you can do:from clearml import Task task = Task.init(...) task.execute_remotely(queue='default')
Then run the script locally, it will get until the "execute_remotely call, quit the process and re-launch it on the "default" queue.
Option B:
Use the cleaml-task
$ clearml-task --folder <where the script is> --project ...
See https://github.com/allegroai/clearml/blob/master/docs/clearml-task.md#launching-a-job-from-a-local-script
GreasyPenguin14 I think this is what you are looking forTask.get_project_id('project_name')
Right, so this "vault" design is built into the paid tiers of ClearML to achieve exactly that. Long story short, users can put their credentials/configs on the clearml-server and the agent (or the clients) will pull and merge them into the execution.
It's very cool and works really nice, but not part of the open source (or the SaaS tier).
What you could do is store these configurations on the Task itself (one way o r another). Maybe for example have an empty definitions.py
file part of ...
Hi TartSeal39
So the thing is, the agent does not support yaml env for conda. Currently if the requirements section is empty, the agent will use the requirements.txt of the repo. We first need to add support for conda yaml, and then allow you to disable the auto requirements or push the specific yaml. Would that work? Also is there a reason the auto package is not working?
What's the trains-server version ?
Basically internally we use psutil to get those stats ...
https://github.com/giampaolo/psutil/issues/1011
See psutil version that fixed that, what do you see on the Task "installed packages" ?
https://github.com/giampaolo/psutil/blob/master/HISTORY.rst#591
Hi SillyPuppy19
I think I lost you half way through.
I have a single script that launches training jobs for various models.
Is this like the automation example on the Github, i.e. cloning/enqueue experiments?
flag which is the model name, and dynamically loading the module to train it.
a Model has a UUID in the system as well, so you can use that instead of name (which is not unique), would that solve the problem?
This didn't mesh well with Trains, because the project a...
Unfortunately this sounds a classic case of RBAC (role based access control), and only the enterprise version has that feature (I think there is a contact us button on the website for those queries).
The easiest way to support the use case you describe is to share on a Task level 😞
Hi ScaryKoala63
Sure, add the following to your clearml.conf:sdk.storage.cache.default_cache_manager_size = 400
I think you are correct, it seems like for some reason you hit the cache limit, and a previous entry was deleted
Are you suggesting the conf file did not set the default size? It sounds like a bug, can you verify?
WorriedParrot51 I now see ...
Two solutions that I can quickly think of:
In the code add:import sys sys.path.append('./my_sub_module')
Assuming you always have to add the sub-directories to make the code work, and assuming they are part of the repository, this is probably the table stolution
2. In the the UI in the Docker base image, add -e PYTHONPATH=/folder
or from code (which is exactly what you did)
a clean interface task.set_base_docker('nvidia/cids -e PYTHONPATH=/folder")
(some packages that are not inside the cache seem to have be missing and then everything fails)
How did that happen?
CloudyHamster42
RC probably in a few days, but notice that it will just remove the warnings, I still can't reproduce the double axis issue.
It will be helpful if you could send a small script to reproduce the problem.
Maybe this example code can help ? https://github.com/allegroai/trains/blob/master/examples/manual_reporting.py
A few more details on the New RC (1.1.2rc0) change set:
Upload dataset now supports chunksize, for multi-part upload/download (useful with large datasets)
backwards compatibility, i.e. parent datasets do not have to support multi-part datasets
Notice multi-part datasets should be accessed with latest RCcleaml-data upload --chunk-size Dataset().upload(..., chunk_size=None)
Get Dataset support partial download (i.e. for debugging, or for more efficient multi-node support)
Notice total n...