Reputation
Badges 1
25 × Eureka!TenseOstrich47 every agent instance has its own venv copy. Obviously every new experiment will remove the old venv and create a new one. Make sense?
What's the matplotlib version ? and python version?
it is just local copy so you can rerun and reconfigure
JumpyPig73 Do you see all the configurations under the Args section in the "Configuration" Tab ?
(Maybe I'm wrong and the latest RC does Not include the python-fire support)
SmarmySeaurchin8 regrading (2)
I'm not sure the current visualization supports it. I mean we can put "{}", but that would imply you can edit it, which then we have to support, possible but weird, and this is why:task.connect({'a':{},'b': {'nested': 'value}}will become'a' = '{}''b/nested' = 'value'
But then if you edit to:'a' = '{'nested': 'value'}''b/nested' = 'value'
you have two different ways of presenting the same type of structure...
If you set the TMP env variable you can control the tmp folder. Would that work?
BTW: we are now adding "datasets chunks for a more efficient large dataset storage"
Why is it using an OutputModel and an InputModel?
So calling OutputModel will create the new Model entity and upload the data, InputModel will store it as required input Model.
Basically on the Task you have input & output section, when you clone the Task you are copying the input section into the newly created Task, and the assumption is that when you execute it, your code will create the output section.
Here when you clone the Task you will be clone the reference to the InputModel (i...
try Hydra/trainer.params.batch_size
hydra separates nesting with "."
AdventurousRabbit79 you are correct, caching was introduced in v1.0 , also notice the default is no caching, you have to specify that you want caching per step.
Hi DisgustedDove53
Is redis used as permanent data storage or just cache?
Mostly cache (Ithink)
Would there be any problems if it is restarted and comes up clean?
Pretty sure it should be fine, why do you ask ?
Logger.current_logger()Will return the logger for the "main" Task.
The "Main" task is the task of this process, a singleton for the process.
All other instances create Task object. you can have multiple Task objects and log different things to them, but you can only have a single "main" Task (the one created with Task.init).
All the auto-magic stuff is logged automatically to the "main" task.
Make sense ?
Hi ReassuredTiger98
However, the clearml-agent also stops working then.
you mean the clearml-agen daemon (the one that spinned the container) is crashing as well ?
Hi @<1715175986749771776:profile|FuzzySeaanemone21>
and then run "clearml-agent daemon --gpus 0 --queue gcp-l4" to start the worker.
I'm assuming the docker service cannot spin a container with GPU access, usually this means you are missing the nvidia docker runtime component
Hi @<1684010629741940736:profile|NonsensicalSparrow35>
So sorry I missed this thread π
Basically your issue is the load balancer that prevents the post command, you can change that, just add to any clearml.conf the following line:
api.http.default_method: "put"
Since my deps are listed in the dependencies of my setup.py, I don't want clearml to list the dependencies of the current environment
Make sense π
Okay let me check regrading the "." in the venv cache.
Oh...
None
try to add to your config file:
sdk.http.timeout.total = 300
But do consider a sort of a designer's press kit on your page haha
That is a great idea!
Also you can use:
https://2928env351k1ylhds3wjks41-wpengine.netdna-ssl.com/wp-content/uploads/2019/11/Clear_ml_white_logo.svg
Actually this is by default for any multi node training framework torch DDP / openmpi etc.
Hi @<1546665666675740672:profile|AttractiveFrog67>
- Make sure you stored the model's checkpoint (either pass
output_uri=TrueinTask.initor manually upload) - When you call
Task.initpass "continue_last_task=True" - Now you can do
last_checkpoint=task.models["output"][-1].get_local_copy()and all you need is to loadlast_checkpoint
this is very odd, can you post the log?
it fails but with COMPLETED status
Which Task is marked "completed" the pipeline Task or the Step ?
LOL AlertBlackbird30 had a PR and pulled it π
Major release due next week after that we will put a a roadmap on the main GitHub page.
Anything specific you have in mind ?
2021-07-11 19:17:32,822 - clearml.Task - INFO - Waiting to finish uploads
I'm assuming a very large uncommitted changes π
Ad1. yes, think this is kind of bug. Using _task to get pipeline input values is a little bit ugly
Good point, let;s fix it π
new pipeline is built from scratch (all steps etc), but by clicking "NEW RUN" in GUI it just reuse existing pipeline. Is it correct?
Oh I think I understand what happens, the way the pipeline logic is built, is that the "DAG" is created the first time the code runs, then when you re-run the pipeline step it serializes the DAG from the Task/backend.
Th...
Hi JitteryCoyote63
The new pipeline is almost ready for release (0.16.2),
It actually contains this exact scenario support.
Check out the example, and let me know if it fits what you are looking for:
https://github.com/allegroai/trains/blob/master/examples/pipeline/pipeline_controller.py
VexedCat68
a Dataset is published, that activates a Dataset trigger. So if every day I publish one dataset, I activate a Dataset Trigger that day once it's published.
From this description it sounds like you created a trigger cycle, am I missing something ?
Basically you can break the cycle by saying, trigger only on New Dataset with a specific Tag (or create the auto dataset in a different project/sub-project).
This will stop your automatic dataset creation from triggering the "orig...
Hi SteadyFox10
I'll use your version instead and put any comment if I find something.
Feel free to join the discussion π https://github.com/pytorch/ignite/issues/892
Thansk for theΒ
ouput_uri
Β can I put in theΒ
~/trains.conf
Β file ?
Sure you can π
https://github.com/allegroai/trains/blob/master/docs/trains.conf#L152
You can add it in the trains-agent machine's conf file, or/and on your development machine. Notice that once you run ...
Hi MiniatureCrocodile39
I would personally recommend the ClearML show π
https://www.youtube.com/watch?v=XpXLMKhnV5k
https://www.youtube.com/watch?v=qz9x7fTQZZ8