
Reputation
Badges 1
149 × Eureka!it has the same effect as start/wait/stop, kinda weird
I still haven't figured out how to make files downloaded this way visible for future get_local_copy
calls though
And can I store models with no attachment to tasks? For example, original pretrained checkpoints
I think it would be intuitive to have an exact name
or introduce another parameter that regulates whether name
is regex
And when it is a regex, it can return all matched models (e.g. as list) rather than only the last one
There are some questions in this channel already regarding pipeline V2. Is there any tutorial or changelog or examples I can refer to?
Sorry for the delay
Not reproduced, but caught another error when running pipeline_from_tasks.py
` Traceback (most recent call last):
File "pipeline_from_tasks.py", line 31, in <module>
pipe.add_step(name='stage_data', base_task_project='examples', base_task_name='pipeline step 1 dataset artifact')
File "/home/kirillfish/.local/lib/python3.6/site-packages/clearml/automation/controller.py", line 276, in add_step
base_task_project, base_task_name))
ValueError: Could not find ...
@<1523701435869433856:profile|SmugDolphin23> maybe I could make a pull request ? Is there any community guideline how to make pull requests to ClearML?
I could insert some updated info to my conference talk if you share the recording by tomorrow morning 😄
CostlyOstrich36 hi! yes, as I expected, it doesn't see any files unless I call add_files
first
But add_files
has no output_url
parameter and tries to upload to the default place. This returns 413 Request Entity Too Large
error because there are too many files, so using the default location is not an option. Could you please help with this?
so that the way of doing it would be like this:all_models = Model.query_models(projeect_name=..., task_name=..., tags=['running-best-checkpoint']) all_models = sorted(all_models, key=lambda x: extract_epoch(x)) for model in all_models[:-num_to_preserve]: Model.remove(model, delete_weights_file=True)
AgitatedDove14 are models technically Task
s and can they be treated as such? If not, how to delete a model permanently (both from the server and from AWS storage)?
SuccessfulKoala55 sorry, that was a bug on my side. It was just referring to another class named Model
SuccessfulKoala55 Turns out we have copied elasticsearch database as well. Also it seems that the error is thrown only for experiments with artifacts
No, when I run the pipeline from the console on my local machine, it for some reason launches on clearml-services
hostname (despite of the fact I specified the queue with the desired agent with pipe.set_default_execution_queue
in my code)
I regularly run into the same problem when I launch pipelines locally (for remote execution)
However, when I clone the pipeline from web UI and launch it once again, it works. Is there a way to bypass this?
Yes, it works, thank you! The question remains though: why docker containers won't launch on services
Solved. The problem was a trailing space before the image name in the Image
section in web UI. I think you should probably strip the string before proceeding to environment building step, to avoid this annoying stuff to happen. Of course, users could check twice before launching, but this thing will come up every once in a while regardless
More specifically, there are 2 tasks with almost identical docker commands. The only difference is the image itself. The task with one image works, and with another image it fails. Both images are valid images that lauch nicely on my laptop. Both images exist in the registry. Maybe you have some ideas what could possibly be wrong here?
Maybe displaying 9 or 10 by default would be enough + clearly visible and thick scrollbar to the right
In short, what helped isgitlab+deploy-token
in gitlab url
Suppose I have the following scenario (real-world project, real ML pipeline scenario)
- I have separate projects for different steps (ETL, train, test, tensorrt conversion...). Every step has it's own git repository, docker image, branch etc
- For quite a long time all the steps were not functioning as parts of an automated pipeline. For example, collaborative experimentation (training and validation steps). We were just focusing on reproducibility/versioning etc
- After some time, we decided...