Hi UnevenDolphin73
You mean this part?
https://github.com/allegroai/clearml-agent/blob/5afb604e3d53d3f09dd6de81fe0a494dacb2e94d/docs/clearml.conf#L212
(In other words, theΒ
the Task's Environment section
Β is a bit unclear)
Yes we should expand, but generally you are correct it should work as you described π
Hi ShinyRabbit94
system_site_packages: true
This is set automatically when running in "docker mode" no need to worry π
What is exactly the error you are getting ?
Could it be the container itself has the python packages installed in a venv not as "system packages" ?
Hi @<1544853721739956224:profile|QuizzicalFox36>
http:/34.67.35.46:8081/...
notice there is a / missing in the link, how is that possible? it should be http://
Let me know if I can be of help π
(you can find it in the pipeline component page)
Hi UnevenOstrich23
if --docker is enable that will means every new experiments will be executed into dedicated agent worker containers?
Correct
I think the missing part is how to specify the docker for the experiment?
If this is the case, in the web UI, clone your experiment (which will create a draft copy, that you can edit), then in the Execution tab, scroll down to the "base docker image" and specify the docker image to use.
Notice that you can also add flags after the docker im...
BulkyTiger31 could it be there is some issue with the elastic container ?
Can you see any experiment's metrics ?
And I think the default is 100 entries, so it should not get cleaned.
and then they are all removed and for a particular task it even happens before my task is done
Is this reproducible ? Who is cleaning it and when?
upload_artifact
will actually do two things:
upload the file to the trains-server register it as an artifact on the experiment
What did you mean by "register the artifact manually"? You still need to upload the file to the trains-server (so it is later accessible )
if I want to run the experiment the first time without creating theΒ
template
?
You mean without manually executing it once ?
Unfortunately that is correct. It continues as if nothing happened!
oh dear, let me make sure this is taken care of
And thank you for the reproduce code!!!
I think your "files_server" is misconfigured somewhere, I cannot explain how you ended up with this broken link...
Check the clearml.conf on the machines or the env vars ?
However, it's very interesting why ability to cache the step impacts artifacts behavior
From you log:
videos_df = StorageManager.download_file(videos_df)
Seems like "videos_df" is the DataFrame, why are you trying to download the DataFrame ? I would expect to try and download the pandas file, not a DataFrame object
Right, if this is the case, then just use 'title/name 001'
it should be enough (I think this is how TB separates title/series or metric/variant )
LudicrousParrot69 we are working on adding nested project which should help with the humongous mass the HPO can create. This is a more generic solution for the nesting issue. (since nesting inside a table is probably not the best UX solution π )
looks like a great idea, I'll make sure to pass it along and that someone reply π
Hi VexedCat68
What type of data is it? And what type of annotations?
Streaming data into the training process is great, but is it post quality control?
Hi RoughTiger69
I'm actually not sure about DVC support as well, see in these links, syncing and registering is a link, not creating an immutable copy.
And the sync between the local and remote seems like it is downloading the remote and comparing to the local copy.
Basically adding remote source Does not mean DVC will create an immutable copy of the content, it's just a pointer to a bucket (feel free to correct me if I misunderstood their capability)
https://dvc.org/doc/command-reference/...
SmallBluewhale13 the final path is automatically generated, you only need to specify the bucket itself. By default it will be your "files_server"
https://github.com/allegroai/clearml/blob/c58e8a4c6a1294f8acec6ed9cba81c3b91aa2abd/docs/clearml.conf#L10
You can either change the configuration (which will make sure All uploaded artificats will always be there, including debug images etc.)
You can specify where you want the artifacts and debug images to be uploaded by setting:
https://allegro....
trains was not able to pick the right wheel when I updated the torch req from 1.3.1 to 1.7.0: It downloaded wheel for cuda version 101.
Could you send a log, it should have worked π
Correct, and that also means the code the runs is not auto-magically logged.
Hi JitteryCoyote63 could you provide a bit more details, is this a repo for a python module (i.e. in the installed pacakges) or a submodule ?
(no objection to add an argument but, I just wonder what's the value)
Try to manually edit the "Installed Packages" (right click the Task, select "reset", now you can edit the section)
and change it to :-e git+ssh@github.com:user/private_package.git@57f382f51d124299788544b3e7afa11c4cba2d1f#egg=private_package
(assuming " pip install -e
mailto:git+ssh@github.com :user/...
" will work, should solve the issue )
FiercePenguin76 in the Tasks execution tab, under "script path", change to "-m filprofiler run catboost_train.py".
It should work (assuming the "catboost_train.py" is in the working directory).
Then it initiate a run on aws, which I want it to use the same task-id.
BoredPigeon26 Clone the Task, it basically creates a new copy (of the setup/configuration etc.)/
Then you can launch it on an aws instance (I'm assuming with clearml-agent)
wdyt?
But it write-over the execution tab in the gui
It does you are correct, it will however Not overwrite the reports (log scalars etc)
I think the main risk is ClearML upgrades to MongoDB vX.Y, and mongo changed the API (which they did because of amazon), and now the API call (aka the mongo driver) stops working.
Long story short, I would not recommend it π
Okay could you test with export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/.singularity.d/libs/