This will disable storing the uncommitted changes
@<1754676270102220800:profile|AlertReindeer55> , I think what @<1523701087100473344:profile|SuccessfulKoala55> means is that you can set the docker image on the experiment level itself as well. If you go into the "EXECUTION" tab of the experiment, in the container section you might see an image there
Hi @<1742355077231808512:profile|DisturbedLizard6> , you can open a GitHub feature request for this to be added 🙂
What are the exact steps you are currently doing now? Is the folder/script in a repo?
Hi @<1749965229388730368:profile|UnevenDeer21> , an NFS is one good option. You can also point all agents on the same machine to the same cache folder as well. Or just like you suggested, point all workers to the same cache on a mounted NFS
Hi @<1558986867771183104:profile|ShakyKangaroo32> , what version of clearml
are you using?
Hi @<1523701949617147904:profile|PricklyRaven28> , you mean that a single machine will have multiple workers on it each "serving" a slice of the gpu?
Hi @<1742355077231808512:profile|DisturbedLizard6> , you can achieve this using the following env var:
CLEARML_AGENT_FORCE_EXEC_SCRIPT
None
@<1644147961996775424:profile|HurtStarfish47> , you also have the auto_connect_frameworks
parameter of Task.init
do disable the automatic logging and then manually log using the Model module to manually name and register the model (and upload ofc)
Hi @<1523701842515595264:profile|PleasantOwl46> , the version is released, thus public. Not sure what you mean, can you please elaborate?
Hi @<1613344994104446976:profile|FancyOtter74> , I think this is cause because you're creating a dataset in the same task. Therefor there is a connection between the task and the dataset and they are moved to a special folder for datasets. Is there a specific reason why you're creating both a Task & Dataset in the same code?
Hi @<1750689997440159744:profile|ShinyPanda97> , I think you can simply move the model to a different project as part of the pipeline
Hi @<1671689458606411776:profile|StormySeaturtle98> , I'm afraid that's not possible. You could rerun the code on the other workspace though 🙂
Hi, if you pass an input model, at the end of the training you will have your output model. Why do you want to fetch the input model from the previous step?
You can fetch the task object via the SDK and inspect task.data
or do dir(task)
to see what else is inside.
You can also fetch it via the API using tasks.get_by_id
@<1576381444509405184:profile|ManiacalLizard2> , the rules for caching steps is as follows - First you need to enable it. Then assuming that there is no change of input from the previous time run AND there is no code change THEN use output from previous pipeline run. Code from imports shouldn't change since requirements are logged from previous runs and used in subsequent runs
Hi @<1576381444509405184:profile|ManiacalLizard2> , I think the correct format is PACKAGE @ git+
None
Hi @<1576381444509405184:profile|ManiacalLizard2> , I would suggest playing with the Task object in python. You can do dir(<TASK_OBJECT>)
in python to see all of it's parameters/attributes.
From the environment variable
no, it's an environment variable
Hi @<1576381444509405184:profile|ManiacalLizard2> , I think this is the env var you're looking for
CLEARML_AGENT_SKIP_PIP_VENV_INSTALL
None
you should set it on the machine running the agent
So your HPO job is affected by the azure_storage_blob package? How are you running HPO? Can you provide logs & configurations for two such different runs?
Hi @<1727497172041076736:profile|TightSheep99> , you can change it in the settings -> configuration section
Hi @<1529271085315395584:profile|AmusedCat74> , this is the default image I useprojects/ml-images/global/images/c6-deeplearning-tf2-ent-2-3-cu110-v20201105-ubuntu-1804
I guess the image really depends on your needs