Reputation
Badges 1
25 × Eureka!Disable automatic model uploads
Disable the auto uploadtask = Task.init(..., auto_connect_frameworks{'pytorch': False})
Hi ScantChimpanzee51
How are you launching the code ?
Basically the easiest way is to do so with the example you just mentioned,
Can this issue be reproduced ?
BTW: dockerhub is free and relatively cheap to upgrade π
(GitHub also offers docker registry)
but actually that path doesn't exist and it is giving me an error
So you are saying you only uploaded the "meta-data" i.e. a text file with links to the files, and this is why it is missing?
Is there a way to change the path inside the .txt file to clearml cache, because my images are stored in clearml cache only
I think a good solution would be to store the path in the txt file as relative path, i.e. instead of /Users/adityachaudhry/data/folder... as ./data/folder
It's dead simple to install:
Pip install trains-agent
the.n you can simply do:
Trains-agent execute --id myexperimentid
TrickyRaccoon92 actually Click is on the to do list as well ...
This is odd I was running the example code from:
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
It is stored inside a repo, but the steps that are created (i.e. checking the Task that is created) do not have any repo linked to them.
What's the difference ?
HugePelican43 sure you can, usually the limiting factor is memory, as it cannot be shared among processes, so if one allocated all memory the second process will crash with out of memory error
it is shown in the recording above
It was so odd, I had to ask π okay let me see if we can reproduce
I donβt have any error message in the browser console - Just an empty array returned on events.get_task_logs. This bug didnβt exist on version 1.1.0 and is quite annoyingβ¦
meaning the RestAPI returns nothing, is that correct ?
Is the clearml-agent queue not available in the open source?
fully available in the open source, what is missing is the SLURM connection, in the open source daemon is installed per machine (node) and spins containers/venv on the machine. The enterprise version adds support so it uses SLURM to provision the node. I hope it helps π
so do you think it would be possible to spin up another daemon, which listens to this daemon, which then runs a slurm job?
This is exactly what the ...
Is this reproducible with the hpo example here:
https://github.com/allegroai/clearml/tree/400c6ec103d9f2193694c54d7491bb1a74bbe8e8/examples/optimization/hyper-parameter-optimization
What's your clearml version? (And is it possible you verify with the latest version?)
WickedGoat98
Put the agent.docker_preprocess_bash_script in the root of the file (i.e. you can just add the entire thing at the top of the trains.conf)
Might it be possible that I can place a trains.conf in the mapped local folder containing the filesystem and mongodb data etc e.g.
I'm assuming you are referring to the trains-=agent services, if this is the case, sure you can,
Edit your docker-compose.yml, under line https://github.com/allegroai/trains-server/blob/b93591ec3226...
Hmmm.
could you change the api_server: http://localhost:8008 to your host IP?
for example:api_server: http://192.168.1.11:8008
One issue that I see is that the Dockerfile inside the agent container
Not sure I follow, these are settings for the default container to be used when the agent spins a Task for you.
How are you running the agent itself ?
but here I can tell them: return a dictionary of what you want to save
If this is the case you have two options, either store the dict as an artifact (this makes sense if this is not standalone model you would like to later use), or store as an artifact.
Artifact example:
https://github.com/allegroai/clearml/blob/master/examples/reporting/artifacts.py
getting them back
https://github.com/allegroai/clearml/blob/master/examples/reporting/artifacts_retrieval.py
Model example:
https:/...
Wait, how do I reproduce it on community server? Maybe it has something to do with number of columns ? Or whether it is already wider than the screen? What's your browser / OS ?
Ohh no I see, yes that makes sense, and I was able to reproduce m thanks!
Hi EnviousStarfish54
I remember this feature request, let me check where it stands..
Hi MinuteCamel2
I can I disable it from automatically uploading model checkpoints to ClearML servers?
Maybe this one can help :)
https://www.youtube.com/watch?v=etGjxOKG9lo
deleted all of the models from my ClearML project but I still receive this message. Do you know why?
It might take it a few hours to update... π
AstonishingWorm64 I found the issue.
The cleamlr-serving assume the agent is working in docker mode, as it Has to have the triton docker (where triton engine is installed).
Since you are running in venv mode, tritonserver is not installed, hence the error
GiganticTurtle0
That definitely makes sense. Where can I specify callbacks in theΒ
PipelineDecorator
Β API?
Hmm there isn't one actually... (the interface I was thinking about was PipelineConroller ...)
Would it make sense to throw an exception in the pipeline execution code?
BTW: I just verified, if the pipeline step fails an exception is raised (ValueError)
I know that there is possibility to set up some budget - for example seconds of running after which optimization stops. But is there a possibility to specify a boolean condition when work should stop?
RoundMosquito25 you mean when you reach a limit of loss<Threshold or something similar ?
Hi ShinyWhale52
Every execution of the pipeline (by definition) will create a new job based on the pipeline steps
This is the reason you see all the steps twice (the default assumption is you wish to re-run the step, as this is part of the processing workflow (e.g. training a model)
the model has been overwritten. I guess this is due to this instruction:
This is because you are storing it locally to the same path, it just reflects the fact you just overwrote your model.
To create a...
DilapidatedDucks58 so is this more like a pipeline DAG that is built ?
I'm assuming this is more than just grouping ?
(by that I mean, accessing a Tasks artifact does necessarily point to a "connection", no? Is it a single Task everyone is accessing, or a "type" of a Task ?
Is this process fixed, i.e. for a certain project we have a flow (1) executed Task of type A, then Task of type (B) using the artifacts fro Task (A). This implies we might have multiple Tasks of types A/B but they are alw...
ClearML maintains a github action that sets up a dummy clearml-server,
You have one, it's the http://app.clear.ml (not a dummy one, but for this purpose it will work)
thoughts ?
RobustGoldfish9
I think you need to set the trains-agent docker to be aware of the host, so it knows how to mount data/cache/configurations into the sibling docker
It should look something like:TRAINS_AGENT_DOCKER_HOST_MOUNT="/mnt/host/data:/root/.trains"So if running a docker:docker run -e TRAINS_AGENT_DOCKER_HOST_MOUNT="/mnt/host/data:/root/.trains" ...
You can just spin another agent on the same machine π
StorageManager π
UnevenOstrich23
but interesting that auto-reload config does not working as I expected.
Unfortunately the trains-agent does not support auto reloading the config file yet. If you think this will be a great feature, please feel free to open a GitHub feature request issue π