Reputation
Badges 1
25 × Eureka!It's just another flag when running the trains-agent
You can have multiple service-mode instances, there is no actual limit π
Hi @<1618780810947596288:profile|ExuberantLion50>
Iβm trying to containerize a task using clearml-agent build, following instructions from the docs online.
Do you mean to create a container with the Task's environment for debugging ?
If this is for running the Task there is no need to create a specific container for it, both code and python env are cached.
Yes I was thinking a separate branch.
The main issue with telling git to skip submodules is that it will be easily forgotten and will break stuff. BTW the git repo itself is cached so the second time there is no actual pull. Lastly it's not clear on where one could pass a git argument per task. Wdyt?
I'd prefer to use config_dict, I think it's cleaner
I'm definitely with you
Good news:
newΒ
best_model
Β is saved, add a tagΒ
best
,
Already supported, (you just can't see the tag, but it is there :))
My question is, what do you think would be the easiest interface to tell (post/pre) store, tag/mark this model as best so far (btw, obviously if we know it's not good, why do we bother to store it in the first place...)
it fails because my_package using pip...so I have to manually edit the section and remove the "my_package"
MagnificentSeaurchin79 did you manually add both "." and my_package ?
If so, what was the reasoning to add my_package if pip cannot install it ?
Hi ZippyAlligator65
You mean like env vars?
@<1539780258050347008:profile|CheerfulKoala77> make sure the AMI id matches the zone of the EC2 machine
So for this...
Sorry, what is exactly "this" ?
AbruptWorm50 can you send full image (X axis is missing from the graph)
While if I just download the right packages from the requirements.txt than I don't need to think about that
I see you point, the only question how come these packages are not automatically detected ?
Good, so we narrowed it down. Now the question is how come it is empty ?
that clearml-agent needs to be installed from system python mentioned anywhere in the docs, if not I suggest it gets added.
You are right, I will check and fix if not π
Thank you so much for helping.
My pleasure
or do you mean the machine I ran the experiment locally?
Yes this one
available agent, i.e. not running anything else.
I mean how long would instance 1 wait until instance 2 of the experiment is up and running?
In other words what happens of all the nodes/agents are working and we still "need" additional instance.
This is basically like "pre-allocating" the nodes, only they wait in real-time until the additional node joins them.
Agent A pulls the 3 node Task, the Task clones itself (Task B) and enqueues on "very high priory queue" Task A wait until Task B is ru...
Ohhhh , okay as long as you know, they might fall on memory...
Hi @<1541954607595393024:profile|BattyCrocodile47>
Can you trigger a pre-existing Pipeline via the ClearML REST API?
Yes
'd want to have a Lambda function trigger the Pipeline for a batch without needing to have all the Pipeline code in the lambda function.
Easiest is to use clearml SDK, which basically is clone / enqueue (notice that pipeline is also a kind of a Task). See here: [None](https://github.com/allegroai/clearml/blob/3ca6900c583af7bec18792a4a92592b94ae80cac/example...
So does that mean "origin" solves the issue ?
How can I track in clearML that this and that row was part of experiment x because it belonged to test/training data set y?
Hi @<1543766544847212544:profile|SorePelican79>
the experiments themselves will have a link to the Dataset they were using. From a dataset perspective, the idea is not to limit you, so essentially it will package all your files, and retrieve them when you fetch the datset. In terms of specifying a row / sample. My suggestion is to mark those rows when training a...
Hi JitteryCoyote63 a few implementation details on the services-mode, because I'm not certain I understand the issue.
The docker-agent (running in services mode) will pick a Task from the services queue, then it will setup the docker for it spin it and make sure the Task starts running inside the docker (once it is running inside the docker you will see the service Task registered as additional node in the system, until the Task ends) once that happens the trains-agent will try to fetch the...
help_models is a dir in the git
And the git is registered on the experiment correctly ?
TrickyFox41 are you saying that if you add Task.init inthe code it works, but when you are calling "clearml-task" it does not work? (in both cases editing the Args/overrides ?
but I still have the problem if I try to run locally for debugging purposes
clearml-agent execute --id ...
Is this still an issue ? this is basically the same as the remote execution, maybe you should add the container (if the agent is running in docker mode) --docker ?
You will be able to set it.
You will just not see the output in the console log , but everything is running and being executed
How can i make it such that any update to the upstream database
What do you mean "upstream database"?
which was trained on jupyter notebook.
Hmm that might be the issue, it assumes a local script running, let me verify that
@<1523701304709353472:profile|OddShrimp85> are you trying to shut down the one running on your machine ?
Hi @<1547028031053238272:profile|MassiveGoldfish6>
What is the use case? the gist is you want each component to be running on a different machine. and you want to have clearml do the routing of data and logic between.
How would that work in your use case?
Oh my bad, post 0.17.5 π
RC will be out soon, in the meantime you can install directly from github:pip install git+
Hi FunnyTurkey96
Any chance you can try to run with the latest form GitHub (i just tested your code and it seemed to work on my machine).pip install git+