Hmmm maybe
I thought that was expected behavior from poetry side actually
I think this is the expected behavior, hence bug?!
1st: is it possible to make a pipeline component call another pipeline component (as a substep)
Should work as long as they are in the same file, you can however launch and wait any Task (see pipelines from tasks)
2nd: I am trying to call a function defined in the same script, but unable to import it. I passing the repo parameter to the component decorator, but no change, it always comes back with "No module named <module>" after my
from module import function
c...
Hi WickedBee96
Queue1 will take 3GPUs, Queue2 will take another 3GPUs, so in Queue3 can I put 2-4 GPUs??
Yes exactly !
if there are idle GPUs so take them to process the task? o
Correct, basically you are saying, this queue needs a minimum of 2 GPUs, but if you have more allocate them to the Task it pulled (with a maximum of 45 GPUs)
Make sense ?
Also I can’t call the “preprocess” function since there is no valid endpoint to be hitting
Wait now I'm confused, when you are calling " None " you are actually calling the preprocess function running on the inference container, and this one in turn (automatically) calls the Triton container.
Are you calling the Triton manually?
Could you share your preprcoess.py , and the command line you have used to register the two model versions ?
(based on ...
Number of entries in the dataset cache can be controlled via cleaml.conf : sdk.storage.cache.default_cache_manager_size
, I need to understand it what happens when I press "Enqueue" In web UI and set it to default queue
The Task ID is pushed into the execution queue (from the UI / backend that is it), Then you have clearml-agent
running on Your machine, the agent listens on queue/s and pulls jobs from queue.
It will pull the Task ID from the queue, setup the environment according to the Task (i.e. either inside a docker container or in a new virtual-env), clone the code/apply uncommitted changes ...
What about output_uri?
If you are using StorageManager directly, output_uri
is not relevant
I want to keep the above setup, the remote branch that will track my local will be on
fork
so it needs to pull from there. Currently it recognizes
origin
so it doesn’t work because the agent then can’t find the commit.
So you do not want to push the change set ?
You can basically add the entire change set (uncomitted changes) from the last pushed commit).
In your clearml.conf, set store_code_diff_from_remote: true
https://github.com/allegroai...
Could you test if this is working:
https://github.com/allegroai/clearml/blob/master/examples/reporting/matplotlib_manual_reporting.py
basically use the template 🙂 we will deprecate the override option soon
do you have git repo link in the execution section of the experiment ?
CooperativeFox72 btw, are you guys running those 20 experiments manually or through trains-agent ?
BTW: is this on the community server or self-hosted (aka docker-compose)?
SoreDragonfly16 notice that if in the web UI you aborting a task it will do exactly what you described, print a message and quit the process. Any chance someone did that?
Ooops 😞
task.get_tags()
task.set_tags()
If Task.init() is called in an already running task, don’t reset auto_connect_frameworks? (if i am understanding the behaviour right)
Hmm we might need to somehow store the state of it ...
Option to disable these in the clearml.conf
I think this will be to general, as this is code specific , no?
But every agent is a different pod so I do not know how properly share the folder with images.
Can I conclude Kubernetes running the agents ?
Hi GracefulDog98
As UnevenDolphin73 pointed you might be looking for https://clear.ml/docs/latest/docs/references/sdk/task#execute_remotely
Which will stop the current local process, and enqueue the task on the "default" queue, for the agent to execute.
Is this what you are looking for ?
The idea is you can run your code once in "development" mode, so you know everything is working, then from the UI (or programmatically) you can clone the experiment, edit the configuration (or anythin...
Since my deps are listed in the dependencies of my setup.py, I don't want clearml to list the dependencies of the current environment
Make sense 🙂
Okay let me check regrading the "." in the venv cache.
Hi @<1556450111259676672:profile|PlainSeaurchin97>
Is there any simple way to use
argparse
to pass a clearml task name?
need to call
args = task.connect(args)
.
noooo 🙂 there is no need to do that, the arguments are automatically detected
see for yourself
args = parse_args()
task = Task.init(task_name=args.task_name)
This is odd , and it is marked as failed ?
Are all the Tasks marked failed, or is it just this one ?
Hi @<1523701868901961728:profile|ReassuredTiger98>
The sdk.development.default_output_uri
is used for Artifacts and Models. debug samples (or anything else the Logger class creates) will use the api.file_server
On the Task itself, you have the "output destination" (in the Execution tab) which would override the "output_uri" on a Task level
Does that make sense ?
Hi SubstantialElk6
Yes this is the queue the glue will pull jobs from and push into the k8s. You can create a new queue from the UI (go to the workers&queues page and to the Queue Tab and press on "create new" Ignore it 🙂 this is if you are using config maps and need TCP routing to your pods As you noted this is basically all the arguments you need to pass for (2). Ignore them for the time being This is the k8s overrides to use if launching the k8s job with kubectl (basically --override...
Hi SmugOx94
Hmm are you creating the environment manually, or is it done by Task.init ?
(Basically Task.init will store the entire environment of conda, and if the agent is working with conda package manager it will use it to restore it)
https://github.com/allegroai/clearml-agent/blob/77d6ff6630e97ec9a322e6d265cd874d0ab00c87/docs/clearml.conf#L50
I think it would make sense to have one task per run to make the comparison on hyper-parameters easier
I agree. Could you maybe open a GitHub issue on it, I want to make sure we solve this issue 🙂
DepressedChimpanzee34
What's the hydra version ?
I tested with 1.1.0dev3 and it worked for me