Reputation
Badges 1
25 × Eureka!Hi CluelessElephant89
When you edit the args (General section) in the UI, you are editing the args for "remote execution"
(i.e. when executed by the agent, the args
dict will get the values from the UI , as oppsed to "manual execution" where there UI gets the values from code)
In order to simulate the "remote execution" inside your development environment
Try:
` from clearml import Task
simulate remote execution of a specific Task instance
Task.debug_simulate_remote_task(task_id='R...
AstonishingWorm64 I found the issue.
The cleamlr-serving assume the agent is working in docker mode, as it Has to have the triton docker (where triton engine is installed).
Since you are running in venv mode, tritonserver is not installed, hence the error
and itβs in the βinstalled packagesβ from the child task:
This is because the agent always updates back the full venv setup, so you will be able to always reproduce the entire thing (as opposed to dev time, where it lists only the directly imported packages)
Yes, that sounds like a good start, DilapidatedDucks58 can you open a github issue with the feature request ?
I want to make sure we do not forget
Hmm, is there a way to do this via code?
Yes, clone the Task Task.clone
Then do data=task.export_task()
and edit the data object (see execution section)
Then update back with task.update_task(data)
but I need to dig digger into the architecture to understand what we need exactly from k8s glue.
Once you do, feel free to share, basically there are two options , use the k8s scheduler with dynamic pods, or spin the trains-agent as a service pod, and let it spin the jobs
Hi ShallowArcticwolf27
First of all:
If the answer to number 2 is no, I'd loveee to write a plugin.
Always appreciated β€
Now actually answering the Q:
Any torch.save (or any other framework save) will either register or automatically upload, the file (or folder) in the system. If this is a folder it will be zipped and uploaded, if a file just uploaded to to the assigned storage output (the cleaml-server, any object storage service, or shared folder). I'm not actually sure I...
DepressedChimpanzee34 something along the lines of:from multiprocessing.pool import ThreadPool p = ThreadPool() def get_last_metric(t): return t.get_last_scalar_metrics() task_scalars_list = p.map(get_last_metric, top_tasks) p.close()
We parallelized network connection as I'm assuming the delay is fetching
So the main difference is kedro pipelines are function based steps (I might be overly simplifying, so please take it with a grain of salt), while in ClearML pipeline is Job, i.e. it needs its own environment and is longer than a few seconds (as opposed to a single function)
You are doing great π don't worry about it
JitteryCoyote63 instead of _update_requirements, call the following before Task.init:Task.add_requirements('torch', '1.3.1') Task.add_requirements('git+
')
Hi ConvolutedSealion94
Yes πTask.set_random_seed(my_seed=123) # disable setting random number generators by passing None task = Task.init(...)
Hi JumpyDragonfly13
- is "10.19.20.15" accessible from your machine (i.e. can you ping to it)?
- Can you manually SSH to 10.19.20.15 on port 10022 ?
Let me see if I can reproduce something
SubstantialElk6 whats the command line you are using ?
SweetGiraffe8 Task.init will autolog everything (git/python packages/console etc), for your existing process.
Task.create purely creates a new Task in the system, and lets' you manually fill in all the details on that Task
Make sense ?
Hi LovelyHamster1
You mean when as a section name or a variable?
Could you change this example to include a variable that breaks the support ?
https://github.com/allegroai/clearml/blob/master/examples/frameworks/hydra/hydra_example.py
Hi ReassuredTiger98
Basically assuming Linux, init.d will do the trick
https://unix.stackexchange.com/questions/20357/how-can-i-make-a-script-in-etc-init-d-start-at-boot
The -m src.train
is just the entry script for the execution all the rest is be taken care by the Configuration section (whatever you pass after it will be ignored if you are using Argparse as it is auto-connects with ClearML)
Make sense ?
Ohh if this is the case, you might also consider using offline mode, so there is no need for backend
https://clear.ml/docs/latest/docs/guides/set_offline#setting-task-to-offline-mode
Hi AmiableFish73
Hi all - is there an easier way track the set of datasets used by a particular task?
I think the easiest is to give the Dataset an alias, it will automatically appear in the Configuration section:Dataset.get(..., alias="train dataset")
wdyt?
You might be able to also find out exactly what needs to be pickled using theΒ
f_code
Β of the function (but that's limited to C implementation of python).
Nice!
I like the idea of using the timeit interface, and I think we could actually hack it to do most of the heavy lifting for us π
when you clone the Task, it might be before it is done syncying git / packages.
Also, since you are using 0.16 you have to have a section name (Args or General etc.)
How will task b use the parameters ? (argparser / connect dict?)
because a pipeline is composed of multiple tasks, different tasks in the pipeline could run on different machines.
Yes!
. Or more specifically, they could run on different queues, and as you said, in your other response, we could have a Q for smaller CPU-based instances, and another queue larger GPU-based instances.
Exactly !
I like the idea of having a queue dedicated to CPU-based instances that has multiple agents running on it simultaneously. Like maybe four agents.
Th...