Reputation
Badges 1
25 × Eureka!is there a way that i can pull all scalars at once?
I guess you mean from multiple Tasks ? (if so then the answer is no, this is on a per Task basis)
Or, can i get experiments list and pull the data?
Yes, you can use Task.get_tasks to get a list of task objects, then iterate over them. Would that work for you?
https://clear.ml/docs/latest/docs/references/sdk/task/#taskget_tasks
Hmm I think this was the fix (only with TF2.4), let me check a sec
Hi StormyOx60
Yes, by default it assumes any "file://" or local files, are accessible (which makes sense because if they are not, it will not able to download them).
there some way to force it to download the dataset to a specified location that is actually on my local machine?
You can specify a specific folder is not "local" and what it will do it will copy the zip locally and unzip it.
Is this what you are after ?
Hi SubstantialElk6
Yes this is the queue the glue will pull jobs from and push into the k8s. You can create a new queue from the UI (go to the workers&queues page and to the Queue Tab and press on "create new" Ignore it π this is if you are using config maps and need TCP routing to your pods As you noted this is basically all the arguments you need to pass for (2). Ignore them for the time being This is the k8s overrides to use if launching the k8s job with kubectl (basically --override...
Actually this is by default for any multi node training framework torch DDP / openmpi etc.
The --template-yaml allows you to use foll k8s YAML template (the overrides is just overrides, which do not include most of the configuration options. we should probably deprecate it
I want to run only that sub-dag on all historical data in ad-hoc manner
But wouldn't that be covered by the caching mechanism ?
Hi GiganticTurtle0
Sure, OutputModel can be manually connected:model = OutputModel(task=Task.current_task()) model.update_weights(weights_filename='localfile.pkl')
Hmmm that is a good use case to have (maybe we should have --stop get an argument ?)
Meanwhile you can do$ clearml-agent daemon --gpus 0 --queue default $ clearml-agent daemon --gpus 1 --queue default then to stop only the second one: $ clearml-agent daemon --gpus 1 --queue default --stop
wdyt?
That is awesome!
If you feel like writing a bit about the use-case and how you solved it, I think AnxiousSeal95 will be more than happy to publish something like that π
Yes, the mechanisms under the hood are quite complex, the automagic does not come for "free" π
Anyhow, your perspective is understood. And as you mentioned I think your use case might be a bit less common. Nonetheless we will try to come-up with a solution (probably an argument for Task.init so you could specify a few more options for the auto package detection)
HighOtter69
Could you test with the latest RC? I think this fixed it:
https://github.com/allegroai/clearml/issues/306
Let me know if I can be of help π
There is a git issue for selecting "pip freeze" / auto analyze, we could add "use requirements.txt"
wdyt?
If the manual execution (i.e. pycharm) was working it should have stored it on the Pipeline Task.
Dynamic GPU option only available with Enterprise version right?
Correct π
WittyOwl57 this is what I'm getting on my console (Notice New lines! not a single one overwritten as I would expect)
` Training: 10%|β | 1/10 [00:00<?, ?it/s]
Training: 20%|ββ | 2/10 [00:00<00:00, 9.93it/s]
Training: 30%|βββ | 3/10 [00:00<00:00, 9.89it/s]
Training: 40%|ββββ | 4/10 [00:00<00:00, 9.87it/s]
Training: 50%|βββββ | 5/10 [00:00<00:00, 9.87it/s]
Training: 60%|ββββββ | 6/10 [00:00<00:00, 9.88it/s]
Training: 70%|βββββββ | 7/10 [00:00<00...
GiganticTurtle0
I'm assuming here that self.dask_client.map(read_and_process_file, filepaths)
actually does the multi process/node processing. The way it needs to work, it has to store the current state of the process and then restore it on any remote node/process. In practice this means pickling the local variables (Task included).
First I would try to use a standalone static function for the map, DASK might be able to deduce it does not need to pickle anything, as it is standalone.
A...
Is this example working for you?
https://github.com/allegroai/clearml/blob/master/examples/reporting/model_config.py
SubstantialElk6
Regrading cloning the executed Task:
In the pip requirements syntax, "@" is a hint that tells pip where to find the package if it is not preinstalled.
Usually when you find the @ /tmp/folder
It means the packages was preinstalled (usually pre installed in the docker).
What is the exact scenario that caused it to appear (this was always the case, before v1 as well).
For example zipp
package is installed from pypi be default and not from local temp file.
Your fix b...
somehow set docker_args and docker_bash_setup_script equivalent??
task.set_base_docker(...)# somehow setup repo and branch to download to remote instance before running
This is automatically detected based on your local commit/branch as well ass uncommitted changes
Hi FiercePenguin76
So currently the idea is you have full control over per user credentials (i.e. stored locally). Agents (depending on how deployed) can have shared credentials (with AWS the easiest is to push to the OS env)
No worries, I'll see what I can do π
Is task.parent something that could help?
Exactly π something like:# my step is running here the_pipeline_task = Task.get_task(task_id=task.parent)