Hi BoredPigeon26
what do you mean by "reuse the task" ? is this manual execution (i.e. from code)?
How about archiving the old version?
You can also force Task.init to always create a new Task (which preserves the previous run alongside the execution tab)
Basically what's the specific use case ?
This will fix it, the issue is the "no default value" that breaks the casting@PipelineDecorator.component(cache=False) def step_one(my_arg=""):
Okay that kind of makes sense, now my followup question is how are you using the ASG? I mean the clearml autoscaler does not use it, so I just wonder on what the big picture, before we solve this little annoyance π
p.s. any chance you can get me the nvidia driver version? I can't seem to find the one for v22 on amazon
Noooooooooo, it is still working π
Hi @<1724235687256920064:profile|LonelyFly9>
So, I noticed that with the REST API at least the
/tasks.get_all
endpoint appears to have an undocumented maximum page size of 500.
Yeah otherwise the request size might be too big, but you have pagination:
page
optional Page number, returns a specific page out of the resulting list of tasks
Minimum value : 0 integer
Disable automatic model uploads
Disable the auto uploadtask = Task.init(..., auto_connect_frameworks{'pytorch': False})
Yes it is reproducible do you want a snippet?
Already fixed π please ping tomorrow, I think an RC should be out soon with the fix
ShallowGoldfish8 this call does that:
https://github.com/allegroai/clearml/blob/0397f2b41e41325db2a191070e01b218251bc8b2/examples/advanced/execute_remotely_example.py#L127
Could it be it was never allocated to begin with ?
Hi LazyLeopard18
I think that these toy examples will help:
uploading local datasethttps://github.com/allegroai/events/blob/master/odsc20-east/generic/dataset_artifact.py
2. pre-process data
https://github.com/allegroai/events/blob/master/odsc20-east/generic/process_dataset.py
3. Training example:
https://github.com/allegroai/events/blob/master/odsc20-east/scikit-learn/sklearn_jupyter.ipynb
Legit, if you have a cached_file (i.e. exists and accessible), you can return it to the caller
So if you are using the latest clearml (i.e. +1.3) reenqueuing the pipline will automatically continue it from where it stopped.
With previous versions (which is your case, I think), you clone the pipeline Task, change the parameter and enqueue it.
(The state itself of the pipeline is stored on the Task, and when you clone it, you are cloning the state as well).
Make sense ?
os.environ['CLEARML_PROC_MASTER_ID'] = ''
Nice catch! (I'm assuming you also called Task.init somewhere before, otherwise I do not think this was necessary)
I think i solved it by deleting the project and running the base_task one time before the hyper parameter optimzation
So isit working now? everything is there ?
Hi CheekyElephant36
First you need to run it once on your machine, once this is done (only a few steps is enough), you can one it and enqueue it. Then to actually connect the aws autoscaler (the part that spins machines and runs tasks) go to applications and select the aqs autoscaler.
Btw i think the next video will be about YOLO + autoscaler
Hi, I would like to understand how I can set the pip cache location for my agent,
ClumsyElephant70 by default the pip cache (and all other cache folders) are mounted back into the host itself ~/.clearml/
I'm assuming the idea is shared cache, if this is the case, do:docker_pip_cache = ~/my_shared_nfs/pip-cache
https://github.com/allegroai/clearml-agent/blob/e3e6a1dda81bee2dd20a64d09746568e415f1823/docs/clearml.conf#L139
Then the only other option is the /tmp
is out of space (pip uses it to uncompress the .whl files, then it deletes them)
wdyt?
CrookedWalrus33
Force SSH git authentication, it will auto mount the .ssh from the host to the docker
https://github.com/allegroai/clearml-agent/blob/6c5087e425bcc9911c78751e2a6ae3e1c0640180/docs/clearml.conf#L25
Hi @<1556450111259676672:profile|PlainSeaurchin97>
Is there any simple way to use
argparse
to pass a clearml task name?
need to call
args = task.connect(args)
.
noooo π there is no need to do that, the arguments are automatically detected
see for yourself
args = parse_args()
task = Task.init(task_name=args.task_name)
suspect permissions, but not entirely sure what and where
Seems like it.
Check the config file on the agent machine
https://github.com/allegroai/clearml-agent/blob/822984301889327ae1a703ffdc56470ad006a951/docs/clearml.conf#L18
https://github.com/allegroai/clearml-agent/blob/822984301889327ae1a703ffdc56470ad006a951/docs/clearml.conf#L19
Yes it does, but these files must be committed to begin with, basically think 'git diff' output is stored and then the agent applies it
That should not be complicated to implement. Basically you could run 'clearm-task execute --id taskid' as the sagemaker cmd. Can you manually launch it on sagemaker?
How does ClearML select reference branch? Could it be that ClearML only checks "origin" branch?
Yes π I think we can quickly fix that, I'm just trying to realize if there are down sides to running "git ls-remote --get-url" without origin
Yes. Though again, just highlighting the naming of
foo-mod
is arbitrary. The actual module simply has a folder structured with an implicit namespace:
Yep I think this is exactly why it fails detecting it, let me check that
And itβs failing on typing hints for functions passed in
pipe.add_function_step(β¦, helper_function=[β¦])
β¦ I guess those arenβt being removed like the wrapped function step?
Can you provide the log? I think I'm missing what e...
so 78000 entries ...
wow a lot! would it makes sens to do 1G chunks ? any reason for the initial 1Mb chunk size ?
I thought this is the issue on the thread you linked, did I miss something ?
Hi RoughTiger69
seems to not take the pacakges that are in the requirements.txt
The reason for not taking the entire python packages, it will most likely break when trying to run inside the agent.
The directly imported packages aill essentially pull their required packages, and thus create a stable env on the remote machine. The agent then will store the Entire env, as it assumes it will be able to fully replicate it the next time it runs.
If the "Installed Packages" section is empty...