Also, how do pipelines compare here?
Pipelines are a type of Task, so like Tasks you can clone and enqueue them, or set them as the target of the trigger.
the most flexible solution would be to have some way of triggering the execution of a script in the parent task environment,
This is the exact idea of the TriggerScheduler None
What am I missing here?
If this is the case and assuming you were able to use clearml to upload them, this means that adding the
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
To your env file should just work
https://github.com/allegroai/clearml-serving/blob/main/docker/example.env
Make sense?
hen, in the bash console, after some time, I see some messages being logged from clearml
JitteryCoyote63 Hmm that is strange, let me check something
It might be that the worker was killed before unregistered, you will see it there but the last update will be stuck (after 10min it will be automatically removed)
Ohh I see, could you copy paste what you put there (instead of the secret and key *** will do 🙂 )
DeliciousBluewhale87 You can havwe multiple queues for the k8s queuea in priory order:python k8s_glue_example.py --queue glue_q_high glue_q_lowThen if someone is doing 100 experiments (say HPO), then they push into the "glie_q_low" which means it will first pop Tasks from the high priority queue and if it is empty it will pop from the low priority queue.
Does that make sense ?
The issue is uploading reporting fro http uploads (object storage will report upload). Basically the http upload is post with urllib that does not support upload callbacks for progress report. If you have an idea here, we will gladly add it (as you mentioned it can be quite annoying to have to open network manager to verify the upload is progressing)
This is very odd, can you also put here the file names? maybe an odd character is causing it?
Can you also test it with the latest clearml version (1.8.0) ?
Yes, look for the clearml serving session ID in the web UI (just go to the home screen and put the UID in the search 🙂 )
With pleasure, I'll make sure we officially release RC1 soon :)
WackyRabbit7 in that case:from trains.utilities.pyhocon import ConfigFactory, HOCONConverter from trains.config import config_obj new_conf_text = HOCONConverter.to_hocon(config=ConfigFactory.from_dict(config_obj.as_plain_ordered_dict()), compact=False, level=0, indent=2) print(new_conf_text)
If you need to change the values:config_obj.set(...)You might want to edit the object on a copy, not the original 🙂
SubstantialElk6 could you add a github issue to set the direct url for the vscode as a parameter to the cleaml-session?
We already have --vscode-version we could either extend it to include a direct url, or add a new argument.
wdyt ?
Hi CheerfulGorilla72
the "installed packages" section is used as "requirements.txt for the agent.
Are you saying the autodetection fails to detect all packages? You can specify in "manual execution" (i.e not when the agent is running the code), to just take the requirements.txt locally:` Task.force_requirements_env_freeze(requirements_file="./requirements.txt")
notice the above call should be executed Before Task.init
task = Task.init(...) `3. If you clear all the "installed packages" se...
RipeGoose2 you can put ut before/after the Task.init, the idea is for you to set it before any of the real training starts.
As for not effecting anything,
Try to add the callback and just have it returning None (which means skip over the model log process) let me know if this one works
Although it's still really weird how it was failing silently
totally agree, I think the main issue was the agent had the correct configuration, but the container / env the agent was spinning was missing it,
I'll double check how come it did not print anything
Nooooooooooooooooooooooo
Is there a helper function option at all that means you can flush the clearml-agent working space automatically, or by command?
Every Task execution the agent clears the venv (packages are cached locally, but the actual venv is cleared). If you want you can turn on the venv cache, but there is no need to manually clear the agent's cache.
Hi, is there a way to force the requirements.txt?
You mean to ignore the "Installed Packages" ?
Hi GreasyPenguin14
clearml-data stores only the difference between versions.
Yes, it is on a file basis granularity. Meaning if you change a file (regardless of the type of the file), the new modified file will be stored. Make sense ?
I guess I got confused since the color choices in
One of the most beloved features we added 🙂
Now will these 10 experiments be of different names? How will I know these are part of the 'mnist1' HPO case?
Yes (they will have the specific HP name/value combination).
FYI names are not unique so in theory you could have multiple experiments with the same name.
If you look under the Configuration Tab, you will find all the configuration arguments for the experiment. You can also add specific arguments to the experiment table (click the cogwheel at the right top corner, and select...
Hi OddShrimp85
If you pass 'output_uri=True' to task init, it will upload the model automatically, or as you said manually with outputmodel class
Hi SubstantialElk6 I'll start at the end, you can run your code directly on the remote GPU machine 🙂
See clearml-task documentation, on how to create a task from existing code and launch it
https://github.com/allegroai/clearml/blob/master/docs/clearml-task.md
That said, the idea is that you add the Task.init call when you are writing/coding the code itself, then later when you want to run it remotely you already have everything defined in the UI.
Make sense ?
Just curious about the timeout, was it configured by clearML or the GCS? Can we customize the timeout?
I'm assuming this is GCS, at the end the actual upload is done GCS python package.
Maybe there is an env variable ... Let me google it
SubstantialElk6 it seems the auto resolve of pytorch cuda failed,
What do you have in the "installed packages" section?
It should be autodetected, and listed in the installed packages with something like:keras-contrib @git+https://www.github.com/keras-team/keras-contrib.gitIs this what you are seeing?
If not you can add it manually with:Task.add_requirements('git+ ') Task.init(...)Notice to call before Task.init
*Actually looking at the code, when you call Task.create(...) it will always store the diff from the remote server.
Could that be the issue?
To edit the Task's diff:task.update_task(dict(script=dict(diff='DIFF TEXT HERE')))
I think it's inside the container since it's after the worker pulls the image
Oh that makes more sense, I mean it should not build the from source, but make sense
To solve for build for source:
Add to the "Additional ClearML Configuration" section the following line:agent.package_manager.pip_version: "<21"
You can also turn on venv caching
Add to the "Additional ClearML Configuration" section the following line:agent.venvs_cache.path: ~/.clearml/venvs-cache
I will make sure w...