Reputation
Badges 1
25 × Eureka!Also I would suggest using Task.execute_remotely
https://clear.ml/docs/latest/docs/references/sdk/task#execute_remotely
Regrading the limit interface, let me check I think this is worked on (i.e. nice interface that should be pushed in the next few days). Let me get back to you on this one.
How will imposing an instance limit , prevent or allow --order-fairness feature for example, which exists when running in clearml-agent version compared to k8s_glue_example version ?
A bit of background on how the glue works:
It pulls jobs from the clearml queue, then it prepares a k8s job, and launches the k8s jobs...
Hi SucculentBeetle7
The parameters passed to add_step need to contain the section name (maybe we should warn if it is not there, I'll see if we can add it).
So maybe something like:{'Args/param1', 1}Or{'General/param1', 1}Can you verify it solves the issue?
Thank you AttractiveWoodpecker16 !
Removing the uncommitted changes so that you can launch it from an agent? Or is it visual only?
EnviousStarfish54
plt.show will capture the figure, that if you call it multiple times, it will add a running number to the figure itself (because the figure might change, and you might want the history)
if you call plt.imshow, it's the equivalent of debug image, hence it will be shown in the debug-samples tab, as an image.
Make sense ?
PanickyMoth78 thank you for the mock code, I can verify it reproduces the issue. It seem that for some reason (bug) when the same function is called multiple times it "collects" parents, hence the odd graph,
BTW: if you want to see exactly what is passed to the step you can press on the step's full_details, and see the hyperparameter section.
I'll make sure we fix this bug in the next RC.
2 and 3 - I want to manage access control over the RestAPI
Long story short, put a load-balancer in front of the entire thing (see the k8s setup), and have the load-balancer verify JWT token as authentication (this is usually the easiest)
1- Exactly, custom code
Yes, we need to add a custom example there (somehow forgotten)
Could you open an Issue for that?
in the meantime:
` #
Preprocess class Must be named "Preprocess"
No need to inherit or to implement all methods
lass P...
ok, but this happens in my local machine, not in the agent
resource monitoring is always running in the background, even on local machines. (of course you can turn it off)
SubstantialElk6 I just realized 3 weeks passed, wow!
So the good news we have some new examples:
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_functions.py
The bad news the documentation was postponed a bit, as we are still messaging the interface (the community is constantly pushing for great ideas and uses cases , and they are just too good to miss out 🙂 )...
Does a pipeline step behave differently?
Are you disabling it in the pipeline step ?
(disabling it for the pipeline Task has no effect on the pipeline steps themselves)
'-v', '/tmp/clearml_agent.ssh.cbvchse1:/.ssh',
It's my bad, after that inside the container it does cp -Rf /.ssh ~/.ssh
The reason is that you cannot know the user home folder before spinning the container
Anyhow the point is, are you sure that you have ~/.ssh on the Host machine configured?
And if you do, are you saying this is part of your AMI? if not how did you put it there?
Hmm GreasyLeopard35 can you specify the range you are passing to the HPO, as well as the type of optimization class ? (grid/random/optuna etc.)
I want that last python program to be executed with the environment that was created by the agent for this specific task
Well basically they all inherit the Python environment that points to the venv they started from, so at least in theory it should be transparent when the agent is spinning the initial process.
I eventually found a different way of achieving what I needed
Now I'm curious, what did you end up doing ?
Yes, the container level (when these docker shell scripts run).
I think this is the tricky part, in code you can access the user ID of the Task, and download the .env and apply it, but before the process starts I can't really think of a way to do that ...
That said, I think that in the paid version they have "vault" support, which allows you to store the .env file on the clearml-server, and then the agent automatically applies it at the beginning of the container execution.
Are you referring to Poetry ?
JitteryCoyote63 you mean in runtime where the agent is installing? I'm not sure I fully understand the use case?!
TrickySheep9 you mean custom containers in clearml-session for remote development ?
Hi FierceHamster54
Thanks for bringing it up 🙂
... in term of secret managements/key-value stores
Currently the open-source version does not include the Vault support (e.g. secret management), this is something they added to the enterprise version a few versions away, and as far as I understand this is a per user/project/company granularity feature (i.e. company wide merging with project merging with user specific).
Is this what you are looking for or am I missing something ?
Can you do it manually, i.e. checkout the same commit id, then take the uncommitted changes (you can copy paste it to diff.txt) then call git apply diff.txt ?
No worries, I'll see if I can replicate it anyhow
The import process actually creates a new Task every import, that said if you take a look here:
https://github.com/allegroai/trains/blob/10ec4d56fb4a1f933128b35d68c727189310aae8/trains/task.py#L1733
you can pass a pre-existing Task ID to "import_task" https://github.com/allegroai/trains/blob/10ec4d56fb4a1f933128b35d68c727189310aae8/trains/task.py#L1653
In our case this is not possible due to client security (e.g. training data from clients can potentially be 'reverse engineered' from trained models in future).
Hmm I see, wouldn't it make more sense to separate clients like a multi-tenant SAAS solution ?
StaleMole4 you are printing the values before Task.init had the chance to populate it.
Basically try moving the print after closing the Task (closing the tasks waits for the async update)
Make sense ?
UnevenDolphin73 since at the end plotly is doing the presentation, I think you can provide the extra layout here:
https://github.com/allegroai/clearml/blob/226a6826216a9cabaf9c7877dcfe645c6ae801d1/clearml/logger.py#L293
set the following:CLEARML_AGENT_DISABLE_SSH_MOUNT=1 clearml-agent daemon ...The issue is, it will automatically mount the .ssh of the host into the container, so that if you are using SSH to clone git you have credentials, in your case, it also mounts the configuration, hence failing to login.
I will make sure we add it to the configuration file, so it is more visible
I have the agent configured to force install requirements.txt
what do you mean by that?
Hi @<1657918706052763648:profile|SillyRobin38>
You should either disable certificate verification or add the self-signed certificate to your urllib
None
or set
export REQUESTS_CA_BUNDLE="/path/to/cert/file"
export SSL_CERT_FILE="/path/to/cert/file"
Hi ScaryKoala63
Which versions are you using (clearml / lightning) ?