Reputation
Badges 1
25 × Eureka!@<1595587997728772096:profile|MuddyRobin9> are you sure it was able to spin the EC2 instance ? which clearml version autoscaler are you running ?
Nicely found @<1595587997728772096:profile|MuddyRobin9> !
And the agent continue running.
oh just kill al the processes with clearml-agent
in the cmd line
pkill -9 -f clearml-agent
@<1523701304709353472:profile|OddShrimp85> are you trying to shut down the one running on your machine ?
In the UI you can see all the agents and their IDs
Then you can so
clearml-agent daemon --stop <agent id>
This seems to only work for a single file (weights_path implies a single file, not multiple ones). Is that the case?See update_weights_package
actually packages an entire folder as zip and will do the extraction when you get it back (check the function docstring, I think you can also specify wildcard etc if needed)
Why do you see this as preferred to the dataset method we have now?
So it answers a few requirements that you raised
It is fully visible as part of the project and se...
When you set the pod make sure you mount the clearml local cache folder to the PV
basically /root/.clearml/cache/
Hi @<1523701304709353472:profile|OddShrimp85>
Do you mean Dataset.get_local_copy()
?
By the way, will downloading still happen if the datasets is available in the cache folder?
If it is cached, then there is no need to re-download 🙂
A definite maybe, they may or may not be used, but we'd like to keep that option
The precursor to the question is the idea of storing local files as "input artifacts" on the Task, which means that if the Task is cloned the links go with it. Let's assume for a second this is the case, how would you upload these artifacts in the first place?
Hi @<1524922424720625664:profile|TartLeopard58>
- Opened container ports for VS Code, JupyterLab, and SSH.I think that by default it uses the host network so it can take care of that, are you saying you added k8s integration ?
- Added NodePort to the service to directly access via public IP:NodePort (previously only SSH was available, but now NodePort is added for VS Code and JupyterLab as well), allowing direct access without SSH tunneling.Interesting!
- Considering security vulnerabilitie...
Can you post here the actual line? seems like we can fix it to also support this scenario (if we could test it)
Hi, I was expecting to see the container rather then the actual physical machine.
It is the container, it should tunnels directly into it. (or that's how it should be).
SSH port 10022
Hi @<1607909176359522304:profile|UnevenCow76>
followed the below documentation to implement the clearml monitoring using prometheus and grafana
Did you try following this example, it includes both deploying a model and adding grafana metrics:
None
LOL, if you can get it to run any python code, I can help with the rest. We just need to make sure we can capture the output, and then start the VScode remote debugging feature directly from the extension.
I see, good point. It does look like mostly boiler plate code, not sure where it actually runs the python command, but I'm sure it is there (python.ts, but could not locate who is actually using it)
I wanted to know what the best way to create and register the SSL keys is.
of I see, so basically you need to add it to add nginx with SSL certificates on top of the hosted service (or configure the dockercompose nginx container to add that)
Then you need to add the self signed SSL into any host machine (I'm assuming these are not "valid" SSL certificates generated by a reputable SSL provider)
But generally speaking if you are using self hosted clearml-server on a local machine that n...
Hi @<1657918706052763648:profile|SillyRobin38>
You should either disable certificate verification or add the self-signed certificate to your urllib
None
or set
export REQUESTS_CA_BUNDLE="/path/to/cert/file"
export SSL_CERT_FILE="/path/to/cert/file"
SmugLizard25 are you saying that with the latest version it does not work?
Hi SmugLizard25 I was able to test and it seems that style is being ignored by the FE 😞
I passed to FE guys to make sure it is fixed in the next version.
Notice this is just for tables, anything else works as expected (i.e. styling any other type of plot)
UnevenDolphin73 since at the end plotly is doing the presentation, I think you can provide the extra layout here:
https://github.com/allegroai/clearml/blob/226a6826216a9cabaf9c7877dcfe645c6ae801d1/clearml/logger.py#L293
Hi @<1523715429694967808:profile|ThickCrow29>
Is there a way to specify a callback upon an abort action from the user
You mean abort of the entire pipeline?
None
Hi @<1523715429694967808:profile|ThickCrow29>
I am using the PipelineController with abort_on_failure set to False.
Is this a pipeline from code or from Tasks?
What is the clearml version?
Lastly, if a component fails, and another components is dependent on it's output, how would it run? if it is not dependent, why is it a child component?
Hi @<1523715429694967808:profile|ThickCrow29> , thank you for pinging!
We fixed the issue (hopefully) can you verify with the latest RC? 1.14.0rc0 ?
And same behavior if I make the dependance explicty via the retunr of the first one
Wait, are you saying that in the code above, when you abort "step_a" , then "step_b" is executed ?
@<1523715429694967808:profile|ThickCrow29> this is odd... how did you create the pipeline? can you provide code sample?
Hi @<1541954607595393024:profile|BattyCrocodile47>
Do you mean to start a remote session instead of the cli directly from the vscode ui and connect to it? If so, that would be awesome!! We have a remote session from the web were it spins you remote session and launches vscode inside the container so you work on it in your browser. But a VSCode plugin is a great idea, do you have a ref code to similar plugins?