Hi SteadyFox10 , this one will get all the last metric scalarstrain_logger.get_last_scalar_metrics()
Oh, yes, that might be (threshold is 3 minutes if no reports) but you can change that:task.set_resource_monitor_iteration_timeout(seconds_from_start=10)
I found the issue, the first run it jumps over the first day (let me check if we can quickly fix that)
I'm not sure on the frequency it updates though
GiganticTurtle0 so this was already supposed to be out (v1.1) but a minor py2 backwards compatibility delayed it. Anyhow you can now just call pipeline.start(..)
https://github.com/allegroai/clearml/blob/889d2373988a0d6630703cc1c865e09e58f8f981/examples/pipeline/pipeline_from_tasks.py#L47
(to run it locally call start_locally(...) )pip install git+
(the new version will be out in a few days, meanwhile you can test the new pipeline interface directly from git)
So assuming they are all on the same LB IP: You should do:
LB 8080 (https) -> instance 8080
LB 8008 (https) -> instance 8008
LB 8081 (https) -> instance 8081
It might also work with:
LB 443 (https) -> instance 8080
PompousParrot44 What is the "working directory" on the experiment itself? and the "script path"?
Based on what you wrote above, in order for it work you should have:
working directory: "."
script path: "-m test.scripts.script"
notice no "--args" and working directory is "." (i.e. the root of the repository)
We are planning an RC later this week, I'll make sure this fix is part of it
Any chance @<1578918150261444608:profile|RoundJellyfish71> you can open a GitHub issue so that we can track it? (I think this is indeed a good idea)
Hi SubstantialElk6
ClearML-Serving is already out with a new version, the ETA for the next ClearML-serving full 1.0 (which is the new redesign version) is the end of May
Hi @<1546303293918023680:profile|MiniatureRobin9> could it be the pipeline logic is created via the clrarml-task CLI? If this is the case, I think this is an edge case we should fix. Basically it creates a Task instead of pipeline, which in.essence only effects the UI. To solve it, just run the pipeline locally, notice that by default when you start it, it will actually stop the local run and relaunch itself on an agent.
Also, could you open a GitHub issue so we add a flag for it?
I didn't realise that pickling is what triggers clearml to pick it up.
No, pickling is the only thing that will Not trigger clearml (it is just too generic to automagically log)
Hi GiddyTurkey39
Glad to see that you are already diving into the controllers (the stable release will be out early next week)
A bit of background on how the pipeline controller are designed:
All steps in the pipeline are experiments already registered in the system (i.e. you can see them in the UI). Regardless on how you created those experiments they have to be there prior to the pipeline launch. The pipeline itself can be executed on any machine (it does very little, and...
Yeah we should definitely have get_requirements π
Yes! Thanks so much for the quick turnaround
My pleasure π
BTW: did you see this (it seems like the same bug?!)
https://github.com/allegroai/clearml-helm-charts/blob/0871e7383130411694482468c228c987b0f47753/charts/clearml-agent/templates/agentk8sglue-configmap.yaml#L14
Welp, it's been a day with the new settings, and stats went up 140K for API calls
... going to check again tomorrow to see if any of that was spill over from yesterday
140K calls a day, how often are you sending scalars ? how long is it running? how many experiments are running ?
Hi MassiveBat21
CLEARML_AGENT_GIT_USER is actually git personal token
The easiest is to have a read only user/token for all the projects.
Another option is to use the ClearML vault (unfortunately not part of the open source) to automatically take these configuration on a per user basis.
wdyt?
I think you are correct the env variable is not resolved in "time". It might be it's resolved at import not at Task.init
Just to make sure I understand, running locally creates the Args/command correctly, then when actually executed on the remote machine (i.e. execute_remotely creates the correct Args/command But when the agent actually executes it) it updates back the Args/command as a list. Is that a correct description ?
Hi CheekyAnt38
However now I would like to evaluate directly my machine learning model via api requests, directly over clearml. Itβs possible?
This basically means serving the model, is this what you mean?
BTW: you will be loosing the comments π
Why do you ask? is your server sluggish ?
Hi IrritableJellyfish76
If you are running a code that uses clearml from kubeflow, you have out of the box integration between the two, what am I missing?
Hi @<1683648242530652160:profile|ApprehensiveSeaturtle9>
I send a request to the endpoint but never unload (the gpu memory keep increasing when I infer with a new model).
They are not unloaded after the request is done. see discussion here: None
You can however remove the model from the serving session (but I do not think this is what you meant)
I'm assuming you want to run multiple models on a single GPU with not en...
LovelyHamster1 Now I see... Interesting credentials ability. Specifically all the S3 access on trains is derived from the ~/clearml.conf
credentials section :
https://github.com/allegroai/clearml/blob/ebc0733357ac9ead044d0ed32d41447763f5797e/docs/clearml.conf#L73
( or the AWS S3 environment variables )
I'm not sure how this AWS feature works, I suspect it is changing the AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY variables on the ec2 instance. If this is the case, it should work out of...
JitteryCoyote63
are the calls from the agents made asynchronously/in a non blocking separate thread?
You mean like request processing on the apiserver are multi-threaded / multi-processed ?
So the thing is clearml
automatically detects the last iteration of the previous run, my assumption you also add it hence the double shift.
SourOx12 could that be it?