task = Task.get_task(project_name='project', task_name='best_model_ever')
the issue moving forward is if we restart the pod we will have to manually update that again.
Can't you map the nginx configuration file ? (making the changes persistent across pods)
FloppyDeer99 what am I seeing in the screenshot ?
But why the url in es is different from it in web UI?
They are not really different, but sometimes the "url quote" is an issue (this is the process a browser will take a string url like a/b
and convert it to a%2fb
),
I remember that there was an issue involving double quoting (this is when you have: a/b
-> a%2fb
-> a%252fb
), notice the last one replace "%" with "%25" as in your example...
Let me know i...
No worries, I would love for us to come up with a nice solution 🙂
Hi FloppyDeer99
Since this thread is a bit old, I might have missed something 🙂
Are we saying the links are not working in the UI ?
(notice the links themselves are generated by the clearml package, so if there was a bug, still not sure here, then old links will remain invalid until manually fixed) Can you verify that the latest clearml generates working links?
Hm GiganticTurtle0 let me check quickly it
It completed after the max_job limit (10)
Yep this is optuna "testing the water"
JitteryCoyote63 I think there is a ClearML logger , no?
Hi @<1562973083189383168:profile|GrievingDuck15>
Thanks for noticing, yes the api is always versioned, we should make it clear in the docs. Also if you need the latest one use version 999 , it will default to the latest one it can support
That makes sense...
Basically in the open-source version the approach is everyone sees everything for maximum transparency (and also ease of use). I know there are access-roles in the paid tier and vault for exactly these types of things...
Where do you currently save them? and how do you pass them to the remote machine ?
So maybe the path is related to the fact I have venv caching on?
hmmm could be...
Can you quickly disable the caching and try ?
the only thing that missing is some plots on the clearml server (app ) when i got to the details of the train i cannot see the matrix confusion for example ( but its exists on the bucket )
How do you report the "matrix confusion" ? (I might have an idea on what's the difference)
logger.report_scalar("loss", "train", iteration=0, value=100)
logger.report_scalar("loss", "test", iteration=0, value=200)
The quickest workaround would be, In your final code just do something like:my_params_for_hpo = {'key': omegaconf.key} task.connect(my_params_for_hpo, name='hpo_params') call_training_with_value(my_params_for_hpo['key'])
This will initialize the my_params_for_hpo
with the values from OmegaConf, and allow you to override them in the hyperparameyter section (task.connect is two, in manual it stores the data on the Task, in agent mode, it takes the values from the Task and puts them ba...
I ran the test, but there was no result.
what do you mean by no result, no data after the new query?
Will such an docker image need a trains configuration file?
If you need to configure things other than credentials (see above) than yes you might need to map trains.conf
into the pod.
Specifically, if you need, map your trains.conf to /root/.trains
inside the pod/container
HugeArcticwolf77 changing the color is definitely a feature we will have in the next version, right now I think you cannot 😞 it is randomly chosen based on the title/series and I think your example is a great failure case of that randomness 😅
Hi GreasyRaven35
You should set the output_uri, in Task init, it will auto upload the model, and register the remote location URLtask = Task.init(..., output_uri=True)
You can also specify a target bucket, if you configured credentials (e.g. output_uri=" s3://bucket ")
Correct 🙂
I'm assuming the Task object is not your Current task, but a different one?
but this will be invoked before fil-profiler starts generating them
I thought it will flush in the background 😞
You can however configure the profiler to a specific folder, then mount the folder to the host machine:
In the "base docker args" section add -v /host/folder/for/profiler:/inside/container/profile
Unfortunately that is correct. It continues as if nothing happened!
oh dear, let me make sure this is taken care of
And thank you for the reproduce code!!!
Sorry @<1798525199860109312:profile|IntriguedGoldfish14> just noticed your reply
Yes two inference container, running simultaneously on the cluster. As you said, each one with its own environment (assuming here that the requirements of the models collide)
Make sense
Hi @<1798525199860109312:profile|IntriguedGoldfish14>
Yes the way to do that is just use the custom engine example as you did, also correct on the env var to add catboost to the container
You can of course create your own custom container from the base one and pre install any required package, to speedup the container spin time
One of the design decisions was to support multiple models from a single container, that means that there needs to be one environment for all of them, the main is...
hmm that is odd.
Can you send the full log ?
if you have an automation process, then you should have the Task object, no?
then you have task.id
What am I missing here?
to enable access to the s3 bucket. In this case I wonder how clearml sdk gets access to the s3 bucket if it relies on secret access key and access key id.
Right, basically someone needs to configure the "regular" environment variables for boto to use the IAM role, clearml
will basically uses boto, so it should be transparent. does that make sense ? How do you spin the job on the k8s cluster and how do you configure it?
ince these are temp credentials awe need to use the sessi...
Hi ConvolutedSealion94
Yes this seems like the correct curl
How did you spin the clearml-serving containers? is it with the docker-compose or with the helm chart (I remember that there are some pitfalls with the helm chart, and I would actually start with the local docker-compose to debug it)
BoredHedgehog47 were you able to locate the issue ?