Hi OddShrimp85
I think numpy 1.24.x is broken in a lot of places we have noticed scikit breaks on it, TF and others š
I will make sure we fix this one
Hi @<1564785037834981376:profile|FrustratingBee69>
It's the previous container I've used for the task.
Notice that what you are configuring is the Default container, i.e. if the Task does not "request" a specific container, then this is what the agent will use.
On the Task itself (see Execution Tab, down below Container Image) you set the specific container for the Task. After you execute the Task on an Agent, the agent will put there the container it ended up using. This means that ...
Our server is deployed on a kube cluster. I'm not too clear on how Helm charts etc.
The only thing that I can think of is that something is not right the the load balancer on the server so maybe some requests coming from an instance on the cluster are blocked ...
Hmm, saying that aloud that actually could be?! Try to add the following line to the end of the clearml.conf on the machine running the agent:
api.http.default_method: "put"
Nope - confirmed to be running on the OS's Python environment,
okay so bare metal root is definitely not recommended.
I'm not sure how/why it get's stuck though š
Any chance you can run the agent as non-root?
Also maybe preferred in docker mode, so it is easier for you to control the environment of the Task
GiganticTurtle0 adding --stop to the exact daemon execution will stop it (meaning if you have multiple agents on the same machine launched with different parameters, just add the --stop to retire the specific one)
Hmm, can you send the full log of the pipeline component that failed, because this should have worked
Also could you test it with the latest clearml python version (i.e. 1.10.2)
Hi @<1544853721739956224:profile|QuizzicalFox36>
http:/34.67.35.46:8081/...
notice there is a / missing in the link, how is that possible? it should be http://
What is the link you are seeing there?
Hi @<1523702307240284160:profile|TeenyBeetle18>
and url of the model refers to local file, no to the remote storage.
Do you mean that in the Model tab when you look into the model details the URL points to a local location (e.g. file:///mnt/something/model) ?
And your goal is to get a copy of that model (file) from your code, is that correct ?
Hi @<1523722267119325184:profile|PunySquid88> I guess it's a good thing we talk, because I believe that what you are looking for is already available :)
Logger.current_logger().report_media('title', 'series', iteration=1337, local_path='/tmp/bunny.mp4')
This will actually work on any file, that said, the UI might display the wrong icon (which will be fixed in the next version).
We usually think of artifacts as data you want to reuse, so all the files uploaded there are accessibl...
The configuration tab -> configuration objects -> pipeline is empty
That's the reason it is doing nothing š
How come it is empty if you Cloned the local one?
LittleShrimp86 what do you have in the Configuration Tab of the Cloned Pipeline?
(I think that it has empty configuration -> which means empty DAG, so it does nothing and leaves)
This is very odd...
LittleShrimp86 is this example working for you?
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_tasks.py
Hmm I think the easiest is using the helm chart:
https://github.com/allegroai/clearml-server-helm-cloud-ready
I know there is work on a teraform template, not sure about instio.
Is helm okay for you ?
iām working on creating a custom config with istio
That is awesome! let me know if we could help š
Also please consider PRing it, I'm sure other users will appreciate the option
GiganticTurtle0 can you please add a github issue with feature request to clearml-agent? I think this is a great use case!
PompousParrot44
you can always manually store/load models, example: https://github.com/allegroai/trains/blob/65a4aa7aa90fc867993cf0d5e36c214e6c044270/examples/reporting/model_config.py#L35 Sure, you can patch any frame work with something similar to what we do in xgboost, any such PR will be greatly appreciated! https://github.com/allegroai/trains/blob/master/trains/binding/frameworks/xgboost_bind.py
I'm assuming you mean for the clients, right?
ShakyOstrich31
I am reusing an old task ...
Which means that the old Task stores the requirements on the Task itself (see "Installed Packages" section), Notice it also stores the exact git commit to use.
When you are cloning the Task (i.e. in the pipeline), you should probably:
set the commit / branch to the latest in the branch clear the "installed packages" section, which would cause the agent to use the "requirements.txt" stored in the git repo itself.As far as I understand this s...
or shall I call the Task.init even from the agent
WorriedParrot51 I think something is lost here.
Task.init() is always called, even when the agent is executing the code. The difference is in what happens inside the Task.init() call. When the codebase itself is executed by the trains-agent, it signals through OS environment to the task.init() that instead of a new created task, it should use the already created one. from this point all data flows from the trains-server back into the c...
If you want to quickly test it:pip install clearml-agent
Then assuming Task id is aabbcc
Runclearml-agent execute --id aabbcc
You should be able to trace if the package was installed