AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 5 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Is There An Elegant Way To Download All Images Posted In “Debug_Samples” From The Trains Server?

TrickyRaccoon92
I guess elegant is the challenge 🙂
What exactly is the use case ?

3 years ago

0 Has Anyone Successfully Deployed Clearml On A Kube Cluster Utilizing Istio? I Don’T See Any Mention Of Istio In The Docs.

Hi BurlySeagull48
you mean for the clearml-server ?

3 years ago

0 Hi There, I Have A Batch Prediction Task That Load A Model Published On Clearml.

Hi IrritableGiraffe81
Can you share a code snippet ?
Generally I would try
task = Task.init(..., auto_connect_frameworks={"pytorch': False, 'tensorflow': False)

one year ago

0 Hi. I Somehow Managed To Exceed The Metrics Quota By ~35Gb. I Logged Some Histograms, But Still That Seems Excessive. Now I Am Trying To Delete Archived Experiments With The Cleanup Service, But Some Tasks Cannot Be Deleted:

using the cleanup service

Wait FlutteringWorm14 , the cleanup service , or task.delete call ? (these are not the same)

one year ago

0 Um, Is There A Way To Delete An Artifact From A Task That Is Running?

Hmm, you can delete the artifact with:
task._delete_artifacts(artifact_names=['my_artifact']However this will not delete the file itself.
Do delete the file I would do :
remote_file = task.artifacts['delete_me'].url h = StorageHelper.get(remote_file) h.delete(remote_file) task._delete_artifacts(artifact_names=['delete_me']Maybe we should have a proper interface for that? wdyt? what's the actual use case?

2 years ago

0 Hi All, I'M Trying To Deploy Trains On Rancher (Nice Kubernetes Cluster Orchestration Project) Where I'M Quite New To Rancher And Kubernetes. I Have Been Able To Install Trains Using Helm

Will such an docker image need a trains configuration file?

If you need to configure things other than credentials (see above) than yes you might need to map trains.conf into the pod.
Specifically, if you need, map your trains.conf to /root/.trains inside the pod/container

3 years ago

0 So, I Did A Slew Of Pretrainings, Then Finetuned Those Pretrained Models. Is There A Way To Go Backwards From The Finetuning Task Id To The Pretraining Task Id? What I Tried Was:

SmallDeer34 the function Task.get_models() incorrectly returned the input model "name" instead of the object itself. I'll make sure we push a fix.

I found a different solution (hardcoding the parent tasks by hand),

I have to wonder, how does that solve the issue ?

2 years ago

0 If I Set

and run it locally...

You mean you run the clearml-agent locally ?

3 years ago

0 Hello, "In The Last Period I Pushed To Adopt Clearml Company Wide As It Is A Great Tool. We Actually Have A Data Center And All Nodes Are Managed By Rancher Meaning, Everything We Use Is Purely Kubernetes Stuff. I Deployed Clearml Server In Our

For the on-prem you can check the k8s helm charts it case spin agents for you (static agents).
For the GKE the best solution is the k8s glue:
https://github.com/allegroai/clearml-agent/blob/master/examples/k8s_glue_example.py

3 years ago

0 Is It Possible To Avoid The Clearml-Agent For Local Installations, And Have The File Server Automatically Use An S3 Bucket? I'Ve Found

I will TIAS, but maybe worthwhile to also mention if it has to be the absolute path or if relative path is fine too!

Good point! (absolute but you can use ~, and I "think" also $ENV )

2 years ago

0 Just Getting Started With Clearml, Any Recommended Videos On How To Get A Sample Project Up? I Am Using The One On Their Youtube Channel Right Now But I Am A Bit Confused As How To Use The Demoapp

however setting up the interpertier on pycharm is different on mac for some reason, and the video just didnt match what I see

MiniatureCrocodile39 Are you running on a remote machine (i.e. PyCharm + remote ssh) ?

3 years ago

0 Hello All, I Have A Question Regarding Showing Of Debug Samples Within An On-Prem Clearml Instance. I Am Logging Debug Images Via Tensorboard (Via

I am logging debug images via Tensorboard (via

add_image

function), however apparently these debug images are not collected within fileserver,

ZanyPig66 what do you mean not collected to the file server? are you saying the TB add_image is not automatically uploading images? or that you cannot access the files on your files server?

one year ago

0 Fatal: Could Not Read From Remote Repository. Please Make Sure You Have The Correct Access Rights And The Repository Exists.

I don't think so. it is solved by installing openssh-client to the docker image or by adding deploy token to the cloning url in web ui

You can also have the token (token==password) configured as the defauylt user/pass in your agent's clearml.conf
https://github.com/allegroai/clearml-agent/blob/73625bf00fc7b4506554c1df9abd393b49b2a8ed/docs/clearml.conf#L19

2 years ago

0 Is There A Way To

DisgustedDove53 , TrickySheep9
I'm all for it!
I can think of two options here, (1) use the k8s glue + apply template with ports mode see discussion https://clearml.slack.com/archives/CTK20V944/p1628091020175100
(2) create an interface (queue) to launch arbitrary job on the k8s cluster, with the full pod definition on the Task. This will allow the clearml-session to setup everything from the get go.
How would you interface with the k8s operator, and what exactly will it do?
(BTW: the reas...

3 years ago

0 I Have The Slack Server Running At Localhost:8080 When Trying To Access It From A Remote Computer, I Am Getting A Screen Like So: How Can I See The Dashboard From Another Computer?

WobblyCrab70 sure, put a load-balancer in between, AWS has a solution for that basically use the AMI from the GitHub and ask IT to add https on the 8080/8008/8081 ports

3 years ago

0 Hey, I'M Trying To Run The Aws Autoscaler And Pull A Docker Image From Ecr (Private Repository). I'M Currently Getting The Error:

Is there any way to debug these sessions through clearml? Thanks!

Yes this is a real problem, AWS does not allow to get the data very easily...
Can you check the AWS console, see what you have there ?
In theory this should have worked.
Maybe we you are missing some escaping for the "extra_vm_bash_script" ?
I'm hoping the console output will tell us

3 years ago

0 Hi All - I Have A Question To Ask (And Not Sure If There Is A Channel For Faqs So Sorry For Putting It Here) ... I Am Using Trains In Combination With Pycharm'S Remote Debugging. I Have The Pycharm Plugin Installed. When The Experiment Ends, I Get

Hmm, yes this fits the message. Which basically says that it gave up on analyzing the code because it run out of time. Is the execution very short? Or the repo very large?

4 years ago

0 Hello, Another Question

ShortElephant92 yep, this is definitely enterprise feature 🙂
But you can configure user/pass on the open source, even store as hasedh the passwords if you need.

one year ago

0 Bug?

I just think that the create function should expect

dataset_name

to be None in the case of

use_current_task=True

(or allow the dataset name to differ from the task name)

I think you are correct, at least we should output a warning that it is ignored ... I'll make sure we do 🙂

one year ago

0 Hi, I Would Like To Pass In Some Pip Arguments That Clearml-Agent Would Include When Setting Up The Venv On The Containers. How Should I Specify This? The Argument In Question Are --Trusted-Host And --Find-Links . I Need Them As I'Ve Installed A Pypi Repo

FriendlySquid61 could you help?

3 years ago

0 Good Day! Please Tell Me How To Screw It Up. On What Chart Or Value Does Such An Error Appear?

The docker-compose full logs?

one year ago

0 Hi When We Try And Sign Up A User With Github. The Invitation Link Never Works. Given They Have Already Signed Up With Their Github

They don't give an in app notification.

Oh I see, I assume this is because the github account is not connected with any email, so no invite is sent.
Basically they should just be able to re-login and then they could switch to your workspace (with the link you generated)

11 months ago

0 Hi I Saw This On The Clearml-Agent Docs But Other Than The Docker Image, I'M Not Sure How To Integrate This With Clearml Py And Clearml-Server. Please Advise.

python k8s_glue_example.py --helpTo get all the commands for configurations
You should probably pass a few :)

3 years ago

0 My Agent (Running On Gcp In Docker Mode) Is Having Trouble With Git Fetch --All. I'M Using Ssh For Authentication, However, Known_Hosts Doesn'T Seem To Be Passed To The Docker So It Prompts For Authentification/Fingerprint. Any Ideas?

Wait, is "SSH_AUTH_SOCK" defined on the host? it should auto mount the SSH folder as well?!

9 months ago

0 Hi Everyone, I Wanted To Inquire If It'S Possible To Have Some Type Of Model Unloading. I Know There Was A Discussion Here About It, But After Reviewing It, I Didn'T Find An Answer. So, I Am Curious: Is It Possible To Explicitly Unload A Model (By Calling

Hi @<1657918706052763648:profile|SillyRobin38>

Hi everyone, I wanted to inquire if it's possible to have some type of model unloading.

What do you mean by "unloading" ? you mean remove it from the clearml-serving endpoint ?
If this is from the clearml-serving, then yes you can online :
None

8 months ago

0 Is It Possible To Embed A Streamlit App In A Clearml Report? Are There Other Ways To Integrate Streamlit Apps?

Hi @<1575656665519230976:profile|SkinnyBat30>
Streamlit apps are backend run (i.e. the python code drives the actual web app)
This means running your Tasks code and exposing the web app (i.e. http) streamlit.
This is fully supported with ClearML, but unfortunately only in the paid tiers 😞
You can however run your Task with an agent, make sure the agent's machine is accessible and report the full IP+URL as a hyper-parameter or property, and then use that to access your streaml...

one year ago

0 Hi, We'Re Hosting Clearml On Our K8S Cluster, And I'M Running Into Problems With It... I'Ve Set It Up In A Subdomain Way - App/Files/Api.Clearml.Mydomain... But I Have Some Issues With The Ssl Certificate. When I Try Running

Yey!

3 years ago

0 Hi Everybody, I'M Running Experiments Inside A Docker Which Includes Multiple Python Instances, Some Of Them Are Inside Conda Environments. How Can I Specify The Agent To Use A Specific Conda Environment Inside The Docker?

Hi CrookedWalrus33
the python version is auto detected and register in "manual execution" time (i.e. when you run your code on your machine).
That said this is a suggestion for the agent, and only if it can actually find the matching Python version it will use it, otherwise it will use whatever is
available (i.e. Look through the PATH environment for a matching pythonX.Y executable)
The easiest way to support would just make sure the python binary's path is added to the PATH env.
Does...

2 years ago

0 What Could Be The Reason For Fail Status Of A Task That Seems To Have Completed Correctly? No Information In The Log Whatsoever

Yes
Are you trying to upload_artifact to a Task that is already completed ?

3 years ago

0 Heya, Is There Any Plan For Clearml To Leverage The New

Hi FierceHamster54
This is already supported, unfortunately the open-source version only supports static allocation (i.e you can spin multiple agents and connect each one to specific number of GPUs), the dynamic option (where you have single agent allocating jobs to multiple GPUs / Slices is only part of the enterprise edition
(there is the hidden assumption there that if you spent so much on a DGX you are probably not a small team 🙂 )

one year ago

Show more results