AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hello! I'M Using The Self-Hosted Version Of Clearml. I'M Doing Some Testing And It Seems That The Clearml Isn'T Auto-Logging My Matplotlib Plots. The Versions I'M Using Are Matplotlib==3.6.2 And Clearml==1.6.4. Am I Missing Something?

Can you post here the actual line? seems like we can fix it to also support this scenario (if we could test it)

one year ago

0 Hi All. I Am Struggling With Integrating Plots Into My Task. Without The Plotting Code, The Task Never Completes The Execution And Seems To Hang. Also, The Plots Are Not Visible In The Plots Tab. I Am Running A For Loop For Different Models And Attemptin

reproduced with matplotlib 3.1

3 years ago

0 Hello Everyone, Is There Any Way To Remove A Serving Instance?

one of them has been named incorrectly and now I'm trying to remove it and it's not running anywhere,

Oh I see, meaning until it "times out".
You could search for it in the UI (based on the session ID) and abort/archive it

4 months ago

0 Hi Folks, We Are Trying To Find A Tool To Help With Workflow Orchestration. This Is Our Stack So Far (Label Studio/Clearml/Seldon). Does Anyone Have Any Experience With Using Any Workflow Which Is Most Compatible Esp Wrt To Clearml.

Hi DeliciousBluewhale87
When you say "workflow orchestration", do you mean like a pipeline automation ?

3 years ago

0 I Would Like To Use Clearml Together With Hydra Multirun Sweeps, But I’M Having Some Difficulties With The Configuration Of Tasks.

I would like to use ClearML together with Hydra multirun sweeps, but I’m having some difficulties with the configuration of tasks.

Hi SoreHorse95
In theory that should work out of the box, why do you need to manually create a Task (as opposed to just have Task.init call inside the code) ?

one year ago

0 While Running Same Experiment Next Time On Clearml-Agent Using Clearml-Server, Is There Any Way We Can Avoid Installations For Creating Virtual Environment For That Particular Experiment And Use Previously Created Enviroment?

AstonishingWorm64
You can turn on the venv cache , it will just handle it's own full env caching 🙂
See here:
https://github.com/allegroai/clearml-agent/blob/4f7407084d1900a79d455570c573e60f40208742/docs/clearml.conf#L100

3 years ago

0 Hi, I'M Trying To Install A New Server, This Is A Fresh Ubuntu 18.04 Install. When I Try To Run The Docker Composer Up Command I Get Error Messages Like This One:

SteadyFox10 I suspect you are correct 🙂
CourageousLizard33 see also section (4) here:
https://github.com/allegroai/trains-server/blob/master/docs/install_linux_mac.md#launching-the-trains-server-docker-in-linux-or-macos

4 years ago

0 Sorry Folks Too Many Questions - If I Have A Project (And I Set The Output Uri In It While Creating, To A S3 Folder) How Can I Ensure That A Experiment (Task) That I Run On My Local Outputs The Model To The Uri?

(or woman or in between, we are supportive as long as code is working 🙂 )

3 years ago

0 Hey I Have A Question, Can You Monitor The Time For One Pipeline, I Want To Observe How Much Time Does My Training Task Take When I Run It Through The Pipeline.

Hi @<1570583227918192640:profile|FloppySwallow46>

Hey I have a question, Can you monitor the time for one pipeline,

you mean to see the start / end time of the pipeline?
Click on the details link on the right hand side and you will have all the details on the pipeline task, including running time

one year ago

0 Hi! I Am Using The Modelcheckpoint Callback From Tensorflow To Save The Best Model. When The Experiment Finishes If I Go On The Server To Experiment > Artifacts > Output Model I Can See The Model And Subsequently By Clicking On It The Weights. How Can I

Yes, that sounds like the issue, is the file actually there ?

3 years ago

0 How Can I Remove A Service With Clearml-Serving?

I put two models in the same endpoint, then only one was running,

without providing version number, you are overriding the models (because this is the same endpoint)

I started another docker container having a different port number and then the curls with the new model endpoint (with the new port) started working

Seems like misconfiguration on the first one?

, which apparently I can't specify when I establish the model endpoint but I need to re compose the docker container by...

one year ago

0 Hey, Don'T Really Understand Why The Clearml Worker Needs To Pull The Repository Where My Pipeline (Defined With Decorators) Is Written Is Since Apparently A Temporary Python File (Containing At Least The Code And Imports For The Executed Component) Seems

Hi FierceHamster54
Are you saying the pipeline component is a standalone script?
If this is the case then you are correct, it should not need to, I think you can specify it in the decorator.
I think this might work 🤞
@PipelineDecorator.component(..., repo=False)

one year ago

0 Hi Everyone! Is There A Way To Specify The Working Directory In A Pipeline Component? I’M Using Pipelines From Decorators, I Can Set The Repo Url Just Fine, But I’M Running Everything From A Subfolder, And The Working Dir Is Set To

Hmm yes, @<1570220858075516928:profile|SlipperySheep79> I think you are right in your case it make sense to do add this option.
Could you add GH issue with the feature request? it should be fairly easy to add and we use GH to make sure we track those requests
wdyt?

8 months ago

0 Question Regarding Tensorboard (If There Is An Answer Here Already Please Send Me A Link). I Have A Few Graphs With The Same X Axis But Different Y Axis That Are Presented On Different Graphs In Tensorboard And For Some Reason Trains Joins Them On The Sam

BTW: CloudyHamster42 I think this issue was discussed on GitHub, and the final "verdict" was we should have an option to split/combine graphs on the UI side (i.e. similar to the "smoothing" or wall-time axis etc.)

4 years ago

0 Second: Is There A Way To Take Internally Tracked Training Runs And Publish Them Publicly, E.G. For A Research Paper? "Appendix A: Training Runs Can Be Found Here, Feel Free To Explore Them And Look At The Loss Curves"? For Example

How would one do this? Do I just share a link to the experiment, like

See "Share" in the right click menu on the experiment

2 years ago

0 Question About The File Server. Currently, We Have A Machine With Minio Installed, And All File Communication Is Made Using The Minio Sdk Client. [Minio Is Just Like An S3 Bucket, Fully Compliant With S3 Protocol]. In The Examples I'Ve Seen The

To store all the debug samples, also it can store all the models (if you configure the output_uri=' http://file_server_here:8081 ') Yes: instead of the file server have 's3://<ip_of_minio>:9000/bucket' make sure you add the credentials for the minio in the trains.conf Yes, basically once you have the creendtials in the trains.conf, you could do StorageManager.get_local_copy('s3://<minio>:9000/bucket/file') (also upload of course 🙂 )

3 years ago

0 Hello! I Got The Idea Of Publishing Model/Task. But There Could Be Scenarios When It Still Should Be Archived/Deleted. For Instance Death Of Project. Is It Possible To Archive/Delete/Change Status Of Published Task/Model Via Api? Thanks.

Yey!

3 years ago

0 Hi, I'M Trying To Install A New Server, This Is A Fresh Ubuntu 18.04 Install. When I Try To Run The Docker Composer Up Command I Get Error Messages Like This One:

https://github.com/allegroai/trains/blob/master/docs/trains.conf#L47

4 years ago

0 Good Evening! For Agent Work Please Tell Me If It Is Possible To Specify The Location Of Requirements.Txt In Tack.Init(), Because I Have Correctly Identified The Versions Of The Libraries Used In The Project

Hi CheerfulGorilla72
the "installed packages" section is used as "requirements.txt for the agent.
Are you saying the autodetection fails to detect all packages? You can specify in "manual execution" (i.e not when the agent is running the code), to just take the requirements.txt locally:` Task.force_requirements_env_freeze(requirements_file="./requirements.txt")

notice the above call should be executed Before Task.init

task = Task.init(...) `3. If you clear all the "installed packages" se...

2 years ago

0 Hi Folks, Is There A Way To Force Clear-Ml Agent With --Docker To

Hi RoughTiger69

One quirk I found was that even with this flag on, the agent decides to install whatever is in the requirements.txt

Whats the clearml-agent you are using?

I just noticed that even when I clear the list of installed packages in the UI, upon startup, clearml agent still picks up the requirements.txt (after checking out the code) and tries to install it.

It can also just skip the entire Python installation with:
CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1

2 years ago

0 Hello Everyone. I Don'T Uderstand Why Is My Training Slower With Connected Tensorboard Than Without It. I Have Some Thoughts About It But I Not Sure. My Internet Traffic Looks Wierd.I Think This Is Because Tensorboard Logs Too Much Data On Each Batch And

(this is the part that is not in the background, so if the epoch is short it might have an effect)

2 years ago

0 Another Issue Is The Agent Uses Python 2 For Some Reason Even Though Locally I’M Using Python 3 And The Agent Is Supposed To Use A Python 3 Venv.

Can you send the full log? This is odd, it will by default use the python executable it (the agent) is running with.
Regardless you can specify the python executable to be used here:
https://github.com/allegroai/clearml-agent/blob/bd411a19843fbb1e063b131e830a4515233bdf04/docs/clearml.conf#L44

3 years ago

0 Hello, Has Anyone Know Any Solutions To This?

Hi DeliciousKoala34

Happened when cloning and running a task on an agent on a different machine. I

sounds like torch internal issue, can you send the full log of the remote Task ?

one year ago

0 Hi Guys, Suppose I Have The Following Script:

My pleasure 💗

3 years ago

0 Hello Everone, I Have Hosted Clearml Server And Trained A Yolov8 Model To Test My Installations. The Model Was Trained Successfully And I Tried To Optimize The Hyderparameters By Using The Sample Code From Clearml But Im Getting Some Error In Doing So An

which was trained on jupyter notebook.

Hmm that might be the issue, it assumes a local script running, let me verify that

10 months ago

0 Hi, I'M Following The Instructions For

clearml_agent: ERROR: Can not run task without repository or literalscript in script.diff

This is odd ...

OutrageousSheep60 when you launch clearml-session it tells you the session ID (which is also a Task ID), can you look for it in the UI and check there is something in the repo/uncommitted-changes section ?

2 years ago

0 Hi, I Notice Through The Log That Clearml Cannot Find The Python3.7 That Was Installed In The Docker Container And Is Using The Worker’S Default Version.

Hmm, what's the clearml-agent version ?

2 years ago

0 Anyway To Make A Job Fail If The Required Python Version (3.7 Vs 3.8 For Example) Is Not Available In The Agent?

then when we triggered a inference deploy it failed

How would you control it? Is it based on a Task ? like a property "match python version" ?

3 years ago

0 I Have A Problem With Clearml-Agent, The Agent Is Cloning Repository, But When Executing This Command:

🤔

2 years ago

0 Hi Today I'M Suddenly Getting This

I think there was an issue with the entire .ml domain name (at least for some dns providers)

one year ago

Show more results