AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 Hi Again, I Tried To Upgrade Trains Package To 15.1 From 13.1 That I Was Using For A While.. After The Upgrade My Code Stuck When Trying To Use "Pool" (From Multiprocessing Import Pool) The Code Snip:

😞 CooperativeFox72 please see if you can send a code snippet to reproduce the issue. I'd be happy to solve the it ...

5 years ago

0 Trying To Setup A Trains-Agent Worker On A Remote Machine; When I Run Trains-Init And Follow The Steps To Give It Credentials For Our Trains Server I Get This

okay, just so I understand, this is what you have on your client that can connect with the server:
api { api_server: web_server: files_server: credentials {"access_key": "KEY", "secret_key": "SECRET"} }

4 years ago

0 Hi Guys, I'M Sure This Has Been Asked Before. But How May I Delete The Project "Trains Examples" From The Trains Server?

Hi TrickyRaccoon92 , yes the examples folder is a special case, I'm not sure you can directly delete it.
Can you archive individual experiments in it ?

5 years ago

0 Hi, I'D Like To Know If It'S Possible To Change The Artifact File Path That Is Shown In The Ui. I'D Need This Because I Have Clearml Agents That Are Running In The Same Vpc Of The Server, So They Use The Internal Dns For The Api Server And Files Server An

Hi LovelyHamster1
That is a good point, I think the safest / robust way is to configure both to use the same dns name/s so both (internal/external) are accessible.
Some background, the URL itself on the artifact is basically a standalone, once registered on the Task, the UI will not replace it but use it as is (The UI has no "understanding" on which server it is, it will just fetch the file).
Are you also using a diff port on the load balancer ?
(because the easiest fix is on your external ...

4 years ago

0 Hey, Is There A Way To Disable Going To The Demo Server

SharpDove45 FYI:
if you set the environment variable CLEARML_NO_DEFAULT_SERVER=1 , it will make sure never to default to the demo server

4 years ago

0 I'Ve Tried Setting Up A Clearml Application On Openshift Using The Helm Chart But The Pods Cannot Go Up Because They Are Trying To Write To Files And Directories That Aren'T Open To Non Root Users During Their Setup. This Is A Problem On Openshift Because

(also im a bit newer to this world, whats wrong with openshift?)

It's the most difficulty Kubernetes flavor to work with 🙂

weve already tried that but it didnt really change ...

Can you provide full log? as well as how you created the pods ?

3 years ago

0 Hey All, Hope You’Re All Doing Well. I’M Running A Self-Deployed Server (0.17, I Think, Where Can You Find The Version In Use?). I’M Having Trouble With The Automatic Plot Capture. If I Run

Could you test if this is working:
https://github.com/allegroai/clearml/blob/master/examples/reporting/matplotlib_manual_reporting.py

4 years ago

0 Hi, I Have A Question Regarding The Aws_Autoscaler: It Usually Takes ~Hours To Get A Gpu Instance Nowadays. I Was Thinking, It Would Be Much More Interesting To Stop The Instances (Clearml-Agents) Instead Of Terminating Them Once They Are Inactive, So Tha

instead of terminating them once they are inactive, so that they could be available immediately when they are needed.

JitteryCoyote63 I think you can increase the IDLE timeout on the autoscaler, and achive the same behavior, no ?

3 years ago

0 Question About The Trains Agent And The Git Credentials When Setting A Trains Agent, It Is Possible To Configure Git Credentials For It And I'M Trying To Figure Out In Which Cases It Is Necessary. When Executing A Task Remotely (

WackyRabbit7 basically yes 🙂

5 years ago

0 Hey, Somehow

DeliciousSeal67 the agent will use the "install packages" section in order to install packages for the code. If you clear the entire section (you can do that in the UI or programmatically) then it will revert to requirementsd.txt
Make sense ?

3 years ago

0 Hi, I Went Through This Slack'S History And The Problem Already Popped Up A Couple Of Times But Doesn'T Look Like Solved. On My Machine I Currently Have 4 Gpus, No Problems If I Want To Allocate All 4 Or Just 1 Using

OutrageousGrasshopper93 is "--gpus all" working ?

5 years ago

0 When An Environment Variable Is Tracked Via

ReassuredTiger98

will it then be used by the clearml-agent

Yes, I think that in order to make it work, you have to make sure that the agent is also running with TRAINS_LOG_ENVIRONMENT=MYVAR*
Notice that you can use wildcard or have a list of VARIABLE you allow wither the clearml or the agent to monitor / change.

4 years ago

0 Different Question About Warnings: I'M Getting (Infrequently) This Warning, Followed By My Script Hanging

Okay, progress.
What are you getting when running the following from the git repo folder:
git ls-remote --get-url origin

4 years ago

0 Hello! Since Today I Get

The problem is that clearml installs

cudatoolkit=11.0

but

cudatoolkit=11.1

is needed.
You suggested this fix earlier, but I am not sure why it didnt work then.

Hmm , could you test with the clearml-agent 0.17.2 ? making surethis actually solves the problem

4 years ago

0 What Sort Of Integration Is Possible With Clearml And Sagemaker? On The Page

Hmm and you are getting empty list for thi one:

server_info['url'] = f"http://{server_info['hostname']}:{server_info['port']}/"

2 years ago

0 Hello Everyone , I Am New Bee To Clearml And Finding Option To Accommodate Opensearch Since We Have Already Opensearch Running In Our Env, Is Opensearch Supported In Clearml Instead Of Elasticsearch ? Please Shed Some Light On That

Hi @<1716987924207112192:profile|CostlyOctopus40>

is opensearch supported in ClearML instead of Elasticsearch ? please shed some light on that

Long story short, maybe?! but this is not officially supported.
We only support elasticsearch, the opensearch fork is not officially supported and since we continue to use more advanced features of Elastic, it might be that the API will not be compatible in the future.
Out of curiosity, why are you using opensearch?

one year ago

0 Hello! Since Today I Get

Hi @<1523701868901961728:profile|ReassuredTiger98> when you get to it...
please download the wheel, then install it with

pip3 install -U clearml_agent-0.17.3rc0-py3-none-any.whl

Then run the daemon with the additional --debug argument, basically:

clearml-agent --debug daemon --foreground ...

Once the agent is running please send the Task's log from your console 🙂

4 years ago

0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

The problems comes from ClearML that thinks it starts from iteration 420, and then adds again the iteration number (421), so it starts logging from 420+421=841

JitteryCoyote63 Is this the issue ?

4 years ago

0 Hello! Since Today I Get

Okay found it 🙂 it returns 11020 instead of 112

4 years ago

0 I Think There Is A Little Bug With The

What am I missing ?

3 years ago

JitteryCoyote63 that makes total sense!!
The reporting subprocess is not being updated with the new value! Let me check how we can pass it along...

4 years ago

I might gave an idea, could you test with:
` from clearml import Task
Task._report_subprocess_enabled = False

...

real code here `

4 years ago

0 And One More Question. How Can I Get Loaded Model In Preporcess Class In Clearml Serving?

ohh AbruptHedgehog21 if this is the case, why don't you store the model with torch.jit.save and use Triton to run the model ?
See example:
https://github.com/allegroai/clearml-serving/tree/main/examples/pytorch
(BTW: if you want a full custom model serve, in this case you would need to add torch to the list of python packages)

3 years ago

0 And One More Question. How Can I Get Loaded Model In Preporcess Class In Clearml Serving?

we will try to use Triton, but it’s a bit hard with transformer model.

Yes ...

All extra packages we add in serving)

So it should work, you can also run your preprocess class manually from your own machine (for debugging), if you pass to it a local file (basically the downloaded model file from the UI, it should work

it. But it’s maybe not the best solution

Yes... it is not, separating the pre/post to CPU instance and letting triton do the GPU serving is a lot more effici...

3 years ago

0 Assuming I Have A

WackyRabbit7 I guess we are discussing this one on a diff thread 🙂 but yes, should totally work, that's the idea

5 years ago

0 Assuming I Have A

(without having to execute it first on Machine C)

Someone some where has to create the definition of the environment...
The easiest to go about it is to execute it one.
You can add to your code the following line
task.execute_remotely(queue_name='default')This will cause you code to stop running and enqueue itself on a specific queue.
Quite useful if you want to make sure everything works, (like run a single step) then continue on another machine.
Notice that switching between cpu...

5 years ago

0 And One More Question. How Can I Get Loaded Model In Preporcess Class In Clearml Serving?

How can i get loaded model in Preporcess class in ClearML Serving?

ComfortableShark77
You mean your preprocess class needs a python package or is it your own module ?

3 years ago

0 Hi Folks Any Info On When The Helm Chart Will Be Updated For 1.0.1 ?

checking it

4 years ago

0 I Am Trying To Plot Values That Are Either 0 Or 1 (With Tensorboardx.Add_Scalar). However, It Doesn'T Show Correctly. Any Idea Why? (Smoothing Is 0)

So currently there is a limit (from the elasticsearch) of about 10k (anything above the is subsampled)
In the new version we are adding a "maximize" button, then in the full screen you will have the raw data including all ???k samples. sounds good?

4 years ago

0 Hi All, I'M New With Clearml And I Have A Question. I Have A Modular Code, And When I'M Trying To Run It In A Remote Machine With The Agent, I Get An Error On The Line 'From X Import Y', Which Says That There Isn'T Such Module X. Any Help? Thanks.

https://stackoverflow.com/questions/5419/python-unicode-and-the-windows-console
Hmm try to set this one before spinning the agent
Windows
set PYTHONIOENCODING=:replaceInside Colab
os.environ["PYTHONIOENCODING"] = ":replace"

4 years ago

Show more results