AgitatedDove14

49 Questions, 8122 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8122

0 Hi, Is There Any Documentation For Setting Up And Using Ssl Certs With The Clearml Server And Agent?

HI @<1687643893996195840:profile|RoundCat60>
Are you running on AWS ?

4 years ago

0 Hi Guys, How Does Allegro Keep Track Of The Requirements (I'M Running The Scripts On A Remote Train-Agent With

LovelyHamster1 from the top, we have two steps:
We run the code "manually" (i.e. without the agent) this step create the experiment (Task) and automatically feels in the "installed packages" (which are in the same format as regular requirements.txt) An agent is running a cloned copy of the experiment (Task). The agents creates a new venv on the agent's machine, then the agent is using the "Installed packages" section as a replacement to regular "requirements.txt" and installs everything fro...

4 years ago

0 I Wanted To Ask, How To Run Pipeline Steps Conditionally? E.G If Step Returns A Specific Value, Exit The Pipeline Or Run Another Step Instead Of The Sequential Step

VexedCat68 both are valid. In case the step was cached (i.e. already executed) the node.job will be None, so it is probably safer to get the Task based on the "executed" field which stores the Task ID used.

3 years ago

0 I Seem To Be Missing Something ... I'Ve Only Got One Task Running To Train A Segmentation Model On My Local Machine, And In A Few Days It'S Hit Over 1.15M Api Calls. It Looks Like It'S Sending Every Single Console Output ... Are There Settings To Control

(Not sure it actually has that information)

2 years ago

0 Hi, I'M Trying To Set Storage Manager To Use Our Internal Miniio Installation But I Ran Into This Issue With This Testing Code:

It is way too much to pass on env variable 😞

4 years ago

0 Hi, Is It Possible To Run

Wait, why aren't you just calling Popen? (or os.system), I'm not sure how it relates to the torch multiprocess example. What am I missing ?

2 years ago

0 Question About Using Agents. When Initializing An Agent, Credentials Are Required. As I See It, Credentials Is Something Personal, Which Belongs To Data Scientists Working Remotely Sharing The Same Server And The Same Set Of Agents. So I Wonder - Why Sho

So I wonder - why should an agent be related to a specific user's credentials? Is the right way to go about this is to create a "fake user" for the sake of the agent?

Very true you have to have credentials for the trains-agent, so it can "report" to the trains-server, that said, the creator of the Task (i.e. the person who cloned it) will be registered as the "user" in the UI.
I would recommend to create an "agent" user and put it's credentials on the trains-agent machine (the same way...

5 years ago

0 Hi, I'M Using Clearml 1.16.1 To Launch A Multi Node Task Using

Hi @<1709015393701466112:profile|ScatteredPeacock14>

I get 3 tasks created in total. Any ideas?

Could it be an old instance of the same Task?
Notice the for loop starts from 1 so it does include the master node:
None

one year ago

0 Hi Guys, Following Up On This

the only problem with it is that it will start the task even if the task is completed

What is the criteria ?

5 years ago

Hmm if this is case, you can add some prints in here:
None
the service/action will tell you what you are sending
wdyt?

2 years ago

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

Yes... I think that this might be a bit much automagic even for clearml 😄

3 years ago

0 Hi, Expanding On

Is it possible in Clearml to somehow allocate resources so that maybe after running a number of Alice's tasks, Bob's task get processed (Like maybe Round robin fashion)

Hi DeliciousBluewhale87
A few options here:
set the agent with high / low priority queues. Make sure Alice pushes into low priority (aka HPO) then Bob can push into high priority when he needs. This makes a lot of sense when you have automation processes spinning many experiments. expanding (1) you could set differe...

4 years ago

0 Hey! I'M Having A Weird Issue When I Run Pip Freeze Locally It'S Showing Version "Clearml==0.17.5Rc6" But When I Initiate The Task It'S Always Starting With "Clearml==0.17.2" - This Version Isn'T Accepting Tags Through The Code Etc. (I'M Manually Fixing I

SmallBluewhale13 in your code what are you getting when you print the version:
from clearml import __version__ print(__version__)

4 years ago

0 Hi All

This will set more time before the timeout right?

Correct.

task.freeze_monitor()
download()
task.defrost_monitor()

Currently there isn't, but that's a good ides.
What would be the argument of using it vs increasing the timeout ?
btw: setting the resource timeout to 99999 will basically mean that it will wait until the first reported iteration, Not that it will just sleep for 99999sec 🙂

4 years ago

0 Hi. After Upgrading Clearml To Latest Version, Got This Error From My Pipeline (Windows10, Configured And Running Tensorflowod For Tf 2.3.):

So I think this is a good example of pipelines and data:
Basically Task A generates data stored using the cleamrl-data (See Dataset class). The output of that is an ID of the Dataset. Then Task B uses that ID to retrieve the Dataset created by Task A.
documentation
https://github.com/allegroai/clearml/blob/master/docs/datasets.md

Example:
Step A creating Dataset:
https://github.com/alguchg/clearml-demo/blob/main/process_dataset.py
Step B training model using the Dataset created in ...

4 years ago

0 Clearml Task Execution Fails Trying To Pull Data From Gitlab. The Credentials Are Correct (Username + Access Token), But I Get This Error:

Any chance you can share the Log?
(feel free to DM it so it will not end up public)

2 years ago

0 Hi Again

Hi JealousParrot68
no need for decorators, you can just pass the function to schedule_function=<function goes here> 🙂
See scheduler here
https://github.com/allegroai/clearml/blob/8708967a5ef4d8529a1a5ea417672e3ebbb258d7/clearml/automation/scheduler.py#L485
And triggers here:
https://github.com/allegroai/clearml/blob/8708967a5ef4d8529a1a5ea417672e3ebbb258d7/clearml/automation/trigger.py#L193
https://github.com/allegroai/clearml/blob/8708967a5ef4d8529a1a5ea417672e3ebbb258d7/clea...

4 years ago

0 I Have An Experiment That Generates Many Plots, But Not All Of Them Show Up In The “Plots” Section Of The Experiment Results. I Thought I Read Somewhere About A Limit On The Number Of Plots That Would Be Shown In That Section, But I Couldn’T Find It In Th

Hi NastyFox63 yes I think the problem was found (actually backend side).
It will be solved in the upcoming release (due after this weekend 🙂 )

4 years ago

0 I Originally Posted In

ldconfig from

/etc/profile

which is put there by the interactive_session_task

LackadaisicalOtter14 are you sure ? maybe this is done as part of the installation the interactive session runs ?
Could that be the issue ?
apt-get update && apt-get install -y openssh-server

3 years ago

0 Hi! I'Ve Been Trying Out The

First that is awesome to hear PanickyFish98 !
Can you send the full exception? You might be on to something...
2. Actually we thought of it, but could not find a use case, can you expand?
3. I'm not sure I follow, do you mean you expect the first execution to happen immediately?

4 years ago

0 What Is Being Stored Exactly In

my question is how to recover, must i recreate the agents or there is another way?

Yes you have to recreate the Task (I assume they failed, no?!)

3 years ago

0 Clearml Task Execution Fails Trying To Pull Data From Gitlab. The Credentials Are Correct (Username + Access Token), But I Get This Error:

Was I right to put the credentials in

clearml.conf

on the machine I am starting the agent on?

AdventurousButterfly15 Yes exactly!
you should be able to see that in the log of the Task (at the top of the log there will be the entire configuration), can you see the git user there?

2 years ago

0 Hello Everyone. I'Ve Just Started Playing With Clearml. In The 2Nd 'Getting Started' Tutorial, I Launched The Agent From Google Colab. But Whenever A Task Is Picked, It Fails For The Following Error. Any Clues? Thank You!

It actually started executing your code, but it did not capture it correctly:

/root/.clearml/venvs-builds/3.10/bin/python -u /root/.clearml/venvs-builds/3.10/code/colab_kernel_launcher.py

Which I assume means the actual Task had bad code.
What do you have under the Task execution tab in the UI (the one you were launching, i.e. enqueueing )

one year ago

0 I Updated Trains-Server Today, And Now It'S Very Unstable, Web Interface Randomly Stops Working. Anyone Had The Same Problem? I'Ve Never Had Any Problems With Updating The Server Before

Hi DilapidatedDucks58 ,
I'm not aware of anything of this nature, but I'd like to get a bit more information so we could check it.
Could you send the web-server logs ? either from the docker or the browser itself.

5 years ago

0 Hi Again

The difference is whether you are only supplying a "minutes" or you are also passing hour/day etc.
See the examples:
Every 15 minutes
add_task(task_id='1235', queue='default', minute=15)Every hour on minute 20 of the hour (i.e. 00:20, 01:20 ...)
add_task(task_id='1235', queue='default', hour=1, minute=20)

4 years ago

0 Hi Everyone, If I Want To Run A Script That Has Trains Tracking Statements In It But Just This Time I Want To Disable All Logging, How Would I Go About That?

Hi IntriguedRat44
You can make log it offline (i.e. into a local folder/zip) by calling:
Task.set_offline(True)You can also set the environment variable:
TRAINS_OFFLINE_MODE=1You could also just skip the Trains.init call 😉
Does that help?

4 years ago

0 Hi, How I Can To Control On The Notifications Mode? I Got Many Warnings, Like:

Hi RoundSeahorse20
Try the following , let me know if it worked.
clear_logger = logging.getLogger('clearml.metrics') clear_logger.setLevel(logging.ERROR)

4 years ago

0 Hi All, I'M Trying To Use The Relatively New Jupyter Preview Feature But For Some Reason I Have The Notebook Artifact Under Artifacts But The Preview Is Unavailable.. Am I Missing Some Needed Steps? Thanks!

That's the theory, I still see it is not there

4 years ago

... grab the model artifacts for each, put them into the parent HPO model as its artifacts, and then go through the archive everything.

Nice. wouldn't it make more sense to "store" a link to the "winning" experiment. So you know how to reproduce it, and the set of HP that were chosen?
No that the model is bad, but how would I know how to reproduce it, or retrain when I have more data etc..

4 years ago

0 Hi, I Have A Question About Clearml-Data. Clearml-Data Probably Does Well On Data Versioning, But When It Comes To Actual Loading Of Data, Are There Examples Of How It Can Make Use Of Advanced Features Such That Those In

Hi SubstantialElk6
ClearML-Data doesn't actually "load" the data, it brings it locally and returns a folder with all your data files, from that point onward, it's up to your code to load it to the framework. Make sense ?

4 years ago

Show more results