AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Questions 48
Answers 8049

0 Votes

3 Answers

485 Views

0 Votes 3 Answers 485 Views

We Recently Released A New Version Of

we recently released a new version of clearml-session with Persistent Workspace support! 🚀 🎉 Finally you can develop on remote machines with workspace fold...

remote-ssh

6 months ago

0 Votes

7 Answers

414 Views

0 Votes 7 Answers 414 Views

Thank You All For Taking The Time To Answer Our Survey (If You Haven'T Already, We Urge You To

Thank you all for taking the time to answer our survey (If you haven't already, we urge you to do so ). Your feedback has a major impact on what we build, do...

clearml

4 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi ClearML v0.17.1 and ClearML-Agent v0.17.0 are now the official packages & repositories 🎉 🎊 👋 🛤️ This new name brings on many changes, mainly replace a...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hi Guys/Gals, If You Want To Checkout The Latest Rc We Have 0.15.0Rc0 Out :

Hi Guys/Gals, If you want to checkout the latest RC we have 0.15.0rc0 out : pip install trains==0.15.0rc0 pip install trains-agent==0.15.0rc0Many of the impr...

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

This Is Usually Due To Enterprise Level Issued Https Certificates Not Part Of The Local Installation (Basically Any Python Generated Ssl Request Will Fail)

This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

We Are At Aaai Ny, Come Look Us Up :)

We are at AAAI NY, come look us up :)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

New Rc For Trains-Agent Is Out

New RC for trains-agent is out pip install trains-agent==0.13.2rc1

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

docs are up

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Lol, I Wonder What The Adblock Rule Was ;)

Lol, I wonder what the adblock rule was ;)

clearml

4 years ago

0 Votes

0 Answers

881 Views

0 Votes 0 Answers 881 Views

<!everyone> Trains v0.14.2 is out (<https://github.com/allegroai/trains/releases/tag/0.14.2|Change log>) Highlights: <https://github.com/allegroai/trains/blob/master/trains/storage/manager.py#L13|trains.storage.StorageManager> - with caching for any http

Trains v0.14.2 is out ( https://github.com/allegroai/trains/releases/tag/0.14.2 ) Highlights: https://github.com/allegroai/trains/blob/master/trains/storage/...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

@YummyWhale40 awesome thanks!

YummyWhale40 awesome thanks!

clearml

4 years ago

0 Votes

0 Answers

948 Views

0 Votes 0 Answers 948 Views

<!here> Gals/Guys/:robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : <https://github.com/allegroai/trains/issues/161> For example: generate an alert if my experiment reaches a certain

Gals/Guys/ :robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : https://github.com/alleg...

clearml

4 years ago

0 Votes

1 Answers

471 Views

0 Votes 1 Answers 471 Views

Lstmeow Is Back! Bots/Gals/Guys Feel Free To

LSTMeow is back! Bots/Gals/Guys feel free to 👍 None

clearml

4 years ago

0 Votes

2 Answers

959 Views

0 Votes 2 Answers 959 Views

Hi ! trains 0.16.2 is finally out with the new pipelines interface! Check out the new example https://github.com/allegroai/trains/blob/master/examples/pipeli...

clearml

4 years ago

0 Votes

6 Answers

410 Views

0 Votes 6 Answers 410 Views

Hi :robot_face: , humans We have the new documentation site up and running 🎉 None 🎊 This is still a work in progress, so we keep the previous version alive...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

New releases: ```pip install trains==0.13.3``` <https://github.com/allegroai/trains/releases/tag/0.13.3> ```pip install trains-agent==0.13.2``` <https://github.com/allegroai/trains-agent/releases/tag/0.13.2>

New releases: pip install trains==0.13.3https://github.com/allegroai/trains/releases/tag/0.13.3 pip install trains-agent==0.13.2https://github.com/allegroai/...

clearml

4 years ago

0 Votes

0 Answers

873 Views

0 Votes 0 Answers 873 Views

Hi Gals / :robot_face: / Guys <!here> Quick update, we will be updating the GitHub repository tomorrow with the new ClearML version, together with the accompanying python packages, ETA Noon(ish) PT time. `trains` , `trains-agent` and the docker images a

Hi Gals / :robot_face: / Guys Quick update, we will be updating the GitHub repository tomorrow with the new ClearML version, together with the accompanying p...

clearml

3 years ago

0 Votes

6 Answers

986 Views

0 Votes 6 Answers 986 Views

Hi ! ClearML Server + SDK v1.9.0 is out! 🎉 🚀 🎊 Happy Holidays and Happy New Year! ❇️ 🎇 🎄

clearml

one year ago

Show more results

0 Hi, When Trying To Use A Remote Agent To Train A Model, The Initial Environment Setup On The Remote Machine Fails Because The List Of Requirements Located In /Tmp/Cached-Reqsaw90Argk.Txt Contains A Link To An Aarch64 Wheel:

Thanks TroubledJellyfish71 I manged to locate the bug (and indeed it's the new aarach package support)
I'll make sure we push an RC in the next few days, until then as a workaround, you can put the full link (http) to the torch wheel
BTW: 1.11 is the first version to support aarch64, if you request a lower torch version, you will not encounter the bug

2 years ago

0 Hey Guys, Do You Have Any Plans To Add Functionality To Export Training Config With All Hyperparameters To The Different Formats, Such As Training Command Line Command, Yaml, Etc.?

It's dead simple to install:
Pip install trains-agent
the.n you can simply do:
Trains-agent execute --id myexperimentid

4 years ago

0 Hi All, Is There A Way To Schedule The Tasks From The Queue Onto The Gpu Instances Based On Factors Such As Gpu Utilisation, Number Of Cpu Cores Present, Free Memory Or Custom Parameters Such As Priority Of The Task, Estimated Time Etc?

Hi CharmingPuppy6
Basically yes there is.
The way clearml is designed, is to have queues abstract different types pf resources. for example a queue for single gpu jobs (let's nam "single_gpu") and a queue for dual gpu jobs (let's name it "single_gpu").
Then you spin agents on machines and have the agents pull jobs from specific queues based on the hardware they have. For example we can have a 4 GPU machine with 3 agents, one agent connect to 2xGPUs and pulling Tasks from the "dual_gpu...

3 years ago

0 Any Ideas Of Using Label Studio With Clearml Datasets - Base Dataset, Load To Label Studio, Annotate, Child Annotated Dataset Is The Kind Of Flow

I assume so 🙂 Datasets are kind of agnostic to the data itself, for the Dataset it's basically a file hierarchy

3 years ago

The idea of queues is not to let the users have too much freedom on the one hand and on the other allow for maximum flexibility & control.
The granularity offered by K8s (and as you specified) is sometimes way too detailed for a user, for example I know I want 4 GPUs but 100GB disk-space, no idea, just give me 3 levels to choose from (if any, actually I would prefer a default that is large enough, since this is by definition for temp cache only), and the same argument for number of CPUs..
Ch...

3 years ago

I can definitely see your point from the "DevOps" perspective, but from the user perspective it put the "liability" on me to "optimize" the resource, which to me sounds a bit much to put on my tiny shoulders, I just have a general knowledge on what I need. For example lots of CPUs (because I know my process scales well with more cpus), or large memory (because I have an entire dataset in memory). Personally (and really only my personal perspective), I'd rather have the option to select from a...

3 years ago

0 I'M A Little Confused As To How Force_Requirements_Env_Freeze Works When No Requirements File Is Supplied. Is It Supposed To Store The Full Reqs Of The Environment That Calls It?

Just making sure, pip package installed on your Conda env, correct?

3 years ago

0 I'M A Little Confused As To How Force_Requirements_Env_Freeze Works When No Requirements File Is Supplied. Is It Supposed To Store The Full Reqs Of The Environment That Calls It?

pip freezeworks ?

3 years ago

0 I Have A Question About The Clean Up Script. The Cleanup Service Can Remove Model Checkpoints That Are Saved Somewhere On Disk. However The Cleanup Service Is Also Running In A Docker Container. How Is It Possible That The Cleanup Service Has Access And C

Hi GreasyPenguin14

However the cleanup service is also running in a docker container. How is it possible that the cleanup service has access and can remove these model checkpoints?

The easiest solution is to launch the cleanup script with a mount point from the storage directory, to inside the container ( -v <host_folder>:<container_folder> )
The other option, which clearml version 1.0 and above supports, is using the Task.delete, that now supports deleting the artifacts and mod...

3 years ago

0 My Agent Is Not Fully Utilized. I Wonder Anyhow I Could Run Multi-Task On A Same Agent Without Queuing?

HugePelican43 sure you can, usually the limiting factor is memory, as it cannot be shared among processes, so if one allocated all memory the second process will crash with out of memory error

3 years ago

0 Hi Guys, I Have Many Questions To Ask, Sorry If This Questions Were Posted Already - If The Answer Exist, Please, Point Me To It. Thank You For Your Help. I'M Training Object Detection Model Using Tf 2.3 Object Detection Api And Use Clearml On Local Serve

Hi MagnificentSeaurchin79
Could you test with the tesnorflow toy example?
https://github.com/allegroai/clearml/blob/master/examples/frameworks/tensorflow/tensorboard_toy.py

3 years ago

MagnificentSeaurchin79 making sure the basics work.
Can you see the 3D plots under the Plot section ?
Regrading the Tensors, could you provide a toy example for us to test ?

3 years ago

0 My Agent Is Not Fully Utilized. I Wonder Anyhow I Could Run Multi-Task On A Same Agent Without Queuing?

No by definition the agent will only execute one Task at a time, you can spin a second agent on the same GPU :)

3 years ago

0 Oh Also, May I Inquire About The Clearml Professional And Enterprise Pricing?

Hi PunyGoose16 ,
I think the website is probably the easiest 🙂
https://clear.ml/contact-us/
I think they get back to quite quickly

3 years ago

0 Currently, I Start My Clearml Queues Like This From The Command Shell. Is There A More Elegant Way To Do This/ Like Running It In Perpetuity ?

Hi DeliciousBluewhale87
I think we had a docker that does exactly that, and then you would spin the docker as a k8s service , is this what you are referring to?

3 years ago

0 Hello, I Have A Server With 2 Gpus. Many Users Want To Train On The Gpus. That Means The Git Credentials Are Different For Every User. In The Clearml.Conf File I Let All Credentials Blank And Set Force_Git_Ssh_Protocol To True. But Where Do I Define The P

Hi UnsightlySeagull42

does anyone know how this works with git ssh credentials?

These will be taken from the host ~/.ssh folder

3 years ago

0 Hi, I'M Trying To Make Use Of New Capabilities Of Dag Creation In Clearml. Seems That Api Has Changed Pretty Much Since A Few Versions Back. There Seems To Be No Need In

Seems that api has changed pretty much since a few versions back.

Correct, notice that your old pipelines Tasks use the older package and will still work.

There seems to be no need in

controller_task

anymore, right?

Correct, you can just call pipeline.start() 🙂

The pipeline creates the tasks, but never executes/enqueues them (they are all in

Draft

mode). No DAG graph appears in

RESULTS/PLOTS

tab.

Which vers...

3 years ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

Sorry, what I meant is that it is not documented anywhere that the agent should run in docker mode, hence my confusion

This is a good point! I'll make sure we stress it (BTW: it will work with elevated credentials, but probably not recommended)

2 years ago

0 Different Question About Warnings: I'M Getting (Infrequently) This Warning, Followed By My Script Hanging

LOL, thanks!

3 years ago

0 How Can I Run A New Version Of A Pipeline, Wait For It To Finish And Then Check Its Completion/Failure Status? I Want To Kick Off The Pipeline And Then Check Completion

So would this pseudo code solve the issue

def pipeline_creator():
  pipeline_a_id = os.system("python3 create_pipeline_a.py")
  print(f"pipeline_a_id={pipeline_a_id}")

something like that?
(obviously the quesiton how would you get the return value of the new pipeline ID, but I'm getting a head of myself)

one year ago

0 How Can I Run A New Version Of A Pipeline, Wait For It To Finish And Then Check Its Completion/Failure Status? I Want To Kick Off The Pipeline And Then Check Completion

Hi @<1523701079223570432:profile|ReassuredOwl55> let me try ti add some color here:
Basically we have to parts (1) pipeline logic, i.e. the code that drives the DAG, (2) pipeline components, e.g. model verification
The pipeline logic (1) i.e. the code that creates the dag, the tasks and enqueues them, will be running in the git actions context. i.e. this is the automation code. The pipeline components themselves (2) e.g. model verification training etc. are running using the clearml agents...

one year ago

0 Hello Everyone! I'M Encountering An Issue When Trying To Deploy An Endpoint For A Large-Sized Model Or Get Inference On A Large Dataset (Both Exceeding ~100Mb). It Seems That They Can Only Be Downloaded Up To About 100Mb. Is There A Way To Increase A Time

It’s only on this specific local machine that we’re facing this truncated download.

Yes that what the log says, make sense

Seems like this still doesn’t solve the problem, how can we verify this setting has been applied correctly?

hmm exec into the container? what did you put in clearml.conf?

6 months ago

0 Hi, I Need Your Help Setting Up An Trains Agent Running In Docker. I Have An Python Script Calling Wget As System Command Which Runs Fine On My Dev Engine. When Cloning The Experiment And Scheduling It Into The Services Queue I Get An Error That The Call

trains-agent should be deployed to GPU instances, not the trains-server.
The trains-agent purpose is for you to be able to send jobs to a GPU (at least in most cases) instance.
The "trains-server" is a control plane , basically telling the agent what to run (by storing the execution queue and tasks). Make sense ?

3 years ago

0 Dear Developers, I Encountered A Question That The Local Module Cannot Be Found When Pulling Task From Queue. I Opened A Issue Here

Or can I enable agent in this kind of local mode?

You just built a local agent

2 years ago

0 When Running Jobs, My Pipeline Controller Always Updates To The Latest Git Commit Id But Sometimes My Pipeline Steps Do Not. This Appears To Be Somewhat Random So I Believe It Is Due To Caching. Has Anyone Else Encountered This Or Have Any Idea How To Fix

AdventurousRabbit79 you are correct, caching was introduced in v1.0 , also notice the default is no caching, you have to specify that you want caching per step.

3 years ago

0 Hi Everyone, I Wanna Implement Some Sort Of A Tool That Checks Every N Minutes For A New Data In Db Or Maybe S3 Bucket And Whenever The New Data Appears It Runs Some Preprocessing And Then Starts Training. Is It A Good Idea To Use Clearml Agent Services F

Hi ContemplativeGoat37