AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 Hi Team! Is There A Way To Make Clearml’S Aws Autoscaler And Queues Resource-Aware Please? I.E. If We Can Say, As We Enqueue Our Job, How Much Ram Or Gpu-Ram Or Even Gpus It Needs, Have The Scheduler/Autoscaler Dispatch The Job To Instances That Are Of Th

Having the ability to pack jobs/tasks onto the same "resource" (underlying server/EC2 instance)

This is essentially a "queue". Basically a queue is a way to abstract a specific type of resource, so that you can achieve exactly what you descibed.

open up a streaming use case, wherein batch (offline) inference could be done directly inside of a ClearML pipeline in reaction to an event/trigger (like new data landing in your data lake).

Yes, that's exactly how clearml is designed, a...

2 years ago

0 Different Question About Warnings: I'M Getting (Infrequently) This Warning, Followed By My Script Hanging

I will take any suggestion 🙂
git remote -v could be a good start but I'm not familiar with the output structure, is there a template for parsing ?

3 years ago

0 Hi All, I'M New With Clearml And I Have A Question. I Have A Modular Code, And When I'M Trying To Run It In A Remote Machine With The Agent, I Get An Error On The Line 'From X Import Y', Which Says That There Isn'T Such Module X. Any Help? Thanks.

By default the agent will add the root of the git repository into the pythonpath , so that you can import...

4 years ago

0 I'M A Little Confused As To How Force_Requirements_Env_Freeze Works When No Requirements File Is Supplied. Is It Supposed To Store The Full Reqs Of The Environment That Calls It?

I think CostlyOstrich36 managed to reproduce?!

3 years ago

0 Hi Clearml, Does Clearml Orchestration Have The Ability To Break Gpu Devices Into Virtual Ones?

So basically development on a "shared" GPU?

3 years ago

0 Hi Again, I Was Wondering What Would Be A Good Practice With Respect To Saving Different Datasets (While Preprocessing It In Several Steps/Stages). Mainly With The Use Of Remove_Files(). Is It Ok To Delete Raw Data After Preprocessing For Example? In That

Hi CostlyElephant1
What do you mean by "delete raw data"? Data is always fetched to cached folders and clearml takes care of cache cleanup
That said notice that get mutable copy is a target you specify, in this case you should definetly delete after usage. Wdyt ?

2 years ago

0 Hi All, I'Ve Successfully Run A Task Locally, And Now I'M Trying To Clone It And Send It To A Queue. It Looks Like The Environment Is Built Successfully, But It Hangs Here:

I managed to set up my (Windows) laptop as a worker and reproduce the issue.

Any insight on how we can reproduce the issue?

one year ago

0 I Have A Script In Which I Added

PompousParrot44
Check out the task.execute_remotely()
You can call it right after the task init, and it will enqueue your running Task, and leave the process (if you want).
https://github.com/allegroai/trains/blob/65a4aa7aa90fc867993cf0d5e36c214e6c044270/trains/task.py#L1437

5 years ago

0 Hi, I Am Running Several Python Scripts But All For The Same Project/Task. Is It Possible To Task.Init To Existing Running/Completed Task And Adding On The Results?

SubstantialElk6 if you call Task.init with continue_last_task=<task_id> it will automatically add the last_iteration of the previous run, to any logging/report so you never overwrite the previous reports 🙂

4 years ago

0 Hi, I’M Trying To Create A Dataset On Clearml Server From My Aws S3 Bucket Via:

Hi @<1562610699555835904:profile|VirtuousHedgehong97>
I think you need to upgrade your self-hosted clearml-server, could that be the case?

2 years ago

0 How Can I Tell Clearml To Ignore Certain Submodules Existing In The Project? My Projects Consists Of Multiple Git Submodules And It Is Rather Annoying That The Task Always Tries To Fetch All Submodules, When They Are Not Even Necessary. I Don'T Know How I

Omg that's a lot of submodules!
It has nothing with what the tasks sees if you are inside a git repo you will have to cone it on the remote machine. Let me check in the code maybe you have a workaround

one year ago

0 Hi! Is There A Way To Run A Task Without Reporting To The Server? For Example If I Want To Debug A Script By Running It Locally Without It Appearing On The Server

We just don’t want to pollute the server when debugging.

Why not ?
you can always remove it later (with Task.delete) ?

4 years ago

0 Clearml (Remote Execution) Sometimes Doesn'T "Pick-Up" Gpu. After I Rerun The Task It Picks It Up. Seems Random, Doesn'T Happen Too Often (Maybe Once In 30-40 Times) And I Cannot Seem To Detect Any Pattern. Did Anyone Else Notice This? Agents Are Vms On G

I know about clearml.conf but wanted to avoid ssh-ing through 50 instances to edit it.

LOL yeah, btw: this is exactly the reason the enterprise version has a vault feature, so one could edit the base configuration in the UI and it automatically propagates everywhere

but docker_arguments doesn't propagate if I leave docker_image as None

yeah, that's correct, you have to select a container to be used

one year ago

0 I'M A Little Confused As To How Force_Requirements_Env_Freeze Works When No Requirements File Is Supplied. Is It Supposed To Store The Full Reqs Of The Environment That Calls It?

No should be fine... Let me see if I can get a windows box 🙂

3 years ago

0 Hello! I Have An Issue Reproducing My Runs. The Task.Create Completes Successfully. When I Clone And Enqueue A Completed Task The Clone Fails. It Fails During The Python Requirements Installation. Why Is This? Do You Know How I Can Debug? Thank You In Adv

How are you getting:

beautifulsoup4 @ file:///croot/beautifulsoup4-split_1681493039619/work

is this what you had on the Original manual execution ? (i.e. not the one executed by the agent) - you can also look under "org _pip" dropdown in the "installed packages" of the failed Task

one year ago

0 Hi All, I Observed That When I Get A Dataset With

Is there any documentation on versioning for Datasets?

You mean how to select the version name ?

3 years ago

0 Hello, My Name Is Gabriel, I'M Using Clearml For Our Machine Learning Experiments, Which Is An Amazing Tool To Manage This Type Of Stuff So Thank You Guys For Creating This. But The Last Time I Tried To Use It Some Unexpected Error Came Up For Which I Can

Could it be someone deleted the file? this is inside the temp venv folder but it should not get there

4 years ago

0 Hi, Coming Back With The Venv Caching: With The Following Setting:

correct 🙂

4 years ago

0 If I Clone A Task, I Suppose All Artifacts Are Not Cloned With It, Even If They Are Registered, Right?

For example, the

Task

object is heavily overloaded and its documentation would benefit from being separated into logical units of work. It would also make it easier for the ClearML team to spot any formatting issues.

This is a very good point (the current documentation is basically docstring, but we should create a structured one)

... but some visualization/inline code with explanation is also very much welcome.

I'm assuming this connected with the previous po...

3 years ago

0 More Of Pushing Clearml To It'S Data Engineering Limits

single task in the DAG is an entire ClearML

pipeline

.

just making sure detials are not lost, "entire ClearML pipeline ." : the pipeline logic is process A running on machine AA.
Every step of that pipeline can be (1) subprocess, but that means the exact same environement is used for everything, (2) The DEFAULT behavior, each step B is running on a different machine BB.

The non-ClearML steps would orchestrate putting messages into a queue, doing retry logic, and tr...

2 years ago

0 Hi! I Deployed Clearml Server Along With Jupyterhub On Azure K8S (Aks). The Way It Works Is That Every User Is Assigned A New Pod That Is Spawned With A Docker Image Of A Choice (One Of Them With Clearml Sdk Installed). I Managed To Configure Most Of The

Hi GreasyPenguin66
So the way clearml can store your notebook is by using the jupyter-notebook rest api. It assumes, that it can communicate with it as the kernel is running on the same machine. What exactly is the setup? is the jupyter-lab/notebook running inside the docker? maybe the docker itself is running with some --network argument ?

4 years ago

0 Hello, Everyone! I Have A Question Regarding Clearml Features. We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs. Thu

how did you try to restart them ?

Yes, but how did you restart the agent on the remote machine ?

2 years ago

Since I'm assuming there is no actual task to run, and you do not need to setup the environment (is that correct?)
you can do:
$ CLEARML_OFFLINE_MODE=1 python3 my_main.pywdyt?

3 years ago

0 This Message Is For The Clearml Team. I'Ve Found A Bug. I Think It'S Reproducible. Basically, When Dealing With Bools Inside Args, I Think What You Guys Do Is Just Cast It To Bool Since All The Args Are Stored As Strings If I'M Correct. Only Issue Is, Boo

Hi VexedCat68
can you supply more details on the issue ? (probably the best is to open a github issue, and have all the details there, so we have better visibility)
wdyt?

3 years ago

It's the same but done from outside, you want the same and "offline" as well right?

3 years ago

Failed to initialize NVML: Unknown Error

yeah this is a driver issue. I think you need to check the VM image if the drivers match the GPU on that machine

one year ago

0 Hey Guys. I Tried Running The Pytorch Mnist Example On A Train-Agent By Running It Locally And Then Resetting The Experiment And Then Enqueue-Ing It To The Default Queue. All Works Well But It Seems The Environment Building Process Gets Stuck On A Manual

Hi ColossalAnt7 , I think we run into it on a few dockers, I believe the bug was fixed in the latest trains-agent RC. Could you verify please ?

4 years ago

0 So From What I Can Tell Using

So this should work, what is missing?

2 years ago

0 Hey Community! I Have A Question Regarding The Optuna Optimizer With Clearml. I'M Using A Config Yaml File That I'M Connecting Via

Well it should work out if the box as long as you have the full route, i.e. Section/param

2 years ago

0 When My Remote Task Is Installing The Python Dependencies

PYTHONPATH is still not working as expected

inside your code if you do :
import os print("PYTHONPATH", os.environ["PYTHONPATH"])what are you getting?

2 years ago

Show more results