AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 Hi, I'M Configuring An Agent. After Pasting The Credentials, I Get:

GiddyTurkey39
I would guess your VM cannot access the trains-server , meaning actual network configuration issue.
What are VM ip and the trains-server IP (the first two numbers are enough, e.g. 10.1.X.Y 174.4.X.Y)

5 years ago

0 Hi Guys, Just Wondering If Anyone Encountered This Error When Using The Pipeline Controller Object. I Simply Added A Step With The Step-Name And Base_Task_Id As Flags.

Oh that is odd... let me check something

4 years ago

0 Whet Is The Method For Packages Exploration When Using Conda? Agent Is Set To 'Conda' Mode. We Upload A Task From A Local Conda Env That (Obviously) Has Some Pip Packages As Well. When We Enqueue The Task To Run Remotely, Not All Conda Packages Are Instal

I'm not sure if it matters but 'kwcoco' is being imported inside one of the repo's functions and not on the script's header.

Should work.
when you run pip freeze inside the same env what are you getting ?
Also, is there anyother import that is missing? (basically 'clearml' tryies to be smart, and see if maybe the script itself, even though inside a repo, is not actually importing anything from the repo itself, and if this is the case it will only analyze the original script. Basically...

3 years ago

0 Hi, I Tried To Setup Clearml Serving And Ran The Example Given

Can you post here the docker-compose.yml you are spinning? Maybe it is the wring one?
Step 4 here:
https://github.com/thepycoder/asteroid_example#deployment-phase

3 years ago

0 When I Run Experiments I Set

IntriguedRat44 If the monitoring only shows a single GPU (the selected one) it means it reads the correct CUDA_VISIBLE_DEVICES (this is how it knows that you are only using a selected GPU not all of them).
There is nothing else in the code that will change the OS environment.
Could you print os.environ['CUDA_VISIBLE_DEVICES'] while running the code to verify ?

4 years ago

0 How Can I Add My Requirements.Txt File To The Pipeline Instead Of Each Tasks?

, when I am running the pipeline remotely is there a way the remote machine can access it?

Well for the dataset to be accessible, you need to upload it with Dataset class, then the remote machine can do Dataset.get(...).get_local_copy() to get the actual data on the remote machine

2 years ago

0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

Wait, it shows "hydra==2.5" not "hydra-core==x.y" ?

3 years ago

0 Unrelated Problem (Or Is It?) The Clearml'S Built In Cleanup Service Fails

. Yes I do have a GOOGLE_APPLICATION_CREDENTIALS environment variable set, but nowhere do we save anything to GCS. The only usage is in the code which reads from BigQuery

Are you certain you have no artifacts on GS?
Are you saying that if GOOGLE_APPLICATION_CREDENTIALS and clearml.conf contains no "project" section it crashed when starting ?

3 years ago

0 Hi, I Try To Run Locally

Hi @<1523706266315132928:profile|DefiantHippopotamus88>
The idea is that clearml-server acts as a control plane and can sit on a different machine, obviously you can run both on the same machine for testing. Specifically it looks like the clearml-sering is not configured correctly as the error points to issue with initial handshake/login between the triton containers and the clearml-server. How did you configure the clearml-serving docker compose?

3 years ago

0 Hi, I Am Trying To Setup An Auto Scaler, But I Am Getting The Following Dependency Error:

Hi SkinnyPanda43
This issue was fixed with clearml-agent 1.5.1, can you verify?

2 years ago

0 I Am Trying Pytorch Nightly Again With Python 3.10. Works Fine Locally, But Fails On Clearml-Agent In Docker Mode.

seems like pip 20.1.1 has the issue, but >= 22.2.2 do not.

Notice we changed the value there, it now has two versions, pne for python 3.10 < and one for python 3.10>=
The main reason is that pip changed their resolving algorithm, and the new one can break its own dependencies (i.e. pip freeze > requirements.txt -> pip install might not actually work)
None

2 years ago

0 I Am Using Clearml Pro And Pretty Regularly I Will Restart An Experiment And Nothing Will Get Logged To Clearml. It Shows The Experiment Running (For Days) And It'S Running Fine On The Pc But No Scalers Or Debug Samples Are Shown. How Do We Troubleshoot T

task.connect(model_config)
task.connect(DataAugConfig)

If these are separate dictionaries , you should probably use two sections:

    task.connect(model_config, name="model config")
    task.connect(DataAugConfig, name="data aug")

It is still getting stuck.
I notice that one of the scalars that gets logged early is logging the epoch while the remaining scalars seem to be iterations because the iteration value is 1355 instead of 26

wait so you are seeing Some scalars ?...

one year ago

0 Hello, My Dl Workflow Includes Post-Training Quantization. Is There A Way To Implement These Procedures In Clearml?

MistakenBee55 how about a Task doing the Model quantization, then trigger it with TriggerScheduler ?
https://github.com/allegroai/clearml/blob/master/examples/scheduler/trigger_example.py

3 years ago

0 Hi It Is Me Again, This Time Trying To Upload A Single File As Dataset But Met With The Following Error. The File Is 13.42Gb And Of Apache Arrow Format. Any Idea How To Solve This Error Please? Thank You.

total size 5.34 GB, 1 chunked stored (average size 5.34 GB)PanickyAnt52 The issue itself the Dataset will not break files (it will package into multiple zip files a large folder, but not break the file itself).
The upload itself is limited by the HTTP interface (i.e. 2GB file size limit)
I would just encode it into multiple Arrow files
does that make sense ?

3 years ago

0 Any Pointers On Running Gpu Tasks With K8S Glue?

Does that work?

4 years ago

0 Hi, I Am Having Difficulties When Using The Dataset Functionality. I Am Trying To Create A Dataset With The Following Simple Code:

Found it
GiganticTurtle0 you are 🧨 ! thank you for stumbling across this one as well.
Fix will be pushed later today 🙂

4 years ago

0 Let'S Say That I Specify The

Hi GiganticTurtle0
you should actually get " file://home/user/local_storage_path "
With "file://" prefix.
We always store the file:// prefix to note that this is a local path

4 years ago

0 Hey, We Were Trying To Run An Experiment On Clearml Using Its Python-Sdk. When I Run An Experiment Using

p.s. StraightCoral86 I might be missing something here, please feel free to describe the entire execution scenario and what you are trying to achieve 🙂

4 years ago

0 Hi, What Is The Right Way Of Syncing A Dataset? Whenever I Add New Archives And Try To Upload I Get:

Correct 🙂

4 years ago

0 I Have A Bunch Of Python Modules With Clearml Tasks. They Are Using 3Rd-Party Libraries But No Module Uses Code From Another Module. When I Run Such A Task Remotely - Then Clearml Deduces The Dependencies From Imports, Which Works Fine. Now I Decided To T

but we run everything in docker containers. Will it still help?

As long as you are running with clearml-agent(in docker mode), all the cache folders (this one included) are mounted on the host machine for persistency

3 years ago

0 Anyone Here With Any Idea Why My Service Tasks Get Aborted When Going To Sleep?

Hmm okay let me check that, I think I understand the issue

2 years ago

0 Hello! I Have A Quick Question About The Clearml Hyperparameter Optimizations Module. Is It Possible To Use It Without Using The Clearml Agent System? In Other Words, Launch A Script From A Few Machines Manually But The Hyperparameters Are Given From Cle

if so is there any doc/examples about this?

Good point, passing to docs 🙂
https://github.com/allegroai/clearml/blob/51af6e833ddc5a8ba1efaaf75980f58616b25e85/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py#L123
I mean it is mentioned, but we should highlight it better

3 years ago

0 Hi, I Want To Run A Script Remotely On My Agent, But For It To Work I Need It To Download To The Agent The Whole Directory The Script Is In, Is It Possible?

🤞

4 years ago

0 I Uncommented The Line

I see, so there’s no way to launch a variant of my last run (with say some config/code tweaks) via CLI, and have it re-use the cached venv?

Try:
clearml-task ... --requirements requirements.txtYou can also clone / override args with
clearml-task --base-task-id <ID-of-original-task-post-agent> --args ...See full doc: https://clear.ml/docs/latest/docs/apps/clearml_task/

3 years ago

0 Hi Again, I Was Wondering What Would Be A Good Practice With Respect To Saving Different Datasets (While Preprocessing It In Several Steps/Stages). Mainly With The Use Of Remove_Files(). Is It Ok To Delete Raw Data After Preprocessing For Example? In That

Hi CostlyElephant1
What do you mean by "delete raw data"? Data is always fetched to cached folders and clearml takes care of cache cleanup
That said notice that get mutable copy is a target you specify, in this case you should definetly delete after usage. Wdyt ?

2 years ago

0 Is It Possible To Give The Agent Access To Install Private Pip Packages (Needs To Be Installed From The Repo)?

This means that in your "Installed packages" you should see the line:
Notice that this is not a pypi artifactory (i.e. a server to add to the extra index url for pip), this is a direct pip install from a git repository, hence it should be listed in the "installed packages".
If this is the way the package was installed locally, you should have had this line in the installed packages.
The clearml agent should take care of the authentication for you (specifically here, it should do nothing).
If ...

4 years ago

0 Hello, I Would Like To Optimize Hparams Saved In Configuration Objects. I Used Hydra And Omegaconf For Hparams Definition (See Img). How Should I Define The Name Of Hparam In

Hi CurvedHedgehog15

I would like to optimize hparams saved in Configuration objects.

Yes, this is a tough one.
Basically the easiest way to optimize is with hyperparameter sections as they are basically key/value you can control from the outside (see the HPO process)
Configuration objects are, well, blobs of data, that "someone" can parse. There is no real restriction on them, since there are many standards to store them (yaml,json.init, dot notation etc.)
The quickest way is to add...

3 years ago

0 Hi, I Have A Question About

I think it would be nicer if the CLI had a subcommand to show the content of

~/.clearml_data.json

.

Actually, it only stores the last dataset id at the moment, no not much 🙂
But maybe we should have a cmd line that just outputs the current datasetid, this means it will be easier to grab and pipe
WDYT?

4 years ago

0 Is There A Way To Control How Many Parallel Connections Are Used When Downloading From

Hi ShakyJellyfish91

It seems clearml is using a single connection, that takes a long time download

Hmm, I found this one:
https://github.com/allegroai/clearml/blob/1cb5dbb276026644ae20fef63d58256cdc887818/clearml/storage/helper.py#L1763

Does max_connections=10 mean 10 concurrent connections ?

4 years ago

0 Hello, My Dl Workflow Includes Post-Training Quantization. Is There A Way To Implement These Procedures In Clearml?

However, SNPE performs quantization with precompiled CLI binary instead of python library (which also needs to be installed). What would be the pipeline in this case?

I would imagine a container with preinstalled SNPE compiler / quantizer, and a python script triggering the process ?

one more question: in case of triggering the quantization process, will it be considered as separate task?

I think this makes sense, since you probably want a container with the SNE environment, m...

3 years ago

Show more results