AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hi!

should work as well 🙂

3 years ago

0 ..

@<1539780284646428672:profile|PoisedElephant79> you could turn off certificate verification:
None

one year ago

0 Question About Pipelines - So The Default For Pipeline Tasks That Are Executed Remotely Is To Execute On The

It's relatively new and it is great as from the usage aspect it is exactly like a user/pass only the pass is the PAT , really makes life easier

2 years ago

0 Hey All. I Need Some Help Debugging Some Errors. I Keep Getting An Error About Failing To Clone The Repository On The Remote Instance. What Could Be The Reason Of This? Are There Any Common Errors Related To This? I Suspect Permissions, But Not Entirely

Hi @<1687643893996195840:profile|RoundCat60> , I just saw the message,

Just by chance I set the SSH deploy keys to write access and now we're able to clone the repo. Why would the SSH key need write access to the repo to be able to clone?

Let me explain, the default use case for the agent is to use user/pass (as configured in the clearml.conf file(
It will change any ssh links to https links and will add the credentials to clone the repository.
You can also provide SSH keys (basicall...

3 years ago

0 I'M Getting This When Running With Keras Framework. Clearml.Storage - Error - Failed Uploading: [Errno 21] Is A Directory: 'Model.Savedmodel'.

It reflects what is stored by Keras, so if Keras stores the best model this is what you get. BTW if you pass output_uri=True it will automatically upload the models

3 years ago

0 Hi Friends. I Need To Authenticate To Hugging Face To Download A Private Dataset (As Shown Here:

Should be under Profile -> Workspace (Configuration Vault)

2 years ago

0 Different Question. How Can I Pass Pythonpath Env Variable To A Task, Run By Agent (So Python Can Find Classes Inside M Subdirectories)?

Happy to hear 🙂

2 years ago

0 Apart From Having Packages In Requirements.Txt, Does Clearml Expect Them To Be Actuall Installed To Add Them As Installed Packages For A Task?

Is it not possible to say just look at my requirements.txt file and the imports in the script?

I think there is a GitHub Issue for this feature
(basically the issue is, requirements.txt are very often not updated, and have no real version lock, so replicating a working env is always safer)

3 years ago

0 Hi, Is There A Way To Stop A Clearml-Agent From Within An Experiment? Or Block It To Prevent It Running Any Other Task?

Hi, Is there a way to stop a clearml-agent from within an experiment?

It is possible but only in the paid tier (it needs backend support for that) 😞

My use case it: in a spot instance marked for termination after 2 mins by aws

Basically what you are saying is you want the instance to spin down after the job is completed, correct?

3 years ago

0 Hey All -- I'M Fairly New To This But, As Of Today, My Required Packages Aren'T Being Recognized In Cloned Runs And They Are Repeatedly Failing. Has Anyone Had Similar Issues/Found A Fix?

Turn it off and on ?! 😊

one year ago

0 Hi, I Was Trying To Test The Autoscaler Feature, But I Am Getting The Following Error:

Can you share the log?

one year ago

0 Qq: I'M Trying To Run The

Hi SpotlessWorm70

OMP: Error #15: Initializing libiomp5.dylib, but found libomp.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program.

This seems like OpenMP issue
I would assume something is off with the local environment (not really connected to clearml but to one of the frameworks, for example TF, Keras, etc.)

3 years ago

0 Hi, What Happens Exactly When I Execute The Following Command:

NVIDIA_VISIBLE_DEVICES=0,1
Basically it is uses "as is" and Nvidia drivers do the rest
Same goes for all or 0-3 etc.

4 years ago

0 Hi, How Could I Know That "Task.Init" Find My "Clearml.Conf" File? I Executed

Hi PerplexedWalrus3
you should get something like the following on the console :
ClearML Task: created new task id=1ca59ef1f86d44bd81cb517d529d9e5a 2021-07-25 13:59:09 ClearML results page: 2021-07-25 13:59:16

3 years ago

0 Why Am I Getting A 403 From File Server When The K8 Glue Agent Is Initializing ?

BoredHedgehog47 can you provide some logs, this is odd..

2 years ago

0 So, Here'S A Question. Does Clearml Automatically Save Everything Necessary To Continue Training A Pytorch Language Model? Specifically, I'Ve Been Looking At The Checkpoint Folders Created When I'M Training A Huggingface Robertaformaskedlm. I Checked What

Hmm pseudo stack:
https://github.com/huggingface/transformers/blob/040283170cd559b59b8eb37fe9fe8e99ff7edcbc/src/transformers/trainer_tf.py#L779

https://github.com/huggingface/transformers/blob/040283170cd559b59b8eb37fe9fe8e99ff7edcbc/src/transformers/feature_extraction_utils.py#L285

https://github.com/huggingface/transformers/blob/040283170cd559b59b8eb37fe9fe8e99ff7edcbc/src/transformers/feature_extraction_utils.py#L470

3 years ago

0 Hi All, I'M Using Clearml And Pytorch-Lightning. I Was Able To Train My Models Successfully As Long As I Was Using A Single Gpu. When I Used Two Gpus For Training My Models I Got The Following Error:

I think the ClearmlLogger is kind of deprecated ...
Basically all you need is Task.init at the beginning , the default tensorboard logger will be caught by clearml

2 years ago

0 Hi! I Have A Gpu Workstation At The Office (No Public Ip) With Latest Clearml-Agent Installed. When I Was In The Same Network - I Was Able To Use Clearml-Session From My Laptop. Now I Work From Home, And Clearml-Session Fails With

ngrok to connect to the remote server at the office?
That makes sense, I guess this is the equivalent of using a VPN, from that point onward clearml-session can directly access the remote machine, right?

3 years ago

0 Ok, Nvmd. As Soon As I Spend All The Time To Write The Above Message, I Figured It Out. In Case You Are Curious:

Ohh sorry I missed that and answered on the original message, nvm 🙂 all is well now

3 years ago

0 Hello, Everyone! I Have A Question Regarding Clearml Features. We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs. Thu

should reload the reported scalars

Exactly (notice it also understand when was the last report of scalars so it should automatically increase the iterations (i.e. you will not accidentally overwrite previously reported scalars)

and the task needs to reload last checkpoints only, right?

Correct 🙂

We didn't figure out the best way of continuing for both the grid and optuna. Can you suggest something?

That is a good point, not sure if we have a GH issue, for that but wo...

2 years ago

0 How Do We Configure S3 Bucket Credentials When Working With The Autoscaling Service?

Sure

3 years ago

0 When Using Something Like Pdf2Image Which Requires Poppler (Which Can Be Installed With Conda), How Can I Ensure That The Task Can Run On An Agent Correctly? As Of Now It Doesn’T Know About Poppler

Hi JealousParrot68
spinning the clearml-agent with docker support (i.e. each experiment is running inside its own container):
https://clear.ml/docs/latest/docs/clearml_agent#docker-mode
Basically you can specify a default docker to use (per agent) and a specific docker container to use per Task (configured in the UI under execution at the bottom)

3 years ago

0 Hi All! I Noticed When A Pipeline Fails, All Its Components Continue Running. Wouldn'T It Make More Sense For The Pipeline To Send An Abort Signal To All Tasks That Depend On The Pipeline? I'M Using Clearml V1.1.3Rc0 And Clearml-Agent 1.1.0

Or maybe you could bundle some parameters that belongs to PipelineDecorator.component into high-level configuration variable (something like PipelineDecorator.global_config (?))

So in the PipelineController we have a per step callback and generic callbacks (i.e. for all the steps), is this what you are referring to ?

Well, I can see the difference here. Using the new pipelines generation the user has the flexibility to play with the returned values of each step.

Yep 🙂

We...

2 years ago

0 Hi, I Would Like To Configure Clearml-Server To Connect To An S3 Bucket In Order To Store Artefacts - I'Ve Taken A Look On This Page

None
So this is the only place we need to change to support it, do you feel like messing around with it and adding IAM roles ?

3 years ago

0 I Am Trying To Use

Same machine running the trains-init ?

3 years ago

0 Hey Guys, I Am Trying To Plan What I Need To Do In Order To Efficiently Use Clearml With Spot Instances 1) Detecting When Spot Instance Is Down And Experiment Is Aborted 2) Extracting S3 Address Of The Latest Checkpoint From Clearml Api 3) Starting New E

Very Cool!
BTW guys, are you using the task.models[] to continue from the last checkpoint? or is it task.artifacts[] ?

3 years ago

0 Hi,

Hi FloppyDeer99

What is the meaning of no real scheduling

I think the meaning is that from the moment a k8s job is created, the k8s is in charge of actually spinning the container. Since k8s has no real priority/order the scheduling order is not guaranteed form this point.

The idea of the cleaml-k8s -glue is that the glue will launch a job on the k8s cluster only if it is sure there are enough resources to actually spin the job now (as opposed to, sometime in the future), this mea...

3 years ago

0 Hi, I Have A Script Running Cross Validation, Basically It Calls 5 Times (5 Folds) Another Script That Does A Training And Evaluation. Is It Possible In Clearml To Have A Main Task (The Complete Cross Validation) And Subtasks (One For Each Fold)?

GreasyPenguin14

Is it possible in ClearML to have a main task (the complete cross validation) and subtasks (one for each fold)?

You mean to see it as nested in the UI? or Auto logged by the code ?

3 years ago

0 I Would Like To Understand The Limitations Of

current task fetches the good Task

Assuming you fork the process than the gloabl instance" is passed to the subprocess. Assuming the sub-process was spawned (e.g. POpen) then an environement variable with the Task's unique ID is passed. then when you call the "Task.current_task" it "knows" the Task was already created and it will fetch the state from the clearml-server and create a new Task object for you to work with.
BTW: please use the latest RC (we fixed an issue with exactly this...

3 years ago

0 Hi Everyone, I Have Questions Related To Clearml-Serving.

wdyt?
https://clear.ml/docs/latest/docs/clearml_serving/clearml_serving
https://clear.ml/docs/latest/docs/clearml_serving/clearml_serving_cli#metrics
https://clear.ml/docs/latest/docs/clearml_serving/clearml_serving_tutorial#model-monitoring-and-performance-metrics

2 years ago

Show more results