AgitatedDove14

49 Questions, 8122 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8122

0 Hi, Is There A General Github Actions Workflow Just To Login Into Your Clearml App (Demo Or Server) So I Can Run Python Files Related To Clearml. I'Ve Seen Clearml-Actions-Train-Model And Clearml-Actions-Get-Stats And They Seem To Be Very Specific. Maybe

This is very odd...
LittleShrimp86 is this example working for you?
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_tasks.py

2 years ago

0 Hi All

Thank you! 🤩

2 years ago

0 For Some Runs Of My Experiments The Ressource Monitoring Exists, For Other It Does Not. Any Idea Why This Could Be The Case?

make sense ?

4 years ago

0 Does Anyone Have Experience With Integrating Clearml And Slurm? If So, What Pattern Did You Use? (Did You Submit Tasks And Just Use Clearml As Tracker, Or Did You Start Agents With Slurm?) Would Love To Hear From The Community Before Trying To Diy

im helping train my friend

on clearml to assist with his astrophysics research,

if that's the case, what you can do is use the agent inside your sbatch script,
(full open source). This means the sbatch becomes " clearml-agent execute --id <task_id_here> " this will set up the environment and monitor the job and still allow you to launch it from slurm, wdyt?

7 months ago

0 Hi Everyone, Is There A Way To Increase The Cache Size Of Each Clearml Task? I'M Running An Experiment And Many Artifacts Are Downloaded. My Dataloader Fails To Load Some Of The File Since They Are Missing, Although They Were Downloaded. I Guess There Is

Hi ScaryKoala63
Sure, add the following to your clearml.conf:
sdk.storage.cache.default_cache_manager_size = 400I think you are correct, it seems like for some reason you hit the cache limit, and a previous entry was deleted

3 years ago

0 I Just Deployed Clearml Into K8 Cluster Using Clearml Helm Package. When I Ran A Job, It Gave This Error In The Clearml Web Server (Attached Below). I Sshed Into The Pod Running The Clearml-Agent. Upon Typing Clearml-Agent Init, I Realised The Clearml.Con

clearml-agent deployment file

What do you mean by that? is that the helm of the agent ?

4 years ago

0 Hi! Is There A Way To Export The Credentials Of The Aws Account Only During The Creation Of The Docker? I Don’T Want Every User In My Team To Know The Credentials To Access S3 Buckets. I Just Want Them To Be Able To Write In The Bucket Without The Credent

it would be clearml-server’s job to distribute to each user internally?

So you mean the user will never know their own S3 access credentials?
Are those credentials unique per user or once"hidden" for all of them?

3 years ago

0 I Have An On-Prem/Free Clearml-Server Setup With Custom S3 Back-End Storage. I'M Trying Out The Clearml-Serving Capability And Not Sure What'S Failing. When I Start The Serving Containers It Can'T Retrieve The Model:

I can't seem to figure out what the names should be from the pytorch example - where did INPUT__0 come from

This is actually the latyer name in the model:
https://github.com/allegroai/clearml-serving/blob/4b52103636bc7430d4a6666ee85fd126fcb49e2e/examples/pytorch/train_pytorch_mnist.py#L24
Which is just the default name Pytorch gives the layer
https://discuss.pytorch.org/t/how-to-get-layer-names-in-a-network/134238

it appears I need to converted into TorchScript?

Yes, this ...

2 years ago

0 I Am Back With Another Question: Is There A File Similar To The

ReassuredTiger98 that is a good point, at the moment they are designed as "machine level" configs, but we do have built in support to allow multiple configurations. The technical issue is we have to read the configuration file before we initial the Task object, that means we still are not aware of the git root (which I assume is where we could put a configuration file)
BTW: regrading the detect_with_conda_freeze we hope that this flag is rarely used, as the Clearml should auto-detect t...

4 years ago

0 Hi, I'Ve Got A Quick Question About

the time taken to upload halved. It is puzzling because as you say it's not that much to upload.

Maybe it was the load on the server? meaning dealing with multiple requests at the same time delayed the requests?!

For now I've whittled down the number of entries to a more select but useful few and that has solved the issue. If it crops up again I will try

connect_configuration

properly.
Thanks for your help!

My pleasure 🙂

3 years ago

0 Hi I'M Trying To Run A Hyperparameter Tuning Experiment On A Privately Hosted Server And The Trials Are Forever Enqueued (Status: Pending) As Long As The Main Task Is Running But The Workers Are Never Utilised When The Trials Are Not Running. Is This Expe

Hi AverageBee39
Did you setup an agent to execute the actual Tasks ?

4 years ago

0 Moreover, When I Go To The Queue Page, I See The Queue Is Empty, But When I'M On The Queued Task'S Page I Can See It Is Enqueued To Right Right Queue... So The Task Says It Is In The Queue, But The Queue Says It Is Empty

WackyRabbit7 I might be missing something here, but the pipeline itself should be launched on the "pipelines" queue, is the pipeline itself running? or is it the step itself that is stuck in ""queued" state?

3 years ago

0 Hi! Does Clearml Have A Way To Turn On/Off Virtual Machines Depending If There Are Experiments On Queue?

Not yet 😞
It should not be complex to implement,
The actual aws auto scaler class is implementing just two functions:

def spin_up_worker(self, resource, worker_id_prefix, queue_name):
https://github.com/allegroai/clearml/blob/e9f8fc949db7f82b6a6f1c1ca64f94347196f4c0/clearml/automation/auto_scaler.py#L104

def spin_down_worker(self, instance_id):
https://github.com/allegroai/clearml/blob/e9f8fc949db7f82b6a6f1c1ca64f94347196f4c0/clearml/automation/auto_scaler.py#L...

4 years ago

0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

It will also allow you to pass them to Hydra (wither as overloaded, or directly edit the entire hydra config)

3 years ago

0 Currently, To Provide Ssh Access To The Docker Images For A Task,

Hmm you either need to run with SUDO or make sure the running user has docker run permissions

4 years ago

0 I Have A Notebook Which Is Uncommited. It Is Being Run On A Remote Machine With Clearml-Agent Through Clearml-Session. Everything With Newest Versions, Server Is Community-Hosted. Under Uncommitted Changes I See

okay, let me check it, but I suspect the issue is running over SSH, to overcome these issues with pycharm we have specific plugin to pass the git info to the remote machine. Let me check what we can do here.
FiercePenguin76 BTW, you can do the following to add / update packages on the remote session
clearml-session --packages "newpackge>x.y" "jupyterlab>6"

4 years ago

0 How Can I Log My Configuration Like This? I Have A Dict Params = {'Data':{'Data_Key':123}, 'Model':{'Model_Key':123}}, But It Become Data/Datakey Instead Of An Foldable Config. In Addition, I Don'T Want To Name It As "General", Where Can I Change It?

I use Yaml config for data and model. each of them would be a nested yaml (could be more than 2 layers), so it won't be a flexible solution and I need to manually flatten the dictionary

Yes, you are correct, the recommended option would be to store it with task.connect_configuration it's goal is to store these types of configuration files/objects.
You can also store the yaml file itself directly just pass Path object instead of dict/string

5 years ago

0 Sometimes I Notice That At The End Of An Experiment Clearml Keeps Hanging (Something With Repository Detection?) And The Script Does Not End. Do More People See This? Especially In Our Continuous Integration Pipeline This Give Problems Because Tests Are G

Thanks SolidSealion72 !

Also, I found out that adding "pool.join()" after pool.close() seem to solve the issue in the minimal example.

This is interesting, I'm pretty sure it has something to do with the subprocess not "closing" properly (or too fast or something)
Let me see if I can reproduce

3 years ago

0 Hi All, I'M Wondering If I Could Use Clearml Agent To Use Multiple Machines In A Self-Hosted Server In Windows.

Hi @<1664079296102141952:profile|DangerousStarfish38>
You mean spin the agent on multiple Windows machines? Yes that is supported, I think that it is limited to venv (i.e. not docker) mode, but other than that should work out of the box

one year ago

0 Hi, A Question About Dataset Storage Suppose I Create A Dataset Like This

Hi MelancholyElk85
So the way datasets now work, is they are actually an entity (folder) inside a project , all under TFW hidden .datasets sub project
This is so all data and tasks are both on the same project , but at the same time will not intersect with subprojects by the same name. Does that make sense?

2 years ago

0 Hi, I Was Uploading An Image Artifact Using The Following But In The Preview I Only Get An Array Instead Of An Image. Am I Doing Something Wrong? ``` Im=Cv2.Imread('Pic.Jpg') Task.Upload_Artifact('Myimage',I'M) ```

https://allegro.ai/clearml/docs/rst/references/clearml_python_ref/logger_module/logger_logger.html#clearml.logger.Logger.report_image

4 years ago

0 Hey! Do You Have Any Support For 3D Mesh Visulaization?

3d scatter points:
https://github.com/allegroai/trains/blob/master/examples/manual_reporting.py#L53

5 years ago

0 I'M Looking At How Triggers Work In Clearml. Is There An Example, Maybe With Clearml Data And A Dataset Being Uploaded Or Some Other Example?

Also could you explain the difference between trigger.start() and trigger.start_remotely()

Start will start the trigger process (the one "watching the changes") locally (this makes sense for debugging etc.)
start_remotely will launch the trigger process on the "services" where it should live forever 🙂

Okay so when I add trigger_on_tags, the repetition issue is resolved.

Nice!

This problem occurs when I'm scheduling a task. Copies of the task keep being put on the queue ...

3 years ago

0 How Can I Add My Requirements.Txt File To The Pipeline Instead Of Each Tasks?

Run clearml-agent and enqueue the pipeline ? What am i missing?

2 years ago

0 Hi, I Would Like To Pass In Some Pip Arguments That Clearml-Agent Would Include When Setting Up The Venv On The Containers. How Should I Specify This? The Argument In Question Are --Trusted-Host And --Find-Links . I Need Them As I'Ve Installed A Pypi Repo

SubstantialElk6 is this the pip to install the agent, or the pip the agent is using to install the packages for the specific experiment ?

4 years ago

0 {"Detail":"Error Processing Request: Error: Failed Loading Preprocess Code For 'Py_Code_Best_Model': [Errno 2] No Such File Or Directory: '/Root/.Clearml/Cache/Storage_Manager/Global/Cd46Dd0091D71B5294Dc6870Ac6D17Dc..._Artifacts_Archive_Py_Code_Best_Model

I wonder if the try/except approach would work for XGboost load, could we just try a few classes one after the other?

2 years ago

btw:
# in another process
How do you spin the subprrocess, is it with Popen ?
also what's the OS and python version you are using?

3 years ago

It does work about 50% of the times

EcstaticGoat95 what do you mean by "work about 50%" ? do you mean the other 50% it hangs ?

3 years ago

0 I .

100% of things with

task_overrides

would be the most convenient way

I think the issue is that you have to pass the project ID not project name (the project unique IS is the property that is actually stored on the Task)
@<1523707653782507520:profile|MelancholyElk85> can you check the following works:

pipe.add_task(, ..., task_overrides={'project': Task.get_project_id(project_name='examples')},)

3 years ago

0 Hello, In The Following Context:

That said, you might have accessed the artifacts before any of them were registered

5 years ago

Show more results