AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hello, Is There A Way To Update A Task Diff Programatically? Eg, I'M Creating A Task Using

ShakyJellyfish91 what exactly are you passing to Task.create?
Could it be you are only passing script= and leaving repo= None ?

3 years ago

0 Hello, Is There A Way To Update A Task Diff Programatically? Eg, I'M Creating A Task Using

Thanks ShakyJellyfish91 ! please let me know what you come up with, I would love for us to fix this issue.

3 years ago

0 Hey All. Quick Question About The

ClumsyElephant70
Could it be virtualenv package is not installed on the host machine ?
(From the log it seems you are running in venv mode, is that correct?)

3 years ago

0 Hey All. Quick Question About The

Can you send the full log ?

3 years ago

0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

Just verifying the Pod does get allocated 2 gpus, correct ?
What do you have under the "script path" in the Task?

3 years ago

0 Hey All. Quick Question About The

okay this seems like a broken pip install python3.6
Can you verify it fails on another folder (maybe it's a permissions thing, for example if you run in docker mode, then the permissions will be root, as the docker is creating those folders)

3 years ago

0 Hi All, I'M Trying To Deploy Trains On Rancher (Nice Kubernetes Cluster Orchestration Project) Where I'M Quite New To Rancher And Kubernetes. I Have Been Able To Install Trains Using Helm

WickedGoat98
The webUI will look like the demo server 🙂https://demoapp.trains.allegro.ai/
2. curl http://server-ip:8008 should return something like:
{"meta":{"id":"78a9dc77081348e2930d1f429fd7e092","trx":"78a9dc77081348e2930d1f429fd7e092","endpoint":{"name":"","requested_version":1.0,"actual_version":null},"result_code":400,"result_subcode":0,"result_msg":"Invalid request path /","error_stack":null},"data":{}}%3. curl http://server-ip:8080 should return something like:
` <!d...

3 years ago

0 Hello, Is There A Way To Update A Task Diff Programatically? Eg, I'M Creating A Task Using

pip install clearml==1.0.6rc2Did not work?!

3 years ago

0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

👍

3 years ago

0 Hey All. Quick Question About The

ClumsyElephant70
Can you manually run the same command ?
['python3.6', '-m', 'virtualenv', '/home/user/.clearml/venvs-builds/3.6']Basically:
python3.6 -m virtualenv /home/user/.clearml/venvs-builds/3.6'

3 years ago

0 Hello Everyone! The Question About Dataset.Squash(). The Squash Operation Copies All The Data And Is No Longer Linked To Previous Commits? I Thought This Operation Is Like Git Squash But It Seems To Me That Clearml Dataset.Squash() Create Just A Copy Of S

The Squash operation copies all the data and is no longer linked to previous commits?

Yes, basically the idea is if you have data version that relies on many parents that needs to be merged, the squash will create a merged copy and push it all as a single version, and then yes the parent versions are no longer needed

I thought this operation is like git squash but it seems to me

yeah... we did not want to actually delete the parents because unlike git, the operation is done ...

7 months ago

0 Dear Clearml Community, I Am Trying To Optimize Storage On My Clearml File Server When Doing A Lot Of Experiments. To Achieve This, I Already Upload Only The Newest And Best Checkpoints To Clearml File Server Instead Of All Checkpoints. Another Component

None
notice there is a scroll_id there, you might need to call the API multiple times until you scroll over All the events
could that be it?

7 months ago

Notice that you need to pass the returned scroll_id to the next call

scroll_id = response["scroll_id"]

7 months ago

0 Any Idea Why I Get This Error In All My Agents

Seems like settings on the clearml-server disappeared (specifically default queue tag?!)

3 years ago

0 Hi All! I Am A Bit Confused As To How The Python Environment Is Set. I Can Submit Jobs That Build The Environment And Run Perfectly Fine. But, If I Abort The Job -> Requeue It From The Gui, Then A Different Environment Is Installed (Which Has Some Package

the first runs perfectly fine,

Just making sure, running in an agent?

the second crashes

Running inside the same container as the first one ?

7 months ago

0 Hi, I Am Wondering Why Do I Need To Create Files Before Applying Diff ?

DefeatedOstrich93 what do you mean by "I am wondering why do I need to create files before applying diff ?"
git diff will not list files unless their are added (they are marked as "untracked") think temp files logs etc. until you add a file to git it will basically ignore that file. Make sense ?

3 years ago

0 Hi, I Want To Pass Environment Variables From The Host To The Docker Containers Running My Task. I Managed To Use

but is there any other way to get env vars / any value or secret from the host to the docker of a task?

if this is docker -e/--env as argument would do the same
-e VAR=somevalue

3 years ago

0 Any Idea Why I Get This Error In All My Agents

in the docker-compose file. Still strange...

hmm yes it is... If you have an idea on what went wrong let me know, we would love to fix it

3 years ago

0 Hello Everyone! I Have A Problem With Clearml. Could You Please Help Me? I Have 2 Little Projects With Total 31 Experiments. And Its 837Mb Metric Stored. Where Can I Find A Detail Information About This Memory Quota Spending? I Really Don'T Understand, Wh

Oh I see, yes the "metrics" include both scalars / plots & console outputs,
I also think they are updated only once a day (or maybe twice a day?) so even if you delete them it will take to update
(archive is not delete, you then need to go to the archived view and delete it from there)

7 months ago

0 Any Idea Why I Get This Error In All My Agents

How are you spinning the agents ?

3 years ago

0 Any Idea Why I Get This Error In All My Agents

It seems like you are correct, everything should just work. Are you still getting the error? What's the clearml agent version?

3 years ago

0 Hi, I Want To Pass Environment Variables From The Host To The Docker Containers Running My Task. I Managed To Use

but this would be still part of the clearml.conf right?

You can pass it per Task , also you can configure the agent to always pass it add this env.
https://github.com/allegroai/clearml-agent/blob/5a080798cb4292e198948fbe16cba70136cb6bdf/docs/clearml.conf#L137

3 years ago

0 Hey Again, So I Asked About Archiving A

Sure SharpDove45 ,
from clearml import Model model = Model('model_id_aabbcc') model.system_tags += ['archived']

3 years ago

0 Hi, I Am Using

Hi @<1695969549783928832:profile|ObedientTurkey46>
Use --services-mode in the agent , it will run many Tasks on the same machine, this is usually associated with the services queue, but can be run on any queue. This way you could have the same machine easily running those multiple "control" tasks.
wdyt?

4 months ago

0 Hi, I'M Trying To Use

SoggyBeetle95 maybe it makes sense to configure the agent with an access-all credentials? Wdyt

2 years ago

0 Hello Everyone, I’M Newcomer For Clearml. I Have Question Related To

Hi MortifiedCrow63

saw

file:///var/folders/cj/SOME_RANDOM_ID/T/tf_ckpts/ckpt-1

, ...

By default ClearML will only log the exact local place where you stored the file, I assume this is it.
If you pass output_uri=True to the Task.init it will automatically upload the model to the files_server and then the model repository will point to the files_server (you can also have any object storage as model storage, e.g. output_uri=s3://bucket )
Notice yo...

3 years ago

0 Hello, Is There A Way To Update A Task Diff Programatically? Eg, I'M Creating A Task Using

it seems it's following the path of the script i'm using to task.create, eg:

The folder it should run it is the script path you are passing (i.e. "script=ep_fn," )
Wrong path would imply that is it not finding the correct repository, is that the case ?

3 years ago

0 Hey All. Quick Question About The

TenseOstrich47 it's based on free "index" so the first index not in used will be captured, but if you remove agents, then the order will change e.g. you take down worker #1 , the next worker you spin will be #1 becuase it is not taken)

3 years ago

0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

NonchalantDeer14
I think the issue is the way it spins the subprocess is not with fork but with Popen, so clearml is not "loaded" into the subprocess hence no logging.
The easiest fix is to call Task.current_task() inside the actual code (somewhere when it starts), it should trigger clearml.

3 years ago

0 Hey All. Quick Question About The

ClumsyElephant70 the odd thing is the error here:
docker: Error response from daemon: manifest for nvidia/cuda:latest not found: manifest unknown: manifest unknown.I would imagine it will be with "nvidia/cuda:11.3.0-cudnn8-runtime-ubuntu18.04" but the error is saying "nvidia/cuda:latest"
How could that be ?
Also can you manually run the same command (i.e. docker run --gpus device=0 --rm -it nvidia/cuda:11.3.0-cudnn8-runtime-ubuntu18.04 bash )?

3 years ago

Show more results