AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 Hi, I Want To Pass Environment Variables From The Host To The Docker Containers Running My Task. I Managed To Use

but is there any other way to get env vars / any value or secret from the host to the docker of a task?

if this is docker -e/--env as argument would do the same
-e VAR=somevalue

4 years ago

0 Hey Everyone

this

from fastai.callbacks.tensorboard import LearnerTensorboardWriter

doesn’t exist anymore in fastai2

Hmm we should definitely update the example to fastai2 API

maybe the fastai bindings in clearml package are outdated

Are you getting any scalars reported to clearml?

they also appear to be relying on the tensorboard callback which seems not to work on distributed training

Yes that is correct, usually the way it works all nodes report back to "master...

3 years ago

0 How Can I Run A New Version Of A Pipeline, Wait For It To Finish And Then Check Its Completion/Failure Status? I Want To Kick Off The Pipeline And Then Check Completion

Hi @<1523701079223570432:profile|ReassuredOwl55> let me try ti add some color here:
Basically we have to parts (1) pipeline logic, i.e. the code that drives the DAG, (2) pipeline components, e.g. model verification
The pipeline logic (1) i.e. the code that creates the dag, the tasks and enqueues them, will be running in the git actions context. i.e. this is the automation code. The pipeline components themselves (2) e.g. model verification training etc. are running using the clearml agents...

2 years ago

0 I Wanted To Ask, How To Run Pipeline Steps Conditionally? E.G If Step Returns A Specific Value, Exit The Pipeline Or Run Another Step Instead Of The Sequential Step

Great to hear!

3 years ago

0 I'M Trying To Understand How Clearml Serving Works And Trying To Set It Up. I Have An Agent Listening To The Serving Queue And I'M Trying To Set Up Clearml Serving To Launch On The Serving Queue. Do I Need To Have Clearml-Serving Installed On The Machine

can you tell me what the serving example is in terms of the explanation above and what the triton serving engine is,

Great idea!

This line actually creates the control Task (2)
clearml-serving triton --project "serving" --name "serving example"
This line configures the control Task (the idea is that you can do that even when the control Task is already running, but in this case it is still in draft mode).
Notice the actual model serving configuration is already stored on the crea...

3 years ago

0 I’M Using Catboost For Training, But Sadly It Does Not Have A Native Integration With Clearml (Xgboost And Lightgbm Do Have Integrations). But Catboost Writes Down Training Logs In Tensorboard Format (Into A

it certainly does not use tensorboard python lib

Hmm, yes I assume this is why the automagic is not working 😞

Does it have a pythonic interface form the metrics ?

4 years ago

0 Hi Everyone, I Have Questions Related To Clearml-Serving.

Hmm, how does your preprocessing code looks like?

3 years ago

0 Hi, I Have Another Problem

what do you see in the console when you start the trains-agent , it should detect the cuda version

5 years ago

0 Also, Is There A Way To Remove The Examples From My Server Deployment? I Can'T Delete The Tasks. I Tried To Archive The Task Prior To Deletion And I Get The Following Error:

Out of interest, is there a reason these are read-only?

Yes, we should probably change that... they are designed to be pre-populated, but there should not be any reason you could not remove them

The code for these tasks is on github right?

Correct

4 years ago

Actually that is less interesting, as it is quite straight forward

4 years ago

0 Question About The Storage Manager. Assuming I Have An Object That Updates Frequently And Always Saved At The Same Path (E.G.

We should probably change it so it is more human readable 🙂

5 years ago

0 I'M Training A Tensorflow Model And Saving It In The End. I Looked At The Outputmodel Class. How Do I Connect The Model I'M Saving To The Outputmodel?

Once a model is saved and published, it should be downloadable right

Well that depends if you configured CLearML to autoupload it (by default it will just log the "local location").
To auto-upload add output_uri=True to Task.Init (or specify a destination with output_uri= ` s3://bucket/ )
You can also configure it as default here:
https://github.com/allegroai/clearml/blob/65f1c0baa124efb05fb7894a5386f0dd52c0536b/docs/clearml.conf#L163

3 years ago

0 Anyone Seeing These Errors?

AttractiveCockroach17 I verified this is an issue with hypeparemeters with "." or section names with ".", thank you for noticing!
I will make sure I pass it along, should be part of the next version (ETA a week) 🙂

3 years ago

0 How Can I Integrate Trains-Server To Aws Ec2 Api

Hi AstonishingSwan80 , what do you mean by "ec2 API"?

5 years ago

0 I’M Trying To Use Minio With Clearml As A External Storage. I Am Having Problems With The Configuration File For The Clearml Client When I Use The Output_Uri Parameter Of Task.Init What Do I Put There? I Am Currently Doing Task.Init(… Output_Uri=“S3://I

odd message though ... it should have said something about boto3

2 years ago

0 Hi Everyone, Does Anybody Now If The Latest Release 1.15 Is Still Vulnerable To

Hi Martin, of course not,

Smart!

I was just wondering if it has been patched yet and if not what is the expected timeline for patching it

Yes, I believe the target is a patch version 1.15.1 to be released in a couple of weeks. This is not a major issue but it's always better to have have it fixed. (btw: the enterprise version never had this issue to being with, because it is of course authenticated, as well as it has additional RBAC layer on top.)

one year ago

0 Hi, Trying To Spin Up A Clearml Agent And Gettting This Error:

So the agent installed okay. It's the specific Task that the agent is failing to create the environment for, correct?
if this is the case, what do you have in the "Installed Packages" section of the Task (see under the Execution tab)

3 years ago

0 Anyone Seeing These Errors?

is it consistent ? (the error), meaning it happens on other integer values ?

3 years ago

0 Anyone Seeing These Errors?

This is odd, what is the parameter?
I assume it needs sorting and one time this is Integer, and the next it is a String, so the server cannot sort based on it. Could that be ?

3 years ago

0 Hi, Trying To Spin Up A Clearml Agent And Gettting This Error:

the latter is an ec2 instance

and the agent fails to install on the ec2 machine ?

3 years ago

0 Whet Is The Method For Packages Exploration When Using Conda? Agent Is Set To 'Conda' Mode. We Upload A Task From A Local Conda Env That (Obviously) Has Some Pip Packages As Well. When We Enqueue The Task To Run Remotely, Not All Conda Packages Are Instal

CrookedWalrus33 from the log it seems the code is trying to use "kwcoco" but it is not listed under any "Installed packages" nor do you see any attempt to install it. Can you confirm ?

3 years ago

0 Hi! I Deployed Clearml Server Along With Jupyterhub On Azure K8S (Aks). The Way It Works Is That Every User Is Assigned A New Pod That Is Spawned With A Docker Image Of A Choice (One Of Them With Clearml Sdk Installed). I Managed To Configure Most Of The

GreasyPenguin66 Nice !!!
Very cool setup, and kudos on making it work with multiple users!
Quick question, shouldn't the JUPYTERHUB_API_TOKEN env variable be enough to gain access to the server? Why did you need to add it to the 'nbserver-x.json' as well?

4 years ago

0 Hi Fam! I’M Trying To Get

Hi QuaintPelican38 can you manually access the machine based on the IP it registered
(Look under the DevOps project, you'll see a running Task "interactive session" under the configuration tab, user properties you should find the IP

4 years ago

0 Is There Any Documentation For

docstring ?
Usually the preferred way is StorageManager
https://clear.ml/docs/latest/docs/references/sdk/storage
https://clear.ml/docs/latest/docs/integrations/storage

3 years ago

0 I Want To Run

Hi @<1576381444509405184:profile|ManiacalLizard2>
You can also use env vars, it might be easier, I'm assuming this is kind of CI/CD process
'''
export CLEARML_API_ACCESS_KEY="your-public-key"
export CLEARML_API_SECRET_KEY="your-private-secret"
export CLEARML_API_HOST=" https://api.clear.ml "
export CLEARML_WEB_HOST=" https://app.clear.ml "
export CLEARML_FILES_HOST=" https://files.clear.ml "

'''
[https://clear.ml/do...

3 months ago

0 Hi, I Run 'Manually' On My Local Machine With No Errors. Then, I Clone The Completed Task And Enqueue It. I Get To Stage When 'Environment Setup Completed Successfully'. But Right After I Get An Error Related To 'Connect' Method - Task.Connect(Config.Mode

@<1571308003204796416:profile|HollowPeacock58> seems like an internal issue copying this object config.model
This is a complex object, and it seems that for some reason
None

As a workaround just do not connect this object. it seems you cannot pickle it / copy it (see GH issue)

2 years ago

0 This Message Is For The Clearml Team. I'Ve Found A Bug. I Think It'S Reproducible. Basically, When Dealing With Bools Inside Args, I Think What You Guys Do Is Just Cast It To Bool Since All The Args Are Stored As Strings If I'M Correct. Only Issue Is, Boo

Hi VexedCat68
can you supply more details on the issue ? (probably the best is to open a github issue, and have all the details there, so we have better visibility)
wdyt?

3 years ago

0 Hello, Everyone! I Have A Question Regarding Clearml Features. We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs. Thu

We have tried to manually restart tasks reloading all the scalars from a dead task and loading latest saved torch model.

Hi ThickKitten19
how did you try to restart them ? how are you monitoring dying instances ? where . how they are running?

3 years ago

0 Hi There

5 years ago

0 Clearml-Agent Vs Clearml-Agent-Services ? Same Thing?

(as i see the services worker is only in the services-queue, and not my default queue (where my other servers/workers are)

So basically the service-mode is just a flag passed to the agent, and the services queue is the name of the queue it will pull from.

If i want a normal worker also

You can just add another section to the docker-compose, or run it manually after you spin the docker-compose.

LazyFox65 wdyt ?

4 years ago

Show more results