AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 After I Have Create A Task And Closed It In A Notebook, Any Activity Seems To Trigger Another Task. For Example:

Is Task.current_task() creating a task?

Hmm it should not, it should return a Task instance if one was already created.
That said, I remember there was a bug (not sure if it was in a released version or an RC) that caused it to create a new Task if there isn't an existing one. Could that be the case ?

3 years ago

0 Hi I Am Encountering Some Difficulties While Trying To Run The Examples Of The Clearml Documenation (E.G.

Update us if it solved the issue (for increased visibility)

2 years ago

0 Hi All, Anyone Also Have Issues With The Logger Hang The Whole Task?? Or Doesn’T Upload The Reported Images And Scalers? I Got Many Tasks That Were Just Hang At The End Of The Script Without Finishing (Staying In

Hi SpotlessLeopard9

I got many tasks that were just hang at the end of the script without ...

I remember this exact issue was fixed with 1.1.5rc0, see here:
https://clearml.slack.com/archives/CTK20V944/p1634910855059900

Can you verify with the latest RC?
pip install clearml==1.1.5rc3

2 years ago

0 Good Evening! Can You Please Tell Me If It Is Possible To Set Up Slack Monitoring In Clearml?

CheerfulGorilla72 sounds like a great idea, I'll pass along to documentation ppl 🙂

2 years ago

0 Hello Everyone, I Am Using Self Hosted Clearml Server On Ec2 (Clearml Community Amis). This Ec2 Instance Is Attached To S3 With Iam Role. Now If I Create Or Upload Data From Client Side , I Want It To Be Uploaded On S3. There Is A Way Mentioned For Mentio

data it is going to s3 as well as ebs. Why so it should only go to s3

This sounds odd, if this is mounted then it goes to the S3 (the link will point to the files server, but it will be stored on the mounted drive i.e. S3)
wdyt?

11 months ago

0 Hi! I'M Just Starting With Clearml And Have Some Questions:

You mean parameters of the pipeline? Is this a pipeline from Tasks or from function decorator?

11 months ago

0 Hello, I'Ve Been Using Clearml For A Month Now, And Must Say It'S A Really Good Product! I'M Mostly Working With Huggingface Transformers, I Integrated Clearml In My Solution:

HungryArcticwolf62 the new clearml-serving is almost out (eta late next week), you can already start playing here:
https://github.com/allegroai/clearml-serving/tree/dev
Example:
train+serve
https://github.com/allegroai/clearml-serving/tree/dev/examples/sklearn

2 years ago

0 Hello, Everyone! I Have A Question Regarding Clearml Features. We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs. Thu

okay that makes sense, if this is the case I would just use clearml-agent execute --id <task_id here> to continue the training Task.
Do notice you have to reload your last chekcpoint from the Task's models/artifacts to continue 🙂
Last question, what is the HPO optimization algorithm, is it just grid/random search or optuna hbop/optuna, if this is the later, how do make it "continue" ?

2 years ago

0 Is There A Way To Access Dataframe Logged Using Report_Table From The A Task Instance Instantiated Using Task.Get_Task(Id='.....')? I Have: T = Task.Get_Task(Id='....') And I Am Looking For Something Along The Lines Of: Df = T.Get_Table('Table Name')

Give me a moment 🙂

3 years ago

0 Hi, A Question About Dataset Storage Suppose I Create A Dataset Like This

Why would that require refactoring ? Dataset class should take care if it internally ,no?
The reason my_name is a subproject , is that so every version could be a "Task" inside that project , just easier to manage (or at least that was the idea)

one year ago

0 Having Issues Running Trains-Server On Win10. Trains-Elastic Exited With Code 137 Trains-Mongo Exited With Code 100 Trains-Apiserver Exited With Code 1 Some Errors=> Requests.Exceptions.Connectionerror: Httpconnectionpool(Host='Elasticsearch', Port=9200

could you send the entire log here?
i.e. from the "docker-compose" command line and onward

4 years ago

0 Hi! I Am Researching Different Mlops Libraries / Platforms. I Don'T Want To Use Platform As A Service Solutions. Could You Suggest Me What Are The Main Differences Between Clearml And Mlflow? What Are Advantages Of Using Clearml?

Hi RoundMosquito25
This is a bit old but probably a good start:
https://clear.ml/blog/stacking-up-against-the-competition/
tl;dr
ClearML advantages (at least a few I can think of)
Scales way better Enables out of the box experiment orchestration (i.e. remote execution etc) Data management Nicer UI Full RestAPI Full MLops platform Model serving Query-able model repositoryProbably more 🙂

2 years ago

0 Continuing On

Docker cmd is basically docker image name but you can add parameters as well.
For example "Nvidia/cuda" or "Nvidia/cuda -v /mnt/data:/mnt/data"

3 years ago

0 Is There An Efficient Way To Query All Unique Models (Ie Excluding Versions) In A Project?

What do you mean? every Model has a unique ID, what do you consider a version?

3 years ago

0 Hi There! Can Anybody Help Me With Specifying The 'Platform' For A Model In Clearml-Serving. I Am Using The K8S Clearml-Serving Setup (Version 1.3.1). I Already Tried A Bunch Of Variants Like

Ohh! I see now
@<1526371965655322624:profile|NuttyCamel41> the "backend: "pytorch" is not really supported because it does not use the optimized Triron engine (which is the reason to run Triron server)
In order to use pytorch you need to convert it to torchscript and then deploy, see example here:
None
[None](https://github.com/allegroai/clearml-serving/blob/7ba356efc97a6ae2159283d198d981b3c1ab85e6/examples/pytor...

4 months ago

0 Hi, I Am Trying To Use Agent With A Sample, Very Simple Task. But It Stucks And Task Does Not Finish. In Ui In Console I See What I Pasted On Image. Do You Know What I Might Be Doing Wrong? Agent Is Run In Virtual Env Mode

RoundMosquito25 do notice the agent is pulling the code from the remote repo, so you do need to push the local commits, but the uncommitted changes clearml will do for you. Make sense?

2 years ago

0 Hi, I'M Running My Experiments On Remote Ec2 Machine. It Works Good But I Noticed That It Doesn'T Logs My Repo/Branch/Working_Dir, Probably Since My Git Is Local And Not On The Machine. I Used Set_Script() To Log This Details. Is There Any Other More Auto

Hi HappyDove3
task.set_script is a great way to add the info (assuming the .git is missing)
Are you running it using PyCharm? (If so use the clearml pycharm plugin, it basically passes the info from your local git to the remote machine via OS environment)

2 years ago

0 Hello, I Am Using Clearml In Docker Mode. I Have A Simple Script That Runs Locally, Runs On The Target Machine Running The Same Tensorflow Container, But Doesn'T Run When I Deploy It Using Clearml. Here'S The Log Of The Error:

It runs directly but leads to the above error with clearml

Both manually (i.e. calling Task.init and running it without agent, and with agent ? same exact behavior ?

one year ago

0 Hello I'M Running A Local Agent . While Its Running The Task I Get This Error. Any Suggestion? Uccessfully Installed Numpy-1.24.4 Found Pytorch Version Torch==2.0.1 Matching Cuda Version 0 Found Pytorch Version Torchaudio==2.0.2 Matching Cuda Version 0 Er

You should manually remove the cudatoolkit from the installed packages section in the UI, then try to send it to the agent and see if it works. The question is how it ended there in the first place

one year ago

0 Whats The Main Difference Between Creating A Task And Using Init?

SweetGiraffe8 Task.init will autolog everything (git/python packages/console etc), for your existing process.
Task.create purely creates a new Task in the system, and lets' you manually fill in all the details on that Task
Make sense ?

3 years ago

0 Hi All, I Have A Question Regarding Multi-Node Training Using The Clearml-Agent. What Is The Recommended Setup In This Case? Say I Have 3 Nodes With 3 Agents Running On Them. How Do I Make Sure They All Run The Same Job?

The problem is not really for the agents to wait (this is easily solved by additional high priority queue) the problem is will you have a "free" agent... you see my point ?

3 years ago

0 Hi! I Noticed A Bug Related To Reusing The Same Component In A Pipeline. I Have Prepared A Mock Example So That You Can Reproduce It:

Building the pipeline in runtime from external configuration is very cool!!
I think nested components is exactly the correct solution, and it is a great use case.

2 years ago

0 Hi, Another Question If You May. Is It Possible To Edit A Logged Task? For Instance - Remove All The Metrics From Some Step Onward?

I see now.
Let's assume you know which snapshot that was:
` prev_task = Task.get_task(task_id='the_first_training_task_id')

get the second from last checkpoint

task.models['output'][-2].url
prev_scalars = prev_task.get_reported_scalars()
new_task = Task.init('example', 'new task')
logger = new_task.get_logger()

do some fpr loop and report the prev_scalars with logger.report_scalars

new_task.flush(wait_for_uploads=True)
new_task.set_initial_iteration(22000)

start the train `

3 years ago

0 Is There Any Way To Clear The Installed Packages Of A Task Programmatically? (I.E. Using The Python Sdk And Not The Ui)

task = Task.init(...) if task.running_locally(): # wait for the repo detection and requirements update task._wait_for_repo_detection() # reset requirements task._update_requirements(None)🙂

3 years ago

0 I Have An Experiment That Generates Many Plots, But Not All Of Them Show Up In The “Plots” Section Of The Experiment Results. I Thought I Read Somewhere About A Limit On The Number Of Plots That Would Be Shown In That Section, But I Couldn’T Find It In Th

Hi NastyFox63
This seems like most of the reports are converted to pngs (which is what the automagic does if it fails to convert the matplotlib into interactive plot).

no more than 114 plots are shown in the plots tab.

Are you saying we have 114 limit on plots ?
Is this true for "full screen" mode (i.e. not in the experiments table but switch to full detailed view)

3 years ago

0 Oh Also, May I Inquire About The Clearml Professional And Enterprise Pricing?

Hi PunyGoose16 ,
I think the website is probably the easiest 🙂
https://clear.ml/contact-us/
I think they get back to quite quickly

3 years ago

0 Happy Friday Everyone ! We Have A New Repo Release We Would Love To Get Your Feedback On

@<1545216070686609408:profile|EnthusiasticCow4>

Is there currently a way to bind the same GPU to multiple queues? I believe the agent complains last time I tried (which was a bit ago)

run multiple agents on the same GPU,

CLEARML_WORKER_NAME=host-gpu0a clearml-agent daemon --gpus 0
CLEARML_WORKER_NAME=host-gpu0b clearml-agent daemon --gpus 0

7 months ago

0 I Saw Some Talk Of Clearml + Kedro On Reddit. Is That A Good Approach?

one can containerise the whole pipeline and run it pretty much anywhere.

Does that mean the entire pipeline will be running on the instance spinning the container ?
From here: this is what I understand:
https://kedro.readthedocs.io/en/stable/10_deployment/06_kubeflow.html

My thinking was I can use one command and run all steps locally while still registering all "nodes/functions/inputs/outputs etc" with clearml such that I could also then later go into the interface and clone an...

3 years ago

0 Just Getting Started With Clearml, Any Recommended Videos On How To Get A Sample Project Up? I Am Using The One On Their Youtube Channel Right Now But I Am A Bit Confused As How To Use The Demoapp

GrumpyPenguin23 could you help and point us to an overview/getting-started video?

3 years ago

0 Hi, Is There A Simple Way To Make

Sorry if it's something trivial. I recently started working with ClearML.

No worries, this has actually more to do with how you work with Dask
The Task ID is the unique id of the any Task in the system (task.id will return the UID str)
Can you post a toy Dash code here, I'll explain how to make it compatible with clearml 🙂

3 years ago

Show more results