AgitatedDove14

48 Questions, 8051 Answers

Active since 10 January 2023

Last activity 7 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8051

0 So, I Have Just Started Using Clearml For Local Data And Experiment Tracking And Its Been Super Helpful. Now That I Am Moving Towards Deploying And Serving The Models Using Clearml-Serving And Triton. I Have Done Some Basic Experimenting With The Provided

Hi RipeAnt6

What would be the best way to add another model from another project say C to the same triton server serving the previous model?

You can add multiple call to cleaml-serving , each one with a new endpoint and a new project/model to watch, then when you launch it it will setup all endpoints on a single Triton server (the model optimization loading is taken care by Triton anyhow)

3 years ago

0 Hi, I'M Trying To Use

SoggyBeetle95 the question is, where does clearml stores these arguments, and the answer is on the Task object (from there the agent will take them and apply to the docker execution). Now since all users see all the tasks, they also see these arguments. Wdyt?

2 years ago

0 Is There A Way How I Can Get How Many Minutes The Gpu Has Been Used In A Month? The Duration Of An Iteration Is For Every Run Different If You Vary Batch Size. Model, Or Other Stuff. I Want To Do A Crude Energy Consumption Calculation By Doing A Sum Over

You can query the system and get all the experiments based on date, then grab the machine GPU metrics.
DefeatedCrab47 check the cleanup service, it queries the system with the Apiclient.
https://github.com/allegroai/trains/blob/10ec4d56fb4a1f933128b35d68c727189310aae8/examples/services/cleanup/cleanup_service.py#L72

4 years ago

0 Hi Folks, Happy New Year! I'M Running Into A New Bug In The Webapp Where I Compare Two Or More Experiments, But The Plots Only Show One Of Them (I.E. There'S Two Plots Shown Side-By-Side But They'Re Actually Both Just The First Experiment That Was Selec

Happy new year @<1618780810947596288:profile|ExuberantLion50>

Is this the right place to mention such bugs?Definitely the right place to discuss them, usually if verified we ask to also add in github for easier traceability / visibility

m (i.e. there's two plots shown side-by-side but they're actually both just the first experiment that was selected). This is happening across all experiments, all my workspaces, and all the browsers I've tried.

Can you share a screenshot? is this r...

10 months ago

0 Hi, I Was Running My Agent And Had A Few Scripts For Agent.Extra_Docker_Shell_Script. But When I Looked Through The Logs, They Were Not Executed. Any Idea Why? Using Agent V1.01R1 In K8S Glue.

Using agent v1.01r1 in k8s glue.

I think a fix was recently committed, let me check it

3 years ago

0 Hi, Another Question If You May. Is It Possible To Edit A Logged Task? For Instance - Remove All The Metrics From Some Step Onward?

I see now.
Let's assume you know which snapshot that was:
` prev_task = Task.get_task(task_id='the_first_training_task_id')

get the second from last checkpoint

task.models['output'][-2].url
prev_scalars = prev_task.get_reported_scalars()
new_task = Task.init('example', 'new task')
logger = new_task.get_logger()

do some fpr loop and report the prev_scalars with logger.report_scalars

new_task.flush(wait_for_uploads=True)
new_task.set_initial_iteration(22000)

start the train `

4 years ago

0 Any Idea Why I Get This Error In All My Agents

i'm sorry, I mean if the queue name is not provided to the agent , the agent will look for the queue with the "default" tag. If you are specifying the queue name, there is no need to add the tag.
Is it working now?

3 years ago

0 Hi

Hi TastyOwl44
So this depends on your code itself, but usually you need a CPU machine to run ClearML server (or use the free community server), than a machine to run the pipeline controller (usually the same machine running the clearml-server , as the pipeline control code is basically controller only and does not execute the Task itself), lastly you need machines with GPU running the clearml-agent (these GPU machines are the one actually doing the training inference etc.)

Make ...

3 years ago

0 Hey, What Is The Exact Difference Between

Will this still be considered as

global site-packages

This is a pip settings, I "think" it inherits from the local user's installation, but I would actually install with "sudo pip" that will definitely be "inherited"

4 years ago

0 If I Update The Annotation Files For Some Dataset And Upload It. Can It Call The Previous Version Of Dataset Using The Dataset Id ?

What is the specific use case, updating a file on existing dataset and creating a new version?

3 years ago

0 What Is Being Stored Exactly In

if they're mission critical, but rather the clearml cache folder?

hmmm... they are important, but only when starting the process. any specific suggestion ?
(and they are deleted after the Task is done, so they are temp)

2 years ago

0 Hey, I Have A Question Regarding “Logger.Report_Table”, It’S Seems Like After The Table Is Drawn In The Ui I Cannot Change The Column Size And More Annoying I Cannot Select Content From Table To Copy It. Anyone Know What Params I Need To Pass In Order To

any idea why i cannot selected text inside the table?

Ichh, seems again like plotly 😞 I have to admit quite annoying to me as well ... I would vote here: None

one year ago

0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

The issue itself is changing the default user.

USER appuser
WORKDIR /home/appuser

Any reason for it ?

3 years ago

0 Hi. I Have A Question About Pipelines And Their Generated Dependency Graphs. I Took The Code Of The Clearml Pipeline From Decorator Example:

feature request: tell me what gets passed along each edge of the pipeline graph

Nice! please feel free to add to GH issue 🙂
Hmm yes that is odd, let me see if I can reproduce

2 years ago

0 I'M Trying To Configure The Glue Agent To Use Aws Ecr Via Helm Charts. Below Is My Configuration. It Is Not Pulling The Image Though, It Is Failing With

Yes please 🙂
BTW: I originally thought the double quotes (in your PR) were also a bug, this is why I was asking, wdyt?

2 years ago

0 Hello! How Can I Use "Report_Scatter2D" In Order To Report Timestamp In The X-Axis?

https://github.com/allegroai/clearml/commit/6b8f21b30e005d50da35e2e5a5ded8782bce2f53

3 years ago

0 Hi Guys! Love Using Trains And Love The Great Support In This Channel. Say I Have Two Different Training Experiments Which Report Every 20 Iteration, But The Batch Size Between Them Is Different, Resulting In Different Number Of Iterations Per Epoch. I Wo

I mean manually you can get the results and rescale but, not through the UI

4 years ago

0 Hello Everyone, I Am Using Self Hosted Clearml Server On Ec2 (Clearml Community Amis). This Ec2 Instance Is Attached To S3 With Iam Role. Now If I Create Or Upload Data From Client Side , I Want It To Be Uploaded On S3. There Is A Way Mentioned For Mentio

Check the links that are generated in the ui when you upload an artifact or model

one year ago

0 I Wonder If There Is A Way To Setup

I see TightElk12

You can always setup the OS environments : CLEARML_API_HOST CLEARML_WEB_HOST CLEARML_FILES_HOST with the correct configuration Or you can simply set CLEARML_NO_DEFAULT_SERVER=1 which will prevent any usage of the default demo serverwdyt?

3 years ago

0 Hi, Is There A Concept Of An Agent Taking More Then One Job?

doing some extra "services"

what do you mean by "services" ? (from the system perspective any Task that is executed by an agent that is running in "services-mode" is a service, there are no actual limitation on what it can do 🙂 )

3 years ago

0 Does Anyone Know How To Resolve Symbol Cublasltgetstatusstring Version Libcublaslt.So.11 Not Defined In File Libcublaslt.So.11 With Link Time Reference

Sen the full Task log, you can DM it if it is easier

one year ago

0 Hello All, Thanks For This Really Cool Software And Community! I Have A Question On

Hmm that makes sense, btw the PYTHONPATH set by the agent would be the working dir listed under the Task, But if you set the agent.force_git_root_python_path the agent would also add the root git repo to the python path

one year ago

0 Another Question: Is It Possible To Specify In Which Directory To Save All The Files That Clearml-Agent Creates (E.G. Cache Files Or Results Of The Currently Running Experiments)

Hmm, so the way the configuration works is it loads the default configuration (equivalent to the example in the docs) then it adds the ~/clearml.conf on top. That means that you can tell your users to just copy paste the credentials from the UI into a template you make. How is that ?

3 years ago

0 Two Questions About Datasets: Question 1: Are Parallel Writes To A Dataset With The Same Version Possible? Is The Way To Go, To Have A Task, Which Creates A Dataset Object, Which In Turn Is Passed As Artifact To The Subsequent Ingestion Tasks? After The P

yes, or (because I deployed clearml using helm in kubernetes) from the same machine, but multiple pods (tasks).

Oh now I see, long story short, no 😞 the correct way of doing that is every node/pod creates it's own dataset,
then when you are done, you create a new version with the X datasets that you created as parents, the newly created version is just "meta" it basically tells the system how to combine the previously generated datasets (i.e. no data is actually re-uploa...

10 months ago

0 Is There A Way To Get Tasks By Hyperparameters Values? When I Use The Search In The Ui I Get The Relevant Task, But When I Try The Following I Get An Empty List:

HandsomeCrow5 I see, my bad.
BTW: Did you see this one?
https://github.com/allegroai/trains/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py
And the helper classes here: https://github.com/allegroai/trains/tree/master/trains/automation

4 years ago

0 Hey I’M Running This Script And Initialise The Clearml Task Also In This File

Still I wonder if it is normal behavior that clearml exits the experiments with status "completed" and not with failure

Well that depends on the process exit code, if for some reason (not sure why) the process exits with return code 0, it means everything was okay.
I assume this is doing something "Detected an exited process, so exiting main" this is an internal print of your code, I guess it just leaves the process with exitcode 0

3 years ago

0 If I Have A Task And A Dataset Is Being Created In A Task, How Can I Get A “Link” That This Dataset Is Created In This Task, Similar To How Model Has The Task Where It Came From

👍

3 years ago

0 I Would Like To Use Clearml Together With Hydra Multirun Sweeps, But I’M Having Some Difficulties With The Configuration Of Tasks.

Understood, then I would use Task.remote_execution()
Basically :

task = Task.init(...)
# config some stuff
task.remote_execute(quque_name_here)
# this line will be executed on the remote machine only

This will both automatically log your code / repo with Task.init, and the call to Task.remote_execute will stop the local process (on your machine that runs the hydra sweep) and continue on the remote machine.
This will both allow you to use Hydra sweet & schedule / run on remote ...

one year ago

0 Running This Code From Inside A Docker Container Locally:

So the thing is, regardless of the link you should end with:
helper <clearml.storage.helper.StorageHelper object at 0x....>But the code that failed seemed to return None, which makes me suspect the url itself is somehow broken.
Any chance you have a space before the "s3://" ?
BTW : what's the clearml version you are using ?

2 years ago

0 I Am Getting This Specific Message When Trying To Run Hyper Parameters Optimization (Running Remotely My Task). Does It Affect My Flow? Do I Have Something To Worry About?

Although I didn't understand why you mentioned

torch

in my case?

Just a guess 🙂 other frameworks do multi-process as well,

I would guess it relates to parallelization of Tasks execution of the

HyperParameterOptimizer

class?

Yes that might be it, it's basically by product of using python "Process" class for multiprocessing. we are working on a fix, not a trivial unfortunately

2 years ago

Show more results