AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

Or use python:3.9 when starting the agent

This is probably the best solution 🙂

2 years ago

0 Hi All, Is There Documentation \ Example Describing How Does Clearml Works With Hydra?

Yey!

4 years ago

0 Hello, I Have A Problem With Task.Set_Initial_Iteration(0) In Google Colab. After Continuing The Experiment, Gaps Appear On My Graph, But If You Use Colab. I Tried It On My Computer And Everything Is Normal There.

And it works correctly when running on my computer, and if I use colab, then for some reason it has no effect.

I think I'm lost on this one, when running in colab, is this continuing a previous experiment ?

3 years ago

0 Hi, Is It Possible To Specify Per Experiment (Task In Clearml) Where The Results (Artifacts) Are Saved?

BTW: GreasyPenguin14 you can also upload them as debug samples (when setting the output_uri, the debug samples will be uploaded to the same destination)
https://github.com/allegroai/clearml/blob/6b9297660e0ed83a77bce3da2fab384c552206fd/examples/reporting/image_reporting.py#L21

4 years ago

0 Hi, Is It Possible To Specify Per Experiment (Task In Clearml) Where The Results (Artifacts) Are Saved?

Because we are working with very big files, having them stored at multiple locations is something we try to avoid

Just so I better understand, is this for storing files as part of a dataset, or as debug samples ?
In other words can two diff processes create the exact same file (image) ?

4 years ago

0 Is It Possible To Run An Agent, Listen To The Services Queue Without Using Docker?

trains-agent build --docker nvidia/cuda --id myTaskId --target base_env_services
It's building a gpu enabled docker...
you might want a diff container or to specific --cpu-only

5 years ago

0 Assuming I Have A

A few implementation / design details:
When you run code with Trains (and call init) it will record your environment (python packages, git code, uncommitted changes etc) Everything is stored on the Task object in the trains-server, when you clone a task you literally create a copy of the Task object (i.e. a second experiment). on the cloned experiment, you can edit everything (parameters, git, base docker image etc) When you enqueue a Task you add its ID to the execution queue list a trains-a...

5 years ago

0 Assuming I Have A

(without having to execute it first on Machine C)

Someone some where has to create the definition of the environment...
The easiest to go about it is to execute it one.
You can add to your code the following line
task.execute_remotely(queue_name='default')This will cause you code to stop running and enqueue itself on a specific queue.
Quite useful if you want to make sure everything works, (like run a single step) then continue on another machine.
Notice that switching between cpu...

5 years ago

0 Is There A Way To Output The Cleamrl Reports Scalars / Configuration Etc Into A Output Pdf ? If Not Available, Is It On The Near Term Pipeline ?

And having a pdf is easier/better than sharing a link to the results page ?

4 years ago

0 Hi All, I'M A New User With Clearml-Agent. I Know It'S Supposed To Automatically Replicate The Environment Of A Task, Based On Installed Packages List. However, Installed Packages Of My Task Is Misses Many Of Installed Packages (Any Idea Why?) How Do I Co

Could you put the log here?

2 years ago

0 Is There A Way To Get Tasks By Hyperparameters Values? When I Use The Search In The Ui I Get The Relevant Task, But When I Try The Following I Get An Empty List:

and of course:
task.set_parameters_as_dict(params)

5 years ago

0 Is There Any Specific Version Of Numpy You Recommend To Use With Clearml Python Library? I Am Building An Python Alpine Docker Image With Clearml==1.7.2 But It Breaks When Building Image From Dockerfile.

Hi DrabCockroach54
This seems like a pip issue trying to install from source, try upgrading the pip version and before installing numpy, it should solve it 🤞

2 years ago

0 Hello! Tell Me Please, Is It Intended That Nan Values Are Converted To 0 When Logging? Upd: I See Nan In The Tensorboard, And 0 In Clearml. Upd2: Use V1.1.*

CheerfulGorilla72

upd: I see NAN in the tensorboard, and 0 in Clearml.

I have to admit, since NaN's are actually skipped in the graph, should we actually log them ?

3 years ago

0 Hello, I Am Looking For A Way To Increase Number Of Images Saved In Results>Debug Samples. Looks Like There Is A Limit Of 100 Images Per Experiment, And All Images Saved After Are Not Displayed In Web Client. I Like To Have First Batch With Predictions V

Right, if this is the case, then just use 'title/name 001' it should be enough (I think this is how TB separates title/series or metric/variant )

4 years ago

0 Hi, I Faced With A Silly Error, When I Run The Python Script With Task = Trains.Init(Project_Name='My Project', Task_Name='My Task'). The Task Goes To The Trains Server, But In The Trains Server, In Installed Packages Part One Of The Line

MysteriousBee56 yes, please change the trains code!!! Wee pee, if you think someone else can benefit, feel free to PR :)
Regrading the double entry, that seems like an odd bug, how can I reproduce it?

5 years ago

0 Is There A Functionality To See The Dependency Structure Of Datasets? Or Has Anyone Written A Script For This?

If this is the case:
dataset = Dataset.get(...) dataset.get_dependency_graph()https://clear.ml/docs/latest/docs/references/sdk/dataset#get_dependency_graph

3 years ago

0 Anyone Using Trains With Snakemake? I Am Running My Workflow With Snakemake In A Docker Container, And It Can Output To The Trains Server Of Course, But Executing A Task From Trains Ui Tries To Run The Script In Its Own Container... It Downloads An Ubuntu

BroadMole98

I'm still exploring what trains is for.

I guess you can think of Trains as Experiment manager + MLOps tied together.

The idea is to give a quick and easy way to move from coding/running on one machine to scaling it to multiple remote machines, with everything that comes with it.

In some ways it is like snakemake, it setups your environment and execute the code. Snakemake also allows you to setup data, which in Trains is done via code (StorageManager), pipelines are also...

5 years ago

0 Hi Guys, I Am Having Some Problems About Parsing Arguments To A Task When Running On Agent. I Have A Task A That Creates Another Task B Via Subprocess. Some Arguments Of A And B Share The Same Names. The Problem Is That They'Re Visible To B Even Though I'

I was wondering about what i can do with the agent's argparse magic

You mean how to pass arguments to components a pipeline? btw did you check the pipeline example here?
None

one year ago

0 Hi Again! I Am Doing Batch Inference From A Parent Task (That Defines A Base Docker Image). However, I'Ve Encountered An Issue Where The Task Takes Several Minutes (Approximately 3-5 Minutes) Specifically When It Reaches The Stage Of "Environment Setup Co

Hi @<1529633468214939648:profile|CostlyElephant1>
what seems to be the issue? I could not locate anything in the log

"Environment setup completed successfully
Starting Task Execution:"

Do you mean it takes a long time to setup the environment inside the container?

CLEARML_AGENT_SKIP_PIP_VENV_INSTALL and CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL,

It seems to be working, as you can see no virtual environment is created, the only thing that is installed is the cleartml-agent that i...

one year ago

0 Hey, I'M Running A Pipeline, And 1 Stage Passed - But The Next One Failed. I Fixed The Bug For The Second One - Is There Any Way To Retry The Pipeline From The Failure?

The pipeline stores the state of it's previous run, specifically the executed steps.
In our case the executed step was reset (I assume) so it cannot find the output model you are referring to, hence crashing
CleanPigeon16 make sense ?

4 years ago

0 Hi, I Went Through This Slack'S History And The Problem Already Popped Up A Couple Of Times But Doesn'T Look Like Solved. On My Machine I Currently Have 4 Gpus, No Problems If I Want To Allocate All 4 Or Just 1 Using

Also what is the docker vserion?

4 years ago

0 Hello Again, How Can I Use The

Basically run the 'agentin virtual environment mode JumpyDragonfly13 try this one (notice no --docker flag) clearml-agent daemon --queue interactive --create-queue Then from the "laptop" try to get a remote session with: clearml-session `

4 years ago

0 Hi! I Am Having Some Problems With A Loss After A Good Amount Of Training, What Would Be The Best Way To Log A Value To Have A Better Idea Of What Is Happening?

GrievingTurkey78 I'm not sure I follow, are you asking how to add additional scalars ?

3 years ago

0 Hi Guys, We Are Running Clearml-Serving On A Kube Cluster On Aws And We Have Noticed That We Are Getting Some 502 Errors Once In A While That We Can'T Seem To Trace Back.

None
The thing is the server will not return a 502 error, only a 500 error,
None
Could it be the k8s ingress maybe ?

one year ago

0 Quick Question On The

GrievingTurkey78 short answer no 😞
Long answer, the files are stored as differentiable sets (think changes set from the previous version(s)) The collection of files is then compressed and stored as a single zip. The zip itself can be stored on Google but on their object storage (not the GDrive). Notice that the default storage for the clearml-data is the clearml-server, that said you can always mix and match (even between versions).

4 years ago

0 Any Idea Why Only A Single Instance Of Mujoco Can Be Run With Clearml-Agent? I Run 2 Clearm-Agents, One Per Gpu On My Workstation. However, The Second Task Failes With One Of The Following Errors:

My pleasure

4 years ago

0 Are There Instructions Somewhere On How I Can Use Clearml-Agent To Run Jobs On My Google-Cloud Compute Engine?

Exactly !

3 years ago

0 Hi All, I Am Having Trouble Using The

Can you share the storagemanager usage, and error you are getting ?

3 years ago

0 Hi, I'M Having Some Trouble With Trains-Agent In Docker Mode With A Local Trains Server. I Pulled Allegroai/Trains-Agent:Latest And Spun It Up In A Container, Set The Appropriate Environment Variables To Point To My Trains Server, And Bind Mounted /Var/Ru

RobustGoldfish9
I think you need to set the trains-agent docker to be aware of the host, so it knows how to mount data/cache/configurations into the sibling docker

It should look something like:
TRAINS_AGENT_DOCKER_HOST_MOUNT="/mnt/host/data:/root/.trains"So if running a docker:
docker run -e TRAINS_AGENT_DOCKER_HOST_MOUNT="/mnt/host/data:/root/.trains" ...

4 years ago

0 Hi, I Am Trying To Setup An Auto Scaler, But I Am Getting The Following Dependency Error:

Any chance there is an env variable you set to get 1.5.0rc0? Because this is the version that is being used

2 years ago

Show more results