AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 [Clearml With Pytorch-Based Distributed Training} Hi Everyone! Is The Combination Of Clearml With

Hi ScantChimpanzee51
How are you launching the code ?
Basically the easiest way is to do so with the example you just mentioned,
Can this issue be reproduced ?

2 years ago

0 [Clearml With Pytorch-Based Distributed Training} Hi Everyone! Is The Combination Of Clearml With

It should actually work the same, if you find out it fails to properly register let me know (and then I guess a github issue is the next step)

2 years ago

0 Hi, I Would Like To Pass In Some Pip Arguments That Clearml-Agent Would Include When Setting Up The Venv On The Containers. How Should I Specify This? The Argument In Question Are --Trusted-Host And --Find-Links . I Need Them As I'Ve Installed A Pypi Repo

Hmm, I think you should use --template-yaml

4 years ago

0 Does Clearml Has A Webhook Mechanism ? Example, When A Training Job Is Completed.. There Is Notification Raised So Can Proceed To Do Deployment Etc ...

Is this some sort of polling ?

yes

End of the day, we are just worried whether this will hog resources compared to a web-hook ? Any ideas (edited)

No need to worry, it pulls every 30 sec, and this is negligible (as a comparison any task will at least send a write request every 30 sec, if not more)
Actually webhooks might be more taxing on the server, as you need to always have a webhook up (i.e. wasting a socket ...)

4 years ago

0 I'M A Little Confused As To How Force_Requirements_Env_Freeze Works When No Requirements File Is Supplied. Is It Supposed To Store The Full Reqs Of The Environment That Calls It?

I think CostlyOstrich36 managed to reproduce?!

3 years ago

0 What Happens If The Task.Init Doesn'T Happen In The Same Py File As The "Data Science" Stuff I Have A List Of Classes That Do The Coding And I Initialise The Task Outside Of Them. Something Like

This will allow them to experiment outside of clearml and only switch to it when they are in an OK state. This will also helpnot to pollute clearml spaces with half backed ideas

What's the value of runnign outside of an experiment management context ? don't you want to log it?
There is no real penalty here, no?!

2 years ago

0 Hi! Can Someone Show Me An Example Of How

. I was wondering what is the use of

PipelineController.create_draft

if you can't use it to clone and run tasks, as we have seen

I think the initial thought was to allow to create a pipeline from a pipeline programatically. Then once you have the "pipeline" you can manually enqueue it and modify it. Think a pipeline constructing other pipelines in flight based on some logic, then launching them in parallel.
make sense ?

3 years ago

0 Hi, I Want Clearml To Not Install Packages From My Pycharm Environment, But Activate An Environment I Have Locally On The Computer The Agent Is Installed On, Is The Variable Clearml_Agent_Skip_Pip_Venv_Install Relevant To This, And If So Can I Configure I

Hi ConvolutedChicken69
assuming you are runnign the agent in venv mode you can do something like:
$ CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=1 clearml-agent daemon --queue defaultThis will basically only clone the code and use the default python the clearml-agent itself is using.
Does that help?
BTW:

it gets an error as it can't find it with pip.

What's the error? how come the package cannot be installed ?

3 years ago

0 Hi All, Is There Anyway To Get The Id Of The Pipeline Using Pipeline Name? I Need The Id Of The Pipeline So That I Can Schedule The Pipeline To Run Via

So inside the pipeline logic you can do Task.current_task().id
Or inside a component Task.current_task().parent

2 years ago

0 Hi, I Was Getting A Really Weird Error Due To Mismatch On The Versions Between The Installed Libraries In My Environment And The Ones Ran In The Node (I Manually Changed The Installed Packages And Everything Worked). How Can I Force Trains To Use Exactly

GrievingTurkey78

maybe since the package is not directly imported in my code it is possible to get a different version to what I have locally (?).

If these are derivative packages (i.e. imported by other packages) they are not automatically logged when executing the Task manually (in order to keep the "installed packages as lean as possible on the one hand but specify also specify the important packages for you)
That said, when the "trains-agent" executed the task it will store nack...

4 years ago

0 I Just Deployed Clearml Into K8 Cluster Using Clearml Helm Package. When I Ran A Job, It Gave This Error In The Clearml Web Server (Attached Below). I Sshed Into The Pod Running The Clearml-Agent. Upon Typing Clearml-Agent Init, I Realised The Clearml.Con

DeliciousBluewhale87

Upon ssh-ing into the folders in the both the physical node (/opt/clearml/agent) and the pod (/root/.clearml), it seems there are some files there..

Hmm that means it is working...
Do you see there a *.conf files? What do they contain? (it point to the correct clearml-server config)

4 years ago

0 Hi Folks. I'Ve Installed Clearml On K8S Cluster Using Helm Chart 7.11.0, If It Matters. When I Trying To Create "App Credentials" From Workspace Settings And Then Past Them To Clearml-Init - I Got The Error:

And are you sure your are pointing to the correct API server and not mixing API with WEB address ?
Also what's the clearml-server version?

one year ago

Hi @<1729309120315527168:profile|ShallowLion60>
How did you create those credentials ?

one year ago

0 Fyi: Conda Installation Of Pytorch Is Broken Again. My Old Tasks Which Worked Before Now Fail Since They Do Not Find Torch. However, I Can See In The Execution That Conda Had Errors. Most Probably It Happens Because Pytorch 1.8.1 Has Been Released, But I

nice 🙂

4 years ago

0 Hi Everyone! Is There A Way To Specify The Working Directory In A Pipeline Component? I’M Using Pipelines From Decorators, I Can Set The Repo Url Just Fine, But I’M Running Everything From A Subfolder, And The Working Dir Is Set To

Hi @<1570220858075516928:profile|SlipperySheep79>

Is there a way to specify the working dir from the decoratoe

not directly, but why would that change anything? I mean the coponent code will be created in the git root, and you can still access files inside the subfolders

from .subfolder import something

what am I missing?

one year ago

This would work to load the local modules, but I’m also using poetry and the

pyproject.toml

is in the subdirectory, so the agent won’t install any dependency if I don’t set the

work_dir

hmmm true, in terms of requirements, you can list them in the decorator (see packages argument)

one year ago

0 Task Struck At

Hi PanickyMoth78

it was uploading fine for most of the day but now it is not uploading metrics and at the end

Where are you uploading metrics to (i.e. where is the clearml-server) ?
Are you seeing any retry logging on your console ?
packages/clearml/backend_interface/metrics/reporter.py", line 124, in wait_for_eventsThis seems to be consistent with waiting for metrics to be flushed to the backend, but usually you will see retry messages on your console when that happens

2 years ago

0 Hi All! I'M Using Clearml With Hydra As Configuration Manager. I'M Trying To Rerun A Task By Overriding Some Of The Configurations From The Ui. I Tried To Change The Config_Name Args In The Args Section And Also The Omegaconf Configuration In Configuratio

For example, could you test if this one works:
https://github.com/allegroai/clearml/blob/master/examples/frameworks/hydra/hydra_example.py

4 years ago

0 Hi Folks, We Are Trying To Find A Tool To Help With Workflow Orchestration. This Is Our Stack So Far (Label Studio/Clearml/Seldon). Does Anyone Have Any Experience With Using Any Workflow Which Is Most Compatible Esp Wrt To Clearml.

So like a UI for creating pipelines doing different things on the different solutions ?

4 years ago

0 Task Struck At

ShallowGoldfish8 I believe it was solved in 1.9.0, can you verify?
pip install clearml==1.9.0

2 years ago

0 [Task Gets Interrupted / Aborted / Reset When In Offline Mode] For Local Testing, We Have Added A

Thanks ScantChimpanzee51 !
Let me see what I can find, should be easy enough to fix now 🙂

2 years ago

0 Anyone Doing Sagemaker With Clearml - Something Like The K8S Glue But The Tasks Are Pulled Into Sagemaker Training Jobs

Do you have any experience and things to watch out for?

Yes, for testing start with cheap node instances 🙂
If I remember correctly everything is preconfigured to support GPU instances (aka nvidia runtime).
You can take one of the templates from here as a starting point:
https://aws.amazon.com/blogs/compute/running-gpu-accelerated-kubernetes-workloads-on-p3-and-p2-ec2-instances-with-amazon-eks/

4 years ago

0 I'M Using Tensorboard Summarywriter To Add Scalar Metrics For The Experiment. If Experiment Crashed, And I Want To Continue It From Checkpoint, For Some Reason It Plots Metrics In A Really Weird Way. Even Though I Pass Global_Step=Epoch To The Summarywrit

I think we should open a GitHub Issue and get some more feedback, maybe we should just add support in the backend side ?

3 years ago

0 Hey Guys, I Have Set Up A Clearml Pipeline For My Simple Isolation Forest Model. But I Have Been Receiving This Error.

If you are using the "default" queue for the agent, notice you might need to run the agent with --services-mode to allow for multiple pipeline components on the same machine

2 years ago

0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

The package detection is done when running the code on your laptop, and this is when it first logs the packages and versions. Following it, what do you have on your laptop? OS/Conda/Python

3 years ago

0 Hello! I Get The Following Error In Results->Console After A Task Is Sent For Remote Execution (Using Sdk):

I have an idea, can you try with:
task = Task.init(..., reuse_last_task_id=False)I have a suspicion it starts the Tasks in parallel, and the "reuse_last_task_id" causes them to "reuse the same task locally" which makes them overwrite the configuration of one another.

3 years ago

0 Kindly Let Me Know, How To Change The Hyper-Parameter Value ?

So now for it to take place you need to enqueue the Task and set an agent to pick it up and run it.
When the agent is running the Task the new parameter will be passed.
does that make sense ?

2 years ago

0 Is Anyone Also Experiencing Network Error During Every Clearml Dataset Download? It'S Been A While And Almost Every Download Fails...

Hi BitterStarfish58
Where are you uploading it to?

3 years ago

0 Guess We'Re Back To Basics How Do I Report A Single Scalar With No Iteration Dimension - Something I Can Put As One Of The Columns In The Experiments Table?

WackyRabbit7 How do I reproduce it ?

4 years ago

0 Hello, Community. I Hope You Are All Doing Well. I'M Seeking Information Regarding A Specific Problem, Specially In The Field Of Computer Vision. Typically, An App In The Field Of Computer Vision Will Have Multiple Models, Each With Its Own Preprocessing,

, but what I really want to achieve is to share this code:

You mean to share the code between them, unless this is a "preinstalled" package in the container, each endpoint has it's own separate set of modules / files
(this is on purpose, so you could actually change them, just image diff versions of the same common.py file)

one year ago

Show more results