AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Question About The Storage Manager. Assuming I Have An Object That Updates Frequently And Always Saved At The Same Path (E.G.

Well I guess you can say this is definitely not self explanatory line 😉
but, it is actually asking whether we should extract the code, think of it as:
if extract_archive and cached_file: return cls._extract_to_cache(cached_file, name)

3 years ago

0 I Am Also Experiencing A Weird Behaviour When Running A Script Using The Module Flag. For Example I Run:

command line to the arg parser should be passed via the "Args" section in the Configuration tab.
What is the working directory on the experiment ?

3 years ago

0 Question About The Storage Manager. Assuming I Have An Object That Updates Frequently And Always Saved At The Same Path (E.G.

We should probably change it so it is more human readable 🙂

3 years ago

0 Hello, Everyone. I Have A Model, And In

Hi @<1657918706052763648:profile|SillyRobin38>

I have included some print statements

you should see those under the Task of the inference instance.
You can also do:

import clearml
...
def preprocess(...):
  clearml.Logger.current_logger().report_text(...)
  clearml.Logger.current_logger().report_scalar(...)

, specifically within the containers where the inferencing occurs.

it might be that fastapi is capturing the prints...
[None](https://github.com/tiangolo/uvicor...

7 months ago

0 I'Ve Been Working A Bit With Trains-Agent, Having Them Deployed On Different Machines Listening To Queues (Docker Mode) And It'S Been Working Good So Far. My Question Is What Is The Difference Between That Setup (Creating Agents On Different Machines And

It's just another flag when running the trains-agent
You can have multiple service-mode instances, there is no actual limit 🙂

3 years ago

0 How Do I Restart Trains-Agents? How Do I Stop Them?

WackyRabbit7 I do 'pkill -f trains' but it's the same... If you need to debug and test run with --foreground and just hit ctrl-c to end the process (it will never switch to background...). Helps?

4 years ago

0 Hey, Using K8S With Trains 0.16.1-320, All Of A Sudden The Entire Data (I.E Experiments, Tasks, Api Creds) Is Not Showing In The Ui Anymore. All Logs Seems To Be Fine Afai Can Tell... Any Idea What Went Wrong?

so if the node went down and then some other node came up, the data is lost

That might be the case. where is the k8s running ? cloud service ?

3 years ago

0 How Can I Tell Clearml To Ignore Certain Submodules Existing In The Project? My Projects Consists Of Multiple Git Submodules And It Is Rather Annoying That The Task Always Tries To Fetch All Submodules, When They Are Not Even Necessary. I Don'T Know How I

That is quite neat! You can also put a soft link from the main repo to the submodule for better visibility

5 months ago

0 When I Try To Create Experiment In The Ui All I See Is This Dialogue

👍

2 years ago

0 Hey All, I'M Testing The Usage Of

First I would check the CLI command it will basically prefill it for you:
https://clear.ml/docs/latest/docs/apps/clearml_task
Specifically to your question, working directory "." is the root of the git repo
But I would avoid adding it manually, use the CLI, it will either use ask you to provide info or take the git repo details from the local copy

2 years ago

0 Hello Folks. We'Re A Small Team Currently Considering Adopting Clearml For Experiment Tracking. I Was Wondering If I Start With The Hosted Service And Decide To Switch To A Self-Hosted Server Later, Is There A Way To Export All The Experiments/Data/Etc Fr

Exporter would be nice I agree, not sure it is on the roadmap at the moment 😞
Should not be very complicated to implement if you want to take a stab at it.

2 years ago

0 Um, Is There A Way To Delete An Artifact From A Task That Is Running?

Hmm, you can delete the artifact with:
task._delete_artifacts(artifact_names=['my_artifact']However this will not delete the file itself.
Do delete the file I would do :
remote_file = task.artifacts['delete_me'].url h = StorageHelper.get(remote_file) h.delete(remote_file) task._delete_artifacts(artifact_names=['delete_me']Maybe we should have a proper interface for that? wdyt? what's the actual use case?

2 years ago

0 Https://Clearml.Slack.Com/Archives/Ctk20V944/P1713357955958089

Hi @<1523701949617147904:profile|PricklyRaven28>
Sorry, we missed that one

we need to invoke it with

accelerate launch

so we use

subprocess.run

So you have two options, either you change the script entry of the Task from your " script.py " to" -m accelerate launch script.py
or you manually do that inside your entry point (i.e. call accelerate launch)
BTW, I "think" we added an "auto detect" for it, so that if you launched it manually this wa...

5 months ago

0 Https://Clearml.Slack.Com/Archives/Ctk20V944/P1713357955958089

How does this work in the context of a pipeline?

Is your pipeline from functions / decorators ? or is it from Tasks ?
(if this is Tasks then just changing the entry point in the overides)
In case of functions or decorators, you have to do that manually (i.e. your function needs to do "accelerate launch"

from accelerate.commands.launch import launch_command, launch_command_parser
parser = launch_command_parser()
args = parser.parse_args("-command -here".split())
launch_command(arg...

5 months ago

0 Https://Clearml.Slack.Com/Archives/Ctk20V944/P1713357955958089

We used subprocess for it, ...

Popen? os.system? fork?

5 months ago

0 Hi Everyone! Is There A Way Or A Trigger To Detect When The Number Of Workers In A Queue Reaches Zero? Sometimes, My Workers Terminate Unexpectedly, Which Causes The Worker Count In The Queue To Drop To Zero And Prevents My Scheduler From Executing. I’D L

Hi @<1523701260895653888:profile|QuaintJellyfish58>

Is there a way or a trigger to detect when the number of workers in a queue reaches zero?

You mean to spin them down? what's the rational ?

I’d like to implement a notification system that alerts me when there are no workers left in the queue.

How are they "dropping" ?

Specifically to your question, let me check I'm sure there is an API that get's that data becuase you can see it in the UI 🙂

5 months ago

0 Https://Clearml.Slack.Com/Archives/Ctk20V944/P1713357955958089

If nothing specific comes to mind i can try to create some reproducible demo code (after holiday vacation)

Yes please! 🙏
In the mean time see if the workaround is a valid one

5 months ago

0 Hi, I Would Like To Follow-Up In This

Thank JitteryCoyote63 this is very helpful!

2 years ago

0 I Have Set

would those containers best be started from something in services mode?

Yes as long as the machine has enough cpu/ram
Notice that the services mode will start a second parallel Task after the first one is done setting up the env, if running with CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL, with containers that have git/python/clearml-agent preinstalled it should be minimal.

or is it possible to get no-overhead with my approach of worker-inside-docker?

No do not do that, see above e...

5 months ago

0 I Have Set

I can see all the steps like git clone,

git clone has nothing to do with "env setup" this is brining the code, you cannot skip that one, that said, this is why the git itself is cached on the host machine, so it is fast

... There may be some odd package that need to be installed because one of our DS is experimenting ... But all that we can see what is happening.

even if everything is preinstalled, it Verifies the packages match, this might take a long time. It's just pip being ...

5 months ago

0 Hi - Quick Question. I Am Using The Pipelinecontroller With Abort_On_Failure Set To False. I Have A Pipe With A First Task That Branch Out In 3 Branches.

Hi @<1523715429694967808:profile|ThickCrow29>

I am using the PipelineController with abort_on_failure set to False.

Is this a pipeline from code or from Tasks?
What is the clearml version?
Lastly, if a component fails, and another components is dependent on it's output, how would it run? if it is not dependent, why is it a child component?

9 months ago

0 Hello, I Am Trying To Use The

I am trying to use the

configuration vault

option but it doesn't seem to apply the variables I am using.

Hi EmbarrassedSpider34 I think this is an enterprise feature...

Manged to make the credentials attached to the configuration when the task is spinned,

I'm assuming env variables ?

2 years ago

0 Hi, I Have A Problem With "Dataset" Module. I Create Dataset And Uploaded Few Files:

Hmm HandsomeGiraffe70
This seem like a bug, let me see what we can do about that 🙂
could it be the parent version was created with an older version of clearml sdk ?

2 years ago

0 Hi, I Have A Problem With "Dataset" Module. I Create Dataset And Uploaded Few Files:

Hmm this is odd, when you press on the parent dataset in the UI, and go to full-details, then the INFO tab. Can you copy here everything ?

2 years ago

0 Hi - Quick Question. I Am Using The Pipelinecontroller With Abort_On_Failure Set To False. I Have A Pipe With A First Task That Branch Out In 3 Branches.

if the first task failed - then the remaining task are not schedule for execution which is what I expect.

agreed

I'm just surprised that if the first task is

aborted

instead by the user,

How is that different from failed? The assumption is if a component depends on another one it needs its output, if it does not then they can run in parallel. What am i missing?

9 months ago

0 Hi - Quick Question. I Am Using The Pipelinecontroller With Abort_On_Failure Set To False. I Have A Pipe With A First Task That Branch Out In 3 Branches.

@<1523715429694967808:profile|ThickCrow29> this is odd... how did you create the pipeline? can you provide code sample?

9 months ago

0 Hi! Can Someone Show Me An Example Of How

Well, PipelineDecorator actually allows you to do the same thing, with the same ability that is clone / modify / enqueue.
(I mean, Pipeline with tasks is also great, I just want to clarify that they have the same capabilities in this respect).

2 years ago

0 Clearml (Remote Execution) Sometimes Doesn'T "Pick-Up" Gpu. After I Rerun The Task It Picks It Up. Seems Random, Doesn'T Happen Too Often (Maybe Once In 30-40 Times) And I Cannot Seem To Detect Any Pattern. Did Anyone Else Notice This? Agents Are Vms On G

I know about clearml.conf but wanted to avoid ssh-ing through 50 instances to edit it.

LOL yeah, btw: this is exactly the reason the enterprise version has a vault feature, so one could edit the base configuration in the UI and it automatically propagates everywhere

but docker_arguments doesn't propagate if I leave docker_image as None

yeah, that's correct, you have to select a container to be used

2 months ago

0 I Hit A Issue That I Cannot See My Matplotlib Plot, But It Was Shown In The Panel. Any Idea?

The upload itself is in the background.
It should not take long to prepare the plot for sending. Are you experiencing a major delay ?

4 years ago

0 Is There A Way To Set Precedence On Package Managers? If We Set An Agent To Use

Local changes are applied before installing requirements, right?

correct

2 years ago

Show more results