SmallTurkey79

10 Questions, 124 Answers

Active since 12 April 2024

Last activity one year ago

Reputation

Badges 1

103 × Eureka!

Questions 10
Answers 124

0 Votes

14 Answers

2K Views

0 Votes 14 Answers 2K Views

I'M Having A Hard Time With Git Cloning + Cache For A Private Repo Accessed Via Personal Access Token. This Happens 100% Of The Time, Across Both Bitbucket + Github. I Have A Simple "Hello World" Task In A Private Repo. The Worker Is Running In A Docker

I'm having a hard time with git cloning + cache for a private repo accessed via personal access token. This happens 100% of the time, across both bitbucket +...

clearml

one year ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hello! Thank You For The Great Product. I Have A Bit Of A Request: This Hover Feature In Pipeline Overview Would Be Much More Useful If I Could Read Out The Whole Metric Name. (Not So Much An Issue With Things Like F1, "Acc", But Anything Longer Is Not

hello! thank you for the great product. I have a bit of a request: this hover feature in pipeline overview would be much more useful if I could read out the ...

clearml

one year ago

0 Votes

13 Answers

2K Views

0 Votes 13 Answers 2K Views

Any Tips On Debugging Worker Graphs Not Showing Up? Seems To Be Some Js Errors In The Console That May Be Related. Running Localhost Against 1.16.1 Images

any tips on debugging worker graphs not showing up? seems to be some js errors in the console that may be related. running localhost against 1.16.1 images

clearml

one year ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi Everyone! I Just Wanted To Bring To Your Attention That Clearml 1.16.0 Introduced Authentication For The Self-Hosted Fileserver By Default.

Hi everyone! I just wanted to bring to your attention that ClearML 1.16.0 introduced authentication for the self-hosted fileserver by default. None If any of...

clearml

one year ago

0 Votes

7 Answers

3K Views

0 Votes 7 Answers 3K Views

Thread Re: Pipelines And How They'Re Meant To Be Used / How Long They Take To Orchestrate.

Thread re: Pipelines and how they're meant to be used / how long they take to orchestrate. @<1523701205467926528:profile|AgitatedDove14> I appreciated your a...

clearml

one year ago

0 Votes

9 Answers

2K Views

0 Votes 9 Answers 2K Views

Why Does Clearml Still Waste Time On Requirement Analysis When I Provide Them? Any Tips For How I Can Reduce Clearml Overhead ... (The Time Before Work Actually Starts)?

why does clearml still waste time on requirement analysis when I provide them? any tips for how I can reduce clearml overhead ... (the time before work actua...

clearml

one year ago

0 Votes

54 Answers

118K Views

0 Votes 54 Answers 118K Views

I Have Set

I have set export CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=true export CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=true in my entrypoint.sh (which runs clearml-agent da...

clearml

one year ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

In Case Anyone Else Ever Comes Across Mongo Issues Using The Docker Compose Clearml Stack (In Case Of A Messy Shutdown), I Have Found This Script To Be A Lifesaver On Numerous Occasions:

in case anyone else ever comes across mongo issues using the docker compose clearml stack (in case of a messy shutdown), I have found this script to be a lif...

clearml

one year ago

0 Votes

43 Answers

125K Views

0 Votes 43 Answers 125K Views

I Dont Exactly Know How To Ask For Help On This... Nor Have A Reproducible Minimal Example... I Downgraded Back To 1.15.1 From 1.16.2 And Have The Same Issue There. I Have A Pipeline That'S Repeatedly Failing To Complete. It Correctly Marks Things As Cach

i dont exactly know how to ask for help on this... nor have a reproducible minimal example... I downgraded back to 1.15.1 from 1.16.2 and have the same issue...

clearml

one year ago

0 Votes

31 Answers

137K Views

0 Votes 31 Answers 137K Views

I Noticed After Upgrading To The Latest Clearml That App Credentials Now Disappear On Restart. Is This An Intentional Design Choice? I'M In A Bit Of A Chicken-And-Egg Situation: Trying To Generate Valid Keys For

I noticed after upgrading to the latest clearml that App Credentials now disappear on restart. Is this an intentional design choice? I'm in a bit of a chicke...

clearml

one year ago

0 Hi Everyone! I'M Working On A Solution That Uses Clearml Agent Running On An Ec2 Instances. These Instances (And Agents) Are Provisioned Automatically And Listen To A Specific Clearml Queue. Different Users Can Send Jobs To This Queue And Therefore To Th

He's asking "what git credentials make sense to use for agents" - regardless of autoscaling or not. I had the same question earlier.

tldr: it depends on your security policies.

@<1719524650926477312:profile|EncouragingFish95> - if you have the ability to create a "service account" in your git provider, perhaps at the org-level, I would do that.

My org's cloud git provider does not enable this functionality, and so we have agreed that it is "acceptable" to have the agent's git credentials...

one year ago

0 I Noticed After Upgrading To The Latest Clearml That App Credentials Now Disappear On Restart. Is This An Intentional Design Choice? I'M In A Bit Of A Chicken-And-Egg Situation: Trying To Generate Valid Keys For

thank you!
out of curiosity: how come the clearml-webserver upgrades weren't included in this release? was it just to patch the api part of the codebase?

one year ago

0 Is There A Way I Can Make Clearml Server Work On A 4Gb Ram Ec2-Instance? I See That Min 8Gb Is Recommended, Although It'S Working With 4Gb Too, But Not Sure At What Load I'Ll Start Facing Problems. Also, I Guess Elasticsearch Uses Up Most Of The Ram, Is T

you can control how much memory elastic has via the compose stack, but in my experience - ive been able to run on a 4 core w 16gb of ram only up to a certain point . for things to feel snappy you really need a lot of memory available once you approach navigating over 100k tasks .

so far under 500k tasks on 16gb of ram dedicated solely to elastic has been stable for us . concurrent execution of more than a couple hundred workers can bring the UI to its knees until complete, so arguably we...

10 months ago

it's really frustrating, as I'm trying to debug server behavior (so I'm restarting often), and keep needing to re-create these.

one year ago

if there's a process I'm not understanding please clarify...

but
(a) i start up the compose stack, log in via web browser as a user . this is on a remote server .
(b) i go to settings and generate a credential
(c) i use that credential to set up my local dev env, editing my clearml.conf
(d) i repeat (b) and use that credential to start up a remote workers to serve queues .

am i misunderstanding something? if there's another way to generate credentials I'm not familiar with it .

one year ago

hello @<1523701087100473344:profile|SuccessfulKoala55>
I appreciate your help. Thank you. Do you happen to have any updates? We had another restart and lost the creds again. So our deployment is in a brittle state on this latest upgrade, and I'm going back to 1.15.1 until I hear back.

one year ago

so, I tried this on a fresh deployment, and for some reason that stack allows me to restart without losing App Credentials.
It's just the one that I performed an update on.

one year ago

0 Hello, Are There Any Resources For Trying To Reduce The Number Of Api Calls? I Am Trying Out Clear Ml And With Just 20 Epochs It Says There Have Been 80K Api Calls

for me, it was to set loglevel higher up and reduce the number of prints that my code was doing. since I was using a logger instead of prints, it was pretty easy.

If you're using some framework that spits out its own progress bars, then I'd look into disabling those from options available.

Turning off logs entirely I don't know, will let clearml ppl respond to that.

For sure though the comms of CPU monitoring and epoch monitoring will lead to a lot of calls... but i'll agree 80k seems exce...

one year ago

0 Why Does Clearml Still Waste Time On Requirement Analysis When I Provide Them? Any Tips For How I Can Reduce Clearml Overhead ... (The Time Before Work Actually Starts)?

thank you very much.

for remote workers, would this env variable get parsed correctly?
CLEARML_API_HTTP_RETRIES_BACKOFF_FACTOR=0.1

one year ago

App Credentials now persist (I upgraded 1.15.1 -> 1.16.1 and the same keys exist!)
thanks!

one year ago

0 Could I Get Some Feedback From People With Experience Using Clearml Pipelines On The Best Way To Handle Caching? My Team Is Working On Configuring Clearml Pipelines For Our Data Processing Workflow. We Currently Have An Experimental Pipeline Configured F

It sounds like you understand the limitations correctly.

As far as I know, it'd be up to you to write your own code that computes the delta between old and new and only re-process the new entries.

The API would let you search through prior experimental results.

so you could load up the prior task, check the ids that showed up in output (maybe you save these as a separate artifact for faster load times), and only evaluate the new inputs. perhaps you copy over the old outputs to the new task...

one year ago

0 Why Does Clearml Still Waste Time On Requirement Analysis When I Provide Them? Any Tips For How I Can Reduce Clearml Overhead ... (The Time Before Work Actually Starts)?

thanks so much!
I've been running a bunch of tests with timers and seeing an absurd amount of variance. Ive seen parameters connect and task create in seconds and other times it takes 4 minutes.

Since I see timeout connection errors somewhat regularly, I'm wondering if perhaps I'm having networking errors. Is there a way (at the class level) to control the retry logic on connecting to the API server?

my operating theory is that some sort of backoff / timeout (eg 10s) is causing the hig...

one year ago

0 Hello, Are There Any Resources For Trying To Reduce The Number Of Api Calls? I Am Trying Out Clear Ml And With Just 20 Epochs It Says There Have Been 80K Api Calls

I would assume a lot of them are logs streaming? So you can try reducing printouts / progress bars. That seems to help for me.

For context: I have noticed the large number of API calls can be a problem when networking is unreliable. It causes a cascade of slow retries and can really hold up task execution. So do be cautious of where work is occurring relative to where the server is, and what connects the two.

one year ago

I can confirm that simply switching back to 1.15.1 results in persistent "App Credentials" across restarts.

Literally just did :%s/1.16.0/1.15.1/g , restarted the stack under the older version, created creds, and restarted again... and found them sitting there. So I know my volume mounts and all are good. It's something about the upgrade that caused this.

There's an issue on github that seems to be related, but the discussion under it seems to have digressed. Should I open a new is...

one year ago

0 Why Does Clearml Still Waste Time On Requirement Analysis When I Provide Them? Any Tips For How I Can Reduce Clearml Overhead ... (The Time Before Work Actually Starts)?

thanks for the clarification. is there any bypass? (a git diff + git rev parse should take mere milliseconds)

I'm working out of a mono repo, and am beginning to suspect its a cause of slowness. next week ill try moving a pipeline over to a new repo to test if this theory holds any water.

one year ago

0 Question About Pipeline : My Setup Is As Follow:

I think of draft tasks as "class definitions" that the pipeline uses to create task "objects" out of.

one year ago

0 Question About Pipeline : My Setup Is As Follow:

and yes, you're correct. I'd say this is exactly what clearml pipelines offer.
the smartness is simple enough: same inputs are assumed to create the same outputs (it's up to YOU to ensure your tasks satisfy this determinism... e.g. seeds are either hard-coded or inputs to a task)

one year ago

0 Hi Everyone ! I'M Working With Pipelinecontroller. Is It Possible To Create A Pipeline With Optional Steps ? To Clarify I Add An Example Of A Pipeline In The Picture. For Example, The User Would Be Able To Modify The Value Of A Parameter To Execute The St

I do this a lot. pipeline params spawn K number of nodes, that collect just like you drew. No decorator being used here, just referencing tasks by id or name/project. I do not use continue on fail at all.

I do this with functions that have the contract ( f(pipe: PipelineController, **kwargs) -> PipelineController ) and a for-loop.

just be aware DAG creation slows down pretty quickly after a dozen or so such loops.

All the images below were made with the same pipeline (just evolved some n...

one year ago

0 Any Tips On Debugging Worker Graphs Not Showing Up? Seems To Be Some Js Errors In The Console That May Be Related. Running Localhost Against 1.16.1 Images

thanks!
I've been experiencing enough weird behavior on my new deployment that I need to stick to 1.15.1 for a bit to get work done. The graphs show up there just fine, and it feels like (since I no longer need auth) it's the more stable choice right now.

When clearml-web receives the updates that are on the main branch now, I'll definitely be rushing to upgrade our images and test the latest again. (for now I'm still running a sidecar container hosting the built version of the web app o...

one year ago

0 Question About Pipeline : My Setup Is As Follow:

yeah. it's using what you see in the UI here.
so if you made a change to a task used in a pipeline (my pipelines are from tasks, not functions... can't speak to that but i think it just generates a hidden task under the hood), point the (draft) task to that commit (assuming it's pushed), or re-run the task. the pipeline picks up from the tasks the API is aware of (by id or by name, in which case it uses latest updated) under the specified project, not from local code.

that part was confusing...

one year ago

0 Hello, Are There Any Resources For Trying To Reduce The Number Of Api Calls? I Am Trying Out Clear Ml And With Just 20 Epochs It Says There Have Been 80K Api Calls

ah, I'm self-hosting.
progress bars could easily take up several thousand calls, as it moves with each batch.

would love to know if the # of API calls decreases substantially by turning off auto_connect_streams . please post an update when you have one 😃

one year ago

0 Thread Re: Pipelines And How They'Re Meant To Be Used / How Long They Take To Orchestrate.

mind-blowing... but somehow just later in the same day I got the same pipeline to create its DAG and start running in under a minute.

I don't know what exactly I changed. The pipeline task was run locally (which I've never done before), then cloned to run remotely in my services queue. And then it just flew through the experiment at the pace I expected.

so there's hope. i'll keep stress-testing it and see what causes differences. I was right to suspect that such a simple DAG should not take...

one year ago

0 Thread Re: Pipelines And How They'Re Meant To Be Used / How Long They Take To Orchestrate.

is it? I can't tell if these delays (DAG-computation) are pipeline-specific (i get that pipeline is just a type of task), but it felt like a different question as I'm asking "are pipelines like this appropriate?"

is there something fundamentally slower about using pipe.start() at the end of a pipeline vs pipe.run_locally() ?

one year ago

0 I Have Set

apologies - just trying to keep sensitive data out of screenshot

one year ago

0 I Have Set

oh yes. Using env until the next message is 2 minutes.

one year ago

0 I Have Set

ah I see. thank you very much!

trying export CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=$(which python)
but I still see Environment setup completed successfully
(it is printed after Running task id )

it still takes a full 3 minutes between task pulled by worker until Running task id
is this normal? What is happening in these few minutes (besides a git pull / switch)?

one year ago

Nope still dealing with it .

Oddly enough when i spin up a new instance on the new version, it doesnt seem to happen

one year ago

0 I Have Set

are you on clearml agent 1.8.0?

(im noticing sometimes im just missing logs such as "Running task id.." entirely)

one year ago

0 I Have Set

starting to . thanks for your explanation .

would those containers best be started from something in services mode? or is it possible to get no-overhead with my approach of worker-inside-docker?

i designed my tasks as different functions, based mostly on what metrics to report and artifacts that are best cached (and how to best leverage comparisons of tasks) . they do require cpu, but not a ton.

I'm now experimenting with lumping a lot of stuff into one big task and seeing how this go...

one year ago

0 I Have Set

i really dont see how this provides any additional context that the timestamps + crops dont but okay.

one year ago

Show more results