Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
SmallTurkey79
Moderator
10 Questions, 124 Answers
  Active since 12 April 2024
  Last activity one year ago

Reputation

0

Badges 1

103 × Eureka!
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
one year ago
0 Votes
1 Answers
937 Views
0 Votes 1 Answers 937 Views
in case anyone else ever comes across mongo issues using the docker compose clearml stack (in case of a messy shutdown), I have found this script to be a lif...
one year ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi everyone! I just wanted to bring to your attention that ClearML 1.16.0 introduced authentication for the self-hosted fileserver by default. None If any of...
one year ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Thread re: Pipelines and how they're meant to be used / how long they take to orchestrate. @<1523701205467926528:profile|AgitatedDove14> I appreciated your a...
one year ago
0 Votes
13 Answers
1K Views
0 Votes 13 Answers 1K Views
any tips on debugging worker graphs not showing up? seems to be some js errors in the console that may be related. running localhost against 1.16.1 images
one year ago
0 Votes
9 Answers
2K Views
0 Votes 9 Answers 2K Views
why does clearml still waste time on requirement analysis when I provide them? any tips for how I can reduce clearml overhead ... (the time before work actua...
one year ago
0 Votes
14 Answers
2K Views
0 Votes 14 Answers 2K Views
one year ago
0 Votes
54 Answers
103K Views
0 Votes 54 Answers 103K Views
I have set export CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=true export CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=true in my entrypoint.sh (which runs clearml-agent da...
one year ago
0 Votes
31 Answers
118K Views
0 Votes 31 Answers 118K Views
I noticed after upgrading to the latest clearml that App Credentials now disappear on restart. Is this an intentional design choice? I'm in a bit of a chicke...
one year ago
0 Votes
43 Answers
109K Views
0 Votes 43 Answers 109K Views
one year ago
0 I Have Set

of what task? i'm running lots of them and benchmarking execution times. would you like to see a best case or worst case scenario? (ive kept some experiments for each).

and yeah, in those docs you just linked, "boolean" vars like CLEARML_AGENT_GIT_CLONE_VERBOSE explicitly say true so I ended up trying that pattern. but originally i did try 1. let me go back to that now. thank you.

overall I've seen some improvements in execution time using the suggestions in this thread (tysm!) - th...

one year ago
0 I Dont Exactly Know How To Ask For Help On This... Nor Have A Reproducible Minimal Example... I Downgraded Back To 1.15.1 From 1.16.2 And Have The Same Issue There. I Have A Pipeline That'S Repeatedly Failing To Complete. It Correctly Marks Things As Cach

yeah locally it did run. I then ran another via UI spawned from the successful one, it showed cached steps and then refused to run the bottom one, disappearing again. No status message, no status reason. (not running... actually dead)
image

one year ago
0 I Have Set

ah I see. thank you very much!

trying export CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=$(which python)
but I still see Environment setup completed successfully
(it is printed after Running task id )

it still takes a full 3 minutes between task pulled by worker until Running task id
is this normal? What is happening in these few minutes (besides a git pull / switch)?

one year ago
0 Question About Pipeline : My Setup Is As Follow:

Pipeline step caching matches on inputs and task status. If your task points to latest commit, clearml can’t know what that is until runtime and cant cache. On a fixed tag or commit, it sees no code has changed, and so if inputs match (hashable, all parameters are serializable), then it caches.

one year ago
0 I Have Set

I'm just working on speeding up the time from "queue experiment" to "my code actually runs remotely" - as of yesterday things would sit for many minutes at a time. trying to see if venv is the culprit .

one year ago
0 I Have Set

thank you!
i'll take that design into consideration.

re: CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL in "docker venv mode" im still not quite sure I understand correctly - since the agent is running in a container, as far as it is concerned it may as well be on bare-metal.

is it just that there's no way for that worker to avoid venv? (i.e. the only way to bypass venv is to use docker-mode?)

one year ago
0 I Have Set

yeah, still noticing that it can be multiple minutes before something starts...
like... what is happening in this time (besides a git clone), now that I set both

export CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=true
export CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=$(which python)

update: it's now been six mins and the task still isn't done. this should have run through in like a minute total end-to-end
![image](https://clearml-web-assets.s3.amazonaws.com/scoold/images/TT9ATQXJ5-F072N3UF22...

one year ago
0 I Have Set

i just ran a pipeline that took about 2h (more than half this time was just the DAG), with about a hundred tasks. i'm taking a look at them now to see what the logs show for runtimes.

one year ago
0 I Have Set

i would love some advice on that though - should I be using services mode + docker and some max # of instances to be spinning up multiple tasks instead?

my thinking was to avoid some of the docker overhead. but i did try this approach previously and found that the container limit wasn't exactly respected.

one year ago
0 Hello! Is It Possible To Use S3 Backblaze With Clearml? Can I Use The Aws Section For That Settings? Or Another Section In Clearml.Conf Corresponding To Backblaze Would Work? Or There Is No Way To Use Backblaze In This Case At All?

For digitalocean:
host: "(region). digitaloceanspaces.com:443 "
bucket: “(bucket name)”
key: “(key)”
secret: “(secret)”
multipart: false
secure: true
(verify commented out entirely)

So for you - make sure to add your creds that have the right scope (r/w), and try specifying the bucket .

Then in clearml tasks themselves you tell the task using output_uri=“s3://(region).digitaloceanspaces.com:443/clearml/”

(I import this as a constant from a _constants.py file...

7 months ago
0 Question About Pipeline : My Setup Is As Follow:

that same pipeline with just 1 date input.
i have the flexibility from the UI to either run a single, a dozen, or a hundred experiments... in parallel.

pipelines are amazing 😃
image

one year ago
0 I Have Set

oh yes. Using env until the next message is 2 minutes.

one year ago
0 Question About Pipeline : My Setup Is As Follow:

basically the git hash of the executed experiment + a hash on the inputs to the task.

one year ago
0 I Have Set

starting to . thanks for your explanation .

would those containers best be started from something in services mode? or is it possible to get no-overhead with my approach of worker-inside-docker?

i designed my tasks as different functions, based mostly on what metrics to report and artifacts that are best cached (and how to best leverage comparisons of tasks) . they do require cpu, but not a ton.

I'm now experimenting with lumping a lot of stuff into one big task and seeing how this go...

one year ago
0 I Have Set

but pretty reliably some proportion of tasks still just take a much longer time. 1m - 10m is a variance i'd really like to understand.

one year ago
0 Why Does Clearml Still Waste Time On Requirement Analysis When I Provide Them? Any Tips For How I Can Reduce Clearml Overhead ... (The Time Before Work Actually Starts)?

thanks so much!
I've been running a bunch of tests with timers and seeing an absurd amount of variance. Ive seen parameters connect and task create in seconds and other times it takes 4 minutes.

Since I see timeout connection errors somewhat regularly, I'm wondering if perhaps I'm having networking errors. Is there a way (at the class level) to control the retry logic on connecting to the API server?

my operating theory is that some sort of backoff / timeout (eg 10s) is causing the hig...

one year ago
0 Thread Re: Pipelines And How They'Re Meant To Be Used / How Long They Take To Orchestrate.

mind-blowing... but somehow just later in the same day I got the same pipeline to create its DAG and start running in under a minute.

I don't know what exactly I changed. The pipeline task was run locally (which I've never done before), then cloned to run remotely in my services queue. And then it just flew through the experiment at the pace I expected.

so there's hope. i'll keep stress-testing it and see what causes differences. I was right to suspect that such a simple DAG should not take...

one year ago
0 Why Does Clearml Still Waste Time On Requirement Analysis When I Provide Them? Any Tips For How I Can Reduce Clearml Overhead ... (The Time Before Work Actually Starts)?

thanks for the clarification. is there any bypass? (a git diff + git rev parse should take mere milliseconds)

I'm working out of a mono repo, and am beginning to suspect its a cause of slowness. next week ill try moving a pipeline over to a new repo to test if this theory holds any water.

one year ago
0 Question About Pipeline : My Setup Is As Follow:

and yes, you're correct. I'd say this is exactly what clearml pipelines offer.
the smartness is simple enough: same inputs are assumed to create the same outputs (it's up to YOU to ensure your tasks satisfy this determinism... e.g. seeds are either hard-coded or inputs to a task)

one year ago
0 I Have Set

what if the preexisting venv is just the system python ? my base image is python:3.10.10 and i just pip install all requirements in that image . Does that not avoid venv still?

it's good to know that in theory there's a path forward with almost zero overhead . that's what I want .

is it reasonable to expect that with sufficient workers, I can get 50 tasks to run in the same time it takes to run a single one? i cant imagine the apiserver being a noticeable bottleneck .

one year ago
0 I Have Set

minute of silence between first two msgs and then two more mins until a flood of logs. Basically 3 mins total before this task (which does almost nothing - just using it for testing) starts.
image
image
image

one year ago
0 I Dont Exactly Know How To Ask For Help On This... Nor Have A Reproducible Minimal Example... I Downgraded Back To 1.15.1 From 1.16.2 And Have The Same Issue There. I Have A Pipeline That'S Repeatedly Failing To Complete. It Correctly Marks Things As Cach

would it be on the pipeline task itself then, since that's what's disappearing?
I will do some experiment comparisons and see if there are package diffs. thanks for the tip.

one year ago
0 Hello! Thank You For The Great Product. I Have A Bit Of A Request: This Hover Feature In Pipeline Overview Would Be Much More Useful If I Could Read Out The Whole Metric Name. (Not So Much An Issue With Things Like F1, "Acc", But Anything Longer Is Not

took me a while to deliver enough functionality to my team to justify working on open source... but I finally go back around to investigating this to write a proper issue, but ended up figuring it out myself and opening a PR:
None

one year ago
0 Hello! Is It Possible To Use S3 Backblaze With Clearml? Can I Use The Aws Section For That Settings? Or Another Section In Clearml.Conf Corresponding To Backblaze Would Work? Or There Is No Way To Use Backblaze In This Case At All?

i ran into this recently.
its a small thing but double check the port. should be 443, not 433 as in the docs (typo?) - seems you got this in the screenshot .
no region should be set .

i dont use backblaze but if it helps i can show my digitalocean spaces config . should be comparable .

8 months ago
0 Is There A Way I Can Make Clearml Server Work On A 4Gb Ram Ec2-Instance? I See That Min 8Gb Is Recommended, Although It'S Working With 4Gb Too, But Not Sure At What Load I'Ll Start Facing Problems. Also, I Guess Elasticsearch Uses Up Most Of The Ram, Is T

you can control how much memory elastic has via the compose stack, but in my experience - ive been able to run on a 4 core w 16gb of ram only up to a certain point . for things to feel snappy you really need a lot of memory available once you approach navigating over 100k tasks .

so far under 500k tasks on 16gb of ram dedicated solely to elastic has been stable for us . concurrent execution of more than a couple hundred workers can bring the UI to its knees until complete, so arguably we...

8 months ago
0 I Have Set

oooh thank you, i was hoping for some sort of debugging tips like that. will do.

from a speed-of-clearing-a-queue perspective, is a services-mode queue better or worse than having many workers "always up"?

one year ago
Show more results compactanswers