Reputation
Badges 1
662 × Eureka!Ah! Makes sense. Thanks!
Also I appreciate the time youre taking to answer AgitatedDove14 and CostlyOstrich36 , I know Fridays are not working days in Israel, so thank you π
I am indeed
nevermind! Found and answered (solution in the issue linked above)
Yes; I tried running it both outside venv and inside a venv. No idea why it uses 2.7?
So caching results for steps with the same arguments is trivial. Ultimately I would say you can combine the task-based pipeline with a function-based pipeline to achieve such dynamic control as you specified in the first two scenarios.
About the third scenario I'm not sure. If the configuration has changed, shouldn't the relevant steps (the ones where the configuration changed and their dependent steps) be rerun?
At any case, I think if you stay away from the decorators, at the cost of a bi...
Iβve tracked it down further, it seems the pigar utility does not apply any smart logic there.
The case we have is the following -
- We have a monorepo, but all modules/libs share a common namespace
foo
; so e.g. working on modulemod
, we usefrom foo.mod import β¦
- This then looks for a module called
foo
, even though itβs just a namespace - In the dist-info requirement, it seems any hyphen, dot, etc are swapped for an underscore, so our site-packages represents this as `foo_m...
minio was a tiny bit of headache to configure, but I'd be happy to help if you want CrookedWalrus33 , I just went through this process yesterday and today (see a few threads up...)
If everything is managed with a git repo, does this also mean PRs will have a messy metadata file attached to them?
Thanks Alon. In the full/official documentation the clearml-data
CLI is not mentioned anywhere, so perhaps it should be refreshed π
I think we're referring to different things here.
I won't be using the UI (and neither will my team).
But as mentioned, we've used DVC before and it adds a lot of junk metadata files to each GitHub PR (many dvc.yaml
, dvc.lock
and .gitignore
files). We're trying to avoid that as much as possible, hence my question about GitHub pull...
Thanks AgitatedDove14 , I'll first have to prove viability with the free version :)
Oh! Nice! I'll have a go at it and report back at the PR if it's in a functional state π Thanks AgitatedDove14 !
I believe that a Pipeline should have the system tags ( pipeline
, maybe hidden
), even if it created in a running Task
.
Is there some default Docker image you ship with ClearML that you'd recommend, or can/should we use our own? π
I guess the big question is how can I transfer local environment variables to a new Task
For example, we have a complicated YAML file with built-in !include
instructions, so we upload all the included files too. This then clogs up the artifacts sidebar, and it would be nice to be able to say "these are all artifacts from this one file, you can collapse it by clicking here"
The SDK is fine as it is - I'm more looking at the WebUI at this point
The key/secret is also shared internally so that sounds like a nice mitigation actually!
Which environment variable am I looking for? I couldn't spot anything specifically in that environment variables page
The overall flow I currently have is e.g.
Start an internal task (not ClearML Task; MLOps not initialized yet) Call some pre_init
function with args
so I can upload the environment file via StorageManager to S3 Call some start_run
function with the configuration dictionary loaded, so I can upload the relevant CSV files and configuration file Finally initialize the MLOps (ClearML), start a task, execute remotely
I can play around with 3/4 (so e.g. upload CSVs and configuratio...
Sure, for example when reporting HTML files:
What's new in 1.1.6rc0?
That doesn't make sense? π€
Maybe I was not clear, but it's a simple part of the config file.
I guess it's mixed. If #340 is resolved, then this initializer task will be a no-op: detach, and init-close new tasks as needed.
Yes, thanks AgitatedDove14 ! It's just that the configuration
object passed onwards was a bit confusing.
Is there a planned documentation overhaul? π€
I mean, if I search for "model", will it automatically search for tasks containing "model" in their name?
Great, thanks! Any idea about environment variables and/or other files (CSV)? I suppose I could use the task.upload_artifact
for the CSVs. but I'm still unsure about the environment variables
Maybe. When the container spins, are there any identifiers regarding the task etc available? I create a folder on the bucket per python train.py
so that the environment variables files doesn't get overwritten if two users execute almost-simultaneously
If I set the following:"extra_clearml_conf": "sdk.aws.s3.credentials = [\n{\nhost: 'ip:9000'\nkey: 'xxx'\nsecret: 'xxx'\nmultipart: false\nsecure: false\n},\n{\nhost: 'ip2:9000'\nkey: 'xxx'\nsecret: 'xxx'\nmultipart: false\nsecure: false\n}\n]"
I run into a weird furl
error:ValueError: Invalid port '9000''.
Coming back to this; ClearML prints a lot of error messages in local tests, supposedly because the output streams are not directly available:
` --- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3504, in _at_exit
self.__shutdown...