Reputation
Badges 1
662 × Eureka!I’ll also post this on the main channel -->
Eek. Is there a way to merge a backup from elastic to current running server?
But... Which queue does it listen to, and which type of instances will it use etc
ClearML 1.1.4, Matplotlib 3.3.0 (it's not the latest as we have some backward compatibility issues)
Sorry AgitatedDove14 , forgot to get back to this.
I've been trying to convince my team to drop poetry 😄
This was a long time running since I could not access the macbook in question to debug this.
It is now resolved and indeed a user error - they had implicitly defined CLEARML_CONFIG_FILE
to e.g. /home/username/clearml.conf
instead of /Users/username/clearml.conf
as is expected on Mac.
I guess the error message could be made clearer in this case (i.e. CLEARML_CONFIG_FILE='/home/username/clearml.conf' file does not exist
). Thanks for the support! ❤
Internally yes, but in Task.init
the default argument is a boolean, not an int.
We don't want to close the task, but we have a remote task that spawns more tasks. With this change, subsequent calls to Task.init
fail because it goes in the deferred init clause and fails on validate_defaults
.
Maybe. When the container spins, are there any identifiers regarding the task etc available? I create a folder on the bucket per python train.py
so that the environment variables files doesn't get overwritten if two users execute almost-simultaneously
Feels like we've been over this 😄 Has there been new developments perhaps?
It's essentially that this - https://clear.ml/docs/latest/docs/guides/advanced/multiple_tasks_single_process cannot work in a remote execution.
Thanks SuccessfulKoala55 ! Is this listed anywhere in the documentation?
Could I set an environment variable there and then refer to it internally in the config with the ${...}
notation?
I see https://github.com/allegroai/clearml-agent/blob/d2f3614ab06be763ca145bd6e4ba50d4799a1bb2/clearml_agent/backend_config/utils.py#L23 but not where it's called 🤔
Hm, just a small update - I just verified and it does indeed work on linux:
` import clearml
import dotenv
if name == "main":
dotenv.load_dotenv()
config = clearml.backend_api.Config.load() # Success, parsed with environment variables `
It does, but I don't want to guess the json structure (what if ClearML changes it or the folder structure it uses for offline execution?). If I do this, I'm planning a test that's reliant on ClearML implementation of offline mode, which is tangent to the unit test
We’d be happy if ClearML captures that (since it uses e.g. pip, then we have the git + commit hash for reproducibility), as it claims it would 😅
Any thoughts CostlyOstrich36 ?
Seems like you're missing an image definition (AMI or otherwise)
You mean at the container level or at clearml?
Yes, the container level (when these docker shell scripts run).
The per user ID would be nice, except I upload the .env
file before the Task
is created (it's only available really early in the code).
Using the PipelineController with add_function_step
Yeah 🤔 🤔 they did. I'll give your suggested fix a go on Monday!
Sure, for example when reporting HTML files:
task.upload_artifact(..., is_requirement=True)
, task.connect_configuration(..., is_requirement=True)
Just implies these artifacts/configurations must be downloaded prior to running the code itself; then you also don't have to worry about zipping? 🤔
So where should I install the latest clearml version? On the client that's running a task, or on the worker machine?
The instance that took a while to terminate (or has taken a while to disappear from the idle workers)
After the task was initialized? 🤔
No that does not seem to work, I get
task.execute_remotely(queue_name="default")
2024-01-24 11:28:23,894 - clearml - WARNING - Calling task.execute_remotely is only supported on main Task (created with Task.init)
Defaulting to self.enqueue(queue_name=default)
Any follow-up thoughts, @<1523701070390366208:profile|CostlyOstrich36> , or maybe @<1523701087100473344:profile|SuccessfulKoala55> ? 🤔
SuccessfulKoala55 The changelog wrongly cites https://github.com/allegroai/clearml/issues/400 btw. It is not implemented and is not related to being able to save CSVs 😅
I wouldn't mind going the requests
route if I could find the API end point from the SDK?
My current workaround is to use poetry
and tell users to delete poetry.lock
if they want their environment copied verbatim
Example configuration -
` version: 1
disable_existing_loggers: true
formatters:
simple:
format: '%(asctime)s %(levelname)-9s %(name)-24s: %(message)s'
filters:
brackets:
(): ccutils.logger.BracketFilter
handlers:
console:
class: ccmlp.utils.TqdmStreamHandler
level: INFO
formatter: simple
filters: [brackets]
loggers: # Set logging levels for specific packages
urllib3:
level: WARNING
matplotlib:
level: WARNING
...