Reputation
Badges 1
662 × Eureka!I’ll also post this on the main channel -->
Eek. Is there a way to merge a backup from elastic to current running server?
ClearML 1.1.4, Matplotlib 3.3.0 (it's not the latest as we have some backward compatibility issues)
Sorry AgitatedDove14 , forgot to get back to this.
I've been trying to convince my team to drop poetry 😄
This was a long time running since I could not access the macbook in question to debug this.
It is now resolved and indeed a user error - they had implicitly defined CLEARML_CONFIG_FILE
to e.g. /home/username/clearml.conf
instead of /Users/username/clearml.conf
as is expected on Mac.
I guess the error message could be made clearer in this case (i.e. CLEARML_CONFIG_FILE='/home/username/clearml.conf' file does not exist
). Thanks for the support! ❤
Maybe. When the container spins, are there any identifiers regarding the task etc available? I create a folder on the bucket per python train.py
so that the environment variables files doesn't get overwritten if two users execute almost-simultaneously
Feels like we've been over this 😄 Has there been new developments perhaps?
It's essentially that this - https://clear.ml/docs/latest/docs/guides/advanced/multiple_tasks_single_process cannot work in a remote execution.
Thanks SuccessfulKoala55 ! Is this listed anywhere in the documentation?
Could I set an environment variable there and then refer to it internally in the config with the ${...}
notation?
I see https://github.com/allegroai/clearml-agent/blob/d2f3614ab06be763ca145bd6e4ba50d4799a1bb2/clearml_agent/backend_config/utils.py#L23 but not where it's called 🤔
Hm, just a small update - I just verified and it does indeed work on linux:
` import clearml
import dotenv
if name == "main":
dotenv.load_dotenv()
config = clearml.backend_api.Config.load() # Success, parsed with environment variables `
You mean at the container level or at clearml?
Yes, the container level (when these docker shell scripts run).
The per user ID would be nice, except I upload the .env
file before the Task
is created (it's only available really early in the code).
Yeah 🤔 🤔 they did. I'll give your suggested fix a go on Monday!
task.upload_artifact(..., is_requirement=True)
, task.connect_configuration(..., is_requirement=True)
Just implies these artifacts/configurations must be downloaded prior to running the code itself; then you also don't have to worry about zipping? 🤔
The instance that took a while to terminate (or has taken a while to disappear from the idle workers)
After the task was initialized? 🤔
I wouldn't mind going the requests
route if I could find the API end point from the SDK?
My current workaround is to use poetry
and tell users to delete poetry.lock
if they want their environment copied verbatim
Example configuration -
` version: 1
disable_existing_loggers: true
formatters:
simple:
format: '%(asctime)s %(levelname)-9s %(name)-24s: %(message)s'
filters:
brackets:
(): ccutils.logger.BracketFilter
handlers:
console:
class: ccmlp.utils.TqdmStreamHandler
level: INFO
formatter: simple
filters: [brackets]
loggers: # Set logging levels for specific packages
urllib3:
level: WARNING
matplotlib:
level: WARNING
...
~
is a bit weird since it's not part of the package (might as well let the user go through clearml-init
), but using ${PWD} works! 👍 👍
(Though I still had to add the CLEARML_API_HOST and CLEARML_WEB_HOST ofc, or define them in the clearml.conf)
I'm also getting the following warning, I guess it's some ClearML dependency?IPython could not be loaded!
AgitatedDove14 the issue was that we'd like the remote task to be able to spawn new tasks, which it cannot do if I use Task.init
before override_current_task_id(None)
.
When would this callback be called? I'm not sure I understand the usecase.
Am I making sense ?
No, not really. I don't see how task.connect_configuration
interacts with our existing CLI? Additionally, the documentation for task.connect_configuration
say the second argument is the name of a file, not the path to it? So something is off
As the meme goes, well yes but actually no, since the input path is provided via argparse? I'm also not sure how this would help debug from the WebUI - you can't really see the contents of a zipped file/the configuration tab is too messy for such a nested configuration as the one we have. It's best suited as an artifact.
EDIT: Or am I missing something? Point being, when the remote execution begins, the entry point tries to run e.g. python train.py --config_file path/to/local/file.yaml
...
Debugging. It's very useful for us to be able to see the contents of the configuration and understand what is going on and what is meant to be going on. Without a preview (which in our case is the entire content of the configuration file), one has to take an annoying route of downloading the files etc. The configurations are uploaded to a single task and then linked across all task to conserve storage space (so the S3 storage point is identical across tasks) Sure, sounds good. I think it's a ...
That could work, given that:
Could we add a preview section? One reason I don't like using the configuration section is that it makes debugging much much harder. Will the clearml-agent download and unzip the files, placing them into the same local folder as needed for execution? What if we want to include non-configuration objects? (i.e. the model case I listed)
Sounds like a nice idea 😁
Follow-up; any ideas how to avoid PEP 517 with the auto scaler? 🤔 Takes a long time to build the wheels
Hurrah! Addedgit config --system credential.helper 'store --file /root/.git-credentials'
to the extra_vm_bash_script
and now it works
(logs the given git credentials in the store file, which can then be used immediately for the recursive calls)
Yes, you're correct, I misread the exception.
Maybe it hasn't completed uploading? At least for Datasets one needs to explicitly wait IIRC
SuccessfulKoala55 That at least did not work, unless one has to specify wildcard patterns perhaps..?