Reputation
Badges 1
662 × Eureka!I'm using 1.1.6 (upgraded from 1.1.6rc0) - should I try 1.1.7rc0 or smth?
It does (root in a docker container); it shouldn't touch /run/systemd/generator/systemd-networkd.service anyway though
We can change the project name’s of course, if there’s a suggestion/guide that will make them see past the namespace…
Yeah I was basically trying to avoid clutter in the Pipelines page. But see my other thread for the background, maybe you have some good input there? 🙏
I can elaborate in more detail if you have the time, but generally the code is just defined in some source files.
I’ve been trying to play around with pipelines for this purpose, but as suspected, it fails finding the definition for the pickled object…
Because setting env vars and ensuring they exist on the remote machine during execution etc is more complicated 😁
There are always ways around, I was just wondering what is the expected flow 🙂
And actually it fails on quite many tasks for us with this Python 3.6.
I tried to set up a different image ( agent8sglue.defaultContainerImage: "ubuntu:20.04" ) but that did not change much.
I suspect the culprit is agentk8sglue.image , which is set to tag 1.24-21 of clearml-agent-k8s-base . That image is quite very old… Any updates on that? 🤔
Thanks SuccessfulKoala55 ! Could I change this during runtime, so for example, only the very first task goes through this process?
Note that it would succeed if e.g. run with pytest -s
I'll try that in a bit (that requires some access control changes). Any idea how can I modify the dynamically created virtualenv?
` Poetry Enabled: Ignoring requested python packages, using repository poetry lock file!
The currently activated Python version 3.10.6 is not supported by the project (~3.8.0).
Trying to find and use a compatible version.
Using python3.8 (3.8.16)
Creating virtualenv ... in /root/.clearml/venvs-builds/3.10/task_repository/...git/.venv
Installing dependencies from ...
But it is strictly that if condition in Task.init, see the issue I opened about it
Setting the endpoint will not be the only thing missing though, so unfortunately that's insufficient 😞
I think you're looking for the execute_remotely function?
Opened this - https://github.com/allegroai/clearml/issues/530 let me know if it's not clear enough FrothyDog40 !
Then the username and password would be visible in the autoscaler task 😕
But it should work out of the box, as it does work like that out of the box also regardless of ClearML. The user and personal access token are used as is and it propagates down to submodules, since those are simply another git repository.
I've further checks on a different machine and it works as well 🤔
We're using 1.1.5 at the moment -- I'll make sure everyone updates to 1.1.6 on Monday.
That solution does not work for us unfortunately -- the .env is an argument from argparse, and because we cannot attach non-git files to a remote task (again issue #395), we have to first download CLI arguments for remote execution and ensure they exist on the remote agent.
Hm, I did not specify any specific versions previously. What was the previous default?
That could be a solution for the regex search; my comment on the pop-up (in the previous reply) was a bit more generic - just that it should potentially include some information on what failed while fetching experiments 😄
I think I may have brought this up multiple times in different ways :D
When dealing with long and complicated configurations (whether config objects, yaml, or otherwise), it's often useful to break them down into relevant chunks (think hydra, maybe).
In our case, we have a custom YAML instruction !include , i.e.
` # foo.yaml
bar: baz
bar.yaml
obj: !include foo.yaml
maybe_another_obj: !include foo.yaml `
TimelyPenguin76 here's the full log (took a moment to anonynomize completely):
`
Using environment access key CLEARML_API_ACCESS_KEY=xxx
Using environment secret key CLEARML_API_SECRET_KEY=********
Current configuration (clearml_agent v1.3.0, location: /tmp/.clearml_agent.zs4e7egs.cfg):
sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.m...
That doesn't make sense? 🤔
Maybe I was not clear, but it's a simple part of the config file.
Something like this, SuccessfulKoala55 ?
Open a bash session on the docker ( docker exec -it <docker id> /bin/bash ) Open a mongo shell ( mongo ) Switch to backend db ( use backend ) Get relevant project IDs ( db.project.find({"name": "ClearML Examples"}) and db.project.find({"name": "ClearML - Nvidia Framework Examples/Clara"}) ) Remove relevant tasks ( db.task.remove({"project": "<project_id>"}) ) Remove project IDs ( db.project.remove({"name": ...}) )
I've updated my feature request to describe that as well. A textual description is not necessarily a preview 😅 For now I'll use the debug samples.
These kind of things definitely show how ClearML was designed originally only for neural networks tbh, where images are almost always only part of the dataset. Same goes for the consistent use of iteration everywhere 😞
Most of these are configurations (specific for an execution, but one such configuration defines multiple tasks). Some models might be uploaded if the user does not use our built-in link to ClearML model fetching 😄
I also tried adding gent.package_manager.system_site_packages = true to ensure these virtual environments have access btw, still no avail
I dunno :man-shrugging: but Task.init is clearly incompatible with pytest and friends
FWIW It’s also listed in other places @<1523704157695905792:profile|VivaciousBadger56> , e.g. None says:
In order to make sure we also automatically upload the model snapshot (instead of saving its local path), we need to pass a storage location for the model files to be uploaded to.
For example, upload all snapshots to an S3 bucket…