Reputation
Badges 1
662 × Eureka!Basically you have the details from the Dataset page, why should it be mixed with the others ?
Because maybe it contains code and logs on how to prepare the dataset. Or maybe the user just wants increased visibility for the dataset itself in the tasks view.
why would you need the Dataset Task itself is the main question?
For the same reason as above. Visibility and ease of access. Coupling relevant tasks and dataset in the same project makes it easier to understand that they're...
Sorry, I misspoke, yes of course, the agents config file, not the queues
It's self-hosted TimelyPenguin76
SuccessfulKoala55 WebApp: 1.4.0-175 • Server: 1.4.0-175 • API: 2.18
I mean, it makes sense to have it in a time-series plot when one is logging iterations and such. But that's not always the case... Anyway I opened an issue about that too! 🙂
A follow up question (instead of opening a new thread), is there a way I could signal some files/directories to be copied to the execute_remotely
task?
I've updated my feature request to describe that as well. A textual description is not necessarily a preview 😅 For now I'll use the debug samples.
These kind of things definitely show how ClearML was designed originally only for neural networks tbh, where images are almost always only part of the dataset. Same goes for the consistent use of iteration
everywhere 😞
The idea is that the features would be copied/accessed by the server, so we can transition slowly and not use the available storage manager for data monitoring
It's a bit hard to read when they'll all clustered together:
I guess following the example https://github.com/allegroai/clearml/blob/master/examples/advanced/execute_remotely_example.py , it's not clear to me how the server has access to the data loaders location when it hits execute_remotely
Exactly; the cloud instances (that are run with clearml-agent
) should have that clearml.conf
+ any changes specified in extra_clearml_configuration
for the scaler
Okay trying again without detached
Is there a preferred way to stop the agent?
AFAIK that's the only way right now (see my comment here - https://clearml.slack.com/archives/CTK20V944/p1657720159903739?thread_ts=1657699287.630779&cid=CTK20V944 )
Or then if you have the ClearML paid service, I believe there is a "vaults" service, right AgitatedDove14 ?
Yes 😅 I want ClearML to load and parse the config before that. But now I'm not even sure those settings in the config are even exposed as environment variables?
The only thing I could think of is that the output of pip freeze would be a URL?
Full log:
` command: /usr/sbin/helm --version=4.1.2 upgrade -i --reset-values --wait -f=/tmp/tmp77d9ecye.yml clearml clearml/clearml
msg: |-
Failure when executing Helm command. Exited 1.
stdout:
stderr: W0728 09:23:47.076465 2345 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0728 09:23:47.126364 2345 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unava...
Could you provide a more complete set of instructions, for the less inclined?
How would I backup the data in future times etc?
I think now there's the following:
Resource type Queue (name) defines resource + max instancesAnd I'm looking for:
Resource type "pool" of resources (type + max instances) A pool can be shared among queues
Of course. We'd like to use S3 backends anyway, I couldn't spot exactly where to configure this in the chart (so it's defined in the individual agent's configuration)
Okay, I'll test it out by trying to downgrade to 4.0.0 and then upgrade to 4.1.2
Just to make sure, the chart_ref
is allegroai/clearml
right? (for some reason we had clearml/clearml
and it seems like it previously worked?)
But to be fair, I've also tried with python3.X -m pip install poetry
etc. I get the same error.
Something like this, SuccessfulKoala55 ?
Open a bash session on the docker ( docker exec -it <docker id> /bin/bash
) Open a mongo shell ( mongo
) Switch to backend db ( use backend
) Get relevant project IDs ( db.project.find({"name": "ClearML Examples"})
and db.project.find({"name": "ClearML - Nvidia Framework Examples/Clara"})
) Remove relevant tasks ( db.task.remove({"project": "<project_id>"})
) Remove project IDs ( db.project.remove({"name": ...})
)
Also I can't select any tasks from the dashboard search results 😞
Using an on-perm clearml server, latest published version
SuccessfulKoala55 help me out here 🙂
It seems all the changes I make in the AWS autoscaler apply directly to the virtual environment set for the autoscaler, but nothing from that propagates down to the launched instances.
So e.g. the autoscaler environment has poetry
installed, but then the instance fails because it does not have it available?