Reputation
Badges 1
662 × Eureka!One more UI question TimelyPenguin76 , if I may -- it seems one cannot simply report single integers. The report_scalar
feature creates a plot of a single data point (or single iteration).
For example if I want to report a scalar "final MAE" for easier comparison, it's kinda impossible 😞
Hah. Now it worked.
It's pulled from the remote repository, my best guess is that the uncommitted changes apply only after the environment is set up?
Sorry, I misspoke, yes of course, the agents config file, not the queues
I mean, it makes sense to have it in a time-series plot when one is logging iterations and such. But that's not always the case... Anyway I opened an issue about that too! 🙂
A follow up question (instead of opening a new thread), is there a way I could signal some files/directories to be copied to the execute_remotely
task?
The idea is that the features would be copied/accessed by the server, so we can transition slowly and not use the available storage manager for data monitoring
It's a bit hard to read when they'll all clustered together:
I guess following the example https://github.com/allegroai/clearml/blob/master/examples/advanced/execute_remotely_example.py , it's not clear to me how the server has access to the data loaders location when it hits execute_remotely
Exactly; the cloud instances (that are run with clearml-agent
) should have that clearml.conf
+ any changes specified in extra_clearml_configuration
for the scaler
The only thing I could think of is that the output of pip freeze would be a URL?
Full log:
` command: /usr/sbin/helm --version=4.1.2 upgrade -i --reset-values --wait -f=/tmp/tmp77d9ecye.yml clearml clearml/clearml
msg: |-
Failure when executing Helm command. Exited 1.
stdout:
stderr: W0728 09:23:47.076465 2345 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0728 09:23:47.126364 2345 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unava...
Could you provide a more complete set of instructions, for the less inclined?
How would I backup the data in future times etc?
Of course. We'd like to use S3 backends anyway, I couldn't spot exactly where to configure this in the chart (so it's defined in the individual agent's configuration)
Okay, I'll test it out by trying to downgrade to 4.0.0 and then upgrade to 4.1.2
Just to make sure, the chart_ref
is allegroai/clearml
right? (for some reason we had clearml/clearml
and it seems like it previously worked?)
But to be fair, I've also tried with python3.X -m pip install poetry
etc. I get the same error.
Also I can't select any tasks from the dashboard search results 😞
Using an on-perm clearml server, latest published version
SuccessfulKoala55 help me out here 🙂
It seems all the changes I make in the AWS autoscaler apply directly to the virtual environment set for the autoscaler, but nothing from that propagates down to the launched instances.
So e.g. the autoscaler environment has poetry
installed, but then the instance fails because it does not have it available?
This could be relevant SuccessfulKoala55 ; might entail some serious bug in ClearML multiprocessing too - https://stackoverflow.com/questions/45665991/multiprocessing-returns-too-many-open-files-but-using-with-as-fixes-it-wh
Is there a way to specify that flag within the config file, SuccessfulKoala55 ?
We're not using the docker setup though. The CLI run by the autoscaler is python -m clearml_agent --config-file /root/clearml.conf daemon --queue aws_small
, so no docker
Should this be under the clearml
or clearml-agent
repo?
I'm not too worried about the dataset appearing (or not) in the Datasets
tab. I would like it (the original task ) to to not disappear from the original project I assigned it to
Yes and no SmugDolphin23
The project is listed, but there is no content and it hides my main task that it is attached to.
Not necessarily on the same branch, no