Hi @<1547028052028952576:profile|ExuberantBat52>
task = Task.get_task(...)
print(task.data)
wdyt?
an implementation of this kind is interesting for you or do you suggest to fork
You mean adding a config map storing a default trains.conf for the agent?
post_optional_packages: ["google-cloud-storage", ]Will install it last (i.e. after all the other packages) but only if you have it in the "Installed packages" list
Hi FiercePenguin76
Here’s my workaround - ignore the fail messages, and manually create an SSH connection to the server with Jupyter port forwarded.
You are correct, clearml-session assumes it can SSH into the remote agent machine, from that point it automatically tunnels all other connections on top of the original SSH (well with some fancy SSH keep-alive proxy).
I'm assuming that from home you cannot connect to the SSH machine at the office, which makes sense, but out of curiosity...
Thanks ReassuredTiger98 , yes that makes sense.
What's the python version you are using ?
So I think this is a good example of pipelines and data:
Basically Task A generates data stored using the cleamrl-data (See Dataset class). The output of that is an ID of the Dataset. Then Task B uses that ID to retrieve the Dataset created by Task A.
documentation
https://github.com/allegroai/clearml/blob/master/docs/datasets.md
Example:
Step A creating Dataset:
https://github.com/alguchg/clearml-demo/blob/main/process_dataset.py
Step B training model using the Dataset created in ...
I think we added it somewhere in 0.14, anyhow I just checked the Logger doc, it is there now 🙂
Hi LazyTurkey38
Configuring these folders will be pushed later today 🙂
Basically you'll have in your clearml.conf
` agent {
docker_internal_mounts {
sdk_cache: "/clearml_agent_cache"
apt_cache: "/var/cache/apt/archives"
ssh_folder: "/root/.ssh"
pip_cache: "/root/.cache/pip"
poetry_cache: "/root/.cache/pypoetry"
vcs_cache: "/root/.clearml/vcs-cache"
venv_build: "/root/.clearml/venvs-builds"
pip_download: "/root/.clearml/p...
EnviousPanda91 so which frame works are being missed? Is it a request to support new framework or are you saying there is a bug somewhere?
BTW: you can always set different config files by with an environment variable:CLEARML_CONFIG_FILE="path/to/cobfig/file
HandsomeCrow5
BTW: out of curiosity, how do you generate the html reports. I remember a few users suggesting trains should have a report generating functionality
how would I get an agent to launch in the same instance of my clearml server
Actually that is my point, you do not have to spin the agent on the clearml-server instance. We added the services agent as part of the docker-compose for easier deployment, that said you can always manually SSH to the server, or run on any other machine, like you would spin any other clearml-agent .
Does that make sense ?
DeliciousBluewhale87
node.base_task_id
is the base task, which will always be in draft mode, Instead we should use the
node.executed
which references the current executed node.
YES, maybe we should add that into the example, so it is clearer ? WDYT?
Just to make sure I understand, running locally creates the Args/command correctly, then when actually executed on the remote machine (i.e. execute_remotely creates the correct Args/command But when the agent actually executes it) it updates back the Args/command as a list. Is that a correct description ?
This is what I think you should end up withDiscreteParameterRange('General/dataset_url', values=["option 1 for url", "option 2 for url"])If args['dataset_url'] is a list, you should just do values=args['dataset_url']
In Windows setting
system_site_packages
to
true
allowed all stages in pipeline to start - but doesn't work in Lunux.
Notice that it will inherit from the system packages not the venv the agent is installed in
I've deleted tfrecords from master branch and commit the removal, and set the folder for tfrecords to be ignored in .gitignore. Trying to find, which changes are considered to be uncommited.
you can run git diff it is essentially...
Hi DepressedChimpanzee34
This is not a query call, this is a reporting call. see docs below
https://clear.ml/docs/latest/docs/references/api/workers#post-workersstatus_report
It is used by the worker to report its own status.
I think this is what you are looking for:
https://clear.ml/docs/latest/docs/references/api/workers#post-workersget_stats
Hi ExuberantBat52
I do not think you can... i would use aws secret manager to push the entire user list config file wdyt?
UnsightlyShark53 Awesome, the RC is still not available on pip, but we should have it in a few days.
I'll keep you posted here :)
cannot schedule new futures after interpreter shutdown
This implies the process is shutting down.
Where are you uploading the model? What is the clearml version you are using ? can you check with the latest version (1.10) ?
When I passed specific arguments (for example --steps) it ignored them...
script.py test blah1 blah2 blah3 42
Is this how it is intended to be used ?
. Are there any option to remove the example projects?
So sorry just realized I missed your message
Yes, but I'm not sure it will have an effect, see here
why the memory usage of the elastic search still persist on 32 gb after removing experiments?
did you restart the server after removing the experiments?
or creating a dedicated function I would suggest also including the actual sampled point in the HP space.
Could you expand ?
This would be the most common use case, and essentially the reason for running the HPO understanding the sensitivity of metrics with respect to hyper-parameters
Does this relates to:
https://github.com/allegroai/clearml/issues/430
manually" filtering the keys I've put in for the HP space. I find it a bit strange that they are not saved as part of t...
the unclear part is how do I sample another point in the optimization space from the optimizer
Just so I'm clear on the issue, you want multiple machines to access the internals of the optimizer class ? or Do you just want a way to understand what is the optimizer sampling space (i.e. the parameters and options per parameter) ?
JitteryCoyote63 what's the clearml version ?
Are you always seeing the "model uploaded completed" message ?
What's the python version you are using?
could it be the polling on the Task (can't remember whats the interval), but it will update it's state once every X minutes/seconds
Nice, that seems to be the issue. Any chance you can open a GitHub issue, so we do not loose track of it ?
Hi MelancholyElk85
Can I manually delete
.zip
files with datasets in
.clearml/cache/storage_manager/datasets
directory?
Yes, you can. I "think" the .zip is stored for easier access, but you can delete it, as long as the "extracted" folder exists, it should be fine.
HandsomeCrow5 OMG the guys already added it to the debug samples as well, checkout the demo app (drop down "test html sample"):
https://demoapp.trains.allegro.ai/projects/4e7fef090aa849b1acc37d92b59b3360/experiments/83c9ed509f0e421eaadc1ef56b3af5b4/info-output/debugImages