Reputation
Badges 1
41 × Eureka!I would also be interested in a GCP autoscaler, I did not know it was possible/available yet.
Great, and this would show up in the description column in the dashboard ?
Thanks, I guess I need to have a bucket under Cloud Storage?
So if I want to train with a remote agent on a remote machine, I have to:
spin up clearml-agent on the remote create a dataset using clearml-data, populate with data… from my local machine use clearml-data to upload data to google gs:// bucket modify my code so it accesses data from the dataset as here https://clear.ml/docs/latest/docs/clearml_data/clearml_data_sdk#accessing-datasetsAm I understanding right?
I think I am missing one part — which command do I use on my local machine, to indicate the job needs to be run remotely? I’m imagining something likeclearml-remote run python3 my_train.py
AgitatedDove14 thanks yes I assume I would follow these instructions:
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_gcp
I guess I follow these steps on a GCP instance?
https://clear.ml/docs/latest/docs/clearml_agent
got it, nice, thanks
thanks, so I got clearml-task working, sent to a queue and the agent on gcp picked it up. I had a question — for a job that runs on the order of minutes, it’s not worth re-creating the whole python virtual env from scratch on the remote (that itself takes 5mins). So is the --folder
` option meant for running it in an existing folder in an existing virtual env?
(and a way to specify which remote server)
should I nuke the .clearml/cache
CLI doesn’t care about the state of my git repo right?
it finally finished no worries
Dataset.get
works fine from python script, it pulls in the data into cache. Just the cli seems broken
created a new dataset 5GB, no update since 20 mins, is that normal?
this is great… so it looks like best to do it in a new dir
Thanks for the quick response . Will look into this later , I think I understand
So net-net does this mean it’s behaving as expected, or is there something I need to do enable “full venv cache”? It spends nearly 2 mins starting fromcreated virtual environment CPython3.8.10.final.0-64 in 97ms creator CPython3Posix(dest=/home/pchalasani/.clearml/venvs-builds/3.8, clear=False, global=False)
and then printing several lines lines like this
` Successfully installed pip-20.1.1
Collecting Cython
Using cached Cython-0.29.30-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86...
A quick note for others who may visit this… it looks like you have to do:Task.force_requirements_env_freeze(force=True, requirements_file="requirements.txt")
to ensure any changes in requirements.txt are reflected in the remote venv
I mean it is in Pip mode and the agent installs deps from git repo that it pulls
I have a strong attachment to a workflow based on CLI, nice zsh auto-suggestions, Hydra and the like. Hence why I moved away from dvc 🙂
Oh I think I know what missed. When I set --project … --name …
they did not match the names I used when I did task.init( )
in my code