not sure how for debug sample and scalars ....
But theorically, with the above, one should be able to fully reproduce a run
Hi.
How do you tell the server to use my azure storage instead of local drive, on the host machine ? Isn't it by setting azure.storage
in /opt/clearml/config/clearml.conf
?
nevermind, all the database files are in data folder
I understand to from the agent, point of view, I just need to update the conf file to use new credential and new server address.
but afaik this only works locally and not if you run your task on a clearml-agent!
Isn;t the agent using the same clearml.conf ?
We have our agent running task and uploading everything to Cloud. As I said, we don;t even have file server running
I am more curious about how to migrate all the information stored in the local clearml server to the clearml server in the cloud
Nice ! That is handy !!
thanks !
Clear. Thanks @<1523701070390366208:profile|CostlyOstrich36> !
Sure:
def main():
repo = "redacted"
commit = "redacted"
commit = "redacted"
bands = ["redacted"]
test_size = 0.2
batch_size = 64
num_workers = 12
img_size = (128, 128)
random_seed = 42
epoch = 20
learning_rate = 0.1
livbatch_list = get_livbatch_list(repo, commit)
lbs = download_batches(repo, commit, livbatch_list)
df, label_map = get_annotation_df(lbs, bands)
df_train, df_val = deterministic_train_val(df, test_size=test_siz...
following this thread as it happen every now and then that clearml miss some package for some reason ...
@<1523701087100473344:profile|SuccessfulKoala55> Should I raise a github issue ?
so in your case, in the clearml-agent conf, it contains multiple credential, each for different cloud storage that you potential use ?
When i set output uri in the client, artefact are sent to blob storage
When file_server is set to azure:// then model/checkpoint are sent to blob storage
But the are still plot and metrics folder that are stored in the server local disk. Is it correct?
may be I will play around a bit and ask more specific questions .... It's just I cannot find much docs around how the pipeline caching work (which is the main point of pipeline ?)
the config that I mention above are the clearml.conf for each agent
so i guess it need to be set inside the container
please provide the full logs and error message.
I use CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=/path/to/my/vemv/bin/python3.12
and it work for me
are you using the agent docker mode ?
we are not using docker compose. We are deploying in Azure with each database as a standalone service
Try to set CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=true
in the terminal start clearml-agent
See None
I don;t think ClearML is designed to handle secrets other than git and storage ...
inside the script that launch the agent, I set all the env need (aka disable installation with the var above)
in that case yes. What happen is that in docker mode:
you run a clearml agent, that then receive a task
create a container
install another agent inside that container
then run that second agent inside the container
that second agent then pull the task and do the usuall build/install
CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=true
need to be set on that second agent somehow ...
CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=/path/to/my/vemv/bin/python3.12 clearml-agent bla
you may want to share your config (with credential redacted) and the full docker compose start up log ?