We would like to have easy/cheaper access to the artifacts etc. that will be output from the experiments
I see, but you have two different points in the system that use the storage:
Where the experiment is being executed - that can be your machine, or a remote machine (using ClearML Agent) - you need to make sure clearml.conf
contains the correct credentials there in order to be able to upload the data to Google Storage. However, if you're not running the experiment in k8s, there's no need to configure these credentials there Where you view your experiment (and artifacts, and possibly download them) - that's the browser, which is running on your own machine in any case. There are different ways to make sure the ClearML WebApp can access your files (in the Google Storage case, I believe that's being handled automatically when your browser is signed into your Google account, and your user has the required permissions in Google Storage). This will not involve clearml.conf
as the browser is running the WebApp in a sandbox and it has no access to local files on your machine.
Hi,
Thx for you response,
Yes - we are using the above repo.
We would like to have easy/cheaper access to the artifacts etc. that will be output from the experiments
But this is not on the pods, isn't it? We're talking about the python code running from COLAB or locally...?
correct - but where is the clearml.conf
file?
But this is not on the pods, isn't it? We're talking about the python code running from COLAB or locally...?
Hi SuccessfulKoala55
Thx again for your help
in case of the google colab, the values can be provided as environment variables
We still need to run the code in a colab environment (or remote client)
do you have any example for setting the environment variables?
For a general environment variable there is an example! export MPLBACKEND=TkAg
But what would be for the clearml.conf
?
retrieving we can use
config_obj.get('sdk.google')
but how would the setting work? we did not manage to work withconfig_obj.set_overrides()
Also, why would you need the Google Storage support in the ClearML server?
That's the most recent and update k8s/helm support
Another option - copy your clearml.conf from the drive:from google.colab import drive drive.mount("/content/drive") !cp /content/drive/My\ Drive/clearml.conf ~
Hi
you will have to configure the credentials there (in a local
clearml.conf
or using environment variables
This is the part that confuses me - is there a way to configure clearml.conf
from the values.yaml
? I would like the GKE to load the cluster with the correct credentials without logging into the pods and manually updating the claerml.conf
file
Hi OutrageousSheep60 , did you use https://github.com/allegroai/clearml-helm-charts ?
Thx again for your time
Happy to be of assistance 🙂
running the python code from
COLAB
or locally
Since the Python code using the ClearML SDK is the one uploading stuff to the storage, you will have to configure the credentials there (in a local clearml.conf
or using environment variables). The ClearML Server itself running in GKE (so I understand) doesn't need the credentials since it will never try to access the storage - it only holds links to the storage.
I was able to view the
artifact
directly (and not through the WebApp) in the bucket- Is it possible to do so in ClearML?
Well, if you have some way of browsing the bucket conveniently, you can of course see the artifact there, it's just a file, stored under a directory structure with the Project name and the task ID
Thx again for your time -
Where the experiment is being executed
Not sure I understand what you mean by this -
Assuming that we are running the ClearML on GKE (we have succeeded doing so) - and running the python code from COLAB or locally. Where do we configure the Google Storage ? how can the helm / k8s dynamically load the clearml.conf
? is it only from values.yaml
?
Where you view your experiment
In mlflow
I was able to view the artifact
directly (and not through the WebApp) in the bucket- Is it possible to do so in ClearML?
The extra configurations in the diagram are server configurations. The storage settings are always client configurations
Unless it's required by an agent you're spinning up alongside the server?
Just for the record - I guess there is an option to use os.environ
https://github.com/allegroai/clearml/blob/ca7909f0349b255f7edca0500878a8e08f3b1c99/clearml/automation/auto_scaler.py#L152-L157
Also (sorry for the mess 🙂 ) - see this - https://clear.ml/docs/latest/docs/guides/ide/google_colab
Are we suppose to use the "Extra Configurations" from the https://clear.ml/docs/latest/assets/images/ClearML_Server_Diagram-7ea19db8e22a7737f062cce207befe38.png ?
https://docs.google.com/drawings/d/11f-AWVmIq7P0e8bP5OnMUz0hguXm2T_Xqq7iNMA-ANA/edit?usp=sharing
Just for the record - for who ever will be searching for a similar setup with colab
prerequisitecreate a dedicated Service Account (I was not able to authenticate with a regular User credentials (and not SA)) get SA key ( credentials.json ) Upload json to an ephemeral location (e.g. root of colab)login into ClearML Web UI - Create access key for user - https://clear.ml/docs/latest/docs/webapp/webapp_profile#creating-clearml-credentials prepare credentials` %%bash
export api=cat <<EOF api { web_server: <
> api_server: <
> files_server: <
> credentials { "access_key" = "<clearml USER access_key>" "secret_key" = "<clearmlUSER secret_key>" } } sdk { google.storage { credentials = [ { bucket: "<GCS-BUCKET>" # subdir: "path/in/bucket" # Not required project: "<GCP PROJECT_ID>" credentials_json: "/content/<clearml-SA.json>" }, ] } } EOF
echo "$api" > /root/clearml.conf Client/Task Setup
project_name = new_prj # or what ever you want
experiment_name = 'experiment1' # or whatever
output_uri=' '
When creating a new project
task = Task.init(project_name=project_name, task_name=experiment_name, output_uri=output_uri)
When connecting to an existing Project and want to create a new experiment
task = Task.create(project_name=project_name}, task_name='experiment2')
When connecting to an existing experiment
task=Task.get_task(project_name=project_name, task_name='experiment2')
logger = task.get_logger() `
Hi OutrageousSheep60 , see here: https://clear.ml/blog/q-a/google-colab-used-as-clearml-workers/
Feeling that we are nearly there ....
One more question -
Is there a way to configure Clearml to store all the artifacts
and the Plots
etc. in a bucket instead of manually uploading/downloading the artifacts from within the client's code?
Specifying the output_uri
in Task.init
saves the the checkpoints, what about the rest of the outputs?
https://clear.ml/docs/latest/docs/faq#git-and-storage
The clearml.conf
is a file that is located in your home folder (locally), or, in case of the google colab, the values can be provided as environment variables