Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey All, I'M Running A Self Hosted K8S Cluster With Clearml Server Installed Using Helm Chart Clearml-7.2.0, And Saving My Artifacts In Self Hosted S3 Bucket. I'M Able To Upload My Artifacts Just Fine, But I Want To Be Able To Delete Those Artifacts When

hey all, I'm running a self hosted k8s cluster with ClearML server installed using helm chart clearml-7.2.0, and saving my artifacts in self hosted S3 bucket. I'm able to upload my artifacts just fine, but I want to be able to delete those artifacts when deleting manually from the UI. I read some threads but I still cannot make it work, my clearml version is 1.14.1-448, and I'm mounting the credentials as a config map in my apiserver deployment like this

apiVersion: v1
data:
  services.conf: |
    storage_credentials {
      aws {
        s3 {
            use_credentials_chain: false
            credentials: [
              {
                host: "myhost:443"
                bucket: "mybucket"
                key: "key7087806"
                secret: "secret808086"
                region: on-prem
              },
            ]
        }
      }
    }
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/instance: clearml
  name: additional-configs
  namespace: clearml

is there anything else I need to do??

existingAdditionalConfigsConfigMap: "additional-configs"
  
  
Posted one month ago
Votes Newest

Answers 19


@<1523701994743664640:profile|AppetizingMouse58> hi, when I delete a file I don't get any error, the linked task gets deleted from the clearml db, but the artifact is still in my bucket. I see the async_delete service in the docker compose file, but I can't find where it's executed in the helm chart

  
  
Posted one month ago

the api server doesn't say much

[2024-03-18 13:17:45,323] [12] [INFO] [clearml.service_repo] Returned 200 for tasks.get_all_ex in 3ms
[2024-03-18 13:17:46,141] [12] [INFO] [clearml.service_repo] Returned 200 for tasks.delete_many in 791ms
[2024-03-18 13:17:46,360] [12] [INFO] [clearml.service_repo] Returned 200 for tasks.get_all_ex in 10ms

as for the file server, is that really needed if I'm storing things in the s3 bucket??? This is the only log I get

Loading config from /opt/clearml/fileserver/config/default
Loading config from file /opt/clearml/fileserver/config/default/logging.conf
Loading config from file /opt/clearml/fileserver/config/default/fileserver.conf
Loading config from /opt/clearml/config
 * Serving Flask app 'fileserver'
 * Debug mode: off
  
  
Posted one month ago

@<1673863788857659392:profile|HomelyRabbit25> can you confirm the apiserver loads configuration from the mounted services.conf file?

  
  
Posted one month ago

@<1673863788857659392:profile|HomelyRabbit25> What happens when you delete the files from UI? Can you please share the logs from the async_delete service? This is the service that is actually responsible for the files deletion and the s3 configuration that you prepared should be mapped into that service (not the apiserver)

  
  
Posted one month ago

I just confirmed that it's indeed loading the config from this file

  
  
Posted one month ago

hey, I can confirm that I have a file in the correct location according to this doc

root@clearml-apiserver-5cb4495f9f-2p7wg:/opt/clearml# cat /opt/clearml/config/services.conf
storage_credentials {
  aws {
    s3 {
        use_credentials_chain: false
        credentials: [
          {
            host: "machine-learningbla.com:443"
            bucket: "machine-learning-bucket"
            key: "UdifdasfBS"
            secret: "---6HAE----O"
            region: "on-prem"
            secure: true
            multipart: false
          },
        ]
    }
  }
}

Not sure how to see if it's loading from that file, I don't see the CLEARML_CONFIG_DIR env variable in my pod. I see this when the apiserver initializes

[2024-03-18 13:50:27,317] [18] [INFO] [clearml.service_repo] Loading services from /opt/clearml/apiserver/services
[2024-03-18 13:50:50,395] [18] [INFO] [clearml.service_repo] Returned 200 for debug.ping in 0ms
...

Inside directory /opt/clearml/apiserver/services I have this

root@clearml-apiserver-5cb4495f9f-276wp:/opt/clearml# ls apiserver/services
__init__.py  __pycache__  auth.py  debug.py  events.py  login  models.py  organization.py  pipelines.py  projects.py  queues.py  reports.py  server  tasks.py  users.py  utils.py  workers.py
  
  
Posted one month ago

ok this is weird, in apiserver we should see call for deletion request. I need to consult with some people because I don’t think this is infra config related.

  
  
Posted one month ago

it would be great to get logs from apiserver and fileserver pods when deleting a file from ui so we can see what is going on. I’m saying this because, at first glance, I don’t see anyissue in your config

  
  
Posted one month ago

if it's not needed, then why is the apiserver-config it in the volumes section of the asyncdelete deployment??
https://github.com/allegroai/clearml-helm-charts/blob/4ca4bc82c48a403060c1d43b93ab[…]/charts/clearml/templates/apiserver-asyncdelete-deployment.yaml

for sure, I'll open a bug

  
  
Posted one month ago

correct me if I'm wrong, the chart is missing the following

volumeMounts:  
  - name: apiserver-config
    mountPath: /opt/clearml/config

in the clearml-apiserver definition
https://github.com/allegroai/clearml-helm-charts/blob/4ca4bc82c48a403060c1d43b93ab[…]/charts/clearml/templates/apiserver-asyncdelete-deployment.yaml
I added it and now it is working

  
  
Posted one month ago

I was getting an error saying credentials couldn't be found to delete objects in my s3 bucket

  
  
Posted one month ago

this one should not be needed for asyncdelete, what is the error you are getting?

  
  
Posted one month ago

it’s weird, can you pls open a bug in clearml-helm-charts repo?

  
  
Posted one month ago

@<1673863788857659392:profile|HomelyRabbit25> We are planning to release a new version v1.15 in a few days that will support this job in helm charts. Currently this option does not exist in K8s deployment and the apiserver is not deleting task artifacts from external storages

  
  
Posted one month ago

@<1523701994743664640:profile|AppetizingMouse58> @<1523701087100473344:profile|SuccessfulKoala55> I've been looking and I can't find any call to the async_urls_delete job in the helm chart, can you confirm this is the case? or am I confused? Thanks!

  
  
Posted one month ago

Hi @<1673863788857659392:profile|HomelyRabbit25> , yes it should include the support for async_delete service. Please provide the storage_credentials configuration to this service instead of the apiserver. For the details of whether the deletion works or it has any issues with the provided configuration please inspect the logs from the async_delete pod.

  
  
Posted one month ago

hi @<1523701994743664640:profile|AppetizingMouse58> ! I saw that the new release is out, just wanted to confirm that this problem should be solved so I can make the change. I don't see anything in the changelist mentioning this issue.

  
  
Posted one month ago

awesome, thanks! I'll wait for this new version :)

  
  
Posted one month ago

I see it now, awesome thanks! I'll give it a try 🙂

  
  
Posted one month ago
121 Views
19 Answers
one month ago
one month ago
Tags
aws