Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Unrelated Problem (Or Is It?) The Clearml'S Built In Cleanup Service Fails

Unrelated problem (or is it?) the ClearML's built in Cleanup Service fails

clearml.utilities.pyhocon.exceptions.ConfigMissingException: 'No configuration setting found for key project' Leaving process id 4606 DONE: Running task '7a3080a1bc634e6aa41a0d0874d638fc', exit status 1 Process failed, exit code 1

  
  
Posted 2 years ago
Votes Newest

Answers 31


to fix it, I excluded this var entirely from the docker-compose

  
  
Posted 2 years ago

image

  
  
Posted 2 years ago

πŸ‘

  
  
Posted 2 years ago

AgitatedDove14 I still can't get it to work... I couldn't figure out how can I change the clearml version in the runtime of the Cleanup Service as I'm not in control of the agent that executes it

  
  
Posted 2 years ago

AgitatedDove14
So I couldn't kill the service agent myself (permission denied, I'm not sudo). What I did is I docker-compose down ed, commented out only the environment variable of GOOGLE_APPLICATION_CREDENTIALS from the clearml services agent service and upped the docker-compose again. I enqueued the Cleanup Service and now it works. Really weird, looks like the setting of GOOGLE_APPLICATION_CREDENTIALS causes an error when set even though I'm 100% is it not used for storage.

Are you certain you have no artifacts on GS?
Are you saying that ifΒ 

GOOGLE_APPLICATION_CREDENTIALS

Β and clearml.conf contains no "project" section it crashed when starting ?

100% sure no artifacts are on GS. Not sure what you are asking in the second line here. The only place I have ever set GOOGLE_APPLICATION_CREDENTIALS is as an environment variable when launching agents (on other queues, not the services queue) and on the clients only for the sake of using BigQuery

  
  
Posted 2 years ago

I still can't get it to work... I couldn't figure out how can I change the clearml version in the runtime of the Cleanup Service as I'm not in control of the agent that executes it

Let's take a step back. Let's remove the clearml-services from the docker compose for a second, and run it manually (then you can control everything). Once you have it running manually, let's try to replicate the setup back to the docker compose, make sense ?

  
  
Posted 2 years ago

. Yes I do have a GOOGLE_APPLICATION_CREDENTIALS environment variable set, but nowhere do we save anything to GCS. The only usage is in the code which reads from BigQuery

Are you certain you have no artifacts on GS?
Are you saying that if GOOGLE_APPLICATION_CREDENTIALS and clearml.conf contains no "project" section it crashed when starting ?

  
  
Posted 2 years ago

what do you say that I will manually kill the services agent and launch one myself?

Makes sense πŸ™‚

  
  
Posted 2 years ago

No absolutely not. Yes I do have a GOOGLE_APPLICATION_CREDENTIALS environment variable set, but nowhere do we save anything to GCS. The only usage is in the code which reads from BigQuery

  
  
Posted 2 years ago

I don't think the problem is setting that variable, I think it has something to do with it but not that obvious... Because it did work for me in the past, since then we docker-compose up/downed a few times, changed some other things etc... Can't figure out what made it get to this point

  
  
Posted 2 years ago

Can't figure out what made it get to this point

I "think" this has something to do with loading the configuration and setting up the "StorageManager".
(in other words setting the google.storage)... Or maybe it is the lack of google storage package?!
Let me check

  
  
Posted 2 years ago

I'm glad you were able to solve the issue!
WackyRabbit7 I could not reproduce it, what did you pass in "GOOGLE_APPLICATION_CREDENTIALS" ?

  
  
Posted 2 years ago

the path to the JSON file

  
  
Posted 2 years ago

google store package could be the cause, because indeed we have the env var set, but we don't use the google storage package

  
  
Posted 2 years ago

AgitatedDove14 sorry for delayed reply - where do I read the version the Cleanup Service is using?

  
  
Posted 2 years ago

AgitatedDove14 clearml version on the Cleanup Service is 0.17.0

  
  
Posted 2 years ago

πŸ€” is the "installed packages" part editable? good to know

Isn't it a bit risky manually changing a package version? what if it won't be compatible with the rest?

  
  
Posted 2 years ago

I assume it has nothing to do with my client version

  
  
Posted 2 years ago

Edit the cloned version and enqueue it?

  
  
Posted 2 years ago

Will try this out and report

  
  
Posted 2 years ago

In the Task log itself it will say the version of all the packages, basically I wonder maybe it is using an older clearml version, and this is why I cannot reproduce it..

  
  
Posted 2 years ago

Β  is the "installed packages" part editable? good to know

Of course it is, when you clone a Task everything is Editable πŸ™‚

Isn't it a bit risky manually changing a package version?

worst case it will crash quickly, and you reset/edit/enqueue πŸ™‚
(Should work though)

  
  
Posted 2 years ago

BTW from the log you attached:

File "/root/.clearml/venvs-builds/3.6/lib/python3.6/site-packages/clearml/storage/helper.py", line 218, in StorageHelper
_gs_configurations = GSBucketConfigurations.from_config(config.get('google.storage', {}))

This means it tries to remove an artifact from a Task, that artifact is probably in GS (i'm assuming because it is using the GS api), and the cleanup service is missing the GS configuraiton.
WackyRabbit7 is that possible ?

  
  
Posted 2 years ago

I'm saying that because in the task under "INSTALLED PACKAGES" this is what appears

This is exactly what I was looking for. Thanks!
Yes that makes sense, I think this bug was fixed a long time ago, and this is why I could not reproduce it.
I also think you can use a later version of clearml πŸ™‚

  
  
Posted 2 years ago

so it is not defined

  
  
Posted 2 years ago

How can I change the version of the Cleanup Service?

  
  
Posted 2 years ago

Let's take a step back. Let's remove the clearml-services from the docker compose for a second, and run it manually (then you can control everything). Once you have it running manually, let's try to replicate the setup back to the docker compose, make sense ?

I'd prefer not to docker-compose down as researchers are actively working on it, what do you say that I will manually kill the services agent and launch one myself?

  
  
Posted 2 years ago

AgitatedDove14 ?

  
  
Posted 2 years ago

Very odd, I still can't reproduce. This is just the cleanup service running without anything else ?
What's the clearml version it is using ?

  
  
Posted 2 years ago

to fix it, I excluded this var entirely from the docker-compose

Make sense.

the path to the JSON file

Yep, that's what I did and things seem to work... Let me check again if I missed anything

  
  
Posted 2 years ago
7K Views
31 Answers
2 years ago
one month ago
Tags