Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi New With Clearml I Create Clearml Server On Gcp With Docker Now I’M Training Yolov5 And I Want To Save All The Info (Model And Metrics ) With Clearml To My Bucket.. (So I Can Have Small Server And No Memory Issue ) Where Should I Start? Its Should Be C

Hi new with clearml
i create clearml server on gcp with docker
now I’m training yolov5 and i want to save all the info (model and metrics ) with clearml to my bucket..
(so i can have small server and no memory issue )
where should i start? its should be config on the clearml server ?
clerml.conf from the client ( on the yolo training )

Thanks for the help 🙏

  
  
Posted 2 years ago
Votes Newest

Answers 27


Hi AstonishingRabbit13

now I’m training yolov5 and i want to save all the info (model and metrics ) with clearml to my bucket..

The easiest thing (assuming you are running YOLOv5 with python train.py is to add the following env variable:
CLEARML_DEFAULT_OUTPUT_URI=" " python train.pyNotice that you need to pass your GS credentials here:
https://github.com/allegroai/clearml/blob/d45ec5d3e2caf1af477b37fcb36a81595fb9759f/docs/clearml.conf#L113

  
  
Posted 2 years ago

Hi AgitatedDove14 thanks for the help!
i run it now and in the end the task upload the model for me to the bucket
clearml.Task - INFO - Completed model upload to gs://
but when i check i can see in the bucket only the final model.. do you know how can i save all the logs and all the metric images?

Thanks

  
  
Posted 2 years ago

do you know how can i save all the logs and all the metric images?

These are stored into clearml-server, no? what am I missing ?

  
  
Posted 2 years ago

yes they are on the clearml-server now

i would like to have it also save on the bucket

  1. save space on the clearml server
  2. i have the model + all its info in one place on the bucket
  
  
Posted 2 years ago

i would like to have it also save on the bucket

oh if this is the casse, you can just change the clearml file server to point to GS bucket, everything will be stored there.
Just change your clearml.conf:
files_server: " "https://github.com/allegroai/clearml/blob/d45ec5d3e2caf1af477b37fcb36a81595fb9759f/docs/clearml.conf#L10

  
  
Posted 2 years ago

Thanks i can see the files now on the bucket
i saw also error in the end of the training
clearml.storage - ERROR - Failed uploading: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url... (Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac
and in the clearml server some images in the PLOTS tab are missing..
is there something else in the conf that i should change ?

again thanks a lot for the help!

  
  
Posted 2 years ago

is there something else in the conf that i should change ?

I'm assuming the google credentials?
https://github.com/allegroai/clearml/blob/d45ec5d3e2caf1af477b37fcb36a81595fb9759f/docs/clearml.conf#L113

  
  
Posted 2 years ago

i had that.. in the end the files where uploaded
just some where missing in the clearml server..

  
  
Posted 2 years ago

AstonishingRabbit13 so is it working now ?

  
  
Posted 2 years ago

in the bucket i can see all the files now!
but on the clearml server when i go into the train some of the plots are missing..
like Confusion Matrix ..

for now i think i’m ok
the scalars seems to be right and the metrics there is whats import for me..

the error for uploading is weird

again thanks for the help!!

  
  
Posted 2 years ago

the error for uploading is weird

wait, are you still getting this error?

  
  
Posted 2 years ago

yes

  
  
Posted 2 years ago

AstonishingRabbit13
https://github.com/googleapis/google-cloud-python/issues/4941#issuecomment-369472576
check the openssl and the date, this seems like SSL low level error (even before authentication)

  
  
Posted 2 years ago

(Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record macWhere is the code running (agent) GCP instance ? your machine ?

  
  
Posted 2 years ago

my gcp instance
try to upgrade openssl.. still got error

  
  
Posted 2 years ago

Could it be you have some custom SSL certificate installed, or policy ?
can you get other https sites? (for example your clearml-server)

  
  
Posted 2 years ago

its very weird
the train upload to the bucket the files..
ex:
clearml.Task - INFO - Completed model upload to gs://... clearml.Task - INFO - Finished uploadingi have print for the model is uploaded
also i can see all the files in the bucket as i said (model+ metrics )
the only thing that missing is some plots on the clearml server (app ) when i got to the details of the train i cannot see the matrix confusion for example ( but its exists on the bucket )

i thought the error logged it might related to that

  
  
Posted 2 years ago

the only thing that missing is some plots on the clearml server (app ) when i got to the details of the train i cannot see the matrix confusion for example ( but its exists on the bucket )

How do you report the "matrix confusion" ? (I might have an idea on what's the difference)

  
  
Posted 2 years ago

i’m using the yolov5 repo
https://github.com/ultralytics/yolov5
its should logged all in the end as I understand

  
  
Posted 2 years ago

its should logged all in the end as I understand

Hmm let me check the code for a minute

  
  
Posted 2 years ago

The confusion matrix shows under debug sample, but the image is empty, is that correct?

  
  
Posted 2 years ago

on the clearml server i can see only :
F1-Confidence Curve, Precision-Confidence Curve, Precision-Recall Curve, Recall-Confidence Curve,but on the bucket all the rest:

  
  
Posted 2 years ago

Are you saying that in the UI you do not see "confusion matrix" at all, only on the GS bucket ?

  
  
Posted 2 years ago

yes..
when i change the files_server to be back the clearml server ( save locally ) i can see everything

  
  
Posted 2 years ago

And you are seeing a bunch of the GS SSL errors?

  
  
Posted 2 years ago

no..
just the error line i mention

  
  
Posted 2 years ago

This is very odd, can you also put here the file names? maybe an odd character is causing it?
Can you also test it with the latest clearml version (1.8.0) ?

  
  
Posted 2 years ago
1K Views
27 Answers
2 years ago
one year ago
Tags
gcp
Similar posts