Hi Jake & John, I Was Having The Same Question As Lindsay, And Have Been Sifting Both Through Slack & Github. I Did Get Some Useful Pointers (Such As The Correct Path Structure For Azure), But I'M Far From A Resolution. I'D Like To Ask Some Question, To

Answered

Hi Jake & John, I was having the same question as Lindsay, and have been sifting both through slack & github. I did get some useful pointers (such as the correct path structure for azure), but I'm far from a resolution.

I'd like to ask some question, to clear up some confusions I have:
is the plot data stored in mongo, or does mongo just store some links? when trying to access artifacts stored in azure blob storage, I get permission errors (which disturbingly show the private key in the image preview), and I'm wondering what the correct setup for that is. I tried all the possible combinations I could think of already, the one in the screenshot is the one clearml automatically produces, where I've deleted the actual secret. Seems very strange that both "bucket" and "key" should hold the value "azure". Not sure if the region field is necessary. If I, say, copy the clearml data directory from the existing machine to a different location, and copy the tensorboard data to the same absolute path, will that work? In the future I want to avoid this problem (of having the move experiment tracking, merge experiments from different machines, etc). What is the best practice for that? I would try to store everything (including actual experiment data, e.g. tensorboard, logs, etc) to blob storage, but I don't think that is possible.
Any pointers are greatly appreciated.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					WittyOwl57
				
					0
					 × 1

Votes Newest

Answers 5

Hi @<1523701987055505408:profile|WittyOwl57> ,
For the files_server , this controls the upload of debug_images, and would be either the fileserver address (like you have now, I assume), or some object storage like None for example.
For setting an object storage for models and artifacts, you would need to set up the default_output_uri field in the clearml.conf file.
Regarding azure setup in the WebApp, @<1523701070390366208:profile|CostlyOstrich36> do you have some real-world examples?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Sorry to ping you @<1523701087100473344:profile|SuccessfulKoala55> , can you offer any ideas to the two questions from my reply (about the correct web app cloud access and the correct way to specify a blob storage in the clearml.conf file? Thanks 🙏

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					WittyOwl57
				
					0
					 × 1

Hi @<1523701087100473344:profile|SuccessfulKoala55> ,
thanks for the pointers.

I didn't know that the plot data is stored in elasticsearch. Good to know. It relates to the rest of my questions in that I want to understand where everything is saved, all the parts of my experiments. The plots are actually the most important part, since I have direct access to the artifacts I save (like, say, models) but not to the plot data which helps me compare and rank experiments. I mention tensorboard because that's what's producing the traces. I'm still not sure if clearml is actually storing plot data inside elasticsearch or simply linking to the tensorboard's tfevent files.

I still have no idea what the correct way to set up the access to the blob storage. Again, writing from the sdk is fine, retrieving from the webui is not. As described in the first screenshot, the first two fields for web app cloud access , "bucket" and "key", the values written by clearml are "azure", which can't be right. My question is - what real world azure concepts are these two names related to, so I can have a better guess at what the correct values for them might be (or simply a working example would be amazing).

Regarding the file_server configuration, would you be so kind to give an example. I couldn't find anything. The way i managed to get artifact upload working was by using the output_uri in Task.init .
What I have in my config right now for api.files_server is:

api {
    files_server:

Imagine my confusion 😭

Thanks again for your advice.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					WittyOwl57
				
					0
					 × 1

Hi @<1523701987055505408:profile|WittyOwl57> ,

is the plot data stored in mongo, or does mongo just store some links?

Plot data is stored in the ElasticSearch database. I am not sure how is this related to the rest of your question as they pertain to images 🙂

If I, say, copy the clearml data directory from the existing machine to a different location, and copy the tensorboard data to the same absolute path, will that work?

That should work. I'm not sure why tansorboard is mentioned here, but if you're talking about the fileserver storage folder, than it would work.

In the future I want to avoid this problem (of having the move experiment tracking, merge experiments from different machines, etc). What is the best practice for that? I would try to store everything (including actual experiment data, e.g. tensorboard, logs, etc) to blob storage, but I don't think that is possible.

Scaling should not be an issue. Using blob storage for uploaded artifacts and images is part of the system's design (by using the files_server and output_uri configuration options on the client side). Anything else can be handles by scaling up the databases, which as you mentioned should not be an issue.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

This is how the links to the artifacts looks like (the part I blurred out is is the last part of the secret, which is working fine since the task was able to upload those correctly to storage, I can check that):

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					WittyOwl57
				
					0
					 × 1

Write your answer

2K Views

5 Answers

2 years ago