Hi, I'M Trying To Set Storage Manager To Use Our Internal Miniio Installation But I Ran Into This Issue With This Testing Code:

Answered

Hi,
I'm trying to set storage manager to use our internal MiniIO installation but I ran into this issue with this testing code:
` from trains import Task, StorageManager

task = Task.init(project_name="tests", task_name="storage test")
local_iris_pkl = StorageManager.get_local_copy(remote_url="s3://localhost:9000/trains/dataset_test") the result is: trains.storage - ERROR - Could not download s3://localhost:9000/trains/dataset_test , err: SSL validation failed for [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1108) I think this is related the fact s3 config set by the injector in the code is always setting secure=True `

just to be sure the issue is not on minio itself I did
` import boto3

s3 = boto3.resource('s3',
endpoint_url=' ',
aws_access_key_id='DEMOaccessKey',
aws_secret_access_key='DEMOsecretKey')
s3.Bucket('trains').download_file('dataset_test.csv', '/tmp/dataset_test.csv') `
and this is working like a charm.

How can I fix this?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

Votes Newest

Answers 23

an implementation of this kind is interesting for you or do you suggest to fork? I mean, I don't want to impact your time reviewing

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

(probably it's not possible)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

as usual it starts small and after 5 mins discussion is getting challenging 😄 I love this stuff... let me think a bit about it I will get back to you asap on this.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

Hi Martin 😄 ok got it but now the question: how I can pass this to the train-agent deployed with Helm chart?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

https://github.com/allegroai/trains-server-k8s/pull/13 i think you will like it 😄

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

about helm chart, yes, I mean adding capability of managing a configmap qith config file. If it's interesting I can raise a PR otherwise I need to fork 😄

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

I think this is great! That said, it only applies when you are spining agents (the default helm is for the server). So maybe we need another one? or an option?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

ok so it' time to create a configmap with the entire file 😄

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

JuicyFox94
NICE!!! this is exactly what I had in mind.
BTW: you do not need to put the default values there, basically it reads the defaults from the package itself trains-agent/trains and uses the conf file as overrides, so this section can only contain the parts that are important (like cache location credentials etc)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

our data engineer directly write code in pycharm and test it on the fly with brakpoints. when good we simply commit in git and we set a tag "prod ready"

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

Nice!
is trainsConfig pure text blob ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Sounds good to me 🙂

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 trainsConfig is totally optional and you can put the config file itself in it.. e.g.:
trainsConfig: |- sdk { aws { s3 { key: "" secret: "" region: "" credentials: [ { host: "minio.minio:9000" key: "DEMOaccessKey" secret: "DEMOsecretKey" multipart: false secure: false region: "" } ] } boto3 { pool_connections: 512 max_multipart_concurrency: 16 } } development { default_output_uri: "s3://minio.minio:9000/trains/" } }

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

at that point we define a queue and the agents will take care of training

This is my preferred way as well :)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

at that point we define a queue and the agents will take care of training 😄

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

The easiest is to pass an entire trains.conf file

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Notice that the StorageManager has default configuration here:
https://github.com/allegroai/trains/blob/f27aed767cb3aa3ea83d8f273e48460dd79a90df/docs/trains.conf#L76
Then a per bucket credentials list, with detials:
https://github.com/allegroai/trains/blob/f27aed767cb3aa3ea83d8f273e48460dd79a90df/docs/trains.conf#L81

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

basically a new helm chart 😄

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					JuicyFox94
				
					0
					 × 1

an implementation of this kind is interesting for you or do you suggest to fork

You mean adding a config map storing a default trains.conf for the agent?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

an implementation of this kind is interesting for you or do you suggest to fork

You mean adding a config map storing a default trains.conf for the agent?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi JuicyFox94
you pointed to exactly the issue 🙂
In your trains.conf
https://github.com/allegroai/trains/blob/f27aed767cb3aa3ea83d8f273e48460dd79a90df/docs/trains.conf#L94

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

It is way too much to pass on env variable 😞

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Yes 🙂
BTW: do you guys do remote machine development (i.e. Jupyter / vscode-server) ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

1K Views

23 Answers

4 years ago

one year ago