Thia is just keeping getting better and better.... 🤩
So just to be clear - the file server has nothing to do with the storage?
Okay Jake, so that basically means I don't have to touch any server configuration regarding the file-server on the trains server. It will simply get ignored and all I/O initiated by clients with the right configuration will cover for that?
And once this is done, what is the file server IP good for? will it redirect to the bucket?
No, it'll just be there 🙂 You can obviously edit your docker-compose.yml and remove it, if you'd like (although it takes close to no resources)
I just tried setting the conf in the section Martin said, it works perfectly
To be clearer - how to I refrain from using the built in file-server altogether - and use MINIO for any storage need?
So just to be clear - the file server has nothing to do with the storage?
Think of it as a quick and dirty "minio", storing files and serving them over http. If you have minio (or any object storage) you can replace it all together 🙂
You might need to turn off the secure option... Let me check
To store all the debug samples, also it can store all the models (if you configure the output_uri=' http://file_server_here:8081 ') Yes: instead of the file server have 's3://<ip_of_minio>:9000/bucket' make sure you add the credentials for the minio in the trains.conf Yes, basically once you have the creendtials in the trains.conf, you could do StorageManager.get_local_copy('s3://<minio>:9000/bucket/file') (also upload of course 🙂 )
In your trains.conf, change the valuefiles_server: ' s3://ip :port/bucket'
Isn't this a client configuration
No, that's just the thing - in order to use minio, each client needs to have the credentials configured
WackyRabbit7 this section is what you need, un mark it, and fill it in
https://github.com/allegroai/trains/blob/c9fac89bcd87550b7eb40e6be64bd19d4384b515/docs/trains.conf#L88
EnviousStarfish54 Notice that you can configure it on the agent machine only, so in development you are not "wasting" storage when uploading debug checkpoints/models 🙂
Wow! Just need this, I am surprised that I don't need to configure on server side
I tried what you said in the previous response, setting sdk.aws.s3.key and sdk.aws.s3.secret to the ones in my MINIO. Yet when I try to download an object, i get the following>>> result = manager.get_local_copy(remote_url="s3://***********:9000/test-bucket/test.txt") 2020-10-15 13:24:45,023 - trains.storage - ERROR - Could not download s3://***********:9000/test-bucket/test.txt , err: SSL validation failed for https://***********:9000/test-bucket/test.txt [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1123)
Hi WackyRabbit7 ,
Just to expand on #2:
make sure you add the credentials for the minio in the trains.conf
In trains.conf , set your minio credentials(key, secret, region) in sdk.aws.s3.key , sdk.aws.s3.secret etc. You can also use the standard AWS env vars which are automatically parsed by Trains ( AWS_ACCESS_KEY_ID , AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION )
Martin: In your trains.conf, change the valuefiles_server: ' s3://ip :port/bucket'
Isn't this a client configuration ( trains-init )? Shouldn't be any change to the server configuration ( /opt/trains/config... )?
basically the default_output_uri will cause all models to be uploaded to this server (with specific subfolder per project/task)
You can have the same value there as the files_server.
The files_server is where you have all your artifacts / debug samples
I know I can configure the file server on trains-init - but that only touches the client side, what about the container on the trains server?
And once this is done, what is the file server IP good for? will it redirect to the bucket?
Continuing on this discussion... What is the relationship between configuring files_server and all the rest we just talked about and the the default_output_uri ?