Reputation
Badges 1
27 × Eureka!where do I run trains-init from?
I did git clone, not pip install
🙂 I could not locate this file!
trains-apiserver | [2020-07-10 13:33:29,269] [8] [ERROR] [trains.updates] Failed obtaining updates
trains-apiserver | Traceback (most recent call last):
trains-apiserver | File "/opt/trains/server/updates.py", line 96, in _check_updates
trains-apiserver | response = self._check_new_version_available()
trains-apiserver | File "/opt/trains/server/updates.py", line 48, in _check_new_version_available
trains-apiserver | uid = Settings.get_by_key("server.uuid")
trains-apiserver | Fil...
docker volume create --name=mongodata
Also, each task might need its own configuration. Data are usually stored in multiple containers. Rather than a single configuration, there should be possibility to do it per task.
There already seems to be support for multiple containers in the code.
Is there an example to configure multiple storage accounts?
I just installed trains[azure]. Since all my data is on Azure. I don't know about StorageManager.
I see that _AzureBlobServiceStorageDriver need to be updated. Anything else?
Will try, thank you.
For now I am trying to achieve (1). But the goal is (2)
Is there documentation for (2) available for evaluation?
Would be nice to have a reference implementation
I use AzureML, and like to try trains.
First, how to setup trains-server on Azure.
And then...
AzureML allows to trains on low prio clusters.
How can I configure and setup low prio training clusters and connect them to trains.
Federated learning is about sending code to where data exists, training local models and aggregating them in a central place.
Can existing design support this or extensions need to be built?
Looks like a mongodb and NTFS issue
https://github.com/docker-library/mongo/issues/190
Above command and yaml file are working in Win10
Sure, let me test its completely working
I tried a slightly different approach that seems to work.
docker volume create --name=mongodata
And configured mogodat data in docker-compose file
Checked. Only change I had to make was to increase memory to 4GB. Still there are errors.
I think errors are related to network and permission.
I have ~100GB of data that I do not wish to upload to the trains-server. Instead, I would like to have them copied only to host machine (azure container) at training time.
The data is in Azure blob storage and will be copied using a custom script just before training starts.
I surely can, will let you know.
Web server port is modified and changed c:\opt to d:\opt