Reputation
Badges 1
533 × Eureka!does the services mode have a separate configuration for base image?
FriendlySquid61
Just updating, I still haven't touched this.... I did not consider the time it would take me to set up the auto scaling, so I must attend other issues now, I hope to get back to this soon and make it work
a machine that had previous installation, but I deleted the /opt/trains directory beforehand
Oh... from the docs I understood that I don't have to run the script, that I can either configure it in the UI, or with the sscript (wizard) so I ignored it up until now
🤔 is the "installed packages" part editable? good to know
Isn't it a bit risky manually changing a package version? what if it won't be compatible with the rest?
Also being able to separate their configurations files would be good (maybe there is and I don't know?)
So the scale will also appear?
Do i need to copy this aws scaler task to any project I want to have auto scaling on? what does it mean to enqueue hte aws scaler?
(it works now, with 20 GB)
Okay, so the agent automatically inherits the launching environment's variables?
Yes, I'll prepare something and send
so in my code, I'll use this environment variable to read from disk
So if I'm collecting from the middle ones, shouldn't the callback be attached to them?
Makes sense
So I assume, trains assumes I have nvidia-docker installed on the agent machine?
Moreover, since I'm going to use Task.execute_remotely (and not through the UI) is there any code way to specify the docker image to be used?
SuccessfulKoala55 this actually doesn't work
` apiserver_conf = ConfigFactory.parse_file(API_SERVER_CONF_PATH)
POINT 1
conf_content = HOCONConverter.to_hocon(config=ConfigFactory.from_dict(apiserver_conf.as_plain_ordered_dict()),
compact=False,
level=0, indent=2)
apiserver_conf['auth']['fixed_users']['users'].append(
ConfigFactory.from_dict({'username': username, 'password': password, 'name': name}))
##...
So regarding 1, I'm not really sure what is the difference
When running in docker mode what is different the the regular mode? No where in the instructions is nvidia docker a prerequisite, so how exacly will tasks on GPU get executed?
I feel I don't underatand enough of the mechanism to (1) understand the difference between docker mode and not and (2) what is the use casr for each
Another Q on that - does pyhocon allows me to edit the file while keeping the comments in place?
I assume that at some points in the execution, the client (where the task is running) is sending JSONs to the mongo service, and that is what we see in the web UI.
Since we are talking about a case where there is no internet available, maybe these could be dumped into files/stdout and let the user manually insert them.
The manual insertion UX could be something like a CLI copy-paste or and endpoint for files - but since your UX is so good ( 🙂 ) I'm sure you'll figure this part out better
that is because my own machine has 10.2 (not the docker, the machine the agent is on)
essentially editing apiserver.conf section auth.fixed_users.users
Yep, if communication is both ways, there is no way (that I can think of) it can be solved for offline mode.
But if the calls that are made from the server to the client can be redundant in a specific setup (some functionality will not work, but enough valuable functionality remains) then it is possible in the manual way
Especially coming from the standpoint of a team leader or other kind of supervision (or anyone who wants to view the experiment which is not the code author), when looking at an experiment you want to see the actual code
