
Reputation
Badges 1
533 × Eureka!okay lets go
FriendlySquid61
Just updating, I still haven't touched this.... I did not consider the time it would take me to set up the auto scaling, so I must attend other issues now, I hope to get back to this soon and make it work
So just to be clear - the file server has nothing to do with the storage?
Thia is just keeping getting better and better.... 🤩
I know I can configure the file server on trains-init
- but that only touches the client side, what about the container on the trains server?
To be clearer - how to I refrain from using the built in file-server altogether - and use MINIO for any storage need?
Martin: In your trains.conf, change the valuefiles_server: '
s3://ip :port/bucket'
Isn't this a client configuration ( trains-init
)? Shouldn't be any change to the server configuration ( /opt/trains/config...
)?
Okay Jake, so that basically means I don't have to touch any server configuration regarding the file-server
on the trains server. It will simply get ignored and all I/O initiated by clients with the right configuration will cover for that?
Continuing on this discussion... What is the relationship between configuring files_server
and all the rest we just talked about and the the default_output_uri
?
I just tried setting the conf in the section Martin said, it works perfectly
I assume that at some points in the execution, the client (where the task is running) is sending JSONs to the mongo service, and that is what we see in the web UI.
Since we are talking about a case where there is no internet available, maybe these could be dumped into files/stdout and let the user manually insert them.
The manual insertion UX could be something like a CLI copy-paste or and endpoint for files - but since your UX is so good ( 🙂 ) I'm sure you'll figure this part out better
Yep, if communication is both ways, there is no way (that I can think of) it can be solved for offline mode.
But if the calls that are made from the server to the client can be redundant in a specific setup (some functionality will not work, but enough valuable functionality remains) then it is possible in the manual way
I really don't know, as you can see in my last screenshot, I've configured my base image to be 10.1
I'm really confused, I'm not sure what is wrong and what is the relationship between the templates the agent and all of those thing
For the meantime, I'm giving up on the pipeline thing and I'll write a bash script to orchestrate the execution, because I need to deliver and I'm not feeling this is going anywhere
On an end note I'd love for this to work as expected, I'm not sure what you need from me. A fully reproducible example will be hard because obviously this is proprietary code. What ...
In standard docker TimelyPenguin76 this quoting you mentioned is wrong, since the whole argument is being passed - hence the double tricky quotation I posted above
Saving part from task A:
pipeline = trials.trials[index]['result']['pipeline'] output_prefix = 'best_iter_' if i == 0 else 'iter_' task.upload_artifact(name=output_prefix + str(index), artifact_object=pipeline)
One sec I'll paste the relevant pieces of code
can't remember, I just restarted everything so I don't have this info now
Is there a more elegant way to find the process to kill? Right now I'm doing pgrep -af trains
but if I'll have multiples agents, I will never be able to tell them apart
Okay, so the agent automatically inherits the launching environment's variables?
Good, so if I'm templating something using clearml-task
(without queue, so the task is in draft mode) it will use this task? Even though it never exeucted?