
Reputation
Badges 1
100 × Eureka!Something else, If I want to designate only some of the GPUs of a worker, how can I do that?
Found it in the init docs π
Oh, that seems right, how can I get the project id of the newly created project?
I've sorted this out. All I needed was to add them to a queue so they would be visible.
Fixed. the issue was the project name containing /
(The one that was created with initial task)
Furthermore, let's say I have 6 GPUs on a machine, and I'd like trains to treat this machine as 2 workers (gpus 0-2, 3-5), is there a way to do that?
That should do the trick, thanks π
I'll do as Jake says. Thanks :)
I am aware this is the current behavior, but could it be changed to something more intelligent? π
When you say I can still get race/starvation cases, you mean in the enterprise or regular version?
This is the path:
/Remote/moshe/Experiments/trains_bs_pipe_new/ypi/OKAY/Try_That/baseline/evaluation_validation/results/images/bottom_scores/0.0_slot02_extracted_23_01__1035__1.png
I bet it has something to do with the server or DB, any clue?
Changing the mountpoint for the agent is not possible
I am not sure what you mean by verifying the API.
let me try
Is there a way to set this via a config file? like the docker compose yml?
Since my servers have a shared file system, the init process tells me that the configuration file already exists. Can I tell it to place it in another location? GrumpyPenguin23
Oh I see, I think this will work. Thanks π
Yes. More exactly I'm gzip.open them but I don't believe it should matter
Nevermind, you can find it in the apiserver.conf
I've ran this 8 times:trains-agent --config-file /opt/trains/trains.conf daemon --detached --cpu-only --queue important_cpu_queue cpu_queue
The version is 0.16.2rc0 (a version Mushik gave me that supports local conda env)