Reputation
Badges 1
5 × Eureka!kernel.sem = 50100 128256000 50100 2560
Don't think the semaphores should be depleted.
That example is quite large. We are not doing anything close to that, or even downloading any datasets/artifacts on the runner, and have ~40GB available in the /tmp
directory.
We can try to further increase the storage if there are no other ideas, but if that fixes anything, it would mean that there is a bug in the clearml sdk. So much storage shouldn't be needed to run the controller from my per...
AgitatedDove14 We are using the PipelineController class, with Tasks as steps.
Running this on a laptop, we could observe on the Web UI that the pipeline task was running locally, and the tasks on the agents.
The same script however crashes on the runner with the OSError.
` pipe_task = Task.init(project_name=project_name,
task_name=pipeline_name,
task_type=Task.TaskTypes.controller,
reuse_last_task_id=False,
...
Hi AgitatedDove14 !
there is no more storage left to run all those subprocesses
I see. Does this mean the /tmp directory? I might be a little unfamiliar here.
Also, why is so much storage space required to run the subprocesses (nodes)? The run_pipeline_steps_locally
flag is set to False (by default) in controller_object.start_locally()
. Only the pipelinecontroller should be running locally, right?
This is the maximum simultaneous jobs it will try to launch (it will launch m...
Why are you running from gitlab runner
Hi CostlyOstrich36 We are trying to run integration tests on our pipelines.
To make sure that changes in tasks during merge requests, do not break the pipeline
AgitatedDove14 Can you please specify which resources we should increase? I haven't been able to observe any depleted resources on the runner while the pipeline is running (semaphores, threads, ram, cache), but I might be wrong here since the process crashes as soon as we hit the start_locally command.