Yeah but I don't get what it is for - for now I have 2 agents, each listening to some queues. I actually ignore the "services" queue until now
I don't get the difference between how I'm using my agents now, just starting them on machines, and making them listen to queues, to using the "services" mode
and the machine I have is 10.2.
I also tried nvidia/cuda:10.2-base-ubuntu18.04 which is the latest
Also being able to separate their configurations files would be good (maybe there is and I don't know?)
Okay so at the first part of the code, we define some kind of callback that we add to our steps, so later we can collect them and attach the results to the pipeline task. It looks something like this
` class MedianPredictionCollector:
_tasks_to_collect = list()
@classmethod
def collect_description_tables(cls, pipeline: clearml.PipelineController, node: clearml.PipelineController.Node):
# Collect tasks
cls._tasks_to_collect.append(node.executed)
@classmethod...
Maybe the case is that after start / start_locally the reference to the pipeline task disappears somehow? O_O
that is because my own machine has 10.2 (not the docker, the machine the agent is on)
Cool, now I understand the auto detection better
How do I get all children tasks given a parent?
this is the selection from the column setting menu
the ability to exexute without an agent i was just talking about thia functionality the other day in the community channel
I really don't know, as you can see in my last screenshot, I've configured my base image to be 10.1
Does it mean that if it is set to False I need an agent but if I set it to True I don't need one?
but remember, it didnt work also with the default one (nvidia/cuda)
I tried what you said in the previous response, setting sdk.aws.s3.key and sdk.aws.s3.secret to the ones in my MINIO. Yet when I try to download an object, i get the following
` >>> result = manager.get_local_copy(remote_url="s3://*******:9000/test-bucket/test.txt")
2020-10-15 13:24:45,023 - trains.storage - ERROR - Could not download s3://*****:9000/test-bucket/test.txt , err: SSL validation failed for https://*****:9000/test-bucket/test.txt [SSL: WRONG_VERSION_NU...
TimelyPenguin76 if I build a custom image, do I have to host it on dockerhub for it to run on the agent? If not how do I make the agent aware of my custom image?
SuccessfulKoala55 The simplest thing i can think of is on Task.execute_remotely to be able to append ot the docker_init_bash_script
Sorry.. I still don't get it - when I'm launching an agent with the --docker flag or with the --services-mode flag, what is the difference? Can I use both flags? what does it mean? 🤔
Committing that notebook with changes solved it, but I wonder why it failed



