Reputation
Badges 1
606 × Eureka!@<1576381444509405184:profile|ManiacalLizard2> Maybe you are using the enterprise version with the vault? I suppose the enterprise version is running differently, but I dont have experience with it.
For the open-source version, each clearml-agent is using it's own clearml.conf
Hey Martin, thank you for answering!
I see your point, however in my opinion this is really unexpected behavior. Sure, I can do some work to make it "safe", but shouldn't that be default. So throw an error without clearml.conf and expect CLEARML_USE_DEFAULT_SERVER=1
` .
So clearml 1.0.1 clearml-agent 1.0.0 and clearml-server from master
Okay, thank you anyways. I was just asking because I thought I had seen such a setting before. Must have been something different.
I colleague fixed my server and I can confirm, that the fix works!
What's the reason for the shift?
I am currently on the Open Source version, so no Vault. The environment variables are not meant to used on a per task basis right?
When the task is aborted I, the logs will show up, but the scalar logs will never appear. The scalar logs only appear when the task finishes.
Okay, no worries. I will check first. Thanks for helping!
Interesting. Will probably only matter for very small experiments or experiments, where validation is run very infrequently.
Ah, sore should have been more specific. I mean on the ClearML server.
I just tested with remote_execution and the problem seems to exist there, too. It is just that when the task switches from local to remote execution (i.e. exists the local script) the local scalars will appear, but no scalar of remote execution will show up. So also the iteration will not update. However, at least for remote execution I get live console output.
Maybe this opens up another question, which is more about how clearml-agent is supposed to be used. The "pure" way would be to make the docker image provide everything and clearml-agent should do not setup at all.
What I currently do instead is letting the docker image provide all system dependencies and let clearml-agent setup all the python dependencies. This allows me to reuse a docker image for more different experiments. However, then it would make sense to have as many configs as possib...
I am currently on the move, but it was something like upstream server not found in /etc/nginx/nginx.conf and if I remember correctly line 88
Maybe let s put it in a different way:
Pipeline
Preprocess Task Main Task Postprocess Task
My main task is my experiment, so my training code. When I ran the main task standalone, I just used Task.init
and set up the project name, task name, etc.
Now what I could do is push this task to the server, then just reference the task by its task-ID and run the pipeline. However, I do not want to push the main task to the server before running. Instead I want to push the whole pipeline, but st...
Perfect, works! I was looking for "host", didn't come to my mind to search for "worker". Any idea about getting the user that created the task?
It seems to work when I enable conda_freeze
.
Long story short, the Task requirements are async, so if one puts it after creating the object (at least in theory), it might be too late.
AgitatedDove14 Is there no await/synchronize method to wait for task update?
btw: With the ssh agent forwarding I do not have any issues ( https://github.com/allegroai/clearml-agent/issues/45 )
I think doing all that work is not worth it right now, I am just trying to understand why I clearml seems not to be designed something like this:
` task_name = args.task_name
task = Task()
task = task.load_statedict(await Task.load_or_create(task_name))
task.requirements.add(...)
await task.synchronize()
task.execute_remotely(queue_name, exit=True) `
Here is some context on what I am currently trying to do (pseudocode):
`
def run_experiment(args):
...
def get_task_experiment():
task = Task.init(...)
task.bind_run(run_experiment)
return task
def run_with_pipeline(task):
pipe = PipelineController(...)
pipe.add_step(prepare_something...)
pipe.add_step(task)
pipe.add_step(postprocess_something...)
return pipe
if name == "main":
task = get_task_experiment()
# Run without Pipeline
if ...
Thank you. Yes we need to wait for carla to spin up.
Okay, I see. Unfortunetly, I don't get how clearml tasks are intended to be used. Could you help me with that? (see code)
` def start_carla_factory():
task = # How do I create this task?
long_blocking_call_to_start_carla()
return task
pipe = PipelineController(
name="carla-autostart",
project="rlad/carla-servers",
version="0.0.1",
add_pipeline_tags=False,
)
pipe.add_step(name="start-carla", base_task_factory=start_carla_factory)
pipe.start() `
Maybe related question: Will there be some documentation about clearml internals with the new documentation? ClearML seems to store stuff that's relevant to script execution outside of clearml.Task if I am not mistaken. I would like to learn a little bit about what the code structure / internal mechanism is.
Thanks, that makes sense. Can you also explain what task_log_buffer_capacity
does?
But would this not have to be a server parameter instead of a clearml.conf parameter then? Maybe someone from clearml can confirm MortifiedDove27 's explaination?