Reputation
Badges 1
75 × Eureka!maybe not at the top but in the Task.init
description
I subscribe to the problem of having large metrics without a tool for proper inspection what is it coming from.
From the documentation https://github.com/allegroai/clearml-agent :
` Two K8s integration flavours
Spin ClearML-Agent as a long-lasting service pod
use clearml-agent docker image
map docker socket into the pod (soon replaced by podman)
allow the clearml-agent to manage sibling dockers
benefits: full use of the ClearML scheduling, no need to worry about wrong container images / lost pods etc.
downside: Sibling containers `
Is there a place where I can find details about this approach?
I did something similar to what you suggests and it worked, the key insight was that connect and connect_configuration work differently in terms of overrides, thanks!
it is a configuration object (line of my code:config_path = task.connect_configuration(config_path)
ok, I solved the problem,agent.force_git_ssh_protocol = true
did the trick
I created my own docker image with a newer python and the error disappeared
I circumvented the problem by putting timestamp in task name, but I don't think this is necessary.
ok, but do you know why did it try to reuse in the first place?
I did not know about it, thanks!
ok, I will do a simple workaround for this (use an additional parameter that I can update using parameter_override and then check if it exists and update the configuration in python myself)
I can hardcode it into program if you want
no, I set the env variable CLEARML_TASK_ID myself
Just to let you know, it now works (obviously) in the k8s setting as well.
which is probably why it does not work for me, right?
traceback:
` Traceback (most recent call last):
File "/home/marek/nomagic/monomagic/ml/tiresias/calibrate_and_test.py", line 57, in <module>
Task.add_requirements('requirements.txt')
File "/home/marek/.virtualenvs/tiresias-3.9/lib/python3.9/site-packages/clearml/backend_interface/task/task.py", line 1976, in add_requirements
for req in pkg_resources.parse_requirements(requirements_txt):
File "/home/marek/.virtualenvs/tiresias-3.9/lib/python3.9/site-packages/pkg_resources/_init...
AgitatedDove14 np
FrothyDog40 thanks!
I could have been more inventive as well 😄
but seriously, I am very thankful you were willing to spend so much time helping me, I am super impressed by your response time and helpfulness!
this is part of repository
ok, understood, it was probably my fault, I was messing up with the services container and probably made the pipeline task interrupted, so the subtasks themselves have finished, but the pipeline task was not alive when it happened
I did not configure user/pass for git
I don't see such a method in the docs, but it seems so natural that decided to ask.
I think there was some problem how shutil.copytree works in python3.6 with broken links
@<1523701435869433856:profile|SmugDolphin23> will send later today