Reputation
Badges 1
611 × Eureka!` # Connecting ClearML with the current process,
from here on everything is logged automatically
task = Task.init(project_name="examples", task_name="artifacts example")
task.set_base_docker(
"my_docker",
docker_arguments="--memory=60g --shm-size=60g -e NVIDIA_DRIVER_CAPABILITIES=all",
)
if not running_remotely():
task.execute_remotely("docker", clone=False, exit_process=True)
timer = Timer()
with timer:
# add and upload Numpy Object (stored as .npz file)
task.upload_a...
Yea, the clearml-data is immutable, but not the underlying data if I just store a pointer to some location.
Thanks a lot. I somehow missed this.
Anyways, from my google search it seems that this is not something that is intuitive to fix.
Is there any progress on this: https://github.com/allegroai/clearml-agent/issues/45 ? This works on all my machines 🙂
Could you guide me to the documentation for using the docker file? I am not able to find it. I only found task.set_base_docker which I am not sure what it does.
If you think the explanation takes too much time, no worries! I do not want to waste your time on my confusion 😄
Thank you very much!
I think doing all that work is not worth it right now, I am just trying to understand why I clearml seems not to be designed something like this:
` task_name = args.task_name
task = Task()
task = task.load_statedict(await Task.load_or_create(task_name))
task.requirements.add(...)
await task.synchronize()
task.execute_remotely(queue_name, exit=True) `
Long story short, the Task requirements are async, so if one puts it after creating the object (at least in theory), it might be too late.
AgitatedDove14 Is there no await/synchronize method to wait for task update?
Maybe related question: Will there be some documentation about clearml internals with the new documentation? ClearML seems to store stuff that's relevant to script execution outside of clearml.Task if I am not mistaken. I would like to learn a little bit about what the code structure / internal mechanism is.
Well, after restarting the agent (to set it into --detached more) it set the cleanup_task.py into service mode, but my monitoring tasks are just executed on the agent itself (no new service clearml-agent is started) and then it is aborted right after starting.
Btw: Is it intented that the folder structures in the fileserver directories is not deleted?
To update the task.requirements before it actually creates it (the requirements are created in a background thread)
Why can't it be updated after creation?
Both, actually. So what I personally would find intuitive is something like this:
` class Task:
def load_statedict(self, state_dict):
pass
async def synchronize(self):
...
async def task_execute_remotely(self):
await self.synchronize()
...
def add_requirement(self, requirement):
...
@classmethod
async def init(task_name):
task = Task()
task.load_statedict(await Task.load_or_create(task_name))
await tas...
Python 3.8.8, clearml 1.0.2
Works with 1.4. Sorry for not checking versions myself!
Give me 5min and I send the full log
The script is intended to be used something like this:script.py train my_model --steps 10000 --checkpoint-every 10000
orscript.py test my_model --steps 1000
Perfect, will try it. fyi: The conda_channels that I used are from clearml-agent init
I randocker run -it -v /home/hostuser/.ssh/:/root/.ssh ubuntu:18.04but cloning does not work and this is what ls -lah /root/.ssh gives inside the docker container:
` -rw------- 1 1001 1001 1.5K Apr 8 12:28 authorized_keys
-rw-rw-r-- 1 1001 1001 208 Apr 29 09:15 config
-rw------- 1 1001 1001 432 Apr 8 12:53 id_ed25519
-rw-r--r-- 1 1001 1001 119 Apr 8 12:53 id_ed25519.pub
-rw------- 1 1001 1001 432 Apr 29 09:16 id_gitlab
-rw-r--r-- 1 1001 1001 119 Apr 29 09:25 id_gitlab.pub
-...
Thank you very much. I am going to try that.
For me this does not work (at least with nested tqdm bars, did not try single ones yet).
` ocker-compose ps
Name Command State Ports
clearml-agent-services /usr/agent/entrypoint.sh Restarting
clearml-apiserver /opt/clearml/wrapper.sh ap ... Up 0.0.0.0:8008->8008/tcp, 8080/tcp, 8081/tcp ...