Reputation
Badges 1
54 × Eureka!Iām not sure but it seems like you get different kinds of flexibility depending on whether you enqueue the task yourself or whether you rely on execute_remotely
. I think ideally if I could choose to get the benefit of auto-scanning provided by execute_remotely
as well as more flexibility it would be great.
Is it possible to set that at task enqueueing SuccessfulKoala55 ?
My docker image will have all required apt
packages, so no need.
It recognizes the main repo, but I want it to push and pull from another one (my own forked repo). AgitatedDove14
AgitatedDove14 when I try this I getclearml.backend_interface.session.SendError: Action failed <400/110: tasks.enqueue/v1.0 (Invalid task status (Invalid status change): current_status=in_progress, new_status=queued)> (queue=e78d2fdf2d5140b6b5c6678338c532bb, task=95082c9174a04044b25253d724362ec1)
AgitatedDove14 wouldnāt the above command task.execute_remotely(queue_name=None, clone=False, exit_process=False)
fail becauseclone==False and exit_process==False is not supported. Task enqueuing itself must exit the process afterwards.
I thought it worked earlier š®
This is exactly what I was looking for. I thought once you call execute_remotely
the task is sent and itās too late to change anything.
I already have that set to true and want that behavior. The issue is on the ācommittedā change set. When I push code to github I push to my fork and pull from the main/master repo (all changes go through PRs from fork to main).
Now when I use execute_remotely
, whatever code does the git discovery, considers whatever repo I pull
from the repo to use. But these changes havenāt necessarily been merged into main. The correct behavior would be to use the forked repo.
I know this is not the default behavior so Iād be happy with having the option to override the repo when I call execute_remotely
Hereās another place where /root/
is hardcoded https://github.com/allegroai/clearml-agent/blob/b196ab57931f3c67efcb561df0c8a2fe7c0e76f9/clearml_agent/commands/worker.py#L3338-L3341
Well this doesnāt workpip install -e
It is indeed autopopulated by init
Is there a way to make it use ssh+git
instead of git+git
? Maybe add a force_ssh_pip_install
to the agent config?
... more-itertools==8.6.0 -e git+git@github.com:user/private_package.git@57f382f51d124299788544b3e7afa11c4cba2d1f#egg=private_package msgpack==1.0.2 msgpack-numpy==0.4.7.1 ...
The commit is valid for sure.
Iām wondering, would an older version of the agent work well with a newer server version and vice-versa?
That wonāt work š
The docker shell script runs too early in the process.
I want to inject a bash command after the repo has been clone (and maybe even after the venv has been installed).
TimelyPenguin76 After creating the venv (so I donāt have to do it myself). Once an env is there, I need to run a script while the env is activated from the root of the repo.
So when the repo is cloned and venv is created and activated I want to executed this from the repo: tools/setup_dependencies.sh
$ python --version Python 3.6.8 $ python repo/toy_workflow.py --logtostderr --logtoclearml --clearml_queue=ada_manual_jobs 2021-08-07 04:04:16,844 - clearml - WARNING - Switching to remote execution, output log page https://...
On the webpage logs I see this:2021-08-07 04:04:12 ClearML Task: created new task id=f1092bcbe30249639122a49a9b3f9145 ClearML results page:
`
2021-08-07 04:04:14
ClearML Monitor: GPU monitoring failed getting GPU reading, switching off GPU monitoring
2021-08...
AgitatedDove14 it was executed with Python 3 and Iām running in venv mode.
OH! I was installing it on an env
$ git remote -v fork git@github.com:salimmj/somerepo.git (fetch) fork git@github.com:salimmj/somerepo.git (push) origin git@github.com:mainuser/somerepo.git (fetch) origin git@github.com:mainuser/somerepo.git (push)
I want to keep the above setup, the remote branch that will track my local will be on fork
so it needs to pull from there. Currently it recognizes origin
so it doesnāt work because the agent then canāt find the commit.
I think itās great to let users build their own UI-connected apps, Iād use that for sure!
AgitatedDove14 no I mean I can do:
` docker run -t --gpus "device=1" -dit -e APP_ENV=kprod -e CLEARML_WORKER_ID=ada:gpu1 -e CLEARML_DOCKER_IMAGE=922531023312.dkr.ecr.us-west-2.amazonaws.com/jym-coach:202108080511.7e8d6d1 -v /home/smjahad/.gitconfig:/root/.gitconfig -v /tmp/.clearml_agent.kjx6r9oo.cfg:/root/clearml.conf -v /tmp/clearml_agent.ssh.l8cguj81:/root/.ssh -v /home/smjahad/.clearml/apt-cache.1:/var/cache/apt/archives -v /home/smjahad/.clearml/pip-cache:/root/.cache/pip -v /home/smjah...
SuccessfulKoala55 I tried to make a docker image by combining one of our dockerfiles with this https://github.com/allegroai/clearml-agent/blob/master/docker/agent/Dockerfile . I modified the entrypoint
to also be a combination of both.
Right now Iām not seeing that error, but the the process seems to exit (as completed) after the docker run
. Iām wondering if my Dockerfile is not properly setup and itās exiting before the deamon is started.
ugh, sudo actually makes it fail explicitly because
` error: Could not fetch origin
Repository cloning failed: Command '['git', 'fetch', '--all', '--recurse-submodules']' returned non-zero exit status 1.
- Make sure you pushed the requested commit:
(repository='git@github.com:salimmj/clearml-demo.git', branch='main', commit_id='f76f3affd28d5558928d7ffd9a6797890ffdd708', tag='', docker_cmd='nvidia/cuda:11.4.0-runtime-ubuntu20.04', entry_point='mnist.py', working_dir='.') - Check if remote-wo...
I tried with and without. Iām having the issue where if I run the task from the queue it will complete as soon as it goes into docker but if I run the same docker run it works.