
Reputation
Badges 1
54 × Eureka!AgitatedDove14 it was executed with Python 3 and Iām running in venv mode.
$ python --version Python 3.6.8 $ python repo/toy_workflow.py --logtostderr --logtoclearml --clearml_queue=ada_manual_jobs 2021-08-07 04:04:16,844 - clearml - WARNING - Switching to remote execution, output log page https://...
On the webpage logs I see this:2021-08-07 04:04:12 ClearML Task: created new task id=f1092bcbe30249639122a49a9b3f9145 ClearML results page:
`
2021-08-07 04:04:14
ClearML Monitor: GPU monitoring failed getting GPU reading, switching off GPU monitoring
2021-08...
That wonāt work š
The docker shell script runs too early in the process.
I want to inject a bash command after the repo has been clone (and maybe even after the venv has been installed).
So when the repo is cloned and venv is created and activated I want to executed this from the repo: tools/setup_dependencies.sh
The private_package
can be installed by doing pip install
git+ssh://git@github.com/user/private_package.git but the agent is trying to do pip install private_package
which wonāt work.
It is indeed autopopulated by init
Our code is tightly integrated with protobuffers which needs to be re-compiled every now and then. We have a script to do that. If thatās not done, some imports end up failing.
AgitatedDove14 this works: pip install
git+ssh://git@github.com/user/repo.git
I think it works, Iām fixing something else that came up.
If you were to add this, where would you put it? I can use a modified version of clearml-agent
TimelyPenguin76 After creating the venv (so I donāt have to do it myself). Once an env is there, I need to run a script while the env is activated from the root of the repo.
I think itās great to let users build their own UI-connected apps, Iād use that for sure!
Is it possible to set that at task enqueueing SuccessfulKoala55 ?
I tried with and without. Iām having the issue where if I run the task from the queue it will complete as soon as it goes into docker but if I run the same docker run it works.
Itās not that I think because it works if I run the same command manually.
ugh, sudo actually makes it fail explicitly because
` error: Could not fetch origin
Repository cloning failed: Command '['git', 'fetch', '--all', '--recurse-submodules']' returned non-zero exit status 1.
- Make sure you pushed the requested commit:
(repository='git@github.com:salimmj/clearml-demo.git', branch='main', commit_id='f76f3affd28d5558928d7ffd9a6797890ffdd708', tag='', docker_cmd='nvidia/cuda:11.4.0-runtime-ubuntu20.04', entry_point='mnist.py', working_dir='.') - Check if remote-wo...
The commit is valid for sure.
EagerOtter28 Iām running into a similar situation as you.
I think you could use --standalone-mode
and do the cloning yourself in the docker bash script that you can configure in the agent config.
I do expect it to pip
install though which doesnāt root access I think
Great find! So a pip upgrade should fix it hopefully.
It doesnāt install it automatically, I think I need to specify it somewhere, see the above error. Or am I misunderstanding?
I am already forcing ssh auth
SuccessfulKoala55 I tried to make a docker image by combining one of our dockerfiles with this https://github.com/allegroai/clearml-agent/blob/master/docker/agent/Dockerfile . I modified the entrypoint
to also be a combination of both.
Right now Iām not seeing that error, but the the process seems to exit (as completed) after the docker run
. Iām wondering if my Dockerfile is not properly setup and itās exiting before the deamon is started.
AgitatedDove14 wouldnāt the above command task.execute_remotely(queue_name=None, clone=False, exit_process=False)
fail becauseclone==False and exit_process==False is not supported. Task enqueuing itself must exit the process afterwards.
I thought it worked earlier š®
I already have that set to true and want that behavior. The issue is on the ācommittedā change set. When I push code to github I push to my fork and pull from the main/master repo (all changes go through PRs from fork to main).
Now when I use execute_remotely
, whatever code does the git discovery, considers whatever repo I pull
from the repo to use. But these changes havenāt necessarily been merged into main. The correct behavior would be to use the forked repo.
This is exactly what I was looking for. I thought once you call execute_remotely
the task is sent and itās too late to change anything.
Fixed it by adding this code block. Makes sense.if clone: task = Task.clone(self) else: task = self # check if the server supports enqueueing aborted/stopped Tasks if Session.check_min_api_server_version('2.13'): self.mark_stopped(force=True) else: self.reset()
If venv works inside containers thatās even better. We actually have custom containers that build on master merges. I wonder if using our own containers which should have most the deps will work better than a simpler container.