Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Everyone, I Am Running A Pipeline Using The Autoscaler, I Am Able To Spin Up The Vm Instance Using The Autoscaler And The Docker Is Also Getting Installed In There Perfectly. The Issue I Am Facing Is That During Executing A Pipeline Task While Cloning

Hi everyone, I am running a pipeline using the autoscaler, I am able to spin up the VM instance using the autoscaler and the docker is also getting installed in there perfectly. The issue I am facing is that during executing a pipeline task while cloning my git repo, the task execution is kinda stuck and is running forever, it is showing me the following logs in console

Collecting git+git_repo_name
Cloning git_repo
Running command git clone -q git_repo_name
Username for ' None ':

While creating the autoscaler instance I did provide my git credentials, i.e my username and Personal Access Token. How should I resolve this issue?

  
  
Posted one year ago
Votes Newest

Answers 30


while we spin up the autoscaler instance

  
  
Posted one year ago

Oh ok, thanks though!!

  
  
Posted one year ago

Ok, I'll try that out

  
  
Posted one year ago

Ok I'll try that out, enable_git_ask_pass: true is not working

  
  
Posted one year ago

Then try to add the missing apt packages

extra_docker_shell_script: ["apt-get install -y ???", ]

None

  
  
Posted one year ago

I am not familiar with autoscaler ... are you using the paid version of Clearml ?

  
  
Posted one year ago

yeah

  
  
Posted one year ago

I provided the credentials while setting up the autoscaler instance, where can I look for the clearml.conf. When I ssh into the instance, spin up by the autoscaler, I am not able to see the clearml.conf

  
  
Posted one year ago

Try to add '--network host' to the docker args on the task you are launching

  
  
Posted one year ago

I don't have it so I don't know how things are setup and how to pass on credentials in this case

  
  
Posted one year ago

And one more thing is there a way to make changes to the .bashrc which is present inside the docker container

  
  
Posted one year ago

If you can let me know @<1576381444509405184:profile|ManiacalLizard2> @<1523701087100473344:profile|SuccessfulKoala55> how to resolve this, that would be very much helpful

  
  
Posted one year ago

how di you provide credentials to clearml and git ?

  
  
Posted one year ago

what is the command you use to run clearml-agent ?

  
  
Posted one year ago

on the host machine or inside the containers that are spinning on the host machine ?

  
  
Posted one year ago

inside the containers that are spinning on the host machine

  
  
Posted one year ago

Let me know if this is enough information or not

  
  
Posted one year ago

this looks like the agent running inside your docker did not have any username/password to do git clone. so the default behavior is to wait for keyboard input: which look like hanging ....

  
  
Posted one year ago

While creating the autoscaler instance I did provide my git credentials, i.e my username and Personal Access Token.

How exactly did you do that ?

  
  
Posted one year ago

Hmm I see, add this for example

extra_docker_shell_script: ["rm ~/.bashrc", "echo removed bashrc"]

None

  
  
Posted one year ago

Because I think I need to have the following two lines in the .bashrc and the Google_Application_credentials
git config --global user.email 'email'
git config --global user.name "user_name"

  
  
Posted one year ago

Note: switching to 'commit_id'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:
git switch -c <new-branch-name>
Or undo this operation with:
git switch -
Turn off this advice by setting config variable advice.detachedHead to false
HEAD is now at commit_id
type: git
url: git_repo
branch: HEAD
commit: commit_id
root: root_dir
Ignoring pip: markers 'python_version >= "3.10"' don't match your environment
Collecting pip<20.2
Using cached pip-20.1.1-py2.py3-none-any.whl (1.5 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 23.2.1
Uninstalling pip-23.2.1:
Successfully uninstalled pip-23.2.1
2023-10-12 11:49:23
Successfully installed pip-20.1.1
Collecting git+it_repo_name
Cloning git_repo
Running command git clone -q git_repo_name
Username for ' None ':
2023-10-12 12:19:36
User aborted: stopping task (1)

2023-10-12 12:19:36
Process aborted by user

  
  
Posted one year ago

Hi @<1610083503607648256:profile|DiminutiveToad80> , can you perhaps include a more comprehensive log?

  
  
Posted one year ago

what does your clearml.conf look liks ?

  
  
Posted one year ago

@<1610083503607648256:profile|DiminutiveToad80> try to turn on:
None

enable_git_ask_pass: true
  
  
Posted one year ago

Just a follow up on this issue, @<1523701087100473344:profile|SuccessfulKoala55> @<1523701205467926528:profile|AgitatedDove14> I would very much appreciate it if you could help me with this.

  
  
Posted one year ago

Ok I was able to resolve the above issue, but now I am getting the following error while executing a task

import cv2
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/cv2/init.py", line 181, in <module>
bootstrap()
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/cv2/init.py", line 153, in bootstrap
native_module = importlib.import_module("cv2")
File "/usr/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: libGL.so.1: cannot open shared object file: No such file or directory

  
  
Posted one year ago

try:
None

docker_install_opencv_libs: true
  
  
Posted one year ago

Still giving me the same error

  
  
Posted one year ago

@<1523701205467926528:profile|AgitatedDove14> I was able to resolve that, but now I am having issues with fiftyone, it's showing me the following error

import fiftyone as fo
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/fiftyone/init.py", line 25, in <module>
from fiftyone.public import *
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/fiftyone/public.py", line 15, in <module>
_foo.establish_db_conn(config)
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/fiftyone/core/odm/database.py", line 200, in establish_db_conn
port = _db_service.port
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/fiftyone/core/service.py", line 276, in port
return self._wait_for_child_port()
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/fiftyone/core/service.py", line 170, in _wait_for_child_port
return find_port()
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/retrying.py", line 56, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/retrying.py", line 266, in call
raise attempt.get()
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/retrying.py", line 301, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/usr/local/lib/python3.8/dist-packages/six.py", line 719, in reraise
raise value
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/retrying.py", line 251, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/fiftyone/core/service.py", line 168, in find_port
raise ServiceListenTimeout(etau.get_class_name(self), port)
fiftyone.core.service.ServiceListenTimeout: fiftyone.core.service.DatabaseService failed to bind to port

  
  
Posted one year ago
879 Views
30 Answers
one year ago
one year ago
Tags
Similar posts