Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey There! Question About The Aws Autoscaler, The Tldr Is That I Can'T Get The Aws_Autoscaler.Py When Running With --Remote Flag To Clone My Git Repository (Hosted On Gitlab). Here'S What I Did So Far:

hey there! question about the aws autoscaler, the TLDR is that i can't get the aws_autoscaler.py when running with --remote flag to clone my git repository (hosted on gitlab).
here's what i did so far:

  • created daemon using this command:
 ~/.local/bin/clearml-agent daemon -d --create-queue --queue scaler --git-user *** --git-pass '***' --docker registry.gitlab.com/visionary.ai/brightervision/py-vai:latest --cpu-only

note on the git user and password, i gave the gitlab token and not the password, though it doesn't work either way.
2. ran the script aws_autoscaler.py almost without changes, changed the queue name to "scaler" (from services), when i run the code i see on my webserver that the task is indeed running, but it fails cloning my repository.
3. ive tried adding my git credentials to the clearml.conf file, tried with the daemon, no matter what i do this is the output i get:

remote: The project you were looking for could not be found or you don't have permission to view it.
fatal: repository '
' not found
fatal: clone of '
%40visionary.ai:glpat-qBsjZLLBZXiXyygZec6w@gitlab.com/Visionary.ai/deployed-models.git' into submodule path '/root/.clearml/vcs-cache/brightervision.git.914550bb3b0c1d9fa4908353eae6005a/brightervision.git/jetson/deployed-models' failed
Failed to clone 'deployed-models'. Retry scheduled
remote: The project you were looking for could not be found or you don't have permission to view it.
fatal: repository '
' not found
fatal: clone of 'https://<git user>:<git passwword:@gitlab.com/Visionary.ai/deployed-models.git' into submodule path '/root/.clearml/vcs-cache/brightervision.git.914550bb3b0c1d9fa4908353eae6005a/brightervision.git/jetson/deployed-models' failed
Failed to clone 'deployed-models' a second time, aborting
Failed to recurse into submodule path 'jetson'
Repository cloning failed: Command '['clone', 'https:/<user>@gitlab.com/Visionary.ai/brightervision.git', '/root/.clearml/vcs-cache/brightervision.git.914550bb3b0c1d9fa4908353eae6005a/brightervision.git', '--quiet', '--recursive']' returned non-zero exit status 1.
clearml_agent: ERROR: Failed cloning repository. 
1) Make sure you pushed the requested commit:
(repository='git@gitlab.com:Visionary.ai/brightervision.git', branch='clearml-testing', commit_id='b1f912faa67fa2db921c277f3c953022d35a7c05', tag='', docker_cmd='registry.gitlab.com/visionary.ai/brightervision/py-vai:latest', entry_point='research/amir/clearml/aws_autoscaler.py', working_dir='.')
2) Check if remote-worker has valid credentials [see worker configuration file]

as far as i know the token i've provided gives permissions for the full use of the API, the user has access to all sub-repositories

  
  
Posted 5 months ago
Votes Newest

Answers 5


Hi @<1612982606469533696:profile|ZealousFlamingo93> , I'm not sure I understand. You're trying to run the autoscaler, how is the clearml-agent connected to this?

  
  
Posted 5 months ago

Can you please elaborate a bit on your setup and what you're trying to achieve?

  
  
Posted 5 months ago

Hey @<1612982606469533696:profile|ZealousFlamingo93> , I had a similar problem with Gitlab tokens not working with the Agent. My issue was slightly different with the error being clearly a permissions issue with no alternative options, but I see that your output is suggesting to check if your remote-worker had valid credentials as well along with the making sure you have the right commit.

I resolved the issue by making a gitlab token with a developer role. I found that with private Gitlab repos, the Guest role (which is default for Gitlab project access tokens) does not have the permission to clone or even access the repos.

  
  
Posted 5 months ago

@<1600661428556009472:profile|HighCoyote66> managed to solve the issue, the git i've provided was indeed in developer role, i switched to my personal git (which is maintainer) and it works smoothly. but thanks for the help!

  
  
Posted 5 months ago

hey, thanks for the reply.
i understood, perhaps i was wrong, that i need to create the "scaler" queue and have an agent listening on the queue so that when i run the auto_scaler with the --remote flag someone will pick up the task.
as for the current setup question, do you mean like how my machines are configured?
what im trying to achieve is that i could instantiate ec2 clients so that we could train our networks, i want to be able to instantiate multiple instances, but also control when i turn them off, therefore the auto_scaler seems like the logical solution

  
  
Posted 5 months ago
363 Views
5 Answers
5 months ago
5 months ago
Tags
aws
Similar posts