Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
I Am Working Up With The Autoscaler, After Setting Up The Autoscaler Instance I Am Getting The Following Error When I Launch The Autoscaler Googleapiclient.Errors.Httperror: <Httperror 404 When Requesting

I am working up with the autoscaler, after setting up the autoscaler instance I am getting the following error when I launch the autoscaler

googleapiclient.errors.HttpError: <HttpError 404 when requesting https://compute.googleapis.com/compute/v1/projects/ .. returned "The resource 'projects/../' was not found". Details: "[{'message': "The resource 'projects/../' was not found", 'domain': 'global', 'reason': 'notFound'}]">

  
  
Posted one year ago
Votes Newest

Answers 20


@<1610083503607648256:profile|DiminutiveToad80> if you're using GCP, there's some machine image you should be specifying for the machine - the docker image is only used later by the agent, when the agent is running. Can you please elaborate on exactly what is starting inside the instance, and share logs to show it?

  
  
Posted one year ago

So, I am able to resolve the above issues

  
  
Posted one year ago

Also I was facing another issue, the task is not able to clone the github repo, it's showing authentication error even though I have passed my git credentials

  
  
Posted one year ago

Also @<1523701087100473344:profile|SuccessfulKoala55> when autoscaler spins up my GCP instance, when I look inside it I am not able to find the clearml.conf file, does it not install clearml automatically when it spins up the VM?

  
  
Posted one year ago

Well the VM is running in the default docker nvidia/cuda:10.2-cudnn7-runtime-ubuntu18.04, but it's not spinning up the agent when the VM is intialized

  
  
Posted one year ago

So funny thing I was making a typo while writing the GPU type, I was writing NVIDIA T4 instead of nvidia-tesla-t4

  
  
Posted one year ago

Good to hear - what did you change?
Regarding your question, the autoscaler will automatically inject a startup script to do that for you, but you will need to make sure the VM contains docker

  
  
Posted one year ago

Hey, so I am able to spin up the GCP instance using the autoscaler, I wanted to confirm one thing does the autoscaler spins up the agent automatically in the VM or do I need to add the script for that to the bash script

  
  
Posted one year ago

I was able to set up a GCP VM manually earlier, like without the autoscaler

  
  
Posted one year ago

Hi @<1610083503607648256:profile|DiminutiveToad80> , apologies for the delay - is it possible that a T4 is not available in the zone you're configuring?

  
  
Posted one year ago

I don't think it has issues with this

  
  
Posted one year ago

It has my project ID and zone

  
  
Posted one year ago

@<1610083503607648256:profile|DiminutiveToad80> how is this section of the autoscaler wizard configured?
image

  
  
Posted one year ago

image

  
  
Posted one year ago

While creating a GCP credentials using None
What values should I insert in the following step so that the autoscaler has access, as of now I left this field blank

  
  
Posted one year ago

I did provide the credentials, and also I am running up the autoscaler for the first time, so no it hasn't worked before

  
  
Posted one year ago

Thanks!
Hmm from here : None
Could it be you do not have privileges to the resource, or that you did not provide credentials ?
Did that autoscaler work before ?

  
  
Posted one year ago

image
image
image

  
  
Posted one year ago

2023-10-03 20:46:07,100 - clearml.Auto-Scaler - INFO - Spinning new instance resource='clearml-autoscaler-vm', prefix='dynamic_gcp', queue='default'
2023-10-03 20:46:07,107 - googleapiclient.discovery_cache - INFO - file_cache is only supported with oauth2client<4.0.0
2023-10-03 20:46:07,122 - clearml.Auto-Scaler - INFO - Creating regular instance for resource clearml-autoscaler-vm
2023-10-03 20:46:07,264 - clearml.Auto-Scaler - INFO - --- Cloud instances (0):
2023-10-03 20:46:07,482 - clearml.Auto-Scaler - INFO - stopping
2023-10-03 20:46:07,482 - clearml.Auto-Scaler - INFO - state change: State.RUNNING -> State.STOPPED
2023-10-03 20:46:07,482 - clearml.Auto-Scaler - INFO - Autoscaler exits
2023-10-03 20:46:07,556 - clearml.Auto-Scaler - ERROR - Failed to start new instance (resource 'clearml-autoscaler-vm'), Error: <HttpError 404 when requesting https://compute.googleapis.com/compute/v1/projects/../zones/ returned "The resource 'projects/../zones/us-central1-a/ was not found". Details: "[{'message': "The resource 'projects/../zones/ was not found", 'domain': 'global', 'reason': 'notFound'}]">
Traceback (most recent call last):
File "/root/.clearml/venvs-builds/3/task_repository/clearml-apps.git/apps/auto_scaler/auto_scaler.py", line 742, in launch_one
instance_id = self.driver.spin_up_worker(resource_conf, worker_prefix, queue, task_id=task_id)
File "/root/.clearml/venvs-builds/3/task_repository/clearml-apps.git/apps/auto_scaler/cloud_driver.py", line 281, in spin_up_worker
instance_id, region = self._spin_up_worker(resource_conf, worker_prefix, queue_name, task_id)
File "/root/.clearml/venvs-builds/3/task_repository/clearml-apps.git/apps/auto_scaler/gcp_driver.py", line 194, in _spin_up_worker
exc, response = f(*args)
File "/root/.clearml/venvs-builds/3/task_repository/clearml-apps.git/apps/auto_scaler/networking.py", line 18, in wrapper
return func(obj_instance, *args, **kwargs)
File "/root/.clearml/venvs-builds/3/task_repository/clearml-apps.git/apps/auto_scaler/gcp_driver.py", line 137, in attempt_launch
spin_up_client.instances().insert(project=self.gcp_project_id, zone=zone, body=launch_spec).execute()
File "/root/venv/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 131, in positional_wrapper
return wrapped(*args, **kwargs)
File "/root/venv/lib/python3.8/site-packages/googleapiclient/http.py", line 937, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 404 when requesting https://compute.googleapis.com/compute/v1/projects/zones/../ returned "The resource 'projects/../zones/' was not found". Details: "[{'message': "The resource 'projects/../zones/us-central1-a/acceleratorTypes/NVIDIA T4' was not found", 'domain': 'global', 'reason': 'notFound'}]">

1696365971074 apps-agent-i-08bf8b26b6175ea1f-1:service:8d816e475307473885aaa87b52a5c526 DEBUG Process aborted by user

  
  
Posted one year ago

Hi @<1610083503607648256:profile|DiminutiveToad80>
I think we will need more context for the log...
but I think there is something wrong with the GCP resource configuration of your autoscaler
Can you send the full autoscaler log and the configuration ?

  
  
Posted one year ago
987 Views
20 Answers
one year ago
one year ago
Tags
Similar posts