I Am Working Up With The Autoscaler, After Setting Up The Autoscaler Instance I Am Getting The Following Error When I Launch The Autoscaler Googleapiclient.Errors.Httperror: <Httperror 404 When Requesting

Answered

I am working up with the autoscaler, after setting up the autoscaler instance I am getting the following error when I launch the autoscaler

googleapiclient.errors.HttpError: <HttpError 404 when requesting https://compute.googleapis.com/compute/v1/projects/ .. returned "The resource 'projects/../' was not found". Details: "[{'message': "The resource 'projects/../' was not found", 'domain': 'global', 'reason': 'notFound'}]">

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Votes Newest

Answers 20

I don't think it has issues with this

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Hey, so I am able to spin up the GCP instance using the autoscaler, I wanted to confirm one thing does the autoscaler spins up the agent automatically in the VM or do I need to add the script for that to the bash script

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Thanks!
Hmm from here : None
Could it be you do not have privileges to the resource, or that you did not provide credentials ?
Did that autoscaler work before ?

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

DiminutiveToad80 if you're using GCP, there's some machine image you should be specifying for the machine - the docker image is only used later by the agent, when the agent is running. Can you please elaborate on exactly what is starting inside the instance, and share logs to show it?

  				
Posted 
	one year ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Also SuccessfulKoala55 when autoscaler spins up my GCP instance, when I look inside it I am not able to find the clearml.conf file, does it not install clearml automatically when it spins up the VM?

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

I did provide the credentials, and also I am running up the autoscaler for the first time, so no it hasn't worked before

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

So funny thing I was making a typo while writing the GPU type, I was writing NVIDIA T4 instead of nvidia-tesla-t4

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Well the VM is running in the default docker nvidia/cuda:10.2-cudnn7-runtime-ubuntu18.04, but it's not spinning up the agent when the VM is intialized

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Hi DiminutiveToad80 , apologies for the delay - is it possible that a T4 is not available in the zone you're configuring?

  				
Posted 
	one year ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

It has my project ID and zone

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Hi DiminutiveToad80
I think we will need more context for the log...
but I think there is something wrong with the GCP resource configuration of your autoscaler
Can you send the full autoscaler log and the configuration ?

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Also I was facing another issue, the task is not able to clone the github repo, it's showing authentication error even though I have passed my git credentials

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

So, I am able to resolve the above issues

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

I was able to set up a GCP VM manually earlier, like without the autoscaler

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

DiminutiveToad80 how is this section of the autoscaler wizard configured?

  				
Posted 
	one year ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Good to hear - what did you change?
Regarding your question, the autoscaler will automatically inject a startup script to do that for you, but you will need to make sure the VM contains docker

  				
Posted 
	one year ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

2023-10-03 20:46:07,100 - clearml.Auto-Scaler - INFO - Spinning new instance resource='clearml-autoscaler-vm', prefix='dynamic_gcp', queue='default'
2023-10-03 20:46:07,107 - googleapiclient.discovery_cache - INFO - file_cache is only supported with oauth2client<4.0.0
2023-10-03 20:46:07,122 - clearml.Auto-Scaler - INFO - Creating regular instance for resource clearml-autoscaler-vm
2023-10-03 20:46:07,264 - clearml.Auto-Scaler - INFO - --- Cloud instances (0):
2023-10-03 20:46:07,482 - clearml.Auto-Scaler - INFO - stopping
2023-10-03 20:46:07,482 - clearml.Auto-Scaler - INFO - state change: State.RUNNING -> State.STOPPED
2023-10-03 20:46:07,482 - clearml.Auto-Scaler - INFO - Autoscaler exits
2023-10-03 20:46:07,556 - clearml.Auto-Scaler - ERROR - Failed to start new instance (resource 'clearml-autoscaler-vm'), Error: <HttpError 404 when requesting https://compute.googleapis.com/compute/v1/projects/../zones/ returned "The resource 'projects/../zones/us-central1-a/ was not found". Details: "[{'message': "The resource 'projects/../zones/ was not found", 'domain': 'global', 'reason': 'notFound'}]">
Traceback (most recent call last):
File "/root/.clearml/venvs-builds/3/task_repository/clearml-apps.git/apps/auto_scaler/auto_scaler.py", line 742, in launch_one
instance_id = self.driver.spin_up_worker(resource_conf, worker_prefix, queue, task_id=task_id)
File "/root/.clearml/venvs-builds/3/task_repository/clearml-apps.git/apps/auto_scaler/cloud_driver.py", line 281, in spin_up_worker
instance_id, region = self._spin_up_worker(resource_conf, worker_prefix, queue_name, task_id)
File "/root/.clearml/venvs-builds/3/task_repository/clearml-apps.git/apps/auto_scaler/gcp_driver.py", line 194, in _spin_up_worker
exc, response = f(*args)
File "/root/.clearml/venvs-builds/3/task_repository/clearml-apps.git/apps/auto_scaler/networking.py", line 18, in wrapper
return func(obj_instance, *args, **kwargs)
File "/root/.clearml/venvs-builds/3/task_repository/clearml-apps.git/apps/auto_scaler/gcp_driver.py", line 137, in attempt_launch
spin_up_client.instances().insert(project=self.gcp_project_id, zone=zone, body=launch_spec).execute()
File "/root/venv/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 131, in positional_wrapper
return wrapped(*args, **kwargs)
File "/root/venv/lib/python3.8/site-packages/googleapiclient/http.py", line 937, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 404 when requesting https://compute.googleapis.com/compute/v1/projects/zones/../ returned "The resource 'projects/../zones/' was not found". Details: "[{'message': "The resource 'projects/../zones/us-central1-a/acceleratorTypes/NVIDIA T4' was not found", 'domain': 'global', 'reason': 'notFound'}]">

1696365971074 apps-agent-i-08bf8b26b6175ea1f-1:service:8d816e475307473885aaa87b52a5c526 DEBUG Process aborted by user

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

While creating a GCP credentials using None
What values should I insert in the following step so that the autoscaler has access, as of now I left this field blank

  				
Posted 
	one year ago

					More  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Write your answer

1K Views

20 Answers

one year ago