Unanswered
Hi All,
I'M Trying To Set Up Aws Autoscaler To Spin Up Ec2 Instances From Predefined Ami
So I Was Able To Set Up The Autoscaler, But I Am Experiencing Some Issues With Spinning Up The Ec2 Instance.
Seems Like It Keeps Failing (Spinning Up An Instance, The
this time it got stuck...
2025-01-22 12:54:32
2025-01-22 10:54:27,220 - clearml.Auto-Scaler - INFO - --- Cloud instances (0):
2025-01-22 10:54:27,697 - clearml.Auto-Scaler - INFO - Idle for 60.00 seconds
2025-01-22 12:55:32
2025-01-22 10:55:28,230 - clearml.Auto-Scaler - INFO - Found 1 tasks in queue 'default'
2025-01-22 10:55:28,235 - clearml.Auto-Scaler - INFO - Spinning new instance resource='pe-jobs', prefix='dynamic_aws', queue='default'
2025-01-22 10:55:28,236 - clearml.Auto-Scaler - INFO - spinning up worker without specific subnet or availability zone
2025-01-22 10:55:28,237 - clearml.Auto-Scaler - INFO - monitor spots started
2025-01-22 10:55:28,248 - clearml.Auto-Scaler - INFO - Creating spot instance for resource pe-jobs
2025-01-22 10:55:28,605 - clearml.Auto-Scaler - INFO - --- Cloud instances (0):
2025-01-22 10:55:28,834 - clearml.Auto-Scaler - INFO - Idle for 60.00 seconds
2025-01-22 12:55:48
2025-01-22 10:55:46,908 - clearml.Auto-Scaler - INFO - New instance i-00bf957a948e2a52f listening to default queue
2025-01-22 12:56:33
2025-01-22 10:56:29,571 - clearml.Auto-Scaler - INFO - Found 1 tasks in queue 'default'
2025-01-22 10:56:29,997 - clearml.Auto-Scaler - INFO - --- Cloud instances (1): i-00bf957a948e2a52f (spot)
2025-01-22 10:56:30,423 - clearml.Auto-Scaler - INFO - Idle for 60.00 seconds
2025-01-22 12:57:34
2025-01-22 10:57:31,205 - clearml.Auto-Scaler - INFO - Found 1 tasks in queue 'default'
2025-01-22 10:57:31,566 - clearml.Auto-Scaler - INFO - --- Cloud instances (1): i-00bf957a948e2a52f (spot)
2025-01-22 10:57:32,028 - clearml.Auto-Scaler - INFO - Idle for 60.00 seconds
2025-01-22 12:58:35
2025-01-22 10:58:32,726 - clearml.Auto-Scaler - INFO - Found 1 tasks in queue 'default'
2025-01-22 10:58:33,118 - clearml.Auto-Scaler - INFO - --- Cloud instances (1): i-00bf957a948e2a52f (spot)
2025-01-22 10:58:33,331 - clearml.Auto-Scaler - INFO - Idle for 60.00 seconds
2025-01-22 12:59:35
2025-01-22 10:59:34,016 - clearml.Auto-Scaler - INFO - Found 1 tasks in queue 'default'
2025-01-22 10:59:34,407 - clearml.Auto-Scaler - INFO - --- Cloud instances (1): i-00bf957a948e2a52f (spot)
2025-01-22 10:59:34,789 - clearml.Auto-Scaler - INFO - Idle for 60.00 seconds
2025-01-22 13:00:36
2025-01-22 11:00:35,800 - clearml.Auto-Scaler - INFO - Found 1 tasks in queue 'default'
2025-01-22 11:00:36,189 - clearml.Auto-Scaler - INFO - --- Cloud instances (1): i-00bf957a948e2a52f (spot)
2025-01-22 13:00:41
2025-01-22 11:00:36,733 - clearml.Auto-Scaler - INFO - Idle for 60.00 seconds
2025-01-22 13:01:42
2025-01-22 11:01:37,533 - clearml.Auto-Scaler - INFO - Found 1 tasks in queue 'default'
2025-01-22 11:01:37,918 - clearml.Auto-Scaler - INFO - --- Cloud instances (1): i-00bf957a948e2a52f (spot)
2025-01-22 11:01:38,141 - clearml.Auto-Scaler - INFO - Idle for 60.00 seconds
2025-01-22 13:02:32
2025-01-22 11:02:31,172 - clearml.Auto-Scaler - INFO - Worker 'dynamic_aws:pe-jobs:g4dn.xlarge:i-00bf957a948e2a52f' does not have an active task
2025-01-22 11:02:31,172 - clearml.Auto-Scaler - WARNING - The following instances have crashed:
* i-00bf957a948e2a52f
2025-01-22 13:02:42
2025-01-22 11:02:38,653 - clearml.Auto-Scaler - INFO - Spinning down stuck worker dynamic_aws:pe-jobs:g4dn.xlarge:i-00bf957a948e2a52f from stale_spun
2025-01-22 11:02:39,225 - clearml.Auto-Scaler - INFO - Stuck worker spun down: 'dynamic_aws:pe-jobs:g4dn.xlarge:i-00bf957a948e2a52f'
2025-01-22 11:02:39,421 - clearml.Auto-Scaler - INFO - Found 1 tasks in queue 'default'
2025-01-22 11:02:39,427 - clearml.Auto-Scaler - INFO - Spinning new instance resource='pe-jobs', prefix='dynamic_aws', queue='default'
2025-01-22 11:02:39,427 - clearml.Auto-Scaler - INFO - spinning up worker without specific subnet or availability zone
2025-01-22 11:02:39,537 - clearml.Auto-Scaler - INFO - Creating spot instance for resource pe-jobs
2025-01-22 11:02:39,902 - clearml.Auto-Scaler - INFO - --- Cloud instances (0):
2025-01-22 11:02:40,129 - clearml.Auto-Scaler - INFO - Idle for 60.00 seconds
2025-01-22 13:03:02
2025-01-22 11:02:58,177 - clearml.Auto-Scaler - INFO - New instance i-0c2be2fda2dd959f9 listening to default queue
2025-01-22 13:03:43
2025-01-22 11:03:40,807 - clearml.Auto-Scaler - INFO - Found 1 tasks in queue 'default'
2025-01-22 11:03:41,222 - clearml.Auto-Scaler - INFO - --- Cloud instances (1): i-0c2be2fda2dd959f9 (spot)
2025-01-22 11:03:41,450 - clearml.Auto-Scaler - INFO - Idle for 60.00 seconds
2025-01-22 13:04:44
2025-01-22 11:04:42,247 - clearml.Auto-Scaler - INFO - Found 1 tasks in queue 'default'
2025-01-22 11:04:42,631 - clearml.Auto-Scaler - INFO - --- Cloud instances (1): i-0c2be2fda2dd959f9 (spot)
2025-01-22 11:04:42,842 - clearml.Auto-Scaler - INFO - Idle for 60.00 seconds
looks like it's using python2...
I manually launched an instance from the same ami, ssh'ed into and run:
$ python --version
Python 3.7.6
$ which python
/home/ubuntu/anaconda3/bin/python
could it be something related to docker image?
37 Views
0
Answers
2 months ago
2 months ago