Hi JitteryCoyote63 , which ec2 type and AMI are you using?
the deep learning AMI from nvidia (Ubuntu 18.04)
can you attach the full log of the instance? Did the aws scalar output any logs?
The running task in the UI for it
there is no error from this side, I think the aws autoscaler just waits for the agent to connect, which will never happen since the agent won’t start because the userdata script fails
agree
E: Could not get lock /var/lib/apt/lists/lock - open (11: Resource temporarily unavailable)
Another process is using the lock, can you specify the ami (and region) so I can try to reproduce it?
AMI ami-08e9a0e4210f38cb6
, region: eu-west-1a
I think waiting for the apt locks to be released with something like this would workstartup_bash_script = [ "#!/bin/bash", "while sudo fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do echo 'Waiting for other instances of apt to complete...'; sleep 5; done", "sudo apt-get update", ...
Weirdly this throws an error in the autoscaler:Spinning new instance type=v100_spot Error: Failed to start new instance, unexpected '{' in field name
How did you add it? Just edited the configuration part of the task or with the wizard?
edited the aws_auto_scaler.py, actually I think it’s just a typo, I just need to double the brackets
Now it starts, I’ll see if this solves the issue
the instances takes so much time to start, like 5 mins
so what worked for me was the following startup userscript:#!/bin/bash sleep 120 while sudo fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do echo 'Waiting for other instances of apt to complete...'; sleep 5; done sudo apt-get update while sudo fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do echo 'Waiting for other instances of apt to complete...'; sleep 5; done sudo apt-get install -y python3-dev python3-pip gcc git build-essential python3 -m pip install -U pip ...
As you can see, more hard waiting (initial sleep), and then before each apt action, make sure there is no lock