Problem
Monday morning started with an incident at work: our k8s
cluster doesn’t have enough available nodes.
After some investigation with my amazing teammates, we identified the root to be coming from Debian 9:
- Node’s launch template was based on an old AMI that uses Debian 9
- Cluster auto-scaler cycles out old nodes while newly launched nodes fail to finish the launch sequence:
apt-get update
fails with:"Packages 404 Not Found"
error message - New nodes were not able to join the
k8s
cluster, triggering the issue we noticed.
We’re far from running the latest version of many dependencies, and I’ve seen the same thing elsewhere: proactively upgrading dependencies usually fail to justify the business case. More often, it’s done in a reactive way.
A bit of Googling helped me locate the original message regarding this change:
the stretch, stretch-debug and stretch-proposed-updates suites have now
also been imported on archive.debian.org. People still interested in
these should update their sources.list.
I plan to remove the suites from the main archive in about a month
(2023-04-23 or later).
The stretch-backports, stretch-backports-sloppy and related debug suites
will likely move soon as well and might be removed from the main archive
around the same time.
Solution
The short-term solution is relatively simple: change deb.debian.org
to archive.debian.org
in /etc/apt/sources.list
and files under /etc/apt/sources.list.d/
If you’re using cloudfront
as the package repository, you can point the URL from cloudfront.debian.net/debian
to cloudfront.debian.net/debian-archive/debian
(for both /etc/apt/sources.list
and files under /etc/apt/sources.list.d/
)
In AWS, this can be done by overriding the userdata
script in your LaunchTemplate
and publishing a new version.
This should buy you enough time to work on a long-term solution. It’s not great to be tasked to fix something really outdated, and I hope writing this down will help whoever runs into this issue.