We are getting rid of project-wide SSH keys and enabling OS Login.
For some reason, doing so causes the instance to have high disk IO - to the point where the system is inoperable. Has happened on 2 out of 14 systems converted. One is on Ubuntu 20.04 the other Debian 11. The systems where it was fine, Debian 11 and Ubuntu 22.04.
Even after shutting the instance down and restarting it, within a few seconds, the system is again inoperable and in Observability - high disk IO.
Any way to fix this or why it's happening???
Solved! Go to Solution.
Update - issue solved.
We had another outage today on our VPN server - which was a blessing in disguise.
My colleague had noticed that Promtail wasn't running on it, so he started it up.
Withing 2 minutes, the server was hosed, issue is, OSLogin makes use of /var/log/lastlog and we had Promtail configured to scrape all logs in the /var/log directory.
We changed Promtail's config to only look at:
/var/log/*.log and /var/log/syslog
As you know, lastlog is a "sparse" file but it was causing Promtail to lose it's mind as it can grow (but not really taking up disk space).
Rgds...Geoff