Hi and good day! I need some assistance to please how to troubleshoot this networking issue.
Please note that this is only happening in Ubuntu Bionic & Focal. No network issue if I use Debian image.
There is no network/no IP after changing the legacy network to a shared VPC.
root@instance-1-eng-2819:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens4: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether 42:01:0a:80:00:2c brd ff:ff:ff:ff:ff:ff
altname enp0s4
root@instance-1-eng-2819:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens4: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether 42:01:0a:80:00:2c brd ff:ff:ff:ff:ff:ff
altname enp0s4
I got these from Serial port 1 (console)
oot@instance-1-eng-2819:~# Dec 25 01:02:53 instance-1-eng-2819 OSConfigAgent[4112]: 2022-12-25T01:02:53.4655Z OSConfigAgent Critical main.go:100: Error parsing metadata, agent cannot start: network error when requesting metadata, make sure your instance has an active network and can reach the metadata server: Get http://169.254.169.254/computeMetadata/v1/?recursive=true&alt=json&wait_for_change=true&last_etag=0&timeout_sec=60: dial tcp 169.254.169.254:80: connect: network is unreachable
Dec 25 01:02:53 instance-1-eng-2819 systemd[1]: google-osconfig-agent.service: Main process exited, code=exited, status=1/FAILURE
Dec 25 01:02:53 instance-1-eng-2819 systemd[1]: google-osconfig-agent.service: Failed with result 'exit-code'.
Dec 25 01:02:54 instance-1-eng-2819 systemd[1]: google-osconfig-agent.service: Scheduled restart job, restart counter is at 76.
Dec 25 01:02:54 instance-1-eng-2819 systemd[1]: Stopped Google OSConfig Agent.
Dec 25 01:02:54 instance-1-eng-2819 systemd[1]: Started Google OSConfig Agent.
Dec 25 01:03:39 instance-1-eng-2819 systemd[1]: google-guest-agent.service: State 'stop-sigterm' timed out. Killing.
I also saw a Permission Denied from the logs:
Dec 24 23:45:19 instance-1-eng-2819 dhclient[419]: execve (/bin/true, ...): Permission denied Dec 24 23:45:19 instance-1-eng-2819 dhclient[415]: Listening on LPF/ens4/42:01:0a:80:00:2c Dec 24 23:45:19 instance-1-eng-2819 dhclient[415]: Sending on LPF/ens4/42:01:0a:80:00:2c Dec 24 23:45:19 instance-1-eng-2819 dhclient[415]: Sending on Socket/fallback Dec 24 23:45:19 instance-1-eng-2819 dhclient[415]: DHCPDISCOVER on ens4 to 255.255.255.255 port 67 interval 3 (xid=0xbe742848) Dec 24 23:45:19 instance-1-eng-2819 dhclient[415]: DHCPOFFER of 10.128.0.44 from 169.254.169.254 Dec 24 23:45:19 instance-1-eng-2819 dhclient[415]: DHCPREQUEST for 10.128.0.44 on ens4 to 255.255.255.255 port 67 (xid=0x482874be) Dec 24 23:45:19 instance-1-eng-2819 dhclient[415]: DHCPACK of 10.128.0.44 from 169.254.169.254 (xid=0xbe742848) Dec 24 23:45:19 instance-1-eng-2819 dhclient[420]: execve (/bin/true, ...): Permission denied
So I think the network failure occurred because it can't communicate to gcp metadata 169.254.169.254.
Is this an Ubuntu-specific issue because it works fine using Debian's image?
UPDATE:
It turns out that after changing the network in GCP, it didn't get/update the new MAC address.
Current MAC addr if ens4
ip a | grep link/ether
link/ether 42:01:0a:80:00:2c brd ff:ff:ff:ff:ff:ff
Netplan mac addr:
cat /etc/netplan/50-cloud-init.yaml | grep mac
2 macaddress: 42:01:0a:f0:02:42
After matching (changed netplan mac addr) the mac addr, I got the IP and network up.
ip a | grep ens4
2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq state UP group default qlen 1000
inet 10.128.0.44/32 scope global dynamic ens4
root@instance-1-eng-2819:~# ping google.com
PING google.com (142.251.161.100) 56(84) bytes of data.
64 bytes from ig-in-f100.1e100.net (142.251.161.100): icmp_seq=1 ttl=109 time=1.83 ms
64 bytes from ig-in-f100.1e100.net (142.251.161.100): icmp_seq=2 ttl=109 time=1.24 ms
So it looks like the issue now is:
cloud-init getting the wrong MAC addr after changing the network VPC.
Solved! Go to Solution.
I found a fix. running cloud-init clean before updating the network VPC fixed it.
I found a fix. running cloud-init clean before updating the network VPC fixed it.
iam unable to ssh to the vm instance, can you please help explaining where the cloud-init clean command should be run, and what process is happening while i run the cloud-init clean command.