Hello everyone.
I am new to the community and I am wondering if I can get some help. My website on GCP is down and offline. It has been OK until yesterday, when I start getting alert from jetpack that my website is offline and then I will receive another message that says its back online. I didn't worry too much about it until today when it is completely offline.
I don't know where to start looking for the problem as I am just learning as I go. I tried to google for help nothing helpful came up.
This is the message I get when I try to visit the site:
Unable to connect
An error occurred during a connection
The site could be temporarily unavailable or too busy. Try again in a few moments.
If you are unable to load any pages, check your computer’s network connection.
If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web.
I shut down the VM for a while and restarted and I also reboot to see if that helps, but not.
From the Console , I checked the error log but not too sure what it says or what to do.
I will appreciate some help to resolve this.
Thank you.
Rafiat.
Solved! Go to Solution.
From the information provided, it is strange that your VM shows a larger disk than what you have configured. You could report this issue on issue tracker. Also you should reach the Support team for further assistance.
From the logs you have shared, I see you got two alerts of type type.googleapis.com/cloud_integrity.IntegrityEvent
.
On this Stack Overflow question, there is a review on a problem similar to yours:
This answer briefly explains the issue:
This seems to be a shielded VM with "integrity monitoring" enabled and most likely this is caused by an integrity validation failure.
This answer shows a command to solve the issue:
gcloud compute instances update INSTANCE_NAME --shielded-learn-integrity-policyInstance must be running and have Integrity Monitoring enabled.
On this comment, there is an detailed explanation for this issue happening:
This issue occurs commonly after updating the machine's kernel. The Titan security chip that Google uses on most of its hardware finds that the Kernel is now different to the last baseline taken, and locks up the machine to protect it from any outside threats. The above command will resolve this by resetting that baseline to match your machine's current/new setup, and should keep the security chip happy, as it's now operating as expected.
See also:
Hello cristianrm,
Thank you for your response and the solutions provided.
I have tried this:
gcloud compute instances update INSTANCE_NAME --shielded-learn-integrity-policy
It cleared the integrity error but still the website is not displaying and I could not access the vm through ssh both from the desktop and console.
Permission denied (publickey).
(gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255]
I ran the ssh trouble shoot and one of the suggestion I got is possibly the disk has ran out of space. I checked the disk it was 77% used and I increased it but the percentage use still remain 77% despite increase in disk size.
still I can't view the website and ssh access is still denied.
From the console, I troubleshoot ssh access and it return
VM - ok
network- ok
user permission: OK
buts till can't access by ssh.
This is the disk after I increased from 10G to 25G, not to sure about the details in the in the screen shot because it reads 60G.
sudo df -T
sudo lsblk
From the information provided, it is strange that your VM shows a larger disk than what you have configured. You could report this issue on issue tracker. Also you should reach the Support team for further assistance.
Thank you, much appreciated.
I was able to access support and I worked with them to resolve the problem. The solution being that, I had to delete and start all over as I could not access the vm to retrieve the website files. Since the website was a migration from physical host to the clouds, I still have the files for reuse.
It was a good learning experience and I learnt to create snapshots of my VM for backup and I also learn about the process of how I can retrieving files vm when they become inaccessible like I experienced.