How to explore "status!=ok" request with data in "Cloud spanner Instance - API request rate" ?

Hi all,

How to explore "status!=ok" request with data in "Cloud spanner Instance - API request rate" in the metrics explorer ? and how to look into the status calls metrics content ? or it is an error ?. Please describe anything with related to this, if you are familiar with it. I have attached screen shot for reference. Thank you in advance. Apology if you find this question as silly question, I am new to GCP. 

Screen Shot 2024-03-26 at 5.30.39 PM.png

0 2 72
2 REPLIES 2

When you encounter non-OK requests in Cloud Spanner, it indicates potential issues or errors with your interactions. Here's how to isolate and investigate these requests using Google Cloud's Metrics Explorer and Logs Explorer.

Steps to Investigate

Metrics Explorer:

  1. Navigate to Google Cloud Monitoring -> Metrics Explorer (Metrics Explorer).

  2. Select Resource Type: "Cloud Spanner Instance".

  3. Select Metric: "API request rate".

  4. Filter: Add a filter for status != "OK" to isolate non-successful requests.

Analyze Requests:

  • The graph will visualize non-successful requests. For easier trend analysis, switch the visualization to "Line".

  • Pay special attention to periods where the rate of non-OK requests spikes or is unusually high.

Error Details:

  • Hover: Move your cursor over data points on the chart to see details about error types and timestamps.

  • Logs Explorer: Use the timestamps identified in Metrics Explorer to dive deeper into Logs Explorer (Logs Explorer). Here, you can find specific error messages and stack traces to understand the root causes.

Common Scenarios for Non-OK Requests

  • Incorrect Permissions: Verify that your client/application has the necessary permissions for the operations it's attempting.

  • Invalid Arguments: Ensure the data and parameters provided to Spanner API calls are correct.

  • Rate Limiting/Throttling: Exceeding Spanner quotas or burst limits can result in errors.

  • Timeouts: Client timeout settings or network latency issues can lead to requests timing out.

  • Internal Spanner Errors: While rare, transient errors on Spanner's side can occur.

Important Notes

  • Time Ranges: Adjust the time range in Metrics Explorer to focus on specific periods of interest or known problem occurrences.

  • Granularity: If needed, change the aggregation in Metrics Explorer for a more detailed analysis.

Additional Considerations

  • Consult Documentation: For specific error codes or behaviors, the Google Cloud Spanner documentation can offer valuable insights and troubleshooting steps.

  • Monitoring and Alerting: Setting up proactive monitoring and alerting for high rates of non-OK requests can help identify and mitigate issues early.

  • Performance Optimization: Non-OK statuses may sometimes be linked to performance issues. Utilizing Cloud Spanner's query execution statistics and optimization recommendations can improve overall performance.

Thank you @ms4446 for the knowledge. I will try to get the invalid_argument requests using the Log Explorer.