Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Platform Block?: Cloud Run Service Returns Google 404 Despite Being Healthy and Publicly Configured

moggy8
New Member

I've been troubleshooting this extensively with the help of AI (including Gemini) and after multiple rounds of debugging I cannot find the solution. I'm working on a small, single-person, project so I don't have any organizational policies created that would impede the health, and believe there is a platform level impediment for the service. Thank you in advance for any help you can provide -  I'm newer to the Cloud Run services. Let me know what supplemental information I can provide.

Project ID: appthree-86b53

Service Name: sports-ingest-fanout

Region: us-central1

Problem Description: Our Cloud Run service, sports-ingest-fanout, is unreachable from the public internet and returns a generic, Google-branded 404 Not Found error page. This occurs even for explicitly defined health check endpoints. However, the service is verifiably healthy, running, and functions correctly for internal traffic.

Comprehensive Diagnostics Performed:

We have undertaken an exhaustive debugging process and can confirm the issue is not with the application code, its dependencies, or its direct configuration.

1. Application-Level Fixes (Completed): We initially identified and fixed several application-level bugs. The application is now stable:

  • Corrected Startup: Fixed the package.json to use node index.js instead of the incorrect functions-framework.

  • Added Server Listener: Added the required app.listen(process.env.PORT || 8080) to index.js to ensure the server starts correctly.

  • Created Dependencies: Added the necessary gcloud tasks queues create command to our deployment script to ensure the required Cloud Tasks queue exists before the service starts.

  • Current Status: Logs definitively show the container starts, the server listens on port 8080, and the service passes all startup probes. The application is not crashing.

2. Environmental and Networking Checks (Completed): We have ruled out all common user-configurable causes for this issue:

  • Public Access Confirmed: The service is deployed with the --allow-unauthenticated flag, and gcloud run services get-iam-policy confirms the allUsers principal has the roles/run.invoker role.

  • No VPC Connector: The service is not attached to a Serverless VPC Access connector.

  • No Load Balancer: There are no Global External HTTPS Load Balancers configured in the project that could be interfering.

  • Internal Traffic Succeeds: A backfill job initiated from within our GCP environment runs successfully. It correctly hits the /ingest endpoint, creates tasks, and the worker processes them. This proves the entire application logic is sound.

The "Smoking Gun" Evidence: The core issue is the discrepancy between internal and external traffic.

  • Internal Request (e.g., backfill job): Succeeds.

  • External Request (e.g., curl <service-url>/healthz): Fails with a Google-branded 404.

Conclusion & Request: The fact that a healthy, running container is reachable from internal GCP services but not from the public internet is definitive proof of an environmental policy blocking public ingress. The request is being intercepted by Google's infrastructure before it reaches our container.

We request that your team investigate the project-level and platform-level configurations that are not visible to us. Specifically, please investigate:

  1. VPC Service Controls: Is project appthree-86b53 part of a service perimeter that restricts run.googleapis.com without a corresponding public ingress rule?

  2. Organization Policies: Is there an effective Organization Policy on this project, such as constraints/run.allowedIngress, that is overriding our service's public configuration?

We have exhausted all possible application-level and user-configurable fixes. Please assist us in identifying the platform-level block.

0 0 13
0 REPLIES 0
Top Solution Authors