RayCluster, RayJob, and RayService are not created
Hi @kiran8 ,
Welcome to Google Cloud Community!
Here’s the step-by-step basic troubleshooting you can use:
1.Verify Operator Health:
(kubectl get pods -n <operator-namespace>)
(kubectl logs -n <operator-namespace> <operator-pod-name>)
2. Check RayCluster CR Status:
kubectl get raycluster -n <your-namespace> -o yaml <your-raycluster-name>
to see the current state, events, and any errors associated with your cluster.3.Inspect Events:
kubectl get events -n <your-namespace>
to review events related to pod scheduling, creation, or any failures.4. Describe Failing Pods:
kubectl describe pod -n <your-namespace> <failing-pod-name>
to get more detailed information on errors (like ImagePullBackOff, scheduling failures, etc.).5. Simplify:
6. Check Resources:
kubectl describe node
on your cluster nodes to see the available resources and any node taints.7. Check the Ray Operator's RBAC:
8. Review Ray Operator and Cluster Configuration:
To understand more about Kuberay, you may check “Getting Started with KubeRay”.
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |