I run a custom model on vertex AI. Model is a simple FastAPI app that loads a whisper model. The beginning of the app looks like this.
if torch.cuda.is_available():
print("GPU is available =)")
model = whisper.load_model(model_name).cuda()
else:
print("GPU is not available =(")
model = whisper.load_model(model_name)
When running on vertex AI
gcloud ai endpoints deploy-model [ENDPOINT_NAME] \
--region=europe-west4 \
--model=[MODEL_NAME] \
--machine-type=n1-standard-2 \
--accelerator=type=nvidia-tesla-t4,count=1 \
torch.cuda.is_available() always returns false.
There is also a log message prio to that
/app/.venv/lib/python3.10/site-packages/torch/cuda/__init__.py:88: UserWarning: HIP initialization: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice (Triggered internally at ../c10/hip/HIPFunctions.cpp:110.)
Can you advice me a direction to look into. I'm running out of ideas how to set the app for the GPU support.
This very same docker image works on Compute Engine Vm and can find nvidia drivers. Why can it not do it on Vertex AI.
Docker base image is this btw
FROM nvidia/cuda:11.7.0-base-ubuntu22.04
ENV PYTHON_VERSION=3.10
ENV POETRY_VENV=/app/.venv
RUN export DEBIAN_FRONTEND=noninteractive \
&& apt-get -qq update \
&& apt-get -qq install --no-install-recommends \
python${PYTHON_VERSION} \
python${PYTHON_VERSION}-venv \
python3-pip \
ffmpeg \
&& rm -rf /var/lib/apt/lists/*
RUN ln -s -f /usr/bin/python${PYTHON_VERSION} /usr/bin/python3 && \
ln -s -f /usr/bin/python${PYTHON_VERSION} /usr/bin/python && \
ln -s -f /usr/bin/pip3 /usr/bin/pip
RUN python3 -m venv $POETRY_VENV \
&& $POETRY_VENV/bin/pip install -U pip setuptools \
&& $POETRY_VENV/bin/pip install poetry
ENV PATH="${PATH}:${POETRY_VENV}/bin"
WORKDIR /app
COPY . /app
RUN poetry config virtualenvs.in-project true
RUN poetry install
RUN $POETRY_VENV/bin/pip install torch==1.13.0 -f https://download.pytorch.org/whl/torch
EXPOSE 8080
ENV PORT 8080
CMD exec gunicorn --bind :${PORT} --workers 1 --threads 8 --timeout 0 app.webservice:app -k uvicorn.workers.UvicornWorker
Solved! Go to Solution.
I think it's best to use the official pytorch gpu image e.g. this:
https://hub.docker.com/layers/pytorch/pytorch/1.13.1-cuda11.6-cudnn8-runtime/images/sha256-1e26efd42...
Just make sure that you're not doing pip install pytorch again as that image already comes with Pytorch pre-installed with GPU set up, or you'll be overriding it and potentially disable GPUs.
I think it's best to use the official pytorch gpu image e.g. this:
https://hub.docker.com/layers/pytorch/pytorch/1.13.1-cuda11.6-cudnn8-runtime/images/sha256-1e26efd42...
Just make sure that you're not doing pip install pytorch again as that image already comes with Pytorch pre-installed with GPU set up, or you'll be overriding it and potentially disable GPUs.
Thank. the issue is indeed that I used non gpu pytorch version.
here is the fix.
RUN $POETRY_VENV/bin/pip install torch==1.13.0+cu117 -f https://download.pytorch.org/whl/torch
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |