Hello!
I am using a Python library called Pydub to work with audio. It works very well in Colab Enterprise, but when I try to run it in a Dataflow job, I get the following error:
No such file or directory: 'ffprobe': 'ffprobe'
After searching on the internet and in the issues on the official repository of this library (here is the link: https://github.com/jiaaro/pydub/issues?page=3&q=not+found), I saw that the recommended solution is to add /usr/bin/ffprobe to a PATH variable.
Given that the Dataflow flex template works with a Dockerfile, I am adding the ffprobe path to the PATH environment variable in the Dockerfile, at build time. However, I still get the same error message.
What else can I do to fix this error?
This is my Dockerfile:
FROM gcr.io/dataflow-templates-base/python3-template-launcher-base
ARG WORKDIR=/template
RUN mkdir -p ${WORKDIR}
WORKDIR ${WORKDIR}
ARG PYTHON_PY_FILE=insights_interpreter.py
COPY . .
ENV PYTHONPATH ${WORKDIR}
ENV FLEX_TEMPLATE_PYTHON_PY_FILE="${WORKDIR}/${PYTHON_PY_FILE}"
ENV FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE="${WORKDIR}/requirements.txt"
ENV FLEX_TEMPLATE_PYTHON_SETUP_FILE="${WORKDIR}/setup.py"
RUN apt-get update \
&& apt-get install ffmpeg libavcodec-extra libav-tools -y \
&& pip install --upgrade pip \
&& pip install google-cloud-texttospeech pydub \
# Download the requirements to speed up launching the Dataflow job.
&& pip download --no-cache-dir --dest /tmp/dataflow-requirements-cache -r $FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE
ENV PATH="/usr/bin/ffprobe:$PATH"
RUN echo $PATH # Verification
# Since we already downloaded all the dependencies, there's no need to rebuild everything.
ENV PIP_NO_DEPS=True
ENTRYPOINT ["/opt/google/dataflow/python_template_launcher"]
I read you in the comments.
--
Best regards
David Regalado
Web | Linkedin | Cloudskillsboost
I don't have a solution but I have some thoughts. In your post of your Dockerenv, I see you coded:
ENV PATH="/usr/bin/ffprobe:$PATH"
A couple of thoughts on this. Looking here, the syntax appears to be:
ENV PATH "/usr/bin/ffprobe:$PATH"
I don't know if the "=" throws something off.
The other thing is ... is the ffprobe command in /usr/bin? You point to Linux executables by including the containing directory in the path, not the path to the executable itself. Might this be better ...
ENV PATH "/usr/bin:$PATH"
I'd also suggest that you spin up a local copy of the image but cause it to run /bin/bash. Open a shell to the inside of the container and go find the ffprobe executable. Convince yourself that it is indeed present inside the container.