Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Does Vertex AI Model Garden have a model for baby cry detection, or any related audio classification

I'm currently working on a project using Google Cloud Vertex AI, and I’d like to build a model to detect baby crying sounds from audio input. Before I start building and training from scratch, I’m wondering:

  • Are there any pretrained models in Vertex AI Model Garden (or Hugging Face integrations) related to audio classification, especially for baby cry detection?

  • Has anyone seen documentation, code samples, or forum posts/blogs where someone has built a similar model?

  • Is it better to use custom training or AutoML for this type of use case in Vertex AI?

If you’ve done anything similar or have suggestions on datasets, models, or pipelines, I’d really appreciate your input!

0 1 143
1 REPLY 1

Hi @ageng,

Welcome to Google Cloud Community!

  • Are there any pre-trained models in Vertex AI Model Garden (or Hugging Face integrations) related to audio classification, especially for baby cry detection? Has anyone seen documentation, code samples, or forum posts/blogs where someone has built a similar model? Has anyone seen documentation, code samples, or forum posts/blogs where someone has built a similar model?

    Currently, there are no specific pre-trained models in the Vertex AI Model Garden or on Hugging Face supported under Model Garden that are specifically designed to detect baby crying sounds. The Model Garden mainly includes Google's own models and a few popular, general-purpose open-source and partner models, which typically do not cover specialized tasks like detecting baby cries. Meanwhile, Hugging Face models supported on Vertex AI Model Garden primarily support tasks such as text generation, text-to-text generation, text-to-image, feature extraction, sentence similarity, and image-text-to-text. You may also find this documentation helpful if the pre-trained models on Hugging Face that can perform such tasks are not listed in the Model Garden. Another workaround is manually developing a custom model to train for audio classification tasks, such as detecting baby crying sounds.

  • Is it better to use custom training or AutoML for this type of use case in Vertex AI?

    For this type of use case, custom training is a better choice as it offers more flexibility and allows you to fine-tune the model.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.