Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Data Catalog integration with our data sources

Hi Guys,

Good day!

I would like to inquire about the integration capabilities of Dataplex. Specifically, I am interested in knowing if it can seamlessly integrate with BigQuery, PostgresDB, MongoDB, Metabase, and GoogleSheet while offering automation, scalability scanning, and support for various data sources. Furthermore, I am curious to learn about the data sources that are supported by Dataplex, if such information is available. Thanks!

0 3 1,282
3 REPLIES 3

Google Dataplex is a data management platform that offers a suite of features designed to manage, secure, and analyze data across your organization. This platform is designed to offer seamless integration with various data sources and analytic tools, which should include BigQuery, PostgresDB, MongoDB, Metabase, and Google Sheets.

Here are some key features of Google Dataplex:

  1. Single pane of glass for data management: Dataplex provides a unified platform for managing data across various silos, with centralized security and governance. This includes unified search and data discovery, built-in data intelligence, and support for open source tools​.

  2. Freedom of choice: With Dataplex, you have the freedom to store data wherever you prefer and choose the best analytics tools to accelerate the analytics lifecycle​​.

  3. Intelligent automation: Dataplex uses Google’s AI/ML capabilities to automate data discovery, metadata harvesting, data lifecycle management, data quality, and lineage, which could help reduce management costs​​.

  4. Unified governance: Dataplex allows standardization and unification of metadata, security policies, governance, and data classification for consistency across distributed data​.

  5. Simplified data discovery: Dataplex automates data discovery, classification, and metadata enrichment of structured, semi-structured, and unstructured data, and offers a powerful Data Catalog to manage all types of metadata. It features built-in faceted search interface using the same technology as Gmail for data searching, finding, and understanding​.

  6. Data organization and life cycle management: With Dataplex, you can logically organize your data that spans multiple storage services into business-specific domains. It allows easy management, curation, tiering, and archiving of your data

For more information see https://cloud.google.com/dataplex

Thanks for your feedback @ms4446 . This well very helpful. If not too much to ask, if you can provide also data for below requirements stated. Appreciate it very much for your help.

Requirements:

  • Should be able to integrate with BigQuery, PostgresDB, MongoDB, Metabase, and GoogleSheet with automation and scalability scan
  • Unified search functionality regardless of the type/nature of the data source and should show all types of metadata
  • Should have all type of metadata such as technical, business and operational metadata hence metadata management + business and data glossary
  • Should be able to integrate with Stitch, Airflow, and DBT with automation and scalability scan
  • Should have a full view of graph/map of the data lineage with complete details i.e. calculations and transformations
  • Should be able to integrate with BigQuery, PostgresDB, MongoDB, Metabase, and GoogleSheet in terms of access grants/revokes
  • Should have a customizable access request workflow with automated grants/revokes of access on the data sources
  • Should be able to integrate with BigQuery, Metabase, PostgresDB, and MongoDB. Ideally, with automated data quality checks and data cleansing capabilities.
  • Should be able to integrate with BigQuery, Metabase, PostgresDB, and MongoDB. Ideally, with automated data policy and data compliance i.e. data masking/transformation functions.
  • Should be easy to setup, administer and maintain and ideally cloud-based / server-less application

Google Dataplex offers several features that align with your requirements, including automation, scalability, support for metadata management, data quality checks, data lineage, and more. Please refer to the following links for more details:

https://cloud.google.com/dataplex 

https://cloud.google.com/bigquery/docs/bigquery-data-quality-tasks-with-dataplex