I am wanting to connect my data to a BI Dashboard, like Tableau, Power BI, and Looker Studio. I have two data sources that I am wanting to connect or create a relationship between. I have mail data and sales data. My mailing data is in an excel sheet and I update it once a week because we send mail out once a week. There are around 1.3 million rows of mail data separated into two sheets within the same Excel Workbook. The Sales data is being POST from our CRM's API onto a Google Sheet, using a third party Google Sheet API connector.
As of now, I am just copying and pasting the Sales data from Google Sheets to the Excel workbook that contains all of the mail data. Each piece of mail has a unique "response code" that is also in the sales data, so there will be a matching response code in the mail data for every sale we have. The mail data contains more metrics than our sales data and that's why we are wanting to create a relationship between them. As of now, I am using the VSTACK function within Excel to return the full rows of mail data that have a matachin "response code" with sales onto a new sheet.
I feel as if using BIgQuery, Cloud SQL, Cloud Spanner, Bigtable, and/ or API Manager could make the process much more efficient. But not sure which one fits my situation best.
The main thing I would like to achieve is to create a relationship between the two sources (mail and sales) through a relational database, and then create a live connection to a BI Dashboard to analyze this data. I don't know much about coding or computer languages though. So if that isn't an option because of my lack of JSON knowledge, then I would like to at least connect the CRM's API directly to the BI instead of using Google Sheets as a middle man.
I am very willing to learn about these databases, but want to make sure I am utilizing the right product/ products before attempting. I just want to create a live connection with the data sources and a BI dashboard, instead of having to manually copying and pasting daily. It seems like creating a relational database is a great option, but if not I at least want to create a live connection and then create a relationship within the BI Dashboard.
Can anybody give me some guidance? Thanks so much!
Hi @cwalker7,
Welcome to Google Cloud Community.
It's great that you have interest in optimizing your data management process and utilizing advanced tools like databases and BI dashboards, Google Cloud Platform could help you with that.
Here are some of the options you mentioned and recommend a solution that fits your needs.
BigQuery, Cloud SQL, Cloud Spanner, and Bigtable are all cloud-based databases provided by Google Cloud Platform (GCP). These databases are designed to handle massive amounts of data and provide high scalability and availability. However, setting up and maintaining these databases can require significant technical knowledge, and may not be the best option for your current needs.
API Manager, on the other hand, is a also tool provided by Google Cloud Platform (GCP) for managing APIs and can help you streamline your data transfer processes. It may be helpful if you need to manage multiple APIs, but it doesn't provide a database for storing and querying data.
For your data sources in mail data is currently stored in an Excel sheet, while your sales data is being POST from your CRM's API onto a Google Sheet. Both of these sources are not ideal for creating a relational database. Excel sheets are not designed for storing and querying large amounts of data, and Google Sheets may not provide the flexibility and performance you need to handle large datasets.
A better solution might be to create a data pipeline that automatically extracts, transforms, and loads your data into a cloud-based database. You could use a tool like Google Cloud Dataflow or Apache Beam to build this pipeline. These tools allow you to create data processing pipelines using a visual interface or code, and can integrate with various data sources and databases.
Once your data is stored in a database, you can then connect it to a BI dashboard using a live connection. This will allow you to analyze your data in real-time and visualize insights on a dashboard. Tableau, Power BI, and Looker Studio are all popular BI dashboard tools that can connect to various data sources, including cloud-based databases like BigQuery and Cloud SQL.
In conclusion, I suggest you create a data pipeline using a tool like Google Cloud Dataflow or Apache Beam to extract, transform, and load your data into a cloud-based database. From there, you can connect your database to a BI dashboard tool like Tableau, Power BI, or Looker Studio to analyze and visualize your data in real-time.
Here are some documentations that might help you:
What Data Pipeline Architecture should I use?
https://cloud.google.com/blog/topics/developers-practitioners/what-data-pipeline-architecture-should...
hl=en&_ga=2.150428022.-1392753435.1676655686
Create a streaming pipeline using a Dataflow template
Why Apache Beam? A Google Perspective
https://cloud.google.com/blog/products/gcp/why-apache-beam-a-google-perspective?_ga=2.183907942.-139...
How to build a BI dashboard using Google Data Studio and BigQueryhttps://cloud.google.com/blog/products/gcp/how-to-build-a-bi-dashboard-using-google-data-studio-and-...