Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Defining Lakes in DataPlex

Hi,

I want to define "Lakes" in Dataplex but I am unsure as to what level these should be set at. The documentation states it should be a "data domain or a business unit" and suggests creating a lake for each department.

However my organization consists of several separate companies. Should I create a separate lake for each company ("company X") , or each company and department ("company X - dept Y"), or define lakes based on  roles or functions that are shared across companies (eg, "Sales" or "Operations").

Any advice would be most appreciated.

Solved Solved
0 2 335
1 ACCEPTED SOLUTION

The Lakes concept in Dataplex is primarily present to provide the "fabric" to build a Data Mesh.  Attempting to answer your question would likely result in a 50 page thesis 🙂 ... the simple answer is "it depends".   What I would suggest you do is read through this article:

Data Mesh Architecture

and ideally the corresponding books cited at the end.  I know this isn't necessarily the "up/down" answer you were looking for, but the question is similar to "I have data in my business, how should I organize it" ... there isn't a "one-size-fits-all".  Once you get a grasp of the Data Mesh Architecture, I believe you will find that the desire is to "group" your data into distinct "data owner" domains and have a lake per data domain.  Unfortunately, I can't say whether that will be by company, by department or by function ... as all are valid options.

View solution in original post

2 REPLIES 2