Hi,
I want to define "Lakes" in Dataplex but I am unsure as to what level these should be set at. The documentation states it should be a "data domain or a business unit" and suggests creating a lake for each department.
However my organization consists of several separate companies. Should I create a separate lake for each company ("company X") , or each company and department ("company X - dept Y"), or define lakes based on roles or functions that are shared across companies (eg, "Sales" or "Operations").
Any advice would be most appreciated.
Solved! Go to Solution.
The Lakes concept in Dataplex is primarily present to provide the "fabric" to build a Data Mesh. Attempting to answer your question would likely result in a 50 page thesis 🙂 ... the simple answer is "it depends". What I would suggest you do is read through this article:
and ideally the corresponding books cited at the end. I know this isn't necessarily the "up/down" answer you were looking for, but the question is similar to "I have data in my business, how should I organize it" ... there isn't a "one-size-fits-all". Once you get a grasp of the Data Mesh Architecture, I believe you will find that the desire is to "group" your data into distinct "data owner" domains and have a lake per data domain. Unfortunately, I can't say whether that will be by company, by department or by function ... as all are valid options.
The Lakes concept in Dataplex is primarily present to provide the "fabric" to build a Data Mesh. Attempting to answer your question would likely result in a 50 page thesis 🙂 ... the simple answer is "it depends". What I would suggest you do is read through this article:
and ideally the corresponding books cited at the end. I know this isn't necessarily the "up/down" answer you were looking for, but the question is similar to "I have data in my business, how should I organize it" ... there isn't a "one-size-fits-all". Once you get a grasp of the Data Mesh Architecture, I believe you will find that the desire is to "group" your data into distinct "data owner" domains and have a lake per data domain. Unfortunately, I can't say whether that will be by company, by department or by function ... as all are valid options.
Hi Kolban,
Thanks for your quick reponse and pointing me in the right direction. I will take a look at the article and books you suggested.
Thanks again