Welcome to Model Your Reality, a newsletter with musings about data modeling, data warehousing and the like.
Until further notice, each issue will contain of two parts:
a list of data events that might be of interest for you (they definitely are of interest for me) and
some thoughts about a certain data topic (like data vault modeling patterns).
Let’s get started!
Data Events
Recent
I’m still in the process of uploading recordings from this year’s Knowledge Gap data modeling & data architecture conference. They’ll keep appearing in my YouTube channel on the Knowledge Gap playlist over the next few weeks together with the recordings from the latest Data Modeling Meetups.
Upcoming
If you know about other relevant events in the near future, please mention them in the comments or send an email to admin@obaysch.net. Online events preferred.
2022-10-04: Data Modeling Meetup on data products and data marts (online)
2022-10-05: UK Data Vault User Group on lessons from the field (online)
2022-10-06/13: Data Modeling Masterclass, TEDAMOH (in German, online)
2022-10-10/12: Data Vault Training (CDVDM), Genesee Academy (online)
2022-10-25: Data Modeling Meetup on Activity Schema (online)
2022-11-07: Monday Morning Data Chat with myself (online)
2022-11-16: UK Data Vault User Group with Scott Ambler (online)
2022-12-12: Ghosts of Data Warehousing Past, Present and Future (online)
2022-12-14: UK Data Vault User Group on data mesh and data vault (online)
2023-05-24/25/26: Knowledge Gap data modeling & data architecture conference (online)
Data Mesh, Data Vault, Inmon or Kimball?
I keep seeing questions along the lines of “How can I choose between Data Mesh, Data Vault, Inmon or Kimball?” on LinkedIn and in other places, so it seems we still have to get better (or at least less terrible) with data industry terminology.
The question makes little sense because it mixes up different things:
Data Mesh is a sociotechnical approach that doesn’t prescribe a particular modeling approach but it seems that ensemble modeling approaches like data vault or anchor modeling can work quite well (see also Roche Diagnostics on the subject).
Inmon stands for the three-layer data warehouse approach where you model the core or integration layer in a more normalized/historization-friendly way (in the 1990s, 3NF+T, today more commonly data vault) and the data mart or presentation layer in a more user-friendly way (dimensional back in the day, now also flat table, activity schema and the like).
Kimball is almost the same except that there is a greater focus on the presentation layer and dimensional modeling while the staging and integration part is left as an exercise to the implementer (Kimball just calls it “backroom” as opposed to the dimensional “front room”). So, you can build a two-layer or three-layer data warehouse and even pick data vault as a modeling approach for core/integration if you do three layers.
Data vault is a data modeling approach best suited for the core layer of a three-layer data warehouse but also useful in other cases where you have to do integration and/or (uni-)temporal historization, like data mesh data products.
For a better understanding of data requirements and approaches, maybe it would help to get to the right level of abstraction first.
I can recommend Martijn Evers’s poster work and especially Ronald Damhof’s data quadrants (see this recent presentation from Rogier Werschkull). And of course the collected works of Lars Rönnbäck (see list of publications).
Develop a mental framework first, then it becomes easy to put the different data buzzwords in the right buckets and pick the one that suits you best.
This issue is based on a LinkedIn post the contents of which will hopefully be easier to find now that it has become a Substack issue.
The problem of creating a mental framework is that you need to know the different concepts to do so. What the newbie needs is a story, built from a certain point of view that can incorporate all these concepts. As the historian tends to say, I am too close to the now to see the bigger picture of what happens now. We are still somewhat in the journalism fase, with models competing rather than standing in some overall story.