Welcome to Model Your Reality, a newsletter with musings about data modeling, data warehousing and the like.
Until further notice, each issue will contain of two parts:
a list of data events that might be of interest for you (they definitely are of interest for me) and
some thoughts about a certain data topic (like data vault modeling patterns).
Let’s get started!
Data Events
Recent
I’m still in the process of uploading recordings from this year’s Knowledge Gap data modeling & data architecture conference. They’ll keep appearing in my YouTube channel on the Knowledge Gap playlist over the next few weeks together with the recordings from the latest Data Modeling Meetups.
Upcoming
If you know about other relevant events in the near future, please mention them in the comments or send an email to admin@obaysch.net. Online events preferred.
2022-10-25: Data Modeling Meetup on Activity Schema (online)
2022-11-07: Monday Morning Data Chat with myself (online)
2022-11-16: UK Data Vault User Group with Scott Ambler (online)
2022-11-23/24/25: Data Vault Training (CDVDM), Genesee Academy (online)
2022-12-12: Ghosts of Data Warehousing Past, Present and Future (online)
2022-12-14: UK Data Vault User Group on data mesh and data vault (online)
2023-05-24/25/26: Knowledge Gap data modeling & data architecture conference (online)
Towards a Model-Driven Organization (Part 2)
By Christian Kaul and Lars Rönnbäck
Part 1 is available here.
Simplicity — the art of maximizing the amount of work not done — is essential.
Mike Beedle, Arie van Bennekum, Alistair Cockburn, Ward Cunningham, Martin Fowler, Jim Highsmith, Andrew Hunt, Ron Jeffries, Jon Kern, Brian Marick, Robert C. Martin, Ken Schwaber, Jeff Sutherland, Dave Thomas, Principles behind the Agile Manifesto (2001)
The current way of working with data is fraught with issues, most of them caused by our current way of working itself. We have described some of the issues in part 1 of our series entitled “Towards a Model-Driven Organization”. As practitioners, we have spent years fighting them, and through our experiences, we’ve come to the realization that most of this could have been avoided if we had put data modeling front and center in our way of working.
You may have heard similar pronouncements before but our take on it is different from how it has been done in the past in several respects. The new way we are about to describe won’t have you bottlenecked by building complex enterprise data models in an ivory-tower fashion before implementing anything; far from it.
From Three to One
If we look at the way an organization works, we can distinguish between three important dimensions:
What an organization is actually doing — reality.
What we say about what the organization is doing — language.
What we store about what the organization is doing— data.
Each of these have been thoroughly studied in separate disciplines, and some intersection studies have definitely been done between any given couple. The Model-Driven Organization (henceforth MDO), however, aims to merge all three into a single coherent concept. In an MDO what an organization is actually doing, what we say about it, and what is stored are aligned as closely as possible. We believe that the better aligned the three dimensions are, the less of the traditional issues you will encounter.
For a long time, focus has been put on reality, running your organization, with language and data being secondary considerations. Language and data, however, have always played a crucial role in the survival of an organization. Organizations rely on feedback loops to operate and improve over time. Data is one way to very efficiently provide such feedback, thanks to it being structured and that it can be managed programmatically.
Data is important because with proper data management, you know what happened, can infer what is going on and have a chance of planning for the future. How well such insights can be operationalized then depends on language. With poor alignment, time and resources are bound to be wasted. If you just accumulate data without proper management or strategy, this is unavoidable.
In that respect, data models only really become useful when they also work as communication tools, documenting with sufficient detail how an organization works now and how it will work in the future. In the process of creating such a model, the people in the organization develop what Eric Evans calls a “ubiquitous language,” a common vocabulary that makes sure that everyone understands what everyone else in the organization is talking about.
A common language is the first prerequisite for escaping the vicious cycle of siloization. With its help, an organization can overcome the Tower of Babel–like confusion caused by silo-specific dialects that use different words for the same thing or, even worse, the same word for different things. The second prerequisite is to stop seeing a data model as purely technical and specific to an application, such as something that describes the database of a particular system. Instead, think of a data model as a description of what actually happens in the organization, a model that is shared between applications.
This is the type of unified model that lies at the heart of the MDO.
The Model-Driven Approach
In the model-driven organization, the unified model is put at the very center of the organization. The data structure of the model is derived from the goals of the organization, thereby reflecting exactly and specifically what a particular organization aims to do. Only after this model is known, an organizational structure is formed, based on the concepts in the model. It may have teams, departments, or projects, with the sole purpose of achieving results that manifest themselves as data in a unified database that implements the unified model.
Data within a model-driven organization
Applications in a model-driven organization do not have their own disparate models, and should ideally not persist any organization-created data outside of the unified database. Instead, they work directly on the unified database, from which they retrieve existing data and to which they write new data. The unified database can thereby at the same time act as a message bus between the applications.
In the MDO, the organizational chart is just another physical implementation of the common logical design. People will work together in small, cross-functional teams that are one-to-one with the concepts that are important to the organization right now. If, for example, your important concepts are Customer, Employee, Product, and Sale, then you’ll have teams called Customer, Employee, Product, and Sale that are responsible for the respective concept, its details, and the physical data store(s) associated with it. For example, in an MDO, statements like the following are natural: “The purpose of our team is to make sure that as many existing customers as possible are related to a repeat purchase” and “The purpose of our team is to make sure that the email addresses of all our customers are as up to date as possible”.
The common business-IT divide will slowly become obsolete because, to fulfill all its responsibilities, each team will have to include both more business-minded and more technical-minded people. At the same time, the one-to-one relationship between concepts and models will prevent the reemergence of different understandings of the same concept in different parts of the organization.
Of course, none of these teams would or should be an island. There will be defined interfaces between the teams that are one-to-one with the connections from the logical design. Teams are jointly responsible for their common connections and the physical data store(s) associated with them, usually with one team in the lead. In our example, there will be a connection between Customer, Employee, and Sale, and another connection between the Sale and the Products that have been sold. In both cases, it makes sense that the Sale team takes the lead because Sale is the concept that ties all these other concepts together. These institutionalized connections will make sure that no team can isolate itself from the others and degrade into one of the people silos of old.
The idea of the MDO is not altogether new. In 2002, Dewhurst et al. introduced a general enterprise model (GEM) much like our unified model and in 2007, Wilson et al. built on this in their paper “A model-driven approach to enterprise integration”. In 2013, Clark et al. coined the term Model Driven Organization, as “an organization that maintains and uses an integrated set of models to manage alignment concerns”. We take these ideas to their farthest extent, where our MDO has one single unified model, whose implementation is a unified database that serves all applications, and from which the organizational structure and terminology can be derived.
Towards a New Application Landscape
A trip down the information technological memory lane will help us understand the difference between traditional applications and applications in the MDO. Back when computers were largely unconnected, it was necessary for data, interfaces, and logic to reside alongside each other. The purpose of the interfaces and the logic was to fetch, display, modify, and create data, usually with a human involved in the process. This meant that even within a single computer running one program, it made sense to separate concerns for the sake of building software that was easy to maintain. During the 1970s, such ideas were even formalized in the programming language Smalltalk-79 under the acronym MVC (Model-View-Controller), a design pattern that lives on in most modern programming languages.
With the widespread use of this pattern, it is somewhat perplexing that in a time when it is hard to say where one computer ends and another begins, or if they are local or in the cloud, the way we think about applications has changed very little. Applications are mostly the same monoliths as they were back in the 1970s, keeping the same single-computer architecture, but now with a virtual machine on top of any number of elastically assigned physical units. We have already gone from the computer as a physical asset to compute as a virtual resource, but this journey is now also beginning for data. Thanks to data being able to flow freely, applications can be built so that it’s also difficult to say where one application ends and another begins, and applications may seamlessly run from the edge to the cloud.
We are now about to face a paradigm shift in application development, where we transition from being application-centric to become model-driven. There are no longer technical limitations preventing applications and the people working with them from speaking a common, ubiquitous language throughout an organization. When things change, the terminology used and the model with which people work can change with it.
Database Support for Model-Driven Applications
The largest obstacle is that existing applications are not geared for immediate use in an MDO. However, many applications are already configurable to work with external “master data”, and this is an extension of that concept. An application will not own any organizational data, in the sense that it is allowed to create or identify such. That responsibility lies within the unified database, much like such responsibility is already outsourced in architectures containing master data management systems (MDM) or entity resolution systems (ERS).
We can distinguish between five types of data in an organization:
Configuration data
Data local to an application, does not define the organization, and determines how the application executes.Operational data
Data that resides in the unified database, defines the organization, and is created by the applications.Third-party data
Data that resides in or is accessible from the unified database, enriches existing data, and is created by external parties.Supervisory data
Data that resides in the unified database, assists the maintainers, and internally benchmarks parts of the unified model, created from logs and usage of data.Analytical data
Data that resides in the unified database, enlightens the organization, and is derived from operational, third-party, and supervisory data.
Traditional applications work with the first two types, configuration and operational data. External, analytical, and supervisory have traditionally been the concern of analysts in conjunction with data engineers or data warehouse architects. At a bare minimum, applications tailored to support an MDO can keep configuration data local but must externalize all operational data. We believe that future applications built specifically for an MDO are likely to incorporate all five types to various degrees, depending on which use cases the applications serve.
The access patterns for the types of data listed also vary. Operational data is typically characterized by work done in small chunks and with high concurrency, whereas analytical data is work done in large chunks and with low concurrency. We have therefore seen database systems often specializing in managing one or the other type of load, but not both simultaneously. This is now changing, with offerings like SingleStore and Snowflake’s Unistore. Snowflake even aims to provide “native applications”, their own version of an App Store, where it will be possible to buy applications that run locally on your data.
Benefits of Going MDO
In the problem statement found in part 1, six factors were listed that distance the de facto way of working from the ideal way of working in an organization. We will now show how the MDO will help bring the way of working closer to the ideal.
The goals of the organization exist vaguely and localized to some select individuals. While a traditional organization’s overarching purpose may be well known to its workforce, that purpose is usually hard to translate into the rationales behind the work put into daily operations on an individual level. In the MDO, everyone has a crystal-clear purpose; people’s actions serve the purpose of fetching, modifying, and creating data about a specific concept. It’s clear where to find customer data because all customer data can be found in one place, shepherded by the Customer team. While there isn’t one authority for everything, the team that is responsible for a concept is the one authority for everything related to that concept.
The de-facto way of working is a heritage from a different time. The way of working in the MDO will always reflect the model, and as long as the model is up to date with reality, the risk of not working on what you should be working is greatly reduced. Given that even the organizational structure is tied to the model, this provides a lock-in to the model. You cannot change the business without first changing the model. If activities are discovered that lie outside of what the model dictates, either the model needs to quickly adapt or the activities cease.
The de-facto way of working strays from the ideal because of management fads. Many organizations change their way of working based on current trends in management theory. For the MDO, the way of working is more well defined as it unites the three earlier mentioned dimensions: reality, language, and data. That leaves less room to wiggle in exotic forms of management. Individuals have clear objectives tied to results that show up as data in the unified database. Given how tangible work then becomes, the desire to look for other ways of working should also diminish.
The de-facto way of working is externally incentivized by vendors who benefit from it. Many organizations suffer from various degrees of vendor-locking, limited to what those vendors provide and lagging behind their roadmaps. With reusable model-driven applications, working on a single unified database, the important things for your organization are all under the control of the MDO. What is expressed through the dimensions: reality, language, and data can no longer become opaque in the hands of a vendor. This greatly reduces the power any vendor can hold over an organization.
The de-facto way of working is a compromise due to technological limitations. When this happens, the organization is either stuck with legacy systems that are near impossible to replace or the organization has requirements that no existing applications can fill. In both of these cases, developing a custom solution should be simpler in an MDO, given that data is already externalized from applications. In-house development will be important for an MDO, especially before the paradigm shift is complete and reusable model-driven applications are commonplace.
The de-facto way of working is sufficient to be profitable. For an already profitable organization, there is little incentive to change, even if the way of working has great inefficiencies. In the MDO, inefficiencies are easier to discover, thanks to the observability of work done through the data that results from that work. If activities lead to no or undesired outcomes, those activities can be spotted more easily. Inefficiencies can thereby be dealt with before they become the norm in the organization.
Conclusion
Whatever you may think of its feasibility with respect to your own organization right now, the model-driven organization is a force to be reckoned with. Given the VUCA (short for volatility, uncertainty, complexity, and ambiguity) world we are living in, having an accurate and up-to-date model of your organization that readily translates into organizational activities and IT systems has become crucial for the success and even the survival of an organization.
For many years, supporting factors like relative global peace, lack of pandemics more seriously than the occasional flu variant, low interest rates and high venture capital investment, meant that organizations could stay afloat without necessarily being very efficient or even profitable. In a comparatively short period of time, these supporting factors have disappeared one by one and probably won’t return for quite some time.
So, for the first time in decades, many organizations really have to know what they are doing, in more than one regard:
To be able to react quickly to unforeseen disruptions, you need an accurate and continuously updated, unified model of your organization.
Only if you know what is happening and how things are related, you are able to change your way of working, the conceptual buckets into which you put things and the teams in which your employees and colleagues are subdivided fast enough.
Only if you can automatically generate new data structure and org structure items as soon as you have recognized the need for a change, you can hit the ground running instead of wasting time and money for extensive transformation projects.
And only if you can avoid the usual disconnect between operating model, org structure and data model, you are able to operate efficiently enough to survive as an organization in a time of seriously constrained access to capital.
So, we don’t think it’s an exaggeration to say that adopting the MDO way of working might make the difference between the continued existence of your organization and its untimely demise. We’ll get into more details in parts 3 and 4 of this article series, compare the MDO to other approaches, and hint at a roadmap for adopting it in your organization.
Ah, great post. Took me too long to find time for it.
just as a highlight:
it is somewhat perplexing that in a time when it is hard to say where one computer ends and another begins, or if they are local or in the cloud, the way we think about applications has changed very little.
While I completely agree with the idea of the common language, it is seriously difficult to attain, because language is a living thing. I would be happy if I could offer sufficiently unambiguous definitions, so that the data can be understood and found and used in a consistent manner. In a large company with many departments that have little in common that is probably the good result to strive for, to prevent the best result ("ubiquitous language") from being the enemy of (a) good (common understanding).
In a setting with some dialects also all applications can work directly on the unified database, from which they retrieve existing data and to which they write new data. The unified database can thereby at the same time act as a message bus between the applications.