Decoupling Data: Strategies and Considerations
    
    
    
    
    
      It may be that an organization must evolve its data's fundamental
      structure in order to scale operations. In the e-commerce space, this
      could take form in several ways. For instance, it may be that the catalog
      cannot grow without an unaffordable spend in computational complexity.
      This could be bad because a company cannot effectively compete without
      growing the catalog, thereby losing market share and thus losing business.
      Another example is slow developer velocity, which grinds to a halt due to
      the testing of complex dependencies that shipping even the smallest
      feature will require. Slow velocity means inefficent operations, which
      also means less business. These are two examples of how the data's
      inability to scale will manifest as business problems.
    
    
    
      Access to the data's implementation details is the root cause. This
      manifests as the use of direct SQL rather than APIS, which means that
      consumers have very close relationship to the data's technological
      configuration. That is, they know more about the data than just what it
      literally is. That is, they know about the tables, what columns, foreign
      key relationships, what databases, hardware details etc. It would be like
      knowing Google search's internal querying logic, which is very complex,
      and also not necessary to perform a simple query; for most, the simple
      Google Search website is sufficient. Now, with a small company, it is ok
      to be close to the implementation details, but at scale, there are
      problems, as we shall see.
    
    
    
      Due to the access to implementation details, the quantity and diversity of
      consumers makes it challenging to evolve the data structure. The reason is
      this. Say, for instance, a table had to be moved to another database.
      Then, all the consumers would have to know about that, and that would be
      major problem because they would have to reconfigure their query to access
      the new table. The result is that a usually simple task is impossible to
      achieve because consumers have to migrate. That is, they depend on the
      current implementation to access the data, so it is messy and a lot of
      work to have all consumers rewrite their code to accomodate the data
      changes. Something has to be done about this, because if the refactoring
      doesn't happen, the company will lose out on the opportunity to scale the
      operation, and in the worse case, they cannot support the current state.
    
    
    
      Some other examples will illuminate how this situation become untenable at
      scale. To start, consider the existence multiple sources of truth. At
      scale, this can be signifant problem due to the complexity in maintaining
      consistency, and also has a multiplying factor where the implementation
      details are revealed at many places and many times. Another example is the
      the challenge of testing amidst the tech debt. If, for instance, there is
      a preponderance of 1000-line classes without tests, then verifying a new
      feature is a unaffordable amount of work. Needless to saw, classes of that
      size mean only one thing: coupling to the implementation details. In fact,
      the need for data refactoring is very similar to the need for general
      refactoring in this way: that is, the size of a single entity causing too
      much coupling, but again, it is not the goal to elaborate on all the
      examples, because anyone in the thick of data restructuring knows the
      problem is real, nor will they feel a dirth of evidence.
    
    
    
    See https://github.com/Adrianjewell91/decoupler-website/blob/main/README.md
    
    
    
      The first requirement is to make it easy for consumers to migrate,
      otherwise there is no point of solving the problem. In other words,
      consumers should be able to continue as they were before the changes were
      made, with no interuption to operations.
    
    
    
      Another requirement is to refactor the data structure incrementally.
      Strictly speaking, incremental wins are not a hard requirement, but it
      should be, especially if some of the data needs to be refactored before
      other parts. Additionally, it is good practice it think incrementally,
      because stakeholders usually expect it, and it avoids the temptation of
      waterfall. This last point is important: not doing waterfall, which
      creates the risk of building a huge product, but then nobody uses it. It
      comes from thinking that clients will adopt new platform no matter what,
      but things do not work like that in practice.
    
    
    
      The complexity of the effort will also pose a challenge, but thankfully,
      it divides roughly into two types. The first is modifying columns, new
      data, validations etc. These examples are real; migrating tables to new
      databases is exactly what needs to happen, and even more than that has to
      be done. Additionally, tables need to be rewritten, refactored, and moved
      not only to new databases but to entirely different kinds of databases.
      This is very important for saving money on software costs. However, there
      also the important second kind, which is the reworking of the business
      domains themselves. It is a conceptual refactoring, where new business
      entities emerge from the growing requirements of the business. Of course,
      both types of refactoring support and inform the other, but they are also
      distinct. For instance, migrating a database table is different from
      concieving on a new business domain, for instance, a new Catalog or
      Marketing concept.
    
    
    
    
    
      Hide the implementation details in a systematic and simple way.
      Importantly, do it without interfering with the software systems that
      currently consume it. Practically, this means creating new interfaces,
      migrating clients, then refactoring behind the interfaces. This leads to a
      great simplification of the software system by decoupling the data from
      how it is stored. Once the data is behind interfaces, the data can evolve
      independently of one another but retain the existing relationship. This is
      what every successfully scaled company has done in order to grow, and the
      benefits of doing it are discussed at great length by the people who
      implemented it. Therefore, we know it works, and it also works because, in
      theory, the current data interfaces aren't bad, they just need to be able
      to grow and develop independently of one another, but also keep the
      existing functionality. So, it is ok to keep them, and change them and
      grow them under the hood.
    
    
    
    
      
        - 
          Assume that the starting point is a monolithic software system. In
          this sytsem, clients are coupled to the implementation details of the
          data, in a way expressed by the diagram below.
            
- 
          First, wrap each table behind an interface. This is an agile way to
          make the new interfaces available immediately for adoption, without
          too much work, so restructuring can start happinng incrementally.
            
- 
          Second, create new domains and migrate data as needed. This gives
          teams their correct responsibility: let data owners address
          implementation details, and domain owners take care of the domain
          restructuring.
            
- 
          Third, migrate existing clients as needed behind the scenes as needed.
          Inject the interfaces into the old system in a way that is hidden from
          the consumers. Then, lock the old interface from new features and then
          require any new work to use the new interfaces. There are many ways to
          do this, even at the db level. In a SQL world, the strategy would map
          under the hood every SQL query to an interface, and then lock the SQL
          queries, so no new ones could be made. Then, any new feature would
          require adoption of the new data structures because the old queries
          are locked from being changed. This strategy will work because it is
          programmatic rather than negotiative. That is, no work needs to be
          done to convince teams to adopt or not, just set the rules and enforce
          them programmatically, and it will happen.
            
    
    
    
      One risk is making life difficult for consumers, mainly by mandating a
      refactor of their own code, with their own engineering effort. This
      mistake is easy to make because the company doesn't "eat their own
      dogfood" and therefore think it is easy for clients to migrate. However
      this is rarely the case, and this decision costs the company a great deal
      of time because consumers have many things to do instead of letting
      engineering teams tell them to stop all their work and migrate to the new
      interface. It basically defeats the whole point of trying to solve the
      orginal problem, which was to stop teams from having to migrate their
      interface when the data structure changes. So in the end, there is a
      massive risk, which is that consumers won't migrate.
    
    
    
      A second risk is striving for waterfall-based, rather incremental, wins.
      This is because a company may decide to build a new platform as complex as
      the previous one, and before adoption. That is a lot of work: to build it,
      to figure out what to build, because they have to find out the intended
      consumption patterns, and the new business domains. This is very hard to
      do because the future is unpredictable, and could change. If the future
      changes, the work would have been wasted. Hopefully, consumers adopt, but
      it is not know as to when. With that uncertainty, the whole point of the
      project in is question: the important work of refactoring the data
      structure. The answer is that it won't get started until the new platform
      is built, then adopted, which will take a very long time. In the meantime,
      there will still be the legacy data structures in place, so no progress on
      the original problem.
    
    
    
      A third risk is asking the data owners specifically to perform all of the
      data restructuring. This will be a time-sink because teams get very little
      done when doing too much. This might not seem like a mistake, because the
      data owners know a lot, and they know how their data should look in the
      future, but they have an incomplete picture. There are also data consumers
      who have their ideas of new domains, what shape they want the data, and
      how they want to integrate it, etc. Therefore, the domain-specific
      refactoring should rest with the consumer as much as possible because they
      know what they want better than anyone else. So, if the data owners try to
      do the work of the consumers, or vice versa, then teams will spend too
      much time asking other teams what it is that they want, which leads to a
      lot of time wasted on discussions and planning instead of building.
    
    
    
      If any of these risks are likely to happen, then there are symptoms of a
      greater problem, which is that the focus has shifted away from the
      original problem, which is bad because the original problem still stands:
      refactor the data structure. Probably, a few consumers will still adopt,
      but it is clear what is happening: the company is focused on building a
      new platform, in a waterfall way, for an imagined, pre-planned,
      overarching future use-case which may or may not exist, and then hoping
      everyone will adopt it. If all that works, then finally the company can
      focus on rewriting the data structures under the hood. However, it is too
      far away, and by then it may be too late.
    
    
    Additional Risks and Management
    
      
        - 
          Performance:
          
            - 
              The decoupled architecture means the removal of the SQL Join as an
              architectural lynchpin, which could be undesirable because the
              relational model dissolves, and possible performance implications
              because joins are very efficient. However, this this permissable
              because the decoupled system offers many alternatives to the Join,
              such as caching and aggregation, to name a few.
            
 
- 
          Complexity:
          
            - 
              A decoupled architecture introduces complexity elsewhere, as it
              will consist of many micro-services, but that is permissable
              because they can be observed and monitored with modern
              observability software.
            
 
- 
          Cost of Resources:
          
            - 
              Manage cost; the cost of running the decoupled architecture needs
              to be cheaper than running the database monolith, but that will
              happen because the system will be on average the same size as
              database monolith but only need to scale some of the components,
              which net cheaper than scaling the whole monolith. Additionally,
              the ROI on decoupling should justify the costs.