Why Using a Single Source of Truth is Important

Yvonne Chen
4 min readFeb 11, 2021

A topic comparing the single source of truth method in Ruby’s object relationships that applies to real life situations…

Coming from a financial data analyst background, I have learned how important is to keep data consistent and accurate. During my first phase of software engineering immersive at Flatiron School, I learned about a single source of truth in Ruby’s object relationships. This caught my attention immediately. In my opinion, it’s easier to eliminate data discrepancies through a single source of truth when the data scale is not too huge, such as in a local business data. Today, I want to show you the impact in storing a single source of truth in different places of our code as well as why using a single source of truth is important.

Let’s consider a service similar to Kickstarter which has two main entities — projects, which needed to be funded by backers, and backers, which are individuals that can fund a project.

class Project attr_reader :title  def initialize(title)   @title = title   @backers = []  end
def add_backer(backer) @backers << backer end
def backers @backers endendclass Backer attr_reader :name
def initialize(name) @name = name @backed_projects = [] end
def back_project(project) @backed_projects << project end
def backed_projects @backed_projects endend

In this first solution, we have two classes: Project and Backer. Backers can be added to projects, and projects can be added to backers. However, we run into some interesting behavior when backing projects and adding backers.

steven = Backer.new(“Steven”)michael = Backer.new(“Michael”)
project_1 = Project.new(“project_1”)project_2 = Project.new(“project_2”)
steven.back_project(project_1)project_1.add_backer(michael)
michael.back_project(project_2)project_2.add_backer(steven)
steven.backed_projects # => [project_1]michael.backed_projects # => [project_2]
project_1.backers # => [michael]project_2.backers # => [steven]

Intuitively, when a backer backs a project, the project should include that backer in its list of backers. Likewise, when a project adds a backer, the project should be included in the backer’s list of project. However, that is not happening here. The two classes are not talking to each other, and we have a lack of consistency in our data that depends on how the project-backer relationship is established. Because all the data is tracked independently, this would be considered a fully denormalized solution. Let’s take a look at a normalized solution which more accurately models the relationships between our entities.

class Project attr_reader :title
def initialize(title) @title = title @project_backers = [] end
def add_backer(backer) @project_backers << ProjectBacker.new(self, backer) backer.back_project(self) end
def backers @project_backers.map { |project_backer| project_backer.backer } endendclass Backer attr_reader :name
def initialize(name) @name = name @project_backers = [] end
def back_project(project) @project_backers << ProjectBacker.new(project, self) project.add_backer(self) end
def backed_projects @project_backers.map { |project_backer| project_backer.project } endendclass ProjectBacker attr_reader :project, :backer
def initialize(project, backer) @project = project @backer = backer endend

Here, we introduce a join class, ProjectBacker, which models the relation between a project and a backer. Now, adding a project to a backer also adds the backer to the project, and vice-versa.

steven.backed_projects # => [project_1, project_2]michael.backed_projects # => [project_1, project_2]project_1.backers # => [steven, michael]project_2.backers # => [michael, steven]

It seems we have solved all the shortcomings of the initial solution. However, we are creating duplicate ProjectBacker entries every time a project-backer relationship is established. Moreover, creating new ProjectBacker instances directly does not result in any changes on existing Project or Backer instances. We can do better…

class Project attr_reader :title
def initialize(title) @title = title end
def add_backer(backer) ProjectBacker.new(self, backer) end
def backers project_backers = ProjectBacker.all.select { |project_backer| project_backer.project == self } project_backers.map { |project_backer| project_backer.backer } endendclass Backer attr_reader :name
def initialize(name) @name = name end
def back_project(project) ProjectBacker.new(project, self) end
def backed_projects project_backers = ProjectBacker.all.select { |project_backer| project_backer.backer == self } project_backers.map { |project_backer| project_backer.project } endendclass ProjectBacker attr_reader :project, :backer @@all = []
def initialize(project, backer) @project = project @backer = backer @@all << self end
def self.all @@all endendsteven = Backer.new(“Steven”)michael = Backer.new(“Michael”)
project_1 = Project.new(“project_1”)project_2 = Project.new(“project_2”)
steven.back_project(project_1)project_2.add_backer(michael)
ProjectBacker.new(project_1, michael)
ProjectBacker.new(project_2, steven)
steven.backed_projects # => [project_1, project_2]michael.backed_projects # => [project_2, project_1]
project_1.backers # => [steven, michael]project_2.backers # => [michael, steven]

In this third and final solution, I’m demonstrating what I personally think this is the most ideal way to make sure our data is updated. In the example above, Project and Backer no longer track any project-backer relationships on their own. Project changes are reflected in their backers, and backer changes are reflected in their backed projects. Utilizing the ProjectBacker join class directly also results in the changes we expect. Although the third solution is looping through a few arrays, it’s still considered as the most effective method to ensure our data is updated across the board. It also represents a fully normalized data model.

In today’s world, many companies tend to use the most recent data to evaluate their business performance. In this case, we want to make sure the information we have collected is fresh, accurate and reliable. This is the reason using a single source of truth is important. In addition, it helps prevent us from having any data discrepancies, incorrect data entries, and missing data updates when it comes to data mutations.

Thanks for reading!

--

--

Yvonne Chen

Full Stack Software Engineer with Finance Background