Data Flow / Architecture

Introduction
Unidirectional Data Flow
Component Responsibility

Dumb Components
Smart Components

Flux

Stores
Actions
Dispatcher
Immutability
Benefits
Async
Flavors

Summary

Introduction

Recommended Viewing

Full-Stack Flux - Pete Hunt from Instagram/Facebook discusses what problems Flux is designed to solve, and how centralizing the source of truth in your application can naturally minimize complexity.
Redux - Redux, a not-so-Flux-like Flux implementation. Video describes what inspired it and how it relates to Flux, and the benefits that result from its divergence. Helps solidify general Flux knowledge through comparisons and examples.
React.js Conf 2015 - Round-up of all the talks from this awesome conference.

Unidirectional Data Flow

While it may be self-evident, unidirectional data flow, formally stated, describes a one-way architecture, specifically one that is top-down. In the real world, this means that data enters your application at the highest possible level and trickles down. As an example, if a component's responsibility is to display data in a certain format, it doesn't need to be concerned about where that data comes from - it simply comes "down" into the component via a property. Conversely, a view might need to make a request for a specific resource, in which case it will be the one responsible for connecting to a store and providing the necessary properties to its children.

The influences of this architecture can be seen all the way down at the component level. Because data flows from top to bottom, components become reactive, and therefore declarative, as they only need to describe how to render given a certain input. Approached functionally, a component can therefore be thought as:

f(x) = ...

where the component is a function of its input, x (in the case of React, x is the set of properties that a component receives, and the result of the function is how that component looks when it renders). This means that you can begin thinking about your components as representations of slices in time.

Think back to middle school for a minute. Remember learning that in order for f(x) to be a function, there can be only one y value for every x. If you pass a two values to a function, e.g.:

function add (a, b) {
  return a + b;
}

you can reasonably expect get the same output for every input, anything else and what's the point? If you extrapolate this to the application level, you should be able to pass in some state (properties) and obtain the same result every time. This is far easier said than done, and as a result it's common to introduce side effects and impure functions, ultimately complicating applications to the point that they become non-deterministic and difficult to reason about. However, if you succeed, your application becomes a simple function composed of declarative components, allowing you to easily reproduce any state just by providing the correct input.

On the component level, by adhering to pure functions (in React all render methods should be pure) components become referentially transparent. This makes testing incredibly easy because, just like your application, you can just pass a component a set of properties and analyze the result; no need to track what mutations occur over time (not to mention the myriad of possible combinations and timings). This is just part of why functional reactive programming is awesome.

Compare this to what a directive and its controller in Angular might be responsible for: influencing and enacting changes over time. This means that how a component looks when it first renders might not be how it looks in 2 minutes, even though the properties it was given haven't changed! How do you reliably test this? it's incredibly difficult because these types of components are impure and can have different outputs for the same input. You have to remember everything that goes on inside the component, rather than having it simply react to changes that come from the outside. The imperative approach is far too common, primarily because it's intuitive, and quickly leads to unneeded complexity. Thankfully, React makes doing this very difficult.

Following the reactive paradigm encouraged by React, you end up almost completely eliminating an entire dimension (time) from the equation. This is called reactive programming, and it fits incredibly well with unidirectional, data flow where state changes occur at the top level and propagate down, allowing components to automatically re-render to reflect those changes. This can't be overstated enough: components are merely blueprints for describing how a given state should render.

So how do we implement this? And what about the data? Where is it stored and how does it get changed? These are great questions, but they are implementation details secondary to the overarching design. The section on Flux will delve further into answering these questions.

Component Responsibility

Recommended Reading

Smart and Dumb Components - Dan Abramov, the creator of Redux and React-hot-loader, discusses component design patterns that have evolved over time.

Components should touch as little surface area as possible, especially where data is concerned. Doing so eliminates complexity and makes it far easier to test and reuse components since there are few moving parts. For example, a TodoList component does not need to know where the list of todos comes from, or what to do when the user wants to delete one. Such a component would be classified as a dumb component, which is covered in more detail below.

This results in two primary types of components: Smart and Dumb components. Dumb components will make up the majority of your codebase, and are able to be reused pretty much anywhere, since they don't control how data is obtained or manipulated. Smart components, on the other hand, are the "top level" of an application. They are responsible for distributing data down into their constituent dumb components, who simply do the bidding of whoever implements them.

Dumb Components

This type of component is the building block of an application. It is what takes state (in the form of properties) and transforms it into something meaningful for the user. A TodoList takes a collection of Todos and displays it in a friendly list format, possibly as smaller Todo components.

In order for a Dumb component to be meaningful, it needs to expose an API (again, in the form of properties) that implementors adhere must to. This can range from simple properties that provide data, to functions that the component can invoke in response to different events. With the TodoList component, the TodoList doesn't need to know how to remove todos or flag them as complete. Maybe you have 5 different types of TodoLists in your application that all use different sets of data from totally separate sources - maybe some of them are even tied directly to the server. Because of this, it doesn't make sense for the TodoList to make assumptions about how to operate on the data its given. Instead, it might invoke handlers such as "deleteTodo" or "toggleTodoCompletion", which its parent provides and can hook into. New data for a TodoList flows into the TodoList - top-down, unidirectional data flow!

This may seem like a lot of work, and it does take more effort than simply allowing components to perform direct mutations, but you are rewarded with components that are entirely isolated from the ecosystems they live in, and your single source of truth remains safely intact.

Smart Components

Smart components are where data enters your application. These components might be things like views that include a bunch of smaller dumb components. They often don't include much or any DOM themselves, as their primary goal is to consume data and forward it (as well as actions) down into components.

This jumps a bit ahead into more of a Flux implementation, but if you're like me, you may wonder how smart components fit into larger application architectures. Say you have two entirely disparate views that need entirely different sets of data. That's a question I encountered, and one which the author of Redux was kind enough to clarify. Basically, if your "smart" component is responsible for loading data into a store, it should also be responsible for removing it. I know, I know, we haven't even talked about Flux yet, but this will all make sense!

Flux

So unidirectional data flow and smart/dumb components sound like a great idea, but how do they fit together? What kind of architecture surrounds them? Enter Flux.

Flux boils down to a pretty basic concept: data flows in one direction through your application, with data and the actions that change it being centralized in stores. Components react to changes in stores, but it's important to note that they react based on the new state that's produced, not the events themselves.

If you take this description further, it can be extrapolated to state that your core application state should be able to be represented by a single "input" at any point in time. And, similar to components, an application is just one big function that encompasses a bunch of smaller functions. You take some state representation, run it through your application, and out pops a result!

Stores

Stores are, as the name implies, where data is stored. It's also where any changes to that data take place, which keeps everything centralized. Stores were designed to solve the problem where multiple components rely on the same data, but potentially have inaccurate or out-of-date representations of that data. When data can be transformed in more than one spot, you open yourself up to race conditions or other nasty bugs, and you lose that "single source of truth". Stores keep all the business logic right next to the actual data, which means that as long as all components draw from that store (which they should), they will all remain in sync with each other.

Importantly, stores are entirely synchronous. They receive actions and perform some internal operation in response to those actions, that's it. This means that asynchronous events simply dispatch actions upon success/error, this way the store doesn't have to know about, or keep track of outstanding requests. Once you start storing state within a store, it becomes impure - it could look the same on the outside, but whatever's going on internally could influence its output.

Actions

So you have a centralized store, awesome, but how do you change what's inside of it? After all, that's the whole point of an application, doing something. This is what actions (and the dispatcher, as you'll see) are for. An action, in its simplest form, is an object describing the type of action (normally as a constant) and any data relevant to the action. It might look something like this:

import { TODO_DELETE } from 'constants/todo';

const deleteTodoAction = {
  type : TODO_DELETE,
  payload : {
    id : 5
  }
};

All actions will flow through a store, whether it cares to do anything with them or not. It decides what actions to use based on their type, so a Todo store might want to know about all todo-related actions, e.g. TODO_DELETE, TODO_CREATE, TODO_TOGGLE_COMPLETE. These actions are a way to communicate with top-level stores from anywhere within the component hierarchy. Actions, which again are just simple objects, can be cumbersome to create, so there are normally action creators to help with this. Here's one that helps with that TODO_DELETE action from above.

import { TODO_DELETE } from 'constants/todo';

export function deleteTodo (id) {
  return {
    type : TODO_DELETE,
    payload : {
      id // es6, remember!
    }
  };
}

Notice how this abstracts away all of the actual creating and formatting of the action object, inserting the data that's passing in as arguments into the payload. Now that we know how to create actions, let's see how a store might handle one:

Dispatcher.register(function (action) {
  const { type, payload } = action;

  switch (type) {
    case TODO_DELETE:
      this._todos = this._todos.filter(todo => todo.id !== payload.id);
      this.emitChange();
      break;
  }
}.bind(this));

Once the todos have been updated with the deletion, the change event will notify whichever component is listening to the store. That component will then, likely, re-render based on the new todos list, which allows all child components to automatically react to this change, without having to know what changed or how. Now, you may be wondering, how does the store know about this action? And herein lies a critical point.

In the original Flux implementation, action creators (that deleteTodo function from earlier), would automatically call the dispatcher. So instead of just return a plain object, it would look something like this:

import { TODO_DELETE } from 'constants/todo';
import AppDispatcher from 'dispatchers/app';

export function deleteTodo (id) {
  AppDispatcher.dispatch({
    type : TODO_DELETE,
    payload : {
      id // es6, remember!
    }
  });
}

Many flavors of Flux have moved away from this, since it couples your action creators directly to a specific dispatcher. By just returning plain action objects, they can be dispatched however you want.

Dispatcher

The dispatcher is essentially a central routing system that forwards actions on to stores. But wait, why don't the actions just communicate directly with the store? Wouldn't that be easier? Great question, lad (or ladette)!

That would couple the action way too tightly to the store. And what happens when you want to communicate with multiple stores? By keeping them separated, you can dispatch actions freely without having to be aware of the other half of the implementation.
Using a dispatcher allows stores to declare dependencies on other stores, so actions can be dispatched in a specific order.
Funneling all actions through a single point allows for central logging, debugging, and centralizes events that affect application state. Without a central system, there's no easy way to reliably track these actions.

Immutability

Let's approach this from the perspective of the the canonical Todo application. Think of how it's generally been written in the past, where your collection of todos might start off as an empty array that gets pushed/popped over time.

const todos = [];

// let's add a new Todo item
todos.push(new Todo());

What could be wrong with an example so simple? Well, for one, think about what would happen if you were to provide this list of todos to other components. How would they know something changed? There'd have be a deep equality check on the object, since the object reference is the same. Additionally, how would the application know when to do this check? Flux solves this by having stores emit changes. Angular on the other hand eschews a global store and attempts to solve this exact problem with its dirty checking and digest cycle, allowing data to change anywhere at any point in time. Ever wonder why Angular 2.0 differs so wildly from its predecessor?

But there's an even better way.

The important thing to notice with the above snippet is that we're mutating todos. We maintain the same object, but it just changes a little bit. As a result, we've lost the ability to represent the old state and new state as separate entities at the same time, because the old one no longer exists. In order to traverse back through time, you'd have to remember which mutations occurred and when. You'd have to reverse actions, popping instead of pushing. Wouldn't it be easier to just represent distinct states as, well, distinct states (separate objects)? Enter immutability.

Immutability also offers performance benefits. Libraries such as ImmutableJS offer sophisticated algorithms that try to optimize the creation of new objects, by maintaining unchanged references and replacing those that need to be swapped. React's using the PureRenderMixin can take advantage of immutability and pure render functions by eliminating the need to perform complex diffs of the virtual DOM when a component's state and props haven't changed.

Benefits

Hot Reloading

Functional purity is what allows react-hot-loader to work. If a component's render function was non-deterministic (impure), the entire component would need to be reloaded after a change. However, since that's not the case, the previous state and properties can continue to exist as they were while the component's methods/render function are simply be swapped out for their new versions.

Centralization

The funneling of actions through a central dispatcher means that it's easy to add middleware between your actions and the stores. This opens up the possibility to implement features such as centralized logging and debugging.

Time Travel

If you use immutable data structures, you now have the completely free ability to completely step through time by replaying actions or stepping through previous states.

Async

So we've now talked about how the dispatcher, actions, and stores all fit together. The discussion on unidirectional data flow discussed how awesome it is to be able to eliminate time from the equation, or at least limit its effect on the application. But we're well past the AJAX revolution, and web apps need to be able to work with asynchronous events. How does that fit into the Flux architecture?

Well, it simply means that you dispatch the actions when whatever asynchronous event completes! Make a request for the data you need, and once the response comes back the rest of the process remains just like it would be in a synchronous application. This keeps your stores simple, and you no longer have to worry about store data being potentially out of data; removing time from the equation really helps!

In traditional Flux, this might look something like:

function deleteTodoAsync (id) {
  someAwesomeAjaxCall(id)
    .then(resp => AppDispatcher.dispatch({
      type : TODO_DELETE,
      payload : {
        newTodos : resp
      }
    });
}

And, while we haven't covered it yet, let me give you an idea of how this is done in Redux. In Redux, actions don't dispatch themselves, but for asynchronous actions they can return thunks to make the experience more pleasant.

function deleteTodoAsync (id) {

  // "callback" is generally written as "dispatch" in Redux,
  // but I kept it this way for clarity.
  return function deleteTodoAsyncThunk (callback) {
    someAwesomeAjaxCall(id)
      .then(resp => callback({
        type : TODO_DELETE,
        payload : {
          newTodos : resp
        }
      });
  };
}

This means that you can pass the dispatcher function in as the callback and your action will be synchronously dispatched once the request completes. There are other solutions to this, especially with Redux which offers the ability to use middleware (such as to support actions that are promises), but this is just one example.

The point is, always keep your stores synchronous. It solves a lot of problems with testability and mocking, and allows you to access stores without worrying about whether they're pending some additional action or not. The state you get from a store is the canonical state.

Flavors

Redux

Recommended Reading

Redux Documentation

Recommended Viewing

Dan Abramov - Live React - Links directly to the 11-minute timestamp where Dan Abramov discusses how traditional Flux stores can be simplified as reducers. Basically, he more elegantly describes everything I'm about to try to explain.

Redux, or "reducers" + "flux", takes the original principles of flux but applies them in an even more purely functional manner. Traditional flux stores are responsible for emitting changes that components can listen to, and this allows state to be mutable. It also means that there is no singular, cohesive application state, but instead multiple top-level stores. The distinction might sound trivial, but it's not and here's why.

First, let's recall the function signature for a reducer. If you're not already familiar with it, here's a refresher:

// (acc, val) -> val
[1,2,3].reduce(function (acc, n) {
  return acc + n;
}); // 6

The function takes a collection and runs through them sequentially, "accumulating" a result as it does. The first argument to the callback function is the accumulator, the second is the next item. So what if we approached applications in the same way? The application could start with some initial state, and you get a new state as actions are accumulated.

[createTodo, deleteTodo, toggleCompleteTodo].reduce(function (state, action) {
  // your application!
}, initialState);

So now your stores have essentially become reducers; they receive actions and return a new state. And since they're pure, you can compose them together into one core application reducer. Actions come in, state comes out. The important thing to note here is that reducers must remain pure - that is, without side effects - and they must return a new state, they cannot mutate the old one.

By following these two simple rules, you not only reduce immediate complexity but you allow other things to occur naturally within the codebase. Redux can naturally notice state changes, not by deep equality checks but just by new references, and automatically update subscribed components. No more emitChange()!

Redux provides additional benefits; for one, since state is immutable, it's very easy to record new states as they appear over time. This is the basis for redux-devtools, allowing you to quite literally step forward and backward through time. It also means it's possible to test applications simply by feeding it a collection of actions and analyzing the result; no extra effort required.

Summary

By representing state in a centralized location and simply reacting it, you gain the ability to write declarative and easily testable components. Maintaining synchronous stores removes time from the equation and in doing so eliminates the possibility for components relying on the same data to become out of sync with each other.

It should be your goal to make components as declarative as possible, and represented chiefly by the properties that are provided to them.

davezuko / flux-introduction Goto Github PK

flux-introduction's Introduction