Why developers should be force-fed state machines

Willem van Bergen

33

This post is meant to create more awareness about state machines in the web application developer crowd. If you don’t know what state machines are, please read up on them first. Wikipedia is a good place to start, as always.

State machines are awesome

The main reason for using state machines is to help the design process. It is much easier to figure out all the possible edge conditions by drawing out the state machine on paper. This will make sure that your application will have less bugs and less undefined behavior. Also, it clearly defines which parts of the internal state of your object are exposed as external API.

Moreover, state machines have decades of math and CS research behind them about analyzing them, simplifying them, and much more. Once you realize that in management state machines are called business processes, you'll find a wealth of information and tools at your disposal.

Recognizing the state machine pattern

Most web applications contain several examples of state machines, including accounts and subscriptions, invoices, orders, blog posts, and many more. The problem is that you might not necessarily think of them as state machines while designing your application. Therefore, it is good to have some indicators to recognize them early on. The easiest way is to look at your data model:

  • Adding a state or status field to your model is the most obvious sign of a state machine.
  • Boolean fields are usually also a good indication, like published, or paid. Also timestamps that can have a NULL value like published_at and paid_at are a usable sign.
  • Finally, having records that are only valid for a given period in time, like subscriptions with a start and end date.

When you decide that a state machine is the way to go for your problem at hand, there are many tools available to help you implement it. For Ruby on Rails, we have the excellent gem state_machine which should cover virtually all of your state machine needs.

Keeping the transition history

Now that you are using state machines for modelling, the next thing you will want to do is keeping track of all the state transitions over time. When you are starting out, you may be only interested in the current state of an object, but at some point the transition history will be an invaluable source of information. It allows you to answer all kinds of questions, like: “How long on average does it take for an account to upgrade?”, “How long does it take to get a draft blog post published?”, or “Which invoices are waiting for an initial payment the longest?”. In short, it gives you great insight on your users' behavior.

When your state machine is acyclic (i.e. it is not possible to return to a previous state) the simplest way to keep track of the transitions is to add a timestamp field for every possible state (e.g. confirmed_atpublished_atpaid_at). Simply set these fields to the current time whenever a transition to the given state occurs.

However, it is often possible to revisit the same state multiple times. In that case, simply adding fields to your model won’t do the trick because you will be overwriting them. Instead, add a log table in which all the state transitions will be logged. Fields that you probably want to include are the timestamp, the old state, the new state, and the event that caused the transition.

For Ruby and Rails, Jesse Storimer and I have developed the Ruby gem state_machine-audit_trail to track this history for you. It can be used in unison with the state_machine gem.

Deleting records?

In some cases, you may be tempted to delete state machine records from your database. However, you should never do this. For accountability and completeness of your history alone, it is a good practice to never delete records. Instead of removing it, add an error state for any reason you would have wanted to delete a record. A spam account? Don’t delete, set to the spam state. A fraudulent order? Don’t delete, set to the fraud state.

This allows you to keep track of these problems over time, like: how many accounts are spam, or how long it takes on average to see that an order is fraudulent.

In conclusion

Hopefully, reading this text has made you more aware of state machines and you will be applying them more often when developing a web application. Disclaimer: like any technique, state machines can be overused. Developer discretion is advised.

Comments

Jake Gordon

Jake Gordon June 13, 2011 11:39AM EDT

Nice post.

State machines are great, although I have to admit, sometimes it takes me a while before I recognize that I should be using one… but once I re-factor it usually cleans up a lot of code and makes for a much simpler design.

If your web app is javascript intensive, you can also use state machines on the client. I happened to build a small (and simple) javascript state machine library just last week for a game

http://codeincomplete.com/posts/2011/6/1/javascript_state_machine/

Alex

Alex June 13, 2011 11:55AM EDT

Anybody else read the title of this post as meaning developers should be replaced with state machines which are force-fed, rather than meaning that human developers should study state machines…? :-P

Tom

Tom June 13, 2011 12:28PM EDT

Alex: yes!

Rachel 'Groby' Blum

Rachel 'Groby' Blum June 13, 2011 01:22PM EDT

It might be the education snob in me, but it reads like another reason why developers should have a CS degree. State machines are one of the basic tools in every devs toolkit, and I’m always amazed that there are (supposedly) people out there who don’t know about them.

That’s not to detract from the awesome work folks like you do to make state machines easier to use – they always feel like a whole truckload of boilerplate, so help using them is always appreciated.

will

will June 13, 2011 02:04PM EDT

anybody else feel this didnt do a good job of…anything?

Willem van Bergen

Willem van Bergen June 13, 2011 02:12PM EDT

Rachel: Yes, I agree that state machine theory is one of the most important and useful parts in a CS eduction.

However, the problem is recognizing when you should be using a state machine, because it is very easy to miss opportunities to apply the technique. It’s easy to just at a “published_at” field to your blog post model. What I hope to achieve with this post is a trigger in a developer’s mind to consider using a state machine instead.

Willem van Bergen

Willem van Bergen June 13, 2011 02:22PM EDT

Alex, Tom: I would like to see a state machine that accurately describes a developer. Maybe “Programming → Eating pizza → Drinking coke → Programming” with an error state of “Sleeping” thrown in? ;)

Willem van Bergen

Willem van Bergen June 13, 2011 02:23PM EDT

Will: anything specific that you’re missing from the article?

Willem van Bergen

Willem van Bergen June 13, 2011 02:27PM EDT

Jake Gordon: awesome work. We’re doing more and more with Javascript at Shopify, so I’m sure that a good javascript state machine library will come in handy.

Luke vdH

Luke vdH June 13, 2011 02:49PM EDT

Have you guys seen https://rubygems.org/gems/paper_trail? It’s not quite state machine, but it captures a lot of that state-change tracking type info.

Kevin Sookocheff

Kevin Sookocheff June 13, 2011 03:21PM EDT

Great stuff Willem, I’ve used state machines for parsing MP3 data to go from a huge case statement to a handful of distinct states. So satisfying!

Rein Henrichs

Rein Henrichs June 13, 2011 03:36PM EDT

While I agree that an understanding of state machines is important, the state machine described here is really a bastardization of the computer scientific and mathematical definition of a state machine (FSM). State machines aren’t just used to describe the current state of your blog post.

Automata theory has uses ranging from formal languages to communication protocol design to hardware design. Perhaps a better place to start when explaining state machines would be their use in deciding regular languages in the Chomsky hierarchy (and therefore how they are related to regular expressions). By reducing the concept to a way to describe the state of a blog post, you are (imo) missing out on most of its beauty and elegance. An insufficiently deep explanation also tends to turn profound concepts into ready-made crystal hammers and invites cargo-culting.

Ryan Oberholzer

Ryan Oberholzer June 13, 2011 05:16PM EDT

State machines are awesome. I wrote one a while ago – http://rubygems.org/gems/stateflow – We use in extensively in all our projects.

Check it out, and let me know what you think! :)

Willem van Bergen

Willem van Bergen June 13, 2011 07:14PM EDT

Rein Henrichs: You’re absolutely right that this post doesn’t cover the topic of state machines completely; I point to Wikipedia for a more complete description.

What I try to accomplish is that developer learn how to recognize the state machine pattern in their apps so they can apply it. Talking about regular expressions or network protocols will make it seem that state machines are not applicable when designing something like blog posts, which is the exact oppositie of what I want to achieve. This is why I left off most of the complex stuff, even though it makes the theory so elegant.

Pete

Pete June 13, 2011 09:40PM EDT

I agree with Rachel – state machines are one of the things you learn about in school; too many self-taught folks never learn about them until too late.
personally, I see FSMs more places than not – it’s harder to NOT use one than the reverse.

Jim Roepcke

Jim Roepcke June 13, 2011 11:26PM EDT

Check out webmachine by Basho (makers of Riak). It’s a web framework that’s literally modelled on HTTP’s state machine.

http://webmachine.basho.com/

It was ported to node.js (nodemachine), and that works great too.

The GoF’s State Pattern is a great way to introduce state machines into classic OO software.

oldprogrammer

oldprogrammer June 13, 2011 11:34PM EDT

Jesus, I am old. Have any of you diaper wearing “youngsters” ever heard of Knuth? Anyway noobs, PLEASE read http://en.wikipedia.org/wiki/The_Art_of_Computer_Programming before your further degrade our already beat down industry rep.

Pete Forde

Pete Forde June 13, 2011 11:48PM EDT

Thanks for the great article, Willem! I’m a very minor contributor to the state_machine gem, and I am pleased as punch to see you building upon it.

There’s a lot of trolls commenting on this post, and I’m honestly curious what you’d have a single post do? It would seem like you’d prefer the content body to read “I was going to write a nice summary post about how to get started using state machines in your code, but I decided to insult you into reading an entire fucking book in the most patronizing way possible with no further direction given. KTHXBYE!”

I’m a self-taught developer that learned to love state machines not a moment too soon, but realistically after about a decade of professional development. It’s not that I wasn’t using them before — it’s hard not to. I just didn’t recognize them as a general abstraction. Now I start designing my state machines in the early concept stage of my projects.

At the same time, I’m not planning to apologize for not using them sooner, nor do I feel like going to get a CS degree would have been the best course of action for me given the arc of my career. Different paths to happiness for different folks! All of you haters should chill out.

kodeninja

kodeninja June 14, 2011 04:12AM EDT

Great blog.

On a related note, I think it would be invaluable for noobs if someone could create a screencast of how he took an existing piece of code (something non-trivial, but not too big either) that didn’t use FSM but was a possible candidate, refactored it using FSM concepts and how the end result improved the overall design.

Just an idea.

Oliver

Oliver June 14, 2011 08:06AM EDT

+1 Pete. If you’re the kind of person who thinks the best introduction to state machines is “regular languages in the Chomsky hierarchy”, then you are clearly NOT the target audience of this post. While the CS applications might be elegant and fascinating, they’re not for everyone. You don’t get a child interested in engineering by explaining the equations that keep forces in balance on a skyscraper; you give them a tub of Duplo, let them build their own skyscraper and watch it fall over. Then they wonder “why did that happen?” “What can I do to stop it?” “That works, but it uses too many blocks. How can I do this better?” The next time they need to stop something falling over, they’ll be ready.

If you’re the type that gets turned on by CS theory, you’ll have your own way of doing things. I’m studying software engineering, have been coding a few years, and am absolutely in love with it, but CS feels too mystical sometimes. The most approachable way to teach me a new concept is to make it instantly applicable to the kind of things I work on, and allow me to scale out my knowledge on more advanced topics that interest me. If you can teach me how to see a state machine pattern, that will be more valuable than any Chomsky hierarchy in making me use it. Eventually I’ll rub up against Chomsky in a more advanced blog post.

Interestingly, one of the subjects I’m studying introduced state modelling a couple of months ago, and I have an exam on it in 36 hours. Wasting time surfing the Web and reading friendly articles is helping me study! I’ve read GoF, but the state pattern struck me as something that kinda always fitted into a design, rather than the core of the design itself. This subject opened my eyes a little more.

We all need different perspectives, so thank you Willem.

Oliver Ponder

Oliver Ponder June 14, 2011 04:35PM EDT

Those of you intrigued by the concept and looking to learn more: http://video.google.com/videoplay?docid=-5837841629284334824&hl=en#

This vid from a CS lecture from Shai Simonson really helped things click in my mind.

brandonc

brandonc June 15, 2011 01:13PM EDT

Feeling inspired by your post, I wrote a generic state machine class in c#:

https://github.com/brandonc/statemachine

Rein Henrichs

Rein Henrichs June 15, 2011 03:39PM EDT

My point, which seems to have gotten lost, is that a blog post could be in is really not a state machiine. A state machine is a very specific computer scientific concept and this post does very little to actually introduce or explain it.

This is a lot like saying that atoms are the fundamental building blocks of matter, and then going on to describe the atomic theory of the ancient Greeks (Democritus, et al.). Sure, you get across the most basic concept of the “atom”, but you really don’t explain that much about atomic theory.

If this blog post does make people interested in the concept and that then causes them to learn it more deeply, that would make me happy. If it leads people to think that “state machine == list of states”, that makes me sad.

In any event, I don’t think that saying this makes me a troll. I think calling someone a troll for rationally expressing a differing point of view is by far a worse offense.

Rein Henrichs

Rein Henrichs June 15, 2011 03:40PM EDT

Sorry, I somehow deleted some important words while editing. I meant to say, “My point, which seems to have gotten lost, is that a list of states that a blog post could be in is really not a state machine”.

umlcat

umlcat June 16, 2011 01:21PM EDT

I read about state-machines in my compiler classes, and found that could be used to solve other problems. But, its thru, they are a design pattern that many developers aren’t aware of it.

VolkerG

VolkerG September 01, 2011 11:26AM EDT

I love state machines, they always make code clearer, more stable and even simpler to debug.
For most cases (simple state machines) you need a enumeration (in c: enum {sIDLE, sHUNT, sPAYED, …} my_states; I start all my states with a lowercase ‘s’) and a switch() case: construct and you are almost there.
If you put it in a function, you can log entry and exit conditions what makes debug easy.

Mike Holly

Mike Holly September 01, 2011 02:21PM EDT

That state_machine gem looks rad. Thanks for the tip!

samwyse

samwyse September 02, 2011 07:32AM EDT

Found this article via http://www.skorks.com/2011/09/why-developers-never-use-state-machines/ and wanted to add a quick testimonial: State machines are great! Just this week, I needed to write a command line tool to split out subsections from large documents. Once the Python grew to a particular size, it got buggy and I couldn’t easily see why. I grabbed a sheet of paper, drew my state diagram, rearranged my code, and it worked on the first run.

John Haugeland

John Haugeland September 02, 2011 12:29PM EDT

This is basically a bunch of cheerleading for Ruby’s state machine gem. There’s no technical content here; if someone already knows what a state machine is, you’re preaching to the choir, and if they don’t, they sure aren’t going to learn from this.

Will’s right: this article didn’t do a good job of anything.

John Haugeland

John Haugeland September 02, 2011 12:31PM EDT

Rein: the reason they’re calling you a troll is that this blog post has promised them a magic bean, and in trying to bring realism to the excitement, instead of seeing you as the guy who’s going to save them a bunch of wasted work and frustration, they see you as the troll who’s trying to destroy their magic beans.

Steve Caine

Steve Caine September 02, 2011 02:05PM EDT

And for those toiling in the iOS vineyard:

http://matt-greer.com/blog/2009/05/state-machines-and-objective-c/

Bill Kress

Bill Kress September 06, 2011 11:48AM EDT

A while ago I wrote what is essentially a state machine DSL in Java. It’s more OO than most and allows you to translate from your paper state transition table more-or-less straight into code (The data is defined in the format of a transition table). You may optionally add states through code, but the table ends up being pretty easy to maintain and I don’t think it costs anything in terms of performance.

The problem with state machine “Libraries” is that it’s pretty tough to really centralize all the redundant parts of a state machine, you are generally better off hand coding them.

This was just for fun but it works and might inspire anyone trying to code a FSM without all the boilerplate/redundant code. It may seem a little strange at first, but I think the model is good.

http://code.google.com/p/state-machine/

Anonymous coward

Anonymous coward September 22, 2011 03:06PM EDT

Suppose you have a light sensor, watching the ambient light, and a relay which turns light on and off, with a delay of let’s say one hour. These two work together.

The states for this state machine would be darkOutsideLightOff, darkOutsideLightOn, darkOutsideLightOn, lightOutsideLightOn, lightOutsideLightOff. The state would be also accompanied by a timestamp recording the last state change, so the light switch relay can know when to act.

How is this simpler to implement than a system using two booleans – darkOutside and lightOn?

I have used state machines in many LOB apps, I can’t think of how a compiler could be implemented without one, and in general I agree that they are highly useful – for the right type of problems.

But using state machines in cases where the states don’t have an immediate, primary natural meaning from a business point of view (like above – the states are actually predicates on two domain-specific variables), or when there is a simpler alternate representation of the state, is IMO a mistake.

IMO, the state machine approach lends itself well to solving problems which you are most comfortable with modelling from the start as state diagrams. If such a model seems artificial and unnatural for your problem, probably a state machine isn’t the right solution.

Leave a Comment

Your email address will not be published.