State machines part ten
I finally had an occasion to roll my own state machine earlier this month.
My problem was to integrate a little piece of our application with a foreign web application. Jon Udell taught me to look at all web applications as web services in Practical Internet Groupware. The short story is I needed to scrape some lightly-tag-soup-style HTML. It's a parsing problem. Bob Martin's example of using a state machine to parse code comments came to mind. HTTP::Parser gets me part of the way. It'll scrape the angle brackets and give me a stream of four specific events: start-tag, end-tag, text, and comment. If the other app were using validated XHTML I might be able to do this more simply with XPath. But these apps don't live in the ideal web-services bubble. They live on today's web, warts and all.
<FONT FACE="Arial"> <H3>FLINTSTONE , FRED</H3> </FONT> <DIR><DIR> <TABLE BORDER=0 CELLPADDING=0> <TR> <TD ALIGN="LEFT" VALIGN="TOP"><STRONG>Business Address</STRONG></TD> <TD> </TD> <TD>12 W SLATE QUARRY RD<BR>BRONTOSAURUS 3<BR>BEDROCK, PG 00004</TD> </TR> <TR> <TD><STRONG>Phone Number</STRONG></TD> <TD> </TD> <TD>(888) 555-1212</TD> </TR> ...
While I was thinking through the states needed to digest this html, I remembered a couple other useful ideas. Bob's state machine compiler lets you work on your state machine through the more concise structure of a state transition table. You let his compiler transform the table into the moving parts of a state machine. Then I remembered the beautiful example where Paul Graham transforms a list of data and a set of functions into a network of closures which can be directly executed for the same effect. [On Lisp pp. 76-81]
I ended up with something not quite as elegant as Paul's example, but more satisfying than what I could have done in Java. My state transition table is a perl hash of hashes where the values of the inner hashes are closures. Most of the magic is actually triggered by the _notify() function which uses the current state and the current event to look up the appropriate closure to execute. Here's most of the code:
my($_STATE_TABLE) = {
begin => {
start => sub {_change_state(@_, 'h3', 'save_name')},
},
save_name => {
text => sub {_save($_[0], {_parse_name(@_)})},
end => sub {_change_state(@_, undef, 'street1')},
},
street1 => {
text => sub {_change_state(@_, 'Business Address',
'empty_address_cell')},
},
empty_address_cell => {
start => sub {_change_state(@_, 'td', 'skip_empty_address_cell')},
},
skip_empty_address_cell => {
start => sub {_change_state(@_, 'td', 'save_address')},
},
save_address => {
# /(skip_)?empty_address_cell/ get us inside the 'td' that holds the
# address. The only start tag we should see is 'br' which will cause
# address chunks to be sent in separate text events depending on the
# number of 'br' tags. Save whatever part of the address we can match
# until we hit the closing 'td' tag.
text => sub {
# if we can find "city, state zip", then save each of them
if ($_[1] =~ /^([^,]+)?,\s+([A-Z]{2})\s+(\d{5})?$/) {
_save($_[0], {map(defined($_->[1]) && $_->[1]
? ('Address.'.$_->[0] => $_->[1])
: (),
[city => ucfirst(lc($1))],
[state => $2],
[zip => $3])});
}
# otherwise, we're either in street1 or street2
else {
_save($_[0], {(!exists($_[0]->[$_IDI]->{'Address.street1'})
? 'Address.street1'
: 'Address.street2'), $_[1]});
}
},
end => sub {_change_state(@_, undef, 'phone')},
},
phone => {
text => sub {_change_state(@_, 'Phone Number', 'save_phone')},
},
save_phone => {
text => sub {_save($_[0], {'Phone_1.phone', Bivio::Type::Phone
->from_literal($_[1])})
unless _is_empty(@_)},
end => sub {_change_state(@_, 'tr', 'end')},
},
end => {},
};
# _change_state(self, string actual, string expect, string state)
# change to 'state' unless 'actual' ne 'expect', or change to 'state'
# unconditionally if 'expect' is undef
sub _change_state {
my($self, $actual, $expect, $state) = @_;
return if defined($expect) && $expect ne $actual;
my($fields) = $self->[$_IDI];
$fields->{state} = $state;
return;
}
# _is_empty(self, string) : boolean
sub _is_empty {
my($self, $value) = @_;
return $value =~ /^(\s*| )$/;
}
# _notify(self, string event, string data)
# 'event' must be one of 'start', 'text', 'end', or 'comment'
# 'data' is the tag for 'start' and 'end', and the text for 'text'
# and the comment for 'comment'
# Looks up the next action from $_STATE_TABLE using event and data,
# and if the action exists, it is called with 'self' and 'data'.
# The action will either _change_state(), _save(), or ignore the
# event
sub _notify {
my($self, $event, $data) = @_;
my($fields) = $self->[$_IDI];
my($fn) = $_STATE_TABLE->{$fields->{'state'}}->{$event};
return $fn->($self, $data) if defined($fn);
}
The actual states and transitions are modeled on Bob's comment parser. The comment parser changes state when special characters are found. It starts in the OutsideCommentState. In that state a '/' will change to MaybeCommentState. There another '/' or '*' will change to InsideCommentState. Anything else would change back to OutsideComment. Or something to that effect -- don't have his book at hand as I write this. My html parser changes states when special tags or text are encountered.
Here's what this state machine does in English. The parser starts out in the 'begin' state. HTML::Parser sends start, end, text, and comment events which are forwarded to _notify(). In the 'begin' state all events are ignored until we run into the first <H3> start tag. That will trigger a change to the 'save_name' state. In the save_name state, the first block of text will be _save()'d as the name. Then the first end tag will change the state to 'street1'. In the 'street1' state we ignore everything until we see the text 'Business Address' which triggers a change to 'empty_address_cell'. 'street1', 'empty_address_cell', and 'skip_empty_address_cell' work together to skip past the table cell containing the and into the cell that actually contains the address. The 'save_address' state is the most complicated of the bunch and skipping that makes it somewhat less complicated. Addresses are complicated because in some examples we get a street1, city, state, and zip whereas in other cases we get street1, street2, city, state and zip. In a few cases we only get city, state, and zip and one case gives only the state. I won't go into the details about the regular expression for parsing the city, state, and zip. Nor will I explain the map(). Those are both worthy of discussion, but not central to the mechanics of the state machine. The 'phone' and 'save_phone' states work similarly to the 'begin' and 'save_name' states. 'phone' ignores all tags until it finds the text 'Phone Number' when it changes to 'save_phone'. The closing row tag changes to 'end'. 'end' ignores all remaining events. What the parser ultimately returns is a hash containing the name, address and phone number.
What I find most satisfying about this solution is the concise and declarative structure of the $_STATE_TABLE. There's some effort required to understand the mechanics of the notification and the start, text, and end events. But with that understood, the $_STATE_TABLE is considerably less cluttered than the pile of if .. elsif .. else blocks it could have been. The bulk of the state machine is also fits in a single window in emacs whereas in Java I would have a pile of objects in different files to express the same machine. I think it also provides a nice example of the kind of power and expressiveness you get from having anonymous functions and arrays and hashes built into the language. Python, Ruby, and Groovy offer similar expressive power, but the proof is left as an exercise for the reader. :-) For what it's worth, In Java, I think the Command pattern with a little Dependency Injection could be used to provide something similar to the anonymous functions.
This technique for parsing angle-bracketed data would be equally applicable with SAX and an XML data source.
State machines part nine
Around this time last year I started publishing a series of entries about state machines and workflows. The topic has recently resurfaced in a couple areas. There's clearly more that I want to say.
What interested me last year was all the various places I was seeing state machines and workflows. Looking over it a year later I notice a glaring omission: I wrote everything with the assumption that the reader knows what a state machine is and how to code one. Let me now drop that assumption and briefly describe state machines.
The Gang of Four explain that the state pattern allows an object "to alter its behavior when its internal state changes." [Design Patterns p.305] That's about as succinct as it gets. I personally needed a bit more explanation.
They offered two example applications of the state pattern: a TCP connection object and the different drawing tools in graphics applications like Photoshop or Illustrator. In Agile Software Development Bob Martin offered a couple other examples: a turnstile and a comment parser.
All of these examples have an object which moves through a finite collection of states, responding to a specific set of events. The behavior in response to the events is different depending on the current state of the object. The turnstile is the simplest of these examples to explain. I'm just paraphrasing Bob Martin's story here. That book is exceptional and his explanation of state machines is really worth your time.
The turnstile takes two events: you either drop a coin in it, or you try to pass through it. Call the events 'coin' and 'pass'. It also has only two states: locked and unlocked. When it's locked the 'coin' and 'pass' events invoke different behavior than when it's unlocked. When locked 'pass' will sound an alarm and 'coin' will change the state to unlocked. When unlocked 'coin' should trigger a refund of the coin and 'pass' should change the state to locked. Different behaviors depending on the state.
How this might look in Java:
public class Turnstile {
State state;
public Turnstile {
state = new Locked(this);
}
public void coin() {
state.coin();
}
public void pass {
state.pass();
}
}
public interface State {
public void coin();
public void pass();
}
public class Locked implements State {
private Turnstile t;
public Locked(Turnstile t) {
this.t = t;
}
public void coin() {
t.state = new Unlocked(t);
}
public void pass() {
throw new FreeloaderException("Come back here!");
}
}
public class Unlocked implements State {
private Turnstile t;
public Unlocked(Turnstile t) {
this.t = t;
}
public void coin() {
//refund coin
}
public void pass() {
t.state = new Locked(t);
}
}Bob also suggests using a state transition table to sketch out the different states and their events:
Turnstile state event result Locked coin set Unlocked Locked pass sound alarm Unlocked coin refund coin Unlocked pass set Locked
In his state machine compiler he offers a more formal syntax for a state transition table to define states and events and results. It can then be compiled into a pile of objects similar to what I sketched above. I've already exclaimed about how cool I think that trick is in State Machines and Agile Software Development. In that entry, you'll also find an example of the config file for the Turnstile state machine.
I won't cover the other examples in as much detail. The TCP connection has to respond to open, close, and acknowledge events differently depending on whether the connection is established, listening, or closed. The drawing tools have to respond to mouseup, mousedown, and mouse position events and do different things to the drawing depending on which tool is selected (or what state the palatte is in). The comment parser responds to a stream of characters and changes states depending on whether it is in an inline comment, a comment block or whathaveyou. Hopefully those few examples give a sense of how widely applicable the state pattern can be.
As a web developer, I'm particularly interested in how state machines can model web applications. You have GETs and POSTS and PUTs and DELETEs as the collection of events. The URL and cookies describe the current state. New requests change the state of the application for the user. You can use a state transition table to describe the navigation and flow through the application. If you're familiar with Struts, you can kinda squint your eyes and see a state transition table in the struts-config.xml file. Unfortunately, struts mixes the concerns of the navigation with actions that change the model. It's sorta like mashing the 'event' and 'result' columns together in the state transition table I showed above. It looks like WebWork2 does this better and like JSF will too. Of course these days I'm partial to bOP.
The Wikipedia has some useful, though more formal entries:
I work for bivio and we're looking for projects. This is a shameless plug for our development services. If you're subscribed, maybe you'll just take my word for it when I say "We're really good programmers, you should hire us." If you need more convincing, here the same thing with a few more words. ;-)
Update: contact us via eric at dobbse.net, info at bivio.biz, +1 303.417.0919, or use the comment form on this post.
We build custom applications to solve business problems. What we do best is manage the risks of software development. We've built a lot of software and know the risks well. We can help you assess them for your specific project and we can adapt our disciplined development practices to mitigate those risks.
For some projects, the main risk is a volatile business environment. The business needs will probably change while the project is under development. We ensure the project can roll with the changes with a variety of techniques. We start by getting something working fast and then adapting it with additional features. If you pay us to build your software, we'll make sure you get software you can actually use, even if your business needs change while we're programming. I think every application we've built is still in use.
Maybe you have a very public, high profile, project for your business. It has to be up 24/7. It's going to have customer financial data and absolutely must be secure. The people using the system aren't geeks so it's got to be usable by ordinary people. We have one of those sites too. It's bivio.com and it's the project that drove development of our application framework, bOP. Thousands of investment clubs trust us to keep track of their portfolios.
We're so confident of our ability to manage development risk we have even taken on that risk ourselves for some projects. We founded Assurance Systems, Inc. as a joint venture with a team of experienced email deliverability experts. The big risk was time-to-market. Coding started in September of 2002 and the service was launched in October 2002. If you face a limited window of opportunity, we can probably get your software into production in a month too.
We tend to build web applications with our own open source (as in LGPL) framework, and we tend to work in perl. None of these things are requirements however. We would be happy to help you work on other kinds of problems in other languages on other platforms if you prefer. We want to solve your business problem, not sell you a technology. We'd like to earn your trust with this business problem so you'll consider us for your next business problem.