thinair Boulder, Colorado. elevation 5400 feet.

FSM kitchen sink: workflows, pipelines, asynchrony, SOAs

State machines part eleven

These were my first rambling thoughts about workflow from June of 2002. (Maybe I should call this "part zero". :-) I haven't looked at Commons workflow nor Turbine's pipeline since then. The code has probably diverged from my UML diagrams.

A summary of what I've learned over the past couple years of working on PlanetCAD's workflow and paying attention to other options.

I think there are several different things that get described as "workflow". I think of them as different use cases for a general workflow engine.

  1. There's the wizard interface that organizes a complex form into multiple steps (or pages, or screens, or whatever). This is what commons-workflow is primarily about.
  2. There's the business process automation where a purchase order must have quotes from three different suppliers, reviewed by someone in management, further reviewed by someone in purchasing, and finally the PO is cut and delivered to the winning supplier.
  3. Complex processing of data without human intervention. The stuff we're doing at PlanetCAD is automating the preparation, processing, and delivery of engineering data between manufacturers and their suppliers. I like to describe this as a distributed batch processing system. A sysadmin sets up the scripts that will be used for processing. When an end user sends a collection of engineering files to another user, the destination user's profile determines which of the scripts get used for processing. The scripts are run without human intervention by a collection of job servers. A workflow state machine manages the processing of the script.

As you might guess from the rest of the entries in this series, I now see applications for workflow everywhere.

Clearly, some workflow systems would be a natural combination of these things. I think that Jason's use case of the lifecycle of a book will probably be a combination of 2 and 3. In fact, we have a couple cases where our automated workflow pauses to wait for a human to approve what's been done so far. We started with 3 and are adding elements of 2 as they are requested. I'm not as interested in 1, but I've seen some discussion on the commons lists about using commons workflow for some of these other kinds of workflow, so I have been including it in my thinking.

The original audience for whom I wrote this knew what I meant by "Jason's use case." For the rest of you, Jason van Zyl described the process a book goes through in editing and production -- something important for tambora.


Commons Workflow (figure 1):

An Activity (or wizard) is a collection of steps. There's a composite called Block that allows steps to be arbitrarily nested. Blocks are also used to model 'if', 'for', and 'where' type operations. There's some plans to add a Process which is a collection of Activities. The existing code handles wizards. 'Process' would be needed to add support for workflows between people (type 2) or workflows between systems (type 3). A Context is provided at runtime as processing flows through a defined Activity. Decisions through the workflow are made based on objects in the Context.

Turbine's Pipeline (figure 2):

Looks pretty similar to Commons Workflow, though a little more simple. A Pipeline is a collection of valves. The pipeline is defined with an XML file not unlike the default means of configuring Steps in an Activity in Workflow. At runtime RunData and ValveContext are provided as processing flows through the Pipeline. There's no composite patter like the Workflow Block, but the rest is pretty similar.

State design pattern (figure 3):

I think this turns the Pipeline model inside out. Instead of having a context and moving it through a pipeline, client code talks to the context and it changes states through subsequent calls. Context is the object that client code talks to and the behavior of that Context changes as the state of the Context changes. PlanetCAD's workflow uses a State pattern. It's worked pretty well for us but it seems to invite some coupling between the State objects and the Context. And we're also seeing some proliferation of pretty similar states.

Event model:

The asynchronous nature of a business process workflow (type 2 above) invites some thinking about making the workflow event driven, which is what I've been doing for the past several weeks. In this case the individual Steps or Valves or States would be decoupled and made independent. Then the workflow would be a network of listeners that trigger a particular processing step including access to contextual info in the invocation. When processing completes an event would be fired to indicate its completion and some other step could be listening for that event. You could still use an XML descriptor to define the network of listeners and the rules that govern different paths through the network. But I think making the steps more loosely coupled opens the opportunity to change the workflow at runtime. That would allow ad-hoc workflows to be created, or existing workflows to be changed on the fly to handle exceptions in the standard process.

I also like the potential for letting completely different systems trigger events in the network. I haven't quite gotten my head around this one yet, which is why I haven't included an illustration. Seems like the same sort of mental transition to event driving programming and GUIs. The elements start to look like they aren't doing enough, but the effect of a whole bunch of them orchestrated by the user makes cool things happen. In the workflow stuff I'm working on, it pushes a lot of the magic into how the network of listeners is organized how the rules in the network interact with one another to control flow through the network.

These decoupled states waiting for asynchronous events could even be web services in a pipeline, if you accept my theory that a pipeline is just an inside-out workflow. I still think there are interesting parallels between the asynchronous event-driven processing of GUI applications and similar asynchronous event-driven processing in Service Oriented Architectures. I don't think its any coincidence that SOAs and pipelines are getting some buzz these days. Finally, I'm still interested in a workflow engine supporting ad-hoc workflows.