13 Nov 2015

Deterministic and Non-Deterministic Finite State Machines with Cyclops

I have just recently become aware of the Cyclops library for functional programming with Java 8. This exciting library offers many excellent features, among them
  • Enhancements and utilities for Streams
  • Advanced monadic operations
  • Powerful pattern matching
  • Interoperability with both TotallyLazy and javaslang
and more. I have only looked at a part of it, and I'm sure there are many useful things to discover. Go look for yourself!

Abstracting over Monads

The feature that drew my attention initially, however, were Cyclops' tools for monadic programming. In particular, they can abstract over the three monads in JDK 8: Stream, Optional and CompletableFuture (and any other monads you may find or create). By using what they call Comprehenders, indeed they can coerce even classes that are technically non-monadic (like List) into a monad. You can find pointers to articles about this on the Cyclops page, for example this readable introduction.

Why did this interest me so much? Well, monads are particularly good at combining computations into a coherent whole, abstracting away some boilerplate code. The algorithm for combining the computations can always be the same (using a method that is usually called flatMap or bind), with the details of what happens being delegated to the particular monad. An instructive example is given by Mike MacHenry in this post. MacHenry models a deterministic finite state machine (DFA) with a transition function that has type Optional, and a nondeterminstic fine state machine (NFA) with a transition function that has type List. He then shows how we can abstract over DFAs and NFAs by implementing a traversal function that works for both kinds of machines. The advantage is interface economy, and consequently more elegance and less maintenance.

MacHenry's post is in the context of Haskell, and I have always wished to be able to do the same thing in Java. But in the JDK, although Optional basically implements the same conceptual API as List (you might think of Optional as a sequence with a cardinality of at most 1, cf. here), there is no actual interface to capture the common functionality. You might even be driven to model a DFA as an NFA and place the constraint that the lists must contain at most one state in the Javadoc. The point is that this would be a decidedly inferior DFA model.

Finally, with Cyclops, we can do what we really want. Cyclops introduces a simple wrapper class, called AnyM, to manage the abstraction.

The technicalities of working with AnyM, however, can also be a bit involved, so we want to hide that from our application logic and provide a common implementation for DFAs and NFAs with an API that exposes only standard JDK classes. I'd like to show you how I did it. The solution requires Cyclops 6.1.1 or later.

Modelling Finite State Machines

A finite state machine is defined by it's inventory of states, an alphabet of permissible input symbols and a transition table that takes the current state, the next input symbol to be consumed and yields the next state (or states, in the case of an NFA). In Java, the transition function can be modelled by

BiFunction<S, Character, M>

where S denotes the type of the machine's states, and M the monadic container used as the result value of the transition function (Optional<S> or List<S> for a DFA or NFA, resp.)

The states and alphabet will be left implicit in the transition table. In addition, for convenience we will allow the transition table to be partial, in the sense that it need only contain the valid transitions. The appropriate empty value that signifies an invalid transition is supplied separately when creating a state machine. Here's the factory method signature:

public static <S, M> StateMachine<S,M> newStateMachine(BiFunction<S, Character, M> partialTransitions, M defaultValue)

The machine itself is represented as a traversal function that takes a state and a sequence of input characters, and returns the resulting state(s) after consuming the entire input, in which case the input is said to have been accepted. (Well, really it is only accepted when you start from a properly identified initial state and end up in a final state, but I'm ignoring this detail here. You can easily build it on top of the implementation given below.)

public interface StateMachine<S, M> {
    M accept(S state, CharSequence input);

The states themselves can be anything. We will assume a simple immutable class State hat has an integer identity attribute with equals and hashCode defined accordingly, and an associated factory method state(int).

Working with Finite State Machines

In proper test-first fashion, let us first consider how we want to be able to use our state machines. Here are a couple of JUnit tests. I will use Guava's two-dimensional array table to provide the transition function.

import static java.util.Arrays.asList;
import static java.util.Collections.emptyList;
import static java.util.Optional.empty;
import static java.util.Optional.of;

public void demoDFA() {
    ArrayTable<State, Character, Optional<State>> transitiontable = ArrayTable.create(
        asList(state(1), state(2)), asList('a', 'b'));

    transitiontable.put(state(1), 'a', of(state(2)));
    transitiontable.put(state(1), 'b', of(state(1)));
    transitiontable.put(state(2), 'b', of(state(1)));

    StateMachine<State,Optional<State>> dfa = newStateMachine(transitiontable::get, empty());
    Optional<State> finalState = dfa.accept(state(1), "ab");
    assertEquals(state(1), finalState.get());

public void demoNFA() {
    ArrayTable<State, Character, List<State>> transitiontable = ArrayTable.create(
        asList(state(1), state(2), state(3)), asList('a', 'b'));

    transitiontable.put(state(1), 'a', asList(state(2), state(3)));
    transitiontable.put(state(2), 'a', asList(state(2)));
    transitiontable.put(state(3), 'b', asList(state(3)));

    StateMachine<State, List<State>> nfa = newStateMachine(transitiontable::get, emptyList());
    List<State> finalStates = nfa.accept(state(1), "ab");
    assertEquals(asList(state(3)), finalStates);

I have not shown any test with inacceptable input. In that case, because the transition function will return either an empty Optional or empty List when it encounters a state out of which there is no path, the result will also be empty. And of course, in case the input is empty, we shall expect the result to be just the initial state (wrapped in its appropriate monad).

Implementing the Finite State Machine

Without further ado, here's the implementation. It's quite concise. Explanatory comments will follow.

public static <S, M> StateMachine<S,M> newStateMachine(BiFunction<S, Character, M> partialTransitions, M defaultValue) {
     BiFunction<S, Character, AnyM<S>> totalTransitions = (s, c) -> 
         AnyM.ofMonad(Optional.ofNullable(partialTransitions.apply(s, c)).orElse(defaultValue));

    Function<S, AnyM<S>> unit = s -> AnyM.ofMonad(defaultValue).unit(s);
    return (state,input) -> machine(totalTransitions, unit).accept(state, input).unwrap();

private static <S> StateMachine<S, AnyM<S>> machine(BiFunction<S, Character, AnyM<S>> transitions, Function<S, AnyM<S>> unit) {
    return (state,input) -> {
                if (input.length() == 0) return unit.apply(state);
                return transitions.apply(state, input.charAt(0))
                           .flatMap(next -> machine(transitions,unit).accept(next, input.subSequence(1, input.length())));

First thing in the factory method newStateMachine we extend the given function to a total function, and make it put the function value into Cyclops' AnyM wrapper. (Making the function total will obviate dealing with any null value later.) AnyM.ofMonad takes an Object that is already of a supported type and makes it accessible through the wrapper's monadic interface.

Then we define a function that creates a new instance of the underlying monad when given a state. We need that function to terminate the recursive traversal. Cyclops provides a unit method that we can use. We can supply the given default value to provide the required monad type.

Finally, we return the traversal function. Cyclops' unwrap method will remove the AnyM wrapper around the resulting monad.

The traversal function is recursive, terminating with the current state when the input is exhausted. It uses the transition function to look up any new state (or states) reachable from the current state while consuming the next input character, and for each new state that it finds calls that state next and continues to traverse the graph with next as the current state and consuming the rest of the input.

Cyclops API

Cyclops before release 6.1.1 used to contain a host of conversion functions. It was really difficult to know when to use what method. I am indebted to John McClean, the author of Cyclops, for some friendly help with the correct usage of the Cyclops APIs, having made many wrong choices myself. This has not been an easy library to learn!

Fortunately, with Cyclops 6.1.1 many of these methods have been deprecated (see the release notes), which has improved the situation a lot. Just use the factory methods on class AnyM. Javadoc coverage has also been extended.

Some conversion methods ensure that the wrapped type inside AnyM remains unchanged, some convert to Stream implicitly. The latter approach (although it can be more efficient) may impose some additional burden on the application coding. In our case that burden would be allowing for a different monadic type to be returned from newStateMachine than was put in, and explicitly collecting to a list in the caller. The choice is yours.


With DFAs, flatMap over Optional abstracts away the tedium of looking at the intermediate result and checking whether it's empty and continuing only with a non-empty value. With NFAs, flatMap over List abstracts away the tedium of applying a map and concatenating the results. With Cyclops, we can do both in a uniform manner.

I hope you have found this example as instructive as I did.

1 comment:

  1. There have been some API changes in cyclops since version 6.1.1 that necessitate some changes in the code above. See