When we started working on the second version of Recast.AI’s bot builder, one of our goal was to make something as separate as possible from the NLP part of Recast.AI.
What I’d like to go over with you today is how we can use Bot Builder as a logic engine, without thinking about NLP and conversations, to implement a Finite State Machine. We’ll use NLP, but only as an interface to play with what we’re building. We’ll first start with a simple example, and then with a more complex one, in the form of a little game.
Finite State Machine
Quoting Wikipedia, a finite state machine is an abstract machine that can be in exactly one of a finite number of states at any given time. [..] The changes from one state to another are called transitions. [..] a finite state machine is defined by a list of its states: its initial state, and the conditions for each transition.
Understanding a simple finite state machine: a subway turnstile
If you’ve already taken the subway, there is a fairly good chance that you know how to use a turnstile:
- If you try to push through without inserting a ticket, the turnstiles stays locked
- When you insert a ticket, the turnstile is unlocked
- After you go through, the turnstile locks again
Doesn’t this look like a finite state machine? Let’s describe this system in other words.
According to the definition we saw earlier, it seems that the turnstile, which is either locked or unlocked, is a perfect example of something being in exactly one of a finite number of states at any given time.
But that’s not enough. What else do we need to call our turnstile a finite state machine? Based on our definition, we need to agree on an initial state, which will obviously be locked, and the transitions between one state to another.
Let’s have a look at how the turnstile reacts to external inputs.
This is our transition table, which describe how we go from one state to another, depending on the input we receive. Note that one input does not necessarily have the same effect depending on the state of the system. Here we can see that pushing through the turnstile has different effects if the turnstile is locked or unlocked.
So, we now know what a finite state machine is made of: a finite number of states, an initial state, and transitions that describe the system’s reactions to inputs. We also have a very simple example of a finite state machine in mind, a subway turnstile. Next step: do it with Recast.AI!
Build your own Bot Builder based fine state machine
Disclaimer: I’m not going to do a complete overview or tutorial of Bot Builder in this article. I assume you know how Bot Builder works, as the purpose of this article is to show another way of thinking about it. If you’re not comfortable with the tool, please read this article before moving on.
Our inputs here will be text messages, since Bot Builder is a text based interface. How do we represent the components of our machine?
Since the machine is in exactly one state at a time, we only need to remember one information through the different inputs. Bot Builder offers us a key/value storage through the memory, so we are gonna use it. We will just use a single variable, that we will call state because I like originality.
Since we can’t provide a default value for the memory in Bot Builder, we need a way to recognize the initial state among all states. We will use the absence of the state variable as an indicator that the system is in the initial state.
Here we need to react to inputs with a possible state change. As we use the memory to describe the state, moving from one state to another is just changing a variable in the memory. This is represented by Bot Builder in the actions parts of a skill.
So to react to an input from a state A, we define a skill for the state A (with the appropriate trigger), in which we define the corresponding actions to make the state change (or not, depending on the input).
Technically, we could use only one skill, use only the actions tab, and abuse conditions to describe every state and transitions in our system, but for the sake of readability, we will use one skill for each state.
Turnstile representation in Bot Builder
Let’s have a look at what we saw here:
- We will have one skill for each state
- Each skill’s state is defined in its triggers
- Initial state has a trigger that allows the absence of state
- Each transition is handled in the actions of the skill
Let’s make that in Recast.AI!
Let’s start with the Locked state. As we said, this state is also the initial state, so we need to handle that in the skill triggers. This will take the form of the following condition in the triggers: _memory.state is locked OR _memory.state is-absent. Here we associate the fact that this state is the initial state with the absence of the state variable in memory. Now we have to implement the transitions, we know that we have a text input (which we have access to through _source in bot builder).
We have 3 possible cases here:
- The input is “ticket”
- The input is “push”
- The input is anything else
We will consider that the case 3 is an invalid input and ignore the input.
In the case 1, we just have to refer to our transition table described before to know in which state we have to go. Here to make the transition to the Unlocked state, we just have to set the variable in memory using the set-memory tool of Bot Builder.
In the case 2, the transition table says that we don’t change the state, so we can just ignore the input.
For our example we will also add little messages to have a text feedback when we will test the system.
The same goes for the Unlocked state, except that we will not handle the absence of the state variable, and the actions will be the opposite.
Case 3 is still considered as an invalid input, case 2 will change the state to Locked, and case 3 will do nothing.
You have the link to the complete example here.
That’s it! We have represented a finite state machine with Recast.AI Bot Builder! This system is very simple, but what I want to show here is that you can (and you should) think about the Builder beyond chatbots, and apply it to your project as conversational (or not!) as it is.
A less simple state machine: the Goat, Wolf and Cabbage problem
Here is a famous logic game: A sailor stands in front of a river with a cabbage, a goat and a wolf. He has to cross the river but his boat has only has room for one his three fellows. The problem is that both the goat and the wolf are very undisciplined. If the wolf stays with the goat without the surveillance of the sailor, he will eat the poor animal without remorse. The same goes for the goat and the cabbage.
How can the sailor achieve his goal without losing the goat or the cabbage?
As you might have guessed, we can represent this problem with a finite state machine. Each state of the system can be represented by who (or what) is on the left side of the river.
Here again we’re going to use Bot Builder to simulate the finite state machine. You can fork the project here.
Let’s use letters to represent the protagonists (G for the goat, S for the sailor, W for the wolf and C for the cabbage), so we can represent a state by describing who has crossed the river.
With this notation the final state can be represented by SGWC. Note that the order of the letters don’t matter, so the initial state could have been described by SWCG or WSGC.
This allows us to discover a new property of final state machines: classification. To allow our machine to tell us if a sequence of actions (inputs) is a solution to our sailor’s problem, we introduce a new property to our states. Each state is either accepting or non accepting. What does this mean? If the sequence of inputs leads to a non accepting state, the sequence itself is rejected (in other words, the sequence of actions is not a solution). If the final state is accepting, the sequence is accepted.
In our case we have only one accepting state, the state where everyone has crossed the river. We also need to introduce a new state: the lost state, which you can’t quit, and is reached when the wolf eats the goat or the goat eats the cabbage.
Similarly to the turnstile, we have one skill for each state, and use actions to make the transitions.
There is one difference with our first use case: since we want to have a smart chatbot, we allow our users to talk freely, and the real state machine will only start when the user chooses to start the game.
Here, we’re using a requirement to ask a question until we have a valid input. We define a valid input as an input which either:
- Matches with an intention we define in the Train tab
- Contains an entity we define in the Train tab
The intent contains sentences like “Move the goat” or “the sailor moves alone”, and the entity matches . the words goat, wolf and cabbage. This allows us to define our inputs as the following:
- G: Entity is detected with value goat: The sailor moves the goat
- C: Entity is detected with value cabbage: The sailor moves the cabbage
- W: Entity is detected with value wolf: The sailor moves the wolf
- S: Intention is detected without entity : The sailor moves alone
We can now define our transition table. We refer to the initial state with the word start and the lost state with the word lost.
This table shows for each state, which state is the next depending on the input. Translations noted with “-“ have no effect. As you see, this is exactly the same abstract construction as the turnstile example, with only the state acceptance added.
If you want to try the bot, you can find it here. Of course, I didn’t make it as strict as a finite state machine to allow minimal conversational interactions.
As you see, even if this application is not extremely complex, we have been able to describe it using the tools that Bot Builder provides.
This is to make you think about Bot Builder not only from the NLP/conversational point of view, but also as a programmatic tool, in which we use NLP features and abstraction to create the smartest conversational applications possible. Enjoy!