Posts tagged with "bot"

Dynamic Bot Dialogs

Please note that this article’s age is showing as it’s talking about Bot Framework v3. The subsequent version of Bot Framework works entirely differently.

I’m having a lot of fun developing against botbuilder - the Node.js SDK for the bot framework.

When you’re learning to make bots, you study and build a lot of simple bots that do very little. In this case, it makes good sense to simply define the bot’s dialogs in the same file where you do everything else - the file you may call server.js or app.js or index.js. But if you are working on a bot with enough complexity or bulk to the dialogs, you’ll want to settle on a pattern.

Encapsulated Dialogs

The first pattern embraced I learned from @pveller‘s excellent ecommerce-chatbot. In fact, I learned a lot of good patterns from this bot.

In the ecommerce-chatbot bot, Pavel breaks each dialog out into a separate JavaScript file and wraps them in a separate module. Then from the main page, he calls out to those modules, passing in the bot, and “wires up” the dialog to the bot within that separate module.

Notice in the following code that the main app.js file configures the dialog by requiring it and then calling the returned function passing in the bot object. That allows the dialog to use the bot internally (even though it’s a separate module) to call bot.dialog() and define the dialog functions.

//simplified from https://github.com/pveller/ecommerce-chatbot

//app.js
...
let showProductDialog = require('./app/dialogs/showProduct');
...
intents.matches('ShowProduct', '/showProduct');
...
showProductDialog(bot);
...

//sampledialog.js
module.exports = function (bot) {
bot.dialog('/showProduct', [
function(session,args,next) {
//waterfall function 1
},
function(session,args,next) {
//waterfall function 2
}
]);
}

The result is a much more concise app.js file and a bit of welcome encapsulation. The dialogs handle themselves and nothing more.

Dynamically Loaded Dialogs

Later, while I was working with Johnson and Johnson on a bot, I developed a pattern for dynamically loading dialogs based simply on a) their presence in the dialogs project folder and b) their conformation to a simple pattern.

To create a new dialog, then, here’s all I need to do…

module.exports = function (name, bot) {
bot.dialog(`/${name}`, [
function (session, args, next) {
session.endDialog(`${name} reached`);
}
]).triggerAction({matches:name});
};

The convention I need to follow is to define a module with a function that accepts both a name and a bot object.

That function then calls the dialog() method on the bot just like before, but it uses the name that’s passed in as a) the dialog route and b) the trigger action. This means that if the dialog is called greeting, then it will be triggered whenever an action called greeting fires.

So far, this is a small advantage, but look at how I load this and the other dialogs…

getFileNames('./app/dialogs')
.map(file => Object.assign(file, { fx: require(file.path) }))
.forEach(dialog => dialog.fx(dialog.name, bot));

The getFileNames function is my own, but it simply reads the path you pass in recursively returning all .js files.

The .map() calls require on the path of each found file and adds the resulting export (in our case here the modules are exporting a function) to the array as a property called fx.

Finally, we call .forEach() on this and actually execute the function. This configures the dialog for our bot.

The overall result then is the ability to add dialogs to the bot without any wiring. You just create a new dialog, give it a filename that makes good sense in your application, and it should be loaded and ready to be targeted.

You may not get enough context from these snippets to implement this if that’s what you want to do, so check out a fuller sample in the botstarter repo that @danielegan is working on. The botstarter repo is designed to be a good starting point for creating bots.

TIL Something About Bot Middleware

PREAMBLE: I am trying to blog about the little things now. The idea is partly the reason why so many technical blogs exist - it’s a place for me to record things I’ll need to recall later. But modern search engines are good enough, that you just might make it to this blog post to answer a question that’s burning a hole in your brain right now and that’s awesome. I know I love it when I get a simple, concise, and sensible explanation of something I’m trying to figure out.

MORE PRE-RAMBLE: So, I’ve sort of drifted into bot territory. That is, I didn’t initially get extremely excited about the concept of chat bots. It seemed silly. I have since been convinced of their big business value and have really enjoyed learning how to embrace the Node.js SDK for Microsoft’s Bot Framework.

Recently, I realized that the very best way to learn about the SDK is not to search online for docs or posts, but to go straight to the source, and when you get there, look specificallly for the /core/lib/botbuilder.d.ts file.

That file is a treasure trove of useful comments directly decorating the methods, interfaces, and properties of your bot. It’s great that the bot is written in TypeScript, because that means this source code contains a lot of documenting types that not only made it easier for the team to developer this, but now make it easier for us to read it as well.

Tonight I was specifically wondering about something. I had seen middleware components for bots using property values of botbuilder and send, but then I saw receive and wondered what every possible property was and specifically what they did.

I discovered that in fact botbuilder, send, and receive are the only possible property values there. Let me drop that snippet of the source code here, so you can see how well documented those are…

/** 
* Map of middleware hooks that can be registered in a call to __UniversalCallBot.use()__.
*/
interface IMiddlewareMap {
/** Called in series when an incoming event is received. */
receive?: IEventMiddleware|IEventMiddleware[];

/** Called in series before an outgoing event is sent. */
send?: IEventMiddleware|IEventMiddleware[];

/** Called in series once an incoming message has been bound to a session. Executed after [analyze](#analyze) middleware. */
botbuilder?: ICallSessionMiddleware|ICallSessionMiddleware[];
}

The IMiddlewareMap is an interface, which is a TypeScript concept. That’s not in raw JavaScript. TypeScript does interfaces right, because they’re not actually enforced on objects that implements them (we are, afterall, talking about JavaScript where pretty much nothing is enforced). Rahter, they’re an indication of intent - as in “I intend for my object to conform to the IMiddlewareMap interface.”

That means that at design time (when you’re typing the code in your IDE), you get good information back about whether what you’re typing lines up with what you said this object is expected to be.

So that’s just one little thing I learned tonight wrapped up with all kinds of preamble, pre-ramble, and other words. Hope it helps. Happy hacking.

Cabin Escape

The Idea

I worked together with a few fine folks from my team on a very fun hackathon project, and I want to tell you about it.

The team was myself (@codefoster), Jennifer Marsman (@jennifermarsman), Hao Luo (@howlowck), Kwadwo Nyarko (@kjnyarko), and Doris Chen (@doristchen).

Here’s our team…

team

The hackathon was themed on some relatively new products - namely Cognitive Services and the Bot Framework. Additionally, some members of the team were looking for some opportunity to fine tune their Azure Functions skills, so we went looking for an idea that included them all.

I’ve been mulling around the idea of using some of these technologies to implement an escape room, which as you may know are very popular nowadays. If you haven’t played an escape room, perhaps you want to find one nearby and give it a try. An escape room is essentially a physical room that you and a few friends enter and are tasked with exiting in a set amount of time.

Exiting, however, is a puzzle.

Our escape room project is called Cabin Escape and the setting is an airplane cabin.

Game Play

Players start the game standing in a dark room with a loud, annoying siren and a flashing light. The setting makes it obvious that the plane has just crash landed and the players’ job is to get out.

Players look around in haste, motivated by the siren, and discover a computer terminal. The terminal has some basic information on the screen that introduces itself as CAI (pronounced like kite without the t) - the Central Airplane Intelligence system.

A couple of queries to CAI about her capabilites reveal that the setting is in the future and that she is capable of understanding natural language. And as it turns out, the first task of silencing the alarm is simply a matter of asking CAI to silence the alarm.

What the players don’t know is that the ultimate goal is to discover that the door will not open until the passenger manifest is “validated,” and CAI will not validate the manifest until all passengers are in their assigned seats. The only problem is that passengers don’t know what their assigned seats are.

The task then becomes a matter of finding all of the hidden boarding passes that associate passengers with their seats. Once the last boarding pass is located and the last passenger takes his seat, cameras installed in the seat backs finish reporting to the system that the passenger manifest is validated and the exit door opens.
cabin

Architecture

Architectures of old were almost invariably n-tiered. We software developers are all intimately familiar with that pattern. Times they are a changing! An n-tier monolithic architecture may accomplish your feat, but it won’t scale in a modern cloud very well.

The architecture for cabin escape uses a smattering of platform offerings. What does that mean? It means that we’re not going to ask Azure to give us one or more computers and then install our code on them. Instead, we’re going to compose our application out of a number of higher level services that are each indepedent of one another.

Let’s take a look at an overall architecture diagram.

architecture

In Azure, we’re using stateless and serverless Azure Functions for business logic. This is quite a paradigm shift from classic web services are often implemented as an API.

API’s map to nodes (servers) and whether you see it or not, when your application is running, you are effectively renting that node.

Functions, on the other hand do not conceptually map to servers. Obviously, they are still running on servers, but you don’t have to be in the business of declaring how Functions’ nodes scale up and down. They handle that implicitly. You also don’t pay for nodes when your function is not actually executing.

The difficult part in my opinion is the conceptual change where with a serverless architecture, your business logic is split out into multiple functions. At first it’s jarring. Eventually, though you start to understand why it’s advantagous.

If any given chunk of business logic ends up being triggered a lot for some reason and some other chunk doesn’t, then dividing those chunks of logic in different functions allows one to scale while the other doesn’t.

It also starts to feel good from an encapsulation stand point.

Besides Functions, our diagram contains a DocumentDB database for state, a bot using the Bot Framework, LUIS for natural language recognition, and some IoT devices installed in the plane - some of which use cameras.

Cameras and Cognitive Services

The camera module is developed with Microsoft Cognitive Services, Azure functions, Node.js, and Typescript. In the module, it performs face training, face detection, identification, and as well as notification to Azure function service. The module determines if the right person is seated or not, then the notification will send back to Azure function service and then the controller decides the further action.

The following digrams describes the interaction between the Azure fuctions services, Microsoft cognitive services, Node server prcessiong and client.
Architecture and Intereaction Diagram of Camera project

Cloud Intelligence and Storage

We use Azure Functions to update and retrieve the state of the game. Our Azure Functions are stateless, however we keep the state of every game stored in DocumentDB. In our design, every Cabin Escape room has its own document in the state collection. For the purpose of this project, we have one escape room that has id ‘officialGameState’.

We started by creating a ‘GameState’ DocumentDB database, and then creating a ‘state’ collection. This is pretty straight forward to do in the Azure portal. Keep in mind you’ll need a DocumentDB account created before you create the database and collection.

After setting up the database, we wrote our Azure Functions. We have five functions used to update the game state, communicate with the interactive console (Central Airplane Intelligence - Cai for short), and control the various systems in the plane.Azure Functions can be triggered in various ways, ranging from timed triggers to blob triggers. Our trigger based functions were either HTTP or timer based. Along with triggers, Azure Function can accept various inputs and outputs configured as bindings. Below are the functions in our cabinescape function application.

- GamePulse: 
    * Retrieves the state of the plane alarm, exit door, smoke, overhead bins and sends commands to a raspberry piece
    * Triggerd by a timer
    * Inputs from DocumentDB
- Environment:
    * Updates the state of oxygen and pressure
    * Triggered by a timer
    * Inputs from DocumentDB
    * Outputs to DocumentDB
- SeatPlayer:
    * Checks to see if every player is in their seat
    * Triggerd by HTTP request
    * Inputs from DocumentDB
    * Outputs to DocumentDB and HTTP response
- StartGame: 
    * Initializes the state of a new game
    * Triggered by HTTP request
    * Outputs to DocumentDB and HTTP response
- State:
    * Update the state of the game
    * Triggered by HTTP request
    * Inputs from DocumentDB
    * Ouputs to DocumentDB

A limitation we encountered with timer based triggers is the inability to turn them on or off at will. Our timer based functions are on by default, and are triggerd based on an interval (defined with a cron expression).

In reality, a game is not being played 24/7. Ideally, we want the timer based functions triggered on when a game starts, and continue on an time interval until the game end condition is met.

The Controller

An escape room is really just a ton of digital flags all over the room that either inquire or assert the binary value of some feature.

  • Is the lavatory door open (inquire)
  • Turn the smoke machine on (assert)
  • Is the HVAC switch in the cockpit on? (inquire)
  • Turn the HVAC on (assert)

It’s quite simply a set of inputs and outputs, and their coordination is a logic problem.

All of these logic bits, however, have to exist in real life - what I like to call meat space, and that’s the job of the controller. It’s one thing to change a digital flag saying that the door should be open, but it’s quite another to actually open a door.

The contoller in our solution is a Raspberry Pi 3 with a Node.js that does two things: 1) it reads and writes theh logical values of the GPIO pins and 2) it dynamically creates an API for all of those flags so they can be manipulated from our solution in the cloud.

To scope this project to a 3-day hackathon, the various outputs are going to be represented with LEDs instead of real motors and stuff. It’s meat space, but just barely. It does give everyone a visual on what’s going on in the fictional airplane.

The Central Airplane Intelligence (Cai)

This is a unique “Escape the Room” concept in that it requires a mixture of physical clues in the real world and virtual interaction with a bot. For example, when the team first enters the room, the plane has just “crashed” so there is an alarm beeping. This is pretty annoying, so people are highly motivated to figure out how to turn it off quickly. A lone console is at the front of the airplane, and the players can interact with it.

CAI Welcome Screen

One of the biggest issues with bots is discoverability: how to figure out what the bot can do. Therefore, good bot design is to greet the user with some examples of what the bot can accomplish. In our case, the bot is able to respond to many different types of questions, which are mapped to our LUIS intents:

  • What is the plane’s current status overall?
  • How do I fix the HVAC system?
  • What is the flight attendant code?
  • How do I get out of here?
  • How do I unlock the cockpit door?
  • How do I open the overhead bins?
  • How do we clear all this smoke?
  • What is the oxygen level?
  • How do I disable the alarm?

The bot (named “CAI” for Central Airplane Intelligence) was implemented in C# using LUIS. The code repository can be found at https://github.com/cabinescape/EscapeTheAirplaneBot.