This article was updated on 7th April, 2017. Added new frameworks.
The artificial intelligence, personal assistant and chatbot space has been growing rapidly. The idea of having a personal assistant you can beckon with the words “Siri”, “Alexa”, “Cortana” or “Ok Google” which connects us to the web and the ever growing Internet of Things (IoT) is becoming ever more commonplace. Almost every messenger program and smartphone OS have chatbots or personal assistants available in 2017! While their true level of “artificial intelligence” is debatable, we are witnessing the start of a world where we all have virtual assistants at our disposal!
Luckily for developers who want to get in on the action, there are a range of services available that make it simple to get started with the basics of building your own artificial intelligence, chatbot and/or personal assistant for whatever purpose you can dream up. Connect up your smart home, control a self made media center, deliver all sorts of information via a personal AI assistant… there are so many options available thanks to APIs and services. This lead up throughout 2015 and 2016 has made 2017 the year where developers have more options than ever before. Developers really can start building solutions of their own.
In this overview, we’ll look at the services that exist which can enable developers to begin connecting their own apps and IoT devices to voice recognition, chatbots and artificial intelligence throughout 2016.
Wit.ai is a service which provides a nice combination of both voice recognition and machine learning for developers. It provides the service to convert verbal commands into text and can also be trained up in how to understand those commands. It also has a form of machine learning, where you can train it to understand commands which are said to it which it previously didn’t understand, however this isn’t an automatic process (it’s not a totally intelligent being yet!). Early in 2015, they joined Facebook and opened up the entire platform to be free for both public and private instances. Its development has been up and down since then, but the team have big plans for 2017.
Wit.ai has two main elements to it that you set up within your app — intents and entities. An intent is what action an instruction should take (e.g. turn on a light). An entity is a specific object or piece of information that our AI needs to know about to enact that intent (e.g. which light? Is it a smart light? Should it understand particular colors the light can switch to?). Rather than needing to create intents from scratch, Wit.ai also provides access to existing intents from the developer community which is quite neat!
Wit.ai also has the concept of “roles”, where it can learn to differentiate between entities in different contexts (e.g. numbers in different parts of an instruction can refer to different things – like an age, an order, a count). It also has some entity types built in that it can understand, such as temperature, URLs, emails, duration… etc.
A new feature in Wit.ai is the “Story” feature, which allows you to define typical conversations in a new way. You can set up the initial question, like “What’s the weather in Sydney?” and then define the steps and subsequent questions that the system should ask. It has the concept of “branches” which move the conversation in different ways if the system doesn’t get all the required information up front (e.g. if the user instead says “What’s the weather?”).
Api.ai is a chatbot API which provides similar capabilities to Wit.ai, with intents and entities. It provides machine learning capabilities which can sometimes guess if someone uses a slightly different phrase than you’ve hardcoded into your assistant. They recently were purchased by Google in September 2016. It is now one of the main ways to build conversational interfaces for Google’s Home platform.
One key focus of Api.ai that differs from Wit.ai is its “Domains”. Domains are a whole collection of knowledge and data structures from Api.ai that are ready for use in every Api.ai agent (apps are called “agents” in Api.ai). Domains can include knowledge of common verbs and content types. As an example, it understands the different types of data that a request of “Book restaurant” needs, compared to “Book hotel”. It has a range of real information about encyclopedia-like topics such as history, word definitions, people of significance (e.g. celebrities, writers, characters), movies, stock prices and a lot more.
Api.ai is free to use but it is a little bit misleading on their website at the moment — it isn’t completely free as of 2016. Most of the “Domains” now require your account to be upgraded, however the price for this isn’t clear (developers will need to contact Api.ai’s sales team). Api.ai also still has a paid enterprise option which allows for the whole service to be run on a private cloud internally and more from their services team. This is potentially valuable if your usage needs to be totally private.
If you’d like to give Api.ai a try, I’ve got a series on getting started with Api.ai here at SitePoint. Just keep in mind that the domains have since required a paid account, so my example I put together doesn’t answer every question any longer as I don’t have a paid account.
If you’d rather do more of the programming side of the AI yourself and you are a fan of Raspberry Pi, you could look into Melissa. Melissa is an open source personal assistant written in Python that runs on Raspberry Pi, Windows, OS X and Linux. It’s updated quite frequently and has quite a few who speak very highly of it!
Melissa has always-on voice control and has a range of sample dialogues out of the box, including things like taking notes, telling your horoscope, getting definitions from Wikipedia, playing music and more. For the Python developer who wants total control – Melissa might just be for you! To find out more and get full details on how it is put together, Tanay Pant, it’s main developer, has a whole book that covers it in more detail and serves as the detailed documentation for Melissa. I actually spoke with him all about Melissa at the start of the year. He’s done a lot of work on it!
One service from a completely different perspective is Clarifai, an artificial intelligence service that can recognize image and video content. It has its own deep learning engine that continuously improves with every use. If you are keen to take your AI prototype to a whole new level, why not give it the ability to see and recognize objects? It can do all sorts of things from tagging images, searching for other images that are visually similar and flagging inappropriate images. If you want to take it to the next level, you can even teach the platform entirely new concepts by training it with your own examples.
To integrate this into your own applications, Clarifai has both a REST API that could be integrated with your preferred language along with a Python, Java and Node.js API. Their service is free for up to 5000 uses a month. I’ve got a guide on using Clarifai here at SitePoint for those who’d like to give it a go — How to Make Your Web App Smarter with Image Recognition.
If you are wanting to go beyond services which do a lot of the heavy lifting for you and really want to make true artificial intelligence systems from relative scratch, Google’s TensorFlow might be the option for you! While it’s something that will take longer to put together, you’ll learn a lot more about deep learning and artificial intelligence. TensorFlow is “an open source software library for numerical computation using data flow graphs”. It would be best for things like training your own image recognition system or learning to do language processing. You could also make conversational AI with TensorFlow that is trained on specific data, such as SpeakEasy AI which was a chatbot built on a neural model trained on millions of comments from Reddit.
There’s no limit to the sorts of things you could get a TensorFlow-powered program to do, this developer trained it to write new episodes of hit 90s show, Friends.
There are a range of services and APIs out there which can provide artificial intelligence, personal assistants, chatbots and more. You don’t need to be a computer science expert to implement some of the core basics in your own apps! Try out a few of the above and see what you can create. If you feel super confident, go straight for TensorFlow and make something seriously mindblowing.
If you do put together your own AI prototype using any of the above services, or you’ve had some experience with the above or a service I did not mention — please share it in the comments or get in touch with me on Twitter (@thatpatrickguy). I’d love to hear about it!
PatCat is the founder of Dev Diner, a site that explores developing for emerging tech such as virtual and augmented reality, the Internet of Things, artificial intelligence and wearables. He is a SitePoint contributing editor for emerging tech, an instructor at SitePoint Premium and O'Reilly, a Meta Pioneer and freelance developer who loves every opportunity to tinker with something new in a tech demo.