The Dawn of WebRTC — SitePoint

Web Real-Time Communications (WebRTC) was built to provide developers with the ability to create high definition video and audio calls using simple JavaScript APIs. These APIs are embedded directly in the browser and require no plugins, downloads, or installation of any type to get you up and running. Google spent about $200 million to open source the technology giving it to the development community. WebRTC uses several codecs for video and audio giving anyone the ability to create next generation communication apps without the need to pay for licensing or royalties.

What are the Possibilities?

We have only begun to scratch the surface of how WebRTC will change the communications industry. We are seeing all types of applications being created with WebRTC. One of the most iconic examples is Amazon’s Mayday Button. It shows the true power of how WebRTC is being harnessed by companies large and small. WebRTC brings many abilities for you to enhance your apps such as:

Video Communications: Create secure and high definition audio and video streams between browsers
File Sharing and Messaging: Securely connect and share data between browsers without the need to upload files to the cloud or a network server. Data is sent directly between the connected peers
Phone to Browser: WebRTC allows for connections between Public Switched Telephone Network (PSTN) and browsers. You can make and receive calls all from one location with the use of the new APIs in HTML5, a SIP Gateway and WebRTC
Mobile to Mobile: WebRTC is not just for the web, there are native libraries for both iOS and Android that utilize WebRTC’s capabilities
Machine to Machine: WebRTC is embeddable for systems needing to communicate machine to machine such as with the Internet of Things. Google Chromecast is a perfect example of using WebRTC outside the normal use case

Understanding the WebRTC APIs

WebRTC relies on three JavaScript APIs embedded directly into web browsers requiring no client or browser plugin in order to communicate directly with another WebRTC enabled browser. These APIs are:

MediaStream (aka getUserMedia) allows you to gain access to the camera, microphone, or screen of the device employed by the user. As an added layer of security, the user will have grant access before you will be allowed to stream their media. If the user connects from a secure connection (HTTPS) the user will only need to grant access once for the application but if you connect from a non-secure connection (HTTP) the user will be prompted each time the application needs access
RTCPeerConnection (aka PeerConnection) allows two users to communicate directly, peer to peer. It encodes and decodes media sent to and from your local machine to a remote peer receiving your media.
RTCDataChannel (aka DataChannel) represents a bi-directional data channel between two peers. It piggy backs on top of the RTCPeerConnection allowing you to send data directly between the two connected peers securely.

Getting Started with WebRTC

We are going to start off with a simple photo booth app that will allow you to capture an image using your webcam and apply some CSS filters to the captured image. It’ll teach you the basics of getting started with WebRTC using the MediaStream API. It is a slighted modified version of the sample app that the Google team created

HTML

In the HTML code below you will see the basics needed to create your first WebRTC web application. WebRTC utilizes the HTML5 `video` element to render local and remote video streams. In addition we are going to use the `canvas` element to make a snapshot of our local video stream to apply a filter:

<div class="m-content">
   <h1>getUserMedia + CSS filters demo</h1>

   <div class="photo-booth">
      <!-- local video stream will be rendered to the video tag -->
      <video autoplay></video>
      <!-- a copy of the stream will be made and css filters applied  -->
      <canvas></canvas>
   </div>
   <div class="buttons">
      <!-- call getUserMedia() to access webcam and give permission -->
      <button id="start">Access Webcam</button>
      <!-- take a snapshot from your webcam and render it to the canvas tag -->
      <button id="snapshot">Take a Snapshot</button>
      <!-- sort through the available css filters -->
      <button id="filter">Change Filter</button>
   </div>
</div>

JavaScript

The navigator.getUserMedia() method is the method provided by the getUserMedia API and it allows you to retrieve the stream from your users. At the time of this writing, it needs to be defined for the different vendor prefixes to make this application work across all WebRTC compatible browsers: Chrome, Firefox, Opera. We can achieve this goal with the following code:

navigator.getUserMedia = navigator.getUserMedia ||
                         navigator.webkitGetUserMedia ||
                         navigator.mozGetUserMedia;

We need to define the constraints we are requesting with navigator.getUserMedia() which will determine the media type we are requesting. In this example we are only requesting access to the user’s webcam by setting video: true.

var constraints = { audio: false, video: true };

Below we define and store the HTML elements for the demo application in variables.

var start   = document.querySelector('#start');
var snapshot = document.querySelector('#snapshot');
var filter  = document.querySelector('#filter');
var video   = document.querySelector('video');
var canvas  = document.querySelector('canvas');

Next we need to create an array for the filters that we’ll apply to the snapshot. We’ll define the filters in our CSS code, described in the next section, using the same names:

var filters = ['blur', 'brightness', 'contrast', 'grayscale',
               'hue', 'invert', 'saturate', 'sepia'];

Time for the real fun! We add an click event to our start button to initialize navigator.getUserMedia(constraints, success, error); to gain access the our webcam. Permission must be granted in order to access our webcam. Each browser vendor handles showing the prompt to allow access to the users’ webcam and microphone differently.

start.addEventListener('click', function() {
    navigator.getUserMedia(constraints, success, error);
});

After successfully granting permission to access the user’s webcam we pass the stream object as the HTML5 video tag’s source.

function success(stream) {
   /* hide the start button*/
   start.style.display = 'none';
   
   /* show the snapshot button*/
   snapshot.style.display = 'block';
   
   /* show the filter button*/
   filter.style.display = 'block';
   if(window.URL) {
      video.src = window.URL.createObjectURL(stream);
   } else {
      video.src = stream;
   }
}

If an error occurs accessing the user’s webcam or permission is denied you will receive an error that will be printed to the console.

function error(e) {
   console.log('navigator.getUserMedia error: ', e);
}

Next we create a simple function to apply our CSS filters to the canvas and video elements. The function will find the name of the CSS class and apply the filter to the canvas.

filter.addEventListener('click', function() {	
   var index = (filters.indexOf(canvas.className) + 1) % filters.length;
   video.className = filters[index];
   canvas.className = filters[index];		
});

Lastly we take a snapshot of our local video stream and render it to the canvas.

snapshot.addEventListener('click', function() {
   canvas.width = 360;
   canvas.height = 270;
   canvas.getContext('2d').drawImage(video, 0, 0, canvas.width, canvas.height);
});

CSS

Below you will find the basics for styling your first WebRTC application.

body
{
   font-family: 'Open Sans', sans-serif;
   background-color: #e4e4e4;
}

h1
{
   width: 780px;
   margin-left: 20px;
   float: left;
}

.m-content
{
   width: 800px;
   height: 310px;
   margin: auto;
}

.photo-booth
{
   width: 800px;
   height: 310px;
   float: left;
}

WebRTC allows two ways of defining the size of your video stream. You can define it in your contraints variable that you pass to navigator.getUserMedia(contraints, success, error); or you can define it in your CSS. In this example we are using CSS to define the video dimensions of our local video stream and canvas elements.

video
{
   width: 360px;
   height: 270px;
   float: left;
   margin: 20px;
   background-color: #333;
}

canvas
{
   width: 360px;
   height: 270px;
   float: left;
   margin: 20px;
   background-color: #777;
}

Next we give our buttons a little flare. We will hide our filter and snapshot buttons until we have gained access to our microphone and camera using getUserMedia().

.buttons
{
   margin-left: 20px;
   float: left;
}

button
{
   background-color: #d84a38;
   border: none;
   border-radius: 2px;
   color: white;
   font-family: 'Open Sans', sans-serif;
   font-size: 0.8em;
   margin: 0 0 1em 0;
   padding: 0.5em 0.7em 0.6em 0.7em;
}

button:active
{
   background-color: #cf402f;
}

button:hover
{
   background-color: #cf402f;
   cursor: pointer;
}

#filter, #snapshot
{
   display: none;
   margin-right: 20px;
   float: left;
}

Next I will define the photo booth’s filters using CSS. You can find a list of supported filters on the related MDN page.

.blur
{
   filter: blur(2px);
   -webkit-filter: blur(2px);
}

.grayscale
{
   filter: grayscale(1);
   -webkit-filter: grayscale(1);
}

.sepia
{
   filter: sepia(1);
   -webkit-filter: sepia(1);
}

.brightness
{
   filter: brightness(2.2);
   -webkit-filter: brightness(2.2);
}

.contrast
{
   filter: contrast(3);
   -webkit-filter: contrast(3);
}

.hue
{
   filter: hue-rotate(120deg);
   -webkit-filter: hue-rotate(120deg);
}

.invert
{
   filter: invert(1);
   -webkit-filter: invert(1);
}

.saturate
{
   filter: staurate(5);
   -webkit-filter: staurate(5);
}

If you are familiar with MailChimp you may have noticed the ability to add your profile picture using your webcam. MailChimp has added a simple but effective solution for its users to modify their profile image using WebRTC in a similar manner to this demo. The code developed in this article is available on GitHub. You can view a live demo of the photo app at the WebRTC Challenge.

Compatibility

So you may be wondering about the availability of WebRTC across the browser vendors and mobile devices. As it currently sits today WebRTC is only compatible on desktop versions of Chrome, Firefox and Opera and mobile browsers on Android. WebRTC is not yet available on iOS for mobile browsers but you can use native libraries to build your iOS & Android applications. Microsoft is actively pushing Object Real-Time Communication (ORTC) which is currently planned to be part of the WebRTC 1.1 specification. Until then, there is a work-around using Temasys’s hosted open-source plugin for support in Internet Explorer and Safari. Ericsson is currently supporting WebRTC with their “Bowser” app that you can download from the Apple app store. It is part of their new framework OpenWebRTC which is a cross-platform WebRTC client framework based on GStreamer. A handy tool that you can use to check the status of your favorite browser is the website iswebrtcreadyyet.com.

WebRTC Resources

Web Real-Time Communications is an exciting technology that has opened the doors for innovation. Developers can now enhance user experiences and provide contextual information in their applications. Below are some resources that you can check out to find more information about WebRTC.

Webrtc.org homepage, maintained by the Google Chrome Team
Real-Time Communication with WebRTC: Google I/O 2013 Presentation
WebRTC GitHub Repo

If you want to use WebRTC for simple meetings or conversations with a friend, below is a list of resources that you can use for free:

WebRTC Challenge

If you are up for learning more about the WebRTC ecosystem head over to the WebRTC Challenge. It is a new initiative started by the team at Blacc Spot Media to introduce and educate developers across the web and mobile communities about the benefits and capabilities of WebRTC.

Conclusion

This is only a glimpse of the power and capabilities of Web Real-Time Communications (WebRTC). As we continue this series we will dive deeper into the ends and outs of WebRTC, where it stands in the market and how companies large and small have already started to harness it’s power. It is important to remember that WebRTC is NOT an out-of-box solution but a tool that will allow you to enhance your applications. Stay tuned for more!

Frequently Asked Questions (FAQs) about WebRTC

What is the significance of WebRTC in real-time communication?

WebRTC, short for Web Real-Time Communication, is a free, open-source project that provides web browsers and mobile applications with real-time communication via simple APIs. It allows audio and video communication to work inside web pages by allowing direct peer-to-peer communication, eliminating the need to install plugins or download native apps. Supported by Google, Microsoft, Mozilla, and Opera, WebRTC is being standardized by the World Wide Web Consortium (W3C) and the Internet Engineering Task Force (IETF).

How does WebRTC handle security and privacy?

WebRTC is designed with robust security measures. All WebRTC components, including signaling mechanisms, need to operate over secure origins (HTTPS). The APIs also mandate encryption of all data sent over the network, such as the media and data channels. This ensures that the communication is secure and private, protecting users from potential eavesdropping and hacking attempts.

How can I implement virtual backgrounds in WebRTC?

Implementing virtual backgrounds in WebRTC involves using technologies like TensorFlow.js for machine learning and body segmentation. You can use the bodyPix model from TensorFlow.js to separate the person from the background in the video. Once the segmentation is done, you can replace the background with any image or video of your choice. This process requires a good understanding of JavaScript and machine learning concepts.

Can I add blur effects to WebRTC video streams?

Yes, you can add blur effects to both sent and received WebRTC video streams. This involves manipulating the video stream using JavaScript and HTML5 canvas. You can draw the video frames onto a canvas, apply the blur effect, and then capture the result back into a video stream. This can be done for both local and remote video streams.

How does WebRTC use insertable streams for background removal?

WebRTC uses insertable streams to manipulate the video content in real-time. This feature allows you to access the raw video frames and modify them before they are encoded or after they are decoded. You can use this feature to implement background removal by applying a machine learning model to the video frames and replacing the background pixels with transparency.

How can I edit live video backgrounds with WebRTC and TensorFlow.js?

Editing live video backgrounds involves using TensorFlow.js for body segmentation and WebRTC for real-time communication. You can use the bodyPix model from TensorFlow.js to separate the person from the background in the video. Once the segmentation is done, you can replace the background with any image or video of your choice. This process requires a good understanding of JavaScript and machine learning concepts.

What are the limitations of WebRTC?

While WebRTC is a powerful technology, it does have some limitations. It requires a stable and high-speed internet connection for optimal performance. Also, while it is supported by most modern browsers, there may be compatibility issues with older browsers or certain mobile devices. Additionally, implementing advanced features like virtual backgrounds or blur effects requires a good understanding of JavaScript and machine learning.

Can WebRTC be used for multi-party video conferencing?

Yes, WebRTC can be used for multi-party video conferencing. However, this requires a more complex setup involving a signaling server and possibly a media server or a peer-to-peer mesh network. The exact architecture will depend on the specific requirements of the application, such as the number of participants and the available network resources.

How does WebRTC handle network connectivity issues?

WebRTC includes several mechanisms to handle network connectivity issues. It uses the ICE framework to establish the best possible network path between peers, including handling NAT traversal. It also includes mechanisms for congestion control and bandwidth estimation to optimize the quality of the video and audio streams based on the current network conditions.

Is WebRTC suitable for mobile applications?

Yes, WebRTC is suitable for mobile applications. It is supported by both Android and iOS, and there are libraries available for both platforms. This allows you to build real-time communication features into your mobile apps, including video and audio calls, chat, and data transfer. However, as with web applications, implementing WebRTC in mobile apps requires a good understanding of the technology and the specific platform APIs.