What are Media Capture and Streams and How Do I Use Them?

By Rami Sarieddine

This article is part of a web development series from Microsoft. Thank you for supporting the partners who make SitePoint possible.

When using web technologies to develop apps, be it on the web or mobile, there are times when you would want to use local multimedia devices, such as microphones or video cameras. An example would be to allow users to stream or take photos of themselves from the local video camera. To give you a little background, audio/video capture and streaming on the web had been largely dependent on browser plugins (Flash or Silverlight). With HTML5 and browsers pulling the plug on browser plugins, HTML5, hailed as the savior, brought to the web access to device capabilities, from Geolocation (GPS) and WebGL (GPU) to the Web Audio API (audio hardware), amongst many others.

These powerful features expose high level JavaScript APIs that talk to the system’s underlying hardware capabilities.

Let’s start with HTML Media Capture which, per specification, is defined as a form extension that facilitates the access to a device’s media capture mechanism, which can be a camera, or microphone or even a file within the upload control.

At its heart the Media Capture extends the HTMLInputElement interface with a capture attribute. A basic example would be: <input type="file" capture>.

This capture attribute makes the request to use the media capture tool (camera, microphone, etc.) for capturing media on the spot.

Here is simple declarative example to illustrate its use. The following shows an HTML form using capture alongside the accept attribute, which gives hints on the preferred MIME type for the user to capture media.

<input type="file" accept="image/*" capture>

HTML Media Capture extension was specifically designed to be simple and declarative, and covers a subset of the media capture functionality of the web platform. Nevertheless, this HTML specification does not provide detailed author control over capture nor does it allow access to real-time media streams from the hosting device. HTML Media Capture was the first shot at standardizing media capture on the web. It reuses the file input element and works by overloading it and adding new values for the accept parameter. So basically, it works like a charm but it only allows you to record a media file or take a snapshot in time. Where Media Capture fell short was the ability to achieve real-time effects such as rendering live webcam data to a <canvas> element and apply some WebGL filters on it.

And thus, we have Media Capture and Streams.

Media Capture and Streams is actually a set of JavaScript APIs. These APIs allow local media (audio and video) to be requested from a platform. In other words, it provides access to the user’s local audio and video input/output devices.

More specifically, we have the MediaStream API, which provides the means to control where multimedia stream data is consumed, and provides some control over the devices that produce the media. Additionally, the MediaStream API exposes information about the devices that are able to capture and render media.

Why is it important? Here’s a history lesson for the future generations that might take this for granted. The media (Audio/Video) capture capability has been the “Nirvana” of web development for some time. Historically, we had to rely on browser plugins (Flash or Silverlight) to achieve this. Then came HTML5 to the rescue. HTML5 brought powerful features that allow access to device hardware natively from Geolocation (GPS) to WebGL (GPU) and much more. These features which are now baked into the browser expose high level JavaScript APIs that sit on top of the device’s hardware capabilities.

So, why would we use it becomes obvious.

One of the most important methods in this API is getUserMedia() and it’s the gateway into that set of APIs. getUserMedia() provides the means to access the user’s local camera/microphone stream.

Nevertheless, Feature detecting is the best way to check for its support, either directly if(navigator.getUserMedia) or using modernizer if(Modernizr.getusermedia).

The basic syntax is:

var stream = navigator.getUserMedia(constraints, successCallback, errorCallback);

The constraints parameter is actually a MediaStreamConstraints object with two Boolean members: video and audio. These describe the media types supporting the LocalMediaStream object. Either or both must be specified to validate the constraint argument. The LocalMediaStream object is the MediaStream object returned from the call to getUserMedia(). It has all the properties and events of the MediaStream object and the stop method.

Setting the constraints for both audio and video would like the following: { video: true, audio: true }

The successCallback function will be called (on success) with the LocalMediaStream object that contains the media stream. You may assign that object to the appropriate element and work with it, as shown in the following example:

The errorCallback will be invoked when an error arises, it will be called with one of the following code arguments: “permission_denied”, “not_supported_error” or “mandatory_unsatisfied_error”.

A basic example would be:


<!DOCTYPE html>
	<a href="Default.html">Default.html</a>
	<head>
	<meta charset="utf-8"/>
	<title></title>
	<script type="text/javascript">
		if (navigator.getUserMedia) {
            navigator.getUserMedia(
			// constraints
            {
                video: true,
                audio: true
            },
			// successCallback
			function (localMediaStream) {
				var video = document.querySelector('video');
                video.src = window.URL.createObjectURL(localMediaStream);
				// do whatever you want with the video
                video.play();
            },
			// errorCallback
			function (err) {
                console.log("The following error occured: " + err);
            });
        } else {
            alert("getUserMedia not supported by your web browser or Operating system version");
        }
	</script>
	</head>
	<body>
		<h2>Media Capture and Streaming</h2>
		<p>Let's stream our video!</p>
		<video autoplay></video>
	</body>
</html>

When you run this script, the browser will prompt you to use the webcam and microphone on your device. Here is an image that shows this example in action on a browser:

Media Capture and Streaming demo

From here on, you can get creative. Using getUserMedia, you can record, play, save and load streaming media. Then, if you want, you can apply some visualizations, effects and filters to stream data.

In terms of browser compatibility, getUserMedia API is supported on major modern browsers. Microsoft Edge, Chrome 21+, Opera 18+, and Firefox 17+. Surprisingly, on Chrome the standard function was not recognized. It seems that the function getUserMedia was not recognized. Below is a screenshot from Chrome when I ran the site there.

Browsers not supporting getUserMedia API

In order to resolve this, I had to add vendor webkit vendor prefix. I ended up adding all the vendor prefixes to be on the safe side.

Here is the appended script, just before the if (navigator.getUserMedia)


navigator.getUserMedia = (navigator.getUserMedia ||
	navigator.webkitGetUserMedia ||
	navigator.mozGetUserMedia
);

Microsoft Edge, as promised to deliver standard-based and interoperable web experience, supports the Unprefixed version from build 10240+. You can check the support for Media Capture and Streams, the getUserMedia API in specific, on the Platform Status section of their website. There you can also view the web standards roadmap for other implementations.

In closing, there is a great example on the Microsoft Edge website https://dev.windows.com/en-us/microsoft-edge/testdrive/demos/microphone/. This demo shows Microphone Streaming & Web Audio working together. You can also, check the code for this demo and others on the GitHub repo over here: https://github.com/MicrosoftEdge/EdgePortal-demos.

Happy Coding!

More Hands-on with Web Development

This article is part of the web development series from Microsoft evangelists and engineers on practical JavaScript learning, open source projects, and interoperability best practices including Microsoft Edge browser and the new EdgeHTML rendering engine.

We encourage you to test across browsers and devices including Microsoft Edge – the default browser for Windows 10 – with free tools on dev.microsoftedge.com:

More in-depth learning from our engineers and evangelists:

Our community open source projects:

More free tools and back-end web dev stuff:

Recommended
Sponsors
Get the latest in Front-end, once a week, for free.