In my previous post I showed you how to set up a simple dash.js based player to play multi-DRM protected DASH videos. This blog post gives an introduction to actually creating such videos by yourself to enable playback on various platforms.
It is not a strict step by step tutorial but it will show you the most important parts and examples of how to create a multi-DRM DASH video using only free and open source tools. To add usefulness into the guide, the video shall have multiple video quality levels (with different bitrates), all created with having the best practices in mind. Dash.js, like most players, will automatically, depending on the network speed and conditions, select the best possible quality that is representable without buffering pauses or hiccups, so it’s often useful to provide several video streams in the DASH video to improve customer experience. This guide gives you enough information to create a video that is playable with dash.js but since we are not using any fancy features (only what DASH-IF IOP specification allows) it should be compatible with other DASH players as well. I am not going into specifics of differences in operating systems and other low level details. I also assume you own the source video to be converted to DASH.
The overall process
The process starts with having one unprotected video in an MP4 container. This is the source video to be converted into multi-DRM protected DASH and you may think of the source video as any regular video or movie which has both video and audio. The bitrate and resolution should be as big or bigger than of the video stream of the maximum quality wished in the output DASH. To give you an idea of what are we going to deal with here, here are all the main steps of getting from a source media to multi-DRM DASH video:
- Split the source video file into two different .mp4 files – one containing only video track and another containing only audio track.
- Encode the file containing only the video track into as many quality levels as needed.
- Encrypt all the different video files and the audio file.
- Package all the required media files – all video files with different bitrate and audio file – into many segments (each in a separate file, representing a slice of the media for a specific duration) and also create a manifest. The DASH video is now ready.
Specification of the output
Using the guide as is will output a video with the following main technical characteristics:
Video stream encoding
- Codec: H.264 High
- Level: 4.2
- Frame rate: Keeping the same as the source video
- Bitrate mode: CBR (Constant bitrate)
- GOP mode: Closed
Video bitrates and frame sizes
- 700 kbps – 288p
- 1000 kbps – 360p
- 1500 kbps – 480p
- 2000 kbps – 576p
The width of the frames shall be proportional according to the aspect ratio of the source video.
Audio stream encoding
- Audio stream will be same as source audio.
Profile dashavc264:live from DASH-IF IOP with 4 second segments using segment template to reference segments in the manifest.
The media is encrypted in the multi-DRM fashion conforming to Common Encryption (CENC) to enable it to be decrypted and played back on more than one platform. Since the final media embeds some metadata for both Widevine and PlayReady, the video can be consumed on Google Chrome (which supports only Widevine as the native DRM) and Internet Explorer 11 (which supports only PlayReady as the native DRM).
We are using two tools to achieve the end result:
- ffmpeg – to split the source media into separate video and audio file and to encode the video into several bitrates.
- MP4Box – to encrypt the video and audio files and package the encrypted media into a DASH format.
Both of them are open source and cross platform. Make sure you have them installed and preferably also in the system path to make executing them quick and easy.
Extracting audio and video from the source media
Let’s suppose you have a media file named source.mp4 which contains both video and audio, like a music video or movie. To extract audio and video from it, these commands do the magic:
ffmpeg -i source.mp4 -c:v copy -an video.mp4 ffmpeg -i source.mp4 -c:a copy -vn audio.mp4
Both of the files contain only one track – audio or video.
Encoding video to several bitrates
Several bitrates are important because network conditions can fluctuate during playback and considering this fact, many players are capable of automatically selecting the best possible quality level to play and switch between them during playback to adapt to varying bandwidth. Here is an example list of commands, each creating a separate .mp4 file:
ffmpeg.exe -i video.mp4 -an -c:v libx264 -preset veryslow -profile:v high -level 4.2 -b:v 2000k -minrate 2000k -maxrate 2000k -bufsize 4000k -g 96 -keyint_min 96 -sc_threshold 0 -filter:v "scale='trunc(oh*a/2)*2:576'" -pix_fmt yuv420p video-2000k.mp4 ffmpeg.exe -i video.mp4 -an -c:v libx264 -preset veryslow -profile:v high -level 4.2 -b:v 1500k -minrate 1500k -maxrate 1500k -bufsize 3000k -g 96 -keyint_min 96 -sc_threshold 0 -filter:v "scale='trunc(oh*a/2)*2:480'" -pix_fmt yuv420p video-1500k.mp4 ffmpeg.exe -i video.mp4 -an -c:v libx264 -preset veryslow -profile:v high -level 4.2 -b:v 1000k -minrate 1000k -maxrate 1000k -bufsize 2000k -g 96 -keyint_min 96 -sc_threshold 0 -filter:v "scale='trunc(oh*a/2)*2:360'" -pix_fmt yuv420p video-1000k.mp4 ffmpeg.exe -i video.mp4 -an -c:v libx264 -preset veryslow -profile:v high -level 4.2 -b:v 700k -minrate 700k -maxrate 700k -bufsize 1400k -g 96 -keyint_min 96 -sc_threshold 0 -filter:v "scale='trunc(oh*a/2)*2:288'" -pix_fmt yuv420p video-700k.mp4
As you can see the values that are different for each command are marked in bold. Here’s a list of good practices to bear in mind while encoding to different bitrates:
- Values for b:v, minrate and maxrate are what specify the target bitrate and must be identical for constant bitrate encoding.
- bufsize should be 2 times the target bitrate.
- The filter:v argument basically calculates the frame width given height. Height is the last number in the value, denoted in bold.
- The GOP length should be target segment length (in seconds) times the frame rate (FPS). It is specified as the value of -g and -keyint_min parameters which in this example is 96, meaning that the commands expect the FPS of the source video to be about 24 and target DASH segment length 4 seconds. This article has a good explanation about GOP.
Multi-DRM encryption with PlayReady and Widevine
The following process encrypts audio and video files so that the final encrypted DASH will be actually playable using the dash.js based player described here. The video shall be encrypted with a specific demo key and we also add a specific known KeyID and PlayReady license server URL to the video metadata so that the player would know where to request the license from (which contains the decryption key). MP4Box is needed for encryption and it needs three main things for it’s input: a special XML file describing the protection specific values, the input file name and the output file name. The encryption command must be run separately for all the different bitrate video files and the audio file. The generic command itself looks like this:
mp4box.exe -crypt crypt.xml video-clear.mp4 -out video-encrypted.mp4
The crypt file specifies parameters regarding how the input file is going to be encrypted or what metadata the encrypted file is going to have. The main elements of it are:
- KeyID – An identifier of the encryption key.
- Key – The actual key the input is going to be encrypted with. This key is also needed by the player to decrypt the media.
- Widevine PSSH – Protection system specific information for Widevine CDMs.
- PlayReady PSSH – Protection system specific information for PlayReady CDMs.
Creating the crypt.xml file can be quite delicate process which is actually worth an article of its own and that’s why, for now, I will just show you a working crypt.xml file that instructs MP4box to encrypt the media with a demo key from Axinom, and providing also the KeyID (which is required in the DRM key server to identify the key) along with PlayReady and Widevine signalization (PSSH).
Provided that you have encoded all the video bitrates, here are the commands to run to encrypt the video and audio with information stored in crypt.xml:
mp4box.exe -crypt crypt.xml video-700k.mp4 -out video-700k-encrypted.mp4 mp4box.exe -crypt crypt.xml video-1000k.mp4 -out video-1000k-encrypted.mp4 mp4box.exe -crypt crypt.xml video-1500k.mp4 -out video-1500k-encrypted.mp4 mp4box.exe -crypt crypt.xml video-2000k.mp4 -out video-2000k-encrypted.mp4 mp4box.exe -crypt crypt.xml audio.mp4 -out audio-encrypted.mp4
For each unencrypted input file there is an encrypted counterpart now.
Packaging the encrypted media into DASH
To generate the final DASH video from the encrypted media files into a separate folder dash_protected, execute the following command:
mkdir dash_protected mp4box.exe -dash 4000 -rap -frag-rap -sample-groups-traf -profile dashavc264:live -bs-switching no -segment-name dash_$RepresentationID$_$Number$ -url-template video-700k-encrypted.mp4 video-1000k-encrypted.mp4 video-1500k-encrypted.mp4 video-2000k-encrypted.mp4 audio-encrypted.mp4 -out "dash_protected/manifest.mp4"
As you can see, it requires the file names of all the media files and no special parameter name is needed to differentiate between video and audio files. If you also have subtitle files packaged into mp4 container, these can also be just appended to the list. Creating unencrypted DASH can be done using the same command with the same arguments, you just provide the unencrypted files as input. Now that the final video is created, you can try playback in the DRM enabled dash.js player.
I hope this article gave you a practical starting point with good defaults in creating your own DASH videos, protected or not. Bear in mind that since the videos you protect using the provided crypt.xml will contain information specific to Axinom’s demo key servers, the videos you produce with it should not be used for production purposes. Good luck and have fun!