Media Services Glossary
Warning
Azure Media Services will be retired June 30th, 2024. For more information, see the AMS Retirement Guide.
Use this article to understand the concepts used by Media Services.
Note
This is a work in progress. Readers are welcome to contribute definitions.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Media Services Account - A Media Services account is an Azure resource which allows you to work with the Media Services product.
Adaptive bitrate streaming - Adaptive bitrate streaming is a method that allows media players to switch between encoded files that have different bitrates when available resources such as CPU and bandwidth change.
Advanced Audio Coding (AAC) - An audio coding standard for lossy digital audio compression.
Advanced Video Coding (AVC) - A video compression standard based on block-oriented, motion compensated coding. Also known as H.264.
Alliance Video Codec, version 1 (AV1) -A royalty free video compression format developed by the Alliance for Open Media (AOM).
Aspect ratio - The ratio between the height and width dimensions of a video.
Asset - A Media Services asset is an Azure Storage block blob container. It contains all the files related to one piece of media such as MP4 files as well as manifests, captions, and other data. It can be used to store files for video on demand, encoding inputs and outputs, live streaming outputs, etc.
Asset filter - Applies a dynamic manifest filter to remove or restrict video or audio tracks from the HLS or DASH manifest. Filters can select tracks by bitrate, codecs, resolutions, languages, time range and more. A filter can be applied to an asset with which the filter was associated upon creation, and lasts for the lifetime of the asset.
Authentication token - Access tokens enable clients to securely call protected web APIs, and are used by web APIs to perform authentication and authorization. Per the OAuth specification, access tokens are opaque strings without a set format - some identity providers (IDPs) use GUIDs, others use encrypted blobs. For more information about tokens, see Microsoft identity platform access tokens.
Bitrate - The number of bits per second of content in an encoded video.
Buffering - Buffering happens when there is insufficient bandwidth to stream content with the selected bitrate. This can be due to local ISP issues, or wider area network issues.
Cache - A cache is a way to store data so that it can be accessed more quickly after it has been accessed one time.
Closed Captioning (CC) - The process of displaying either a verbatim or edited transcription of the audio in a video. It is used to enhance the accessibility of video content for the hearing impaired.
Clipping - The practice of taking a small segment of content from a larger video and creating a new video from it.
Common Media Application Format (CMAF) - A video packaging standard established by the Moving Pictures Expert Group (MPEG) to reduce the complexity of publishing video media. This format uses the fragmented MP4 container to store and deliver small chunks of audio, video, and text when paired with an HLS or DASH manifest. In addition, the specification provides details on how content should be encrypted, and how to package closed captions, subtitles, and other advanced features to achieve compatibility between the HLS and DASH streaming ecosystem players.
Codec - A codec compresses or decompresses media such as audio or video. A codec can consist of two parts: an encoder that compresses the media file (encoding) and a decoder that decompresses the file (decoding). Some codecs include both parts, and other codecs only include one of them.
Common Encryption (CENC) - Also known as MPEG CENC, is a standard for encrypting and delivering both DASH and HLS video and audio.
Constant bitrate - Encoding a video so that the bitrate varies as little as possible from a target desired bitrate.
Content aware encoding - Adds logic that allows the encoder to seek an optimal bitrate value for a given resolution, but without requiring extensive computational analysis.
Content Decryption Module (CDM) - Software embedded in a web browser that decrypts encrypted digital rights management (DRM) content. The type of DRM system available can vary by browser and OS.
Content key policy - Used to configure how a content key is delivered to media clients.
Cross-origin Resource Sharing (CORS) - Content requests across domains (origins) are forbidden by default. CORS defines a way in which two domain can share content through http requests that must be approved by the origin domain.
Credential - A certificate or other authentication process that confirms a user or application's permission to access data.
Cropping - Cropping is the process of selecting a rectangular window within the video frame, and encoding just the pixels within that window.
Customer-managed key - We also refer to this as Bring Your Own Key (BYOK). A customer managed key is an encryption key provided by the customer rather than automatically created by Azure.
Dynamic Adaptive Streaming over HTTP - An XML-based manifest, and adaptive streaming bitrate streaming technology created by MPEG.
Decoding - Converting compressed video data back into uncompressed data (video frames).
Decode Time Stamp (DTS) - The time that the decoder should be decode the video frame.
Demuxing - Extracting signals (video, audio, text, metadata) from streams such as MP4 containers.
Descriptive audio - When a video has an additional audio track that describes the visual events taking place in the video. It enhances accessibility for the visually impaired.
Digital Rights Management (DRM) - A way of protecting content from being played unless certain criteria are met such as allowing playback during a period of time or only by certain devices. DRM technologies include Apple FairPlay, and Microsoft's PlayReady.
Digital Rights Management (DRM) license - Before a media player client can play content managed by digital rights, it must first retrieve a license. The license communicates the restrictions placed on the content by digital rights management.
Dolby Digital/AC-3 - An audio codec developed by Dolby Laboratories. Also known as AC-3.
Dolby Digital Plus/eAC-3 - An audio codec developed by Dolby Laboratories that is the successor to Dolby Digital/AC-3. It has better quality and lower bitrates. Also known as eAC-3.
Dynamic encryption - Media Services uses a content key to dynamically encrypt your content by using the Advanced Encryption Standard (AES-128) or any of the two major digital rights management (DRM) systems: Microsoft PlayReady, and Apple FairPlay. The encryption methods and schemes are described in the Common Encryption standard from MPEG.
Dynamic packaging - With Media Services, a streaming endpoint (origin) represents a dynamic (just-in-time) packaging and origin service that can deliver your live and on-demand content directly to a client player. It uses one of the common streaming media protocols HLS, DASH, and Smooth Streaming.
Dynamic manifest - As part of dynamic packaging, the streaming client manifests such as HLS Master Playlist, DASH Media Presentation Description (MPD), and Smooth Streaming, are dynamically generated based on the format selector in the URL. The manifests can also be adjusted with Asset level or Global filters to remove specific tracks and provide more targeted manifests to viewers.
Encoder - An application that takes raw video frames and applies a codec to create a new elementary stream and then puts that stream into a file container or streaming format such as RTMP.
Encoding -The process of converting files containing digital video and/or audio from one standard format to another, with the purpose of (a) reducing the size of the files, and/or (b) producing a format that's compatible with a broad range of devices and apps.
Encoding ladder - A table of recommended resolutions and bitrates for adaptive streaming.
Encrypted Media Extensions (EME) - W3C HTML5 standards where APIs that allow playback of DRM protected content are embedded in browsers.
Entity - Any object that is part of an API that has its own methods and properties. Also referred to as a resource in the Azure Resource Management API.
FFMPEG - A multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play media content.
Filter - Server-side rules that allow you to do things like play back only a section of a video, deliver only specified renditions or language tracks, or adjust the presentation window. Filters can be applied to Media Services accounts or assets.
H.264 - Also called, Advanced Video Coding (AVC), is a video compression standard based on block-oriented, motion compensated coding.
H.265/High Efficiency Video Encoding - Also known as High Efficiency Video Coding (HEVC), is a video compression standard is a successor to H.264 with 25% to 50% better data compression than H.264 (AVC).
H.266 - Also known as Versatile Video Coding (VVC), is a video compression standard is a successor to H.265 that improves compression performance and supports additional applications.
High Availability - When a system implements failover and other methods to ensure that an application is always working for its end users.
High-Bandwidth Digital Content Protection (HDCP) - A digital content protection that enforces restrictions across connections to devices not permitted to stream content.
High Dynamic Range (HDR) - A video format that captures a wide range of brightness values, including Dolby Vision, HDR10, HDR10+, and HLG.
HTML5 - A semantic XML schema that describes page elements, and separates them from presentation which is handled by CSS. In the case of video, there is a <video>
element that was not present in HTML4.
HTTP Live Streaming (HLS) - A way to stream adaptive bitrate media which was developed by Apple.
HTTP Live Streaming (HLS) with AES128 - A way to stream adaptive bitrate media with encryption, which was developed by Apple.
Internet Media Subtitles and Captions (IMSC) - A w3c standard file format that uses XML to describe subtitle and caption content, timing, layout, and styling.
Ingestion - When media content is submitted to an encoding or storage service. The encoding or storage service ingests the incoming media.
Input asset - A Media Services asset (Azure storage container) that is used to house media that is to be encoded. After the media has been encoded it is saved to an output asset.
Job - A request to Media Services to apply a transform to a given input video or audio content. Once a transform has been created, you can submit jobs using Media Services APIs, or any of the published SDKs. A job specifies information like the location of the input video and the location for the output. You can specify the location of your input video using: HTTPS URLs, SAS URLs, or assets.
Key - Keys are used to decrypt encrypted computing resources. There are several types of keys that are used for securing Azure resources. Media Services uses account encryption keys, storage encryption keys, API keys and content keys.
Key delivery service - Media Services provide a key delivery service that generates content keys and DRM licenses to the client player to decrypt HLS or DASH streaming media content.
Latency - The time it takes for media content to be transmitted from the origin server to the player client. It can also refer to the "glass-to-glass" latency which measures the time that it takes for a frame of video to travel from the camera (glass #1,) through the encoder, over the wire to the cloud service for processing, and out through the delivery network to the client player or device (glass #2 would be the monitor or device screen.)
Live encoder - Either a hardware or software encoder that is capable of processing a video feed in real-time and sending it to an ingest URL. Two such live encoders are OBS Studio and Telestream Wirecast.
Live output - a recording function that is created when a live event is intended to be saved after a live event. The recorded video is written to an output asset.
Live event - Ingests, (optionally) encodes and archives live video feeds. A live event can be set to either a basic or standard pass-through (an on-premises live encoder sends a multiple bitrate stream) or live encoding (an on-premises live encoder sends a single bitrate stream and the cloud service provides the encoding).
Live stream - the delivery of live video and/or audio content to an audience close to the time when the content is being produced.
Live stream to video on-demand (VOD) - When live video and audio is transmitted by an on-premises encoder to Media Services, the encoded content is streamed from a live output. When the live event is stopped, the live output is deleted, but the encoded files are kept in an asset so viewers can watch the content later (on-demend).
Live transcription - Speech recognition that results in the transcription of what was said during a live event. The transcription is made available as VTT and IMSC1 tracks for delivery.
Low Latency - When the time it takes for media content to be transmitted from the origin server to the player client is less than 8 seconds.
M3u8 - A multimedia playlist file format that is used with Apple HTTP Streaming (HLS).
Media Source Extensions (MSE) - W3C specification for browser extensions that provides a standardized API for video playback.
Managed identity - Managed identities provide an identity for applications to use when connecting to resources that support Azure Active Directory (Azure AD) authentication. There are two types of managed identities:
- System-assigned. Some Azure services allow you to enable a managed identity directly on a service instance. When you enable a system-assigned managed identity, an identity is created in Azure AD. The identity is tied to the lifecycle of that service instance. When the resource is deleted, Azure automatically deletes the identity for you. By design, only that Azure resource can use this identity to request tokens from Azure AD.
- User-assigned. You may also create a managed identity as a standalone Azure resource. You can create a user-assigned managed identity and assign it to one or more instances of an Azure service. For user-assigned managed identities, the identity is managed separately from the resources that use it.
Manifest - A text file that lists the files of an adaptive bitrate streaming package. Adaptive streaming protocols like HLS, DASH, and Smooth Streaming rely on a manifest file to describe the available tracks of video, audio and text content available to the player. Manifests can also contain "renditions" that describe the various encoding of the same content to give the client player a choice.
Mezzanine file - a lightly compressed rendition of a source video from either a camera, editing system, or other source of raw video and audio data. Typically this is a high quality copy, or the primary source video, that is ingested by Media Services for encoding.
Midroll, preroll, postroll - Media that is inserted during playback of primary content. Ads are the most common scenario.
Motion Picture Experts Group (MPEG) - An organization that creates, researches, and defines global industry standards for video and audio encoding, packaging and delivery.
MPEG DASH - MPEG DASH (Dynamic Adaptive Streaming over HTTP) uses an XML-based file that describes the fMP4 file fragments that are to be delivered for a stream.
MP4 - A container format that can store video, audio, subtitle and image data in separate tracks. Also referred to as ISO Base Media file format and defined in the ISO 14496 part 12 specification.
Muxing - A way of combining multiple elementary streams such as video, audio, and caption streams into a container format, for example, MP4 or TS segments.
OData - A open protocol for querying REST API calls. Using HTTP verbs GET, POST, PUT, PATCH and DELETE, the Media Services API returns the result of the query as a JSON payload.
Opus - A low-latency video codec that was created by the Xiph.Org Foundation that is open-source and used with the VP9 or AVI codecs.
Origin - A server that delivers media content. It stores files that are to be ingested into Media Services. In addition, the origin handles dynamic packaging, encryption, and filtering.
Output asset - A Media Services asset (Azure storage container) that is used to house media after it is encoded.
Overlay - A graphic or additional video that is layered above the main video. An overlay can include a midroll ad, or a logo.
Pass through - When using the pass-through Live Event (basic or standard), the on-premises live encoder generates a multiple bitrate video stream and sends those as the contribution feed to the live event (using RTMP or fragmented-MP4 input protocol). The live event then passes the incoming video streams to the dynamic packager (streaming endpoint) without any further transcoding. A pass-through live event is optimized for long-running live events or 24x365 linear live streaming.
Presentation Time Stamp (PTS) - The time that the media frame should be presented to the screen.
Player client - An application that plays video and audio streams. Some examples are Azure Media Player, Apple AvPlayer framework, Shaka, and Video.js.
Preset - A set of configuration settings for an encoding job or other transformation done on media. Presets are offered by Media Services as a convenience to customers who don't want to define custom presets.
Private endpoint - A network interface that uses a private IP address from your virtual network.
Private link - Azure Private Link enables you to access Azure PaaS Services (for example, Azure Storage and SQL Database) and Azure hosted customer-owned/partner services over a private endpoint in your virtual network.
Red Green Blue (RGB) color model - A digital way of representing color where red, green and blue are given values in an additive (light) color mixing schema. More or less values of light for each determines the final color.
Rebuffering - When a video player has to pause in order to load more video.
Rendition - A version of a video or audio as part of an adaptive bitrate streaming set.
Resolution - The width and height of a video in pixels or lines such as 720p.
Real-Time Messaging Protocol (RTMP) - A communication protocol for streaming audio and video originally defined by Adobe for delivery to Flash clients and now widely adopted by various social media and live streaming services.
Redaction - A way to blur or otherwise obfuscate information that might need to be protected such as faces, license plates, etc.
SAS URL - A string that is generated on the client side that is then shared to client applications that need access to Azure resources.
Slicing - In AVC or HEVC encoded videos, a slice refers to a region in a frame to be processed.
Stiching - Also known as splicing, is taking two or more video and joining them together to create a new video.
Stream - A stream can refer to the files that make up a package to be streamed to a video player, or it can be the actual transmission of media content.
Streaming - Streaming is best understood in comparison of downloading content. When content is downloaded, the entire file is delivered to the end user and the file is saved locally. With streaming, however, chunks of media content are continuously delivered to a client player in data packets. The content does not persist on the user's computer.
Streaming endpoint - A streaming endpoint is a dynamic (just-in-time) packaging and origin service that delivers live and on-demand content directly to a client player app, using one of the common streaming media protocols (HLS or DASH). It also provides dynamic (just-in-time) encryption to DRMs.
Streaming locator - Builds streaming URLs for client players to stream media from Assets. They can be associated with filters, streaming policies and content key policies.
Streaming policy - Defines streaming protocols and encryption options for streaming locators.
Sub-clipping - Specifying that only a portion of a video is to be trimmed, or saved as a stand-alone video. It is also referred to as splicing and editing.
Subtitles - The process of displaying either a verbatim or edited transcription of the audio in a video. It is most commonly used to deliver text in multiple languages, and to enhance the accessibility of a video.
Thumbnail - A still image that is either a frame taken from a video or a different image that is used when a video is not playing or in the stopped state.
Thumbnail sprite - A JPEG file that contains multiple small resolution thumbnails stitched together into a single (large) image in columns and rows, together with a VTT file that defines the CSS offsets and resolutions for each frame to be extracted or displayed in an HTML5 web page. Commonly used for zoetrope of film-strip controls in a streaming video player to enhance fast forward and rewind experiences.
Time shift - The length of time a video player is able to rewind streaming content. Also known as DVR window.
Transform - Used to configure common tasks for encoding or analyzing videos. Each transform describes a recipe or a workflow of tasks for processing video or audio files. A single Transform can apply more than one rule.
Transcoding - Decoding a video or audio stream stored in one codec and then re-encoding using the same codec with different settings, or using another codec. Most file and live encoding in Azure Media Services is really technically "transcoding" rather than "encoding", but we refer to most all codec related operations as "encoding".
Trusted storage - Using a firewall to secure a storage account.
Variable Bitrate (VBR) - A way of encoding a video that distinguishes between frames that have a lot of motion and frames that have very little motion so that the bitrate can change as needed.
Versatile Video Coding (VVC) - A new codec intended to improve on H.265 or High Efficiency Coding released as a standard in 2020. It is not widely deployed yet, nor supported by Media Services at this time.
Video on Demand (VoD) - Video that can be watched any time rather than only at a scheduled time.
Vorbis - An open-source audio codec created by Xiph.Org Foundation. See also Opus.
VP8 - a video codec developed by Google.
VP9 - a video codec developed by Google.
VTT/WebVTT - A text file containing subtitles, captions, a description, as well as chapter and metadata for a video in WebVTT format.
Webhook - A way of communicating computing events, through a URL, from one application to another.
WebM - A container format for storing content encoded as VP8, VP9, or AV1 codecs as well as the Opus and Vorbis codecs.
Web Real-Time Communication (WebRTC) - Published by the W3C and IETF organizations to standardize the APIs for peer-to-peer real-time communication.