Azure 媒体服务分片 MP4 实时引入规范Azure Media Services fragmented MP4 live ingest specification

本规范适用于 Azure 媒体服务,描述基于分片 MP4 的实时传送视频流引入的协议和格式。This specification describes the protocol and format for fragmented MP4-based live streaming ingestion for Azure Media Services. 媒体服务提供实时传送视频流服务,让客户使用 Azure 作为云平台来实时流式传输直播活动和广播内容。Media Services provides a live streaming service that customers can use to stream live events and broadcast content in real time by using Azure as the cloud platform. 此外,本文档还介绍了有关构建高度冗余和稳健的实时引入机制的最佳做法。This document also discusses best practices for building highly redundant and robust live ingest mechanisms.

1.一致表示法1. Conformance notation

本文档中的关键字“必须”、“不得”、“需要”、“应”、“不应”、“应该”、“不应该”、“建议”、“可以”和“可选”均根据 RFC 2119 中所述予以解释。The key words "MUST," "MUST NOT," "REQUIRED," "SHALL," "SHALL NOT," "SHOULD," "SHOULD NOT," "RECOMMENDED," "MAY," and "OPTIONAL" in this document are to be interpreted as they are described in RFC 2119.

2.服务关系图2. Service diagram

下图显示媒体服务中实时传送视频流服务的高级体系结构:The following diagram shows the high-level architecture of the live streaming service in Media Services:

  1. 实时编码器将实时源推送到通过 Azure 媒体服务 SDK 创建并设置的通道。A live encoder pushes live feeds to channels that are created and provisioned via the Azure Media Services SDK.
  2. 媒体服务中的通道、节目与流式处理终结点处理所有的实时传送视频流功能,包括引入、格式化、云 DVR、安全性、可伸缩性和冗余。Channels, programs, and streaming endpoints in Media Services handle all the live streaming functionalities, including ingest, formatting, cloud DVR, security, scalability, and redundancy.
  3. 另外,客户也可以选择在流式处理终结点与客户端终结点之间部署 Azure 内容分发网络层。Optionally, customers can choose to deploy an Azure Content Delivery Network layer between the streaming endpoint and the client endpoints.
  4. 客户端终结点使用 HTTP 自适应流式处理协议从流式处理终结点开始流式传输。Client endpoints stream from the streaming endpoint by using HTTP Adaptive Streaming protocols. 示例包括 Azure 平滑流式处理、通过 HTTP 的动态自适应流式处理(DASH 或 MPEG-DASH)和 Apple HTTP Live Streaming (HLS)。Examples include Azure Smooth Streaming, Dynamic Adaptive Streaming over HTTP (DASH, or MPEG-DASH), and Apple HTTP Live Streaming (HLS).


3.位流格式 – ISO 14496-12 分片 MP43. Bitstream format – ISO 14496-12 fragmented MP4

本文档所述的实时传送视频流引入的有线格式基于 [ISO-14496-12]。The wire format for live streaming ingest discussed in this document is based on [ISO-14496-12]. 若要深入了解分片 MP4 格式以及点播视频文件和实时传送视频流引入的扩展,请参阅 [MS-SSTR]For a detailed explanation of fragmented MP4 format and extensions both for video-on-demand files and live streaming ingestion, see [MS-SSTR].

实时引入格式定义Live ingest format definitions

下表列出了适用于 Azure 媒体服务的实时引入的特殊格式定义:The following list describes special format definitions that apply to live ingest into Azure Media Services:

  1. “ftyp” 、“LiveServerManifestBox” 及“moov” 框必须连同每个请求 (HTTP POST) 一起发送。The ftyp, Live Server Manifest Box, and moov boxes MUST be sent with each request (HTTP POST). 这些框必须在流的开头发送,每当需要恢复流引入时,编码器都必须重新连接。These boxes MUST be sent at the beginning of the stream and any time the encoder must reconnect to resume stream ingest. 有关详细信息,请参阅 [1] 中的第 6 节。For more information, see Section 6 in [1].
  2. [1] 中的第 3.3.2 节定义了实时引入名为“StreamManifestBox” 的可选框。Section 3.3.2 in [1] defines an optional box called StreamManifestBox for live ingest. 由于 Azure 负载均衡器的路由逻辑,此框已被弃用。Due to the routing logic of the Azure load balancer, using this box is deprecated. 引入到媒体服务时不应出现此框。The box SHOULD NOT be present when ingesting into Media Services. 如果存在此框,媒体服务会以无提示方式将其忽略。If this box is present, Media Services silently ignores it.
  3. 每个片段必须有在 [1] 的 中定义的“TrackFragmentExtendedHeaderBox” 框。The TrackFragmentExtendedHeaderBox box defined in in [1] MUST be present for each fragment.
  4. 应使用第 2 版的“TrackFragmentExtendedHeaderBox” 框,才能在多个数据中心生成具有相同 URL 的媒体片段。Version 2 of the TrackFragmentExtendedHeaderBox box SHOULD be used to generate media segments that have identical URLs in multiple datacenters. 对于跨数据中心故障转移基于索引的流格式(例如 Apple HLS 和基于索引的 MPEG DASH),片段索引字段是必需的。The fragment index field is REQUIRED for cross-datacenter failover of index-based streaming formats such as Apple HLS and index-based MPEG-DASH. 若要启用跨数据中心故障转移,片段索引必须在多个编码器之间同步,并且后续的每个媒体片段都必须增加 1,即使跨编码器重启或失败。To enable cross-datacenter failover, the fragment index MUST be synced across multiple encoders and be increased by 1 for each successive media fragment, even across encoder restarts or failures.
  5. [1] 中的第 3.3.6 节定义了名为“MovieFragmentRandomAccessBox” (“mfra” )的框,此框可能会在实时引入结束时发送,表示通道流式传输结束 (EOS)。Section 3.3.6 in [1] defines a box called MovieFragmentRandomAccessBox (mfra) that MAY be sent at the end of live ingestion to indicate end-of-stream (EOS) to the channel. 媒体服务的引入逻辑使得 EOS 的使用方式已过时,不应发送实时引入的“mfra” 框。Due to the ingest logic of Media Services, using EOS is deprecated, and the mfra box for live ingestion SHOULD NOT be sent. 如果已发送,媒体服务会以无提示方式将其忽略。If sent, Media Services silently ignores it. 若要重置引入点的状态,建议使用通道重置To reset the state of the ingest point, we recommend that you use Channel Reset. 此外,建议使用程序停止来结束演播与流。We also recommend that you use Program Stop to end a presentation and stream.
  6. MP4 片段持续时间应为常量,以减小客户端清单的大小。The MP4 fragment duration SHOULD be constant, to reduce the size of the client manifests. 常量 MP4 片段持续时间也可以通过使用重复标记来改进客户端下载启发。A constant MP4 fragment duration also improves client download heuristics through the use of repeat tags. 持续时间可能会波动,以补偿非整数帧速率。The duration MAY fluctuate to compensate for non-integer frame rates.
  7. MP4 片段持续时间应该大约为 2 到 6 秒之间。The MP4 fragment duration SHOULD be between approximately 2 and 6 seconds.
  8. MP4 片段时间戳和索引(“TrackFragmentExtendedHeaderBox” 、fragment_ absolute_ timefragment_index)应以递增顺序送达。MP4 fragment timestamps and indexes (TrackFragmentExtendedHeaderBox fragment_ absolute_ time and fragment_index) SHOULD arrive in increasing order. 尽管媒体服务在复制片段方面很有弹性,但是其根据媒体时间线重新排序片段的功能非常有限。Although Media Services is resilient to duplicate fragments, it has limited ability to reorder fragments according to the media timeline.

4.协议格式 – HTTP4. Protocol format – HTTP

媒体服务基于 ISO 分片 MP4 的实时引入使用长时间运行的标准 HTTP POST 请求,以将分片 MP4 格式打包的编码媒体数据传输到服务。ISO fragmented MP4-based live ingest for Media Services uses a standard long-running HTTP POST request to transmit encoded media data that is packaged in fragmented MP4 format to the service. 每个 HTTP POST 发送一个完整的分片 MP4 位流(“流”),其开头为标头框(“ftyp” 、“实时服务器清单框” 及“moov”框 ),后接一系列片段(“moof” 与“mdat” 框)。Each HTTP POST sends a complete fragmented MP4 bitstream ("stream"), starting from the beginning with header boxes (ftyp, Live Server Manifest Box, and moov boxes), and continuing with a sequence of fragments (moof and mdat boxes). 有关 HTTP POST 请求的 URL 语法,请参阅 [1] 中的第 9.2 节。For URL syntax for the HTTP POST request, see section 9.2 in [1]. 以下是 POST URL 的示例:An example of the POST URL is:


详细要求如下:Here are the detailed requirements:

  1. 编码器应使用相同的引入 URL 来发送包含空白“正文”(内容长度为零)的 HTTP POST 请求,以此开始广播。The encoder SHOULD start the broadcast by sending an HTTP POST request with an empty “body” (zero content length) by using the same ingestion URL. 这可有助于编码器快速检测实时引入终结点是否有效,以及是否需要任何身份验证或其他条件。This can help the encoder quickly detect whether the live ingestion endpoint is valid, and if there are any authentication or other conditions required. 服务器在收到整个请求(包括 POST 正文)之前,无法对每个 HTTP 协议传回 HTTP 响应。Per HTTP protocol, the server can't send back an HTTP response until the entire request, including the POST body, is received. 由于直播活动具有长时间运行的性质,如果不执行此步骤,编码器在完成发送所有数据之前,可能无法检测任何错误。Given the long-running nature of a live event, without this step, the encoder might not be able to detect any error until it finishes sending all the data.
  2. 编码器必须处理任何因为 (1) 而造成的错误或身份验证质询。The encoder MUST handle any errors or authentication challenges because of (1). 如果 (1) 成功并返回 200 响应,则继续进行。If (1) succeeds with a 200 response, continue.
  3. 编码器必须以分片 MP4 流开始新的 HTTP POST 请求。The encoder MUST start a new HTTP POST request with the fragmented MP4 stream. 有效负载必须以标头框开头后接片段。The payload MUST start with the header boxes, followed by fragments. 请注意,由于上一个请求在流结束前终止,因此,即使编码器必须重新连接,“ftyp” 、“实时服务器清单框” 及“moov” 框(依此顺序)仍必须连同每个请求一起发送。Note that the ftyp, Live Server Manifest Box, and moov boxes (in this order) MUST be sent with each request, even if the encoder must reconnect because the previous request was terminated prior to the end of the stream.
  4. 因为无法预测直播活动的整个内容长度,编码器必须使用区块传输编码进行上传。The encoder MUST use chunked transfer encoding for uploading, because it’s impossible to predict the entire content length of the live event.
  5. 当事件结束时,在发送最后一个片段之后,编码器必须正常结束区块传输编码消息序列(大多数 HTTP 客户端堆栈会自动处理)。When the event is over, after sending the last fragment, the encoder MUST gracefully end the chunked transfer encoding message sequence (most HTTP client stacks handle it automatically). 编码器必须等候服务返回最终响应代码,然后终止连接。The encoder MUST wait for the service to return the final response code, and then terminate the connection.
  6. 如 [1] 中第 9.2 节所述,编码器不得使用 Events() 名词来实时引入媒体服务。The encoder MUST NOT use the Events() noun as described in 9.2 in [1] for live ingestion into Media Services.
  7. 如果 HTTP POST 请求在流结束前终止或超时并出现一个 TCP 错误,则编码器必须使用新的连接来发出新的 POST 请求,并按照上述要求操作。If the HTTP POST request terminates or times out with a TCP error prior to the end of the stream, the encoder MUST issue a new POST request by using a new connection, and follow the preceding requirements. 此外,编码器必须在流中为每个轨迹重新发送之前的两个 MP4 片段,并恢复流式传输,而不在媒体时间线上造成中断。Additionally, the encoder MUST resend the previous two MP4 fragments for each track in the stream, and resume without introducing a discontinuity in the media timeline. 为每个轨迹重新发送最后两个 MP4 片段可确保不会丢失数据。Resending the last two MP4 fragments for each track ensures that there is no data loss. 换句话说,如果流包含音频和视频轨迹,并且当前 POST 请求失败,则编码器必须重新连接,并为音频轨迹重新发送最后两个片段(先前已成功发送),为视频轨迹重新发送最后两个片段(先前已成功发送),以确保不会丢失任何数据。In other words, if a stream contains both an audio and a video track, and the current POST request fails, the encoder must reconnect and resend the last two fragments for the audio track, which were previously successfully sent, and the last two fragments for the video track, which were previously successfully sent, to ensure that there is no data loss. 编码器必须维护媒体片段的“转发”缓冲区,当重新连接时会重新发送此缓冲区。The encoder MUST maintain a “forward” buffer of media fragments, which it resends when it reconnects.

5.时间刻度5. Timescale

[MS-SSTR] 描述了 SmoothStreamingMedia (第 节)、StreamElement (第 节)、StreamFragmentElement (第 节)和 LiveSMIL (第 节)的时间刻度使用情况。[MS-SSTR] describes the usage of timescale for SmoothStreamingMedia (Section, StreamElement (Section, StreamFragmentElement (Section, and LiveSMIL (Section 如果没有时间刻度值,则使用默认值 10,000,000 (10 MHz)。If the timescale value is not present, the default value used is 10,000,000 (10 MHz). 尽管平滑流式处理格式规范不会阻止使用其他时间刻度值,但大多数编码器实现会使用此默认值 (10 MHz) 来生成平滑流式处理引入数据。Although the Smooth Streaming format specification doesn’t block usage of other timescale values, most encoder implementations use this default value (10 MHz) to generate Smooth Streaming ingest data. 由于 Azure 媒体动态打包功能的原因,建议为视频流使用 90-KHz 时间刻度,为音频流使用 44.1 KHZ 或 48.1 KHZ 时间刻度。Due to the Azure Media Dynamic Packaging feature, we recommend that you use a 90-KHz timescale for video streams and 44.1 KHz or 48.1 KHz for audio streams. 如果不同的流采用不同的时间刻度值,则必须发送流级时间刻度。If different timescale values are used for different streams, the stream-level timescale MUST be sent. 有关详细信息,请参阅 [MS-SSTR]For more information, see [MS-SSTR].

6.“流”的定义6. Definition of “stream”

流是指在撰写实时演播内容、处理流故障转移和冗余方案的实时引入中操作的基本单位。Stream is the basic unit of operation in live ingestion for composing live presentations, handling streaming failover, and redundancy scenarios. 流定义为一个唯一的分片 MP4 位流,其中可包含单个轨迹或多个轨迹。Stream is defined as one unique, fragmented MP4 bitstream that might contain a single track or multiple tracks. 完整实时演播可能包含一个或多个流,具体取决于实时编码器的配置。A full live presentation might contain one or more streams, depending on the configuration of the live encoders. 以下示例演示了使用流撰写完整实时演播内容的各种选项。The following examples illustrate various options of using streams to compose a full live presentation.


客户想要创建实时传送视频流演播,其中包含以下音频/视频比特率:A customer wants to create a live streaming presentation that includes the following audio/video bitrates:

视频 – 3000 kbps、1500 kbps、750 kbpsVideo – 3000 kbps, 1500 kbps, 750 kbps

音频 – 128 kbpsAudio – 128 kbps

选项 1:在一个流中包含所有轨迹Option 1: All tracks in one stream

在此选项中,单个编码器生成所有音频/视频轨迹,然后将它们捆绑成一个分片 MP4 位流。In this option, a single encoder generates all audio/video tracks, and then bundles them into one fragmented MP4 bitstream. 分片的 MP4 位流然后会通过单个 HTTP POST 连接发送。The fragmented MP4 bitstream is then sent via a single HTTP POST connection. 在此示例中,此实时演播只有一个流。In this example, there is only one stream for this live presentation.


选项 2:将每个轨迹包含在单独的流中Option 2: Each track in a separate stream

在此选项中,编码器在每个分片 MP4 位流中放置一个轨道,然后通过独立的 HTTP 连接发布所有流。In this option, the encoder puts one track into each fragment MP4 bitstream, and then posts all of the streams over separate HTTP connections. 这可通过一个或多个编码器来实现。This can be done with one encoder or with multiple encoders. 从实时引入的角度来看,此实时演播由四个流组成。The live ingestion sees this live presentation as composed of four streams.


选项 3:将音频轨迹与比特率最低的视频轨迹捆绑成一个流Option 3: Bundle audio track with the lowest bitrate video track into one stream

在此选项中,客户选择将音频轨道与比特率最低的视频轨道捆绑成一个分片 MP4 位流,并让另外两个视频轨道保留在单独的流。In this option, the customer chooses to bundle the audio track with the lowest-bitrate video track in one fragment MP4 bitstream, and leave the other two video tracks as separate streams.



此列表并不完整,只列出了此示例中部分可用的引入选项。This is not an exhaustive list of all possible ingestion options for this example. 事实上,实时引入支持将轨道以任何组合方式分组到流中。As a matter of fact, any grouping of tracks into streams is supported by live ingestion. 客户和编码器供应商可以根据工程复杂性、编码器容量以及冗余与故障转移注意事项,来选择自己的实现。Customers and encoder vendors can choose their own implementations based on engineering complexity, encoder capacity, and redundancy and failover considerations. 但是,在大多数情况下,整个实时演播只有一个音频轨道。However, in most cases, there is only one audio track for the entire live presentation. 因此,请务必确保包含音频轨道的引入流正常运行。在考虑到这个要点的情况下,用户通常会将音频轨道放入其自己的流中(如选项 2),或者将它与比特率最低的视频轨道捆绑在一起(如选项 3)。So, it’s important to ensure the healthiness of the ingest stream that contains the audio track. This consideration often results in putting the audio track in its own stream (as in Option 2) or bundling it with the lowest-bitrate video track (as in Option 3). 此外,若要获得更好的冗余与容错效果,强烈建议针对嵌入到媒体服务的实时引入,将同一个音频轨道通过两个不同的流(选项 2 中采用冗余音频轨道)发送,或者将音频轨道与至少两个比特率最低的视频轨道捆绑在一起(选项 3 将音频与至少两个视频流捆绑在一起)。Also, for better redundancy and fault tolerance, sending the same audio track in two different streams (Option 2 with redundant audio tracks) or bundling the audio track with at least two of the lowest-bitrate video tracks (Option 3 with audio bundled in at least two video streams) is highly recommended for live ingest into Media Services.

7.服务故障转移7. Service failover

根据实时传送视频流的性质,良好的故障转移支持是确保服务可用性的关键。Given the nature of live streaming, good failover support is critical for ensuring the availability of the service. 媒体服务可以处理各种类型的故障,包括网络错误、服务器错误和存储问题。Media Services is designed to handle various types of failures, including network errors, server errors, and storage issues. 当与实时编码器端适当的故障转移逻辑结合使用时,客户可以从云实现高度可靠的实时传送视频流服务。When used in conjunction with proper failover logic from the live encoder side, customers can achieve a highly reliable live streaming service from the cloud.

本部分讨论服务故障转移方案。In this section, we discuss service failover scenarios. 在此案例中,服务的某个位置发生故障,且故障将自身记录为网络错误。In this case, the failure happens somewhere within the service, and it manifests itself as a network error. 以下是如何实施编码器以处理服务故障转移的一些建议:Here are some recommendations for the encoder implementation for handling service failover:

  1. 建立超时值为 10 秒的 TCP 连接。Use a 10-second timeout for establishing the TCP connection. 如果尝试建立连接超过 10 秒,便会中止操作并重试。If an attempt to establish the connection takes longer than 10 seconds, abort the operation and try again.

  2. 发送 HTTP 请求消息区块的超时值较短。Use a short timeout for sending the HTTP request message chunks. 如果目标 MP4 片段持续时间为 N 秒,请使用介于 N 和 2 N 秒之间的发送超时;例如,如果 MP4 片段持续时间是 6 秒,则使用 6 到 12 秒的超时。If the target MP4 fragment duration is N seconds, use a send timeout between N and 2 N seconds; for example, if the MP4 fragment duration is 6 seconds, use a timeout of 6 to 12 seconds. 如果发生超时,请重置连接,打开新连接,并以新连接恢复流引入。If a timeout occurs, reset the connection, open a new connection, and resume stream ingest on the new connection.

  3. 为每个轨迹维护一个循环缓冲区,其中包含最后两个已成功完全发送到服务的片段。Maintain a rolling buffer that has the last two fragments for each track that were successfully and completely sent to the service. 如果流的 HTTP POST 请求在流结束前终止或超时,则打开新的连接,并开始另一个 HTTP POST 请求、重新发送流标头、为每个轨迹重新发送最后两个片段,但不在媒体时间轴上造成不连续情况。If the HTTP POST request for a stream is terminated or times out prior to the end of the stream, open a new connection and begin another HTTP POST request, resend the stream headers, resend the last two fragments for each track, and resume the stream without introducing a discontinuity in the media timeline. 这会减少数据丢失的可能性。This reduces the chance of data loss.

  4. 建议编码器不要限制在发生 TCP 错误后,重新尝试创建连接或恢复流式传输的次数。We recommend that the encoder does NOT limit the number of retries to establish a connection or resume streaming after a TCP error occurs.

  5. 发生 TCP 错误后:After a TCP error:

    a.a. 必须关闭当前连接,并且必须为新的 HTTP POST 请求创建新的连接。The current connection MUST be closed, and a new connection MUST be created for a new HTTP POST request.

    b.b. 新的 HTTP POST URL 必须与初始 POST URL 相同。The new HTTP POST URL MUST be the same as the initial POST URL.

    c.c. 新的 HTTP POST 必须包括与初始 POST 相同的流标头(“ftyp” 、“实时服务器清单框” 及“moov” 框)。The new HTTP POST MUST include stream headers (ftyp, Live Server Manifest Box, and moov boxes) that are identical to the stream headers in the initial POST.

    d.d. 必须重新发送为每个轨迹发送的最后两个片段,且必须恢复流式传输,而不在媒体时间线上造成中断。The last two fragments sent for each track must be resent, and streaming must resume without introducing a discontinuity in the media timeline. MP4 片段时间戳必须连续递增,甚至可跨越 HTTP POST 请求。The MP4 fragment timestamps must increase continuously, even across HTTP POST requests.

  6. 如果未以匹配 MP4 片段持续时间的速率发送数据,则编码器应该终止 HTTP POST 请求。The encoder SHOULD terminate the HTTP POST request if data is not being sent at a rate commensurate with the MP4 fragment duration. 不发送数据的 HTTP POST 请求可以防止媒体服务在服务更新事件中很快与编码器断开连接。An HTTP POST request that does not send data can prevent Media Services from quickly disconnecting from the encoder in the event of a service update. 出于此原因,稀疏(广告信号)轨道的 HTTP POST 应该短暂留存,并在发送疏松片段之后立即终止。For this reason, the HTTP POST for sparse (ad signal) tracks SHOULD be short-lived, terminating as soon as the sparse fragment is sent.

8.编码器故障转移8. Encoder failover

编码器故障转移是第二种故障转移方案,必须妥善配置此方案才能进行端到端实时流式传送。Encoder failover is the second type of failover scenario that needs to be addressed for end-to-end live streaming delivery. 在此方案中,错误状况会发生在编码器端。In this scenario, the error condition occurs on the encoder side.


发生编码器故障转移时,实时引入终结点上预期会:The following expectations apply from the live ingestion endpoint when encoder failover happens:

  1. 应创建新的编码器实例以恢复流式传输,如图表中所示(以虚线表示的 3000k 视频流)。A new encoder instance SHOULD be created to continue streaming, as illustrated in the diagram (Stream for 3000k video, with dashed line).
  2. 新编码器必须使用与故障实例相同的 HTTP POST 请求 URL。The new encoder MUST use the same URL for HTTP POST requests as the failed instance.
  3. 新编码器的 POST 请求必须包含与故障实例相同的分片 MP4 标头框。The new encoder’s POST request MUST include the same fragmented MP4 header boxes as the failed instance.
  4. 新编码器必须与所有其他运行中的编码器正确同步,相同的实时演播才能生成与相符片段边界同步的音频/视频样本。The new encoder MUST be properly synced with all other running encoders for the same live presentation to generate synced audio/video samples with aligned fragment boundaries.
  5. 新流必须在语义上等同于上一个流,并可在标头与片段级别互换。The new stream MUST be semantically equivalent with the previous stream, and interchangeable at the header and fragment levels.
  6. 新的编码器应该尝试最大程度地减少数据丢失。The new encoder SHOULD try to minimize data loss. 媒体片段的 fragment_absolute_timefragment_index 应从编码器上次停止的时间点开始增加。The fragment_absolute_time and fragment_index of media fragments SHOULD increase from the point where the encoder last stopped. fragment_absolute_timefragment_index 应连续增加,但允许视需要造成中断。The fragment_absolute_time and fragment_index SHOULD increase in a continuous manner, but it is permissible to introduce a discontinuity, if necessary. 媒体服务会忽略已收到并处理的片段,因此,在片段重新发送端造成错误,要好过在媒体时间线上造成中断。Media Services ignores fragments that it has already received and processed, so it's better to err on the side of resending fragments than to introduce discontinuities in the media timeline.

9.编码器冗余9. Encoder redundancy

对于某些需要更高可用性与优质体验的重要直播活动,建议使用主动-主动冗余编码器,以实现无缝故障转移,且不会丢失数据。For certain critical live events that demand even higher availability and quality of experience, we recommended that you use active-active redundant encoders to achieve seamless failover with no data loss.


如图表中所示,两组编码器同时将每个流的两个副本推送到实时服务。As illustrated in this diagram, two groups of encoders push two copies of each stream simultaneously into the live service. 此设置之所以受支持,是因为媒体服务能够根据流 ID 与片段时间戳筛选出重复片段。This setup is supported because Media Services can filter out duplicate fragments based on stream ID and fragment timestamp. 生成的实时流与存档是所有流的单个副本,此副本很有可能是从两个源聚合而成。The resulting live stream and archive is a single copy of all the streams that is the best possible aggregation from the two sources. 例如,在假设的极端条件下,只要有一个编码器(不一定是同一个)在任意给定时间点对每个流运行,则从服务生成的实时流就会是连续的,且不会丢失数据。For example, in a hypothetical extreme case, as long as there is one encoder (it doesn’t have to be the same one) running at any given point in time for each stream, the resulting live stream from the service is continuous without data loss.

此方案的要求与“编码器故障转移”的要求几乎相同,不同之处在于,第二组编码器将与主编码器同时运行。The requirements for this scenario are almost the same as the requirements in the "Encoder failover" case, with the exception that the second set of encoders are running at the same time as the primary encoders.

10.服务冗余10. Service redundancy

对于高度冗余的全局分发,有时必须跨区域备份以处理区域灾难。For highly redundant global distribution, sometimes you must have cross-region backup to handle regional disasters. 通过扩展“编码器冗余”拓扑,客户可以选择在不同的区域部署冗余服务,并与第二组编码器连接。Expanding on the “Encoder redundancy” topology, customers can choose to have a redundant service deployment in a different region that's connected with the second set of encoders. 此外,客户还可与内容分发网络提供商合作,在两个服务部署之前部署全局流量管理器,以此无缝路由客户端流量。Customers also can work with a Content Delivery Network provider to deploy a Global Traffic Manager in front of the two service deployments to seamlessly route client traffic. 编码器的要求与“编码器冗余”的要求相同。The requirements for the encoders are the same as the “Encoder redundancy” case. 唯一的例外是第二组编码器需要指向另一个实时引入终结点。The only exception is that the second set of encoders needs to be pointed to a different live ingest endpoint. 下图显示了此设置:The following diagram shows this setup:


11.特殊类型的引入格式11. Special types of ingestion formats

本部分介绍用于处理特定方案的特殊类型的实时引入格式。This section discusses special types of live ingestion formats that are designed to handle specific scenarios.

稀疏轨道Sparse track

当以丰富的客户端体验传递实时传送视频流演播时,通常需要传输与主要媒体数据时间同步的事件或带内信号。When delivering a live streaming presentation with a rich client experience, often it's necessary to transmit time-synced events or signals in-band with the main media data. 动态实时广告插播就是一个例子。An example of this is dynamic live ad insertion. 这种类型的事件信号不同于一般音频/视频流,因为它具有稀疏的性质。This type of event signaling is different from regular audio/video streaming because of its sparse nature. 换句话说,信号数据通常不会连续发生,其间隔难以预测。In other words, the signaling data usually does not happen continuously, and the interval can be hard to predict. 稀疏轨道的概念专门用于引入和广播带内信号数据。The concept of sparse track was designed to ingest and broadcast in-band signaling data.

以下步骤是引入稀疏轨道的建议实现方式:The following steps are a recommended implementation for ingesting sparse track:

  1. 创建独立的分片 MP4 位流,其中只包含稀疏轨道,而不包含音频/视频轨道。Create a separate fragmented MP4 bitstream that contains only sparse tracks, without audio/video tracks.

  2. 在如 [1] 的第 6 节定义的“实时服务器清单框” 中,使用“parentTrackName” 参数指定父轨道的名称。有关详细信息,请参阅 [1] 中的 节。In the Live Server Manifest Box as defined in Section 6 in [1], use the parentTrackName parameter to specify the name of the parent track. For more information, see section in [1].

  3. 在“实时服务器清单框” 中,“manifestOutput” 必须设置为“true” 。In the Live Server Manifest Box, manifestOutput MUST be set to true.

  4. 根据信号事件的稀疏性质,建议如下:Given the sparse nature of the signaling event, we recommended the following:

    a.a. 直播活动开始时,编码器会将初始标头框发送给服务,使服务可以在客户端清单中注册稀疏轨道。At the beginning of the live event, the encoder sends the initial header boxes to the service, which allows the service to register the sparse track in the client manifest.

    b.b. 未发送数据时,编码器应该终止 HTTP POST 请求。The encoder SHOULD terminate the HTTP POST request when data is not being sent. 不发送数据的长时间运行 HTTP POST 可以防止媒体服务在服务更新或服务器重启事件中很快与编码器断开连接。A long-running HTTP POST that does not send data can prevent Media Services from quickly disconnecting from the encoder in the event of a service update or server reboot. 在这些情况下,在套接字上的接收操作中,会暂时阻止媒体服务器。In these cases, the media server is temporarily blocked in a receive operation on the socket.

    c.c. 在此期间如果没有可用的信号数据,编码器应该关闭 HTTP POST 请求。During the time when signaling data is not available, the encoder SHOULD close the HTTP POST request. POST 请求处于活动状态时,编码器应发送数据。While the POST request is active, the encoder SHOULD send data.

    d.d. 发送稀疏片段时,编码器可以设置显式 content-Length 标头(如果可用)。When sending sparse fragments, the encoder can set an explicit content-length header, if it’s available.

    e.e. 通过新连接发送稀疏片段时,编码器应从标头框开始发送,接着发送新片段。When sending sparse fragments with a new connection, the encoder SHOULD start sending from the header boxes, followed by the new fragments. 这适用于在中途发生故障转移的情况,并会与先前从未看到稀疏轨道的新服务器建立新的稀疏连接。This is for cases in which failover happens in-between, and the new sparse connection is being established to a new server that has not seen the sparse track before.

    f.f. 当时间戳值相等或更大的对应父轨道片段可供客户端使用时,稀疏轨道片段便可供客户端使用。The sparse track fragment becomes available to the client when the corresponding parent track fragment that has an equal or larger timestamp value is made available to the client. 例如,如果稀疏片段的时间戳 t=1000,则预期在客户端看到“视频”(假设父轨道名称为“video”)片段时间戳为 1000 或以上后,便可下载 t=1000 的稀疏片段。For example, if the sparse fragment has a timestamp of t=1000, it is expected that after the client sees "video" (assuming the parent track name is "video") fragment timestamp 1000 or beyond, it can download the sparse fragment t=1000. 请注意,实际信号能够在演播时间线上的不同位置上用于其指定用途。Note that the actual signal could be used for a different position in the presentation timeline for its designated purpose. 在此示例中,t=1000 的稀疏片段具有 XML 有效负载,可在数秒后将广告插入到某个位置。In this example, it’s possible that the sparse fragment of t=1000 has an XML payload, which is for inserting an ad in a position that’s a few seconds later.

    g.g. 稀疏轨道片段的有效负载可以为不同格式(例如 XML、文本或二进制),具体取决于方案。The payload of sparse track fragments can be in different formats (such as XML, text, or binary), depending on the scenario.

冗余音频轨道Redundant audio track

在典型的 HTTP 自适应流式处理方案中(例如平滑流式处理或 DASH),通常在整个演播中只有一个音频轨道。In a typical HTTP adaptive streaming scenario (for example, Smooth Streaming or DASH), often, there's only one audio track in the entire presentation. 具有多个质量级别的视频轨道可让客户端在错误条件中选择,而音频轨道不同于这类轨道,当引入的流中有损坏的音频轨道时,音频轨道会是唯一的故障点。Unlike video tracks, which have multiple quality levels for the client to choose from in error conditions, the audio track can be a single point of failure if the ingestion of the stream that contains the audio track is broken.

为解决此问题,媒体服务支持实时引入冗余音频轨道。To solve this problem, Media Services supports live ingestion of redundant audio tracks. 其思路是同一音频轨迹可在不同的流中多次发送。The idea is that the same audio track can be sent multiple times in different streams. 尽管服务只会在客户端清单中注册音频轨道一次,但它能够使用冗余的音频轨道作为备份,以在主音频轨道发生问题时检索音频片段。Although the service only registers the audio track once in the client manifest, it can use redundant audio tracks as backups for retrieving audio fragments if the primary audio track has issues. 为了引入冗余音频轨道,编码器需要:To ingest redundant audio tracks, the encoder needs to:

  1. 在多个分片 MP4 位流中创建相同的音频轨道。Create the same audio track in multiple fragment MP4 bitstreams. 冗余的音频轨道必须在语义上与片段时间戳相同,并可在标头与片段级别互换。The redundant audio tracks MUST be semantically equivalent, with the same fragment timestamps, and be interchangeable at the header and fragment levels.
  2. 确保实时服务器清单中的“audio”条目([1] 中的第 6 节)对于所有冗余音频轨道都是相同的。Ensure that the “audio” entry in the Live Server Manifest (Section 6 in [1]) is the same for all redundant audio tracks.

以下是冗余音频轨道的建议实现方式:The following implementation is recommended for redundant audio tracks:

  1. 让流独自发送每个唯一的音频轨迹。Send each unique audio track in a stream by itself. 此外,为每个音频轨道流发送冗余流,其中第二个流与第一个流的唯一不同之处在于 HTTP POST URL 中的标识符:{protocol}://{server address}/{publishing point path}/Streams({identifier})。Also, send a redundant stream for each of these audio track streams, where the second stream differs from the first only by the identifier in the HTTP POST URL: {protocol}://{server address}/{publishing point path}/Streams({identifier}).
  2. 使用独立的流发送两个最低视频比特率。Use separate streams to send the two lowest video bitrates. 其中每个流还应该包含每个唯一音频轨迹的副本。例如,当支持多种语言时,这些流应包含每种语言的音频轨迹。Each of these streams SHOULD also contain a copy of each unique audio track. For example, when multiple languages are supported, these streams SHOULD contain audio tracks for each language.
  3. 使用独立的服务器(编码器)实例来编码和发送 (1) 与 (2) 中所提到的冗余流。Use separate server (encoder) instances to encode and send the redundant streams mentioned in (1) and (2).

媒体服务学习路径Media Services learning paths

媒体服务 v3(最新版本)Media Services v3 (latest)

查看最新版本的 Azure 媒体服务!Check out the latest version of Azure Media Services!

媒体服务 v2(旧版)Media Services v2 (legacy)