使用 Azure 媒体分析检测动作Detect Motions with Azure Media Analytics


借助“Azure Media Motion Detector” 媒体处理器 (MP),用户可在冗长且平淡的视频中有效识别出感兴趣的部分。The Azure Media Motion Detector media processor (MP) enables you to efficiently identify sections of interest within an otherwise long and uneventful video. 可以对静态相机数据片段使用动作检测,以识别视频中有动作的部分。Motion detection can be used on static camera footage to identify sections of the video where motion occurs. 它会生成 JSON 文件,其中包含带时间戳的元数据,以及发生事件的边界区域。It generates a JSON file containing a metadata with timestamps and the bounding region where the event occurred.

此技术面向安全视频提要,它可以将动作分类为相关事件和误报(例如阴影或光源变化)。Targeted towards security video feeds, this technology is able to categorize motion into relevant events and false positives such as shadows and lighting changes. 这样,便可以在不会被发送无止境的不相关事件的情况下,从相机源生成安全警报,同时能够从长时间的监控视频中提取感兴趣的片段。This allows you to generate security alerts from camera feeds without being spammed with endless irrelevant events, while being able to extract moments of interest from long surveillance videos.

Azure 媒体动作检测器 MP 目前以预览版提供。The Azure Media Motion Detector MP is currently in Preview.

本文提供了有关 Azure Media Motion Detector 的详细信息,并演示了如何通过适用于 .NET 的媒体服务 SDK 使用它This article gives details about Azure Media Motion Detector and shows how to use it with Media Services SDK for .NET

动作检测器输入文件Motion Detector input files

视频文件。Video files. 目前支持以下格式:MP4、MOV 和 WMV。Currently, the following formats are supported: MP4, MOV, and WMV.

任务配置(预设)Task configuration (preset)

在使用 Azure 媒体动作检测器创建任务时,必须指定配置预设。When creating a task with Azure Media Motion Detector, you must specify a configuration preset.


可以使用以下参数:You can use the following parameters:

NameName 选项Options 说明Description 默认Default
sensitivityLevelsensitivityLevel 字符串:'low'、'medium'、'high'String:'low', 'medium', 'high' 设置报告动作情况的敏感度级别。Sets the sensitivity level at which motions are reported. 调整此项是为了调整误报数量。Adjust this to adjust number of false positives. 'medium''medium'
frameSamplingValueframeSamplingValue 正整数Positive integer 设置算法的运行频率。Sets the frequency at which algorithm runs. 1 等于每个帧,2 是指每 2 个帧,如此类推。1 equals every frame, 2 means every second frame, and so on. 11
detectLightChangedetectLightChange 布尔值:'true'、'false'Boolean:'true', 'false' 设置是否在结果中报告轻微的更改Sets whether light changes are reported in the results 'False''False'
mergeTimeThresholdmergeTimeThreshold Xs-time:Hh:mm:ssXs-time: Hh:mm:ss
示例:00:00:03Example: 00:00:03
指定动作事件之间的时间窗口,其中的 2 个事件将组合成 1 个事件进行报告。Specifies the time window between motion events where 2 events are be combined and reported as 1. 00:00:0000:00:00
detectionZonesdetectionZones 检测区域的一个数组:An array of detection zones:
- 检测区域是一个包含 3 个或 3 个以上点的数组- Detection Zone is an array of 3 or more points
- 点是从 0 到 1 的 x 和 y 坐标。- Point is an x and y coordinate from 0 to 1.
描述要使用的多边形检测区域列表。Describes the list of polygonal detection zones to be used.
报告结果时将报告以 ID 表示的区域,其中第一个是 ‘id’:0Results are reported with the zones as an ID, with the first one being 'id':0
单个区域,涵盖整个帧。Single zone, which covers the entire frame.

JSON 示例JSON example

      "version": "1.0",
      "options": {
        "sensitivityLevel": "medium",
        "frameSamplingValue": 1,
        "detectLightChange": "False",
        "detectionZones": [
            {"x": 0, "y": 0},
            {"x": 0.5, "y": 0},
            {"x": 0, "y": 1}
            {"x": 0.3, "y": 0.3},
            {"x": 0.55, "y": 0.3},
            {"x": 0.8, "y": 0.3},
            {"x": 0.8, "y": 0.55},
            {"x": 0.8, "y": 0.8},
            {"x": 0.55, "y": 0.8},
            {"x": 0.3, "y": 0.8},
            {"x": 0.3, "y": 0.55}

动作检测器输出文件Motion Detector output files

动作检测作业会在输出资产中返回 JSON 文件,该文件描述视频中的动作警报及其类别。A motion detection job returns a JSON file in the output asset, which describes the motion alerts, and their categories, within the video. 该文件将包含有关在视频中检测到的动作的时间和持续时间的信息。The file contains information about the time and duration of motion detected in the video.

一旦固定背景视频(例如监控视频)中出现运动对象,动作检测器 API 将提供指示器。The Motion Detector API provides indicators once there are objects in motion in a fixed background video (for example, a surveillance video). 动作检测器经过训练可减少误报(例如光源和阴影变化)。The Motion Detector is trained to reduce false alarms, such as lighting and shadow changes. 当前算法限制包括夜视视频、半透明对象和小对象。Current limitations of the algorithms include night vision videos, semi-transparent objects, and small objects.

输出 JSON 文件中的元素Elements of the output JSON file


在最新版本中,输出 JSON 格式已更改,对某些客户来说可以说是重大更改。In the latest release, the Output JSON format has changed and may represent a breaking change for some customers.

下表描述了输出 JSON 文件的元素。The following table describes elements of the output JSON file.

元素Element 说明Description
versionversion 这是指视频 API 的版本。This refers to the version of the Video API. 当前版本为 2。The current version is 2.
timescaletimescale 视频每秒的“刻度”数。"Ticks" per second of the video.
offsetoffset 时间戳的时间偏移量(以“刻度”为单位)。The time offset for timestamps in "ticks." 在版本 1.0 的视频 API 中,此属性始终为 0。In version 1.0 of Video APIs, this will always be 0. 在我们将来支持的方案中,此值可能会更改。In future scenarios we support, this value may change.
framerateframerate 视频的每秒帧数。Frames per second of the video.
width, heightwidth, height 表示视频的宽度和高度(以像素为单位)。Refers to the width and height of the video in pixels.
startstart 开始时间戳(以“刻度”为单位)。The start timestamp in "ticks".
durationduration 事件的长度(以“刻度”为单位)。The length of the event, in "ticks".
intervalinterval 事件中每个条目的间隔(以“刻度”为单位)。The interval of each entry in the event, in "ticks".
eventsevents 每个事件片段包含在该持续时间内检测到的动作。Each event fragment contains the motion detected within that time duration.
typetype 在当前版本中,对于一般动作,该属性始终为“2”。In the current version, this is always ‘2’ for generic motion. 此标签可让视频 API 在将来的版本中灵活地为动作分类。This label gives Video APIs the flexibility to categorize motion in future versions.
regionIdregionId 如上所述,在此版本中此属性始终为 0。As explained above, this will always be 0 in this version. 此标签可让视频 API 在将来的版本中灵活地查找各区域中的动作。This label gives Video API the flexibility to find motion in various regions in future versions.
regionsregions 表示你关注的动作在视频中的区域。Refers to the area in your video where you care about motion.

-“id”表示区域面积 - 且在此版本中只有一个,ID 0。-"id" represents the region area – in this version there is only one, ID 0.
-“type”代表你关注其动作的区域的形状。-"type" represents the shape of the region you care about for motion. 目前支持“矩形”和“多边形”。Currently, "rectangle" and "polygon" are supported.
如果指定了“矩形”,则区域具有以 X、Y表示宽度及高度的维度。If you specified "rectangle", the region has dimensions in X, Y, Width, and Height. X 和 Y 坐标表示规范化 0.0 到 1.0 比例中的区域的左上角 XY 坐标。The X and Y coordinates represent the upper left-hand XY coordinates of the region in a normalized scale of 0.0 to 1.0. 宽度和高度表示规范化 0.0 到 1.0 比例中的区域的大小。The width and height represent the size of the region in a normalized scale of 0.0 to 1.0. 在当前版本中,X、Y、宽度和高度始终固定为 0、0 和 1、1。In the current version, X, Y, Width, and Height are always fixed at 0, 0 and 1, 1.
如果指定了“多边形”,则区域的维度以点来表示。If you specified "polygon", the region has dimensions in points.
fragmentsfragments 元数据划分成称为“片段”的不同段。The metadata is chunked up into different segments called fragments. 每个片段包含开始时间、持续时间、间隔数字和事件。Each fragment contains a start, duration, interval number, and event(s). 没有事件的片段表示在该开始时间和持续时间内没有检测到任何动作。A fragment with no events means that no motion was detected during that start time and duration.
brackets []brackets [] 每个括号表示事件中的单个间隔。Each bracket represents one interval in the event. 如果该间隔显示空括号,则表示没有检测到动作。Empty brackets for that interval means that no motion was detected.
locationslocations 事件下的此新项列出发生动作的位置。This new entry under events lists the location where the motion occurred. 这比检测区域更具体。This is more specific than the detection zones.

以下 JSON 示例显示输出:The following JSON example shows the output:

      "version": 2,
      "timescale": 23976,
      "offset": 0,
      "framerate": 24,
      "width": 1280,
      "height": 720,
      "regions": [
          "id": 0,
          "type": "polygon",
          "points": [{'x': 0, 'y': 0},
            {'x': 0.5, 'y': 0},
            {'x': 0, 'y': 1}]
      "fragments": [
          "start": 0,
          "duration": 226765
          "start": 226765,
          "duration": 47952,
          "interval": 999,
          "events": [
                "type": 2,
                "typeName": "motion",
                "locations": [
                    "x": 0.004184,
                    "y": 0.007463,
                    "width": 0.991667,
                    "height": 0.985185
                "regionId": 0


  • 支持的输入视频格式包括 MP4、MOV 和 WMV。The supported input video formats include MP4, MOV, and WMV.
  • 动作检测已针对固定背景视频优化。Motion Detection is optimized for stationary background videos. 算法专注于降低误报,例如光源变化和阴影。The algorithm focuses on reducing false alarms, such as lighting changes, and shadows.
  • 某些动作可能因技术难题而无法检测到,例如夜视视频、半透明对象和小对象。Some motion may not be detected due to technical challenges; for example, night vision videos, semi-transparent objects, and small objects.

.NET 示例代码.NET sample code

以下程序演示如何:The following program shows how to:

  1. 创建资产并将媒体文件上传到资产。Create an asset and upload a media file into the asset.

  2. 基于包含以下 json 预设的配置文件创建含有视频动作检测任务的作业:Create a job with a video motion detection task based on a configuration file that contains the following json preset:

            "Version": "1.0",
            "Options": {
                "SensitivityLevel": "medium",
                "FrameSamplingValue": 1,
                "DetectLightChange": "False",
                "DetectionZones": [
                    {"x": 0, "y": 0},
                    {"x": 0.5, "y": 0},
                    {"x": 0, "y": 1}
                    {"x": 0.3, "y": 0.3},
                    {"x": 0.55, "y": 0.3},
                    {"x": 0.8, "y": 0.3},
                    {"x": 0.8, "y": 0.55},
                    {"x": 0.8, "y": 0.8},
                    {"x": 0.55, "y": 0.8},
                    {"x": 0.3, "y": 0.8},
                    {"x": 0.3, "y": 0.55}
  3. 下载输出 JSON 文件。Download the output JSON files.

创建和配置 Visual Studio 项目Create and configure a Visual Studio project

设置开发环境,并在 app.config 文件中填充连接信息,如使用 .NET 进行媒体服务开发中所述。Set up your development environment and populate the app.config file with connection information, as described in Media Services development with .NET.


using System;
using System.Configuration;
using System.IO;
using System.Linq;
using Microsoft.WindowsAzure.MediaServices.Client;
using System.Threading;
using System.Threading.Tasks;

namespace VideoMotionDetection
    class Program
        // Read values from the App.config file.
        private static readonly string _AADTenantDomain =
        private static readonly string _RESTAPIEndpoint =
        private static readonly string _AMSClientId =
        private static readonly string _AMSClientSecret =

        // Field for service context.
        private static CloudMediaContext _context = null;

        static void Main(string[] args)
            AzureAdTokenCredentials tokenCredentials =
                new AzureAdTokenCredentials(_AADTenantDomain,
                    new AzureAdClientSymmetricKey(_AMSClientId, _AMSClientSecret),

            var tokenProvider = new AzureAdTokenProvider(tokenCredentials);

            _context = new CloudMediaContext(new Uri(_RESTAPIEndpoint), tokenProvider);

            // Run the VideoMotionDetection job.
            var asset = RunVideoMotionDetectionJob(@"C:\supportFiles\VideoMotionDetection\BigBuckBunny.mp4",

            // Download the job output asset.
            DownloadAsset(asset, @"C:\supportFiles\VideoMotionDetection\Output");

        static IAsset RunVideoMotionDetectionJob(string inputMediaFilePath, string configurationFile)
            // Create an asset and upload the input media file to storage.
            IAsset asset = CreateAssetAndUploadSingleFile(inputMediaFilePath,
                "My Video Motion Detection Input Asset",

            // Declare a new job.
            IJob job = _context.Jobs.Create("My Video Motion Detection Job");

            // Get a reference to Azure Media Motion Detector.
            string MediaProcessorName = "Azure Media Motion Detector";

            var processor = GetLatestMediaProcessorByName(MediaProcessorName);

            // Read configuration from the specified file.
            string configuration = File.ReadAllText(configurationFile);

            // Create a task with the encoding details, using a string preset.
            ITask task = job.Tasks.AddNew("My Video Motion Detection Task",

            // Specify the input asset.

            // Add an output asset to contain the results of the job.
            task.OutputAssets.AddNew("My Video Motion Detection Output Asset", AssetCreationOptions.None);

            // Use the following event handler to check job progress.  
            job.StateChanged += new EventHandler<JobStateChangedEventArgs>(StateChanged);

            // Launch the job.

            // Check job execution and wait for job to finish.
            Task progressJobTask = job.GetExecutionProgressTask(CancellationToken.None);


            // If job state is Error, the event handling
            // method for job progress should log errors.  Here we check
            // for error state and exit if needed.
            if (job.State == JobState.Error)
                ErrorDetail error = job.Tasks.First().ErrorDetails.First();
                Console.WriteLine(string.Format("Error: {0}. {1}",
                return null;

            return job.OutputMediaAssets[0];

        static IAsset CreateAssetAndUploadSingleFile(string filePath, string assetName, AssetCreationOptions options)
            IAsset asset = _context.Assets.Create(assetName, options);

            var assetFile = asset.AssetFiles.Create(Path.GetFileName(filePath));

            return asset;

        static void DownloadAsset(IAsset asset, string outputDirectory)
            foreach (IAssetFile file in asset.AssetFiles)
                file.Download(Path.Combine(outputDirectory, file.Name));

        static IMediaProcessor GetLatestMediaProcessorByName(string mediaProcessorName)
            var processor = _context.MediaProcessors
                .Where(p => p.Name == mediaProcessorName)
                .OrderBy(p => new Version(p.Version))

            if (processor == null)
                throw new ArgumentException(string.Format("Unknown media processor",

            return processor;

        static private void StateChanged(object sender, JobStateChangedEventArgs e)
            Console.WriteLine("Job state changed event:");
            Console.WriteLine("  Previous state: " + e.PreviousState);
            Console.WriteLine("  Current state: " + e.CurrentState);

            switch (e.CurrentState)
                case JobState.Finished:
                    Console.WriteLine("Job is finished.");
                case JobState.Canceling:
                case JobState.Queued:
                case JobState.Scheduled:
                case JobState.Processing:
                    Console.WriteLine("Please wait...\n");
                case JobState.Canceled:
                case JobState.Error:
                    // Cast sender as a job.
                    IJob job = (IJob)sender;
                    // Display or log error details as needed.
                    // LogJobStop(job.Id);

媒体服务学习路径Media Services learning paths

媒体服务 v3(最新版本)Media Services v3 (latest)

查看最新版本的 Azure 媒体服务!Check out the latest version of Azure Media Services!

媒体服务 v2(旧版)Media Services v2 (legacy)

Azure 媒体服务动作检测器博客Azure Media Services Motion Detector blog

Azure 媒体服务分析概述Azure Media Services Analytics Overview