Built-in skills for text and image processing during indexing (Azure Cognitive Search)
This article describes the skills provided with Azure Cognitive Search that you can include in a skillset to extract content and structure from raw unstructured text and image files. A skill is an atomic operation that transforms content in some way. Often, it is an operation that recognizes or extracts text, but it can also be a utility skill that reshapes the enrichments that are already created. Typically, the output is text-based so that it can be used in full text queries.
Built-in skills
Built-in skills are based on pre-trained models from Microsoft, which means you cannot train the model using your own training data. Skills that call the Cognitive Resources APIs have a dependency on those services and are billed at the Cognitive Services Standard Pay-in-Advance Offer price when you attach a resource. Other skills are metered by Azure Cognitive Search, or are utility skills that are available at no charge.
The following table enumerates and describes the built-in skills.
OData type | Description | Metered by |
---|---|---|
Microsoft.Skills.Text.CustomEntityLookupSkill | Looks for text from a custom, user-defined list of words and phrases. | Azure Cognitive Search (pricing) |
Microsoft.Skills.Text.KeyPhraseExtractionSkill | This skill uses a pretrained model to detect important phrases based on term placement, linguistic rules, proximity to other terms, and how unusual the term is within the source data. | Cognitive Services (pricing) |
Microsoft.Skills.Text.LanguageDetectionSkill | This skill uses a pretrained model to detect which language is used (one language ID per document). When multiple languages are used within the same text segments, the output is the LCID of the predominantly used language. | Cognitive Services (pricing) |
Microsoft.Skills.Text.MergeSkill | Consolidates text from a collection of fields into a single field. | Not applicable |
Microsoft.Skills.Text.V3.EntityLinkingSkill | This skill uses a pretrained model to generate links for recognized entities to articles in Wikipedia. | Cognitive Services (pricing) |
Microsoft.Skills.Text.V3.EntityRecognitionSkill | This skill uses a pretrained model to establish entities for a fixed set of categories: "Person" , "Location" , "Organization" , "Quantity" , "DateTime" , "URL" , "Email" , "PersonType" , "Event" , "Product" , "Skill" , "Address" , "Phone Number" and "IP Address" fields. |
Cognitive Services (pricing) |
Microsoft.Skills.Text.PIIDetectionSkill | This skill uses a pretrained model to extract personal information from a given text. The skill also gives various options for masking the detected personal information entities in the text. | Cognitive Services (pricing) |
Microsoft.Skills.Text.V3.SentimentSkill | This skill uses a pretrained model to assign sentiment labels (such as "negative", "neutral" and "positive") based on the highest confidence score found by the service at a sentence and document-level on a record by record basis. | Cognitive Services (pricing) |
Microsoft.Skills.Text.SplitSkill | Splits text into pages so that you can enrich or augment content incrementally. | Not applicable |
Microsoft.Skills.Text.TranslationSkill | This skill uses a pretrained model to translate the input text into a variety of languages for normalization or localization use cases. | Cognitive Services (pricing) |
Microsoft.Skills.Vision.ImageAnalysisSkill | This skill uses an image detection algorithm to identify the content of an image and generate a text description. | Cognitive Services (pricing) |
Microsoft.Skills.Vision.OcrSkill | Optical character recognition. | Cognitive Services (pricing) |
Microsoft.Skills.Util.ConditionalSkill | Allows filtering, assigning a default value, and merging data based on a condition. | Not applicable |
Microsoft.Skills.Util.DocumentExtractionSkill | Extracts content from a file within the enrichment pipeline. | Azure Cognitive Search (pricing) |
Microsoft.Skills.Util.ShaperSkill | Maps output to a complex type (a multi-part data type, which might be used for a full name, a multi-line address, or a combination of last name and a personal identifier.) | Not applicable |
Custom skills
Custom skills are modules that you design, develop, and deploy to the web. You can then call the module from within a skillset as a custom skill.
Type | Description | Metered by |
---|---|---|
Microsoft.Skills.Custom.WebApiSkill | Allows extensibility of an AI enrichment pipeline by making an HTTP call into a custom Web API | None unless your solution uses a metered Azure service |
For guidance on creating a custom skill, see Define a custom interface and Example: Creating a custom skill for AI enrichment.