Browse and search relevant data in the Azure Purview Data Catalog
Published Dec 01 2021 09:00 AM 4,236 Views
Microsoft

After data is scanned and ingested into the Azure Purview Data Map, data consumers need to easily find the data needed for their analytics or governance workloads. Data discovery can be time-consuming because a user might not know where to find the wanted data. Even after finding the data, there might be doubts about whether to trust and take a dependency on it.

 

Through search and browse, the Azure Purview Data Catalog allows for quickly finding data that matters.

 

Searching the Data Catalog
Regardless of the page a user is on, the search bar can be quickly accessed from the top bar of the Purview Studio UX. In the Data Catalog home page, the search bar is in the center of the screen.

DanielPerlovsky_0-1638306973387.png

 

 

Once you click on the search bar, search history and assets recently accessed in the Data Catalog are presented. This allows for a quick pick up from previous data exploration.

DanielPerlovsky_1-1638306973403.png

 

 

When search keywords are entered, Purview dynamically suggests assets and searches that might fit needs.

DanielPerlovsky_2-1638306973414.png

 

 

Viewing and filtering search results
Once a search is entered by a data reader, Purview returns a list of data assets matched to the relevant keywords.

DanielPerlovsky_3-1638306973431.png

 

 

 

The Purview relevance engine sorts through all the matches and ranks them based on what it believes their usefulness is to a user. For example, a table that matches on multiple keywords that a data steward has assigned glossary terms to and given a description is likely going to be more interesting to a data consumer than a folder that has been unannotated. A large set of factors go into an asset’s relevance score, and the Purview search team is constantly tuning the relevance engine to ensure the top search results have the most value to the user.

 

If the top results don’t include the assets you are looking for, you can use the facets on the left-hand side to filter down by business metadata such as glossary terms, classifications, and the containing collection. If the user is interested in a particular data source type, such as Azure Data Lake Storage Gen2 or Azure SQL Database, the source type pill filter can narrow down the search.

For certain annotations, click on the ellipses to choose between an AND condition or an OR condition.

DanielPerlovsky_4-1638306973436.png

 

 

Once you find the asset you are looking for, you can select it to view additional details such as schema, lineage, and a detailed classification list.

DanielPerlovsky_5-1638306973450.png

 

 

Browsing the catalog
While searching is great when one knows what they’re looking for, there are times data consumers wish to explore the data available to them. The Azure Purview Data Catalog offers a browse experience that enables users to explore what data is available to them either by collection or through traversing the hierarchy of each data source in the catalog.

 

To access the browse experience, select Browse assets from the data catalog home page.

DanielPerlovsky_6-1638306973466.png

 

 

Browse By collection
Browse By collection allows for exploration of the different collections one is a data reader or curator for.

DanielPerlovsky_7-1638306973472.png

 

 

Once a collection is selected, a list of assets in that collection is presented with the facets and filters available in search. As a collection can have thousands of assets, browse also leverages the Purview search relevance engine to boost the most important assets to the top.

DanielPerlovsky_8-1638306973480.png

 

 

If the selected collection doesn’t contain the desired data, users can easily navigate to related collections or go back and view the entire collections tree.

DanielPerlovsky_9-1638306973482.png

 

 

Browse By source type
Browse By source type allows data consumers to explore the hierarchies of data sources using an explorer view. Select a source type to see the list of scanned sources.

For example, you might have a dataset called DateDimension under a folder called Dimensions in Azure Data Lake Storage Gen 2. The browse By source type experience lets users navigate to the ADLS Gen 2 storage account, then browse to the service > container > folder(s) to reach the specific Dimensions folder and then see the DateDimension table.

DanielPerlovsky_10-1638306973496.png

 

 

Select a source type to see the different instances that have been ingested into the Data Map. For example, the below image shows the Azure SQL Databases that are part of the data catalog.

DanielPerlovsky_11-1638306973500.png

 

 

Select the source of interest and explore the different data assets it contains.

DanielPerlovsky_12-1638306973513.png

 

 

Like search, once the needed asset has been found, view the asset details page to learn more information and see if the dataset can help accomplish the data task.

 

Get started today!

  • Read documentation on how to search the Azure Purview Data Catalog
  • Read documentation on how to browse the Azure Purview Data Catalog
  • Read documentation on how to create and manage collections in Azure Purview



 

Version history
Last update:
‎Sep 21 2022 03:24 PM
Updated by: