精选文章 azure 入门_Azure数据目录入门

azure 入门_Azure数据目录入门

作者:culuo4781 时间: 2020-07-18 07:26:48
culuo4781 2020-07-18 07:26:48

azure 入门

This article talks about Azure Data Catalog and how data professionals can use it to locate, understand and consume data sources.


As the name suggests, it is a service in Azure that helps users organize, discover and register data sources. This fully managed cloud service acts as a central shared place in an organization for developers, analysts, data scientists and users to contribute their knowledge and help to locate, understand and consume data.

顾名思义,它是Azure中的一项服务,可帮助用户组织,发现和注册数据源。 这种完全托管的云服务充当组织中开发人员,分析师,数据科学家和用户的中央共享场所,以贡献他们的知识并帮助查找,理解和使用数据。

Data Catalog in Azure does not move data and it remains in its existing location, a copy of its structural and descriptive metadata is added to the Data Catalog, along with a reference to the data-source location. This metadata is indexed making the data easily searchable.

Azure中的数据目录不会移动数据,而是保留在其现有位置,其结构性和描述性元数据的副本将添加到数据目录中,并附带对数据源位置的引用。 对该元数据建立索引,使数据易于搜索。

为什么我们需要一个Azure数据目录? (Why do we need an Azure Data Catalog?)

  • Companies are generating and storing boatloads of data every day, and with this fast-growing data, discovering data sources are challenging for both data producers and data consumers

  • It becomes highly complex and time-consuming to create and maintain documentation of large data sources

  • tribal knowledge (information that is known within a company) that exists within an organization and it becomes little challenging for a newcomer in the company to seek all this knowledge. Azure Data Catalog rightly addresses this issue by providing a platform to gain information about the data and hence, it makes data sources easily discoverable and understandable 一定数量的部族知识 (公司内部已知的信息),并且公司中的新人寻求所有这些知识几乎没有挑战。 Azure数据目录通过提供一个平台来获取有关数据的信息来正确解决此问题,因此,它使数据源易于发现和理解
  • With Data Catalog, developers no longer have to spend time looking and searching data using complex queries


Azure数据目录过程涉及: (Azure Data Catalog process involves:)

Below are the steps that are usually followed as we proceed in the Data Catalog:


  1. Create a data catalog – this is the first step to provision a Data Catalog

  2. Register and annotate assets – Users can register their data sources, and also add annotations with tags, documents and understandable descriptions

  3. Discover and consume assets – Users can easily search and filter assets with indexed metadata

  4. Connect to Data – This lets you connect and pull data into various tools like Excel, Power BI, SSDT etc.

    连接到数据–这使您可以连接数据并将数据拉入各种工具,例如Excel,Power BI,SSDT等。

使用Azure数据目录时要记住的重要点 (Important points to remember while working with Azure Data Catalog)

To set up a Data Catalog, you are supposed to be the owner or co-owner of an Azure subscription.


Only one Data Catalog is supported per organization (i.e. per tenant) and you cannot have additional catalogs even if you have multiple subscriptions.


Data Catalog only supports work or school accounts, so in order to create a data catalog in Azure, you need to have a work or school account.

数据目录仅支持工作或学校帐户 ,因此,要在Azure中创建数据目录,您需要拥有工作或学校帐户。

Without any further delay, let’s see Azure Data Catalog in action –


This article assumes you have basic knowledge of Azure, familiar with working with Azure SQL database and have an Azure Subscription.

本文假定您具有Azure的基本知识,熟悉使用Azure SQL数据库并具有Azure订阅

如何创建Azure数据目录? (How to create an Azure Data Catalog?)

You can create Data Catalog like any other Azure resource through the Azure portal. Go to the portal, search for Data Catalog, and mention a name for your data catalog. You will also have to specify the subscription name, the location for the catalog, and the pricing tier (free or standard edition). Then select Create. Finally, go to the Azure Data Catalog home page and select Publish Data.

您可以通过Azure门户像其他任何Azure资源一样创建数据目录。 转到门户网站,搜索“ 数据目录” ,并为您的数据目录命名。 您还必须指定订阅名称,目录位置和定价层(免费版或标准版)。 然后选择创建 。 最后,转到Azure数据目录主页,然后选择“ 发布数据”。

azure 入门_Azure数据目录入门1

Alternatively, you can go to the Azure Data Catalog provision page, and type in Data Catalog Name, the subscription you may want to use, and the location for the catalog as shown below.

或者,可以转到“ Azure数据目录设置”页面 ,然后键入“ 数据目录名称” ,您可能要使用的订阅以及目录的位置 ,如下所示。

azure 入门_Azure数据目录入门2

Scroll a little down to select the Pricing, this service is offered in two editions. For this demo, I am selecting the FREE EDITION.

向下滚动以选择Pricing ,此服务提供两个版本。 对于此演示,我选择 免费版。

azure 入门_Azure数据目录入门3

I am keeping everything as default for the below categories, your ID is automatically added as a catalog user and an administrator. You can further add catalog users and catalog administrators to the catalog. And finally, click Create Catalog to create a Data Catalog named, OurSalesData in Azure.

我将以下类别的所有内容保留为默认值,您的ID将自动添加为目录用户和管理员。 您可以进一步将目录用户和目录管理员添加到目录中。 最后,单击“ 创建目录”以在Azure中创建一个名为OurSalesData的数据目录。

azure 入门_Azure数据目录入门4

The Data Catalog is successfully created and you can view the same in the Azure portal as shown below. Resource group, DataCatalogs-EastUS is created automatically and the catalog resides in this. Also, if you notice, I already have SQL Server and SQL database resources created in my account.

数据目录已成功创建,您可以在Azure门户中查看数据目录,如下所示。 资源组DataCatalogs-EastUS是自动创建的,目录位于其中。 另外,如果您注意到,我已经在我的帐户中创建了SQL Server和SQL数据库资源。

azure 入门_Azure数据目录入门5

Click on the Data Catalog to view properties of the catalog and you can also edit them.


azure 入门_Azure数据目录入门6

启动桌面应用程序以在Azure数据目录中注册数据源 (Launch the desktop application to register your data sources in Azure Data Catalog)

Now coming back to the Data Catalog page, after clicking on Create Catalog button above, you will be taken to the below screen.


azure 入门_Azure数据目录入门7

There are two options with which you can register or publish your data sources in the Data Catalog, – Launch Application and Create Manual Entry. I personally do not prefer the “Create Manual Entry” option, as it would be a challenging and time-consuming activity for larger data sources. It is better to go with the “Launch Application” option as it is just a click-once application.

您可以使用两个选项在“数据目录”中注册或发布数据源:“启动应用程序”和“创建手动输入”。 我个人不喜欢“创建手动输入”选项,因为对于较大的数据源而言,这将是一项艰巨而耗时的活动。 最好使用“启动应用程序”选项,因为它只是一个单击一次的应用程序。

Install this application:


azure 入门_Azure数据目录入门8

Once, this application is successfully installed, you are brought in to the Sign in page. Sign-in using the same credentials that you used to access the catalog in the portal.

成功安装此应用程序后,您将进入“ 登录”页面。 使用与访问门户中的目录相同的凭据登录。

azure 入门_Azure数据目录入门9

选择数据源 (Selecting a data source )

Let’s head over to select a data source in order to register it in your Data Catalog.


You can register tons of data sources like SQL Server, Reporting Services, HDFS, Hive, HANA database, Azure Data Lake Analytics etc. as shown below in the Data Catalog. Since I already have a SQL database in my account, I will go with SQL Server as the data source. Click on SQL Server and select NEXT.

您可以注册大量数据源,例如SQL Server,Reporting Services,HDFS,Hive,HANA数据库,Azure Data Lake Analytics等,如下数据目录中所示。 由于我的帐户中已经有一个SQL数据库,因此我将选择SQL Server作为数据源。 单击SQL Server并选择NEXT

azure 入门_Azure数据目录入门10

Provide SQL Server Name, the authentication Type, and also the database (mysqldb, in this case) that you want to register and click CONNECT.

提供SQL Server名称,身份验证类型以及要注册的数据库(在这种情况下为mysqldb),然后单击CONNECT。

azure 入门_Azure数据目录入门11

在Azure数据目录中注册数据源 (Register a data source in Azure Data Catalog)

Expand your database and select SalesLT, you will be provided with all the objects under Available objects that you want to register in your data catalog. I have selected all of them using a double right arrow (>>). Also, click on Include Preview option to preview sample data later.

展开数据库并选择SalesLT,将为您提供要在数据目录中注册的“可用对象”下的所有对象。 我使用向右双箭头(>>)选择了所有这些对象。 另外,单击“ 包括预览”选项以稍后预览样本数据。

azure 入门_Azure数据目录入门12

The registration of objects has been done and you can also register more objects using ‘register more objects’ option. For now, let’s click on VIEW PORTAL to discover our data.

对象的注册已完成,您也可以使用“注册更多对象”选项注册更多对象。 现在,让我们单击“查看门户”以发现我们的数据。

azure 入门_Azure数据目录入门13

如何发现和注释Azure数据目录中的数据源 (How to discover and annotate data sources in an Azure Data Catalog)

Suppose that we want to look for the information related to any order in the database, for this, you can type ‘order’ in the search bar and you will find two SQL Server tables related to orders.

假设我们要在数据库中查找与任何订单相关的信息,为此,您可以在搜索栏中键入“ order”,您将找到两个与订单相关SQL Server表。

azure 入门_Azure数据目录入门14

You can further annotate this data asset by providing a friendly name (I have typed in OrdersIn2020 as a friendly name), some description, who is the expert, etc. in the Properties tab as shown below.

您可以通过在“ 属性”选项卡中提供一个友好名称(我在OrdersIn2020中输入的友好名称),一些描述,谁是专家等来进一步注释此数据资产,如下所示。

azure 入门_Azure数据目录入门15

Click on the Preview icon to view a sample of the data it contains.


azure 入门_Azure数据目录入门16

We can also add meaningful descriptions and tags to all the columns present in the table in the Columns tab. This will not only help us know where the attribute is located but also depicts what this data attribute is all about.

我们还可以向“ 列”选项卡中表中存在的所有列添加有意义的描述和标记。 这不仅可以帮助我们知道属性的位置,还可以描述此数据属性的全部含义。

azure 入门_Azure数据目录入门17

At times, tags and descriptions are not enough to provide a clear understanding of the data asset. To make it more understandable for data consumers, you can add documentation related to this data asset in the Documentation tab as shown below. This will help provide a complete and detailed explanation of data assets.

有时,标签和描述不足以提供对数据资产的清晰理解。 为了使数据使用者更容易理解,可以在“ 文档”选项卡中添加与此数据资产相关的文档 ,如下所示。 这将有助于提供对数据资产的完整而详细的解释。

azure 入门_Azure数据目录入门18

如何连接到Azure数据目录中的数据源 (How to connect to data sources in an Azure Data Catalog)

Once we are done registering, locating and annotating data, we can also connect to the data source using Data Catalog service. This service offers multiple options to connect to a data source. You can do so by clicking the ‘Open In …’ icon in the horizontal tile. You will find, we can connect our data source to Excel, SSDT and Power BI.

完成数据的注册,定位和注释后,我们还可以使用数据目录服务连接到数据源。 该服务提供了多个选项以连接到数据源。 您可以通过点击水平磁贴中的“ 打开方式... ”图标来实现。 您会发现,我们可以将数据源连接到Excel,SSDT和Power BI。

azure 入门_Azure数据目录入门19

To connect this data source in Power BI Desktop (provided Power BI Desktop is installed on the client computer), click the Power BI Desktop option from the contextual menu.

要在Power BI Desktop中连接此数据源(客户端计算机上已安装了Power BI Desktop),请从上下文菜单中单击Power BI Desktop选项。

Data users can now view, analyze and visualize their data in the Power BI Desktop app as shown below.

数据用户现在可以在Power BI Desktop应用程序中查看,分析和可视化其数据,如下所示。

azure 入门_Azure数据目录入门20

You can also go over this Microsoft documentation, to know more about Data Catalog service in Azure.

您也可以浏览此Microsoft文档 ,以了解有关Azure中数据目录服务的更多信息。

结论 (Conclusion)

We discussed important facts about Azure Data Catalog in this short article. Along the way, we also saw how this tool makes the lives of users easier by discovering, understanding and consuming data sources. If you have any questions, please feel free to ask in the comments section below.

在这篇简短的文章中,我们讨论了有关Azure数据目录的重要事实。 在此过程中,我们还看到了该工具如何通过发现,理解和使用数据源使用户的生活更轻松。 如果您有任何疑问,请随时在下面的评论部分中提问。

翻译自: https://www.sqlshack.com/getting-started-with-azure-data-catalog/

azure 入门


上一篇:sql server 别名_SQL Server别名概述

下一篇:sql server伪列_伪简单SQL Server恢复模型


  • azure 入门_Azure Databricks入门指南

    azure 入门 This article serves as a complete guide to Azure Databricks for the beginners. Here, you will walk through the basics of Databricks in Azure, how to create it on the Azure portal and vario...

  • azure db 设置时区_使用Azure Cosmos DB开始您的旅程

    azure db 设置时区 In this article, we will discuss why we need to use Azure Cosmos DB and how to configure it to store and query our data. 在本文中,我们将讨论为什么需要使用Azure Cosmos DB以及如何配置它以存储和查询我们的数据。 Before tha...

  • 建议收藏!2020 年必备的几个 DevOps 工具

    来自:SegmentFault ,作者:徐九 链接:https://segmentfault.com/a/1190000022908614 提到 DevOps 这个词,我相信很多人一定不会陌生。作为一个热门的概念,DevOps近 年来频频出现在各大技术社区和媒体的文章中。到了 2020 年,DevOps 的革命也终于成为了一个主流,DevOps 相关工具的受欢迎程度也在激增。根据 Googl...

  • 批量关停azure vm_创建Azure自动化以启动Azure VM

    批量关停azure vm When you have created a series of Azure VMs – for example for doing demos – you might want to be able to start it and shut it down easily. For convenience purposes but also for keeping...

  • azure云数据库_使用Azure SQL数据库构建ASP.NET应用

    azure云数据库 In this article, you will learn about Azure SQL Database and its uses. Then the article splits into two sets of tutorials. The first part will show you how to create a single database in ...

  • 微软 azure_Microsoft Azure,我们迁移数据的第一步

    微软 azure The cloud is a buzzword in the IT world. Oracle, Amazon and Microsoft with Microsoft Azure are offering Cloud Services to the public. Most of the companies plan to have part of their envir...

  • azure vnc控制台_使用扩展和标签控制Azure成本

    azure vnc控制台 Depending on our design and security, we can create functions or use built-in tools to control our Azure costs. In some contexts, we may look at the overall cost of what tools we’re us...

  • azure未连接_查找影响Azure成本的未使用资源

    azure未连接 To reduce Azure costs on unused and unnecessary resources, we should design with prevention in mind – considering whether we want to commit to reserved use or test with a pay-as-we-go mode...




中国开发者社区CSDN (Chinese Software Developer Network) 创立于1999年,致力为中国开发者提供知识传播、在线学习、职业发展等全生命周期服务。