团队数据科学流程组管理员任务Team Data Science Process group manager tasks

本文介绍了组管理员 为数据科学组织完成的任务。This article describes the tasks that a group manager completes for a data science organization. 组管理员负责管理企业中的整个数据科学部门。The group manager manages the entire data science unit in an enterprise. 数据科学部门可能有多个团队,每个团队都在不同的业务垂直领域开展多个数据科学项目。A data science unit may have several teams, each of which is working on many data science projects in distinct business verticals. 组管理员的目标是创建致力于标准化团队数据科学流程 (TDSP) 的团队协作环境。The group manager's objective is to establish a collaborative group environment that standardizes on the Team Data Science Process (TDSP). 有关致力于标准化 TDSP 的数据科学团队要处理的所有人员角色及相关任务的概述,请参阅团队数据科学流程角色和任务For an outline of all the personnel roles and associated tasks handled by a data science team standardizing on the TDSP, see Team Data Science Process roles and tasks.

下图显示了六个主要的组管理员设置任务。The following diagram shows the six main group manager setup tasks. 组管理员可以将任务委托给代理,但与角色相关的任务不变。Group managers may delegate their tasks to surrogates, but the tasks associated with the role don't change.

组管理员任务

  1. 为组设置一个 Azure DevOps 组织 。Set up an Azure DevOps organization for the group.
  2. 在 Azure DevOps 组织中创建默认的 GroupCommon 项目 。Create the default GroupCommon project in the Azure DevOps organization.
  3. 在 Azure Repos 中创建 GroupProjectTemplate 存储库。Create the GroupProjectTemplate repository in Azure Repos.
  4. 在 Azure Repos 中创建 GroupUtilities 存储库。Create the GroupUtilities repository in Azure Repos.
  5. 将 Microsoft TDSP 团队 ProjectTemplate 和 Utilities 存储库中的内容导入组通用存储库 。Import the contents of the Microsoft TDSP team's ProjectTemplate and Utilities repositories into the group common repositories.
  6. 为团队成员设置成员身份和权限,以便其访问组 。Set up membership and permissions for team members to access the group.

下面的教程详细介绍了相关步骤。The following tutorial walks through the steps in detail.

备注

本文使用 Azure DevOps 设置 TDSP 组环境,因为 Microsoft 使用此方法实现 TDSP。This article uses Azure DevOps to set up a TDSP group environment, because that is how to implement TDSP at Microsoft. 如果你的组使用其他代码托管或开发平台,组管理员的任务是相同的,但完成这些任务的方法可能不同。If your group uses other code hosting or development platforms, the Group Manager's tasks are the same, but the way to complete them may be different.

在 Azure DevOps 中创建一个组织和项目Create an organization and project in Azure DevOps

  1. 转到 visualstudio.microsoft.com,选择右上方的“登录” ,然后登录到 Microsoft 帐户。Go to visualstudio.microsoft.com, select Sign in at upper right, and sign into your Microsoft account.

    登录 Microsoft 帐户

    如果没有 Microsoft 帐户,请选择“立即注册” 创建一个 Microsoft 帐户,然后使用此帐户登录。If you don't have a Microsoft account, select Sign up now, create a Microsoft account, and sign in using this account. 如果你的组织有 Visual Studio 订阅,请使用该订阅的凭据登录。If your organization has a Visual Studio subscription, sign in with the credentials for that subscription.

  2. 登录后,在 Azure DevOps 页的右上方选择“新建组织” 。After you sign in, at upper right on the Azure DevOps page, select Create new organization.

    新建组织

  3. 如果系统提示你同意服务条款、隐私声明和行为准则,请选择“继续” 。If you're prompted to agree to the Terms of Service, Privacy Statement, and Code of Conduct, select Continue.

  4. 在注册对话框中,为 Azure DevOps 组织命名并接受分配的主机区域,或通过下拉列表选择其他区域。In the signup dialog, name your Azure DevOps organization and accept the host region assignment, or drop down and select a different region. 然后选择“继续”。 Then select Continue.

  5. 在“创建项目以开始使用”下,输入 GroupCommon,然后选择“创建项目” 。Under Create a project to get started, enter GroupCommon, and then select Create project.

    创建项目

GroupCommon 项目的摘要页随即打开 。The GroupCommon project Summary page opens. 页面 URL 为 https://<servername>/<organization-name>/GroupCommon 。The page URL is https://<servername>/<organization-name>/GroupCommon.

项目摘要页

设置组通用存储库Set up the group common repositories

Azure Repos 可为组托管以下类型的存储库:Azure Repos hosts the following types of repositories for your group:

  • 组通用存储库:可供数据科学部门中的多个团队用于多个数据科学项目的通用存储库。Group common repositories: General-purpose repositories that multiple teams within a data science unit can adopt for many data science projects.
  • 团队存储库:专供数据科学部门内特定团队使用的存储库。Team repositories: Repositories for specific teams within a data science unit. 这些存储库特定于团队需求,可用于该团队内的多个项目,但不足以使数据科学部门中的多个团队使用。These repositories are specific for a team's needs, and may be used for multiple projects within that team, but are not general enough to be used across multiple teams within a data science unit.
  • 项目存储库:适用于特定项目的存储库。Project repositories: Repositories for specific projects. 此类存储库通用程度比较低,可能不足以供一个团队内的多个项目或数据科学部门内的其他团队使用。Such repositories may not be general enough for multiple projects within a team, or for other teams in a data science unit.

可按照以下方法在项目中设置组通用存储库:To set up the group common repositories in your project, you:

  • 将默认的 GroupCommon 存储库重命名为 GroupProjectTemplate Rename the default GroupCommon repository to GroupProjectTemplate
  • 创建新的 GroupUtilities 存储库 Create a new GroupUtilities repository

将默认项目存储库重命名为 GroupProjectTemplateRename the default project repository to GroupProjectTemplate

将默认的 GroupCommon 项目存储库重命名为 GroupProjectTemplate :To rename the default GroupCommon project repository to GroupProjectTemplate:

  1. 在 GroupCommon 项目的摘要页上,选择“存储库” 。On the GroupCommon project Summary page, select Repos. 执行此操作后,将转到 GroupCommon 项目的默认 GroupCommon 存储库,该存储库当前为空 。This action takes you to the default GroupCommon repository of the GroupCommon project, which is currently empty.

  2. 在页面顶部,下拉“GroupCommon”旁边的箭头并选择“管理存储库” 。At the top of the page, drop down the arrow next to GroupCommon and select Manage repositories.

    管理存储库

  3. 在“项目设置”页上,选择“GroupCommon”旁边的“...”,然后选择“重命名存储库” 。On the Project Settings page, select the ... next to GroupCommon, and then select Rename repository.

    依次选择“...”和“重命名存储库”

  4. 在“重命名 GroupCommon 存储库”弹出菜单中,输入 GroupProjectTemplate,然后选择“重命名” 。In the Rename the GroupCommon repository popup, enter GroupProjectTemplate, and then select Rename.

    重命名存储库

创建 GroupUtilities 存储库Create the GroupUtilities repository

创建 GroupUtilities 存储库:To create the GroupUtilities repository:

  1. 在 GroupCommon 项目的摘要页上,选择“存储库” 。On the GroupCommon project Summary page, select Repos.

  2. 在页面顶部,下拉“GroupProjectTemplate”旁边的箭头并选择“新建存储库” 。At the top of the page, drop down the arrow next to GroupProjectTemplate and select New repository.

    选择“新建存储库”

  3. 在“新建存储库”对话框中,为“类型”选择“Git”,为“存储库名称”输入 GroupUtilities,然后选择“创建” 。In the Create a new repository dialog, select Git as the Type, enter GroupUtilities as the Repository name, and then select Create.

    创建 GroupUtilities 存储库

  4. 在“项目设置”页上,在左侧导航栏中选择“存储库”下的“存储库”,查看两个组存储库 :GroupProjectTemplate 和 GroupUtilities 。On the Project Settings page, select Repositories under Repos in the left navigation to see the two group repositories: GroupProjectTemplate and GroupUtilities.

    两个组存储库

导入 Microsoft TDSP 团队存储库Import the Microsoft TDSP team repositories

在本教程的这一部分,需要将由 Microsoft TDSP 团队管理的 ProjectTemplate 和 Utilities 存储库中的内容导入 GroupProjectTemplate 和 GroupUtilities 存储库 。In this part of the tutorial, you import the contents of the ProjectTemplate and Utilities repositories managed by the Microsoft TDSP team into your GroupProjectTemplate and GroupUtilities repositories.

导入 Microsoft TDSP 团队存储库:To import the TDSP team repositories:

  1. 在 GroupCommon 项目主页中,从左侧导航栏中选择“存储库” 。From the GroupCommon project home page, select Repos in the left navigation. 默认的 GroupProjectTemplate 存储库随即打开 。The default GroupProjectTemplate repo opens.

  2. 在“GroupProjectTemplate 为空”页上,选择“导入” 。On the GroupProjectTemplate is empty page, select Import.

    选择“导入”

  3. 在“导入 Git 存储库”对话框中,为“源类型”选择“Git”,并为“克隆 URL”输入 https://github.com/Azure/Azure-TDSP-ProjectTemplate.git 。In the Import a Git repository dialog, select Git as the Source type, and enter https://github.com/Azure/Azure-TDSP-ProjectTemplate.git for the Clone URL. 然后选择“导入” 。Then select Import. Microsoft TDSP 团队 ProjectTemplate 存储库已导入 GroupProjectTemplate 存储库。The contents of the Microsoft TDSP team ProjectTemplate repository are imported into your GroupProjectTemplate repository.

    导入 Microsoft TDSP 团队存储库

  4. 在“存储库”页的顶部,在下拉列表中选择“GroupUtilities”存储库 。At the top of the Repos page, drop down and select the GroupUtilities repository.

  5. 重复导入过程,将 Microsoft TDSP 团队 Utilities 存储库 (https://github.com/Azure/Azure-TDSP-Utilities.git) 的内容导入 GroupUtilities 存储库 。Repeat the import process to import the contents of the Microsoft TDSP team Utilities repository, https://github.com/Azure/Azure-TDSP-Utilities.git, into your GroupUtilities repository.

现在,两个组存储库中都包含 Microsoft TDSP 团队的相应存储库中的所有文件,.git 目录中的文件除外。Each of your two group repositories now contains all the files, except those in the .git directory, from the Microsoft TDSP team's corresponding repository.

自定义组存储库的内容Customize the contents of the group repositories

现在可以自定义组存储库的内容以满足组的特定需求。If you want to customize the contents of your group repositories to meet the specific needs of your group, you can do that now. 可以修改文件,更改目录结构,或添加你的组已制定或对组有用的文件。You can modify the files, change the directory structure, or add files that your group has developed or that are helpful for your group.

在 Azure Repos 中进行更改Make changes in Azure Repos

自定义存储库内容:To customize repository contents:

  1. 在 GroupCommon 项目的摘要页上,选择“存储库” 。On the GroupCommon project Summary page, select Repos.

  2. 在页面顶部,选择要自定义的存储库。At the top of the page, select the repository you want to customize.

  3. 在存储库目录结构中,导航到要更改的文件夹或文件。In the repo directory structure, navigate to the folder or file you want to change.

    • 若要创建新的文件夹或文件,请选择“新建”旁边的箭头 。To create new folders or files, select the arrow next to New.

      创建新文件

    • 若要上传文件,请选择“上传文件” 。To upload files, select Upload file(s).

      上传文件

    • 若要编辑现有文件,请导航到该文件,然后选择“编辑” 。To edit existing files, navigate to the file and then select Edit.

      编辑文件

  4. 添加或编辑文件后,选择“提交” 。After adding or editing files, select Commit.

    提交更改

使用本地计算机或 DSVM 进行更改Make changes using your local machine or DSVM

如果要使用本地计算机或 DSVM 进行更改并将更改推送到组存储库,请确保满足使用 Git 和 DSVM 的先决条件:If you want to make changes using your local machine or DSVM and push the changes up to the group repositories, make sure you have the prerequisites for working with Git and DSVMs:

  • 拥有 Azure 订阅(如果要创建 DSVM)。An Azure subscription, if you want to create a DSVM.
  • 计算机上安装有 Git。Git installed on your machine. 如果要使用 DSVM,则需预安装 Git。If you're using a DSVM, Git is pre-installed. 否则,请参阅平台和工具附录Otherwise, see the Platforms and tools appendix.
  • 如果要使用 DSVM,需要在 Azure 中创建和配置 Windows 或 Linux DSVM。If you want to use a DSVM, the Windows or Linux DSVM created and configured in Azure. 有关详细信息和说明,请参阅 Data Science Virtual Machine 文档For more information and instructions, see the Data Science Virtual Machine Documentation.
  • 对于 Windows DSVM,需要在计算机上安装 Git 凭据管理器 (GCM)For a Windows DSVM, Git Credential Manager (GCM) installed on your machine. 在 README.md 文件中,向下滚动到“下载并安装”部分,然后选择“最新安装程序” 。In the README.md file, scroll down to the Download and Install section and select the latest installer. 从安装程序页下载 .exe 安装程序并运行它 。Download the .exe installer from the installer page and run it.
  • 对于 Linux DSVM,需要在 DSVM 上设置 SSH 公钥,并将其添加到 Azure DevOps 中。For a Linux DSVM, an SSH public key set up on your DSVM and added in Azure DevOps. 有关详细信息和说明,请参阅平台和工具附录中的“创建 SSH 公钥” 部分。For more information and instructions, see the Create SSH public key section in the Platforms and tools appendix.

首先,将存储库复制或克隆到本地计算机 。First, copy or clone the repository to your local machine.

  1. 在 GroupCommon 项目的摘要页上,选择“存储库”,然后在页面顶部选择要克隆的存储库 。On the GroupCommon project Summary page, select Repos, and at the top of the page, select the repository you want to clone.

  2. 在“存储库”页上,选择右上方的“克隆” 。On the repo page, select Clone at upper right.

  3. 在“克隆存储库”对话框中,为 HTTP 连接选择“HTTPS”,或为 SSH 连接选择“SSH”,并将命令行下的克隆 URL 复制到剪贴板 。In the Clone repository dialog, select HTTPS for an HTTP connection, or SSH for an SSH connection, and copy the clone URL under Command line to your clipboard.

    克隆存储库

  4. 在本地计算机上创建以下目录:On your local machine, create the following directories:

    • 对于 Windows:C:\GitRepos\GroupCommon For Windows: C:\GitRepos\GroupCommon
    • 对于 Linux:在主目录中创建 $/GitRepos/GroupCommon For Linux, $/GitRepos/GroupCommon on your home directory
  5. 切换到创建的目录。Change to the directory you created.

  6. 在 Git Bash 中,运行命令 git clone <clone URL>.In Git Bash, run the command git clone <clone URL>.

    例如,以下任一命令都可将 GroupUtilities 存储库克隆到本地计算机上的 GroupCommon 目录 。For example, either of the following commands clones the GroupUtilities repository to the GroupCommon directory on your local machine.

    HTTPS 连接:HTTPS connection:

    git clone https://DataScienceUnit@dev.azure.com/DataScienceUnit/GroupCommon/_git/GroupUtilities
    

    SSH 连接:SSH connection:

    git clone git@ssh.dev.azure.com:v3/DataScienceUnit/GroupCommon/GroupUtilities
    

在存储库的本地克隆中进行任何所需更改后,可以将更改推送到共享组通用存储库。After making whatever changes you want in the local clone of your repository, you can push the changes to the shared group common repositories.

从本地 GroupProjectTemplate 或 GroupUtilities 目录运行以下 Git Bash 命令 。Run the following Git Bash commands from your local GroupProjectTemplate or GroupUtilities directory.

git add .
git commit -m "push from local"
git push

备注

如果这是首次提交到 Git 存储库,则需要在运行 git commit 命令之前配置全局参数 user.name 和 user.email 。If this is the first time you commit to a Git repository, you may need to configure global parameters user.name and user.email before you run the git commit command. 运行以下两个命令:Run the following two commands:

git config --global user.name <your name>

git config --global user.email <your email address>

如果要提交到多个 Git 存储库,请在每次提交时都使用相同的姓名和电子邮件地址。If you're committing to several Git repositories, use the same name and email address for all of them. 使用相同的姓名和电子邮件地址在构建 Power BI 仪表板来跟踪多个存储库中的 Git 活动时可提供便利。Using the same name and email address is convenient when building Power BI dashboards to track your Git activities in multiple repositories.

添加组成员并配置权限Add group members and configure permissions

向组添加成员:To add members to the group:

  1. 在 Azure DevOps 的 GroupCommon 项目主页中,从左侧导航栏中选择“项目设置” 。In Azure DevOps, from the GroupCommon project home page, select Project settings from the left navigation.

  2. 从“项目设置”左侧导航栏中,选择“团队”,然后在“团队”页上选择“GroupCommon 团队” 。From the Project Settings left navigation, select Teams, then on the Teams page, select the GroupCommon Team.

    配置团队

  3. 在“团队资料”页上,选择“添加” 。On the Team Profile page, select Add.

    添加到 GroupCommon 团队

  4. 在“添加用户和组”对话框中,搜索并选择要添加到组的成员,然后选择“保存更改” 。In the Add users and groups dialog, search for and select members to add to the group, and then select Save changes.

    添加用户和组

配置成员权限:To configure permissions for members:

  1. 从“项目设置”左侧导航栏中,选择“权限” 。From the Project Settings left navigation, select Permissions.

  2. 在“权限”页上,选择要向其中添加成员的组 。On the Permissions page, select the group you want to add members to.

  3. 在该组的页面上选择“成员”,然后选择“添加” 。On the page for that group, select Members, and then select Add.

  4. 在“邀请成员”弹出菜单中,搜索并选择要添加到组的成员,然后选择“保存” 。In the Invite members popup, search for and select members to add to the group, and then select Save.

    为成员授予权限

后续步骤Next steps

下面是团队数据科学流程中的其他角色和任务的详细说明链接:Here are links to detailed descriptions of the other roles and tasks in the Team Data Science Process: