Maintainer Guide for Authoring#

This document is designed for myself to remember how this software designed.

前言#

一般有两种描述一个系统的内部设计的方式:

从底层模块出发, 一点点的构建起系统的各个模块, 并最后串联成一个系统. 这是一种在设计一个系统时用到的思维.
从最终的用户界面触发, 一点点往下深挖, 看实际底层按照顺序调用了哪些模块. 这是一种在理解一个系统时用到的思维.

这篇文档的主要目的是方便我自己在过了一段时间后回来维护这个项目时理解这个系统用的, 所以我会按照 #2 的思路来写这篇文档.

CLI 命令行初探#

当你在命令行输入 ars 的时候, 就会调用 aws_resource_search.cli.main.run() 这个函数. 由于你没有输入任何 subcommand 以及参数, 所以它会调用 aws_resource_search.cli.ArsCli.__call__() 这个方法. 这两个函数都是位于 aws_resource_search.cli 这个模块中的, 是负责 CLI 的接口的用的, 并不涉及具体业务逻辑. 而你看 __call__ 这个方法就知道, 里面会调用 aws_resource_search.ui_init.run_ui() 这个函数. 这个函数是位于 aws_resource_search.ui_init 下的, 属于 UI 的核心逻辑的入口函数. 而 CLI 函数只是对这个函数的初探.

UI Event Loop#

UI 的入口函数 aws_resource_search.ui_init.run_ui() 里面的内容和恩简单, 就是实例化一个 aws_resource_search.ui_def.UI 对象, 然后进入 event loop. 对于 UI 来说, 你每按下一个按键, 都会调用一个函数来处理你的输入, 然后重新渲染整个界面. 这个用于处理输入的函数就是 aws_resource_search.ui_def.handler()

UI Handler#

UI 的函数是用来处理

如何支持更多的 AWS 服务和资源#

本节主要介绍如果你发现这个项目不支持你想要的 AWS 服务或者资源, 你应该如何去添加它.

首先到 aws_resource_search/code/searcher_enum.json 仿照已经支持的 AWS Resource, 添加你要支持的 AWS Resource 的类型. 其中 description 是给人类看的一句话介绍, 一般是 AWS Document 官网首页的第一句话. 而 ngram 则是额外的用于搜索 ngram 搜索的关键字, 你可以把人类在想搜这个资源时能联想到的各种词汇的全称和缩写都放在这里.

aws_resource_search/code/searcher_enum.json

{
    "cloudformation-stack": {
        "desc": "A stack is a collection of AWS resources that you can manage as a single unit.",
        "ngram": "cfn cft"
    },
    "codebuild-job-run": {
        "desc": "Codebuild job run, it is not a batch job run.",
        "ngram": "jobrun"
    },
    "codebuild-project": {
        "desc": "A build project includes information about how to run a build.",
        "ngram": ""
    },
    "codecommit-repository": {
        "desc": "A repository is where you store code and files for your project.",
        "ngram": ""
    },
    "codepipeline-pipeline": {
        "desc": "Code pipeline is a workflow construct that describes how software changes go through a release process.",
        "ngram": ""
    },
    "dynamodb-table": {
        "desc": "A table is a collection of data.",
        "ngram": ""
    },
    "ec2-instance": {
        "desc": "An EC2 instance is simply a virtual server in AWS",
        "ngram": "aws elastic compute cloud"
    },
    "ec2-security-group": {
        "desc": "A security group acts as a firewall that controls the traffic allowed to and from the resources in your VPC.",
        "ngram": "sg securitygroup"
    },
    "ec2-subnet": {
        "desc": "A subnet is a range of IP addresses in your VPC.",
        "ngram": ""
    },
    "ec2-vpc": {
        "desc": "A virtual private cloud (VPC) is a virtual network dedicated to your AWS account.",
        "ngram": "virtual private cloud"
    },
    "ecr-repository": {
        "desc": "AWS managed container image registry service that is secure, scalable, and reliable.",
        "ngram": "ecr private repository container"
    },
    "ecr-repository-image": {
        "desc": "An container image in ECR repository.",
        "ngram": "ecr private repository container"
    },
    "ecs-cluster": {
        "desc": "An Amazon ECS cluster is a logical grouping of tasks or services.",
        "ngram": ""
    },
    "ecs-task-run": {
        "desc": "A task run is the instantiation of a task definition within a cluster.",
        "ngram": ""
    },
    "ecs_task_definition_family": {
        "desc": "A name of the task definition, without revision id.",
        "ngram": ""
    },
    "glue-crawler": {
        "desc": "You can use a crawler to populate the AWS Glue Data Catalog with tables.",
        "ngram": ""
    },
    "glue-database": {
        "desc": "Databases are used to organize metadata tables in the AWS Glue. ",
        "ngram": "db"
    },
    "glue-database-table": {
        "desc": "The metadata definition that represents your data.",
        "ngram": "db tb"
    },
    "glue-job": {
        "desc": "The business logic that is required to perform ETL work.",
        "ngram": ""
    },
    "glue-job-run": {
        "desc": "A job run is the execution of an ETL job.",
        "ngram": ""
    },
    "iam-group": {
        "desc": "An IAM user group is a collection of IAM users.",
        "ngram": ""
    },
    "iam-policy": {
        "desc": "IAM policies define permissions for an action regardless of the method that you use to perform the operation.",
        "ngram": ""
    },
    "iam-role": {
        "desc": "An IAM role is an IAM identity that you can create in your account that has specific permissions.",
        "ngram": ""
    },
    "iam-user": {
        "desc": "IAM user is an entity that you create in AWS.",
        "ngram": ""
    },
    "kms-key-alias": {
        "desc": "A human friendly name for KMS keys.",
        "ngram": "key management service"
    },
    "lambda-function": {
        "desc": "A function is a resource that you can invoke to run your code in Lambda.",
        "ngram": "lbd"
    },
    "lambda-function-alias": {
        "desc": "A Lambda alias is a pointer to a function version that you can update.",
        "ngram": "lbd"
    },
    "lambda-layer": {
        "desc": "A Lambda layer is a .zip file archive that can contain additional code or other content.",
        "ngram": ""
    },
    "rds-db-cluster": {
        "desc": "A DB cluster deployment is a semi-synchronous, high availability deployment mode of Amazon RDS with two readable standby DB instances.",
        "ngram": "relational database service"
    },
    "rds-db-instance": {
        "desc": "A DB instance is an isolated database environment running in the cloud.",
        "ngram": "relational database service"
    },
    "s3-bucket": {
        "desc": "A bucket is a container for objects.",
        "ngram": "simple storage service"
    },
    "secretsmanager-secret": {
        "desc": "A secret consists of secret information, the secret value, plus metadata about the secret.",
        "ngram": "sm"
    },
    "sfn-state-machine": {
        "desc": "A series of event-driven steps",
        "ngram": "step functions stepfunctions"
    },
    "sfn-state-machine-execution": {
        "desc": "A execution of a state machine.",
        "ngram": "step functions stepfunctions"
    },
    "sns-topic": {
        "desc": "An Amazon SNS topic is a logical access point that acts as a communication channel.",
        "ngram": "simple notification service"
    },
    "sqs-queue": {
        "desc": "A form of asynchronous service-to-service communication used in serverless and microservices architectures",
        "ngram": "simple queue service"
    },
    "ssm-parameter": {
        "desc": "Provides secure, hierarchical storage for configuration data management and secrets management. ",
        "ngram": "system manager parameter store"
    }
}

然后到 aws_resource_search/res/ 下, 找一个跟你要支持的服务比较相近的服务作为模版, copy paste 创建一个新的模块. 模块的名字要跟 AWS Service 对应上. 然后参考其他的模块实现这个搜索器.
运行 scripts/code_work.py, 自动更新其他的 enum 模块, 数据, 和代码.
如果你这个 resource 是一个先要搜索 parent resource, 然后才能搜的 sub resource, 你还要到 aws_resource_search/ui/search_patterns.py 模块中更新 ArsSearchPatternsMixin.get_search_patterns 这个函数中定义的映射关系.

aws_resource_search/ars_search_patterns.py

# -*- coding: utf-8 -*-

"""
This module defines the search patterns for those resource types that requires
special handling.
"""

import typing as T

from .compat import TypedDict, cached_property
from .searcher_enum import SearcherEnum

if T.TYPE_CHECKING:
    from .ars_def import ARS


class T_SEARCH_PATTERN(TypedDict):
    partitioner_resource_type: str
    get_boto_kwargs: T.Callable


K_PARTITIONER_RESOURCE_TYPE = "partitioner_resource_type"
K_GET_BOTO_KWARGS = "get_boto_kwargs"


class ArsSearchPatternsMixin:
    """
    todo: docstring
    """
    def get_search_patterns(self: "ARS"):
        """
        This variable defines those resource types that requires a parent resource name
        for the boto3 API call. For example:

        - in order to search glue table, you need to specify glue database
        - in order to search glue job run, you need to specify glue job
        """
        return {
            SearcherEnum.ecr_repository_image.value: {
                K_PARTITIONER_RESOURCE_TYPE: SearcherEnum.ecr_repository.value,
                K_GET_BOTO_KWARGS: lambda partitioner_query: {
                    "repositoryName": partitioner_query
                },
            },
            SearcherEnum.ecs_task_run.value: {
                K_PARTITIONER_RESOURCE_TYPE: SearcherEnum.ecs_cluster.value,
                K_GET_BOTO_KWARGS: lambda partitioner_query: {
                    "cluster": partitioner_query
                },
            },
            SearcherEnum.glue_database_table.value: {
                K_PARTITIONER_RESOURCE_TYPE: SearcherEnum.glue_database.value,
                K_GET_BOTO_KWARGS: lambda partitioner_query: {
                    "DatabaseName": partitioner_query
                },
            },
            SearcherEnum.glue_job_run.value: {
                K_PARTITIONER_RESOURCE_TYPE: SearcherEnum.glue_job.value,
                K_GET_BOTO_KWARGS: lambda partitioner_query: {
                    "JobName": partitioner_query
                },
            },
            SearcherEnum.sfn_state_machine_execution: {
                K_PARTITIONER_RESOURCE_TYPE: SearcherEnum.sfn_state_machine.value,
                K_GET_BOTO_KWARGS: lambda partitioner_query: {
                    "stateMachineArn": self.aws_console.step_function.get_state_machine_arn(
                        partitioner_query
                    )
                },
            },
            SearcherEnum.codebuild_job_run.value: {
                K_PARTITIONER_RESOURCE_TYPE: SearcherEnum.codebuild_project.value,
                K_GET_BOTO_KWARGS: lambda partitioner_query: {
                    "projectName": partitioner_query
                },
            },
            SearcherEnum.lambda_function_alias.value: {
                K_PARTITIONER_RESOURCE_TYPE: SearcherEnum.lambda_function.value,
                K_GET_BOTO_KWARGS: lambda partitioner_query: {
                    "FunctionName": partitioner_query
                },
            },
        }

    @cached_property
    def search_patterns(self: "ARS"):
        return self.get_search_patterns()

    def _clear_search_patterns_cache(self: "ARS"):
        """
        Clear the :meth:`ArsSearchPatternsMixin.search_patterns` cache.
        """
        del self.search_patterns

    def has_partitioner(
        self: "ARS",
        resource_type: str,
    ) -> bool:
        """
        Check if a resource type need a partitioner resource.
        """
        return resource_type in self.search_patterns

    def get_partitioner_resource_type(
        self: "ARS",
        resource_type: str,
    ) -> str:
        """
        Get the partitioner "resource type" of a resource type.
        """
        return self.search_patterns[resource_type][K_PARTITIONER_RESOURCE_TYPE]

    def get_partitioner_boto_kwargs(
        self: "ARS",
        resource_type: str,
        partitioner_query: str,
    ) -> dict:
        """
        Get the boto3 kwargs for the partitioner resource.
        """
        return self.search_patterns[resource_type][K_GET_BOTO_KWARGS](partitioner_query)

为了验证你的实现是否正确, 你需要到 aws_resource_search/tests/fake_aws/ 目录下创建 mock AWS resource 的模块, 你可以在里面找和你的 Resource 类似的模块作为参考.
然后到 aws_resource_search/tests/fake_aws/main.py 模块中更新 mock_list 中你要 mock 的列表, 以及更新 def create_all(self) 中的逻辑, 把创建你刚实现的 Resource 的逻辑加进去.
然后你就可以用 pyops cov 命令, 或者手动运行 tests/test_ars_init.py 单元测试来自动 mock 所有支持的 AWS 资源并尝试进行搜索了.

What is Searcher#

我们这个 App 的核心功能就是搜索 AWS Resource. 而 AWS Resource 有很多种不同的类型, 例如 EC2 Instance, S3 Bucket, IAM Role. 搜索每种类型的资源的 API 都不一样. 而 Searcher 就是对搜索特定 AWS 资源的逻辑的一个封装. 我们有一个 Searcher Base Class, 然后让负责搜索特定 AWS 资源的 Search 继承这个 Base Class, 并且实现对应的一些方法.

Code Architecture#

Low level modules

底层模块主要是实现一些抽象的基类, 使得我们实现实体类 (就是不会再被继承的类) 的时候能更轻松.

aws_resource_search.base_model: 所有 dataclasses 类的基类.

aws_resource_search.base_searcher: 所有特定 AWS Resource 的 Searcher 类的基类.

aws_resource_search.downloader: 一些帮助我们用 boto3 来下载数据的 utility 函数.

aws_resource_search.searcher_enum: 我们已实现的 searcher (也就是 resource type) 的枚举.

aws_resource_search.terminal: terminal 对象的单例.

Middle level modules:

中层模块主要是一些跟业务逻辑相关的实体类.

aws_resource_search.documents: 所有的可以被搜索的文档的实体类.

aws_resource_search.items: 所有在 UI 中展示的 item 的实体类.

aws_resource_search.conf: 配置管理系统.

aws_resource_search.res_lib.py: 把所有底层, 中层模块的方法都注册到这个模块中, 以便于其他模块可以直接 import 这个模块, 而不用 import 太多的模块.

Per AWS Resource Type Searcher modules:

这一层主要是实现对应的 AWS Resource 的 Searcher 类. 以及把他们汇总到一个 ARS 单例对象中, 便于 import 和调用 search 的 API.

aws_resource_search.res

aws_resource_search.ars_def: ARS 类的基类.

aws_resource_search.ars_mixin: 用 code 来写 code, 自动生成这个模块.

aws_resource_search.ars_search_patterns.py: 把一些 ars_def 中的方法放到其他 mixin 类中去, 以便于 ars_def 中的代码更简洁.

aws_resource_search.ars_init: ARS 单例的创建.

UI modules:

这一层主要是实现 UI.

aws_resource_search.handlers: 所有 UI 中会用到的 handler 的实现.

aws_resource_search.ui_def: UI 类的定义.

aws_resource_search.ui_init: UI 单例的创建.