apigee / registry Goto Github PK

View Code? Open in Web Editor NEW

140.0 16.0 33.0 6.52 MB

The Registry API allows teams to track and manage machine-readable descriptions of APIs.

Home Page: https://apigee.github.io/registry

License: Apache License 2.0

Makefile 0.03% Shell 1.46% Go 98.18% Dockerfile 0.25% Batchfile 0.08%

apis openapi grpc api-lifecycle api-governance api-registry apigee

registry's Introduction

Registry API Core Implementation

This repository contains the core implementation of the Registry API. Please see the wiki for more information.

The Registry API

The Registry API allows teams to upload and share machine-readable descriptions of APIs that are in use and in development. These descriptions include API specifications in standard formats like OpenAPI, the Google API Discovery Service Format, and the Protocol Buffers Language. These API specifications can be used by tools like linters, browsers, documentation generators, test runners, proxies, and API client and server generators. The Registry API itself can be seen as a machine-readable enterprise API catalog designed to back online directories, portals, and workflow managers.

The Registry API is formally described by the Protocol Buffer source files in google/cloud/apigeeregistry/v1. It closely follows the Google API Design Guidelines at aip.dev and presents a developer experience consistent with production Google APIs. Please tell us about your experience if you use it.

The Registry Tool

The Registry Tool (registry) is a command-line tool that simplifies setup and operation of a registry. See cmd/registry and the Registry wiki for more information. The registry tool can be built from sources here or installed with this script on Linux or Darwin:

curl -L https://raw.githubusercontent.com/apigee/registry/main/downloadLatest.sh | sh -

This Implementation

This implementation is a gRPC service written in Go. It can be run locally or deployed in a container using services including Google Cloud Run. It stores data using a configurable relational interface layer that currently supports PostgreSQL and SQLite.

The Registry API service is annotated to support gRPC HTTP/JSON transcoding, which allows it to be automatically published as a JSON REST API using a proxy. Proxies also enable gRPC web, which allows gRPC calls to be directly made from browser-based applications. A configuration for the Envoy proxy is included (deployments/envoy/envoy.yaml).

The Registry API protos also include configuration to support generated API clients (GAPICS), which allow idiomatic API usage from a variety of languages. A Go GAPIC library is generated as part of the build process using gapic-generator-go.

A command-line interface is in cmd/registry and provides a mixture of hand-written high-level features and automatically generated subcommands that call individual RPC methods of the Registry API.

The entry point for the Registry API server itself is cmd/registry-server. For more on running the server, see cmd/registry-server/README.md.

Build Instructions

The following tools are needed to build this software:

Go 1.20 (recommended) or later.
protoc, the Protocol Buffer Compiler (see tools/PROTOC-VERSION.sh for the currently-used version).
make, git, and other elements of common unix build environments.

This repository contains a Makefile that downloads all other dependencies and builds this software (make all). With dependencies downloaded, subsequent builds can be made with go install ./... or make lite.

Quickstart

The easiest way to try the Registry API is to run registry-server locally. By default, the server is configured to use a SQLite database.

registry-server

Next, in a separate terminal, configure your environment to point to this server with the following:

. auth/LOCAL.sh

Now you can check your server and configuration with the registry tool:

registry rpc admin get-status

Next run a suite of tests with make test and see a corresponding walkthrough of API features in tests/demo/walkthrough.sh. For more demonstrations, see the demos directory.

Tests

This repository includes tests that verify registry-server. These server tests focus on correctness at the API level and compliance with the API design guidelines described at aip.dev. Server tests are included in runs of make test and go test ./..., and the server tests can be run by themselves with go test ./server/registry. By default, server tests verify the local code in ./server/registry, but to allow API conformance testing, the tests can be run to verify remote servers using the following options:

With the -remote flag, tests are run against a remote server according to the configuration used by the registry tool. This runs the entire suite of tests. WARNING: These tests are destructive and will overwrite everything in the remote server.
With the -hosted PROJECT_ID flag, tests are run against a remote server in a hosted environment within a single project that is expected to already exist. The server is identified and authenticated with the configuration used by the registry tool. Only the methods of the Registry service are tested (Admin service methods are excluded). WARNING: These tests are destructive and will overwrite everything in the specified project.

A small set of performance benchmarks is in tests/benchmark. These tests run against remote servers specified by the registry tool configuration and test a single project that is expected to already exist. WARNING: These tests are destructive and will overwrite everything in the specified project. Benchmarks can be run with the following invocation:

go test ./tests/benchmark --bench=. --project_id=$PROJECTID --benchtime=${ITERATIONS}x --timeout=0

All of the test configurations described above are verified in this repository's CI tests.

License

This software is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

Disclaimer

This is not an official Google product. Issues filed on GitHub are not subject to service level agreements (SLAs) and responses should be assumed to be on an ad-hoc volunteer basis.

Contributing

Contributions are welcome! Please see CONTRIBUTING for notes on how to contribute to this project.

registry's People

Contributors

Stargazers

Watchers

registry's Issues

Registry v1 API: Add label support to filtering of Api, ApiVersion, and ApiSpec resources

Resource names are case sensitive

Created resources are case sensitive, allowing creation of independent resources like specs that differ only be changing the case of their resource IDs.

AIP-122 recommends against allowing uppercase letters in resource IDs. Allowing case sensitivity in resource names can result in confusing API behavior, e.g. resources not being found when accessed using a case insensitive identifier match.

One possible solution would be to reject uppercase characters during resource creation as recommended by AIP-122, and additionally attempt to canonicalize resource names to lowercase for other operations. This would make consumers aware of the restriction at resource creation time and reduce confusion from case sensitivity mismatch during future accesses.

Existence of parent is not checked during resource creation

Creating resources with a non-existent parent does not result in an error. The parent is considered valid if it has the correct format. Create methods should check existence of the parent and return NOT_FOUND status if it does not exist.

This issue is present on all resource creation and listing methods including spec revision listing.

Registry v1 API: add a new annotations field to Api, ApiVersion, and ApiSpec

List filters do not support name field (and others)

Although list filters do support filtering on specific components of the resource name (e.g. spec_id), the actual name field described in the API definition cannot be filtered. Some other fields that may reasonably be expected to have filtering are not supported. Specifically:

Project.name
Api.name
Version.name
Spec.hash
Spec.revision_id
Spec.revision_create_time

In addition to adding support for filtering these fields, the error messages caused by invalid filters can be improved. Currently the error status is "Internal" for invalid filter arguments, when it would more accurately be invalid argument status. Error messages can be improved by specifically saying the provided field is not supported.

Fix compute lint command for type openapi

After the introduction of GetApiSpecContents in the api, the following command was broken:
registry compute lint ../openapi.yaml

Fix:

GetApiSpecContents unzips and returns the contents of the spec.
The command was broken because it was doing another unzip operation.
The command can be fixed by taking out the unzip operation happening in the command.

Registry v1 API: add a new labels field to Api, ApiVersion, and ApiSpec

List operations return next_page_token when listing is complete

AIP-158 states the following:

If the end of the collection has been reached, the next_page_token field must be empty. This is the only way to communicate "end-of-collection" to users.

Our API sets next_page_token in the response after the final resource has been listed. When attempting to use that token to retrieve the following page an empty response is returned without error.

Pagination with filters has undefined behavior

List methods do not validate that the listing filter is consistent between requests during pagination. This can be confusing when callers specify a filter in their first request, but forget to include it in following requests resulting in an unfiltered response.

In less likely cases, there is no obvious correct behavior. Consider a contrived example where a restrictive filter is provided for several requests, then the filter is removed. Should this include all entries that haven't yet been listed, or just ones that haven't been ignored by the previous restrictive filter?

AIP-158 mentions "The user is expected to keep all other arguments to the RPC the same; if any arguments are different, the API should send an INVALID_ARGUMENT error," which would remove the possibility of this undefined behavior.

Registry v1 API: update hashing of ApiSpec and Artifact contents from SHA-1 to SHA-256

Remove storage layer GORM abstractions

We have custom-built interfaces to support runtime configurable usage of different storage frameworks, but no longer support multiple storage frameworks since #130 removed datastore from the project. These abstractions only obscure how GORM is being used for storage now.

Some of the types that implement these abstractions are still useful, e.g. Iterator is a helpful abstraction that makes it easier to handle listing operations.

Action items:

Remove the interfaces defined in storage.go.
Use GORM query builder directly - Remove the custom Query struct.
Relocate the dao and gorm packages into the storage directory and consider merging them.

Update method responses for missing resource bodies are inaccurate

The response reports a missing resource name when the required resource body is missing. An error should be returned indicating the body is required. This issue is present on all update methods.

Resource creation methods using automatically generated ids treat name collisions as user errors

In the current v1 implementation, unspecified resource ids are replaced with automatically generated ones using substrings of UUIDs: uuid.New().String()[:8].

Name collisions are unlikely but possible - according to birthday attack analysis, they tend to follow the square root of the number of states, i.e. with 2^32-state ids, we could expect a collision after approximately 2^16 (65536) ids were generated. This is consistent with an experiment (below) using our actual id generation expression.

Currently collisions are reported as "already exists" errors, which suggest that the user specified an invalid ID, when simply retrying is likely to fix the problem.

If we explicitly detected and handled collisions, we might also be able to reduce the length of our automatically-generated identifiers to more user-friendly values.

package main

import (
	"fmt"
	"github.com/google/uuid"
)

func findDuplicate() int {
	ids := make(map[string]bool)
	for i := 1; true; i++ {
		x := uuid.New().String()[:8]
		if ids[x] {
			return i
		}
		ids[x] = true
	}
	return 0
}

func main() {
	for i := 0; i < 10; i++ {
		c := findDuplicate()
		fmt.Printf("found duplicate in %d ids\n", c)
	}
}

Running the above code once produced:

found duplicate in 104690 ids
found duplicate in 67448 ids
found duplicate in 41598 ids
found duplicate in 93047 ids
found duplicate in 97930 ids
found duplicate in 72057 ids
found duplicate in 85917 ids
found duplicate in 98528 ids
found duplicate in 110139 ids
found duplicate in 100775 ids

Customization of registry.yaml is difficult in some container environments, environment variable support would help

In Kubernetes and local testing, we run the registry-server with -c config/registry.yaml (or another config file) to configure the database used and some other behaviors (logging, pubsub). This is simple in Kubernetes because we can use ConfigMaps to replace a registry.yaml file in a running container, meaning that we can easily use common container images.

Currently there is no way to replace files in images running in Cloud Run, so we're forced to customize container builds with the specific registry.yaml file that we want used. This is a fairly heavy process, as container builds are much slower than configuration changes.

We are able to set environment variables in Cloud Run, so adding support for configuration with environment variables would allow us to deploy common images. This, however, could lead to questions of precedence if we continue to support both YAML-based and environment-based configuration.

A promising alternative would be to add support for environment variable references in our YAML configuration files. This is easily implemented using os.ExpandEnv from the Go standard library. Essentially, between reading the configuration file and parsing it as YAML, we would insert this step:

		b = []byte(os.ExpandEnv(string(b)))

This would substitute all environment variables in our registry.yaml and would allow us to specify our default registry.yaml as:

database: ${REGISTRY_DATABASE}
dbconfig: ${REGISTRY_DBCONFIG}
log: ${REGISTRY_LOG}
notify: ${REGISTRY_NOTIFY}

(We can also add appropriate comments to document each configuration item here)

Users would then be able to replace registry.yaml in environments where that is possible and to use environment variables for configuration where that is possible.

Registry API unit tests only cover SQLite storage backends

Tests should be configurable to run against all types of storage backends to ensure every storage implementation is correct. Alternatively each storage backend could have a separate test suite, but this would require more resources to develop and would be less maintainable.

Spec revision_create_time is not set in responses

According to https://google.aip.dev/162#guidance and our API definition we should be including revision_create_time with all spec responses. Currently we do not populate this field.

Documentation for custom ID validation does not match behavior

Currently the implementation validates custom IDs according to the regex ([a-zA-Z0-9-_\\.]+). The API documentation says the following:

This value should be 4-63 characters, and valid characters are /[a-z][0-9]-/."

AIP-122 suggests custom IDs follow RFC-1034. The RFC mentions the following:

They must start with a letter, end with a letter or digit, and have as interior characters only letters, digits, and hyphen. There are also some restrictions on the length. Labels must be 63 characters or less."

The demo scripts currently create resources with names like "1.0.0" so we can't follow the AIP suggestion without modifying the demos. Also, the tests in filter_test.go create APIs with single digit IDs. There may be other dependencies on the current validation being more relaxed than the documentation.

Invalid database configurations are difficult to identify

Now that our default database is SQLite, we need a way to quickly alert users when the database configuration is invalid. In container builds, CGO is disabled, making SQLite unavailable. (I believe we should keep SQLite as our default because it provides an easy first user / explorer experience, but container builds will need to be configured to use a different database)

Unit tests do not cover some revisioning behaviors

On PR #111 the demo tests caught a bug that should reasonably be caught by unit tests. Specifically ListSpecs was returning all revisions of all specs, but the unit test suite passed. Given the difficulty of seeding multiple revisions, there are likely many other cases that are not tested.

The unit test suite could benefit from having easier methods of seeding multiple spec revisions for a test case. Currently tests have to manually create and update a spec with new contents to create multiple revisions, which is cumbersome and discourages developers from thoroughly testing behaviors with multiple revisions. In the case of ListSpecs, the table-driven tests are not capable of seeding multiple revisions right now, so a separate test would need to be developed without better seeding helpers.

Initial implementation of worker-architecture

This bug is to track a basic implementation of the worker architecture.

Implement the following:

A dispatcher (GKE workload) which listens to registry events and create tasks in the queue
A set of workers which will execute registry tool commands. A worker can be a single binary currently which can execute compute lint commands from the registry tool.

Scope:
For the initial implementation, the scope can be responding to spec events in the registry and compute lint.
Expected outcome:

The above setup should work for singular uploads.
The compute lint workers should be able to calculate lint results for all kinds of supported APIs. openapi.yaml, Protos
This whole e2e setup should be an automated setup as far as possible and there should be a top level Makefile command to deploy this.

Compute external dependencies of protobuf APIs

Protocol Buffer-based API descriptions consist of potentially multiple .proto files that can use import statements to include other .proto files. Our current working definition of a protobuf spec is that it is a directory of .proto files that might refer to other files outside that directory. Many analysis tasks would require resolution of these references, and a first step to that would be to have a way to compute and store the internally-unresolved imports in a protobuf spec, i.e. the external dependencies.

Registry v1 API: Represent Artifact content types as mime_types and specify preferred values

List methods do not return complete pages

Although permitted by the AIPs, some client libraries interpret an incomplete page as a signal for end-of-collection. If possible we should return only complete pages while listing. In particular, listing with a filter should no longer return empty or near-empty pages.

Invalid identifiers in public API sets

Importing specs from the openapi-directory, we attempt to derive resource ids from the API, version, and spec names in the directory. This works well in most cases, but often requires lower-casing strings to satisfy our requirement that ids be lower-case only.

With that, a few identifiers still fail to match our required format (see the log below). There are two problems:

identifiers that are longer than our 63-character limit
identifiers that contain invalid characters

We can replace or remove some of the invalid characters on the client-side (particularly the parens), but some (the plus sign) might be worth allowing.

For the character counts, we might want to raise our limit to 80, 120, or some other value.

2021/05/18 21:57:10 error projects/openapi/apis/azure.com-enterpriseknowledgegraph-enterpriseknowledgegraphswagger: rpc error: code = InvalidArgument desc = invalid identifier "azure.com-enterpriseknowledgegraph-enterpriseknowledgegraphswagger": must be 63 characters or less
2021/05/18 21:57:21 error projects/openapi/apis/azure.com-machinelearningexperimentation-machinelearningexperimentation: rpc error: code = InvalidArgument desc = invalid identifier "azure.com-machinelearningexperimentation-machinelearningexperimentation": must be 63 characters or less
2021/05/18 21:57:34 error projects/openapi/apis/azure.com-sql-managedrestorabledroppeddatabasebackupshorttermretenion: rpc error: code = InvalidArgument desc = invalid identifier "azure.com-sql-managedrestorabledroppeddatabasebackupshorttermretenion": must be 63 characters or less
2021/05/18 21:57:35 error projects/openapi/apis/azure.com-sql-manageddatabasevulnerabilityassesmentrulebaselines: rpc error: code = InvalidArgument desc = invalid identifier "azure.com-sql-manageddatabasevulnerabilityassesmentrulebaselines": must be 63 characters or less
2021/05/18 21:57:39 upload openapi /home/tim/Desktop/openapi-directory/APIs/drchrono.com/v4 (Hunt Valley)/openapi.yaml: rpc error: code = InvalidArgument desc = invalid version name "projects/openapi/apis/drchrono.com/versions/v4-(hunt-valley)": must match "^projects/([A-Za-z0-9-.]+)/apis/([A-Za-z0-9-.]+)/versions/([A-Za-z0-9-.]+)$"
2021/05/18 21:57:41 upload openapi /home/tim/Desktop/openapi-directory/APIs/gitea.io/1.15.0+dev-317-gba76bd78b/swagger.yaml: rpc error: code = InvalidArgument desc = invalid version name "projects/openapi/apis/gitea.io/versions/1.15.0+dev-317-gba76bd78b": must match "^projects/([A-Za-z0-9-.]+)/apis/([A-Za-z0-9-.]+)/versions/([A-Za-z0-9-.]+)$"

Spec create_time is modified on update

The create_time field should not change when specs are updated. Currently when new revisions are created this timestamp is updated to the revision creation time, which has its own field revision_create_time.

Registry v1 API

After an internal API review, we've drafted protos for a v1 version of the Registry API. These are checked into apigee/registry/tree/v1/google/cloud/apigee/registry/v1 along with a preliminary set of changes that makes name changes and other minor adjustments to move from v1alpha1 to v1. These changes include:

renaming Version to ApiVersion
renaming Spec to ApiSpec
renaming Property to Artifact
simplifying the Artifact interface to represent all artifacts as binary objects with specified (named) types
removing the Label resource (see below for a replacement)
removing a method that lists all of the user created revision tags for an ApiSpec (see below for a replacement)

Several changes remain to be done, and will be addressed with issues and PRs on the v1 branch:

adding a new labels field to Api, ApiVersion, and ApiSpec to support categorization and filtering of these resources
adding a new annotations field to Api, ApiVersion, and ApiSpec to store small amounts of string metadata
adding a new revision_tags field to ApiSpec to list the revision tags that point to the revision returned
updating hashing of ApiSpec and Artifact contents from SHA-1 to SHA-256
representing ApiSpec styles as mime_types (generally following IANA guidelines)
representing Artifact content types as mime_types (also generally following IANA guidelines)
specifying recommended values for ApiSpec and Artifact MIME types.

These changes are a high priority for us and we expect to accompany and follow them with more thorough testing and updates to the Registry Viewer so that we can quickly begin using the v1 service in production.

open config/sqlite.yml: no such file or directory

When trying to start the registry-server in local mode, there's an error. See below:

registry-server -c config/sqlite.yml
2020/12/23 09:31:25 open config/sqlite.yml: no such file or directory

On closer inspection, this looks like a typo in the README.md at the root of the project. The actual file is called sqlite.yaml

ls config/
datastore.yaml postgres.yaml README.md registry.yaml sqlite.yaml

Setup REGISTRY_PROJECT_IDENTIFIER inside Makefile

Some commands under Makefile require the environment variable REGISTRY_PROJECT_IDENTIFIER as a prerequisite. Currently only scripts under auth/.*.sh can setup the variable, but scripts like auth/CLOUDRUN.sh or auth/GKE.sh contain extra steps that are not required by the Makefile. To reduce effort on local setup, we should modify the Makefile to set REGISTRY_PROJECT_IDENTIFIER=$(gcloud config list --format 'value(core.project)')

Spec revision_tags field is unused

This field should be removed or supported. Supporting it is beneficial because it allows consumers to see all the existing revision tags for a spec using ListApiSpecRevisions, and also any tags on an individual spec revision returned from any other spec operation. Without this field, consumers must "remember" all their revision tags because there is no way to access them.

Registry API: Optimize memory usage for database listing operations

Listing operations work by reading the entire table into an iterator then processing resources one at a time until a page is filled or the entire table has been read.

The iterator struct could be modified to fetch smaller batches of resources and automatically attempt to fetch another batch after all of its resources are accessed. The iterator behavior would not change, but the number of resources that are unnecessarily read from the database and stored in memory would be reduced.

Update methods cannot clear field data with gorm storage backend

This bug does not exist when using datastore as the storage backend. It is confirmed to exist for sqlite3 and postgres backends, likely meaning the issue has to do with our use of gorm. I confirmed the updated model reaches the gorm client layer correctly, but any fields that are empty do not persist to the database.

The response from update methods is correct because the model itself is correctly updated and used to create the response. This bug reveals itself on a future request to retrieve the updated resource, where fields have not changed when they should have been cleared.

Proposal: Drop Cloud Datastore API Support

The initial implementation of the Registry API used the Cloud Datastore API, which provided a managed key-value data store that was easy to configure and use. It was expected that this would be replaced, originally we were thinking it would be replaced by an internal database (for an internal production implementation) then later by PostgreSQL when we decided to focus on the open source implementation and expand its usefulness. This led to the addition of a gorm-based storage interface that replicated the Datastore interface and allowed SQLite and PostgreSQL storage backends (among others).

Now our focus is on using the PostgreSQL backend in production, and early conversations with other users suggest that PostgreSQL backends are of broader interest. We've recently been refactoring our storage interface to remove some of the awkwardness of calling PostgreSQL through what essentially a replica of the Datastore API, but continuing to maintain the Datastore backend adds a maintenance and testing burden and limits our ability to use relational queries.

Dropping Datastore would require users to use a relational database, but the SQLite support that we get using gorm has actually given us an easier "try it" experience than we originally had with Datastore and we think there are plenty of hosted PostgreSQL options available for production use.

First steps would be to rewrite the README and other examples to remove Datastore references; then we could remove the Datastore interface layer and continue refactoring and simplifying our calls through gorm.

Concerns?

Registry API: Support ordering in list operations

List methods currently do not support custom ordering. There are many cases where this could be useful, such as listing APIs in order by their creation time. Ordering should be supported as described in AIP-132.

Refactor worker implementation code.

There were some comments on the initial implementation of the workers architecture:

Package utils in cmd/capabilities directory
I think this is good advice about "utils" packages: https://dave.cheney.net/2019/01/08/avoid-package-names-like-base-util-or-common.
Topic name for registry's pub/sub channel
Maybe having a more specific name for pub/sub topic is better. My understanding is this will be used to publish registry events, so maybe something like registry-events?
This will require a refactor on the registry-backend side too.
For struct Dispatcher defined in dispatcher.go
Do we intend on having this struct provide any functionality beyond setting up and starting a server? Requiring a call to setUp before its other methods indicates we're putting too much responsibility on the user of the type. It seems more appropriate to create a package dispatcher with exported method StartServer(context.Context) that handles any necessary setup internally. The main package would become very lightweight with its only responsibility being starting the server (possibly parsing configuration info in the future).

This issue is used to track these refactor efforts.

ListApiSpecs and ListApiSpecRevisions use inconsistent field naming

ListApiSpecsResponse has the field api_specs for returned specs. ListApiSpecRevisionsResponse has the field specs for returned specs. It seems like other listing responses tend to favor the former style, e.g. ListApiVersionsResponse uses api_versions instead of versions.

Registry tool: Merge the automatically-generated CLI into the registry tool.

Currently the project includes two command line interfaces:

apg an automatically-generated CLI that is completely produced from the gRPC description of the API and supports direct calls to all of the API methods.
registry a handwritten CLI that includes a range of capabilities, some very experimental and some that are closer to production-ready.

We imagined that users would write their own API-based tools and CLIs, possibly referring to the registry tool as sample code for Go implementations. More recently we have been thinking that the registry tool would be used in automated handlers that perform event-driven tasks like running linters. These don't preclude integrating all API access into a monolithic tool, but they also don't strongly support continuing to expand the registry tool - instead they seem to suggest that we divide it into a more highly-tested core and move experimental features into sibling tools.

Early users have pointed out that it's confusing to have two CLIs. This does seem to be a good reason to move apg (which is somewhat arbitrarily named) into registry. So as a trial proposal, let's consider adding the direct API access functions in apg to registry by mapping everything under apg registry to registry api. This would mean a call like apg registry get-status would become registry api get-status. This would require some handwritten code to patch the autogenerated parts into the registry implementation, perhaps by modifying the generated code either by hand or in a postprocessor.

Thoughts or concerns?

Registry tool: Protobuf Registry Features

A registry of Protocol Buffer-based APIs may have some requirements that aren't shared with other API styles. These could include:

Avoiding conflicts by ensuring that all messages and rpc names in the registry are unique. This suggests that a global (flat) list of names is needed.
Checking imports in .proto files to resolve external references and detect unresolved references.
Running format converters to produce OpenAPI or other formats describing alternate representations of APIs, such as those produced by gRPC HTTP/JSON Transcoding.
Automation to produce generated artifacts such as client libraries, server stubs, and documentation (probably needed for all API description formats)

Registry API: Load Testing

We do not currently understand the behavior of the Registry API under heavy load. Commands such as spec bulk uploading currently produce the largest loads against the API. The addition of load testing would allow us to identify where the system bottlenecks exist so they can be addressed before they are discovered in a real-world use of the API.

Spec "Currency" is not safe for concurrent requests

Concurrent UpdateSpec calls can result in multiple revisions being marked as the current revision.

Since spec revisions have revision creation timestamps, we can instead use those timestamps to find the most recent revision. Rolling back to previous revisions will not be an issue in this case because a new revision is created as a copy of the rollback target and would have the most recent creation time.

Support direct access of API specs and artifact contents

Currently API spec contents can be accessed as a field of the ApiSpec resource, which is returned by GetApiSpec and ListApiSpecs methods. Early users have pointed out that this can be awkward, particularly in transcoded interfaces. The situation is similar for artifacts. It seems that usage of both could be simplified with API methods that allow direct access to their contents.

Registry v1 API: add a new revision_tags field to ApiSpec to list the revision tags that point to the revision returned

Support registry bulk upload events in worker architecture

When bulk upload events happen in the registry, the worker architecture receives high traffic and needs to scale. This can be handled with a combination of 2 things:

Cloud Tasks has some recommendations when the traffic to the queue reached above 500 QPS. https://cloud.google.com/tasks/docs/manage-cloud-task-scaling
GKE auto-scaling for the workers deployment.

Generate IDs for resources when unspecified in create requests

As a user of the CreateProject, CreateApi, CreateApiVersion, CreateApiSpec, and CreateArtifact methods, I would like to be able to create resources without providing a specific ID. My expectation is that a valid ID will be generated for me and returned as part of the response.

Deleting resources should delete any artifacts associated with them.

For example, deleting a spec (projects/PROJECT_ID/apis/API_ID/versions/VERSION_ID/specs/SPEC_ID) should delete any artifacts under that spec (projects/PROJECT_ID/apis/API_ID/versions/VERSION_ID/specs/SPEC_ID/artifacts/ARTIFACT_ID).

Currently we (crudely) delete all artifacts only when their containing project is deleted.

Support CSV Import/Export for API Specs

I would like to be able to import specs into the API Registry using a spreadsheet containing information about the APIs, versions, specs, and their location on disk (or within a directory).

I would also like to be able to export a CSV containing the same information for all the resources in my API Registry project. Exports should have a column that includes links to the spec contents as stored in the Registry so I have access to direct download links.

Registry v1 API: Represent ApiSpec styles as mime_types and specify preferred values

Registry v1 API: Ensure that field mask usage in update methods conforms to AIP-134

"If optional, the service must treat an omitted field mask as an implied field mask equivalent to all fields that are set on the wire."

Registry tool: Representation for dependencies in a registry.

As API registries grow, we expect that users will want them to include generated artifacts that are computed automatically in response to additions and changes to registry entries. For example, a linter might be run on every new spec revision, and a summarization of linting results for a collection of APIs might be updated after each new linting run.

This could be supported with custom applications that watch pubsub events and trigger actions. However, since these "dispatchers" might share a lot of common code, it could be useful to have a generic dispatcher that could be configured with a representation of artifact dependencies. Without suggesting how a dispatcher might be implemented, here is a sketch of a possible dependency description:

# API Registry Manifest
#
# This file describes the desired state of the API Registry for a project.
# It is attached (serialized) to the project using the "manifest" artifact id.
# e.g. projects/myproject/artifacts/manifest with MIME Type
#    "application/octet-stream;type=google.cloud.apihub.applications.v1alpha1.Manifest"
#

# The artifacts field lists artifacts that we want to exist in the API Registry.
# Each artifact has dependencies and a worker action that generates it.
artifacts:

# A project-level summary of registry contents.
# No owning resource is specified,
# so this defaults to belong to the project that owns this manifest.
# In the declarations below, "$0" is the matching resource, the owner of the artifact.
- artifact: summary
  dependencies:
  - resource: apis/-
  - resource: apis/-/versions/-
  - resource: apis/-/versions/-/specs/-
  action: "compute summary $0"

# Spectral linter results for OpenAPI specs.
# The filter field limits this to specs with OpenAPI mime types.
- artifact: lint-spectral
  resource: apis/-/versions/-/specs/-
  filter: "mime_type.contains('openapi')"
  dependencies:
  - resource: $0
  action: "compute lint $0 --linter spectral"

# Spectral lintstats summaries for OpenAPI specs.
# "lintstats-" artifacts depend on and summarize "lint-" artifacts.
- artifact: lintstats-spectral
  resource: apis/-/versions/-/specs/-
  filter: "mime_type.contains('openapi')"
  dependencies:
  - artifact: lint-spectral
    resource: $0
  action: "compute lintstats $0 --linter spectral"

# Project-level Spectral lintstats.
# These depend on "lintstats-spectral" artifacts of individual specs.
- artifact: lintstats-spectral
  dependencies:
  - artifact: lintstats-spectral
    resource: apis/-/versions/-/specs/-
    filter: "mime_type.contains('openapi')"
  action: "compute lintstats $0 --linter spectral"

# AIP linter results for Protobuf specs.
# These are essentially just like the Spectral linter results (above).
- artifact: lint-aip
  resource: apis/-/versions/-/specs/-
  filter: "mime_type.contains('protobuf')"
  dependencies:
  - resource: $0
  action: "compute lint $0 --linter aip"

# Vocabulary of an API spec.
- artifact: vocabulary
  resource: apis/-/versions/-/specs/-
  dependencies:
  - resource: $0
  action: "compute vocabulary $0"

# Vocabulary of APIs with a particular owner.
- artifact: vocabulary-google
  dependencies:
  - resource: apis/-/versions/-/specs/-
    filter: "owner == 'google'"
  action: "compute vocabulary union $0 --filter \"owner == 'google'\""

- artifact: vocabulary-google-common
  dependencies:
  - resource: apis/-/versions/-/specs/-
    filter: "owner == 'google'"
  action: "compute vocabulary intersection $0 --filter \"owner == 'google'\""

# The labellings field lists resources with automatically-computed labels.
labellings:

# Compute labels indicating the owner, spec formats, and
# the number of specs and versions of the API
- resource: apis/-
  dependencies:
  - resource: $0/versions/-
  - resource: $0/versions/-/specs/-
  action: "compute labels $0"

# Workers can be specified alongside actions.
# If a worker is unspecified, the default worker should be used and must be in the list below.
workers:
  # The default worker runs the registry tool.
  - name: default
    host: myworker.mydomain.com
    path: /registry

Comments?

Generated apg client cannot be used with Envoy over insecure connections

The apg tool doesn't forward credentials to insecure services, so the locally running authz-server is unable to receive the credentials needed to authorize a request. Either the apg tool should be updated (may be difficult) to forward the credentials, or a workaround should be made to enable use of apg with Envoy.

Registry API: Constraints on labels should be enforced

The Registry API design follows the Google Cloud pattern for allowing labels on resources that is described in Creating and Managing Labels. To avoid future compatibility problems, we should verify that labels created with the API follow the documented Requirements for Labels.