转 GraphQL Schema Stitching explained: Schema Delegation

时间:2022-02-20 14:43:56

转自官方文档

In the last article, we discussed the ins and outs of remote (executable) schemas. These remote schemas are the foundation for a set of tools and techniques referred to as schema stitching.

Schema stitching is a brand new topic in the GraphQL community. In general, it refers to the act of combining and connecting multiple GraphQL schemas (or schema definitions) to create a single GraphQL API.

There are two major concepts in schema stitching:

  • Schema delegation: The core idea of schema delegation is to forward (delegate) the invocation of a specific resolver to another resolver. In essence, the respective fields of the schema definitions are being “rewired”.
  • Schema merging: Schema merging is the idea of creating the union of two (or more) existing GraphQL APIs. This is not problematic if the involved schemas are entirely disjunct — if they are not, there needs to be a way for resolving their naming conflicts.

Notice that in most cases, delegation and merging will actually be used together and we’ll end up with a hybrid approach that uses both. In this article series, we’ll cover them separately to make sure each concept can be well understood by itself.

Example: Building a custom GitHub API

Let’s start with an example based on the public GitHub GraphQL API. Assume we want to build a small app that provides information about the Prisma GitHub organization.

The API we need for the app should expose the following capabilities:

  • retrieve information about the Prisma organization (like its IDemail addressavatar URL or the pinned repositories)
  • retrieve a list of repositories from the Prisma organization by their names
  • retrieve a short description about the app itself

Let’s explore the Query type from GitHub’s GraphQL schema definition to see how we can map our requirements to the schema’s root fields.

Requirement 1: Retrieve info about Graphcool organization

The first feature, retrieving information about the Prisma organization, can be achieved by using the repositoryOwner root field on the Query type:

type Query {

  # ...

  # Lookup a repository owner (ie. either a User or an Organization) by login.
repositoryOwner(
# The username to lookup the owner by.
login: String!
): RepositoryOwner # ... }

We can send the following query to ask for information about the Prisma organization:

query {
repositoryOwner(login: "prismagraphql") {
id
url
pinnedRepositories(first:100) {
edges {
node {
name
}
}
}
# ask for more data here
}
}

It works when we provide "prismagraphql" as the login to the repositoryOwnerfield.

One issue here is that we can’t ask for the email in a straightforward way, because RepositoryOwner is only an interface that doesn’t have an email field. However, since we know that the concrete type of the Prisma organization is indeed Organization, we can work around this issue by using an inline fragment inside the query:

query {
repositoryOwner(login: "prismagraphql") {
id
... on Organization {
email
}
}
}

Ok, so this will work but we’re already hitting some friction points that don’t allow for a straightforward use of the GitHub GraphQL API for the purpose of our app.

Ideally, our API would just expose a root field that allowed to ask directly for the info we want without needing to provide an argument upon every query and letting us ask for fields on Organization directly:

type Query {
prismagraphql: Organization!
}

Requirement 2: Retrieve list of Graphcool repositories by name

How about the second requirement, retrieving a list of the Graphcool repositories by their names. Looking at the Query type again, this becomes a bit more complicated. The API doesn’t allow to retrieve a list of repositories directly— instead you can ask for single repositories by providing the owner and the repo’s name using the following root field:

type Query {

  # ...

  # Lookup a given repository by the owner and repository name.
repository(
# The login field of a user or organization
owner: String! # The name of the repository
name: String!
): Repository # ... }

Here’s a corresponding query:

query {
repository(owner: "prismagraphql", name: "graphql-yoga") {
name
description
# ask for more data here
}
}

However, what we actually want for our app (to avoid having to make multiple requests) is a root field looking as follows:

type Query {
prismagraphqlRepositories(names: [String!]): [Repository!]!
}

Requirement 3: Retrieve short description about the app itself

Our API should be able to return a sentence describing our app, such as "This app provides information about the Prisma GitHub organization".

This is of course a completely custom requirement we can’t fulfil based on the GitHub API — but rather it’s clear that we need to implement it ourselves, potentially with a simple Query root field like this:

type Query {
info: String!
}

Defining the application schema

We’re now aware of the required capabilities of our API and the ideal Query type we need to define for the schema:

type Query {
prismagraphql: Organization!
prismagraphqlRepositories(names: [String!]): [Repository!]!
info: String!
}

Obviously, this schema definition in itself is incomplete: it misses the definitions for the Organization and the Repository types. One straightforward way of solving this problem is to just manually copy and paste the definitions from GitHub’s schema definition.

This approach quickly becomes cumbersome, since these type definitions themselves depend on other types in the schema (for example, the Repository type has a field codeOfconduct of type CodeOfConduct) which you then need to manually copy over as well. There is no limit to how deep this dependency chain goes into the schema and you might even end up copying the full schema definition by hand.

Note that when manually copying over types, there are three ways this can be done:

  • The entire type is copied over, no additional fields are added
  • The entire type is copied over and additional fields are added (or existing ones are renamed)
  • Only a subset of the type’s fields are copied over

The first approach of simply copying over the full type is the most straightforward. This can be automated using graphql-import, as explained in the next section.

If additional fields are added to the type definition or existing ones are renamed, you need to make sure to implement corresponding resolvers as the underlying API of course cannot take care of resolving these new fields.

Lastly, you might decide to only copy over a subset of the type’s fields. This can be desirable if you don’t want to expose all the fields of a type (the underlying schema might have a password field on the User type which you don’t want to be exposed in your application schema).

Importing GraphQL type definitions

The package graphql-import saves you from that manual work by letting you share type definitions across different .graphql-files. You can import types from another GraphQL schema definition like so:

# import Repository from "./github.graphql"
# import Organization from "./github.graphql" type Query {
info: String!
graphcoolRepositories(names: [String!]): [Repository!]!
graphcool: Organization!
}

In your JavaScript code, you can now use the importSchema function and it will resolve the dependencies for you, ensuring your schema definition is complete.

Implementing the API

With the above schema definition, we’re only halfway there. What’s still missing is the schema’s implementation in the form of resolver functions.

If you’re feeling lost at this point, make sure to read this article which introduces the basic mechanics and inner workings of GraphQL schemas.

Let’s think about how to implement these resolvers! A first version could look as follows:

const { importSchema } = require('graphql-import')

// Import the application schema, including the
// types it depends on from `schemas/github.graphql`
const typeDefs = importSchema('schemas/app.graphql') // Implement resolver functions for our three custom
// root fields on the `Query` type
const resolvers = {
Query: {
info: (parent, args) => 'This app provides information about the Prisma GitHub organization',
prismagraphqlRepositories: (parent, { names }, context, info) => {
// ???
},
prismagraphql: (parent, args, context, info) => {
// ???
}
}
}

The resolver for info is trivial, we can return a simple string describing our app. But how to deal with the ones for prismagraphql and prismagraphqlRepositories where we actually need to return information from the GitHub GraphQL API?

The naive way of implementing this here would be to look at the info argument to retrieve the selection set of the incoming query — then construct another GraphQL query from scratch that has the same selection set and send it to the GitHub API. This can even be facilitated by creating a remote schema for the GitHub GraphQL API but overall is still quite a verbose and cumbersome process.

This is exactly where schema delegation comes into play! We saw before that GitHub’s schema exposes two root fields that (somewhat) cater the needs for our requirements: repositoryOwner and repository. We can now leverage this to save the work of creating a completely new query and instead forward the incoming one.

Delegating to other schemas

So, rather than trying to construct a whole new query, we simply take the incoming query and delegate its execution to another schema. The API we’re going to use for that is called delegateToSchema provided by graphql-tools.

delegateToSchema receives seven arguments (in the following order):

  1. schema: An executable instance of GraphQLSchema (this is the target schema we want to delegate the execution to)
  2. fragmentReplacements: An object containing inline fragments (this is for more advanced cases we’ll not discuss in this article)
  3. operation: A string with either of three values ( "query" , "mutation" or "subscription") indicating to which root type we want to delegate
  4. fieldName: The name of the root field we want to delegate to
  5. args: The input arguments for the root field we’re delegating to
  6. context: The context object that’s passed through the resolver chain of the target schema
  7. info: An object containing information about the query to be delegated

In order for us to use this approach, we first need an executable instance of GraphQLSchema that represents the GitHub GraphQL API. We can obtain it using makeRemoteExecutableSchema from graphql-tools.

Notice that GitHub’s GraphQL API requires authentication, so you’ll need an authentication token to make this work. You can follow this guide to obtain one.

In order to create the remote schema for the GitHub API, we need two things:

  • its schema definition (in the form of a GraphQLSchema instance)
  • an HttpLink that knows how to fetch data from it

We can achieve this using the following code:

// Read GitHub's schema definition from local file
const gitHubTypeDefs = fs.readFileSync('./schemas/github.graphql', {encoding: 'utf8'}) // Instantiate `GraphQLSchema` with schema definition
const introspectionSchema = makeExecutableSchema({ typeDefs: gitHubTypeDefs }) // Create `HttpLink` based using person auth token
const link = new GitHubLink(TOKEN) // Create remote executable schema based on schema definition and link
const schema = makeRemoteExecutableSchema({
schema: introspectionSchema,
link,
})

GitHubLink is just a simple wrapper on top of HttpLink, providing a bit of convenience around creating the required Link component.

Awesome, we now have an executable version of the GitHub GraphQL API that we can delegate to in our resolvers!