Skip to content

Clarification of how to handle errors that occur in a subscription event stream #995

Open
@Yogu

Description

@Yogu

The GraphQL specification is pretty explicit in regards to handling of errors that occur within a field resolver. It does not say a lot about errors that occur while generating the events of a subscription field. Is this an intentional omission with the intention that implementations come up with their own strategies for error handling, or should this be specified in more detail?

I asked this question two days ago in the Discord server in #graphql-spec, and @benjie suggested to create an issue, so here it is.

Event stream errors in the spec

Section 6.2.3 defines "event streams" and mentions the possibility of errors:

Event streams may complete in response to an error or simply because no more events will occur.

I did not find any other reference to errors in regards to event streams or their usage in subscriptions. This makes me wonder why the specification mentions these two cases (normal completion vs. completion due ton an error). It could be interpreted as "if an error occurs, the event stream should be completed", or it could mean "if an error occurs, the event stream may be completed", or it could just be an non-normative hint.

Section 5.2.3.2 defines the operation defines an operation to convert a source stream to a response stream. (by the way, why is it called MapSourceToResponseEvent and not MapSourceStreamToResponseStream?) This operation does not mention errors, but it defines that the response stream should be completed when the source stream completes. Section 5.2.3 defines Subscribe which uses MapSourceToResponseEvent, but just returns it. This then bubbles up to the introduction of section 6 where it states that the result should be "formatted according to the Response section below". Section 7 (the "below") does not mention streams or subscriptions, so it does not further define how a completion would be formatted.

Field resolver errors in the spec

In contrast, for errors that occur in resolvers, there is a whole subsection in the specification: 6.4.4 "Handling Field Errors". It defines where an error can occur, it defines what should happen, and 7.1.2 Errors defines how the errors should be serialized.

Comparing field resolver errors and event stream errors

Field resolvers and field streams are very similar in the specification. ResolveFieldValue "calls" the "internal function provided by objectType for determining the resolved value of a field named fieldName". ResolveFieldEventStream "calls" the "internal function provided by subscriptionType for determining the resolved event stream of a subscription field named fieldName". Both are intended to be implemented by users of the graphql implementation. This is application code, so so they can produce all kinds of unforseen errors.

The "internal function provided by subscriptionType for determining the resolved event stream of a subscription field" could produce errors in two ways:

a. The function itself could raise an error
b. The generated "resolved event stream" could experience an error while it is generating events.

I think both of these error scenarios are comparable to an error thrown by the "internal function provided by objectType for determining the resolved value of a field". However, only errors of field resolvers are "caught" in the specification (please correct me if I'm wrong). Especially handling of errors generated like described in b. is not described.

What implementations and graphql servers can do

Errors raised while calling the "internal function provided by subscriptionType for determining the resolved event stream of a subscription field" could be handled like describe din 6.4.4 Handling Field Errors. I think this is pretty reasonable, and I guess there are implementations that already do this. (Maybe it's already specced like this and I just don't understand it)

Errors that occur during the execution of an event stream are a bit more complicated.

  • Servers could just ignore the errors and retry forever. However, this would take up resources, and the clients would not know what is going on
  • Servers could complete the event stream. This would free up resources and the client knows that something is going on. However, it can no longer distinguish an error state from the case that there are simply no more events to come. So it would not know whether it should retry.
  • The type of the subscription field could be changed into an union type to include status updates aside from the regular data. This is probably a good choice for expected errors. However, extending each and every subscription field with a union only to cover error cases clutters up the schema. Note that the server might decide against communicating the kind of error, because it could be some internal error reaching origin servers, so the clients would not even benefit from an error definition in the schema. Also, there might be intermediate GraphQL servers that just proxy the subscription to another graphql server without changing data or schema. The union approach would not work for them.
  • The server could close the underlying transport that is used for the subscription (e.g. a WebSocket). However, this would also affect other operations that are using the same transport.
  • Wire protocols for graphql subscriptions could implement a way to signal errors of a single operation. If this is expected from wire protocols, I think it would be good to mention it in the spec that this part is intentionally left to wire protocols.

Summary

I think it would be great if the specification specified error handling. Alternatively, I it could also clarify that it is up to implementations and wire protocols to provide this. Currently, these might omit error handling and say "working as specified, we can't do anything".

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions