Getting Started with CloudEvents and AsyncAPI

In the previous blog post we went over a case study for Azure Service Bus. In this article we’ll look at two specs, CloudEvents and AsyncAPI, that you can use to solve some problems of your event-driven architectures.

Introduction

At the moment, there are quite a few tools and products that have adopted CloudEvents or AsyncAPI. Knative Eventing is a tool that helps developers in a serverless context, Azure Event Grid natively supports CloudEvents. More recently, Jenkins added integration with CloudEvents with a new plugin. It allows users to configure Jenkins as a source and a sink for CloudEvents. There is also interesting integrations between Kubernetes and Azure Event Grid that are compliant with the CloudEvents v1.0 spec. Checkout the GitHub repository or this blog post and learn more about it.

Postman has joined forces with AsyncAPI along with organizations such as Salesforce, Slack and Solace. Postman in particular is publishing public collections related to AsyncAPI. For example a list of companies adopting AsyncAPI, with links to those resources (GitHub repositories, websites, etc).

I hope these projects have got you excited to learn more about these specs. Let’s dive into some of their details!

CloudEvents

The CloudEvents specification is under the CNCF Serverless working group since 2018. The spec’s purpose is describing event data in a common way. This is useful in many scenarios, for example, routing events to the appropriate subscribers depending on the type of the event. Since applications can use a lot of different transports to send and receive events, the CloudEvents spec is protocol-agnostic so it defines protocol bindings in order for the metadata to be correctly mapped for HTTP, AMQP, Kafka, etc.

There are many use cases in using the CloudEvents spec, but perhaps the main one would be interoperability. Imagine applications across clouds, being able to communicate in an event-driven architecture. Where there are producers of all sources, and consumers using all kinds of protocols (e.g. HTTP, AMQP, WebSockets). We can have middleware that connects these applications, adds E2E tracing and more with the use of CloudEvents. Of course we could connect the same applications without a common format, but it requires mapping between event formats (cloud providers use different schemas). Middleware would also need to parse the event data to get specific information.

Another use case is SaaS (Software-as-a-service) that publishes events that clients are interested in to integrate with their own systems. For example, hooking into the checkout flow in a Shopify storefront to add extra checks. By leveraging CloudEvents these events can be consistent, opening the door for numerous integrations between 3rd party software.

Extensions

There are a few extensions worth mentioning, one of them is for distributed tracing. However, it seems there is some discussion around removing this extension from the spec (check this PR on GitHub). There are open issues on some SDKs to support it, and others have already made changes to remove it. The future isn’t clear, but I’d argue it’s interesting to follow this closely for any updates, since tracing events is very important in an event-driven architecture.

The Partioning extension is another interesting extension, it defines a field to be handled by message brokers that can separate load via a partition key. This is used for example in the Kafka protocol binding that requires implementations to map the partitionKey attribute to the key of the Kafka message. In Kafka the concept of a partition is well known, so this maps out really well.

Relation with Serverless computing

Serverless computing has increased in popularity and use in the industry, especially for it’s cost model. But many FaaS (Functions-as-a-service) providers have their own function interface. Which means developers can’t write a function in JavaScript, and deploy them in two cloud providers without making changes. This specification improves portability between FaaS platforms, so that developers receive an event in the same format and can reuse libraries for handling the event.

AsyncAPI

I’ll start with a description of what AsyncAPI is: a specification that describes and documents event-driven APIs in a machine-readable format. It’s protocol-agnostic like CloudEvents, so it can be used for APIs that work over many protocols, including MQTT, WebSockets, and Kafka.

The following is AsyncAPI’s vision stated on their website:

asyncapi description - AsyncAPI becomes the #1 API specification for defining and developing APIs. Any kind of APIs.
Photo taken from AsyncAPI website’s roadmap

I find this vision to be very interesting, mainly because of the part: Any kind of APIs. At first, you might wonder if this means the AsyncAPI spec will define rules and more concepts for other types of APIs like GraphQL or OpenAPI. But this is not at all the case, the goal is to integrate with existing tools and specs! This is valuable for developers because usually enterprise architectures consist of a mix of technologies, each for it’s appropriate use case. Developers nowadays don’t just interact with RESTful APIs in a request/response model. There are different demands and considerations for the ever increasing devices users can use, the software we build needs to match these demands and still maintain manageable. Where we can evolve and create new applications that leverage the numerous APIs that exist internally or from a 3rd party.

Concepts

The spec version 2.1.0 defines a few concepts apart from the common Producer, Consumer and Message. A Channel can be seen as a topic/exchange or queue, an application can send messages to a channel and consumers can subscribe to it to receive them. The Operation object indicates if it’s a publish or a subscribe operation and how an application can send or receive messages. A Binding (or “protocol binding”) is a mechanism to define protocol-specific information or query parameters for the channel bindings. For example, for the AMQP protocol we can specify the channel is an exclusive queue like this:

"bindings": {
    "amqp": {
      "is": "queue",
      "queue": {
        "exclusive": true
      }
    }
}

Each protocol has it’s own JSON schema and we can have bindings for Messages, Servers, Channels, Operations and others.

Components

You can define components to re-use in multiple AsyncAPI documents and we can reference other AsncAPI documents. Let’s say you have two publishers who publish the same message, but with different values in one of the message properties. We’d have two AsyncAPI documents specifying the publishers, referencing a 3rd document specifying the common message with it’s properties. The $ref field is a string that can be the path to the other file, or a URL for an external file where the schema we want is defined. This reference object uses the same rules and format of JSON Reference, which opens the door for many possibilites (check the docs to know more).

When we start to have a lot of apps that depend on each other’s schemas, we can take a look at some solutions to scale out AsyncAPI documents. Perhaps we use Confluent’s Schema Registry for our JSON schemas and setup a catalog of events in our organization. This empowers new developers seeking for ways to integrate with existing systems and event producers. We can also just store these components in a GitHub repository, and reference them in our AsyncAPI documents.

Tooling

There is already quite a few tools and the tooling ecosystem is increasing! I’ve recently seen a repository that enables the creation of Postman collections from an AsyncAPI spec. I’ve also seen architecture documents being generated from multiple AsyncAPI specs too, having a tool that can understand relations between applications and then output a diagram is pretty cool.

Generators for AsyncAPI

One piece of tooling that is often used are generators that produce documentation and code. For example gRPC tools have this capability using the protocol buffer compiler. AsyncAPI generators can take an AsyncAPI document and generate client/server code or documentation in HTML and markdown. Currently, it depends on the template we use to generate server-side code, for example the Node.js WebSocket template generates both server and client code. This can be improved and extended overtime, especially because of the way the generator is designed, enabling extensibility so we can have templates for many other languages that support more protocols, etc. For example, there is only a NATS generator for .NET Core… but perhaps in the future there could be more protocols supported for .NET Core and examples built for Azure 😃.

Conclusion

There is a lot of exciting stuff happening in the event-driven architectures world 😄. We have only touched the surface in this post. In the CloudEvents space there are new specs being designed and worked on: Discovery; Subscription and Schema Registry APIs. Since AsyncAPI defines a document that you can use to describe your API, it’d be interesting to see how these correlate to each other, and using them together.

I encourage you to join these communities and contribute to their open source projects, CloudEvents and AsyncAPI, both specs are very community-driven. Collaboration between everyone is the way forward, with Hacktoberfest and AsyncAPI’s Hackathon coming up, searching good first issues is a great way to start and to contribute 👍!

Let us know in the comments if you’re using these specs and what are your thoughts on them. The next blog post will be about a practical example for .NET Core, Azure and AMQP messaging using CloudEvents and AsyncAPI, so stay tuned!

Additional Links

Here are some links to talks, docs and blog posts that you may find useful if you’re interested to know more about CloudEvents and AsyncAPI:

LEAVE A REPLY

Please enter your comment!
Please enter your name here