Building a Discord Command in Ruby on Google Cloud Functions: Part 1
This is the first of a four-part series on writing a Discord “slash” command in Ruby using Google Cloud Functions. In this part, we cover setting up a Discord bot, deploying a webhook to Cloud Functions, and validating webhook requests from Discord using the Ruby ed25519 library. At the end of this part, we’ll have a working webhook that Discord recognizes, but that doesn’t yet actually implement a command.
Previous articles in this series:
Creating a Discord application
Discord is a big system. Often used for gaming and streaming, but also increasingly for online community interaction, it includes a wide variety of features involving chat, voice, video, and content. Bots are an integral part of the ecosystem, and all bots start in the same place: with a Discord application.
Discord’s developer portal can be accessed at https://discord.com/developers. Once you’ve registered and logged in, it shows you a list of your applications. I created a new application here called scripture-bot
.
Each application comes with a number of properties. These appear in the “general information” tab of the application, and include the following:
- The application ID is a unique number identifying your application. It’s kind of like your application’s “username”, and will be important later when we call the Discord API to register with servers and create commands.
- The public key will be used by your bot to authenticate requests—that is, to verify that HTTP requests you receive actually came from Discord and not from someone else trying to spoof Discord.
- The interactions endpoint URL is the URL of the webhook that will be called when someone invokes your command. It starts off empty because you need to fill it in. And that’s what we’ll be doing next.
There’s also a tab labeled “Bot” that includes information about the “bot user”. It also starts off empty, but we will create a bot user later when we install our command into a Discord server.
Writing a webhook
A Discord application can respond to commands in two ways: via the Gateway or by implementing a webhook. The Gateway communicates over a websocket, which is flexible and low-latency, but complicated to implement, and requires running a permanent process. For this project, we’ll opt for the simpler approach of providing a webhook that Discord will call whenever a command is invoked. The webhook option has limitations, but a key advantage: it can be deployed as a serverless web app, and thus likely to be inexpensive to run if it’s not heavily used.
Writing and deploying webhooks is quite easy with functions-as-a-service, or “FaaS”, a serverless architecture that models your app as a simple function that handles events. Many major cloud providers offer a FaaS environment, for example Lambda from AWS, or Cloud Functions from Google. For this article, we’ll use Cloud Functions.
Hello Functions
Deploying a hello-world app to Cloud Functions is quite simple even if you haven’t done it before. Create a project in the Google Cloud Console, and install the Google Cloud SDK, Google Cloud’s command-line tool. Then you can write a quick function called “discord_webook
” using the Functions Framework:
You can then deploy the function from the command line. Cloud Functions requires that an up-to-date Gemfile.lock file is present in order to deploy, so that means installing the bundle, then running the gcloud command to deploy a function:
Substitute your own project ID (or use gcloud config set project
to set it globally.) The command above deploys to the us-central1
availability region, and specifies a function that responds to HTTP requests using a Ruby 2.7 runtime. Note that it also disables Google’s default authentication. Instead, we will implement Discord’s authentication mechansim below.
If successful, the output of the gcloud deployment command will display the URL for the function. At this point you can use curl
to send http requests to the function and see the response.
Even though it’s very simple to get started, Google Cloud Functions has a long and growing list of features to make it easy to write and test your functions. You can run your function locally with a single command, and there’s a useful set of tools for running functions in isolation so you can write unit tests in Minitest or Rspec. I won’t cover the details here, but a lot of imformation is available in the Functions Framework documentation.
Responding to pings
Now that we have a working function, it’s time to configure it as the webhook endpoint for our Discord application. This is set in the “General Information” tab on your application’s page in the Discord console. However, if you just attempt to set the field now, Discord gives an error:
This is because verification failed. When you set up a webhook for an application, Discord will verify it is running correctly by sending it a ping message and expecting the proper reply. So we first need to update our function to handle pings.
Since we’re about to implement some real logic, let’s break it out into a separate class. The Functions Framework lets you define a function as a block, and you can put all the logic there. But for maintainability sake, it’s often a good idea to write separate Ruby classes encapsulating your application logic. So we’ll start by creating a Responder class to respond to HTTP requests sent by Discord, and refactoring our function to call it:
Note: The above code uses a startup block to instantiate our Responder and set it in a “global” that can be accessed by our function. The startup block and global storage are features of the Ruby Functions Framework. You could also use a Ruby global variable, or even a local variable scoped to the file, but the globals mechanism provided by the Functions Framework makes it easier to isolate runs when you write unit tests.
At this point, you can redeploy the function and verify that it still works. It should still just respond with the “Hello, world!” message. But we’ll change that now.
Discord’s messages, known in the Discord API as “interactions”, are sent as JSON and have a “type” field indicating the interaction type. Ping interactions have a type of 1, and when Discord sends you a ping, it expects you to respond with a similar JSON object, also with the “type” field set to 1. Let’s implement this in our Responder.
Notice the return types: we return a hash if we receive a ping, or a standard Rack response array to report a 400 Bad Request if we receive anything else. The Functions Framework recognizes a variety of return types: a string will be encoded as plain text, a hash will be encoded as JSON, and Rack response types are also recognized.
Redeploy to Cloud Functions, and you can test it there by posting a JSON request using curl and seeing the expected response:
Now you can go back to the Discord developer site, and fill in the interactions endpoint url field with the URL of your function. And…
We’re still getting a verification failure. It turns out, even though we’re returning the correct response to a ping, Discord also requires that we verify request signatures correctly before it will let us set the endpoint. So we’ll turn our attention there next.
Validating Discord requests
When you write a web service, it’s always good practice to validate that any requests you receive are actually from whom you think they’re from. Before it lets you set your endpoint URL, Discord will enforce this practice by checking that you’ve implemented validation correctly. It does this by sending send test requests to your endpoint with both correct and incorrect credentials, and making sure you respond appropriately
So let’s implement this verification, following the instructions from Discord.
First, we’ll need a library that can validate ED25519 signatures. There are several to choose from, but we’ll use the ed25519 gem because it doesn’t depend on outside C libraries, making it easier to deploy it to serverless runtimes.
Run bundle install
to install the gem and ensure that your Gemfile.lock
is updated. If you forget to do this when you update your bundle, Cloud Functions will fail to deploy your app, and will report an error that your lockfile is out of date.
Then it’s time to write the signature verification code. First, create a verification key from the app’s public key (which is available from the General Information tab on Discord.) Set this in the constructor for the Responder class because it’s the same for all requests.
In the above example code, substitute your app’s public key for mine.
Note: We’ve hard-coded the public key for now. This is not great practice, but it’s generally safe because a public key is not secret. In a real application you’ll likely want to load it from an environment variable or configuration file instead.
Once you have a verification key, you can verify a request by checking the contents of the request against the signature sent by Discord, using your key. The signature will match only if it was created using the corresponding private key, which only Discord should have. Additionally, the request content will include a timestamp, and you should check that it is close to the current time, in order to prevent replay attacks. Here’s the final code:
A quick redeploy, and now at last we can set our Discord application’s endpoint URL. If you go look at the Cloud Functions logs in the Google Cloud Console, you’ll be able to see the test requests that Discord sends you. Typically it will send two requests when you attempt to set the webhook URL: one with a correct signature and one with an incorrect signature, just to make sure you have pings and verification implemented.
Note: It can take a few seconds, even after Cloud Functions finishes deploying your function, for the backend to “switch over” to the new deployment. So if you’re following along, and you believe you’ve implemented the verification, but Discord is still reporting a verification error, wait about a minute and then try setting the endpoint field in Discord again.
Now what?
So far so good. We have a working Discord application, deployed to Google Cloud Functions, and responding correctly to requests sent by Discord. Next we actually have to create a command. We’ll cover that in part 2.
Notes
I work at Google in my day job, so all code in this article is: