Using XMPP as a Message Bus

This highly technical post is focused on the idea of using the XMPP as a generic message bus.

As a bit of history, XMPP or the Extensible Messaging and Presence Protocol, was originally created 1999 under the name Jabber. Jabber was originally focused as a ‘chat’ protocol, and came in when things like ICQ, AIM and GAIM were the buzz words. As a standard for chatting, it quickly gained a following as it allowed a variety of different chat applications (most with their own proprietary protocols) to easily inter-communicate. By 2000, the IETF had formed a working group (IMPP), and by 2002, the official name (and steering committee) had formed around the name XMPP. By 2004, the first RFCs (3920 and 3921) were released. A plethora of clients and servers were created, and many still exist today. One of the key achievements of XMPP, however, was not in its ability to chat, but its development of the XMPP Extension Protocols (or XEPs). Each XEP is a standard that describes a new piece of functionality that can be layered on top of (or around) the base XMPP protocol. To date, there are 392 XEPs (see the list). They range from security (OMEMO / XEP-0384, OTR / XEP-0364, Encryption / XEP-0116) to UI (Forms / XEP-0004, Forms Layout / XEP-0141, XHTML / XEP-0071) to IoT (Sensors / XEP-0323, IoT Control / XEP-0325) and everything in-betweeen. In fact, when you’re looking at an XMPP library or a client (like an App on an App store), the feature list is almost always just a list of which of the 329 XEPs it supports.

Alot of use of XMPP today is in machine-to-machine communication. Some of the relevant XEP’s are Forms / XEP-0004, RPC / XEP-0009, Service Discovery / XEP-0030, Ad-hoc Commands / XEP-0050, Search / XEP-0055, Publish-Subscribe / XEP-0060 and In-Band Registration / XEP-0077. I’ve personally found that leveraging In-Band Registration, Service Discovery, Ad-hoc Commands and Forms creates a very straightforward, extensible and performant model for building a scalable machine-to-machine (or even user-to-machine) message bus.

Registration

Everything starts with registration. A new instance of code (whether it be a Docker container, a mobile app or just some process running in your dev environment) needs to be able to connect to the XMPP server, and needs to be establish itself under a given name. This is the process of registration. There already exists XEP-0077 for allowing In-Band Registration. What this means is that your code connects to the XMPP server anonymously, and then sends the user/password that it wants to be known as in the future, thus ‘self-registering’. The biggest issue with this, is, of course, the fact that there is very little control over ‘who’ is registering. It means that with a ‘wide open’ self-registration, you could easily have thousands of spam bots register themselves with your server and start sending millions of spam messages, without you even being aware (at least until the angry messages start coming back).

The trick is to restrict who is allowed to self register. This can best be done by having a shared secret between the XMPP server and the client. My particular favorite is to build a composite password that can be verified by the server.

UserName = Prefix + Base64({random name})
FirstPart = Base64({random password})
Password = FirstPart + '/' + Base64(Hash(ServerName + UserName + FirstPart + SharedSecret))

In this above pseudo-code, I define a shared secret that’s tied to a specific prefix. Everytime I create a new product or even version of the product, I change the prefix and the shared secret. This limits the damage if someone reverse engineers the code to find the shared secret. Let’s walk through an example:

A random name is created, base64’d (this reduces the character set nicely) and is added to a prefix (let’s say ‘MyAppV1-‘). Thus, my UserName might be MyAppV1-c3VwZXJjYWxpZnJhZ2lzdGlj. Next, a truly random password is generated, giving us FirstPart = 'Tm90IFRoYXQgUmFuZG9t'. If our server is located at xmpp.example.com, and the shared secret tied to MyAppV1- is ‘Bob‘, then we generate the composite string 'xmpp.example.comMyAppV1-c3VwZXJjYWxpZnJhZ2lzdGljTm90IFRoYXQgUmFuZG9tBob'. This is then hashed and base64 and then added to the FirstPart, giving us a final password of 'Tm90IFRoYXQgUmFuZG9t/ncwgNfpLhlWvnEt7UCovNRaqcpc='.

Because we’ve hashed the shared secret into final password, there is no way to recover it from the password itself. Additionally, because the hash contains the First Part portion and the User Name, someone can’t just Copy & Paste the second part and use it with other user ids.

Final registration message from the code to the server:

<iq type='set' id='reg2'>
  <query xmlns='jabber:iq:register'>
    <username>MyAppV1-c3VwZXJjYWxpZnJhZ2lzdGlj@xmpp.example.com</username>
    <password>Tm90IFRoYXQgUmFuZG9t/ncwgNfpLhlWvnEt7UCovNRaqcpc=</password>
  </query>
</iq>

To make this all work, you need your XMPP server to be able to verify these passwords during the In-Band Registration. My personal favorite XMPP server is the venerable ejabberd. One of it’s many features is that you can easily add your own password managment system. Due to a few hurdles in doing that within a Docker environment, I won’t go into the details here (perhaps a post for another day), but you can do that fairly easily. I have a standard Docker container / sidecar that works with ejabberd that verifies these passwords against a known list of Prefix/Shared Secret combinations.

Discovery

Once your code is registered, it usually needs to find the other servers it wants to talk to. This is easily done via the Search / XEP-0055 and Service Discovery / XEP-0030. A portion of this protocol would be to search for the jids of interest (perhaps searching by nickname):

<iq type='set' id='search2' xml:lang='en'
    from='MyAppV1-c3VwZXJjYWxpZnJhZ2lzdGlj@xmpp.example.com/device'
    to='xmpp.example.com'>
  <query xmlns='jabber:iq:search'>
    <nick>MyAppServer</nic>
  </query>
</iq>

The server would then respond with something like:

<iq type='result' id='search2' xml:lang='en'
    from='xmpp.example.com'
    to='MyAppV1-c3VwZXJjYWxpZnJhZ2lzdGlj@xmpp.example.com/device'>
  <query xmlns='jabber:iq:search'>
    <item jid='MyAppServer-abc11324312@xmpp.example.com'>
      <nick>MyAppServer</nick>
      <email>support@diamondq.com</email>
    </item>
  </query>
</iq>

This Search query found a single matching element with the given nickname. The code now has the full name (the JID). To make sure that this is the ‘right’ server, the next step would be to subscribe to the JID and then issue a Service Discovery request:

<iq type='get' id='info1'
    from='MyAppV1-c3VwZXJjYWxpZnJhZ2lzdGlj@xmpp.example.com/device'
    to='MyAppServer-abc11324312@xmpp.example.com/device'>
  <query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>

Assuming this is the right server, the response would look like:

<iq type='result' id='info1'
    from='MyAppServer-abc11324312@xmpp.example.com/device'
    to='MyAppV1-c3VwZXJjYWxpZnJhZ2lzdGlj@xmpp.example.com/device'>
  <query xmlns='http://jabber.org/protocol/disco#info'>
    <feature var='http://jabber.org/protocol/disco#info'/>
    <feature var='http://jabber.org/protocol/disco#items'/>
    <feature var='http://jabber.org/protocol/commands'/>
    <feature var='http://diamondq.com/protocol/MyAppServer'/>
  </query>
</iq>

By the presence of the feature http://diamondq.com/protocol/MyAppServer, it’s now known that this code is the right one to talk to.

Commands

Now that we’ve found the right server to talk to, and with the Service Discovery also found that it supports Ad-Hoc Commands / XEP-0050 via the feature http://jabber.org/protocol/commands, we can begin to talk to it.

There are many different ways that the communication could occur, and, perhaps, the most common for machine-to-machine communication would be to use RPC / XEP-009. However, using Ad-Hoc Commands / XEP-0050, has an added benefit: The same code can be used for humans as well. Ad-Hoc Commands is a little more inefficient (slightly more XML is sent), but by supporting both human and machine interaction over the same protocol means that it’s very easy to test, as well as supports manual commands when necessary.

Ad-Hoc Commands enables the list of valid commands to be dynamically queried. This is great for manual commands, but isn’t necessary during machine-to-machine. Additionally, if you send an initial request to a command with no parameters, it will respond with all the parameters that it supports, along with help text. This is also great for manual commands, but again, isn’t needed for machine-to-machine. However, by having it all, it does provide a great ‘self-documenting’ feature for the command, so that even when you are coding machine-to-machine, you can easily get all the details by just requesting with no parameters (or the wrong parameters). Additionally, Ad-Hoc Commands allows for multi-step commands (ie. you send the first bit of information, and then the next step responds with the questions you have to answer, which can continue as necessary). Again, this isn’t usually important for machine-to-machine, as you can usually provide all the data in the first step. All the data necessary to send for a command is usually encapsulated into a Form / XEP-0004. Forms can be very simple with just the name, type and some basic help text, but can also be expanded with complex validation (XEP-0122), complex UI Layout (XEP-0141), CAPTCHA (XEP-0158), videos (XEP-0221), arbitrary XML (XEP-0315), color (XEP-0331), signatures (XEP-0348) and geolocation (XEP-0350).

For this example, we’ll actually submit a unfilled out request (again, not normal for machine-to-machine), so we can see what kind of data can be returned. We’ll issue a request to the list-users command:

<iq type='set' id='exec1'
    from='MyAppV1-c3VwZXJjYWxpZnJhZ2lzdGlj@xmpp.example.com/device'
    to='MyAppServer-abc11324312@xmpp.example.com/device'>
  <command xmlns='http://jabber.org/protocol/commands'
           node='list-users'
           action='execute'/>
</iq>

The server responds with:

<iq type='result' id='exec1'
    from='MyAppServer-abc11324312@xmpp.example.com/device'
    to='MyAppV1-c3VwZXJjYWxpZnJhZ2lzdGlj@xmpp.example.com/device'>
  <command xmlns='http://jabber.org/protocol/commands'
           sessionid='config:20020923T213616Z-700'
           node='list-users'
           status='executing'>
    <actions execute='next'>
      <next/>
    </actions>
    <x xmlns='jabber:x:data' type='form'>
      <title>List Users</title>
      <instructions>Please select the type of user to list.</instructions>
      <field var='user-type' label='User Type' type='list-single'>
        <option><value>admins</value></option>
        <option><value>employees</value></option>
        <option><value>users</value></option>
      </field>
    </x>
  </command>
</iq>

Here you can see that the command requires a single field called user-type. It must be one of three possible values. In a graphical client, this might be displayed as a popup with the three choices. The actual submission (which would normally be the first message in a machine-to-machine scenario) would be:

<iq type='set' id='exec1'
    from='MyAppV1-c3VwZXJjYWxpZnJhZ2lzdGlj@xmpp.example.com/device'
    to='MyAppServer-abc11324312@xmpp.example.com/device'>
  <command xmlns='http://jabber.org/protocol/commands'
           node='list-users'
           action='execute'>
    <x xmlns='jabber:x:data' type='submit'>
      <field var='user-type'>
        <value>employees</value>
      </field>
    </x>
  </command>
</iq>

With a final response listing the two employees:

<iq type='result' id='exec1'
    from='MyAppServer-abc11324312@xmpp.example.com/device'
    to='MyAppV1-c3VwZXJjYWxpZnJhZ2lzdGlj@xmpp.example.com/device'>
  <command xmlns='http://jabber.org/protocol/commands'
           sessionid='list:20020923T213616Z-700'
           node='list-users'
           status='completed'>
    <x xmlns='jabber:x:data' type='result'>
      <title>List Users</title>
      <reported>
        <field var='name' label='Full Name'/>
        <field var='email' label='Email Address'/>
      </reported>
      <item>
        <field var='name'><value>Mike Mansell</value></field>
        <field var='email'><value>mmansell@diamondq.com</value></field>
      </item>
      <item>
        <field var='name'><value>Sonja McLellan</value></field>
        <field var='email'><value>smclella@diamondq.com</value></field>
      </item>
    </x>
  </command>
</iq>

Summary

While this has been a long post showing the different components to leveraging XMPP within a message bus scenario, it’s covered all the major topics (registration, discovery and commands).

Many people feel that the biggest issue with XMPP is the fact that it uses XML, with all the corresponding verbosity. However, there are several things to consider. First, almost all communication is actually done over a compressed transport (XEP-0138), and XML compresses very well (usually 10:1 or better). If even more compression is needed, or the resource constraints of the device are minimal (ie. sensors), a binary format like EXI works very well (see Efficient XML Interchange (EXI) Format / XEP-0322).

Additionally, almost all the complex XML is hidden behind an XMPP library for your language. For example, within the Discovery section, we were checking to see if the code was ‘our’ server (ie. had the right feature). This can be done using the Babbler Java library with a couple of lines of code:

if (client.getManager(ServiceDiscoveryManager.class).discoverInformation(
   Jid.of("MyAppServer-abc11324312@xmpp.example.com/device")).thenApply(
      (infoNode) -> infoNode.getFeatures().contains("http://diamondq.com/protocol/MyAppServer")
   ).get() == true)
      // All good
   else
      // Wrong server

I’ve been using this process now for a few months, and have multiple projects using this methodology. It works well, scales nicely (XMPP / ejabberd can handle millions of devices communicating), it’s secure (all communication is over TLS and can even have end-to-end encryption using OMEMO / XEP-0384 or OTR / XEP-0364) and is incredibly easy to debug.

One thought on “Using XMPP as a Message Bus”

Leave a Reply

Your email address will not be published. Required fields are marked *