Architecture and design

This section relates to the latest released version of the project.

The publish/subscribe model

For the purpose of the Laharsub project, publish/subscribe concepts are defined as follows. The server maintains a set of topics. A topic is a collection of messages in the order they were received on the server. Clients can create topics and publish messages to a topic. Topics and messages are uniquely identified with an integer value. Message identifiers are assigned by the server and are guaranteed to increase following the order in which the messages were received by the server. Clients can subscribe to one or more topics. Subscription to a topic must specify the topic identifier and may specify the minimum identifier of a message the client is interested in receiving. This client-side cursor enables the client to pick up receiving messages starting from the message that it received/processed last.

High level architecture

The system has three tiers: the client (typically an application running in a web browser, but in general any environment from which HTTP calls can be made), the middle tier (the WCF HTTP service that implements the HTTP based APIs and protocol), and the back end (storage technology with publish\subscribe logic).
Laharsub architecture

HTTP APIs and protocols

The APIs for topic creation and publishing to a topic are rather boring, but a few noteworthy design choices related to the subscription API are described below. You can also consult the WCF HTTP help page by pointing your browser to http://localhost/ps/memory/help (assuming default configuration of the Laharsub server).

Overview of the protocol and its behavior is presented on the picture below.

Laharsub protocol

Client subscriptions are modeled as an HTTP long poll request. Subscription is created when the server receives an HTTP long poll request, and terminated when the server sends a HTTP response to the request. Server sends back the response when messages matching the subscription are or become available, or when an HTTP long poll timeout configured on the server occurs (by default 45 seconds), whichever happens first. The server does not maintain any client specific state outside of the lifetime of the short lived subscription.

Client can subscribe to many topics using a single HTTP long poll request. An HTTP long poll specifies subscription parameters as a set of tuples (topic identifier, minimum message identifier). The tuples are encoded as query parameters of an HTTP GET request using the notation employed by jquery.param() function (also check out jquery.param demystified post). For example, a subscription to topic 14 starting from message 1 would look like this:

GET http://laharsrv/ps/sql/subscriptions/volatile?subs[0][topicid]=14&subs[0][from]=1 HTTP/1.1
Host: laharsrv

Server may send multiple messages back to the client as a response to the HTTP long poll request specifying the subscription parameters. Multipart/mixed content type is used as a framing mechanism. Each MIME part specifies the original content type of the message captured at the time it was published, as well as topic identifier and message identifier encoded in the Content-Description MIME part header. A sample response to the request above could look like this:

HTTP/1.1 200 OK
Content-Length: 406
Content-Type: multipart/mixed; boundary=1d69db84.154e.47f7.be93.cc8b65b6efd0
Server: Microsoft-HTTPAPI/2.0
Date: Tue, 18 May 2010 23:14:24 GMT

--1d69db84.154e.47f7.be93.cc8b65b6efd0
Content-Type: text/plain; charset=UTF-8
Content-Description: 14/929

Hello, world!

--1d69db84.154e.47f7.be93.cc8b65b6efd0
Content-Type: text/plain; charset=UTF-8
Content-Description: 14/930

Hello again!

--1d69db84.154e.47f7.be93.cc8b65b6efd0
Content-Type: text/plain; charset=UTF-8
Content-Description: 14/931

World?...

--1d69db84.154e.47f7.be93.cc8b65b6efd0--

Client subscription is a GET request. Since all subscription parameters are captured in the query string, people can pass subscription URLs around or link to them directly.

Publish/subscribe back end

The Laharsub server provides a mechanism for plugging in a custom pub/sub backend implementation and as such abstracts this implementation with the following interface:

public interface IPubSubBackend
{
    IAsyncResult BeginCreateTopic(AsyncCallback callback, object state);
    int EndCreateTopic(IAsyncResult result);

    IAsyncResult BeginPublishMessage(int topicId, string contentType, Stream body, AsyncCallback callback, object state);
    int EndPublishMessage(IAsyncResult result);

    IAsyncResult BeginSubscribe(IDictionary<int, int> subscriptions, TimeSpan timeout, AsyncCallback callback, object state);
    IEnumerable<PubSubMessage> EndSubscribe(IAsyncResult result);
}

The CreateTopic API takes no parameters (right now) and returns the identifier of the newly created topic. The Publish API accepts the topic identifier, message content type, the message itself (as a Stream), and returns an identifier of the message. Note that the backend is required to issue message identifiers that have incremental values within the scope of a topic. The Subscription API accepts the subscription specification as a set of dictionary entries mapping the topic identifier to the minimum message identifier within this topic the client is interested in receiving. The API also takes a timeout. The server is expected to either return an enumeration of messages matching the subscription specification, or null if the timeout elapses, whichever comes first.

Note that this design pushes the problem of scale-out to the implementation of IPubSubBackend interface. In particular, the implementation must address a problem whereby a message published to topic A is received on node N1 of a cluster, while a subscription matching message A is pending on node N2 of a cluster.

jQuery extension

The jQuery JavaScript library is currently the most popular JavaScript framework. Given that a fair amount of logic is involved in subscription multiplexing on HTTP long polls and parsing of multipart/mixed encoding on the client, Laharsub contains a jQuery extension that offers an easy set of APIs to get the work done. The src\client\jquery\jquery.pubsub.js file in the project contains the extension code.

Subscribing to a topic from an Ajax application that uses this extension looks like this:

$.pubsub.subscribe({
    topicId: 14,
    from: 1,
    onMessageReceived: function (message) {
            $("#notifications").append("<p>" + message.body);
    },
    onError: function (args) {
            $("body").html(args.httpRequest.responseText);
    }
});

Publishing a message to a topic is similarly easy:

$.pubsub.publish({
    topicId: 14,
    contentType: "text/plain",
    body: "Hello, world!",
    onError: function (args) {
            $("body").html(args.httpRequest.responseText);
    }
});

The extension takes care of framing, multiplexing, and HTTP long poll management. Please consult the actual code of jquery.pubsub.js for more details.

Last edited Jul 12, 2010 at 4:02 AM by tjanczuk, version 4

Comments

No comments yet.