Event-Based Loose-Coupled Integration

This documents describes loose-coupled integration based on event-driven synchronization, its implementation, the alternatives and how standards could improve interoperability between applications for the whole web application community.

Please take a look at the overview of Gallery 2 Embedding.

What is Event-Based Loose-Coupled Integration

Event-based loose-coupled integration is a method to integrate or embed one web application in another, e.g. to integrate a photo album application in a content management system, or to merge a blog with a forum.

The two terms loose-coupled and event-based can be explained separately, but only the combination of the two provides a complete solution for integrating / merging two or more web applications.

Loose-Coupled Integration

Loose-coupled means that the two applications are unchanged and each application exposes an interface through which it should be accessed by other applications. The two applications do not depend on each other, but they can delegate responsibilities like login / authentication to another application by communicating through the interfaces.

Integrating application A into another application B is done by having some glue code on top / as a plugin of B which calls A to get the generated output (HTML) and then B puts the generated output in its own output and returns the whole output to the user.

 User  -- request --> Application B (e.g. a CMS)
                        |____ B initializes etc.
                        |____ B authenticates the 
                        |     active user
                        |____ If the request is for  -- B calls --> Application A (e.g. a forum)
                        |     an integrated page of                   |____ A handles the request 
                        |     of A in B                                     and returns the output
                        |                            <- A returns -
                        |     
                        |____ B assembles the whole 
                        |     page output and embeds
                        |     A's output in it
                        |____ B returns the requested
                        |     page to the user
 User  <-- output ---   |

Application A relies on application B supplying enough information in its call to application A. Most importantely information about the authenticated user. For web applications, it's also important that B tells A how the URLs are supposed to look like that it should generate.

But its important to note that A and B just communicate with each other through a clearly defined interface or protocol and they don't access each others data structures or functions directly.

TODO: Say something about: interfaces, API, protocols, master-slave relationship

Event-Based Synchronization

Since the applications are loose-coupled and don't share the same data, they need to notify each other of changes or generally events that could affect the other application. Such events include:

Creation of a new user
Update of some user data like the email address
Deletion of a user
Login, Logout
Application settings
...

So each time a new user gets created in one application, it notifies the other application through the well-defined interface of this event along with the user name, user data etc. Then the other application can take an appropriate action which is in this case the creation of a corresponding user and to map the two users by their name or id number.

TODO: Say something about: hooks, events

Summary

Loose-coupled integration together with event-based notification ensure that the two applications:

can be developed independently and don't need to be forked / hacked
never get out of sync'
are easy to integrate

Usually the integrations are one-sided, that is, as in this example A is embedded in B. B is the master, A is the slave and the communication / notification in this master-slave relationship is only simplex. B notifies A of changes, requests data from A, but it's never the other way around.

Why Loose-Coupled?

The alternative to a loose-coupled system is a tight integration where data structures (database tables, ...) and / or function calls of the two applications are interweaved to a point where it becomes unmaintainable, there are more and more cross-dependencies all over and the resulting system is (almost) a fork of the two other applications.

What happens if one of the applications has a new release? The authors of the integration would then have to backport all the changes to their forked integration system. Or apply all the hacks to the new release.

This is not a fictional scenario, there are plenty of examples of such integrations, e.g. PNphpBB (PostNuke fork of phpBB).

Such tight integrations are not easy to maintain and require a dedicated team to keep it up to date and to port security patches and fixes from the original applications to the fork / hacked integration. Depending on a third, often very small and not well-organized team of developers for a production-level application is sure not preferable. Developers lose their interest and it's important to have applications backed by large development teams guaranteeing continued support and development. Also, being able to quickly apply critical security patches is also of great importance.

The main goal of loose-coupled integration is to provide a very well integrated experience for the end-user while keeping the involved applications as separate as possible in the back-end. Well defined interfaces and a good design guarantee that there are no drawbacks for the end-user compared to tight integrations.

Why Event-Based?

The alternative to event-based notifications of changes is on-the-fly notifications that is, instead of notifying the other applications exactly when the event happens, the other application is notified on-the-fly when a user requests the next embedded page.

Example:

A new user is created in application B (e.g. a CMS)
The same user visits an embedded page of A in B
When A is called by B, A checks whether the user already exists in A, if not, the user is created.
A handles the request in any case and returns the data as usual

In this case we talk of on-the-fly user creation.

The problems of on-the-fly user creation are obvious:

As long as a user doesn't visit an embedded page of A in B, the changes are not synchronized. If A has features like mass-emails to all registered users, it won't send emails to all registered users since only a subset already have an account in A
Only a small subclass of changes can be synchronized on-the-fly. User deletion, user data changes (e.g. email address), logout, configuration changes, ... cannot be synchronized without a performance penalty using a workaround like checking for a list of queued changes on each embedded request or trade-off solutions like running a synchronization task periodically

Therefore we classify on-the-fly synchronization as a low-tech fallback solution for applications that are not ready to provide events / hooks such that plugins or other applications can hook into core functionality like the user creation.

Implementation

In this example we use the same pattern of a master-slave relationship as the one used before.

Master Application

The master application is the principle application receiving and handling requests by users and serving the resulting pages. A typical example is a content management system.

Let's further assume the master application is modular and developers can extend its functionality with plugins (or modules, extensions, componennts, ...). To embed a slave application, e.g. a forum or a photo management system, in the master application, you would then write a new plugin for the master. The plugin consists of a small wrapper scripts which communicates with the slave.

Wrapper Script

Sample for the wrapper script:

 $userId = getUserId(); // get the userId of the current active user in the master application
 $baseUrl = getBaseUrl(); // get the base URL of all URLs that point to the master application
 include('some/path/to/slaveApplication/embed.php'); // include the interface exposed by the slave application
 embed_init($userId, $baseUrl); // initialize the slave application with the authenticated user and the base URL
 $html = embed_run(); // let the slave application handle the current request and return the resulting HTML

Then the master would embed the generated $html in one of its own templates and return the complete rendered page to the user.

Event Synchronization

We also assume the master application has a built-in event or hook system that enables its plugins to execute their own code on events such as user creation, login or configuration changes. An illutrative implementation of an event system could look like this (we don't encourage the use of globals, but they are illustrative):

 function create_user($username, $password) {
     ... // create the user as usual

     global $hooks;
     if (!empty($hooks[CREATE_USER])) { // check if there are any registered event listeners / hooks
         foreach ($hooks[CREATE_USER] as $event_handler) {
             $event_handler->handle_event($username, $password);
         }
     }
 }

A plugin of the master application could register an event handler like this:

 $GLOBALS['hooks'][CREATE_USER][] = new my_create_user_handler();

And the handler itself would look like this:

 class my_create_user_handler {
     function handle_event($username, $password) {
         // synchronize user creation to our embedded application
         include('path/to/slaveApplication/embed.php'); // include the interface to the slave application
         embed_init(); // initialize the slave application
         embed_create_user($username, $password); // create the user in the embedded application
     }
 }

Of course you wouldn't implement an event / hooking system that way, but the important part is that there is an event when a user is created in the master application and that plugins of the master application can register themselves as an event listener or hook somehow into core functionality such that they can execute their embed_create_user() function when a user is created.

Slave Application

The primary task of the slave application is to offer an interface or a protocol of integration related methods. This interface must offer methods to:

Initialize the application (as an alternative to the normal index.php entry point)
Create / update / delete users
login / logout users
...

A sample interface could look like this (the embed.php file referenced above):

 function embed_init($userId=null, $baseUrl=null) {
     include(..); // include some files, intitialize the application
     $user = get_user_by_mapped_external_userid($userId);
     set_active_user($user);
 }
 function embed_run() {
     $mode = 'embedded';
     $html = main($mode); // call the application in embedded mode (return html instead of printing to browser, ...)
     return $html;
 }
 function embed_create_user($username, $password) {
     some_api_call_to_create_user($username, $password);
 }

The slave application must of course meet a lot of requirements that allow it to run in embedded mode. But the important bit here is that the inner architecture of the application is hidden and only a stable, simple interface is exposed to the master application.

Standardization

Integrating a web application into another can be quite complicating since only a small percentage is designed to also work when embedded and only a few frameworks / applications are designed to offer the means to embed other applications into them.

If there was a standardized way of integrating applications with specifications that must be met by the master and by the slave applications, writing an integration could be faciliated a lot. Everyone would profit:

There would be clear, standardized patterns and specifications to follow when designing an application for integration
Interoperability of a large range of applications would be guaranteed
Frameworks / CMS would profit from seamlessly integrating best-of-breed applications in their overall solution
Dedicated applications (forums, commerce, photo album, ...) would find their way into more CMS'
The end-user profits from better overall solutions
Tools common to all integrations could be developed

They key requirements at master applications is the event or hooking system. Without, only a low-tech fallback solution like on-the-fly user creation with periodic synchronization runs is possible. How this event system is implemented, does not matter at all.

The slave applications must expose a well-defined interface or protocol. The details of such an interface are not that important (for now, we can always improve the standardization at a later point), as long as the interface follows the same basic principles to offer functions to initialize, handle the request, create a user, etc.

If all interfaces were standardized one could assemble a CMS that has a feature-rich forum, gallery etc. by just writing a little glue-code. Seeing each CMS writing their own (mostly poor in features) forum solution or multimedia framework and thus wasting their own resources and making users less happy than with feature-rich solutions is reason enough for such a standard.

A standard should have different levels, meaning that not all applications can meet all goals of such a standard, but they should still work together, just on a lower level of standardization.

CMS devs usually want to develop their own solution, favor their own forum or gallery solution etc since they believe that one day, they'll have all the features needed, plus it's then an integral part of their CMS, well desgined and their code is better anyway :) Of course we have to accept and appreciate this position, but with integration/interoperability standards, applications could achieve the same, without having to develop specialized solutions like a forum system.

Applications Supporting Event-Based Integration

As a master application:

Gallery2 (90% events / hooks for user / group create / update / delete, logout, login, ..missing: configuration changes)
Xaraya (100% events / hooks for user / group create / update / delete, configuration changes, logout, login, ..)
Wordpress (90% events / hooks for user / update / delete, rewrites, logout, login, ..)
Drupal (80% maybe?)
Joomla (100% joomla 1.5; login, logout, create, update, delete user, block, activation, system before and after start)
Typo3 (50% maybe?, e.g. no unified create user event / hook)
TikiPro (?%)
...

As a slave application:

Gallery 2
Phorum.org
Pretty all scripts, since you can always add a layer / interface which calls the internal functions and add output buffering around it. But they need to provide basic means to dictate the URL format etc.

References

Event-based loose integration is an obvious and the most prominent approach to software integration for a long time now. These are a few papers and references on the topic:

Daniel J. Barrett, Lori A. Clarke, Peri L. Tarr, Alexander E. Wise, "A Framework for Event-Based Software Integration" (Postscript, 461K). ACM Transactions on Software Engineering and Methodology (TOSEM), Volume 5 Number 4, October 1996. Also see: http://www.blazemonger.com/publications.html#ACADEMIC

More links specific to web / PHP interoperability:

http://www.geeklog.net/article.php/2002111817094613 (from early 2002)
http://www.google.com/search?hl=en&q=cms+bridge
http://www.phorum.org/phorum5/read.php?28,53902,page=1 (2006)

Personal tools

Gallery2:Embedding:Event-Based Loose-Coupled Integration

From Gallery Codex

Contents