Gallery2:GallerySession - Gallery Codex
Personal tools

Gallery2:GallerySession

From Gallery Codex

GallerySession Docs for Developers

  • Persistent sessions are sessions that are stored on the Gallery 2 server persistently such that session state information and session data persists over multiple subsequent requests for the same user (G2 has to remember which user is logged in, what items are in the cart, what language was chosen, ...)
  • Persistent session data is stored in the database.
  • A session is initialized for each request and resides only in memory. If the session already exists, the persistent session is loaded into the active session, if the session is new, we initialize a session with default attributes
  • G2 only creates persistent sessions for logged-in users and for guest users that have non-default data in their session. E.g. if a guest adds something to their shopping cart, a persistent session is created for guests too
  • G2 creates persistent sessions as late as possible, that is very close to the end of the request handling, since we have to evaluate the current active session to decide whether a persistent session is needed or not
  • G2 assumes a persistent session exists if the request has a sessionId parameter in the URL or if there's a GALLERYSID cookie. Else the session is new
  • G2 resets the sessionId to an empty value if there is no persistent session with the provided sessionId or if the persistent session has expired
  • Adding session data in immediate views or in a progress-bar view / controller call is discouraged since we cannot create new sessions once we start outputting HTML to the browser (we can't send a cookie and we can't replace temporary sessionIds). Since progress-bar views are usually executed by logged in users, this should still be no problem at all. But for guest users, this will not work since we can't send a cookie if HTML has been outputted and the sessionId appended to URLs is not valid.

A typical request

From the perspective of GallerySession, a normal G2 request goes like this:

1 init urlGenerator (could be called before or after init session)
2 init session (LOAD DATA OF EXISTING SESSION OR LOAD EMPTY SESSION, actually it's $gallery->initSession(); which calls $session = new GallerySession(); $session->init())
3 init translator (sets active language for the session)
4 init navigation (in url generator, depends on navigation data loaded from session)
5 setActiveUser($user) (sets the active user for the session)
6 handle request ($controller->handleRequest() and/or $view->loadTemplate(), maybe changes session data)
7 start session (creates a persistent session if necessary, actually it's $session->start();)
if new session { 
  if persistent session is necessary {
    get new sessionId and save session (total 1 db query) }
  } else { return; /* don't need a session */ }
}
send a cookie
8 $html = $session->replaceTempSessionIdIfNecessary($html) (only relevant for new sessions, see below for explanations)
if session is new {
  if persistent session is newly created {
    replace temp session id placeholder that is in the generated HTML with the real, new sessionId
  } else {
    remove the g2_GALLERYSID= temp session id placeholder from all URLs in the generated HTML
  } 
}
9 save session (only if necessary. if the persistent session was just created, we won't save it here again. if there's no sessionId (sessions of guest users don't have a sessionId, unless they are persistent sessions), we won't save the session)

How to handle the three different types of Gallery requests

In G2 standalone, we can differentiate three basic types of requests in main.php (from the session's point of view):

  • a controller request followed by a redirect

We need to check if need to create a new persistent session before redirecting and if there's a sessionId in the redirect URL, replace it with the real, new one

  • a controller request followed by a delegate to a view, or simply a view request

We need to check if we need to create a new persistent session after fetching the rendered HTML and we need to replace the pseudo temp sessionId with a real one for new persistent session / remove it if no persistent session was created. If we had a real sessionId at the beginning of the request, do nothing.

  • a controller request followed by a progress-bar view, or a immediate view request

Just before giving the control to the progress-bar view / immediate view, create a new session if necessary. Then switch to not use a pseudo temp session id anymore (getId() will return an empty string or the real sessionId) since we can't do additional session stuff during the immediate / progressbar view.

Explanations for the replaceTempSessionIdIfNecessary($html) stage

Problem

If we determine in $session->init() that the session is new / no persistent session exists yet, then we have a problem. On the one hand, we need a new sessionId as soon as possible since we need to append the new sessionId to all generated URLs since we don't yet know whether the user-agent (browser) accepts cookies or not. Without cookies, stateful browsing is only possible with a sessionId in all URLs that a user may click to request the next page.

On the other hand, we can't acquire a new, unique/collision-free sessionId without querying the database to check if the new, randomly chosen sessionId is not already in use. The chance for a md5 collision is very small (around 1 : 2^128), but we want to have guarantee that. And querying the database for a new sessionId costs 1 db query. And in most cases, we don't need a session for users that don't have any session yet. And if we find out that we don't need a session after all, we should delete the acquired sessionId at the end of the request, that makes 2 queries and for most guest requests we don't need a session anyway.

Solution

For new sessions, we use a temporary pseudo session ID during the whole request. All generated URLs have this 'g2_GALLERYSID=' . SESSION_TEMP_ID, and other modules get the the SESSION_TEMP_ID too if they call $session->getId().

Once we reached the latest point in the request before we output HTML or redirect to another page, we evaluate the session data to find out whether a persistent session is required. If so, a persistent session is created.

  • If a persistent session was created, we need to replace SESSION_TEMP_ID in the whole source with the new, valid sessionId. That's a very cheap str_replace() operation.
  • If no persistent session was created, we need to remove all 'g2_GALLERYSID=' . SESSION_TEMP_ID and SESSION_TEMP_ID occurrences in the generated HTML / redirect URLs. That's also a very cheap operation.
  • For existing sessions, the whole replaceTempSessionIdIfNecessary() call does nothing and just returns
  • A nice side-effect is that search engines don't get any sessionId in their URLs at all. And guests don't get ugly sessionIds in their in the URLs of the first page that they visit.

One could argue that we don't need a SESSION_TEMP_ID and should set sessionId for new sessions to a randomly chosen new sessionId. And once we call $session->start() to check if we really need the session, we can find out whether the id that we used throughout the request in URLs etc. is a new, unique id or not. Then in replaceTempSessionIdIfNecessary() we wouldn't have to replace the sessionId if a new persistent session was created. Only in the unlikely case of 1 : 2^128 (if a collision occurred) we would have to replace the used sessionId with the newly, truly unique sessionId.

The reason why we choose a SESSION_TEMP_ID which doesn't have the format of our normal sessionIds is that it's easy to find out if some code uses an unreplaced sessionId or not. Also, the step to replace the SESSION_TEMP_ID is super cheap, so we wouldn't win that much.

Usage

If you want to create a new session and 'login' a specific user, do (omitting the obligatory error checks):

  1. $session = new GallerySession();
  2. $ret = $session->init();
  3. $session->setUserId($id);

Alteratively, use Gallery, but then you're interacting with the one and only $session object:

  1. global $gallery;
  2. $ret = $gallery->initSession();
  3. $gallery->setActiveUserId($id); or $gallery->setActiveUser($user);

If you want to create a new session and ensure that you've got a valid sessionId, do:

  1. $session = new GallerySession();
  2. $ret = $session->initEmpty(false, $userId);
This is useful e.g. in print modules which need to create a session for printing offices such that they can fetch the images from g2 with sufficient privileges to download them.

Before outputting, we call in main.php:

  1. $ret = $session->start(); // which either creates a persistent session and then we have a valid sessionId, or it just returns
  2. $html = $session->replaceTempSessionIdIfNecessary($html);
to remove / replace the temp session id for new sessions

If the HTML is outputted directly to the browser, we can't replace / remove the temp session id and therefore have to check if need a session before we start outputting. That is the case for immediate views and the progressbar (as an exception among the non-immediate-views).

  1. $ret = $session->start();
  2. $session->doNotUseTempId();
which ensures that $session->getId() returns the actual sessionId which is either an empty string or a valid sessionId, but not the temp id.

Search Engines and Gallery Sessions

We don't create sessions / sessionId for anonymous / guest users and all search engines fall into this category in G2, since they don't explicitely log in into G2. That means that all generated URLs have no sessionId for search engines. This is particularely important, since search engines could fall into a loop of following the same links again and again just because the sessionId in the URL has changed, which is not the case in G2.

We also need to detect search engines since some pages in G2 could add some data to the session. Guests don't have a session in G2 as long as their session data has nothing important in it. As soon as a view / controller adds some special data to the session, we create a persistent session for this guest user. If we didn't detect search engines, then we'd create a persistent session for them, which could be quite expensive if the search engines fetch hundreds or thousands of pages from G2. Also, in embedded G2, we'd add the sessionId to the core.DownloadItem URLs and the search engine would record ugly URLs with sessionIds in them. That's why we still need to detect search engines and then we set that the user agent is using cookies and we call $this->doNotUseTempId() and we remember to not create a persistent session too.

As of G2.2, the following search engines are detected by G2:

Search Engine Identifier in G2 User Agent Must Contain
google gsa-crawler OR Google
yahoo Yahoo
askjeeves Ask Jeeves
microsoft msnbot
yandex Yandex
stackrambler StackRambler
convera ConveraCrawler

You can edit this list in modules/core/classes/GalleryUtilities.class function identifySearchEngine().

Session expiration

Since we don't create persistent sessions for guest users unless they browse around a lot / pick a preferred language and such things, the problem of a time-consuming expiration operation to expire / remove old sessions is eased.

  • The expiration code is called for 2% of all request in which a new persistent session was created. It is called in save() ($session->save() is called on every request).
  • The expiration code removes all sessions from the database that are older than the max lifetime haven't been modified for longer than the activity timeout limit
  • Additionally, all guest sessions older than a week are deleted
  • These two conditions are merged into a single database query
  • We're deleting maximally 1000 sessions per _expire() call to prevent too long session table locking. Since the 'garbage collection' rate aka the rate at which we call _expire() is quite high, this should work quite well. The more sessions / requests there are, the higher is the rate at which we call expire() (1:1 relation).
  • The session table has an index over (userId, creationTimestamp, modificationTimestamp) and for this query irrelevant it has a PK for sessionId and an index for userId and an index for isUsingCookies (we might want to do some operations only on sessions that are using cookies?!) actually, isUsingCookie could be dropped, your call.

Performance

Performance is not really an issue.

  • For guest visitors without special session data, no database query is done for sessions
  • The "load persistent session" query is only made if a sessionId is in the request variables or a cookie is present
  • Before using a session id, it is sanitized and sessionIds with not exactly 32 valid digits/characters don't get accepted
  • Creating a new perisistent session requires a single query to acquire a valid & unique sessionId and simultaneously store the session data
  • If a client sends an invalid / expired sessionId with a request, G2 sends back a "delete cookie" to make sure that G2 does not have to handle the invalid sessionId on each request (load query to find out that the sessionId is invalid).
  • Expiration of old sessions is done often, but only in small batches

Issues

No Navigation (navId) Back links for Guests

Guests without a session don't get a navId / back links since we can't store the navigation data in between requests. But as mindless said guests usually don't have access to pages that would make use of the navigation data.

Check if we still get a valid back link when browsing as guest and adding something to the cart.

View Count is less reliable / accurate

If we wanted to keep the lastViewed data in the session, we'd have to create a session for all guests. Instead, we don't create a session just for the lastViewed data. But that means that we'd increment the view count again and again if a guest browses back and forth or views multiple resizes of a specific image.

I've added a check for the HTTP If-modified-since header in incrementViewCount() if we don't have lastViewed data. I'm not checking the date of If-modified-sine at all, just checkinf for its presence. Probably it won't work that well, since the request URLs for different resizes and the fullsize are different.

Alternatively, one could send another cookie, not a GALLERYSID cookie, just to keep data about things like navId and lastViewed.

Data that is set in controllers could have SESSION_TEMP_ID

Data that is set in the controller and passed on to the session or to the view could contain the SESSION_TEMP_ID. I'll also have to replace / remove the SESSION_TEMP_ID from the $form and the session data I guess.

Not allowed to manipulate session during progress-bar / immediate view

During immediate views / progress-bar views, you're no longer allowed to manipulate the session. I don't think that's a problem, but it might be one. At least you can still manipulate the session if the session already exists. In such a case, it won't attempt to send a cookie and doesn't have to replace a pseudo id or something like that.

GallerySession.class is a lot of code to parse

Maybe split GallerySession.class into a GallerySession and a GallerySessionHelper_simple.class since in most cases, we only need to load the session data and nothing else.