Gallery2:Development:Performance and Scalability - Gallery Codex
Personal tools

Gallery2:Development:Performance and Scalability

From Gallery Codex

  • Intended audience: Gallery developers.
  • Purpose of this document: Plan and discuss future development concerning performance optimizations and refactorings.

Definitions:

  • Performance: Minimizing the response time of a stock Gallery with very few items
  • Load: Scaling of the response time with the number concurrent requests
  • Scalability: Scaling of the performance with the number of items, albums, users, ...


Theory and Heuristics

Before we refactor major components (MPTT, permissions / ACL, ...), we need to acquire some base data first.

  • What kind of data needs to be stored?
  • How often does the data change?
  • What data is needed for each request?
  • What queries do we need to run?
  • How often do we run each of those queries?

So please collect here a this data here in codex such that we can base our decisions on them.

Test Framework

Develop a test methodology and a test framework for:

Tools

  • kcachegrind / Callgrind (profiling)
  • xdebug (PHP profiling)
  • ab2 (apache v2 benchmark)
  • Benchmarking and Load tools: SIEGE, httperf, jmeter, ...
  • G2's profile mode (to identify slow DB queries)
  • Explain plan of SQL queries

Bottlenecks

  • # of ACL ids
  • LIKE parentSequence%
  • # of queries per page view
  • Bad page level caching (too many updates? Too frequent expiration? Need block level caching?
  • # file operations per request
  • Code footprint per request

ACL / Permissions

  • Maybe create ACL groups such that heterogeneous edit/.. Permissions don't lead to many view related ACL ids
  • Allow more than one ACL id per itemId to find an equilibrium between too many ACL ids per user and too many ACL ids per item

Queries per Page View

  • Are there similar / duplicate queries?
  • More caching required?
  • Grouping information together to fetch it with a single query? (e.g. user into session)
  • A user reports that viewing large albums is much slower than viewing small ones. See:

Hierarchy Mapping

  • Current: ParentChildMap + parentSequence
    • Con's:
      • LIKE parentSequence% = slow (how important is that?)
  • ParentChildMap + MPTT
    • Con's:
      • Expensive add, move, delete operations
      • Workarounds seem to be very complex (space allocation, bubbles)
  • ParentChildMap + DescendantMap
    • Con's:
      • Expensive add, move, delete?

File operations

  • Caching:
    • Questioning the Gallery Data Cache, maybe should be in the database
    • Maybe we need to group more cached data together
    • Maybe more caching is needed (finer/coarser grained?)
    • Idea: Abstract cache class + cache store implementations (shared mem cache, db cache, disk cache, …)
  • Modular design:
    • Modular @admin / configuration time shouldn't have too much influence on normal usage runtime -> compile / generate all active stuff into current runtime configuration / application files.
      • No file including / loading of each module on each request
      • No registration of event handlers / other callbacks on each request

Code footprint

  • Common files are too large (GalleryUtilities, GalleryCoreApi, GallerySession, GalleryUrlGeneration, main.php, GalleryDataCache, GalleryStorage, adodb, smarty)
  • See above (file operations / modular design)