- Intended audience: Gallery developers.
- Purpose of this document: Plan and discuss future development concerning performance optimizations and refactorings.
Definitions:
- Performance: Minimizing the response time of a stock Gallery with very few items
- Load: Scaling of the response time with the number concurrent requests
- Scalability: Scaling of the performance with the number of items, albums, users, ...
Theory and Heuristics
Before we refactor major components (MPTT, permissions / ACL, ...), we need to acquire some base data first.
- What kind of data needs to be stored?
- How often does the data change?
- What data is needed for each request?
- What queries do we need to run?
- How often do we run each of those queries?
So please collect here a this data here in codex such that we can base our decisions on them.
Test Framework
Develop a test methodology and a test framework for:
Tools
- kcachegrind / Callgrind (profiling)
- xdebug (PHP profiling)
- ab2 (apache v2 benchmark)
- Benchmarking and Load tools: SIEGE, httperf, jmeter, ...
- G2's profile mode (to identify slow DB queries)
- Explain plan of SQL queries
Bottlenecks
- # of ACL ids
- LIKE parentSequence%
- # of queries per page view
- Bad page level caching (too many updates? Too frequent expiration? Need block level caching?
- # file operations per request
- Code footprint per request
ACL / Permissions
- Maybe create ACL groups such that heterogeneous edit/.. Permissions don't lead to many view related ACL ids
- Allow more than one ACL id per itemId to find an equilibrium between too many ACL ids per user and too many ACL ids per item
Queries per Page View
- Are there similar / duplicate queries?
- More caching required?
- Grouping information together to fetch it with a single query? (e.g. user into session)
- A user reports that viewing large albums is much slower than viewing small ones. See:
Hierarchy Mapping
- Current: ParentChildMap + parentSequence
- Con's:
- LIKE parentSequence% = slow (how important is that?)
- ParentChildMap + MPTT
- Con's:
- Expensive add, move, delete operations
- Workarounds seem to be very complex (space allocation, bubbles)
- ParentChildMap + DescendantMap
- Con's:
- Expensive add, move, delete?
File operations
- Caching:
- Questioning the Gallery Data Cache, maybe should be in the database
- Maybe we need to group more cached data together
- Maybe more caching is needed (finer/coarser grained?)
- Idea: Abstract cache class + cache store implementations (shared mem cache, db cache, disk cache, …)
- Modular design:
- Modular @admin / configuration time shouldn't have too much influence on normal usage runtime -> compile / generate all active stuff into current runtime configuration / application files.
- No file including / loading of each module on each request
- No registration of event handlers / other callbacks on each request
- Common files are too large (GalleryUtilities, GalleryCoreApi, GallerySession, GalleryUrlGeneration, main.php, GalleryDataCache, GalleryStorage, adodb, smarty)
- See above (file operations / modular design)