Gallery2:Modules:imagenotes - Gallery Codex
Personal tools

Gallery2:Modules:imagenotes

From Gallery Codex

Note: This project is temporarily on-hold as a new SoC project (2008) intends to add this functionality!


Imagenotes Module

This module (in development, not yet released) is intended to allow the selection of regions within your images for the purposes of annotation and tagging. Discussion of this module's development can be found in Facebook like image tagging.

Status

  • Continued design. More work to do here. When finished, back to the implementation.
  • Investigating extensibility.
  • Investigating non-rectangular (and complex 3d) image region selection (use SVG/VML with YUI abstraction?).
  • Investigating browser tab selection ordering and rendering layers. (YUI OverlayManager manages the z-order of Overlays, but this does not change the tab order as would be desired. Need to modify OverlayManager to re-order document)
  • Client side rectangular region selection and editing (move, resize) complete.

Use cases

Simple text image tagging
  • A user wants to annotate an image by tagging parts of the image that represent the faces of his/her friends. When someone views the image the tags are drawn over the image and by hovering over the tag the image region that represents it is highlighted. It is already known which tags the user has used before, which tags have been used in the album and tags already attached to the image, so a list of suggestions for tags is available for choosing from.(The tags suggestion system can also hook into external systems to get relevant suggestions)
  • Someone who doesn't have permission to annotate images recognises someone in a photograph and wants to suggest that an annotation is created. Just like creating an actual annotation the user creates a 'suggestion' for an image regon and tag annotation. A user with sufficient priviledges will be able to view all pending suggestions (possibly even notified via e-mail) and accept or reject the suggestion. In the meantime, the user that created the suggestion can modify their suggestion, until the point the suggestion is accepted (how does this work with anonymous users?).
    If the suggestion is rejected the user who created it is somehow notified, maybe when they go to view their suggestions or maybe by e-mail.
  • Someone who doesn't have priviledge notices that an image tag is incorrect. They suggest an ammendment, which can be accepted or rejected by a user with sufficient priviledge.
  • A user has some friends defined within an online social networking program and wants to tag their faces within G2 pictures, then have the social networking program (and their friends) informed about this.
  • A user wants to search for when a couple of their friends are in the same image together.
  • A user wants to search for images of a friends face where their face takes up most of the image.
  • A user wants to search for pictures where none of their friends in the image are blinking.
Objective image tagging
  • A user wants to specify properties of an object in the image, that may later be used for search purposes.
    This is an extension of the idea of applying a single tag. A tag defines an instance of a class and that class in turn has a number of attributes that could be set for an image. For instance, if a person is tagged, that person could be looking in a certain direction, smiling and winking. That person may own certain clothes and in the image they may be wearing certain of those.
Notes about an image region
  • A user wants to annotate an image with notes about a specific region of the image. Notes are more detailed than simple tags and give describtions as to what lies within a certain region of the image. When the image is viewed a user can see a region has been marked up and on giving the region focus (mouse or keyboard) the note relevant to that region will be displayed. The user who creates the annotation can place the annotation and decide if the text is always shown or just when focus is given to the image region.
  • A user wants to search for images with certain text in the image note.
Drawing on/annotating an image
  • A user wants to spice up an image by adding some speech bubbles containing some text. The user can choose to have the speech bubble always displayed or triggered to display on certain events (such as mouseover a certain region). The speech bubble display in itself can trigger other events that react with other annotation objects (so a sequence of speech can be displayed).
  • A user wants to search for images where incredibly witty new things have been drawn on the image.
Image recognition
  • An image recognition program trawls through the image database and recognises some faces and objects. The program makes suggestions on what various regions of the images are to a user with sufficient priviledge to view them. The privilidged user can verify the image recognition programs analysis and choose to notify the program that it's recognition is correct or false, the user can also choose to add annotations based on any correct recognition and notify the program what the actual image area was when recognition was false.

Investigations

Investigating browser tab selection ordering and rendering layers
YUI OverlayManager manages the z-order of Overlays, but this does not change the browser tab order. The browser tab order rely's on an elements position in the DOM. Making the tab order the same as the layer order would give a better feel.
Need to modify OverlayManager to re-order document rather than manipulate z-index. As well as benefitting tab ordering, SVG (Scalable Vector Graphics) does not handle z-ordering so requires document ordering to be rendered at the correct layer.
Other systems implementing image region annotation
There are a number of external systems whose behaviours were mentioned in forums as being desirable for image region annotation, so an attempt should be made to implement the features of these and make it possible to recreate their exact functionality without too much difficulty for a G2 site administrator/themer. Investigation is here: Other systems investigations

Design

Overview

  • Keep extensible.
    It is hoped that this module will be extensible so that features not envisioned here can be included later and so that administrators can enable only the features they desire, minimising page load time, client side operation time and resource usage.

Concepts

Image regions
  • An image region is a specific area of an image that has some importance. This image region exists on all versions of the same image (i.e. on all derivatives of the source image). Note that a derivative image may have a different scale AND orientation to the original.
  • An image region is represented by a rectangular DIV or other 2d shape within the bounds of the image.
It should be possible to convert a rectangular DIV selection into a proper 2d shape.
  • All image region's no matter what shape will have a 2d bounding box, which will make collision detection faster.
  •  ?? An image region usually represents a complex 3 dimensional object in 2 dimensional space. Rather than just representing a 2d area, the image region may define the 3d properties of the object in the 2d space, such as the rotation of a face, head and torso. ?? Should this instead be an annotation/other association?
  •  ?? An image region should be able to provide a scalable view of that region for re-use in the page.
Image region sets
  • An image region set is a grouping of related image regions. Because a 3d object may be represented by disjoint regions in 2d space, we need to be able to link separate image regions together. (This becomes important when, for instance, a person is being tagged and their arm goes behind someone else. We tag the image region set rather than each individual 2d image region)


Image region hierarchy... does it make sense to have an image region made up of image region parts, and a part can be a parent to another image region part? Image region always has a root, which is the bounding DIV and then 1 or more children? Does it make sense to do this?


Annotation
  • An annotation is something that is drawn in the browser (may not be visible due to style, but added into the DOM) and is associated with the image.
  •  ?? Does an image region need an annotation so that it can be displayed and interacted with? Or should we be able to attach style to an image region itself? If annotation required, need special circumstances to shape the annotation to the image region.
  • The annotation, unlike an image region, is not necessarily desired on all sizes of the same image.
  • An annotation may have a number of different style attributes depending on the derivative image (and Annotation set) it is being displayed with. Scale and rotation of the annotation may not be as simple as image regions, which are based purely on image co-ordinates.
e.g. A speech bubble annotation may not want to scale the same amount as the derivative image does.
  • An annotation may be, but is not necessarily associated with an image region.
i.e. Tags and notes are associated with an image region, but a quirky speech bubble isn't.
  • An annotation may be shown, hidden, or animated by some event.
  • An annotation can be displayed in various positions:
    • Absolutely positioned, relative to the image.
    • Absolutely positioned, relative to another anotation.
    • Absolutely positioned, relative to the cursor.
    • In the imagenotes block area?
    • In a special annotation region for the annotation type.
    • In other special block areas of the correct type?
 ??Annotation parts
  • An annotation may be made up of different parts and shapes and we may not want a database entry for every individual part. This may make the job of referencing different things more difficult however. And parts may want to be attached to different annotation regions. Examples:
    • Annotating a tag, we may want to display it at the cursor on mouse over, underneath the image (if a large group of people shown, a table of who's in the picture) and in the imagenotes block.
    • Quirky speech bubble animation contains multiple bubbles, animated together so when one bubble appears the other disappears. They together form a whole annotation, so may not want to be separated.
Annotation sets
  • We may have lots of data about an image and not want to display all the data at the same time. Also, we may want different behaviours for interacting with the image when it is being viewed.
  • An annotation belongs in one or more annotation sets.
  • Annotations may have different properties per annotation set.
  • Only one annotation set is viewable at a given time, otherwise different properties of an annotation being displayed may conflict. It should be possible to merge sets into one another, giving the opportunity to resolve conflicts.
  •  ?? Only one annotation set is rendered at a time (the others are not merely hidden, but don't exist in the DOM)?
Image region associations
  • There may be reasons for creating associations with an image region that aren't visual markup. Not sure what yet though. :-)
Theming
  • As with the rest of G2, site owners may want to make imagenotes appear in different ways. Though CSS styles may control a lot, there are instances where non-simple elements are needed. For instance, a selected area may want the inside of the area transparent, whilst not making the border transparent.
Imagenotes regions
There are a number of potentially desirable regions for use with Imagenotes, all with different purposes.
  • Imagenotes block. When editing, all of the various image regions/annotations are shown here somehow. This allows you to interact with specific items potentially more easily than when editing over the image.
  • Image area. The area within which the image is rendered.
  • Sub image area. (Potentially another block?) Somewhere annotations such as who is in the image can be displayed.
  • Annotation set selection. When viewing the image, something to choose which annotation set is being displayed (possible hidden when the user can only see one of the available sets).
  • By the cursor. This is a special region that moves everywhere the cursor does. Potentially some logic is needed here for keeping this area on screen when the cursor is near the browser/frame border. Also, don't want to call code for moving the region when there is nothing being displayed in it.
  • Toolbox. Buttons and/or labels for selection of viewing/editing modes and potentially all their sub toolkits.
  • Docking areas (provided by themes or blocks?). Areas where toolboxes can be dragged to and inserted into the flow of the page.
  • By an image region. When an image region is first created (or edited) and tags/notes are being applied, a convienient place to get the appropriate options is right next to the image region (or maybe just a modal dialog)
Manipulation
When adding/editing the following operations will probably be needed for both image regions and annotations.
  • Dragging
    • Constrained
  • Scaling
    • Constrained
    • Aspect ratio
    • Min and Max
  • Z-ordering (layering)
  • Grouping/ungrouping
    • Image region in multiple groups?
    • Set operations for shapes (Union, Intersection, Negation) Keep the operators?
  • Node moving (when not a simple rectangle, points of the shape can be moved individually)
    • Constrained. Does a shape stop a node being moved to certain places to maintain the shape?
  • Rotation
  • 3d manipulation
Modes
Certain modes (and sub-modes) will be in use when interacting with the image. The mode will determine the behaviour.
  • Viewing.
  • Editing:
    • Editing image regions.
    • Editing tags.
    • Editing notes.
    • Editing non-image region annotations (drawings?)
  • User suggestions (Can an image region be suggested without anything associated to it? Probably not):
    • Suggesting tags.
    • Suggesting notes.
    • Suggesting drawing.
  • Viewing suggestions? Viewing may look different to when you are editing suggestions.
  • Moderating (Suggestion response).
    • View shows who/what made the various suggestion. Can limit view to a specific suggester.

Client side

Goals

  • Make all actions responsive
    • Keep event handlers simple and quick.
    • Be wary of multiple super classes as call times may get out of hand.
    • Keep the item being displayed as simple as possible so there is less in the DOM. i.e. don't use templates for editing (with resize handles) when merely displaying.
    •  ? Destroy classes when changing between modes or not? Memory consumed by keeping, CPU consumed by recreating.
  • Minimise resource usage
  • Minimise page load time
  • Only load required features. Load features as required? (dynamically load JS then eval?)
  • Degrade gracefully?

Notes

  • This all really only works when javascript is enabled. It may be possible to render the markup server side without to atleast some degradeble functionality, but for now this is not being considered (would require the theme to put things in the right place).
  • Image notes manager object handles all interactions with any given image (of course, within the scope of the imagenotes module) and is responsible for any page regions associated with the imagenotes (image, block area, toolkit)
  • The image notes manager will be instanciated when the imagenotes block is included on photo pages.
  • The image notes manager has to be pointed to the image it is managing. Currently a utility function searches out an <img> tag matching the item being displayed (this may not always work), with the image src URL passed in through the imagenotes block smarty template.
  • The image notes manager is responsible for putting the image notes into the normal flow of browser focus, so when tabbing through the document editable regions and keyboard input works correctly.
  • The image notes manager is responsible for client/server communication, plug in modules will communicate through it (is this bad design?, maybe it's more efficient for plugins to do their own thing to save unnecessary stuff being loaded, but structure the calls through an interface method).
  • The image notes manager has a notion of 'mode'. The mode can be changed to allow editing of image regions, creation of annotations, and to generally change the view and behaviour of the imagenotes. An interface class will define the mode and other G2 modules will be able to add modes (collected through factory methods server side).
  • The image notes manager will have rubber band functionality (just within the image area?) that is available to modes. Initially intended only for dragging out an image region, it may be useful for selection.
  • Mode types, view, edit(/add/delete), suggest.
  • Themes can define placement of toolbox, dock areas, annotation areas.
  • A class will define an interface for a handler for imagenotes types. It handles a specific imagenote type. (e.g. for image tagging, the handler will look after the tag/image region relationship, even when an annotation doesn't exist)
  • A class will define an interface for imagenotes types. An imagenote type is something that may be rendered to the page? and will probably fit into the imagenotes manager mode system. Will have events for it's selection, mouseover, etc.
  • Can create annotation areas that fit into page flow and are not absolute(?) Can we determine this with javascript? Does it have to be done per item being displayed? Definitely theme specific. Will it be easily upset by adding other blocks? Sounds too complicated! :-)

Server side

Goals

  • Minimise page load time, whilst being extensible

Notes

  • Image region at client end should stick to co-ordinates on local image. Server side should convert co-ordinates to full size image co-ordinates for storage and to derivative size co-ordinates for client display.
  • Only include javascript in page when necessary. Restrict to needed JS based on user permissions (i.e. edit code is not needed for someone that cannot edit)

Database structure

Goals

  • Minimise page load time
  • Design so it's easy and quick to search for interesting things

Notes

  • Image region needs a unique ID, probably built from a key on the image it's attached to and it's own unique ID for the particular image. Maybe image region ID should have a single unique key to simplify cross referencing from other tables. Image region needs layer information to ensure the desired z-ordering is applied on rendering.

Implementation

Client side

  • Created utility methods for creating instances of templates. That is, cloning a DOM node and applying IDs to the new node and it's children.
  • Created TemplatedRubberBandRegion which allows a rubber band to be dragged over a specified region of the document and on mouseup a custom event is fired with the selected region
  • Created BoxModelRegionOverlay which extends the YUI Overlay widget and allows placement of an overlay based on the margin, border, padding or content parts of the box model. This is so that image region overlays can be positioned and sized by the content area.
  • Created TemplatedResizeOverlay which extends the BoxModelRegionOverlay widget and adds YUI DragDrop functionailty and the ability to resize the region. The code searches a template passed in client side for certain classes to turn into resize and drag handles. Custom events are fired at the start and end of drag and resize operations.
  • Created an initial image region manager that sets up a TemplatedRubberBandRegion and listens for area selected events, then creates a TemplatedResizeOverlay. Multiple regions can be created, but only a single region can be focused at a time.
  • The image region manager ensures that the image regions fit into the page focus flow.
  • Created utility function for finding an <img> tag in the page with a specific src attribute to allow the image region manager to be attached.

Server side

  • Block created for addition to item view page
  • No database structure created yet
  • No server endpoint for AJAX communication created yet


Hacks, or areas otherwise needing improvement

...


Reference

G2 Feature requests

Forum/Codex topics

Because image tagging is one of the goals, general tag discussions are also of interest:

Because within an image, not only do you have a viewing location, you can see various places:

External systems

There are a number of exisiting implementations of image tagging/annotation. These are some:

  • Fotonotes Adding of notes with titles to variable size regions of an image.
  • Flickr Adding of notes to variable size regions of an image (inspiration credited to Fotonotes).
  • Facebook Adding of tags (usually friends of the user) to fixed sized regions of an image.