Gallery2:Design Documents:Input Validation Framework - Gallery Codex
Personal tools

Gallery2:Design Documents:Input Validation Framework

From Gallery Codex

Input Validation Framework Design Document

Objective

Currently, Gallery2's input validation framework consists of GalleryUtilities::getRequestVariables() and GalleryUtilities::sanitizeInputValues(). There is a requirement for a better input validation framework to enhance security and robustness with the following goals:

  • enforce type-safety,
  • enforce stricter filtering,
  • policy of accept nothing (the controller / view must specify specify the name and the types parameters it accepts),
  • the implementation of the new framework should be backwards compatible not requiring a major API bump,
    • No, just the change has to be backwards-compatible, not the new framework / API methods. That is, we can keep the old methods around and remove them on the next major API bump. --Valiant 10:46, 1 July 2007 (PDT)
  • a code audit would be developed to insure that all official code conforms with the new policy,
    • That's not a top priority, just a nice to have. --Valiant 10:48, 1 July 2007 (PDT)
  • it must be able to handle data structures (i.e. arrays of parameters),
  • minimal additional code footprint, and
  • should be easy to implement.

Official modules and new plug-ins should use the input validation framework, while existing 3rd party plug-ins can continue to use the current methods until the next major API bump. The following is a minimal list of that should be implemented:

  • boolean: supports true or false values,
  • string: a sanitized string with the CR/LF removed.
  • text: a sanitized string.
  • integer / id: (id is an integer with a value > 0)
  • checkbox: validates the value as a 'checkbox' value and transforms it to boolean

The following options are needed:

  • whether the parameter is optional (we return null if it's not present, if present, we cast to the specified type)
    • Is there a "not" missing in the above sentence? --Valiant 10:51, 1 July 2007 (PDT)
  • whether to consider POST vars only (all our controller / AJAX code should be done as POSTs, without exceptions. Makes certain attacks harder)
  • whether to not apply htmlsafe (default to htmlsafe)

Approach Alternatives

Two (2) alternatives for the new input validation subsystem were evaluated:

  1. Instead of using the given $form variable, views & controllers must fetch each variable separately and specify the expected type etc. Existing 3rd party plug-ins could still use $form until the next major API bump.
  2. Add a method to each controller / view that specifies what request parameters (name, type, ...) it expects. When populating $form, we check if the controller / view implements that method and use it if defined or fall back to our old behavior if necessary.

Initial prototypes have shown that the second approach provides increased transparency and less impact on existing code. Conversion of existing code to the second approach is significantly easier and less intrusive.

--Valiant 10:57, 1 July 2007 (PDT) : What about error handling in case of approach #2? Say you declare in your "expectedParameters() method some required and some optional params. How should it error out if a required param isn't there? I guess we can't error out there, we have to let the controller handle this case since it probably wants to fail gracefully (e.g. missing / empty param from input form).

What I had thought to do is part of the return for the validation routine would an error structure, that mirrored the structure of the input parameter list. If an entry failed a filter, the error code/message would be stored in the corresponding entry. The validation routine would return a flag or a $ret to indicate that there was failures. The caller would the would look through this for detail information. Then then caller could use a new core api method, formatFilterMessages to format an error message.

This leads to some redundancy. First you declare a list of expected params. Then you have to check if the required params are actually there in the normal handleRequest logic.

Implementation

  1. Modify the GalleryController.class and GalleryViewController.class to contain a new method getRequestFilters. This method will return an array that contains 2 elements:
    1. The first element is an array of flags that indicates which of the PHP request arrays ($_POST, $_FILES, or $_GET) are to be used and in what order to resolve the requested variables from.
    2. The second array that contains the fields and edits to apply.

filters = array('arrays' => array(GALLERY_FETCH_POST, GALLERY_FETCH_GET), 'fields' => array('form' => array('field1' => 'integer:0:10',

'field2' => 'checkbox')));
--Valiant 11:04, 1 July 2007 (PDT) : Some forms have a list / array of an unknown number of input fields. e.g. in handleRequest, we check if isset($form[$i][$j][fileName]).
Would that work with this approach? BTW: If we go with this approach, we have to find a more intuitive name for getRequestDescriptor. :)
The output from the validation routine would match the structure of what was returned from getRequestFilters. Using the above example and assuming the request fields contained the following information field1 = 15 and field2 = 'FIELD2', the return would array would be:

form = array('form' => array('field1' => array('value' => null, 'error' => array(errorCode, 15, 10)),

'field2' => true)));
The caller could call formatFilterMessages with the error information above to produce a locally translated message 'The value 15 exceeds the maximum value of 10.'.
  1. Consolidate all the request handling routines from the GalleryUtilities.class into two new helper classes, GalleryRequestVariableHelper_medium and GalleryRequestVariableHelper_simple. GalleryRequestVariableHelper_medium provides the variable handling routines that are typically used on every request and GalleryRequestVariableHelper_simple will contain the methods that are not used on every request. The methods in the GalleryUtilities class will not be removed until the next major Api bump, but will be converted to calling the new methods in the GalleryCoreApi to reduce the number of lines of code to be loaded on each request.
--Valiant 11:04, 1 July 2007 (PDT) : I think you got it backwards. Simple is for all requests, medium for rare requests, advanced for even rarer operations.
  1. Expose the following methods through the GalleryCoreApi class:
    getValidatedForm
    Given the view or controller class, return all the requested request variables.
    getRequestVariables
    Return the requested variable
    putRequestVariable
    Put a request variable into the specified request array. It will default to $_POST, currently all put request use $_GET
    hasRequestVariable
    Check to see if a variable is in the request.
    removeRequestVariable
    Remove a request variable from the specified request array.
    unsanitizeInputValues
    Undo preprocessing from sanitizeInputValues.
    sanitizeInputValues
    Convert UTF-8 values to Unicode and remove any dangerous input that could affect Gallery security.
  2. The following methods have no equivalent in the new framework and will be deprecated. These functions will be converted to call the new routines until they can be removed at the next major API bump.
    getFile
    Convert to call GalleryCoreAPI::getRequestVariables() with the upload validation
--Valiant 11:09, 1 July 2007 (PDT) : You didn't mention the upload validation rule in the above list.
  1. getFormVariables
    Change to build a default descriptor and then call GalleryCoreAPI::getRequestVariables()
    array_merge_replace
    Won't change, but was only used by getFormVariables() and can be deleted after the next API bump.
    getRequestVariables
    Change to build a default descriptor and then call GalleryCoreAPI::getRequestVariables()
    getAllRequestVariables
    Remove the method. If controllers and views are expected to identify the form variables then a method to get all request variable doesn't make sense.
--Valiant 11:09, 1 July 2007 (PDT) : getAllRequestVariables = Remove a request variable?
  1. getRequestVariablesNoPrefix
    Change to build a default descriptor and then call GalleryCoreAPI::getRequestVariables() and specify a empty string as a prefix.
    putRequestVariable
    Change to call the GalleryCoreAPI::putRequestVariable
    _internalPutRequestVariable
    Private method and can be removed to save lines of code.
    hasRequestVariable
    Change to call the GalleryCoreAPI::hasRequestVariable
    removeRequestVariable
    Change to call the GalleryCoreAPI::removeRequestVariable
    _internalRemoveRequestVariable
    Private method and can be removed to save lines of code.
    _getRequestVariables
    Private method and can be removed to save lines of code.
    _internalGetRequestVariable
    Private method and can be removed to save lines of code.
    GalleryUtilities::isValidEmailString
    Replaced with specifying an email filter.
  2. The handling of the Gallery Form Prefix continue to be done as it is currently done, except that the public methods will have an optional $prefix parameter, which will default to GALLERY_FORM_VARIABLE_PREFIX instead of the true.

Planned Filters

Filters refers to the available edits that can be applied against an input field. Not all filtersare loaded for every request. Filters can be chained together to provide multiple edits. For example, a field validation could be:
optional:0;integer
which would specify that if no value was present, then set a default to zero and convert it to an integer.
--Valiant 11:11, 1 July 2007 (PDT) : Why again didn't we like the term "filter"? In the context you describe here, filter fits the description perfectly.

Loaded for Every Request

The following filters are loaded for each request:

integer integer:min:max

Converts to an integer and insures that it is in the range min to max.

id: id

Is really just a an integer validation with the following validation 'integer:0'.

string: string:min:max

Same as text but also strips CR/LF from the input string.

text: text:min:max

Validates for min and max length and then calls sanitizeInputValues().

Additional Filters(loaded as required)

The following Filters are loaded as required:

boolean: bool

Validates the value as boolean. If the value is an empty string, 0, 'false', the return value is false. If it is 1 or 'true' it return true.

checkbox: checkbox

Validates the value as a 'checkbox' and transforms it to a boolean.

email: email

Validates an email based on the following pattern:

^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,6})$
enum: enum:value1:value2:value3

Checks the value against a list of possible values.

file: file:[exists]

Validates the value as a file name and optionally checks to ensure that the file exists.

optional: value]

Provide a default value, if the request variable doesn't exist. No parameter will set it to null, empty will set it to a empty string, or value will set it to that value.

preprocess: pre:[options]

Allows a value to be preprocessed. Multiple options can be specified and are processed from left to right. options can be any of:

trim trim the spaces left and right
upper make the alphabetic characters upper-case
lower make the alphabetic characters lower-case
alpha strip out all but alphabetical characters (a-z, A-Z)
alnum strip out all but alphanumeric characters (a-z, A-Z, 0-9)
num strip out all but numeric characters (0-9)
left:bytes truncate the input string to a length of bytes
regex: regex:expression

Validates the value is given by the regular expression.

upload: upload

Upload is used for uploaded files, it will normalize the slashes in the file name to forward slashes. This avoids any problems with calls to removeSlashes or addSlashes.

--Valiant 11:13, 1 July 2007 (PDT) : @ email: please add GalleryUtilities::isValidEmailString() to the old -> new method mapping.
--Valiant 11:17, 1 July 2007 (PDT) : How would "upload" be used? How is it different from "file"? Could they be combined?
--Valiant 11:17, 1 July 2007 (PDT) : @Preprocess: Do we need that? trim is probably a good idea. Not sure if we need the others.
--Valiant 11:17, 1 July 2007 (PDT) : @regex / enum: Do we need that? If it's not heavy (code footprint), I guess it makes sense to include it for completeness. Else I'm not sure.

Not Implemented

The following are not implemented, but could be if there was interest: date, currency, or float.

Question: Should the filter routine return a meaning error message or just an indicator that the edit failed. Currently, most form handling sets a flag in the templateData structure and the template displays the error message if that flag is set. Some form handling expects an error message. Providing an error message provides more flexibility, but adds to the overall line count.

--Valiant 11:21, 1 July 2007 (PDT) : I think that usually, we want to have a very specific error message. The error messages generated the generic validation code can only be quite... generic. On the other hand, if we could have a generic form building API, form validation API and form processing API (super high level building of new views/controller), it would make sense to have generic form error messages.
But we're not there. At this point I'd argue that the validation methods should return an error code / status or so, but not an error message. It can use gallery->debug() or the message param of GalleryStatus for unlocalized details.

Phased development

Ideally the input validation framework implementation breaks down into the following steps:

  1. Creating the validation framework and converting the GalleryUtilities methods to call the new GalleryCoreApi methods.
  2. Changing the Gallery2 infrastructure to use the new methods (main.php, init.inc, embed.php, GalleryController, GallerySession, etc.)
  3. The core module, (modules/core/*.inc)
  4. The other official modules
  5. The official themes.