Gallery3:Modules:calendar import - Gallery Codex
Personal tools

Gallery3:Modules:calendar import

From Gallery Codex


Calendar Import

Allows you to configure an import folder on your server that users can use as a "dumping ground" for their items.

Description

The Calendar Import module is a bit of a hybrid concept of a few others modules (calendarview, dupcheck and folder sync).

Calendar Import automatically imports items from the "dumping ground" and sorts them into a calendar-like structure based on the EXIF data from the items. It can also detect duplicate files and take appropriate action on those files.

The module depends the EXIF module to function and it is also aware of the keeporiginal module (for duplicate detection).

Features

  • Option to import into either the root gallery album or one of the top-level albums (allowing for specific permissions to be inherited)
  • Items unable to be sorted are imported into a specified top-level album
  • the import path ("dumping ground") could be accessed by something like a samba share, ftp, or some other third-party web uploader
  • users don't necessarily need to use or log into gallery
  • import folder can contain individual folders for users (and will be imported as "owned" by that user)
  • two styles of calendar structure/sorting supported (Year/Month and Decade/Year)
  • log of operations and pending imports allows you to see what the module is doing
  • Skip or Delete duplicate items (based on both MD5 and SHA1 checksums)
  • Resource limiting (of sorts) by setting the maximum number of operations cron can perform in one call

Automatic imports are handled by the use of a cron job (just like in the folder sync module). The cron job needs to be called regularly but the frequency will depend on how busy your gallery is. The cron job creates a lock file in the servers tmp folder and will automatically exit if a job is already running, so while it may not be wise, it is possible to call the job every minute.

Screen shots

None available.

Installation

Installation is a little involved but reasonably straightforward. This module may be downloaded from the Gallery forums zip file.

The following steps are recommended for installation.

  1. Download the archive file and unpack the "calendarimport" folder into your Gallery 3 modules folder.
  2. Log into your servers terminal and configure a cron job to call the ./calendarimport/cron.php file (see below notes).
  3. While in the terminal, identify/create a folder (accessible to the www-data user) where files will be imported from (eg. /var/www/gallery3/var/uploads).
  4. Optional - inside the identified/created import file, create subfolders for each user (using the username as the subfolder name - eg /var/www/gallery3/var/uploads/bob /var/www/gallery3/var/uploads/bharat /var/www/gallery3/var/uploads/siburny /var/www/gallery3/var/uploads/jnash /var/www/gallery3/var/uploads/rwatcher).
  5. While still in the terminal, identify/create another folder (accessible to the www-data user) where error files will be moved to (eg. /var/www/gallery3/var/uploads/errors)
  6. Log into your Gallery as an administrator.
  7. At the root album, create or identify an album that you will use as the destination location for imported items.
  8. Go back to the root album again and create or identify another album that you will use as the destination for unsorted items.
  9. Go into the Admin -> Modules menu.
  10. Ensure the EXIF module is activated (install and activate it if it isn't).
  11. Activate the Calendar Import module (don't worry nothing will happen yet if this is the first time you have installed Calendar Import).
  12. Go into the Admin -> Settings -> Calendar Import menu and configure the settings (see below notes).
  13. When everything looks ready to begin, enable the setting "Cron jobs can perform imports".

Configuring cron for www-data

These notes are for a Debian based system. The cron job needs to be run as the www-data user. In the terminal:

sudo su www-data
crontab -e

then add a line to the crontab file, something like the below (which in this case runs the job every 1 minute):

*/1 * * * * php /var/www/gallery3/modules/calendarimport/cron.php

You can then check your Calendar Import settings and you should notice your new cron job listed just above the configurable settings.

Configuring the Calendar Import settings

This section explains what each Calendar Import setting is for.

Cron jobs can perform imports: the cron job calls the Calendar Import module, but this setting enables or disables Calendar Import from actually performing any significant operations. This allows you to activate the module but prevent it from performing actions until you are satisfied with the rest of the settings. Basically if this setting is disabled, cron calls the module but the module does nothing.

Base album to create new albums from: this drop down list includes the Root album and all the other top-level albums under the root. The album you select here will dictate where Calendar Import starts creating its sub-albums from. So if you set this as "Root album", then the module might for example create a "/2013/November" album. However if you have an album called "Private" right under the root album, then the module might create a "/Private/2013/November" album.

Album to place images (that couldn't be sorted) into: this drop down list includes the Root album and all the other top-level albums under the root album. When Calendar Import calls an EXIF module function, if the file contains no EXIF then it will be imported into this album instead. The user can then move items from this album into a more suitable album.

Specify the threshold year: the module supports two styles of calendar structure. "All Yearly" is a "Year/Month" structure while "All Decadal" is a "Decade/Year" structure. The threshold year allows you to select a little bit of both methods. Item dates before the threshold year will be moved into the Decadal structure while those or after the threshold year will be moved into the Yearly structure. Note that EXIF was released in 1995 so if EXIF dates are found before 1995, then these have probably been generated manually or maybe scripted by some automated process. Maybe somebody is film scanning in the family negatives from 50 years ago and is manually entering in their best guess of what year the negative was taken in. This setting allows you to imply that the dates before the threshold may be less accurate by showing this as a change in the gallery structure.

Path to import images from: this is simply the server path (just like in the serveradd module) that Calendar Import will import files from. Imported files from this location will be "owned" by the Admin. However you can also create subfolders for each username in your gallery under this path and the module will use that user as the "owner" of the items.

'Error' path to place images (that couldn't even be imported) into: if the module can't import a file for some reason, then this is where the file will ultimately be moved to. If the module is set to Skip duplicate items, then they will be ignored for 24 times and they will then be moved into this Error path.

Number of log entries to keep: the module keeps a log of its operations which can be useful for tracking down problems. This sets how many log items should be kept before older items are overwritten. Available options are Nothing (no log), 10, 100, 1000, 10000 or Everything. The 10000 and Everything options are not recommended since they may cause a timeout due to the number of entries being displayed. If you choose Everything, the log needs to be manually cleared.

Action to take when we are about to import a duplicate item: You can Skip or Delete the import of duplicate files into the gallery. If you choose to Skip, then the file will be ignored for 24 cron calls and it will then be set as a failed import (where it will be moved into the "Error" path). The purpose of this is to allow an active user a chance to go and find out why the file is a duplicate. If you choose to Delete, then the file will immediately be deleted from the import folder. Duplicate detection is performed on all images and movies in the gallery and is based on both MD5 and SHA1 checksums so there is realistically no chance of a unique file being deleted. However if you need peace of mind, then the Skip option exists.

Maximum number of operations to be performed during each call by cron: This setting allows you to perform some resource limiting. The module will limit itself to performing this number of operations on every cron call. The module loosely defines an operation as some sort of file movement or analysis. So for example if you are using a shared host and you have your cron job set to run every minute, then you can limit the module to performing perhaps only 1 operation per minute. Keep in mind that an import into the gallery may itself involve a lot of processing (such as thumbnailing, rotating or any other operations by other modules) so this is by no means foolproof. Obviously if your cron job is scheduled too regularly and this setting is too high, then the host could be constantly performing operations. Note that the module also does a fair amount of file and directory monitoring/checking (which it does not count as an operation), so if you have thousands of files waiting for import, then this itself is going to cause a lot of host operations. Therefore you need to try and balance how many files are expected to be located with how many you should process each time. The best method to limit resource use is with an appropriately scheduled cron job in tandem with this setting.

Deactivating and Changing Settings

There is not a lot of error checking in the module so it is possible to cause problems by changing settings if the module is performing any operations. It is also recommended the module be set and forget - try not to tinker with the settings once they are correct.

The recommended procedure for deactivating the module is:

  1. Disable the option "Cron jobs can perform imports".
  2. Make sure the module is not currently performing any operations (check the log).
  3. Deactivate the module.

Note that deactivating the module also drops all of its discoveries and checksum information. So if you then activate it again, all this information will have to be recreated (which could take some time).

The recommended procedure for changing significant settings (ie. locations, threshold and albums) is:

  1. Disable the option "Cron jobs can perform imports".
  2. Make sure the module is not currently performing any operations (check the log).
  3. Change the desired module settings and any other server/environment configuration that might be required (eg. if you are changing import locations).
  4. Optional - you might need to forget any discoveries the module has made if you have moved your import location (see Troubleshooting).
  5. Enable the option "Cron jobs can perform imports".

Hidden Settings

There are two hidden settings in the Admin -> Settings -> Advanced menu.

drop_log_on_deactivate: If you deactivate the module, all discoveries and checksums are dropped from the database. If you set this option, then it will also drop the log as well. This is as close as you will get to a clean deactivate.

selectable_decade: If you need to select a threshold year before 1970, this is the option you can change. Make sure you use a year divisible by 10 and less that whatever you currently have set for your threshold year.

Known Issues

If the autorotate module is activated, images are not rotated unless the autorotate code is patched. It is unknown whether this is a problem with Calendar Import as such or the library that autorotate uses. To perform the patch go into your Gallery 3 modules folder and edit the file at ./autorotate/lib/pel/PelEntryNumber.php. Locate the setValue function and you should find the parameter for the function is commented out. Re-instate the parameter by modifying the function declaration so that it looks like the below:

function setValue($value) {
$value = func_get_args();
$this->setValueArray($value);
}

Localisation and language support is poor. Most critical aspects of the module make some provision for translation, however there is none at all for the logs generated by Calendar Import.

Troubleshooting

There are a few options available to find out what the module is doing. The module keeps information that you can view in the Admin -> Statistics menu.

Calendar Import log: The log shows you in detail what operations it has been performing. It lists a log ID, level (ie. severity), time of entry, description and the calling function (in case you really need to trace through the code). If your cron call is operating correctly, then you should at least see a "Cron job has been involked" log entry here. If you have set your log to log everything, you can also clear the log with the button at the bottom of this page.

Calendar Import pending: This will show you a list of the items that the module has discovered. The top part will display all the files the module is keeping track of, where they were found, when they are last operated on and what the module is going to do with them. The bottom part shows you all of the folders (that it is checking for new files) and the user who owns that folder. There is also a button at the bottom of this page that tells the module to forget everything it has discovered (it will have to rediscover everything again). This could be helpful if tracking down problems or where the pending items don't reconcile with what is actually in the import locations (which shouldn't happen).

There a few things you can check if cron doesn't seem to start the job at all. First you should verify that your cron job has been created and is correct. If Calendar Import can detect your cron job, it is listed in the Calendar Import settings page. Otherwise a message is displayed asking you to check your cron file. If the job is listed, then it is likely that the problem is to do with Calendar Import, in which case you should read onward.

Just like the folder sync module, the cron.php file creates a lock file that it uses to track if the module is operating. The file is created (or attempts to get created) at the servers tmp folder (usually /tmp/calendarimport.lock) so check to make sure the file exists and create it if necessary. If it does exist, check to make sure that it is owned by the www-data user (sudo chown www-data:www-data /tmp/calendarimport.lock).

If this doesn't help, then it might be time to try and call the cron.php file manually and hope that it outputs error messages. Try something like the below:

sudo su www-data
cd /var/www/gallery3/modules/calendarimport
php cron.php

This will manually start the operation and you can check the Calendar Import log to see what is happening. If there is nothing wrong, then after the module has completed its operations it will return back to the terminal prompt without any output. If the module returns back immediately, then it could be that it thinks an operation is already in progress (because the lock file is locked). If you are absolutely sure the module isn't performing any operations, is configured correctly and has a bunch of images in the import location, you could try deleting the /tmp/calendarimport.lock file and trying it again.

If the cron job runs fine but no items are imported into your gallery, then check the Calendar Import log for an entry "Cron is not allowed import because it is disabled in the options". This means you haven't finished configuring the module yet (check the last step under "Installation").

One possible item of note is that the import folders would also need to be accessible by the www-data user. These notes have assumed that /var/www/gallery3/var/uploads would be used for that purpose so it is probably already configured. If you chose to use some other location, then check to make sure www-data can access it and that the files are not read-only.

Discussion

http://galleryproject.org/node/112373