de
en

1. Why historisation?

  • Time is an important dimension in GIS (often neglected in the past)
  • Ability to reconstruct past state of a dataset
  • Simpler database management:
    • No file-based copies
    • Consistent, stable data structure throughout history of a dataset
  • Traceability: who changed when what data - and what did he change?
  • Law/bills: in some countries organisations are required to historise and archive geographic datasets. As an example in Switzerland the recently introduced "GeoIG" (GeoIV, Abschnitt 5) requests historisation of certain datasets.

2. Goals of a historisation implementation

The following requirements should be taken into account when implementing historisation:

  • Should be automatable
  • Should require only low maintenance
  • Should allow manual overriding (e.g. disabling triggers/rules)
  • Should not require adaptations of desktop GIS and Web-GIS
  • Should keep good performance and low storage demands in mind

3. The approaches

The following is common to both approaches:

  • Data is never deleted, but deleted or updated records are archived
  • All data is stored in a central table (both current and past data)
  • The current data state can be retrieved and manipulated through a view
  • Every record has at least the following three additional columns:
    • create_date
    • archive_date
    • id (in addition to "gid", allows traceability)
  • A record may have the following additional columns:
    • create_user
    • last_user ch columns had been changed)
  • Both approaches listed below don't yet have a conflict detection for the case that two or more people work on the same dataset at the same time

3.1. Repository approach

The repository approach developed by the Kanton of Solothurn (SO!GIS) allows data to be checked out, revised externally (potentially in completely different software) and to be checked in again. Alternatively, one can just import data from external production systems and store the data in a central, historised respository. The advantage is that data consistency is always guaranteed.

The function updatelayer() - see download below - first runs the following checks before doing the updates:

  • Are both tables existing in the specified schemas
  • What are the names of the geometry_columns, are they of the same geometry type and are they registered in the "geometry_columns" table?
  • Are the data structures equivalent?
  • Are the primary keys existing?
  • Are the columns necessary for the historisation exist in the historised repository table?

After the checks ran successfully the actual data synchronsation can take place. In a first step, the script finds all records that exist in the old repository table, but not in the new table to be checked in. These records are archived as deleted in the repository table. As a second step the script finds all records that are present in the new table, but not in the repository. Those records are retrieved into the repository and marked as new inserts.

3.2. Live historisation approach

For the live historisation approach developed by the City of Uster (based on existing work in PostgreSQL, e.g. TimeTravel by Varlena), we first need to import two trigger functions written in pl/Perl. You need to load the two trigger functions insert_timegis() and update_timegis() listed in the download section below. They are best imported into the "public" schema. Next you have to do the following 6 steps, which can be automated by the script add_history(), also available below:

1. Create additional columns required for the historisation (see above) 2. Create a delete rule on the master repository table 3. Create insert and update row-level before triggers for insert and update that call the two functions insert_timegis() and update_timegis() 4. Create indizes on the master table, specifically for create_date and archive_date 5. Create a view representing the current data state 6. Create rules (insert, update and delete) to enable data manipulation on the current view

4. Downloads

FOSSGIS 2009 article "[Historisierung von PostGIS-Daten als Grundlage zur Langzeitarchivierung von Geodaten http://www.kappasys.ch/pgtools/pghistory/postgis_historisierung_fossgis2009_neumann_duester.pdf" (german) explaining the historisation principles.

4.1. Repository approach

updatelayer()

Goal: synchronise external data source with historised repository.

Syntax:

updatelayer(new_schema.new_table,historised_schema.historised_table)

updatelayer(), pl/PgSQL, by Horst Duester, Kanton Solothurn

4.2. Live historisation approach

insert_timegis() trigger function

insert_timegis(), pl/Perl, by Andreas Neumann, City of Uster

update_timegis() trigger function

update_timegis(), pl/Perl, by Andreas Neumann, City of Uster

add_history() function

Goal: add necessary columns, triggers, rules, indizes and views for historised datasets.

Syntax:

add_history('name_schema','name_table','name_schema_view','name_view')

add_history(), pl/Perl, by Andreas Neumann, City of Uster

4.3. Sample data (SQL dump)

Goal: historised dataset to experiment with. Please create schema "test", as this SQL script creates a table and view in the schema "test". It also assumes that the above listed trigger functions "insert_timegis()" and "update_timegis()" are present. Assumes utf-8 encoding.

historised_table_example.sql, by Andreas Neumann, City of Uster