Database layout
A formal description of the data base layout used by the datapool.
Legend:
pk
= Primary Keyfk
= Foreign Keyuq
= Unique within tableA
field
in bold letters indicates a field which cannot be NULL
signal
This is the central table holding the measurements. Each row represents a value of a parameter measured at a given time and location (site). The coordinates of the signal may not correlate to entries in the site table.
field |
datatype |
description |
---|---|---|
signal_id |
integer (pk) |
|
value |
float |
the actual measured value of the parameter |
timestamp |
date_time |
time when the value was measured. |
parameter_id |
integer (fk) |
parameter |
source_id |
integer (fk) |
source |
site_id |
integer (fk) |
site |
coord_x |
string |
at a given site, a signal may origin from a specific place |
coord_y |
string |
the type of coordinate system is CH1903/LV03 |
coord_z |
string |
elevation |
binary_data
Store n-dimensional arrays as binary blob.
Use the poolkit's
binary_data_packing module for converting n-dimensional arrays to binary data
and the other way around.
field |
datatype |
description |
---|---|---|
binary_data_id |
integer (pk) |
pk |
data |
byte |
binary data |
timestamp |
timestamp |
parameter |
parameter_id |
string (fk) |
parameter |
source_id |
string (fk) |
source |
site_id |
string (fk) |
site |
lab_result
Store lab measurements. At the moment primarily designed for water samples.
field |
datatype |
description |
---|---|---|
lab_result_id |
integer (pk) |
pk |
lab_identifier |
string |
id stated by lab |
parameter_id |
string (fk) |
parameter |
sample_identifier |
string |
id of sample |
filter_lab |
string |
used filter |
dilution_lab |
float |
lab dilution |
method_lab |
string |
measurement method |
value_lab |
float |
measured value |
description_lab |
string |
comment from lab personnel |
timestamp_start_lab |
timestamp |
processing start |
timestamp_end_lab |
timestamp |
processing end |
site_id |
integer (fk) |
site |
filter_sample |
string |
filter size used for sample |
dilution_sample |
float |
sample dilution |
timestamp_sample |
timestamp |
time the sample was taken |
method_sample |
string |
sampling method |
description_sample |
string |
comment of person who took sample |
lab_result_person_lab_association
Store multiple people that worked on a single sample in the lab.
field |
datatype |
description |
---|---|---|
lab_result_id |
integer (pk) |
|
person_id |
string (fk) |
person |
lab_result_person_sample_association
Store multiple persons that were involved in the sampling process of a single sample.
field |
datatype |
description |
---|---|---|
lab_result_id |
integer (pk) |
|
person_id |
string (fk) |
person |
meta_log_type
A log type represents the top category, which a meta data history entry can belong to. Eg.: Maintenance, Installation, Malfunctions, …
field |
datatype |
description |
---|---|---|
meta_log_type_id |
integer (pk) |
|
name |
string (uq) |
|
description |
string |
meta_action_type
Action types represent a sub category to log types. Eg.: Cleaning (sub cat. from Maintenance), Battery (sub cat. from Maintenance), …
field |
datatype |
description |
---|---|---|
meta_action_type_id |
integer (pk) |
|
meta_log_type_id |
integer (fk) |
|
name |
string |
|
description |
string |
meta_flag
Flags are intended to mark certain entries. Eg.: low_operator_confidence (stating that operator of the sensor network, has seen issues in the field, …), …
field |
datatype |
description |
---|---|---|
meta_flag_id |
integer (pk) |
|
name |
string (uq) |
|
description |
string |
meta_data
The meta data table is intended to store “one time” information and are linked to sources.
Any static data that corresponds to a source, and does not fit into the other source related tables.
The json
field enables you to store any kind of json like information to ensure maximal flexibility.
Eg.: API-keys for a data backend, WebGIS information
field |
datatype |
description |
---|---|---|
meta_data_id |
integer (pk) |
|
source_id |
integer (fk) |
|
site_id |
integer (fk) |
|
description |
string |
|
additional_meta_info |
json |
field allows for storing any information |
meta_data_history
Continuous meta data can be stored here, like regular maintenance entries. Use log types and action_types (optional),
as well as flags (optional) to make your data easily searchable, after the fact.
The json
field enables you to store any kind of json like information to ensure maximal flexibility.
Eg.:
field |
datatype |
description |
---|---|---|
meta_data_history_id |
integer (pk) |
|
meta_data_id |
integer (fk) |
|
meta_log_type_id |
integer (fk) |
|
meta_action_type_id |
integer (fk) |
|
meta_flag_id |
integer (fk) |
|
person_id |
integer (fk) |
|
timestamp_start |
timestamp |
|
timestamp_end |
timestamp |
|
comment |
string |
|
additional_meta_info |
json |
field allows for storing any information |
meta_picture
Pictures provided with meta data will be stored here.
field |
datatype |
description |
---|---|---|
picture_id |
integer (pk) |
|
meta_data_id |
integer (fk) |
|
meta_data_history_id |
integer (fk) |
|
filename |
string |
|
description |
string |
|
data |
bytes |
person
Enter persons to be referenced in meta data or lab entries.
field |
datatype |
description |
---|---|---|
person_id |
integer (pk) |
|
abbreviation |
string (uq) |
|
name |
string (uq) |
|
string |
project
Create a project. Each source belongs to a project.
field |
datatype |
description |
---|---|---|
project_id |
integer (pk) |
|
title |
string (uq) |
|
description |
string |
site
A site is a location where measurements are made. At a given site, several measuring equipments (source) can be found. The location of the site is also described by its coordinates.
field |
datatype |
description |
---|---|---|
site_id |
integer (pk) |
|
name |
string (uq) |
Name of that site |
description |
string |
|
street |
string |
street and number |
postcode |
string |
postal code |
city |
string |
name of the city/village |
coord_x |
string |
x coordinate of a given site |
coord_y |
string |
y coordinate of a given site |
coord_z |
string |
elevation |
picture
Every site may contain a number of pictures. Filenames for each site
must be unique. The filetype (e.g. png, jpg, tiff) is determined by the filename
extenion of the filename
field.
field |
datatype |
description |
---|---|---|
picture_id |
integer (pk) |
|
site_id |
integer (fk) |
referring to the site |
filename |
string |
|
description |
string |
additional information about the picture |
data |
bytea |
contains the (binary) content of the file |
timestamp |
date_time |
creation date of the picture |
source
A (data-) source is a specific measuring equipment. Every measurement (signal) origins from a specific source. Sources are categorized into source_types. The name of a source must be unique.
field |
datatype |
description |
---|---|---|
source_id |
integer (pk) |
|
source_type_id |
integer (fk) |
source category |
site_id |
integer (fk) |
optional reference to a site (may be NULL) |
name |
string (uq) |
Name of that source. Usually is a combination of source_type and site name. |
description |
string |
|
serial |
string |
serial number. Is unique, if available |
source_type
Categorization of a given source.
field |
datatype |
description |
---|---|---|
source_type_id |
integer (pk) |
pk |
name |
string (uq) |
Name of that source |
description |
string |
|
manufacturer |
string |
company which produced that equipment |
special_value_definition
Certain source types produce categorical data, such as «dry», «wet»,
«n/a» and so on. This table is used to correlate categorical data and numeric
values for a given source type. For example the numerical value 1
might encode
the state «dry».
field |
datatype |
description |
---|---|---|
special_value_definition_id |
integer (pk) |
|
source_type_id |
integer (fk) |
source_type |
description |
string |
|
categorical_value |
string |
the catecorical value |
numerical_value |
float |
the numeric value it is mapped to. |
parameter
Every value in the signal table is connected to a specific parameter which describes and defines its unit.
field |
datatype |
description |
---|---|---|
parameter_id |
integer (pk) |
|
name |
string (uq) |
e.g. “rain intensity”, “absorbance 200.00”, etc. |
description |
string |
|
unit |
string |
the physical unit, e.g. “mm/h”, “m-1” |
signals_signal_quality_association
Table maps flags to signals.
field |
datatype |
description |
---|---|---|
signal_quality_id |
integer (fk) |
|
signal_id |
integer (fk) |
signal_quality
A signal may contain more than one quality flag (but not the same quality flag twice). The combination of signal_id and quality_id must be unique.
field |
datatype |
description |
---|---|---|
signal_quality_id |
integer (pk) |
|
quality_id |
integer (fk) |
|
timestamp |
date_time |
date when annotation was added |
author |
string |
username of the author who added the annotation |
quality
Measuring the environment is always error prone. This table holds the controlled vocabulary mentioned above. As some quality flags may be assigned programmatically the method field indicates the origin of such an quality entry.
field |
datatype |
description |
---|---|---|
quality_id |
integer (pk) |
|
flag |
string (uq) |
a textual description of quality_id |
method |
string |
a description how the quality flag is generated. |
Design principles
The design of the database follows the https://en.wikipedia.org/wiki/Star_schema to model multidimensional data with a https://en.wikipedia.org/wiki/Data_warehouse.
You find a graphical description of the star schema here
.
We follow these principles to assure a consistent layout of the underlying tables:
primary keys of a table are called
tablename\_id
instead ofid
table names are in singular
the star schema avoids too much normalization
a table should not contain too abstract information