Information Management Projects

SBC IM plan
SBC LTER Information Management

Below are descriptions of information managment projects at SBC, both in production and development. This list is representative, but not inclusive. Projects implement specific goals of the information management plan (see the overview at the right) and may be web applications, databases, or libraries of scripts for specific tasks such as processing or metadata generation (EML). Documentation and timelines may change, and the most current information on the status of a project is available internally or by request.

SBC Information Management Plan and Overview (PDF)

Semantic Tools for Ecological Data Management (Semtools): SBC is a partner in the Semantic Tools for Data Management (Semtools) project. Existing approaches to managing data and associated metadata fail to adequately capture the semantics of the scientific process, thereby impeding the utility of those data. The goal of Semtools is to develop software tools to semantically annotated data and metadata. SBC will develop a measurement registry based on this semantic model. SBC also participates in the SONet project, a community-driven effort to define and develop the necessary specifications and technologies to facilitate semantic interpretation and integration of observational data.

Publications database: Publications are currently stored in Metacat, but we are investigating other options. These include

  1. use of a native XML database, eXist, to take advantage of its built-in support for the W3C specification XQuery.
  2. using Metabase bibliographic tables, since we will be using the Metabase database for data package metadata. Keeping publications in Metabase could make cross referencing material easier.
  3. Using the LNO biblioDB. We are already required to send our biblioraphy to the LNO's database. If biblioDB has appropriate features, it would simplify SBC's management tasks if we did not have maintain redundant copies.

  • documentation available by request

Places database: Sampling sites at SBC total more than 300 and cover all habitats. Often samples are collected opportunistically. Centralized metadata storage for sampling sites will faciliate metadata publication in EML, and also provides content for mapping applications (e.g.Google maps, Google Earth), so that research may be coordinated. A schema was developed which is closely aligned with the EML geographicCoverage module, and also with Open Geospatial Consortium (GML). Site descriptions are stored in a native XML database, with built-in REST web services. Web services have been created for automated additions of sampling sites to EML metadata. Links to sample applications using the database are available by request.

Building and maintaining EML metadata:

1. scientific staff members use the EML editor Morpho to describe the unique parts of a data package, with SBC-specific parts added by the information manager using scripts and centralized metadata.

coordinating between labs and im

2. Data can also be published as the last step of data processing in some languages. This set of tools is a collaborative effort with PISCO. get a higher quality image.

stream processing example

EML Dataset Query Application (EDQ): We have developed a generic tool for loading EML-described datasets into a relational database so that data can be queried with web forms. This tool can be applied to many datasets with a few constraints: (1) only single-point sampling sites are supported, (2) the table must have a date column (or one that can be created from multiple cols, and (3) stations in a table must include attribute-level geocov trees that reference dataset level trees.

Further development: The EDQ was written for EML 2.0.1 with a prototype version of the EML Data Manager Library or DML (Java). EML is now at version 2.1, and the DML is being significantly upgraded as part of the NIS PASTA framework. The EDQ will should be evaluated for upgrade for both EML 2.1 and an improved DML. Possibly, some of its functionality will be in PASTA.

Project DB: We have completed Phase I of our implemetation of the network database, "projectDB". This database was developed by the network Information Managers Committee (IMC) to provide a common framework for exchanging information about active and legacy research projects. The database uses a modification of the EML specification for storage and includes a webservice export compliant with EML 2.1. We now have 14 project "themes" in Metabase which are exported as projectDB XML. In future phases of work, we will be adding informaiton about specific project activities and experiments which can be linked to these themes and to individual products such as data and papers.