ESS-DIVE funds a small number of community projects to work on critical data management needs. The current focus is on developing standardized formats, vocabulary and metadata requirements that will be broadly adopted by the ESS community.
A number of teams have been funded in 2019 to work on standards for different data types that are typically generated by ESS research activities. Each of these efforts involves extensive work to identify appropriate existing data/metadata standards, or develop new standards by considering input from other researchers within the DOE community who collect or use similar data. The work also involves developing crosswalks to other major standards relevant to the data type, and ultimately providing a recommendation for the standard of choice.
Email firstname.lastname@example.org if you are interested in working on data standards or other use of the Community Funds.
Generic standard for file-level metadata and comma-separated data files (Alison Boyer, ORNL)
This project will develop two standards for data types important to the ESS community: 1) a generic standard for the structure of data stored in comma-separated values (CSV) text format files, and 2) a metadata schema to describe the contents, scope, and structure of each data file within the ESS-DIVE repository. File-level metadata is the description of individual files that are part of a larger collection. This metadata will be fully consistent with, and will augment, the metadata collected to describe each data collection. Providing file-level metadata helps to provide granular information that will enable data users to search for and locate files that contain specific measured parameters or variables, to understand and compare the contents of each data file within an overarching data collection.
Continuous soil respiration data and metadata (Ben Bond-Lamberty, PNNL)
This project is designing and finalizing standards and tools for a continuous soil respiration database. It will build on a prototype continuous soil respiration database (COSORE, see https://github.com/bpbond/cosore), in which individual contributed datasets are mapped to standardized data and metadata fields. This research project aims to (i) identify and develop a standard format for continuous soil respiration data important to the earth system science community, (ii) define the cross-walk between this standard and other soil respiration standards in use in the community, and (iii) test QA/QC and ingestion tools.
Hydrological Monitoring Data (Amy Goldman and Huiying Ren, PNNL)
The project will build on existing standards and/or well-established best practices for hydrologic monitoring data including single-point-in-time and logged measurements. Variables included will be pressure, river stage, temperature, and conductivity (electrical and specific conductance). The scope may be expanded to other variables such as those measured by aquatic sondes (e.g., dissolved oxygen, pH, oxidation reduction potential).
Storage of leaf level gas exchange data and metadata files (Alistair Rogers and Kim Ely, BNL)
This research project is to identify and develop the standard format for storage of leaf level gas exchange data and metadata files. The aim of this work is to develop standardized formats, vocabulary, and metadata requirements for uploading data from leaf-level gas exchange to the ESS-DIVE archive. This includes all measurements made with a portable gas exchange system such as survey style measurements of photosynthesis, respiration and stomatal conductance; response curves e.g. light response curves, CO2 response curves, and temperature response curves; curves used to determine stomatal slope and fitting of derived parameters e.g. Vc,max from photosynthetic CO2 response curves. It also includes other derivations of key parameters e.g. derivations of Vc max from proxies such as “one-point” measurements or spectroscopy.
Water quality and soil/sediment samples data and metadata files (Kristin Boye, SLAC)
The aim of this work is to develop standardized formats, vocabulary, and metadata requirements for uploading data from water, soil, and sediment samples to the ESS-DIVE archive. It will build on existing standards and/or well-established best practices for water sample data (e.g. EPA’s WQX) and geological/soil sample data (e.g. National Cooperative Soil Survey and other USDA/USGS recommended/employed formats) to develop ESS-specific standards and templates for data and associated metadata uploads to ESS-DIVE.
16S amplicon data products standardization and KBase integration (Pamela Weisenhorn, ANL)
The aim of this work is to develop standardized formats, vocabulary, and metadata requirements for uploading 16S amplicon data products (specifically, operational taxonomic unit (OTU) and environmental sequence variant (ESV) tables) to the ESS-DIVE archive. It will build on existing standards and/or well-established best practices for this data product, which results from bioinformatic processing of 16S data. Multiple pipelines exist for the generation of this data product and appropriate capture of the metadata associated with the analysis is critical in determining the ability to combine or compare across data sets generated by different researchers for meta-analysis or modeling purposes. This work will involve determining best practices for the preservation of data provenance and metadata on bioinformatic processing to enable end-users to determine the appropriateness of the data products for reuse. In addition to consulting broadly with the ESS funded university partners, This data standard will be developed in consultation with the JGI, the National Microbiome Data Collaborative (NMDC), KBase, MG RAST, and other relevant stakeholders.