2024 ESS-DIVE Partner Projects Announcement

November 7, 2024 by Dylan O'Ryan

We are excited to announce that we have four new ESS-DIVE Partner Projects, led by Environmental System Science (ESS) researchers, to help advance DOE ESS data management and use. To learn more about these projects, join the fully virtual ESS-DIVE Annual Data Workshop session next week on “Data Curation to Enable Reuse and Integration with DOE ESS Projects,” November 15th, 11:30 am – 1:30 pm PT. Our project partners will discuss their planned work, answer questions, and get your input. Visit the workshop event page for the full workshop agenda, and register here to attend virtually over Zoom.

The 2024 ESS-DIVE Partner Projects include:

ESS-DIVE Data Curation Support

Kim Ely, LBNL

The ESS-DIVE Data Curation Support project and role will assist ESS-DIVE data contributors in curating high quality datasets using ESS-DIVE reporting formats, improve data curation workflows, advise, and provide training for ESS-DIVE data contributors. To request ESS-DIVE Data Curation support, use the ESS-DIVE Contact Us form and “Request data curation assistance” by selecting the Other option, and provide details for what you need help with.

A Workflow and Reporting Format for Processing Environmental Sensor Data and

Automated Generation of ESS-DIVE Compliant Metadata

Stephanie Pennington and Ben Bond-Lamberty, PNNL

This project will develop an environmental sensor data reporting format, which will include proposing standard guidelines for versioning and processing levels of sensor data, and an automated processing pipeline to handle sensor data and generate ESS-DIVE compliant sensor data and metadata. Reporting format development will involve surveying other major ESS projects using data loggers and environmental sensors to synthesize best practices and commonly used formats. This project will also update the existing Soil Respiration Reporting Format to interact with these sensor data processing new tools.

Harmonization and Increased Usability of ESS-DIVE’s Hydrologic Monitoring and Soil, Sediment, and Water Chemistry Reporting Formats

Amy Goldman and Brieanne Forbes, PNNL

This project will update and harmonize the ESS-DIVE Hydrologic Monitoring and Water-Soil-Sediment Chemistry reporting formats to improve machine readability and interoperability with ESS-DIVE’s features, and develop programmatic tools for data contributors and users to aid in use of the two reporting formats.

Improving Advanced Terrestrial Simulator (ATS) model data managing and archiving standards

Ethan Coon (ORNL) and Zhi Li (PNNL)

This project will prototype standardized procedures for managing and archiving model data for the Advanced Terrestrial Simulator (ATS) model. This project will enhance, optimize, and standardize the existing Model Data Archiving Guidelines reporting format; and improve ATS’s ability to generate standardized Model Data Archives (MDA), including implementing CSV-standard compliant ATS observation files and developing scripts for setting up, automating, and submitting MDAs to ESS-DIVE including the semi-automated generation of File Level Metadata and Data Dictionary files.

2024 ESS-DIVE Annual Data Workshop

October 15, 2024 by Dylan O'Ryan

The Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) data archive, based at Lawrence Berkeley National Laboratory, is thrilled to announce the 2024 ESS-DIVE Annual Data Workshop on November 14th and 15th, 2024. This free and virtual event is a chance for DOE ESS researchers,d ata contributors, and data users to join forces, fostering the exchange of valuable insights and expertise on ESS data.

The theme this year will focus on “ESS Data Integration and Use”. The opening session will feature a keynote presentation from Kjiersten Fagnan, Chief Informatics Officer for the DOE Joint Genome Institute (JGI), on DOE’s Biological and Environmental Research (BER) data integration goals. We will then have presentations from ESS Synthesis projects, who will present on their scientific goals, data management approach, needs and challenges. We will hold discussions and tutorial sessions on data discovery, integration, publication, curation, and use. Whether you are new to DOE ESS, or have been engaging with ESS-DIVE for a while, this is your opportunity to learn about the newest features of the ESS-DIVE data repository, and help drive the future of ESS data management, integration, and use.

By attending this workshop, you will gain access to important information and networking opportunities, offering you the following benefits:

Enhance Data Discovery: Learn about ESS-DIVE tools for enhanced data discovery through search innovations. Learn about the Fusion Database and Deep Dive API for search within dataset files, and the planned search enhancements for data discovery. Participate in discussions led by the ESS Cyberinfrastructure Working Group on the needs for ESS data discovery.
Advance BER ESS Data Integration: Learn about ongoing data integration efforts across DOE BER, ESS-DIVE, and projects. Hear from Kjiersten Fagnan on the DOE BER plans to enable global search across BER data systems. Participate in discussions on data search and integration needs across ESS-DIVE, JGI, EMSL (Environmental Molecular Sciences Laboratory), NMDC (National Microbiome Data Collaborative), KBase (Systems Biology Knowledgebase), Ameriflux, and ARM (Atmospheric Radiation Measurement).
Curate Data Enable Reuse and Integration: Meet the new ESS-DIVE Data Curation Support partner and provide input on your project data curation needs. Learn about other new ESS-DIVE partner projects, their planned updates and harmonization to existing reporting formats (hydrologic monitoring, water-soil-sediment chemistry, soil respiration), and a new reporting format and tools to handle and process environmental sensor data.
Network and Learn from Peers: Participate in a collaborative, interdisciplinary environment where you can learn from fellow scientists with a variety of expertise across data, computational, and domain sciences. Share experiences, challenges, and innovative solutions to advance open ESS data and reuse.

Register Now: Registration for the 2024 ESS-DIVE Annual Data Workshop is now open, and encouraged for anyone who is part of an ESS project. While participants are encouraged to attend both days for the full experience, they are also welcome to join sessions of their interest. To secure your spot, simply complete this registration form. Once registered, you will receive a personalized Zoom link, granting you access to the event.

Each session will provide ample opportunities for participants to engage with the ESS-DIVE team, ask questions, share comments, and provide feedback. Before the event, registered attendees will receive quick instructions on preparing for tutorials and discussions to make the most of their experience.

To discover more about the workshop or ESS-DIVE in general, please visit the workshop event page or contact ess-dive-support@lbl.gov.

Don’t miss the chance to be part of this transformative event and take your ESS projects to new heights. Join us at the 2024 ESS-DIVE Annual Data Workshop for two days of learning, networking, and innovation to advance ESS data integration and use.

ESS-DIVE Has Upgraded Publication Services and Updated File Level Metadata Reporting Format

September 23, 2024 by Dylan O'Ryan

We are excited to announce that ESS-DIVE’s publication services have been upgraded! We’ve been hard at work to make the publication experience on ESS-DIVE easier and improve functionality across the repository over the past year. Additionally, we are proud to announce that the latest File Level Metadata reporting format version is now live.

The latest upgrades are summarized here:

New interface to request publication, reserve DOIs, and view publication statuses
File Level Metadata v1.1.0 is now live and available on GitHub
Dataset API can perform queried searches and bulk publication requests
Alternate storage and upload methods to support large dataset

ESS-DIVE Upgrades

Intuitive new interface for publishing datasets!

ESS-DIVE has revamped the experience of starting a publication request to make the publication journey more intuitive. Additionally, dataset managers can reserve a DOI before publication with the click of a button.
Our new interface clearly displays instructions, considerations, and next steps before sending out a publication request for review or reserving a DOI.
Inactive DOIs can be reserved for private datasets at any time. It will be automatically activated after your dataset is published.

Visit our documentation to learn more.

“Manage” permission is required to manage dataset publication. You must be a registered ESS-DIVE Data Contributor.

Transparent Dataset and DOI Status Badges

Dataset landing pages now display two new icons for your dataset progress and DOI status.
Statuses provide transparency on where you are in the publication process so you know what your next steps are to publish your dataset.
See our documentation for a full review of available statuses.

“Edit” or “Manage” permission is required to view the statuses. You must be a registered ESS-DIVE Data Contributor.

Enhancements

Removed publish year from header

The top row of landing page headers no longer list the publication year and instead only include the dataset DOI and version ID.
This update eliminates confusion between the metadata field and the status of your dataset.

Geographic Coordinate Validation in Editor

The editor now more thoroughly prevents the entry of invalid geographic coordinates.
This means no more bounding boxes with inverted latitude and longitude values or those that erroneously span the international dateline or include the poles. It’s a safeguard against common errors that can lead to issues in data interpretation and use.

Bug Fixes

The following bugs were resolved in the latest release:

Critical patch for compatibility with latest Chrome browser
Dataset aggregation metrics displayed in portals have faster response times and more accurate loading messages
Searching by publish date would raise error in search result window
Contributor metadata field validation did not capture format issue when required last name was missing

Dataset API v1.11

Request publication and reserve DOIs in bulk with the API

ESS-DIVE’s new publication enhancements are available with the API!
Data contributors can now add publication endpoints to their workflows to programmatically request publication or reserve DOIs for multiple datasets at once.
It will still be necessary to be responsive to revision requests via email and complete the publication process.
Visit our technical documentation for endpoint schema and expected format.

Try out the Dataset API using Jupyter Notebook tutorials

No coding experience required! Our tutorials are complete, ready-to-run tools that walk through the process of using the Dataset API from start to finish.
Each tutorial is designed to demonstrate a particular application of the API. Browse our selection of notebooks to see if one fits your needs: https://github.com/ess-dive/essdive-tutorials
Run the files on Jupyter Notebook locally or click the “launch | binder” to run them directly in your browser.

New R code example to submit many files at once

ESS-DIVE has a new coding example that demonstrates how to submit a new dataset with metadata and many files at once in R.
Our example code is intended for data contributors who are building scripts to submit data to ESS-DIVE. Unlike the tutorial notebooks, these have the experienced coder in mind.
Download the code example from our GitHub repository: submit_metadata_with_multiple_files.r

Enhancements

Dataset searches with the API are much improved

There are multiple enhancements in this release that have improved the Dataset API’s search capabilities to make it easier and more informative.
- Search for datasets by keyword
- Dataset previews now include the DOI in search results
- Preview the file sizes (in bytes) before download
- Enables search by older versions of dataset metadata
  Adds the identifier for the next and previous version of dataset metadata to search results

Search parameter descriptions are easier to use

ESS-DIVE’s technical documentation for the API is a useful complementary resource to our tutorials.
The improved search documentation provides more details about available search parameters and format requirements to help you fit the tutorial search examples to specific use cases you are interested in.

Reporting Format: File Level Metadata (FLMD) v1.1.0

Version 1.1.0 (v1.1.0) of the File Level Metadata reporting format addresses usability challenges and feedback from early adopters. Importantly, v1.1.0 is backward compatible with the original version (v1.0.0), and as such, the Fusion Database can read and parse file level metadata (flmd) and data dictionary (dd) files following both versions.

If you would like to suggest a change request for the File Level Metadata reporting format, please submit a GitHub issue using one of the templates we provide. If you have any questions, please email ESS-DIVE support at ess-dive-support@lbl.gov.

Standardized variable naming convention to snake_case

The original FLMD field names used both underscores and capitalizations and did not follow a specific naming convention. In v1.1.0, we have updated the formatting of field names to follow snake_case guidelines to increase consistency and harmonize across the reporting formats. For example:
- File_Name changed to file_name
- Column_or_Row_Name changed to column_or_row_name
No spelling or underscore placement has been revised.
While templates and instructions have been updated to reflect this convention change, the original capitalization from v1.0.0 will still be accepted as long as spelling and use of underscores are consistent.

New terms for standard field

The recommended standard terms for ESS-DIVE reporting formats in the file level metadata file can be used to specify a standard or format that a file follows. The standardized terms are important to identify specific files using ESS-DIVE reporting formats.

Removed excess optional fields

We have removed the following optional fields from the documentation to streamline the creation of the file level metadata file:
- UTC_offset
- Contact
- Date_Start
- Date_End
- Latitude
- Longitude
- Northwest_Latitude_Coordinate
- Northwest_Longitude_Coordinate
- Southeast_Latitude_Coordinate
- Southeast_Longitude_Coordinate
The optional missing_value_code field, which can be used to specify the codes representing missing values, has been moved to the data dictionary file so that it can be used at the variable level.

More exciting upgrades!

Globus: Upload large data with ease

Do you need faster upload times and more memory to upload large data volumes? You may be interested in using Globus!
Globus is a free, cloud-based transfer service that anyone can use to move data from your desktop or cloud endpoints to ESS-DIVE.
We encourage anyone with more than 500GB of data, more than 100 flat files, or anyone experiencing slow local internet speeds to use Globus.
Get in touch with us if you would like to explore this option (ess-dive-support@lbl.gov).
Learn more about how ESS-DIVE supports large data in our documentation.

Webinar slides and recording

Tier 2: Download and browse large, hierarchical data with ease

Tier 2 is an alternate storage service developed by ESS-DIVE to accommodate:
- complex file hierarchies that cannot be compressed,
- large file volumes, or
- large quantities of files that are too cumbersome to download from dataset landing pages.
Get in touch with ESS-DIVE Support (ess-dive-support@lbl.gov) to begin the process of publishing data with Tier 2.
Learn more about how ESS-DIVE supports large data in our documentation.

This service should only be used if necessary.

Webinar on New Publication Interface for ESS-DIVE Datasets

August 29, 2024 by Dylan O'Ryan

ESS-DIVE Webinar

Tuesday, September 24 | 10:00-11:00 PT / 13:00-14:00 ET

View Webinar Recording / Link to Webinar Slides

Join this webinar to learn about ESS-DIVE’s new publication interface. ESS-DIVE’s latest updates allow data contributors greater transparency and agency over core aspects of the dataset publication process. The new interface is available to data contributors directly on dataset submission and landing pages. In this webinar, we will showcase this interface for requesting dataset review and reserving DOIs before publication and explore the new ESS-DIVE status badges. We will include time for participant questions and discussion around this new feature and ESS-DIVE’s publication process.

During the webinar, we will cover the following topics:

How datasets are published on ESS-DIVE
Showcase new dataset publication interface and status badges
Provide examples on how this can benefit publication workflows
Who has access to these new tools

Please encourage anyone from your project who may be interested to attend.

Fianna O’Brien, flobrien@lbl.gov

Computer Systems Engineer
Fianna is a computer systems engineer focusing on data publication and data lifecycle workflows for ESS-DIVE. She has experience managing publication data and creating data driven outreach tools for large-scale ESS projects. She holds a B.A. in Computational Linguistics from Hampshire College.

2024 ESS-DIVE Partner Project Funds Available

April 30, 2024 by Dylan O'Ryan

As announced at the 2024 Department of Energy (DOE) Environmental System Science (ESS) Program Principal Investigators (PI) meeting, ESS-DIVE has $1M of partner project funds available for members of DOE ESS who are interested in working on data management with ESS-DIVE.

Preference will be given to Priority Topics, however ideas that have clear value for DOE ESS are welcome as well.

New reporting formats
New versions of existing reporting formats: Improve machine readability and compatibility with FusionDB, BASIN-3D etc., model data archiving for large outputs/ML
Data curators: Provide guidance to DOE ESS projects on best practices for submitting data to ESS-DIVE & help with adoption of reporting formats
Data integration: Tools to integrate ESS-DIVE and other BER data
Data products: Products using ESS-DIVE & BER data for broad scientific use

To apply, submit a white paper (maximum 2 pages) describing the proposed effort by May 15, 2024 to ess-dive-leadership@lbl.gov. Please review our template with instructions for the whitepaper submissions available here.

You can email ess-dive-leadership@lbl.gov with questions about the whitepapers.