We are excited to announce that ESS-DIVE’s publication services have been upgraded! We’ve been hard at work to make the publication experience on ESS-DIVE easier and improve functionality across the repository over the past year. Additionally, we are proud to announce that the latest File Level Metadata reporting format version is now live.
The latest upgrades are summarized here:
- New interface to request publication, reserve DOIs, and view publication statuses
- File Level Metadata v1.1.0 is now live and available on GitHub
- Dataset API can perform queried searches and bulk publication requests
- Alternate storage and upload methods to support large dataset
ESS-DIVE Upgrades
Intuitive new interface for publishing datasets!
- ESS-DIVE has revamped the experience of starting a publication request to make the publication journey more intuitive. Additionally, dataset managers can reserve a DOI before publication with the click of a button.
- Our new interface clearly displays instructions, considerations, and next steps before sending out a publication request for review or reserving a DOI.
- Inactive DOIs can be reserved for private datasets at any time. It will be automatically activated after your dataset is published.
Visit our documentation to learn more.
“Manage” permission is required to manage dataset publication. You must be a registered ESS-DIVE Data Contributor.
Transparent Dataset and DOI Status Badges
- Dataset landing pages now display two new icons for your dataset progress and DOI status.
- Statuses provide transparency on where you are in the publication process so you know what your next steps are to publish your dataset.
- See our documentation for a full review of available statuses.
Enhancements
Removed publish year from header
- The top row of landing page headers no longer list the publication year and instead only include the dataset DOI and version ID.
- This update eliminates confusion between the metadata field and the status of your dataset.
Geographic Coordinate Validation in Editor
- The editor now more thoroughly prevents the entry of invalid geographic coordinates.
- This means no more bounding boxes with inverted latitude and longitude values or those that erroneously span the international dateline or include the poles. It’s a safeguard against common errors that can lead to issues in data interpretation and use.
Bug Fixes
The following bugs were resolved in the latest release:
- Critical patch for compatibility with latest Chrome browser
- Dataset aggregation metrics displayed in portals have faster response times and more accurate loading messages
- Searching by publish date would raise error in search result window
- Contributor metadata field validation did not capture format issue when required last name was missing
Dataset API v1.11
Request publication and reserve DOIs in bulk with the API
- ESS-DIVE’s new publication enhancements are available with the API!
- Data contributors can now add publication endpoints to their workflows to programmatically request publication or reserve DOIs for multiple datasets at once.
- It will still be necessary to be responsive to revision requests via email and complete the publication process.
- Visit our technical documentation for endpoint schema and expected format.
Try out the Dataset API using Jupyter Notebook tutorials
- No coding experience required! Our tutorials are complete, ready-to-run tools that walk through the process of using the Dataset API from start to finish.
- Each tutorial is designed to demonstrate a particular application of the API. Browse our selection of notebooks to see if one fits your needs: https://github.com/ess-dive/essdive-tutorials
- Run the files on Jupyter Notebook locally or click the “launch | binder” to run them directly in your browser.
New R code example to submit many files at once
- ESS-DIVE has a new coding example that demonstrates how to submit a new dataset with metadata and many files at once in R.
- Our example code is intended for data contributors who are building scripts to submit data to ESS-DIVE. Unlike the tutorial notebooks, these have the experienced coder in mind.
- Download the code example from our GitHub repository: submit_metadata_with_multiple_files.r
Enhancements
Dataset searches with the API are much improved
- There are multiple enhancements in this release that have improved the Dataset API’s search capabilities to make it easier and more informative.
-
Search for datasets by keyword
-
Dataset previews now include the DOI in search results
-
Preview the file sizes (in bytes) before download
-
Enables search by older versions of dataset metadata
Adds the identifier for the next and previous version of dataset metadata to search results
-
Search parameter descriptions are easier to use
- ESS-DIVE’s technical documentation for the API is a useful complementary resource to our tutorials.
- The improved search documentation provides more details about available search parameters and format requirements to help you fit the tutorial search examples to specific use cases you are interested in.
Reporting Format: File Level Metadata (FLMD) v1.1.0
Version 1.1.0 (v1.1.0) of the File Level Metadata reporting format addresses usability challenges and feedback from early adopters in the community. Importantly, v1.1.0 is backward compatible with the original version (v1.0.0), and as such, the Fusion Database can read and parse file level metadata (flmd) and data dictionary (dd) files following both versions.
If you would like to suggest a change request for the File Level Metadata reporting format, please submit a GitHub issue using one of the templates we provide. If you have any questions, please email ESS-DIVE support at ess-dive-support@lbl.gov.
Standardized variable naming convention to snake_case
- The original FLMD field names used both underscores and capitalizations and did not follow a specific naming convention. In v1.1.0, we have updated the formatting of field names to follow snake_case guidelines to increase consistency and harmonize across the reporting formats. For example:
- File_Name changed to file_name
- Column_or_Row_Name changed to column_or_row_name
- No spelling or underscore placement has been revised.
- While templates and instructions have been updated to reflect this convention change, the original capitalization from v1.0.0 will still be accepted as long as spelling and use of underscores are consistent.
New terms for standard field
- The recommended standard terms for ESS-DIVE reporting formats in the file level metadata file can be used to specify a standard or format that a file follows. The standardized terms are important to identify specific files using ESS-DIVE reporting formats.
Removed excess optional fields
- We have removed the following optional fields from the documentation to streamline the creation of the file level metadata file:
- UTC_offset
- Contact
- Date_Start
- Date_End
- Latitude
- Longitude
- Northwest_Latitude_Coordinate
- Northwest_Longitude_Coordinate
- Southeast_Latitude_Coordinate
- Southeast_Longitude_Coordinate
- The optional missing_value_code field, which can be used to specify the codes representing missing values, has been moved to the data dictionary file so that it can be used at the variable level.
More exciting upgrades!
Globus: Upload large data with ease
- Do you need faster upload times and more memory to upload large data volumes? You may be interested in using Globus!
- Globus is a free, cloud-based transfer service that anyone can use to move data from your desktop or cloud endpoints to ESS-DIVE.
- We encourage anyone with more than 500GB of data, more than 100 flat files, or anyone experiencing slow local internet speeds to use Globus.
- Get in touch with us if you would like to explore this option (ess-dive-support@lbl.gov).
- Learn more about how ESS-DIVE supports large data in our documentation.
Tier 2: Download and browse large, hierarchical data with ease
- Tier 2 is an alternate storage service developed by ESS-DIVE to accommodate:
- complex file hierarchies that cannot be compressed,
- large file volumes, or
- large quantities of files that are too cumbersome to download from dataset landing pages.
- Get in touch with ESS-DIVE Support (ess-dive-support@lbl.gov) to begin the process of publishing data with Tier 2.
- Learn more about how ESS-DIVE supports large data in our documentation.
This service should only be used if necessary.