This guide describes each step needed to prepare, format, and submit a dataset to ESS-DIVE. Before you begin this process, register as a data contributor and review ESS-DIVE’s contributor terms of use. Please see our Guide to Using ESS-DIVE for more detailed information.
If you are interested in using standardized Reporting Formats in your dataset, please visit our Data Reporting Formats page.
1. Choose and organize data
Components of a Dataset
Organizing Project Data into Datasets
2. Describe and format your dataset
Requirements for describing your dataset
Reporting formats
Author Guidelines
3. Upload data
Data Submission Web Form
4. Submit and review your dataset
Submit your dataset
Collaborate on datasets
Link to external data sources
Review automated reports
Request publication
1. Choose and organize your data
Components of a Dataset
A dataset includes both data files and accompanying metadata that describes the context, contents, and purpose of the data collected. A published dataset can be cited and is issued a Digital Object Identifier (DOI), providing a permanent link to the dataset.
Organize Project Data into Datasets
Your project may have a variety of data collected by multiple authors that needs to be organized and published in multiple different datasets. Some common ways to organize project data into datasets include:
- Author contributions: Datasets can be organized based on the level of contributor effort for portions of data, since this may alter the order of the author’s list and citation.
- Data type: Particular data types from a project – e.g. continuously generated sensor data, sample data, data synthesis product – can be organized into separate datasets.
- Data related to a publication: Raw or processed data related to a publication can be organized into a single dataset for easy citation of data.
- Field campaign, time period, or season: Data from the same field campaign, time period, or season may be packaged together so that it can be viewed together.
For more guidance on how to organize your data into datasets, visit our Get Started documentation.
2. Describe and Format your Dataset
Metadata Requirements
ESS-DIVE’s dataset metadata requirements allow you to fully describe your dataset so that others can more easily find and use relevant data from dataset searches. Metadata for each dataset submitted should meet the guidelines listed below, and metadata completeness will be assessed during the dataset publication process using both automated and manual review workflows. Ensuring that your dataset has complete metadata before requesting publication will expedite the publication process.
The following checklist contains general descriptions and assessment criteria for each dataset metadata field. Additional information is available on our Dataset Requirements page. An Offline Metadata google document can be copied and used to collaborate with team members on your dataset metadata before submission.
Overview
- Dataset Title A brief title between 7-20 words long which contains relevant information such as the topic, geographic location, dates, and scale of data.
- Existing DOI(s) and Alternate Identifier(s) If this dataset has been previously published elsewhere, enter the DOI or alternate identifier. Identifiers are used to locate the dataset within your project’s data management system and can provide pertinent contextual information for users. Enter as many identifiers as needed.
- Abstract The abstract should be at least 100 words in length, written in full sentences, and understandable to anyone who has not seen related manuscripts. Include a statement about the purpose for why these data were generated and the research question it is intended to answer. A good abstract would provide users with adequate information to determine if the data are useful for their needs.
- Keywords Add a minimum of three total keywords or data variables, choose from the list of GCMD controlled vocabulary where possible. Ensure that these terms differ from words in the title to increase the findability of your dataset in searches.
- Data Variables Include measurement variables present in your dataset. As with keywords, you can choose from the list of GCMD controlled vocabulary terms or enter your own.
- Publication Date If you would like to specify a custom date or year when the dataset can be made publicly available, enter a date in YYYY-MM-DD or YYYY format. If left blank, this field will default to the current date.
- Data Usage Rights Choose how you wish your data to be shared and reused. Usage rights for the metadata will always be Creative Commons Public Domain. Pick from one of the options.
- Project Affiliation Enter the name of the DOE project to associate with this dataset. The project is cited as the publisher of this dataset, in addition to ESS-DIVE. If multiple projects were involved, enter the project that had the largest contribution to this dataset. Choose from the drop down list, which will appear as you start typing. If you are contributing data for an ESS project and it doesn’t appear in the list, contact ess-dive-support@lbl.gov.
- Funding Organizations List the organizations that funded the work (in most cases “U.S. DOE > Office of Science > Biological and Environmental Research (BER)” should be selected). You can use the autocomplete feature to choose from existing funding organizations.
- DOE Contracts If applicable, list the numbers of any DOE contract under which the data contained in your dataset was funded. You can use the autocomplete feature to choose from existing DOE contracts.
- Related References Add full citations (including DOIs, if available) for publications or datasets associated with the dataset you are creating.
People
- Dataset Contact = Single person who should be listed as the primary contact for the dataset for the purposes of the DOI or for users seeking further information about the data. This person may or may not be one of the dataset authors. A valid ORCID is required for the dataset contact.
- Dataset Creators = The main researchers involved in producing the data who should be listed in the citation. The order that creators are added to the dataset metadata will be the order that they appear in the dataset citation. Creators may be dataset authors, owners, originators, or principal investigators. Valid ORCIDs are strongly recommended for dataset creators.
- Dataset Contributors = Any additional contributors involved in producing the data. These may include people who assisted in creating the dataset, but should not be considered authors for publication. Dataset contributors will not be included in the dataset citation. Valid ORCIDs are recommended, but not required, for dataset contributors.
Dates
- Start Date Enter the earliest date in your dataset in ISO format (YYYY-MM-DD)
- End Date Enter the last date in your dataset in ISO format (YYYY-MM-DD)
Locations
- Description Include a short description of the location(s) where data was collected. This may include the location name, known identifiers if associated with a specific project, and ecosystem type involved.
- Bounding Box Coordinates Latitude and Longitude of the location(s) your data represents in WGS84 decimal format. If the data is better represented by a shape, you can include a KML file in the file uploads.
Methods
- Methods Information about the methods employed in collecting or generating the data included in your dataset. The dataset methods should be thorough enough for your work to be reproduced. You may provide a citation for any related methods used that have been previously published, but we strongly recommend still including methods text in your dataset metadata that fully describe data collection and processing steps.
ESS-DIVE data reporting formats
When organizing your data files, consider using the metadata and formatting guidelines and templates developed by ESS-DIVE and our community partners, called reporting formats. Each reporting format was developed with extensive input from relevant experts across the Environmental System Science community. Such consistent data formatting and descriptions will enable both machines and humans to more readily understand and reuse valuable data both now and in the future. Planned ESS-DIVE tools for advanced data search, integration, and visualization will be based around these reporting formats.
Eleven reporting formats are available for use:
- Dataset metadata (required to publish on ESS-DIVE)
- File level metadata for all data types
- CSV Reporting Format
- Sample ID and Metadata
- Soil Respiration
- Leaf-level Gas Exchange
- Hydrologic Monitoring
- Water and Soil Chemistry
- Amplicon Sequencing
- Model Data
- Location Metadata
Learn more about reporting formats and how to use them in our Reporting Format Guide or visit our Community space on GitHub. Coming soon – Checklist for Describing and Formatting Dataset.
Author Guidelines
Similar to journal publication authors, your dataset creators/authors should include the main researchers involved in producing the data (see below for more details). ESS-DIVE uses ORCIDs— globally unique identifiers for researchers— to clearly identify dataset authors and contacts, and their associated contact information. This ensures that data users can reach specific dataset contacts and creators, as contact information often changes over time and can be updated through the ORCID record.
3. Upload Data
When you are ready to submit a dataset, you can choose from one of three methods depending on file volume and number of files.
The ESS-DIVE Online Data Submission Form is the easiest way to describe your data, upload files and submit your dataset for publication. For bulk, programmatic submissions of multiple datasets, it will be more convenient to upload files and submit your datasets through our API using R, Java, or Python. If choosing to upload through our API, you can follow our tutorial documentation, which includes example scripts to provide metadata and upload files. For questions regarding the high bandwidth upload service, please contact support at ess-dive-support@lbl.gov.
Data Submission Web Form
Selecting “Submit Data” on the ESS-DIVE homepage will bring you to the Data Submission web form, where you can add your data files and complete associated metadata fields. Each field has a short description and an example. Fields that are required to submit your dataset are indicated with red asterisks.
4. Submit and Review Your Dataset
Submit your Dataset
Once you have completed the minimum required fields in the dataset submission form, Submit your dataset to ESS-DIVE. The submit function saves your dataset. Submitting a dataset does not make your dataset public or move it into the review process.
After clicking Submit Dataset, you will have the option to continue editing your saved dataset. If you are finished making changes, select the I’m Done option.
OPTIONAL FEATURES
Collaborate on datasets
Registered Data Contributors can share their datasets with registered team members to collaborate on data publications directly on ESS-DIVE.
When you share a dataset with someone, they will be able to access all dataset metadata and data files. Dataset permissions can be changed at any time. There are three permission types to choose from when granting share access to a registered team member*:
- View = This team member can read the dataset when it’s private. They cannot edit or share the dataset. This team member does not need to register, but does need to login to ESS-DIVE. The dataset can be found on the search page when they login to ESS-DIVE.
- Edit = This registered* team member can read the dataset when it’s private, change the dataset metadata, add/remove files, publish the dataset, and edit the dataset after it is published. They cannot share the dataset with others. The dataset can be found in their “My Data” list.
- Manage = This registered* team member has all the same access abilities as an editor, and they can share the dataset with other people or remove people from the dataset permissions. The dataset can be found in their “My Data” list.
*this person has completed the New Data Contributor form
Link to External Data Sources
You can link your ESS-DIVE dataset to data files and metadata that have already been published on another established data repository or to data files that cannot be uploaded to ESS-DIVE. This enables data to be stored where it makes the most sense scientifically and practically, while also following ESS Data Management and Sharing Policy to store searchable metadata on ESS-DIVE. Visit our guide for more information and instructions on how to create external links.
Review your Dataset with Automated Assessment Reports
After submitting your dataset, the “Assessment Report” button will become available. This report includes the results of a set of automated quality checks that help make your dataset findable, accessible, interoperable and reusable. These checks evaluate the presence of key fields, determine file types, and check that URLs included in metadata are resolvable, among other things. For a full list of the automated checks, please visit our Dataset Requirements page.
The automated checks begin once a dataset is submitted and failed checks should be addressed by the dataset submitter before requesting publication. In combination with content-related manual checks, assessment reports are used by the ESS-DIVE publication team to review datasets before approving them for publication.
Please note that assessment reports can take minutes, or up to 24 hours, to generate. Additional data management and curation is needed to make your data more reusable and interoperable in particular, such as following ESS-DIVE reporting formats.
Request Publication
When you are finished making revisions to your dataset, you can request publication. See our Publication documentation for instructions on how to make a request.
Your dataset will be reviewed by our publication team and any necessary revisions will be requested over email. Metadata criteria used in ESS-DIVE’s manual review process are outlined in the “Creating you metadata” section and full details can be found in our Dataset Requirements documentation. Once all requested revisions are completed, your dataset will be published.
Revisions can be made to published datasets at any time. Changes to dataset metadata will become live immediately, while editing dataset files will require approval from the ESS-DIVE publication team before becoming publicly available.