File Naming and Data Deposition Example

From the example used above, seawater was collected at a field station ST28 from different depths for 14 days during the KT-235 cruise. Seawater samples were processed for untargeted proteomics, targeted and untargeted metabolomics (positive and negative ionization modes), and shotgun metagenomic sequencing. Water for nutrient (organic and inorganic) measurements was also collected. CTD casts were conducted during each collection. The internal C-CoMP dataset_no is CMP002.

The numbers in the file names below for all data streams except proteomics refer to file numbers and not unique sample IDs. This decision was made intentionally to avoid misnaming files, account for quality control samples, expedite the file naming process, and incorporate replicates or samples that have to be re-analyzed. Sample IDs or identifiers should be assigned across datasets to link samples across different data streams, but will be included in metadata tables and not the file names.

File names generated via proteomics will contain Sample IDs within the file names due to existing lab procedures for this data stream.

Proteomics files →Proteomexchange, linked to BCO-DMO and Ocean protein portal (OPP)

File name

CMP002_072922_QE_2DDDA_SA20.mzML

CMP002_072922_QE_2DDDA_SA21.mzML

CMP002_072922_QE_2DDDA_SA22.mzML

Targeted metabolomics (LC-MS) → MetaboLights, linked to BCO-DMO

File name
Ion mode

CMP002_072122_Altis_RP_45.mzML

positive ion mode

CMP002_072122_Altis_RP_46.mzML

positive ion mode

CMP002_072122_Altis_RP_47.mzML

positive ion mode

CMP002_072122_Altis_RP_90.mzML

negative ion mode

CMP002_072122_Altis_RP_91.mzML

negative ion mode

CMP002_072122_Altis_RP_92.mzML

negative ion mode

Untargeted metabolomics (LC-MS) → MetaboLights, linked to BCO-DMO

File name
Ion mode

CMP002_072122_Lumos_RP_45.mzML

positive ion mode

CMP002_072122_Lumos_RP_46.mzML

positive ion mode

CMP002_072122_Lumos_RP_47.mzML

positive ion mode

CMP002_072122_Lumos_RP_90.mzML

negative ion mode

CMP002_072122_Lumos_RP_91.mzML

negative ion mode

CMP002_072122_Lumos_RP_92.mzML

negative ion mode

Whole metagenomic sequencing → NCBI SRA; Linked to BCO-DMO

File name
Read direction

CMP002_WMX_SA_20_R1_001.fastq.gz

Forward

CMP002_WMX_SA_20_R2_001.fastq.gz

Reverse

CMP002_WMX_SA_21_R1_001.fastq.gz

Forward

CMP002_WMX_SA_22_R2_001.fastq.gz

Reverse

CMP002_WMX_SA_22_R1_001.fastq.gz

Forward

CMP002_WMX_SA_22_R2_001.fastq.gz

Reverse

Metadata (CTD and nutrient concentrations) → BCO-DMO

CMP002_KT235_ST28_CTD_nutrients.txt

Spreadsheet with Sample_ID (unique combination of depth X day) as rows and time, temperature, salinity, TOC, NO3, NO2+NO3 etc as columns. If possible, columns can be included that link individual biosample accession numbers (shotgun metagenomics) data and the file names back to the sample_ID and other metadata.

Last updated