Comment on page

Sequencing Files

This file name refers to the forward reads (R1) sequenced from a biological sample with the file number 8 that were collected using whole metagenomic sequencing in dataset CMP002.
Please use this file naming template for raw sequences tool as a guide to create file names.
CCoMP_Sequence_File_Names.xlsx
27KB
Binary

Field Definitions

Each number in the example above corresponds to a field in the file name. Fields are separated by ‘_’ to enhance computer readability. Shortened column names used in the template above are provided in parentheses next to the appropriate field definition.
  1. 1.
    Dataset Number (Dataset_No):
    • All C-CoMP datasets will be assigned an internal dataset number. Please request this number on the #dataset_number_requests slack channel following the instructions provided above.
    • Metadata about the dataset (including Dataset number, method type, and data storage location) will be recorded in the C-CoMP Data Catalog.
  2. 2.
    Approach (Approach):
    • The kind of method that was used for this specific project X sample (see examples and abbreviations below)
  3. 3.
    Sample type (Sample_Type):
    • Use this field to distinguish sample types. Sample type should fall into one of these categories: quality control (QC) or biological sample (SA). QC includes samples run as DNA extraction or sequencing controls to check for contamination during sample preparation.
  4. 4.
    File number (File_No)
  5. 5.
    Forward or Reverse Reads (Forward_Reverse):
    • Either the forward reads (R1) or reverse reads (R2) if applicable to the file type. Use ‘noR’ if this is not applicable.
  6. 6.
    Sequencing number (Seq_no):
    1. 1.
      Used if there is more sequencing data for the same sample and data type. This field is only changed if the sample is a technical replicate. If the sample is a biological replicate or from a separate extraction process, the sample is assigned a different sample_ID. Default number is 001.
  7. 7.
    File-type extension
Approach Abbreviations:
  • WMX - Whole Metagenomic Sequencing (environmental metagenomics)
  • WQX - Whole Genome Sequencing
  • AMP - Amplicon Sequencing (e. g. 16S rRNA)
  • TXX - Transcriptomics
  • MTX - Environmental Metatranscriptomics
Sample Type Abbreviations:
  • QC - Quality Control
  • SA - Biological Sample