> For the complete documentation index, see [llms.txt](https://c-comp.gitbook.io/data-management-handbook/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://c-comp.gitbook.io/data-management-handbook/internal-c-comp-dataset-numbers.md).

# Internal C-CoMP Dataset Numbers

An internal dataset number is assigned to each C-CoMP dataset. Dataset numbers organize C-CoMP metadata and data internally. Dataset numbers will follow this format: CMP###.

If you are a member of C-CoMP, add the #dataset\_number\_requests slack channel to your C-CoMP Slack Workspace. To obtain a dataset number, please request a new number in the channel and tag your primary C-CoMP collaborators as well as Laura Gray, the C-CoMP Digital Coordinator, in the message. For example, you could type “@Laura Gray - @name and I need a C-CoMP Dataset Number”. Laura Gray will assign you the next available dataset number (CMP###) by replying to your message. Once your number has been assigned, please record as much information as possible (placeholders are fine) about your dataset in the C-CoMP Data Catalog.

A blank template of the C-CoMP Data Catalog can be downloaded here

{% file src="/files/QTIFrRRhQtjeDD2UIh8O" %}

## **What is a dataset at C-CoMP?**

At C-CoMP, a dataset is defined as a collection of data that relates back to the same original samples. Data, even if they are generated using different methods/measurements are part of the same dataset if the measurements were conducted on the same samples. A set of data within a larger dataset can also be accessed individually and analyzed.

For example, seawater was collected from different depths during a CTD cast. These seawater samples were processed for untargeted proteomics and metabolomics as well as shotgun metagenomic sequencing. Nutrient concentrations were also measured. According to our definition, all of these data streams can be combined to create a comprehensive dataset that describes the chemical and biological properties of seawater across different depths. Parts of the dataset like the untargeted proteomics data can also be accessed and analyzed individually. In this example, all of the data will be assigned the same internal C-CoMP Dataset number (e.g. CMP004; located at the beginning of each file name) to facilitate collaboration, data integration, indexing, and tracking efforts. The file names for each sample X datastream will differ according to the instructions outlined in [File Naming Conventions](/data-management-handbook/file-naming-conventions/lc-ms-metabolomics.md).&#x20;

## What is a dataset at BCO-DMO?

When metadata and tabular data are submitted to BCO-DMO, datasets that include multiple datastreams are divided and submitted as separate datasets by datastream. In the example above, individual dataset landing pages would be created under the same BCO-DMO project for the untargeted proteomics, untargeted metabolomics, metagenomic sequencing, and nutrient measurements. Laura Gray will work with you to organize these submissions. Internal C-CoMP dataset numbers will not be referenced on BCO-DMO dataset landing pages.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://c-comp.gitbook.io/data-management-handbook/internal-c-comp-dataset-numbers.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
