wiki:FileOrganization

Overview of Files in GRITS Project

Required Objects/Files for GRITS Project

1) DataHeader.java -> DataHeader.xml

Contains:
a) Method object (Method.java)

2) Method.java -> settings.xml

3) Data.java -> data.xml

Contains:
a) DataHeader object (DataHeader.java)
b) Method (object Method.java)
c) ScanFeatures (HashMap<Integer,ScanFeatures>, key is scan number, written as <scan_num>.xml)
d) Scans (HashMap<Integer,Scan>, key is scan number, The peaks in this object are written in <scan_num>.xml)

4) 1 or more ScanFeatures.java - <scan_number>.java

Writing the Project Files

Through testing, we determined that it was most optimal in GELATO to first create temporary annotation files for each glycan that is annotated. For this reason the object ScansAnnotation.java was created. During annotation, Annotation objects are instantiated for all structures that match to a precursor MS/MS scan or any MSn scan that also has subscans and stored in a List in the Data object. Likewise, ScansAnnotation objects are instantiated for all annotated structures, as well. Files for each annotated structure are written into a temp folder with the name of the glycan (or fragment) structure (the annotation string ID).
Here is an example of the temporary GlycanScansAnnotation file for "GOG142":

In the image you can see that the outermost tag is specific for this particular glycan (GOG142) and within it are the scan(s) that match to the annotation (e.g. scanId="1").

Once all glycan-centric files are created, GELATO then creates the Scan-based files by iterating over all Glycan/Fragment annotations (List of Annotation objects in Data) and populating a ScanFeatures object for each scan. The ScanFeatures objects are added to the Data object. The final files are then written into the GRITS project folder.
Here is an example of the data.xml:

In the image you can see that the outermost tag is the Data object and following are all of the annotated structures in the project (shown is the fragment structure "GOG428-2").

Here is an example of the ScanFeatures xml file for a scan (1.xml for scan 1):

In the image you can see that the outermost tag is the ScanFeatures object and following are first, the peak list of the scan and then the features that map to the annotated structures in the data.xml file (e.g. annId="1" corresponds to the GlycanAnnotation with annId="1" in the data.xml file).

Please the attached xml fragments for more information.

Support for Different MS Types

For non-LC-MS/MS data (direct infusion, MS Profile, and TIM) the data are organized as follows:

1) In the GRITS project annotation folder, a single archive (.zip) is created that contains all files:

a) data.xml, dataHeader.xml, settings.xml and xml file for all scans in the project.

* Note that the first MS1 scan xml file (e.g. 1.xml) is the MS1 "overview" file containing all precursor peaks
* For TIM data (no MS1 scan) an phony (simulated) MS1 "overview" file is created (0.xml)

For LC-MS/MS data, the data are organized as follows:

1) A .zip file (e.g. 8014.zip) at the root of the annotation folder containing the data.xml, dataHeader.xml, and settings.xml files
2) A folder w/ the name of the entry (e.g. 8014) which contains compressed files (.zip) for every MS1 file (e.g. 1.zip, 12.zip, etc)
3) Each individual MS1 archive (e.g. 1.zip) contains the xml file for the MS1 scan, all sub-scans, and the data.xml for that MS1 scan

a) The individual MS1 archives are similar to the direct infusion archives except only for the specified MS1 scan (e.g. 1.zip)
*There is no Peak data in the data.xml file in the main archive (e.g. 8014.zip). The Peak data are in the data.xml files for the individual MS1 scans (e.g. 1.zip)

Last modified 2 years ago Last modified on 06/26/2018 04:41:11 PM

Attachments (9)

Download all attachments as: .zip