wiki:CreatingAnnotion

Version 10 (modified by dbrentw, 6 years ago) (diff)

--

The GRITS plugin "org.grits.toolbox.importer.ms.annotation.glycan.simiansearch" executes a tool title GELATO, which annotates MS spectra with candidate glycans from a specified database. The GELATO plugin is "org.grits.toolbox.ms.annotation". This plugin creates an archive that is readable by the GRITS GUI component (plugins "org.grits.toolbox.entry.ms.annotation" and "org.grits.toolbox.entry.ms.annotation.glycan"). While GELATO's algorithms perform the matching of peaks in MS spectra to glycans, the archive that is created is an instance of the GRITS MS object model (plugin "org.grits.toolbox.ms.om"). In order to extend GRITS to other forms of MS data, it will be helpful to understand the requirements for creation of an archive file that can be opened in GRITS. This document is meant to serve that purpose.

Minimal steps to create a GRITS MS Glycan Annotation archive.

Step one: populating the data structures with MS data. The default data structure for GELATO is a HashMap<Integer, Scan> where the Integer is the scan number and the Scan object is an instance of org.grits.toolbox.ms.om.data.Scan. GRITS supports a variety of MS types: Direct Infusion (DI), Total Ion Mapping (TIM), LC-MS/MS, and MS Profile. For DI, TIM, and LC-MS/MS, it is assumed that you will have at least 1 MS1 scan and 1 MS/MS scan. For MS Profile, you will only need an MS1 scan.

Assuming a Direct Infusion experiment, the minimum data for GELATO is:

1) Creating Test Data

a) Create an MS1 scan (example code)

   Scan ms1scan = new Scan();  // create new Scan object
   ms1scan.setScanNo(1);  // set the scan number to 1
   ms1scan.setMsLevel(1); // set the MS level to 1
   List<Peak> peakList = new ArrayList<>();  // create a new peak list for the MS1 scan
   Peak precursorPeak = new Peak();  // create a new peak that will be a precursor peak (generates an MS/MS event)
   precursorPeak.setMz(528.2);  // set the m/z value of the peak (could be anything)
   precursorPeak.setIntensity(10000.0);  // set the intensity to something > 0
   precursorPeak.setIsPrecursor(true);  // identify the peak as a precursor peak
   precursorPeak.setId(1);  // the peak MUST have a unique ID
   precursorPeak.setPrecursorMz(528.18);  // when the ion in the MS1 is trapped for fragmentation, this is the m/z of the ions trapped...could slightly differ from the peak in the full MS1 scan
   peakList.add(precursorPeak);  // add the precursor peak to the peak list
   ms1scan.setPeaklist(peakList);  // set the peak list of the full MS1 scan to the newly created peak list

b) Create an MS2 scan (example code)

   Scan ms2scan = new Scan();  // create new Scan object
   ms2scan.setScanNo(2);  // set the scan number to 2
   ms2scan.setMsLevel(2);  // set the MS level to 2
   ms2scan.setPrecursor(precursorPeak);  // set the precursor variable to the precursorPeak created above. This ties the MS2 scan to the precursor peak in the parent MS1 scan
   List<Peak> peakList = new ArrayList<>();  // if you want to id fragment peaks, create a new peak list for the MS2 scan
   Peak fragPeak = new Peak(); // create a fragment peak
   fragPeak.setMz(940.2);
   fragPeak.setIntensity(5000.0);
   fragPeak.setIsPrecursor(false);
   fragPeak.setId(1);
   peakList.add(fragPeak);
   ms2scan.setPeaklist(peakList);
   ms2scan.setParentScan(1);   // set the parent scan of this MS/MS scan to the full MS scan

c) Populate the MS1 object with subscans

   List<Integer> subScans = new ArrayList<>();   // create a list of integers to contain all scan numbers that were subscans of the full MS1 scan
   subScans.add(2);  // scan number 2 is the only subscan in this example, so add the number 2
   ms1scan.setSubScans(subScans);  // add the subscan list to the full MS1 scan

d) Add the scans to the Hashmap datastructure

   HashMap<Integer, Scan> testData = new HashMap<>(); // create the data structure to be used by Gelato
   testData.put(1, ms1scan); // put the ms1 scan in the Hashmap, key is scan number 1
   testData.put(2, ms2scan); // put the ms2 scan in the Hashmap, key is scan number 2

2) Creating Test Glycan Objects

a) * You must first initialize the EuroCarbDB GlycanBuilder BuilderWorkspace even though you never use the instance

BuilderWorkspace bw = new BuilderWorkspace(new GlycanRendererAWT());

b) Create a test GlycanStructure object and a corresponding Glycan object. I'm using a static GlycoWorkbench (GWB) formatted sequence

   private static String GLYCAN_SEQUENCE1 = "freeEnd--??1D-GlcNAc,p--??1D-GlcNAc,p--??1D-Man,p(--??1D-Man,p--??1D-Man,p--??1D-Man,p--??1D-Glc,p--??1D-Glc,p)--??1D-Man,p(--??1D-Man,p--??1D-Man,p)--??1D-Man,p--??1D-Man,p$MONO,perMe,0,0,freeEnd";
   private static String GLYCAN_ID1 = "TEST_GOG1";
   
   private void createTestGlycanObjects() {
      testGlycanStructure1 = new GlycanStructure();   // create the GlycanStructure object
      testGlycanStructure1.setGWBSequence(GLYCAN_SEQUENCE1); // set the sequence
      testGlycanStructure1.setId(GLYCAN_ID1);  // set its ID

      testGlycan1 = Glycan.fromString(testGlycanStructure1.getGWBSequence());  // create the Glycan object using the "Glycan.fromString( <GWB Seq> )" method
      testGlycanStructure1.setSequence( testGlycan1.toGlycoCTCondensed() );   // set the alternative sequence (using GlycoCT Condensed)
      testGlycanStructure1.setSequenceFormat(GlycanAnnotation.SEQ_FORMAT_GLYCOCT_CONDENSED);  // set the type of sequence
   }

3) Create Method object and populate with applicable settings

   private void createTestMethod() {
      Method testMethod = new Method();

      // Set the MS Method type. Options: 
      /* (from the Method.java class)
        public static final String MS_TYPE_INFUSION = "Direct Infusion";
        public static final String MS_TYPE_LC = "LC-MS/MS";
        public static final String MS_TYPE_TIM = "Total Ion Mapping (TIM)";
        public static final String MS_TYPE_MSPROFILE = "MS Profile";
      */ 
      testMethod.setMsType(Method.MS_TYPE_INFUSION);

      // Right now, we only really support generic "glycan"
      testMethod.setAnnotationType( Method.ANNOTATION_TYPE_GLYCAN );

      // set accuracy information
      testMethod.setAccuracy( 1.0 );
      testMethod.setAccuracyPpm(false); // false means daltons

      testMethod.setFragAccuracy( 500.0 ); 
      testMethod.setFragAccuracyPpm( true ); // true means PPM

      testMethod.setShift(0.0); // not sure what this is for but you have to set it

      setMethod( testMethod ); // store the new method object in your class
   }

4) Create Data and DataHeader objects

   private void createTestData() {
      Data testData = new Data();	// Data is part of GRITS object model. It is top-level object for linking MS data to annotation information	
      DataHeader dataHeader = new DataHeader(); // DataHeader, also part of object Model, tracks meta-data, in particular the CustomExtraData that are associated with a project
      dataHeader.setMethod( getMethod() );  // the testMethod was set in step 3. Store it in the DataHeader
      testData.setDataHeader(dataHeader);  // store the DataHeader in the Data object.
      setData( testData ); // keep track of new data object in your class
   }

5) Create AnalyteSettings object

   private void createTestAnalyteSettings() {
     AnalyteSettings testAnalyteSettings = new AnalyteSettings();
     GlycanSettings glycanSettings = new GlycanSettings(); 
     testAnalyteSettings.setGlycanSettings(glycanSettings);
     setAnalyteSettings( testAnalyteSettings );
   }

6) create list of adducts to consider

a) * Note that class GlycanPreDefinedOptions has static, pre-defined glycan-specific options (see comments)

   private void createTestAdducts() {
      List<IonSettings> testAdductsToAnalyze = new ArrayList<>();

      /*
        public class GlycanPreDefinedOptions {
	  public static Ion ION_ADDUCT_HYDROGEN = new Ion("H", new Double(1.007825032), "Hydrogen", Integer.valueOf(1), Boolean.TRUE);
	  public static Ion ION_ADDUCT_SODIUM = new Ion("Na", new Double(22.989769670), "Sodium", Integer.valueOf(1), Boolean.TRUE);
	  public static Ion ION_ADDUCT_POTASSIUM = new Ion("K", new Double(38.963706900), "Potassium", Integer.valueOf(1), Boolean.TRUE);
	  public static Ion ION_ADDUCT_CHLORINE = new Ion("Cl", new Double(34.968852710), "Chlorine", Integer.valueOf(1), Boolean.FALSE);
	  public static Ion ION_ADDUCT_LITHIUM = new Ion("Li", new Double(7.016004000), "Lithium", Integer.valueOf(1), Boolean.TRUE);
	  public static Ion ION_ADDUCT_ELECTRON = new Ion("e", new Double(0.0005486), "electron", Integer.valueOf(1), Boolean.TRUE);
	  public static Ion ION_ADDUCT_NEGHYDROGEN = new Ion("-H", new Double(-1.007825032), "Negative mode Hydrogen", Integer.valueOf(1), Boolean.FALSE);
	  public static Ion ION_ADDUCT_CALCIUM = new Ion("Ca", new Double(39.9625906), "Calcium", Integer.valueOf(2), Boolean.TRUE);
          ......
        }
      */
      testAdductsToAnalyze.add( GlycanPreDefinedOptions.ION_ADDUCT_SODIUM );  // add pre-defined sodium adduct
      
      IonSettings myAdduct = new IonSettings("TestAdduct", 1.0, "Test Glycan Adduct", 2, Boolean.TRUE);	// create your own adduct
      testAdductsToAnalyze.add( myAdduct );

      List<Integer> testAdductsToAnalyzeCnts = new ArrayList<>(); // the software must know how many adducts (and thus charges) to consider when matching
      testAdductsToAnalyzeCnts.add( Integer.valueOf(4) ); // assumes up to 4 sodiums since sodium is first element of testAdductsToAnalyze list
      testAdductsToAnalyzeCnts.add( Integer.valueOf(1) ); // assumes up to 1 myAdduct since sodium is second element of testAdductsToAnalyze list

      setAdducts( testAdductsToAnalyze ); // store the list of adducts in your class
      setAdductCounts( testAdductsToAnalyzeCnts); // store the list of adduct counts in your class
   }

7) Set the final name of the archive

  • Setting the archive name / paths is an important consideration. For LC-MS/MS data, it is different (needs an overview file)!
       protected void setArchiveFilePaths() {
          this.sArchiveName = AnnotationWriter.getArchiveFilePath(getProjectName());  // the ArchiveWriter controls the file extension, etc, so there is a method to create final name
       }
    

8) Process data to annotate spectra

a) This is a sample method to annotate a single MS/MS spectra to a specified glycan structure

   /* Method declaration. 
      Params:  iParentScan == MS1 scan,  iSubScanNum == MS2 scan
      Assumption:  iParentScan is parent of iSubScan. Scan iSubScan is in iParentScan's subscan list
   private boolean processStructure( int iParentScan, int iSubScanNum ) {

     GlycanScansAnnotation glycanScanAnnotation = new GlycanScansAnnotation(); // Instantiate GlycanScansAnnotation object
     GlycanAnnotation glycanAnnotation = new GlycanAnnotation(); // Instantiate GlycanAnnotation object

     // note that the GlycanScanAnnotation and GlycanAnnotation objects have the same id!
     glycanScanAnnotation.setAnnotationId(getCurrentAnnotationId()); // Set new annotation id. This can be incrementing integer valuer
     glycanAnnotation.setId(getCurrentAnnotationId()); // Set the Id of the glycanAnnotation 		
     setCurrentAnnotationId(getCurrentAnnotationId() + 1);  // now increment the id so the next time it is a new value

     // initialize values in glycanScanAnnotation object
     glycanScanAnnotation.setGlycanId(testGlycanStructure1.getId()); // set the glycan id of the structure to match
     getGlycanIDs().add(testGlycanStructure1.getId()); // 

     // initialize values in glycanAnnotation object
     glycanAnnotation.setGlycanId(testGlycanStructure1.getId());
     glycanAnnotation.setSequenceGWB(testGlycanStructure1.getGWBSequence());
     glycanAnnotation.setSequence(testGlycanStructure1.getSequence());

     // Set other glycan-specific options, if desired. Not required to create functioning archive
     //  glycanAnnotation.setPerDerivatisationType( "....." ); 
     //  glycanAnnotation.setReducingEnd( "....." );

		int iCurrentFeatureId = getTestData().getFeatureIndex();				
		//		boolean bRes = matchGlycanStructures(glycanScanAnnotation, glycanAnnotation);

		boolean bRes = matchSingleSubScan(iParentScan, iSubScanNum, iStrucNum, glycanScanAnnotation, glycanAnnotation);
		if( bRes ) {
			if(iCurrentFeatureId != getTestData().getFeatureIndex()) {//means there is new annotations added using the given glycan structure
				getTestData().getAnnotation().add(glycanAnnotation);
			}
			if( ! glycanScanAnnotation.getScanAnnotations().isEmpty() ) {
				try {
					writer.writeAnnotationsPerGlycan(glycanScanAnnotation, getTempOutputPath());  // creates temp files that will be used to populate the archive in the next step
				} catch (IOException e) {
					// TODO Auto-generated catch block
					e.printStackTrace();
				}
			}
			return true;
		}
		return false;
	}

9) Create the archive that will be read by GRITS

a) Populate the Scan Feature object

	protected void populateScanFeatureData(String glycanFilesPath) {
		//define objects to gather the MS1 annotation while processing MS2
		AnnotationReader reader = new AnnotationReader(); // instantiate the reader, which will read the temp files created in Step 8
		GlycanScansAnnotation glycanAnnotation = new GlycanScansAnnotation();
		ScanFeatures scanFeatures = new ScanFeatures();  
		for(Integer scanId : getTestData().getScans().keySet()){	
			if( isCanceled() )
				return;
			scanFeatures = new ScanFeatures(); // Create new ScanFeatures object to add features to the scan if it was annotated
			scanFeatures.setScanId(scanId);
			scanFeatures.setScanPeaks(new HashSet<Peak>(getTestData().getScans().get(scanId).getPeaklist()));
			getTestData().getScanFeatures().put(scanId, scanFeatures);
			if(getTestData().getAnnotatedScan().get(scanId) != null) { // this scan was annotated
				for(String glycanId : getTestData().getAnnotatedScan().get(scanId)){
					glycanAnnotation = reader.readglycanAnnotation(glycanFilesPath, glycanId); // read the temp file for this particular glycan
					if(glycanAnnotation != null && glycanAnnotation.getScanAnnotations().get(scanId) != null) {
						for( GlycanFeature f : glycanAnnotation.getScanAnnotations().get(scanId) ) {
							if( ! scanFeatures.getFeatures().contains(f) ) {
								scanFeatures.getFeatures().add(f); // add each unique GlycanFeature (from ScanAnnotations) to the ScanFeatures list of features
							}
						}
					}
				}
			}
		}
	}