Abstract Title: | Cumulative Neutral Loss Model for Fragment Deconvolution in Electron Spray Ionization High-Resolution Mass Spectrometry Data |
Presenter Name: | Ms Denice van Herwerden |
Co-authors: | Dr Sascha Lege Dr Jake O'Brien Dr Bob Pirok Prof Kevin Thomas Dr Saer Samanipour |
Company/Organisation: | University of Amstedam |
Country: | Netherlands |
Abstract Information :
Non-targeted analysis combined with high-resolution mass spectrometry (HRMS) is the most comprehensive approach to chemically characterize complex samples, including environmental and biological samples. These samples can contain thousands of structurally known and unknown compounds and are often analyzed using data-independent acquisition. To identify these chemicals, componentization is used to group MS ions potentially belonging to unique chemical constituents. This process includes the detection of (in-source) fragment ions, which are essential pieces of information for chemical identification. Overall, the detection of fragments is a challenging task, due to the lack of knowledge regarding the chemical structure of the parent ion. Current approaches heavily rely on the time domain information, including retention time (tr) matching and peak shape correlation analysis of the potential parent features and associated in-source fragments. However, to our knowledge, none of the algorithms use any information from the mass domain. In other words, no approach evaluate whether a detected fragment ion originates from the respective parent ion. To use the mass domain information for fragment deconvolution, a probabilistic cumulative neutral loss (CNL) model was developed. This model utilizes the mass difference between the parent ion and each potential fragment ion (i.e., CNL) to assess the likelihood of an ion being a true fragment. Results showed that the likely CNLs indeed contained structural information. For example, the highest probability CNL had a mass of 18.01, corresponding to the loss of H2O. The CNL model was constructed using both the high-resolution spectra (i.e., >5000) from MassBank EU and North America. The CNL model was optimized using database spectra, for which a CNL score threshold of -0.95 and mass tolerance of 0.005 Da were found. This corresponded with a true positive rate (TPr) of 95.0%, a false positive rate of 18.7%, a false discovery rate (FDr) of 25.6%, and a reduction rate of 39.9%. These optimized parameters were then used to assess the performance for real samples under different conditions (e.g., sample concentration, matrix effects, collision energies). Across all assessments, the TPr stayed above 95.0%, the FDr ranged between the 45% and 77%, and the reduction rate ranged between the 10% and 24%, showing that the model is able to retain the true fragments while removing false signals by only looking at the mass domain information. Finally, the false detection rates for fragment identification were compared using the CNL model, tr matching , and peak shape correlation. Overall, if a time domain is available, the combination of the CNL model with correlation analysis performed best with a TPr, FDr, and reduction rate of 93.1%, 57.2%, and 42.6%, respectively.