This short Vignette will show how to correction overloaded signals in (i) an artificial test case and (ii) a provided real data set. To achieve this we need to load the package functions as well as a small data example in xcmsRaw format.

library(CorrectOverloadedPeaks)
data("xcmsRaw_data")

Let’s model a typical overloaded signal occuring frequently in GC-APCI-MS using the provided function .

pk <- CorrectOverloadedPeaks::ModelGaussPeak(height=10^7, width=3, scan_rate=10, e=0, ds=8*10^6, base_line=10^2)
plot(pk, main="Gaussian peak of true intensity 10^7 but cutt off at 8*10^6")

Now we roughly estimate peak boarders before applying the provided function to correct peak data.

idx <- pk[,"int"]>0.005 * max(pk[,"int"])
tmp <- CorrectOverloadedPeaks::FitGaussPeak(x=pk[idx,"rt"], y=pk[idx,"int"], silent=FALSE, xlab="RT", ylab="Intensity")
## [1] "Number of converging sollutions: 10, keeping 1"

The generated QC plot does show the optimal solution found (green line), indicating the substituted intensity values (grey circles) and obtained parameters (blue text) including the probably peak height (max_int=9.7*10^6) being very close to the true peak height (10^7). Now let’s extend this simplified process to peaks from a real data set. The following function call will generate (i) a PDF in the working directory with QC-plots for 10 peaks from 5 chromatographic regions, (ii) processing information output to the console and (iii) a new file “cor_df_all.RData” in the working directory containing all extracted but non-corrected mass traces.

tmp <- CorrectOverloadedPeaks::CorrectOverloadedPeaks(data=xcmsRaw_data, method="EMG", testing=TRUE)
## 
## Processing... S5_35_01_2241_Int+LM.mzXML 
## 
## Trying to correct 5 overloaded regions.
## [1] "Processing Region/Mass: 1 / 1"
## [1] "Number of converging sollutions: 157, keeping 5"
## [1] "Processing Region/Mass: 1 / 2"
## [1] "Number of converging sollutions: 110, keeping 63"
## [1] "Processing Region/Mass: 2 / 1"
## [1] "Number of converging sollutions: 128, keeping 11"
## [1] "Processing Region/Mass: 2 / 2"
## [1] "Number of converging sollutions: 110, keeping 6"
## [1] "Processing Region/Mass: 3 / 1"
## [1] "Number of converging sollutions: 126, keeping 6"
## [1] "Processing Region/Mass: 4 / 1"
## [1] "Number of converging sollutions: 120, keeping 66"
## [1] "Processing Region/Mass: 4 / 2"
## [1] "Number of converging sollutions: 112, keeping 12"
## [1] "Processing Region/Mass: 4 / 3"
## [1] "Number of converging sollutions: 119, keeping 64"
## [1] "Processing Region/Mass: 5 / 1"
## [1] "Number of converging sollutions: 58, keeping 4"
## [1] "Processing Region/Mass: 5 / 2"
## [1] "Number of converging sollutions: 161, keeping 37"
## [1] "Storing non-corrected data information in 'cor_df_all.RData'"

Let’s load these non-corrected mass traces for further visualization of package capabilities. For instance we can reprocess peak 2 from region 4 using the isotopic ratio approach:

load("cor_df_all.RData")
head(cor_df_all[[4]][[2]])
##     Scan      RT      mz0   int0      mz1   int1      mz2  int2 modified
## 188  188 599.514 350.1646   7374 351.1684   5589 352.1663  1277    FALSE
## 189  189 599.623 350.1636  19565 351.1668   7864 352.1631  3842    FALSE
## 190  190 599.732 350.1627  50418 351.1650  19183 352.1616  9141    FALSE
## 191  191 599.840 350.1646 118553 351.1664  38646 352.1633 18278    FALSE
## 192  192 599.951 350.1635 260899 351.1651  86333 352.1620 41024    FALSE
## 193  193 600.060 350.1637 528827 351.1651 167749 352.1619 77910    FALSE
tmp <- CorrectOverloadedPeaks::FitPeakByIsotopicRatio(cor_df=cor_df_all[[4]][[2]], silent=FALSE)

The extracted data contain RT and Intensity information for the overloaded mass trace (mz=350.164) as well as isotopes of this mz up to the first isotope which is not itself overloaded (M+2, green triangles). This isotope is evaluated with respect to its ratio to M+0 in the peak front (15.9%) and this ratio in turn is used to scal up the overloaded data points of M+0 (grey circles) as indicated by the black line. The data could of course be processed alternatively using the Gauss method as shown previously for artificial data.

tmp <- CorrectOverloadedPeaks::FitGaussPeak(x=cor_df_all[[4]][[2]][,"RT"], y=cor_df_all[[4]][[2]][,"int0"], silent=FALSE, xlab="RT", ylab="Intensity")
## [1] "Number of converging sollutions: 10, keeping 2"

Finally we clean up the temporary files stored on the harddrive.

if(file.exists("cor_df_all.RData")) file.remove("cor_df_all.RData")
## [1] TRUE
if(file.exists("S5_35_01_2241_Int+LM.mzXML.pdf")) file.remove("S5_35_01_2241_Int+LM.mzXML.pdf")
## [1] TRUE