4 resultados para Log cabins.
em Digital Commons at Florida International University
Resumo:
This dissertation develops a new mathematical approach that overcomes the effect of a data processing phenomenon known as “histogram binning” inherent to flow cytometry data. A real-time procedure is introduced to prove the effectiveness and fast implementation of such an approach on real-world data. The histogram binning effect is a dilemma posed by two seemingly antagonistic developments: (1) flow cytometry data in its histogram form is extended in its dynamic range to improve its analysis and interpretation, and (2) the inevitable dynamic range extension introduces an unwelcome side effect, the binning effect, which skews the statistics of the data, undermining as a consequence the accuracy of the analysis and the eventual interpretation of the data. ^ Researchers in the field contended with such a dilemma for many years, resorting either to hardware approaches that are rather costly with inherent calibration and noise effects; or have developed software techniques based on filtering the binning effect but without successfully preserving the statistical content of the original data. ^ The mathematical approach introduced in this dissertation is so appealing that a patent application has been filed. The contribution of this dissertation is an incremental scientific innovation based on a mathematical framework that will allow researchers in the field of flow cytometry to improve the interpretation of data knowing that its statistical meaning has been faithfully preserved for its optimized analysis. Furthermore, with the same mathematical foundation, proof of the origin of such an inherent artifact is provided. ^ These results are unique in that new mathematical derivations are established to define and solve the critical problem of the binning effect faced at the experimental assessment level, providing a data platform that preserves its statistical content. ^ In addition, a novel method for accumulating the log-transformed data was developed. This new method uses the properties of the transformation of statistical distributions to accumulate the output histogram in a non-integer and multi-channel fashion. Although the mathematics of this new mapping technique seem intricate, the concise nature of the derivations allow for an implementation procedure that lends itself to a real-time implementation using lookup tables, a task that is also introduced in this dissertation. ^
Resumo:
Hydrophobicity as measured by Log P is an important molecular property related to toxicity and carcinogenicity. With increasing public health concerns for the effects of Disinfection By-Products (DBPs), there are considerable benefits in developing Quantitative Structure and Activity Relationship (QSAR) models capable of accurately predicting Log P. In this research, Log P values of 173 DBP compounds in 6 functional classes were used to develop QSAR models, by applying 3 molecular descriptors, namely, Energy of the Lowest Unoccupied Molecular Orbital (ELUMO), Number of Chlorine (NCl) and Number of Carbon (NC) by Multiple Linear Regression (MLR) analysis. The QSAR models developed were validated based on the Organization for Economic Co-operation and Development (OECD) principles. The model Applicability Domain (AD) and mechanistic interpretation were explored. Considering the very complex nature of DBPs, the established QSAR models performed very well with respect to goodness-of-fit, robustness and predictability. The predicted values of Log P of DBPs by the QSAR models were found to be significant with a correlation coefficient R2 from 81% to 98%. The Leverage Approach by Williams Plot was applied to detect and remove outliers, consequently increasing R 2 by approximately 2% to 13% for different DBP classes. The developed QSAR models were statistically validated for their predictive power by the Leave-One-Out (LOO) and Leave-Many-Out (LMO) cross validation methods. Finally, Monte Carlo simulation was used to assess the variations and inherent uncertainties in the QSAR models of Log P and determine the most influential parameters in connection with Log P prediction. The developed QSAR models in this dissertation will have a broad applicability domain because the research data set covered six out of eight common DBP classes, including halogenated alkane, halogenated alkene, halogenated aromatic, halogenated aldehyde, halogenated ketone, and halogenated carboxylic acid, which have been brought to the attention of regulatory agencies in recent years. Furthermore, the QSAR models are suitable to be used for prediction of similar DBP compounds within the same applicability domain. The selection and integration of various methodologies developed in this research may also benefit future research in similar fields.
Resumo:
This dissertation develops a new mathematical approach that overcomes the effect of a data processing phenomenon known as "histogram binning" inherent to flow cytometry data. A real-time procedure is introduced to prove the effectiveness and fast implementation of such an approach on real-world data. The histogram binning effect is a dilemma posed by two seemingly antagonistic developments: (1) flow cytometry data in its histogram form is extended in its dynamic range to improve its analysis and interpretation, and (2) the inevitable dynamic range extension introduces an unwelcome side effect, the binning effect, which skews the statistics of the data, undermining as a consequence the accuracy of the analysis and the eventual interpretation of the data. Researchers in the field contended with such a dilemma for many years, resorting either to hardware approaches that are rather costly with inherent calibration and noise effects; or have developed software techniques based on filtering the binning effect but without successfully preserving the statistical content of the original data. The mathematical approach introduced in this dissertation is so appealing that a patent application has been filed. The contribution of this dissertation is an incremental scientific innovation based on a mathematical framework that will allow researchers in the field of flow cytometry to improve the interpretation of data knowing that its statistical meaning has been faithfully preserved for its optimized analysis. Furthermore, with the same mathematical foundation, proof of the origin of such an inherent artifact is provided. These results are unique in that new mathematical derivations are established to define and solve the critical problem of the binning effect faced at the experimental assessment level, providing a data platform that preserves its statistical content. In addition, a novel method for accumulating the log-transformed data was developed. This new method uses the properties of the transformation of statistical distributions to accumulate the output histogram in a non-integer and multi-channel fashion. Although the mathematics of this new mapping technique seem intricate, the concise nature of the derivations allow for an implementation procedure that lends itself to a real-time implementation using lookup tables, a task that is also introduced in this dissertation.