866 resultados para Ordinary Least Squares Method
Resumo:
With its implications for vaccine discovery, the accurate prediction of T cell epitopes is one of the key aspirations of computational vaccinology. We have developed a robust multivariate statistical method, based on partial least squares, for the quantitative prediction of peptide binding to major histocompatibility complexes (MHC), the principal checkpoint on the antigen presentation pathway. As a service to the immunobiology community, we have made a Perl implementation of the method available via a World Wide Web server. We call this server MHCPred. Access to the server is freely available from the URL: http://www.jenner.ac.uk/MHCPred. We have exemplified our method with a model for peptides binding to the common human MHC molecule HLA-B*3501.
Resumo:
The determination of the displacement and the space-dependent force acting on a vibrating structure from measured final or time-average displacement observation is thoroughly investigated. Several aspects related to the existence and uniqueness of a solution of the linear but ill-posed inverse force problems are highlighted. After that, in order to capture the solution a variational formulation is proposed and the gradient of the least-squares functional that is minimized is rigorously and explicitly derived. Numerical results obtained using the Landweber method and the conjugate gradient method are presented and discussed illustrating the convergence of the iterative procedures for exact input data. Furthermore, for noisy data the semi-convergence phenomenon appears, as expected, and stability is restored by stopping the iterations according to the discrepancy principle criterion once the residual becomes close to the amount of noise. The present investigation will be significant to researchers concerned with wave propagation and control of vibrating structures.
Resumo:
A cikk a páros összehasonlításokon alapuló pontozási eljárásokat tárgyalja axiomatikus megközelítésben. A szakirodalomban számos értékelő függvényt javasoltak erre a célra, néhány karakterizációs eredmény is ismert. Ennek ellenére a megfelelő módszer kiválasztása nem egy-szerű feladat, a különböző tulajdonságok bevezetése elsősorban ebben nyújthat segítséget. Itt az összehasonlított objektumok teljesítményén érvényesülő monotonitást tárgyaljuk az önkonzisztencia és önkonzisztens monotonitás axiómákból kiindulva. Bemutatásra kerülnek lehetséges gyengítéseik és kiterjesztéseik, illetve egy, az irreleváns összehasonlításoktól való függetlenséggel kapcsolatos lehetetlenségi tétel is. A tulajdonságok teljesülését három eljárásra, a klasszikus pontszám eljárásra, az ezt továbbfejlesztő általánosított sorösszegre és a legkisebb négyzetek módszerére vizsgáljuk meg, melyek mindegyike egy lineáris egyenletrendszer megoldásaként számítható. A kapott eredmények új szempontokkal gazdagítják a pontozási eljárás megválasztásának kérdését. _____ The paper provides an axiomatic analysis of some scoring procedures based on paired comparisons. Several methods have been proposed for these generalized tournaments, some of them have been also characterized by a set of properties. The choice of an appropriate method is supported by a discussion of their theoretical properties. In the paper we focus on the connections of self-consistency and self-consistent-monotonicity, two axioms based on the comparisons of object's performance. The contradiction of self-consistency and independence of irrel-evant matches is revealed, as well as some possible reductions and extensions of these properties. Their satisfiability is examined through three scoring procedures, the score, generalised row sum and least squares methods, each of them is calculated as a solution of a system of linear equations. Our results contribute to the problem of finding a proper paired comparison based scoring method.
Resumo:
This study examined the effects of computer assisted instruction (CAI) 1 hour per week for 18 weeks on changes in computational scores and attitudes of developmental mathematics students at schools with predominantly Black enrollment. Comparisons were made between students using CAI with differing software--PLATO, CSR or both together--and students using traditional instruction (TI) only.^ This study was conducted in the Dade County Public School System from February through June 1991, at two senior high schools. The dependent variables, the State Student Assessment Test (SSAT), and the School Subjects Attitude Scales (SSAS), measured students' computational scores and attitudes toward mathematics in 3 categories: interest, usefulness, and difficulty, respectively.^ Univariate analyses of variance were performed on the least squares mean differences from pretest to posttest for testing main effects and interactions. A t-test measured significant main effects and interactions. Results were interpreted at the.01 level of significance.^ Null hypotheses 1, 2, and 3 compared versions of CAI with the control group, for changes in mathematical computation scores measured with the SSAT. It could not be concluded that changes in standardized mathematics test scores of students using CAI with differing software 1 hour per week for 18 class hours combined with TI were significantly higher than changes in test scores for students receiving TI only.^ Null hypotheses 4, 5, and 6 tested the effects of CAI for attitudes toward mathematics for experimental groups against control groups measured with the SSAS. Changes in attitudes toward mathematics of students using CAI with differing software 1 hour per week for 18 class hours combined with TI were not significantly higher than attitude changes for students receiving TI only.^ Teacher effect on students' computational scores was a more influential variable than CAI. No interaction was found between gender and learning method on standardized mathematics test scores (null hypothesis 7). ^
Resumo:
Digital systems can generate left and right audio channels that create the effect of virtual sound source placement (spatialization) by processing an audio signal through pairs of Head-Related Transfer Functions (HRTFs) or, equivalently, Head-Related Impulse Responses (HRIRs). The spatialization effect is better when individually-measured HRTFs or HRIRs are used than when generic ones (e.g., from a mannequin) are used. However, the measurement process is not available to the majority of users. There is ongoing interest to find mechanisms to customize HRTFs or HRIRs to a specific user, in order to achieve an improved spatialization effect for that subject. Unfortunately, the current models used for HRTFs and HRIRs contain over a hundred parameters and none of those parameters can be easily related to the characteristics of the subject. This dissertation proposes an alternative model for the representation of HRTFs, which contains at most 30 parameters, all of which have a defined functional significance. It also presents methods to obtain the value of parameters in the model to make it approximately equivalent to an individually-measured HRTF. This conversion is achieved by the systematic deconstruction of HRIR sequences through an augmented version of the Hankel Total Least Squares (HTLS) decomposition approach. An average 95% match (fit) was observed between the original HRIRs and those re-constructed from the Damped and Delayed Sinusoids (DDSs) found by the decomposition process, for ipsilateral source locations. The dissertation also introduces and evaluates an HRIR customization procedure, based on a multilinear model implemented through a 3-mode tensor, for mapping of anatomical data from the subjects to the HRIR sequences at different sound source locations. This model uses the Higher-Order Singular Value Decomposition (HOSVD) method to represent the HRIRs and is capable of generating customized HRIRs from easily attainable anatomical measurements of a new intended user of the system. Listening tests were performed to compare the spatialization performance of customized, generic and individually-measured HRIRs when they are used for synthesized spatial audio. Statistical analysis of the results confirms that the type of HRIRs used for spatialization is a significant factor in the spatialization success, with the customized HRIRs yielding better results than generic HRIRs.
Resumo:
Quantitative Structure-Activity Relationship (QSAR) has been applied extensively in predicting toxicity of Disinfection By-Products (DBPs) in drinking water. Among many toxicological properties, acute and chronic toxicities of DBPs have been widely used in health risk assessment of DBPs. These toxicities are correlated with molecular properties, which are usually correlated with molecular descriptors. The primary goals of this thesis are: (1) to investigate the effects of molecular descriptors (e.g., chlorine number) on molecular properties such as energy of the lowest unoccupied molecular orbital (E LUMO) via QSAR modelling and analysis; (2) to validate the models by using internal and external cross-validation techniques; (3) to quantify the model uncertainties through Taylor and Monte Carlo Simulation. One of the very important ways to predict molecular properties such as ELUMO is using QSAR analysis. In this study, number of chlorine (NCl ) and number of carbon (NC) as well as energy of the highest occupied molecular orbital (EHOMO) are used as molecular descriptors. There are typically three approaches used in QSAR model development: (1) Linear or Multi-linear Regression (MLR); (2) Partial Least Squares (PLS); and (3) Principle Component Regression (PCR). In QSAR analysis, a very critical step is model validation after QSAR models are established and before applying them to toxicity prediction. The DBPs to be studied include five chemical classes: chlorinated alkanes, alkenes, and aromatics. In addition, validated QSARs are developed to describe the toxicity of selected groups (i.e., chloro-alkane and aromatic compounds with a nitro- or cyano group) of DBP chemicals to three types of organisms (e.g., Fish, T. pyriformis, and P.pyosphoreum) based on experimental toxicity data from the literature. The results show that: (1) QSAR models to predict molecular property built by MLR, PLS or PCR can be used either to select valid data points or to eliminate outliers; (2) The Leave-One-Out Cross-Validation procedure by itself is not enough to give a reliable representation of the predictive ability of the QSAR models, however, Leave-Many-Out/K-fold cross-validation and external validation can be applied together to achieve more reliable results; (3) E LUMO are shown to correlate highly with the NCl for several classes of DBPs; and (4) According to uncertainty analysis using Taylor method, the uncertainty of QSAR models is contributed mostly from NCl for all DBP classes.
Resumo:
Ellipsometry is a well known optical technique used for the characterization of reflective surfaces in study and films between two media. It is based on measuring the change in the state of polarization that occurs as a beam of polarized light is reflected from or transmitted through the film. Measuring this change can be used to calculate parameters of a single layer film such as the thickness and the refractive index. However, extracting these parameters of interest requires significant numerical processing due to the noninvertible equations. Typically, this is done using least squares solving methods which are slow and adversely affected by local minima in the solvable surface. This thesis describes the development and implementation of a new technique using only Artificial Neural Networks (ANN) to calculate thin film parameters. The new method offers a speed in the orders of magnitude faster than preceding methods and convergence to local minima is completely eliminated.
Resumo:
In the last decades the study of integer-valued time series has gained notoriety due to its broad applicability (modeling the number of car accidents in a given highway, or the number of people infected by a virus are two examples). One of the main interests of this area of study is to make forecasts, and for this reason it is very important to propose methods to make such forecasts, which consist of nonnegative integer values, due to the discrete nature of the data. In this work, we focus on the study and proposal of forecasts one, two and h steps ahead for integer-valued second-order autoregressive conditional heteroskedasticity processes [INARCH (2)], and in determining some theoretical properties of this model, such as the ordinary moments of its marginal distribution and the asymptotic distribution of its conditional least squares estimators. In addition, we study, via Monte Carlo simulation, the behavior of the estimators for the parameters of INARCH(2) processes obtained using three di erent methods (Yule- Walker, conditional least squares, and conditional maximum likelihood), in terms of mean squared error, mean absolute error and bias. We present some forecast proposals for INARCH(2) processes, which are compared again via Monte Carlo simulation. As an application of this proposed theory, we model a dataset related to the number of live male births of mothers living at Riachuelo city, in the state of Rio Grande do Norte, Brazil.
Resumo:
In the last decades the study of integer-valued time series has gained notoriety due to its broad applicability (modeling the number of car accidents in a given highway, or the number of people infected by a virus are two examples). One of the main interests of this area of study is to make forecasts, and for this reason it is very important to propose methods to make such forecasts, which consist of nonnegative integer values, due to the discrete nature of the data. In this work, we focus on the study and proposal of forecasts one, two and h steps ahead for integer-valued second-order autoregressive conditional heteroskedasticity processes [INARCH (2)], and in determining some theoretical properties of this model, such as the ordinary moments of its marginal distribution and the asymptotic distribution of its conditional least squares estimators. In addition, we study, via Monte Carlo simulation, the behavior of the estimators for the parameters of INARCH(2) processes obtained using three di erent methods (Yule- Walker, conditional least squares, and conditional maximum likelihood), in terms of mean squared error, mean absolute error and bias. We present some forecast proposals for INARCH(2) processes, which are compared again via Monte Carlo simulation. As an application of this proposed theory, we model a dataset related to the number of live male births of mothers living at Riachuelo city, in the state of Rio Grande do Norte, Brazil.
Resumo:
Trace gases are important to our environment even though their presence comes only by ‘traces’, but their concentrations must be monitored, so any necessary interventions can be done at the right time. There are some lower and upper boundaries which produce nice conditions for our lives and then monitoring trace gases comes as an essential task nowadays to be accomplished by many techniques. One of them is the differential optical absorption spectroscopy (DOAS), which consists mathematically on a regression - the classical method uses least-squares - to retrieve the trace gases concentrations. In order to achieve better results, many works have tried out different techniques instead of the classical approach. Some have tried to preprocess the signals to be analyzed by a denoising procedure - e.g. discrete wavelet transform (DWT). This work presents a semi-empirical study to find out the most suitable DWT family to be used in this denoising. The search seeks among many well-known families the one to better remove the noise, keeping the original signal’s main features, then by decreasing the noise, the residual left after the regression is done decreases too. The analysis take account the wavelet decomposition level, the threshold to be applied on the detail coefficients and how to apply them - hard or soft thresholding. The signals used come from an open and online data base which contains characteristic signals from some trace gases usually studied.
Resumo:
Trace gases are important to our environment even though their presence comes only by ‘traces’, but their concentrations must be monitored, so any necessary interventions can be done at the right time. There are some lower and upper boundaries which produce nice conditions for our lives and then monitoring trace gases comes as an essential task nowadays to be accomplished by many techniques. One of them is the differential optical absorption spectroscopy (DOAS), which consists mathematically on a regression - the classical method uses least-squares - to retrieve the trace gases concentrations. In order to achieve better results, many works have tried out different techniques instead of the classical approach. Some have tried to preprocess the signals to be analyzed by a denoising procedure - e.g. discrete wavelet transform (DWT). This work presents a semi-empirical study to find out the most suitable DWT family to be used in this denoising. The search seeks among many well-known families the one to better remove the noise, keeping the original signal’s main features, then by decreasing the noise, the residual left after the regression is done decreases too. The analysis take account the wavelet decomposition level, the threshold to be applied on the detail coefficients and how to apply them - hard or soft thresholding. The signals used come from an open and online data base which contains characteristic signals from some trace gases usually studied.
Resumo:
The monoaromatic compounds are toxic substances present in petroleum derivades and used broadly in the chemical and petrochemical industries. Those compounds are continuously released into the environment, contaminating the soil and water sources, leading to the possible unfeasibility of those hydrous resources due to their highly carcinogenic and mutagenic potentiality, since even in low concentrations, the BTEX may cause serious health issues. Therefore, it is extremely important to develop and search for new methodologies that assist and enable the treatment of BTEX-contaminated matrix. The bioremediation consists on the utilization of microbial groups capable of degrading hydrocarbons, promoting mineralization, or in other words, the permanent destruction of residues, eliminating the risks of future contaminations. This work investigated the biodegradation kinetics of water-soluble monoaromatic compounds (benzene, toluene and ethylbenzene), based on the evaluation of its consummation by the Pseudomonas aeruginosa bacteria, for concentrations varying from 40 to 200 mg/L. To do so, the performances of Monod kinetic model for microbial growth were evaluated and the material balance equations for a batch operation were discretized and numerically solved by the fourth order Runge-Kutta method. The kinetic parameters obtained using the method of least squares as statistical criteria were coherent when compared to those obtained from the literature. They also showed that, the microorganism has greater affinity for ethylbenzene. That way, it was possible to observe that Monod model can predict the experimental data for the individual biodegradation of the BTEX substrates and it can be applied to the optimization of the biodegradation processes of toxic compounds for different types of bioreactors and for different operational conditions.
Resumo:
The monoaromatic compounds are toxic substances present in petroleum derivades and used broadly in the chemical and petrochemical industries. Those compounds are continuously released into the environment, contaminating the soil and water sources, leading to the possible unfeasibility of those hydrous resources due to their highly carcinogenic and mutagenic potentiality, since even in low concentrations, the BTEX may cause serious health issues. Therefore, it is extremely important to develop and search for new methodologies that assist and enable the treatment of BTEX-contaminated matrix. The bioremediation consists on the utilization of microbial groups capable of degrading hydrocarbons, promoting mineralization, or in other words, the permanent destruction of residues, eliminating the risks of future contaminations. This work investigated the biodegradation kinetics of water-soluble monoaromatic compounds (benzene, toluene and ethylbenzene), based on the evaluation of its consummation by the Pseudomonas aeruginosa bacteria, for concentrations varying from 40 to 200 mg/L. To do so, the performances of Monod kinetic model for microbial growth were evaluated and the material balance equations for a batch operation were discretized and numerically solved by the fourth order Runge-Kutta method. The kinetic parameters obtained using the method of least squares as statistical criteria were coherent when compared to those obtained from the literature. They also showed that, the microorganism has greater affinity for ethylbenzene. That way, it was possible to observe that Monod model can predict the experimental data for the individual biodegradation of the BTEX substrates and it can be applied to the optimization of the biodegradation processes of toxic compounds for different types of bioreactors and for different operational conditions.
Resumo:
Based on the quantitative analysis of diatom assemblages preserved in 274 surface sediment samples recovered in the Pacific, Atlantic and western Indian sectors of the Southern Ocean we have defined a new reference database for quantitative estimation of late-middle Pleistocene Antarctic sea ice fields using the transfer function technique. The Detrended Canonical Analysis (DCA) of the diatom data set points to a unimodal distribution of the diatom assemblages. Canonical Correspondence Analysis (CCA) indicates that winter sea ice (WSI) but also summer sea surface temperature (SSST) represent the most prominent environmental variables that control the spatial species distribution. To test the applicability of transfer functions for sea ice reconstruction in terms of concentration and occurrence probability we applied four different methods, the Imbrie and Kipp Method (IKM), the Modern Analog Technique (MAT), Weighted Averaging (WA), and Weighted Averaging Partial Least Squares (WAPLS), using logarithm-transformed diatom data and satellite-derived (1981-2010) sea ice data as a reference. The best performance for IKM results was obtained using a subset of 172 samples with 28 diatom taxa/taxa groups, quadratic regression and a three-factor model (IKM-D172/28/3q) resulting in root mean square errors of prediction (RMSEP) of 7.27% and 11.4% for WSI and summer sea ice (SSI) concentration, respectively. MAT estimates were calculated with different numbers of analogs (4, 6) using a 274-sample/28-taxa reference data set (MAT-D274/28/4an, -6an) resulting in RMSEP's ranging from 5.52% (4an) to 5.91% (6an) for WSI as well as 8.93% (4an) to 9.05% (6an) for SSI. WA and WAPLS performed less well with the D274 data set, compared to MAT, achieving WSI concentration RMSEP's of 9.91% with WA and 11.29% with WAPLS, recommending the use of IKM and MAT. The application of IKM and MAT to surface sediment data revealed strong relations to the satellite-derived winter and summer sea ice field. Sea ice reconstructions performed on an Atlantic- and a Pacific Southern Ocean sediment core, both documenting sea ice variability over the past 150,000 years (MIS 1 - MIS 6), resulted in similar glacial/interglacial trends of IKM and MAT-based sea-ice estimates. On the average, however, IKM estimates display smaller WSI and slightly higher SSI concentration and probability at lower variability in comparison with MAT. This pattern is a result of different estimation techniques with integration of WSI and SSI signals in one single factor assemblage by applying IKM and selecting specific single samples, thus keeping close to the original diatom database and included variability, by MAT. In contrast to the estimation of WSI, reconstructions of past SSI variability remains weaker. Combined with diatom-based estimates, the abundance and flux pattern of biogenic opal represents an additional indication for the WSI and SSI extent.
Resumo:
Quantitative Structure-Activity Relationship (QSAR) has been applied extensively in predicting toxicity of Disinfection By-Products (DBPs) in drinking water. Among many toxicological properties, acute and chronic toxicities of DBPs have been widely used in health risk assessment of DBPs. These toxicities are correlated with molecular properties, which are usually correlated with molecular descriptors. The primary goals of this thesis are: 1) to investigate the effects of molecular descriptors (e.g., chlorine number) on molecular properties such as energy of the lowest unoccupied molecular orbital (ELUMO) via QSAR modelling and analysis; 2) to validate the models by using internal and external cross-validation techniques; 3) to quantify the model uncertainties through Taylor and Monte Carlo Simulation. One of the very important ways to predict molecular properties such as ELUMO is using QSAR analysis. In this study, number of chlorine (NCl) and number of carbon (NC) as well as energy of the highest occupied molecular orbital (EHOMO) are used as molecular descriptors. There are typically three approaches used in QSAR model development: 1) Linear or Multi-linear Regression (MLR); 2) Partial Least Squares (PLS); and 3) Principle Component Regression (PCR). In QSAR analysis, a very critical step is model validation after QSAR models are established and before applying them to toxicity prediction. The DBPs to be studied include five chemical classes: chlorinated alkanes, alkenes, and aromatics. In addition, validated QSARs are developed to describe the toxicity of selected groups (i.e., chloro-alkane and aromatic compounds with a nitro- or cyano group) of DBP chemicals to three types of organisms (e.g., Fish, T. pyriformis, and P.pyosphoreum) based on experimental toxicity data from the literature. The results show that: 1) QSAR models to predict molecular property built by MLR, PLS or PCR can be used either to select valid data points or to eliminate outliers; 2) The Leave-One-Out Cross-Validation procedure by itself is not enough to give a reliable representation of the predictive ability of the QSAR models, however, Leave-Many-Out/K-fold cross-validation and external validation can be applied together to achieve more reliable results; 3) ELUMO are shown to correlate highly with the NCl for several classes of DBPs; and 4) According to uncertainty analysis using Taylor method, the uncertainty of QSAR models is contributed mostly from NCl for all DBP classes.