995 resultados para nonlinear multivariate analysis
Resumo:
2016
Resumo:
2016
Resumo:
Il quark top è una delle particelle fondamentali del Modello Standard, ed è osservato a LHC nelle collisioni a più elevata energia. In particolare, la coppia top-antitop (tt̄) è prodotta tramite interazione forte da eventi gluone-gluone (gg) oppure collisioni di quark e antiquark (qq̄). I diversi meccanismi di produzione portano ad avere coppie con proprietà diverse: un esempio è lo stato di spin di tt̄, che vicino alla soglia di produzione è maggiormente correlato nel caso di un evento gg. Uno studio che voglia misurare l’entità di tali correlazioni risulta quindi essere significativamente facilitato da un metodo di discriminazione delle coppie risultanti sulla base del loro canale di produzione. Il lavoro qui presentato ha quindi lo scopo di ottenere uno strumento per effettuare tale differenziazione, attraverso l’uso di tecniche di analisi multivariata. Tali metodi sono spesso applicati per separare un segnale da un fondo che ostacola l’analisi, in questo caso rispettivamente gli eventi gg e qq̄. Si dice che si ha a che fare con un problema di classificazione. Si è quindi studiata la prestazione di diversi algoritmi di analisi, prendendo in esame le distribuzioni di numerose variabili associate al processo di produzione di coppie tt̄. Si è poi selezionato il migliore in base all’efficienza di riconoscimento degli eventi di segnale e alla reiezione degli eventi di fondo. Per questo elaborato l’algoritmo più performante è il Boosted Decision Trees, che permette di ottenere da un campione con purezza iniziale 0.81 una purezza finale di 0.92, al costo di un’efficienza ridotta a 0.74.
Resumo:
Principal curves have been defined Hastie and Stuetzle (JASA, 1989) assmooth curves passing through the middle of a multidimensional dataset. They are nonlinear generalizations of the first principalcomponent, a characterization of which is the basis for the principalcurves definition.In this paper we propose an alternative approach based on a differentproperty of principal components. Consider a point in the space wherea multivariate normal is defined and, for each hyperplane containingthat point, compute the total variance of the normal distributionconditioned to belong to that hyperplane. Choose now the hyperplaneminimizing this conditional total variance and look for thecorresponding conditional mean. The first principal component of theoriginal distribution passes by this conditional mean and it isorthogonal to that hyperplane. This property is easily generalized todata sets with nonlinear structure. Repeating the search from differentstarting points, many points analogous to conditional means are found.We call them principal oriented points. When a one-dimensional curveruns the set of these special points it is called principal curve oforiented points. Successive principal curves are recursively definedfrom a generalization of the total variance.
Resumo:
The penetration resistance (PR) is a soil attribute that allows identifies areas with restrictions due to compaction, which results in mechanical impedance for root growth and reduced crop yield. The aim of this study was to characterize the PR of an agricultural soil by geostatistical and multivariate analysis. Sampling was done randomly in 90 points up to 0.60 m depth. It was determined spatial distribution models of PR, and defined areas with mechanical impedance for roots growth. The PR showed a random distribution to 0.55 and 0.60 m depth. PR in other depths analyzed showed spatial dependence, with adjustments to exponential and spherical models. The cluster analysis that considered sampling points allowed establishing areas with compaction problem identified in the maps by kriging interpolation. The analysis with main components identified three soil layers, where the middle layer showed the highest values of PR.
Resumo:
In this paper is reported the use of the chromatographic profiles of volatiles to determine disease markers in plants - in this case, leaves of Eucalyptus globulus contaminated by the necrotroph fungus Teratosphaeria nubilosa. The volatile fraction was isolated by headspace solid phase microextraction (HS-SPME) and analyzed by comprehensive two-dimensional gas chromatography-fast quadrupole mass spectrometry (GC. ×. GC-qMS). For the correlation between the metabolic profile described by the chromatograms and the presence of the infection, unfolded-partial least squares discriminant analysis (U-PLS-DA) with orthogonal signal correction (OSC) were employed. The proposed method was checked to be independent of factors such as the age of the harvested plants. The manipulation of the mathematical model obtained also resulted in graphic representations similar to real chromatograms, which allowed the tentative identification of more than 40 compounds potentially useful as disease biomarkers for this plant/pathogen pair. The proposed methodology can be considered as highly reliable, since the diagnosis is based on the whole chromatographic profile rather than in the detection of a single analyte. © 2013 Elsevier B.V..
Resumo:
Concentrations of 39 organic compounds were determined in three fractions (head, heart and tail) obtained from the pot still distillation of fermented sugarcane juice. The results were evaluated using analysis of variance (ANOVA), Tukey's test, principal component analysis (PCA), hierarchical cluster analysis (HCA) and linear discriminant analysis (LDA). According to PCA and HCA, the experimental data lead to the formation of three clusters. The head fractions give rise to a more defined group. The heart and tail fractions showed some overlap consistent with its acid composition. The predictive ability of calibration and validation of the model generated by LDA for the three fractions classification were 90.5 and 100%, respectively. This model recognized as the heart twelve of the thirteen commercial cachacas (92.3%) with good sensory characteristics, thus showing potential for guiding the process of cuts.
Resumo:
As concentrações de 39 compostos orgânicos foram determinadas em três frações (cabeça, coração e cauda) obtidas da destilação em alambique do caldo de cana fermentado. Os resultados foram avaliados utilizando-se análise de variância (ANOVA), teste de Tukey, análise de componentes principais (PCA), agrupamento hierárquico (HCA) e análise discriminante linear (LDA). De acordo com PCA e HCA, os dados experimentais conduzem à formação de três agrupamentos. As frações de cabeça deram origem a um grupo mais definido. As frações coração e cauda apresentaram alguma sobreposição coerente com sua composição em ácidos. As habilidades preditivas de calibração e validação dos modelos gerados pela LDA para a classificação das três frações foram de 90,5 e 100%, respectivamente. Este modelo reconheceu como coração doze de treze cachaças comerciais (92,3%) com boas características sensoriais, apresentando potencial para a orientação do processo de cortes.
Resumo:
Questionnaire data may contain missing values because certain questions do not apply to all respondents. For instance, questions addressing particular attributes of a symptom, such as frequency, triggers or seasonality, are only applicable to those who have experienced the symptom, while for those who have not, responses to these items will be missing. This missing information does not fall into the category 'missing by design', rather the features of interest do not exist and cannot be measured regardless of survey design. Analysis of responses to such conditional items is therefore typically restricted to the subpopulation in which they apply. This article is concerned with joint multivariate modelling of responses to both unconditional and conditional items without restricting the analysis to this subpopulation. Such an approach is of interest when the distributions of both types of responses are thought to be determined by common parameters affecting the whole population. By integrating the conditional item structure into the model, inference can be based both on unconditional data from the entire population and on conditional data from subjects for whom they exist. This approach opens new possibilities for multivariate analysis of such data. We apply this approach to latent class modelling and provide an example using data on respiratory symptoms (wheeze and cough) in children. Conditional data structures such as that considered here are common in medical research settings and, although our focus is on latent class models, the approach can be applied to other multivariate models.
Resumo:
In this article, a model for the determination of displacements, strains, and stresses of a submarine pipeline during its construction is presented. Typically, polyethylene outfall pipelines are the ones treated by this model. The process is carried out from an initial floating situation to the final laying position on the seabed. The following control variables are considered in the laying process: the axial load in the pipe, the flooded inner length, and the distance of the control barge from the coast. External loads such as self-weight, dead loads, and forces due to currents and small waves are also taken into account.This paper describes both the conceptual framework for the proposed model and its practical application in a real engineering situation. The authors also consider how the model might be used as a tool to study how sensitive the behavior of the pipeline is to small changes in the values of the control variables. A detailed description of the actions is considered, especially the ones related to the marine environment such as buoyancy, current, and sea waves. The structural behavior of the pipeline is simulated in the framework of a geometrically nonlinear dynamic analysis. The pipeline is assumed to be a two-dimensional Navier_Bernoulli beam. In the nonlinear analysis an updated Lagrangian formulation is used, and special care is taken regarding the numerical aspects of sea bed contact, follower forces due to external water pressures, and dynamic actions. The paper concludes by describing the implementation of the proposed techniques, using the ANSYS computer program with a number of subroutines developed by the authors. This implementation permits simulation of the two-dimensional structural pipe behavior of the whole construction process. A sensitivity analysis of the bending moments, axial forces, and stresses for different values of the control variables is carried out. Using the techniques described, the engineer may optimize the construction steps in the pipe laying process
Resumo:
The elemental analysis of Spanish palm dates by inductively coupled plasma atomic emission spectrometry and inductively coupled plasma mass spectrometry is reported for the first time. To complete the information about the mineral composition of the samples, C, H, and N are determined by elemental analysis. Dates from Israel, Tunisia, Saudi Arabia, Algeria and Iran have also been analyzed. The elemental composition have been used in multivariate statistical analysis to discriminate the dates according to its geographical origin. A total of 23 elements (As, Ba, C, Ca, Cd, Co, Cr, Cu, Fe, H, In, K, Li, Mg, Mn, N, Na, Ni, Pb, Se, Sr, V, and Zn) at concentrations from major to ultra-trace levels have been determined in 13 date samples (flesh and seeds). A careful inspection of the results indicate that Spanish samples show higher concentrations of Cd, Co, Cr, and Ni than the remaining ones. Multivariate statistical analysis of the obtained results, both in flesh and seed, indicate that the proposed approach can be successfully applied to discriminate the Spanish date samples from the rest of the samples tested.
Resumo:
Many multifactorial biologic effects, particularly in the context of complex human diseases, are still poorly understood. At the same time, the systematic acquisition of multivariate data has become increasingly easy. The use of such data to analyze and model complex phenotypes, however, remains a challenge. Here, a new analytic approach is described, termed coreferentiality, together with an appropriate statistical test. Coreferentiality is the indirect relation of two variables of functional interest in respect to whether they parallel each other in their respective relatedness to multivariate reference data, which can be informative for a complex effect or phenotype. It is shown that the power of coreferentiality testing is comparable to multiple regression analysis, sufficient even when reference data are informative only to a relatively small extent of 2.5%, and clearly exceeding the power of simple bivariate correlation testing. Thus, coreferentiality testing uses the increased power of multivariate analysis, however, in order to address a more straightforward interpretable bivariate relatedness. Systematic application of this approach could substantially improve the analysis and modeling of complex phenotypes, particularly in the context of human study where addressing functional hypotheses by direct experimentation is often difficult.
Resumo:
Transportation Department, Office of University Research, Washington, D.C.
Resumo:
Platelet count is a highly heritable trait with genetic factors responsible for around 80% of the phenotypic variance. We measured platelet count longitudinally in 327 monozygotic and 418 dizygotic twin pairs at 12, 14 and 16 years of age. We also performed a genome-wide linkage scan of these twins and their families in an attempt to localize QTLs that influenced variation in platelet concentrations. Suggestive linkage was observed on chromosome 19q13.13-19q13.31 at 12 (LOD=2.12, P=0.0009), 14 (LOD=2.23, P=0.0007) and 16 (LOD=1.01, P=0.016) years of age and multivariate analysis of counts at all three ages increased the LOD to 2.59 (P=0.0003). A possible candidate in this region is the gene for glycoprotein VI, a receptor involved in platelet aggregation. Smaller linkage peaks were also seen at 2p, 5p, 5q, 10p and 15q. There was little evidence for linkage to the chromosomal regions containing the genes for thrombopoietin (3q27) and the thrombopoietin receptor (1q34), suggesting that polymorphisms in these genes do not contribute substantially to variation in platelet count between healthy individuals.