997 resultados para Statistical structures
Resumo:
Correspondence Analysis was adopted as tool for investigating the statistical structure of hydrochemical and weathering datasets of groundwater samples, with the main purpose of identifying impacts on mineral weathering caused by anthropogenic activities, namely fertilizing of farmlands. The hydrochemical dataset comprised measured concentrations of major inorganic compounds dissolved in groundwater, namely bicarbonate, silica (usually by-products of chemical weathering), chloride, sulphate and nitrate (typically atmospheric plus anthropogenic inputs). The weathering dataset consisted of calculated mass transfers of minerals being dissolved in loess sediments of a region located in SW Hungary (Szigetvár area), namely Na-plagioclase, calcite and dolomite, and of pollution-related concentrations of sodium, magnesium and calcium. A first run of Correspondence Analysis described groundwater composition in the study area as a system of triple influence, where spots of domestic effluents-dominated chemistries are surrounded by areas with agriculture-dominated chemistries, both imprinted over large regions of weathering dominated chemistries. A second run revealed that nitrification of N-fertilizers is promoting mineral weathering by the nitric acid reaction (anthropogenic pathway), in concurrence with the retreating of weathering by carbonic acid (natural pathway). It also indicated that dolomite and calcite are being players in a dedolomitization process driven by dissolution of gypsum fertilizers and nitrification of N-fertilizers. © 2013 Elsevier Ltd.
Resumo:
ნაშრომში დამტკიცებულია აუცილებელი და საკმარისი პირობები, თუ როდის უშვებს ფიშერის სტატისტიკური სტრუქტურა პარამეტრების ძალდებულ შეფასებას.
Resumo:
We examine a method recently proposed by Hinich and Patterson (mimeo, University of Texas at Austin, 1995) for testing the validity of specifying a GARCH error structure for financial time series data in the context of a set of ten daily Sterling exchange rates. The results demonstrate that there are statistical structures present in the data that cannot be captured by a GARCH model, or any of its variants. This result has important implications for the interpretation of the recent voluminous literature which attempts to model financial asset returns using this family of models.
Resumo:
Background: Recent advances on high-throughput technologies have produced a vast amount of protein sequences, while the number of high-resolution structures has seen a limited increase. This has impelled the production of many strategies to built protein structures from its sequence, generating a considerable amount of alternative models. The selection of the closest model to the native conformation has thus become crucial for structure prediction. Several methods have been developed to score protein models by energies, knowledge-based potentials and combination of both.Results: Here, we present and demonstrate a theory to split the knowledge-based potentials in scoring terms biologically meaningful and to combine them in new scores to predict near-native structures. Our strategy allows circumventing the problem of defining the reference state. In this approach we give the proof for a simple and linear application that can be further improved by optimizing the combination of Zscores. Using the simplest composite score () we obtained predictions similar to state-of-the-art methods. Besides, our approach has the advantage of identifying the most relevant terms involved in the stability of the protein structure. Finally, we also use the composite Zscores to assess the conformation of models and to detect local errors.Conclusion: We have introduced a method to split knowledge-based potentials and to solve the problem of defining a reference state. The new scores have detected near-native structures as accurately as state-of-art methods and have been successful to identify wrongly modeled regions of many near-native conformations.
Resumo:
The crystal structures of an aspartic proteinase from Trichoderma reesei (TrAsP) and of its complex with a competitive inhibitor, pepstatin A, were solved and refined to crystallographic R-factors of 17.9% (R(free)=21.2%) at 1.70 angstrom resolution and 15.81% (R(free) = 19.2%) at 1.85 angstrom resolution, respectively. The three-dimensional structure of TrAsP is similar to structures of other members of the pepsin-like family of aspartic proteinases. Each molecule is folded in a predominantly beta-sheet bilobal structure with the N-terminal and C-terminal domains of about the same size. Structural comparison of the native structure and the TrAsP-pepstatin complex reveals that the enzyme undergoes an induced-fit, rigid-body movement upon inhibitor binding, with the N-terminal and C-terminal lobes tightly enclosing the inhibitor. Upon recognition and binding of pepstatin A, amino acid residues of the enzyme active site form a number of short hydrogen bonds to the inhibitor that may play an important role in the mechanism of catalysis and inhibition. The structures of TrAsP were used as a template for performing statistical coupling analysis of the aspartic protease family. This approach permitted, for the first time, the identification of a network of structurally linked residues putatively mediating conformational changes relevant to the function of this family of enzymes. Statistical coupling analysis reveals coevolved continuous clusters of amino acid residues that extend from the active site into the hydrophobic cores of each of the two domains and include amino acid residues from the flap regions, highlighting the importance of these parts of the protein for its enzymatic activity. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
The results of a questionnaire survey of 3,578 young protesters aged 15 to 24 were used to create a typology of the motive structures of the young globalization critics who participated in protests against the G8 summit in Heiligendamm in June 2007. Eight groups with different motive structures identified using cluster analysis reveal the spectrum of motives of the young demonstrators, ranging from social and political idealism to hedonistic fun-seeking and nationalist motives. Despite the diversity of motives, two cross-cluster motives can be identified: the results clearly show that the majority of respondents were motivated by political idealism and rejected violence. Two overlapping minorities were found: one where political idealism was largely lacking, and another where violence was a prominent motive.
Resumo:
El uso de aritmética de punto fijo es una opción de diseño muy extendida en sistemas con fuertes restricciones de área, consumo o rendimiento. Para producir implementaciones donde los costes se minimicen sin impactar negativamente en la precisión de los resultados debemos llevar a cabo una asignación cuidadosa de anchuras de palabra. Encontrar la combinación óptima de anchuras de palabra en coma fija para un sistema dado es un problema combinatorio NP-hard al que los diseñadores dedican entre el 25 y el 50 % del ciclo de diseño. Las plataformas hardware reconfigurables, como son las FPGAs, también se benefician de las ventajas que ofrece la aritmética de coma fija, ya que éstas compensan las frecuencias de reloj más bajas y el uso más ineficiente del hardware que hacen estas plataformas respecto a los ASICs. A medida que las FPGAs se popularizan para su uso en computación científica los diseños aumentan de tamaño y complejidad hasta llegar al punto en que no pueden ser manejados eficientemente por las técnicas actuales de modelado de señal y ruido de cuantificación y de optimización de anchura de palabra. En esta Tesis Doctoral exploramos distintos aspectos del problema de la cuantificación y presentamos nuevas metodologías para cada uno de ellos: Las técnicas basadas en extensiones de intervalos han permitido obtener modelos de propagación de señal y ruido de cuantificación muy precisos en sistemas con operaciones no lineales. Nosotros llevamos esta aproximación un paso más allá introduciendo elementos de Multi-Element Generalized Polynomial Chaos (ME-gPC) y combinándolos con una técnica moderna basada en Modified Affine Arithmetic (MAA) estadístico para así modelar sistemas que contienen estructuras de control de flujo. Nuestra metodología genera los distintos caminos de ejecución automáticamente, determina las regiones del dominio de entrada que ejercitarán cada uno de ellos y extrae los momentos estadísticos del sistema a partir de dichas soluciones parciales. Utilizamos esta técnica para estimar tanto el rango dinámico como el ruido de redondeo en sistemas con las ya mencionadas estructuras de control de flujo y mostramos la precisión de nuestra aproximación, que en determinados casos de uso con operadores no lineales llega a tener tan solo una desviación del 0.04% con respecto a los valores de referencia obtenidos mediante simulación. Un inconveniente conocido de las técnicas basadas en extensiones de intervalos es la explosión combinacional de términos a medida que el tamaño de los sistemas a estudiar crece, lo cual conlleva problemas de escalabilidad. Para afrontar este problema presen tamos una técnica de inyección de ruidos agrupados que hace grupos con las señales del sistema, introduce las fuentes de ruido para cada uno de los grupos por separado y finalmente combina los resultados de cada uno de ellos. De esta forma, el número de fuentes de ruido queda controlado en cada momento y, debido a ello, la explosión combinatoria se minimiza. También presentamos un algoritmo de particionado multi-vía destinado a minimizar la desviación de los resultados a causa de la pérdida de correlación entre términos de ruido con el objetivo de mantener los resultados tan precisos como sea posible. La presente Tesis Doctoral también aborda el desarrollo de metodologías de optimización de anchura de palabra basadas en simulaciones de Monte-Cario que se ejecuten en tiempos razonables. Para ello presentamos dos nuevas técnicas que exploran la reducción del tiempo de ejecución desde distintos ángulos: En primer lugar, el método interpolativo aplica un interpolador sencillo pero preciso para estimar la sensibilidad de cada señal, y que es usado después durante la etapa de optimización. En segundo lugar, el método incremental gira en torno al hecho de que, aunque es estrictamente necesario mantener un intervalo de confianza dado para los resultados finales de nuestra búsqueda, podemos emplear niveles de confianza más relajados, lo cual deriva en un menor número de pruebas por simulación, en las etapas iniciales de la búsqueda, cuando todavía estamos lejos de las soluciones optimizadas. Mediante estas dos aproximaciones demostramos que podemos acelerar el tiempo de ejecución de los algoritmos clásicos de búsqueda voraz en factores de hasta x240 para problemas de tamaño pequeño/mediano. Finalmente, este libro presenta HOPLITE, una infraestructura de cuantificación automatizada, flexible y modular que incluye la implementación de las técnicas anteriores y se proporciona de forma pública. Su objetivo es ofrecer a desabolladores e investigadores un entorno común para prototipar y verificar nuevas metodologías de cuantificación de forma sencilla. Describimos el flujo de trabajo, justificamos las decisiones de diseño tomadas, explicamos su API pública y hacemos una demostración paso a paso de su funcionamiento. Además mostramos, a través de un ejemplo sencillo, la forma en que conectar nuevas extensiones a la herramienta con las interfaces ya existentes para poder así expandir y mejorar las capacidades de HOPLITE. ABSTRACT Using fixed-point arithmetic is one of the most common design choices for systems where area, power or throughput are heavily constrained. In order to produce implementations where the cost is minimized without negatively impacting the accuracy of the results, a careful assignment of word-lengths is required. The problem of finding the optimal combination of fixed-point word-lengths for a given system is a combinatorial NP-hard problem to which developers devote between 25 and 50% of the design-cycle time. Reconfigurable hardware platforms such as FPGAs also benefit of the advantages of fixed-point arithmetic, as it compensates for the slower clock frequencies and less efficient area utilization of the hardware platform with respect to ASICs. As FPGAs become commonly used for scientific computation, designs constantly grow larger and more complex, up to the point where they cannot be handled efficiently by current signal and quantization noise modelling and word-length optimization methodologies. In this Ph.D. Thesis we explore different aspects of the quantization problem and we present new methodologies for each of them: The techniques based on extensions of intervals have allowed to obtain accurate models of the signal and quantization noise propagation in systems with non-linear operations. We take this approach a step further by introducing elements of MultiElement Generalized Polynomial Chaos (ME-gPC) and combining them with an stateof- the-art Statistical Modified Affine Arithmetic (MAA) based methodology in order to model systems that contain control-flow structures. Our methodology produces the different execution paths automatically, determines the regions of the input domain that will exercise them, and extracts the system statistical moments from the partial results. We use this technique to estimate both the dynamic range and the round-off noise in systems with the aforementioned control-flow structures. We show the good accuracy of our approach, which in some case studies with non-linear operators shows a 0.04 % deviation respect to the simulation-based reference values. A known drawback of the techniques based on extensions of intervals is the combinatorial explosion of terms as the size of the targeted systems grows, which leads to scalability problems. To address this issue we present a clustered noise injection technique that groups the signals in the system, introduces the noise terms in each group independently and then combines the results at the end. In this way, the number of noise sources in the system at a given time is controlled and, because of this, the combinato rial explosion is minimized. We also present a multi-way partitioning algorithm aimed at minimizing the deviation of the results due to the loss of correlation between noise terms, in order to keep the results as accurate as possible. This Ph.D. Thesis also covers the development of methodologies for word-length optimization based on Monte-Carlo simulations in reasonable times. We do so by presenting two novel techniques that explore the reduction of the execution times approaching the problem in two different ways: First, the interpolative method applies a simple but precise interpolator to estimate the sensitivity of each signal, which is later used to guide the optimization effort. Second, the incremental method revolves on the fact that, although we strictly need to guarantee a certain confidence level in the simulations for the final results of the optimization process, we can do it with more relaxed levels, which in turn implies using a considerably smaller amount of samples, in the initial stages of the process, when we are still far from the optimized solution. Through these two approaches we demonstrate that the execution time of classical greedy techniques can be accelerated by factors of up to ×240 for small/medium sized problems. Finally, this book introduces HOPLITE, an automated, flexible and modular framework for quantization that includes the implementation of the previous techniques and is provided for public access. The aim is to offer a common ground for developers and researches for prototyping and verifying new techniques for system modelling and word-length optimization easily. We describe its work flow, justifying the taken design decisions, explain its public API and we do a step-by-step demonstration of its execution. We also show, through an example, the way new extensions to the flow should be connected to the existing interfaces in order to expand and improve the capabilities of HOPLITE.
Resumo:
The paper proposes a new application of non-parametric statistical processing of signals recorded from vibration tests for damage detection and evaluation on I-section steel segments. The steel segments investigated constitute the energy dissipating part of a new type of hysteretic damper that is used for passive control of buildings and civil engineering structures subjected to earthquake-type dynamic loadings. Two I-section steel segments with different levels of damage were instrumented with piezoceramic sensors and subjected to controlled white noise random vibrations. The signals recorded during the tests were processed using two non-parametric methods (the power spectral density method and the frequency response function method) that had never previously been applied to hysteretic dampers. The appropriateness of these methods for quantifying the level of damage on the I-shape steel segments is validated experimentally. Based on the results of the random vibrations, the paper proposes a new index that predicts the level of damage and the proximity of failure of the hysteretic damper
Resumo:
This paper describes a variety of statistical methods for obtaining precise quantitative estimates of the similarities and differences in the structures of semantic domains in different languages. The methods include comparing mean correlations within and between groups, principal components analysis of interspeaker correlations, and analysis of variance of speaker by question data. Methods for graphical displays of the results are also presented. The methods give convergent results that are mutually supportive and equivalent under suitable interpretation. The methods are illustrated on the semantic domain of emotion terms in a comparison of the semantic structures of native English and native Japanese speaking subjects. We suggest that, in comparative studies concerning the extent to which semantic structures are universally shared or culture-specific, both similarities and differences should be measured and compared rather than placing total emphasis on one or the other polar position.
Resumo:
Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.
Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.
Resumo:
This article describes a finite element-based formulation for the statistical analysis of the response of stochastic structural composite systems whose material properties are described by random fields. A first-order technique is used to obtain the second-order statistics for the structural response considering means and variances of the displacement and stress fields of plate or shell composite structures. Propagation of uncertainties depends on sensitivities taken as measurement of variation effects. The adjoint variable method is used to obtain the sensitivity matrix. This method is appropriated for composite structures due to the large number of random input parameters. Dominant effects on the stochastic characteristics are studied analyzing the influence of different random parameters. In particular, a study of the anisotropy influence on uncertainties propagation of angle-ply composites is carried out based on the proposed approach.
Resumo:
An approach for the analysis of uncertainty propagation in reliability-based design optimization of composite laminate structures is presented. Using the Uniform Design Method (UDM), a set of design points is generated over a domain centered on the mean reference values of the random variables. A methodology based on inverse optimal design of composite structures to achieve a specified reliability level is proposed, and the corresponding maximum load is outlined as a function of ply angle. Using the generated UDM design points as input/output patterns, an Artificial Neural Network (ANN) is developed based on an evolutionary learning process. Then, a Monte Carlo simulation using ANN development is performed to simulate the behavior of the critical Tsai number, structural reliability index, and their relative sensitivities as a function of the ply angle of laminates. The results are generated for uniformly distributed random variables on a domain centered on mean values. The statistical analysis of the results enables the study of the variability of the reliability index and its sensitivity relative to the ply angle. Numerical examples showing the utility of the approach for robust design of angle-ply laminates are presented.
Resumo:
A novel framework for probabilistic-based structural assessment of existing structures, which combines model identification and reliability assessment procedures, considering in an objective way different sources of uncertainty, is presented in this paper. A short description of structural assessment applications, provided in literature, is initially given. Then, the developed model identification procedure, supported in a robust optimization algorithm, is presented. Special attention is given to both experimental and numerical errors, to be considered in this algorithm convergence criterion. An updated numerical model is obtained from this process. The reliability assessment procedure, which considers a probabilistic model for the structure in analysis, is then introduced, incorporating the results of the model identification procedure. The developed model is then updated, as new data is acquired, through a Bayesian inference algorithm, explicitly addressing statistical uncertainty. Finally, the developed framework is validated with a set of reinforced concrete beams, which were loaded up to failure in laboratory.
Resumo:
The preceding two editions of CoDaWork included talks on the possible considerationof densities as infinite compositions: Egozcue and D´ıaz-Barrero (2003) extended theEuclidean structure of the simplex to a Hilbert space structure of the set of densitieswithin a bounded interval, and van den Boogaart (2005) generalized this to the setof densities bounded by an arbitrary reference density. From the many variations ofthe Hilbert structures available, we work with three cases. For bounded variables, abasis derived from Legendre polynomials is used. For variables with a lower bound, westandardize them with respect to an exponential distribution and express their densitiesas coordinates in a basis derived from Laguerre polynomials. Finally, for unboundedvariables, a normal distribution is used as reference, and coordinates are obtained withrespect to a Hermite-polynomials-based basis.To get the coordinates, several approaches can be considered. A numerical accuracyproblem occurs if one estimates the coordinates directly by using discretized scalarproducts. Thus we propose to use a weighted linear regression approach, where all k-order polynomials are used as predictand variables and weights are proportional to thereference density. Finally, for the case of 2-order Hermite polinomials (normal reference)and 1-order Laguerre polinomials (exponential), one can also derive the coordinatesfrom their relationships to the classical mean and variance.Apart of these theoretical issues, this contribution focuses on the application of thistheory to two main problems in sedimentary geology: the comparison of several grainsize distributions, and the comparison among different rocks of the empirical distribution of a property measured on a batch of individual grains from the same rock orsediment, like their composition
Resumo:
Au cours des deux dernières décennies, la technique d'imagerie arthro-scanner a bénéficié de nombreux progrès technologiques et représente aujourd'hui une excellente alternative à l'imagerie par résonance magnétique (IRM) et / ou arthro-IRM dans l'évaluation des pathologies de la hanche. Cependant, elle reste limitée par l'exposition aux rayonnements ionisants importante. Les techniques de reconstruction itérative (IR) ont récemment été mis en oeuvre avec succès en imagerie ; la littérature montre que l'utilisation ces dernières contribue à réduire la dose d'environ 40 à 55%, comparativement aux protocoles courants utilisant la rétroprojection filtrée (FBP), en scanner de rachis. A notre connaissance, l'utilisation de techniques IR en arthro-scanner de hanche n'a pas été évaluée jusqu'à présent. Le but de notre étude était d'évaluer l'impact de la technique ASIR (GE Healthcare) sur la qualité de l'image objective et subjective en arthro-scanner de hanche, et d'évaluer son potentiel en terme de réduction de dose. Pour cela, trente sept patients examinés par arthro-scanner de hanche ont été randomisés en trois groupes : dose standard (CTDIvol = 38,4 mGy) et deux groupes de dose réduite (CTDIvol = 24,6 ou 15,4 mGy). Les images ont été reconstruites en rétroprojection filtrée (FBP) puis en appliquant différents pourcentages croissants d'ASIR (30, 50, 70 et 90%). Le bruit et le rapport contraste sur bruit (CNR) ont été mesurés. Deux radiologues spécialisés en imagerie musculo-squelettique ont évalué de manière indépendante la qualité de l'image au niveau de plusieurs structures anatomiques en utilisant une échelle de quatre grades. Ils ont également évalué les lésions labrales et du cartilage articulaire. Les résultats révèlent que le bruit augmente (p = 0,0009) et le CNR diminue (p = 0,001) de manière significative lorsque la dose diminue. A l'inverse, le bruit diminue (p = 0,0001) et le contraste sur bruit augmente (p < 0,003) de manière significative lorsque le pourcentage d'ASIR augmente ; on trouve également une augmentation significative des scores de la qualité de l'image pour le labrum, le cartilage, l'os sous-chondral, la qualité de l'image globale (au delà de ASIR 50%), ainsi que le bruit (p < 0,04), et une réduction significative pour l'os trabuculaire et les muscles (p < 0,03). Indépendamment du niveau de dose, il n'y a pas de différence significative pour la détection et la caractérisation des lésions labrales (n=24, p = 1) et des lésions cartilagineuses (n=40, p > 0,89) en fonction du pourcentage d'ASIR. Notre travail a permis de montrer que l'utilisation de plus de 50% d'ASIR permet de reduire de manière significative la dose d'irradiation reçue par le patient lors d'un arthro-scanner de hanche tout en maintenant une qualité d'image diagnostique comparable par rapport à un protocole de dose standard utilisant la rétroprojection filtrée.