990 resultados para Statistical structures
Resumo:
Correspondence Analysis was adopted as tool for investigating the statistical structure of hydrochemical and weathering datasets of groundwater samples, with the main purpose of identifying impacts on mineral weathering caused by anthropogenic activities, namely fertilizing of farmlands. The hydrochemical dataset comprised measured concentrations of major inorganic compounds dissolved in groundwater, namely bicarbonate, silica (usually by-products of chemical weathering), chloride, sulphate and nitrate (typically atmospheric plus anthropogenic inputs). The weathering dataset consisted of calculated mass transfers of minerals being dissolved in loess sediments of a region located in SW Hungary (Szigetvár area), namely Na-plagioclase, calcite and dolomite, and of pollution-related concentrations of sodium, magnesium and calcium. A first run of Correspondence Analysis described groundwater composition in the study area as a system of triple influence, where spots of domestic effluents-dominated chemistries are surrounded by areas with agriculture-dominated chemistries, both imprinted over large regions of weathering dominated chemistries. A second run revealed that nitrification of N-fertilizers is promoting mineral weathering by the nitric acid reaction (anthropogenic pathway), in concurrence with the retreating of weathering by carbonic acid (natural pathway). It also indicated that dolomite and calcite are being players in a dedolomitization process driven by dissolution of gypsum fertilizers and nitrification of N-fertilizers. © 2013 Elsevier Ltd.
Resumo:
This report summarizes our results from security analysis covering all 57 competitions for authenticated encryption: security, applicability, and robustness (CAESAR) first-round candidates and over 210 implementations. We have manually identified security issues with three candidates, two of which are more serious, and these ciphers have been withdrawn from the competition. We have developed a testing framework, BRUTUS, to facilitate automatic detection of simple security lapses and susceptible statistical structures across all ciphers. From this testing, we have security usage notes on four submissions and statistical notes on a further four. We highlight that some of the CAESAR algorithms pose an elevated risk if employed in real-life protocols due to a class of adaptive-chosen-plaintext attacks. Although authenticated encryption with associated data are often defined (and are best used) as discrete primitives that authenticate and transmit only complete messages, in practice, these algorithms are easily implemented in a fashion that outputs observable ciphertext data when the algorithm has not received all of the (attacker-controlled) plaintext. For an implementor, this strategy appears to offer seemingly harmless and compliant storage and latency advantages. If the algorithm uses the same state for secret keying information, encryption, and integrity protection, and the internal mixing permutation is not cryptographically strong, an attacker can exploit the ciphertext–plaintext feedback loop to reveal secret state information or even keying material. We conclude that the main advantages of exhaustive, automated cryptanalysis are that it acts as a very necessary sanity check for implementations and gives the cryptanalyst insights that can be used to focus more specific attack methods on given candidates.
Resumo:
We examine a method recently proposed by Hinich and Patterson (mimeo, University of Texas at Austin, 1995) for testing the validity of specifying a GARCH error structure for financial time series data in the context of a set of ten daily Sterling exchange rates. The results demonstrate that there are statistical structures present in the data that cannot be captured by a GARCH model, or any of its variants. This result has important implications for the interpretation of the recent voluminous literature which attempts to model financial asset returns using this family of models.
Resumo:
The effects of damping on energy sharing in coupled systems are investigated. The approach taken is to compute the forced response patterns of various idealised systems, and from these to calculate the parameters of Statistical Energy Analysis model for the systems using the matrix inversion approach [1]. It is shown that when SEA models are fitted by this procedure, the values of the coupling loss factors are significantly dependent on damping except when it is sufficiently high. For very lightly damped coupled systems, varying the damping causes the values of the coupling loss factor to vary in direct proportion to the internal loss factor. In the limit of zero damping, the coupling loss factors tend to zero. This is a view which contrasts strongly with 'classical' SEA, in which coupling loss factors are determined by the nature of the coupling between subsystems, independent of subsystem damping. One implication of the strong damping dependency is that equipartition of modal energy under low damping does not in general occur. This is contrary to the classical SEA prediction that equipartition of modal energy always occurs if the damping can be reduced to a sufficiently small value. It is demonstrated that the use of this classical assumption can lead to gross overestimates of subsystem energy ratios, especially in multi-subsystem structures. © 1996 Academic Press Limited.
Resumo:
When used correctly, Statistical Energy Analysis (SEA) can provide good predictions of high frequency vibration levels in built-up structures. Unfortunately, the assumptions that underlie SEA break down as the frequency of excitation is reduced, and the method does not yield accurate predictions at "medium" frequencies (and neither does the Finite Element Method, which is limited to low frequencies). A basic problem is that parts of the system have a short wavelength of deformation and meet the requirements of SEA, while other parts of the system do not - this is often referred to as the "mid-frequency" problem, and there is a broad class of mid-frequency vibration problems that are of great concern to industry. In this paper, a coupled deterministic-statistical approach referred to as the Hybrid Method (Shorter & Langley, 2004) is briefly described, and some results that demonstrate how the method overcomes the aforementioned difficulties are presented.
Resumo:
This paper presents an analysis of entropy-based molecular descriptors. Specifically, we use real chemical structures, as well as synthetic isomeric structures, and investigate properties of and among descriptors with respect to the used data set by a statistical analysis. Our numerical results provide evidence that synthetic chemical structures are notably different to real chemical structures and, hence, should not be used to investigate molecular descriptors. Instead, an analysis based on real chemical structures is favorable. Further, we find strong hints that molecular descriptors can be partitioned into distinct classes capturing complementary information.
A new speech analysis system: ASSESS (Automatic Statistical Summary of Elementary Speech Structures)
Resumo:
The crystal structures of an aspartic proteinase from Trichoderma reesei (TrAsP) and of its complex with a competitive inhibitor, pepstatin A, were solved and refined to crystallographic R-factors of 17.9% (R(free)=21.2%) at 1.70 angstrom resolution and 15.81% (R(free) = 19.2%) at 1.85 angstrom resolution, respectively. The three-dimensional structure of TrAsP is similar to structures of other members of the pepsin-like family of aspartic proteinases. Each molecule is folded in a predominantly beta-sheet bilobal structure with the N-terminal and C-terminal domains of about the same size. Structural comparison of the native structure and the TrAsP-pepstatin complex reveals that the enzyme undergoes an induced-fit, rigid-body movement upon inhibitor binding, with the N-terminal and C-terminal lobes tightly enclosing the inhibitor. Upon recognition and binding of pepstatin A, amino acid residues of the enzyme active site form a number of short hydrogen bonds to the inhibitor that may play an important role in the mechanism of catalysis and inhibition. The structures of TrAsP were used as a template for performing statistical coupling analysis of the aspartic protease family. This approach permitted, for the first time, the identification of a network of structurally linked residues putatively mediating conformational changes relevant to the function of this family of enzymes. Statistical coupling analysis reveals coevolved continuous clusters of amino acid residues that extend from the active site into the hydrophobic cores of each of the two domains and include amino acid residues from the flap regions, highlighting the importance of these parts of the protein for its enzymatic activity. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
The results of a questionnaire survey of 3,578 young protesters aged 15 to 24 were used to create a typology of the motive structures of the young globalization critics who participated in protests against the G8 summit in Heiligendamm in June 2007. Eight groups with different motive structures identified using cluster analysis reveal the spectrum of motives of the young demonstrators, ranging from social and political idealism to hedonistic fun-seeking and nationalist motives. Despite the diversity of motives, two cross-cluster motives can be identified: the results clearly show that the majority of respondents were motivated by political idealism and rejected violence. Two overlapping minorities were found: one where political idealism was largely lacking, and another where violence was a prominent motive.
Resumo:
El uso de aritmética de punto fijo es una opción de diseño muy extendida en sistemas con fuertes restricciones de área, consumo o rendimiento. Para producir implementaciones donde los costes se minimicen sin impactar negativamente en la precisión de los resultados debemos llevar a cabo una asignación cuidadosa de anchuras de palabra. Encontrar la combinación óptima de anchuras de palabra en coma fija para un sistema dado es un problema combinatorio NP-hard al que los diseñadores dedican entre el 25 y el 50 % del ciclo de diseño. Las plataformas hardware reconfigurables, como son las FPGAs, también se benefician de las ventajas que ofrece la aritmética de coma fija, ya que éstas compensan las frecuencias de reloj más bajas y el uso más ineficiente del hardware que hacen estas plataformas respecto a los ASICs. A medida que las FPGAs se popularizan para su uso en computación científica los diseños aumentan de tamaño y complejidad hasta llegar al punto en que no pueden ser manejados eficientemente por las técnicas actuales de modelado de señal y ruido de cuantificación y de optimización de anchura de palabra. En esta Tesis Doctoral exploramos distintos aspectos del problema de la cuantificación y presentamos nuevas metodologías para cada uno de ellos: Las técnicas basadas en extensiones de intervalos han permitido obtener modelos de propagación de señal y ruido de cuantificación muy precisos en sistemas con operaciones no lineales. Nosotros llevamos esta aproximación un paso más allá introduciendo elementos de Multi-Element Generalized Polynomial Chaos (ME-gPC) y combinándolos con una técnica moderna basada en Modified Affine Arithmetic (MAA) estadístico para así modelar sistemas que contienen estructuras de control de flujo. Nuestra metodología genera los distintos caminos de ejecución automáticamente, determina las regiones del dominio de entrada que ejercitarán cada uno de ellos y extrae los momentos estadísticos del sistema a partir de dichas soluciones parciales. Utilizamos esta técnica para estimar tanto el rango dinámico como el ruido de redondeo en sistemas con las ya mencionadas estructuras de control de flujo y mostramos la precisión de nuestra aproximación, que en determinados casos de uso con operadores no lineales llega a tener tan solo una desviación del 0.04% con respecto a los valores de referencia obtenidos mediante simulación. Un inconveniente conocido de las técnicas basadas en extensiones de intervalos es la explosión combinacional de términos a medida que el tamaño de los sistemas a estudiar crece, lo cual conlleva problemas de escalabilidad. Para afrontar este problema presen tamos una técnica de inyección de ruidos agrupados que hace grupos con las señales del sistema, introduce las fuentes de ruido para cada uno de los grupos por separado y finalmente combina los resultados de cada uno de ellos. De esta forma, el número de fuentes de ruido queda controlado en cada momento y, debido a ello, la explosión combinatoria se minimiza. También presentamos un algoritmo de particionado multi-vía destinado a minimizar la desviación de los resultados a causa de la pérdida de correlación entre términos de ruido con el objetivo de mantener los resultados tan precisos como sea posible. La presente Tesis Doctoral también aborda el desarrollo de metodologías de optimización de anchura de palabra basadas en simulaciones de Monte-Cario que se ejecuten en tiempos razonables. Para ello presentamos dos nuevas técnicas que exploran la reducción del tiempo de ejecución desde distintos ángulos: En primer lugar, el método interpolativo aplica un interpolador sencillo pero preciso para estimar la sensibilidad de cada señal, y que es usado después durante la etapa de optimización. En segundo lugar, el método incremental gira en torno al hecho de que, aunque es estrictamente necesario mantener un intervalo de confianza dado para los resultados finales de nuestra búsqueda, podemos emplear niveles de confianza más relajados, lo cual deriva en un menor número de pruebas por simulación, en las etapas iniciales de la búsqueda, cuando todavía estamos lejos de las soluciones optimizadas. Mediante estas dos aproximaciones demostramos que podemos acelerar el tiempo de ejecución de los algoritmos clásicos de búsqueda voraz en factores de hasta x240 para problemas de tamaño pequeño/mediano. Finalmente, este libro presenta HOPLITE, una infraestructura de cuantificación automatizada, flexible y modular que incluye la implementación de las técnicas anteriores y se proporciona de forma pública. Su objetivo es ofrecer a desabolladores e investigadores un entorno común para prototipar y verificar nuevas metodologías de cuantificación de forma sencilla. Describimos el flujo de trabajo, justificamos las decisiones de diseño tomadas, explicamos su API pública y hacemos una demostración paso a paso de su funcionamiento. Además mostramos, a través de un ejemplo sencillo, la forma en que conectar nuevas extensiones a la herramienta con las interfaces ya existentes para poder así expandir y mejorar las capacidades de HOPLITE. ABSTRACT Using fixed-point arithmetic is one of the most common design choices for systems where area, power or throughput are heavily constrained. In order to produce implementations where the cost is minimized without negatively impacting the accuracy of the results, a careful assignment of word-lengths is required. The problem of finding the optimal combination of fixed-point word-lengths for a given system is a combinatorial NP-hard problem to which developers devote between 25 and 50% of the design-cycle time. Reconfigurable hardware platforms such as FPGAs also benefit of the advantages of fixed-point arithmetic, as it compensates for the slower clock frequencies and less efficient area utilization of the hardware platform with respect to ASICs. As FPGAs become commonly used for scientific computation, designs constantly grow larger and more complex, up to the point where they cannot be handled efficiently by current signal and quantization noise modelling and word-length optimization methodologies. In this Ph.D. Thesis we explore different aspects of the quantization problem and we present new methodologies for each of them: The techniques based on extensions of intervals have allowed to obtain accurate models of the signal and quantization noise propagation in systems with non-linear operations. We take this approach a step further by introducing elements of MultiElement Generalized Polynomial Chaos (ME-gPC) and combining them with an stateof- the-art Statistical Modified Affine Arithmetic (MAA) based methodology in order to model systems that contain control-flow structures. Our methodology produces the different execution paths automatically, determines the regions of the input domain that will exercise them, and extracts the system statistical moments from the partial results. We use this technique to estimate both the dynamic range and the round-off noise in systems with the aforementioned control-flow structures. We show the good accuracy of our approach, which in some case studies with non-linear operators shows a 0.04 % deviation respect to the simulation-based reference values. A known drawback of the techniques based on extensions of intervals is the combinatorial explosion of terms as the size of the targeted systems grows, which leads to scalability problems. To address this issue we present a clustered noise injection technique that groups the signals in the system, introduces the noise terms in each group independently and then combines the results at the end. In this way, the number of noise sources in the system at a given time is controlled and, because of this, the combinato rial explosion is minimized. We also present a multi-way partitioning algorithm aimed at minimizing the deviation of the results due to the loss of correlation between noise terms, in order to keep the results as accurate as possible. This Ph.D. Thesis also covers the development of methodologies for word-length optimization based on Monte-Carlo simulations in reasonable times. We do so by presenting two novel techniques that explore the reduction of the execution times approaching the problem in two different ways: First, the interpolative method applies a simple but precise interpolator to estimate the sensitivity of each signal, which is later used to guide the optimization effort. Second, the incremental method revolves on the fact that, although we strictly need to guarantee a certain confidence level in the simulations for the final results of the optimization process, we can do it with more relaxed levels, which in turn implies using a considerably smaller amount of samples, in the initial stages of the process, when we are still far from the optimized solution. Through these two approaches we demonstrate that the execution time of classical greedy techniques can be accelerated by factors of up to ×240 for small/medium sized problems. Finally, this book introduces HOPLITE, an automated, flexible and modular framework for quantization that includes the implementation of the previous techniques and is provided for public access. The aim is to offer a common ground for developers and researches for prototyping and verifying new techniques for system modelling and word-length optimization easily. We describe its work flow, justifying the taken design decisions, explain its public API and we do a step-by-step demonstration of its execution. We also show, through an example, the way new extensions to the flow should be connected to the existing interfaces in order to expand and improve the capabilities of HOPLITE.
Resumo:
The paper proposes a new application of non-parametric statistical processing of signals recorded from vibration tests for damage detection and evaluation on I-section steel segments. The steel segments investigated constitute the energy dissipating part of a new type of hysteretic damper that is used for passive control of buildings and civil engineering structures subjected to earthquake-type dynamic loadings. Two I-section steel segments with different levels of damage were instrumented with piezoceramic sensors and subjected to controlled white noise random vibrations. The signals recorded during the tests were processed using two non-parametric methods (the power spectral density method and the frequency response function method) that had never previously been applied to hysteretic dampers. The appropriateness of these methods for quantifying the level of damage on the I-shape steel segments is validated experimentally. Based on the results of the random vibrations, the paper proposes a new index that predicts the level of damage and the proximity of failure of the hysteretic damper
Resumo:
This paper describes a variety of statistical methods for obtaining precise quantitative estimates of the similarities and differences in the structures of semantic domains in different languages. The methods include comparing mean correlations within and between groups, principal components analysis of interspeaker correlations, and analysis of variance of speaker by question data. Methods for graphical displays of the results are also presented. The methods give convergent results that are mutually supportive and equivalent under suitable interpretation. The methods are illustrated on the semantic domain of emotion terms in a comparison of the semantic structures of native English and native Japanese speaking subjects. We suggest that, in comparative studies concerning the extent to which semantic structures are universally shared or culture-specific, both similarities and differences should be measured and compared rather than placing total emphasis on one or the other polar position.
Resumo:
Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.
Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.