8 resultados para Statistical analysis methods
em Universidad Politécnica de Madrid
Resumo:
Background: Several meta-analysis methods can be used to quantitatively combine the results of a group of experiments, including the weighted mean difference, statistical vote counting, the parametric response ratio and the non-parametric response ratio. The software engineering community has focused on the weighted mean difference method. However, other meta-analysis methods have distinct strengths, such as being able to be used when variances are not reported. There are as yet no guidelines to indicate which method is best for use in each case. Aim: Compile a set of rules that SE researchers can use to ascertain which aggregation method is best for use in the synthesis phase of a systematic review. Method: Monte Carlo simulation varying the number of experiments in the meta analyses, the number of subjects that they include, their variance and effect size. We empirically calculated the reliability and statistical power in each case Results: WMD is generally reliable if the variance is low, whereas its power depends on the effect size and number of subjects per meta-analysis; the reliability of RR is generally unaffected by changes in variance, but it does require more subjects than WMD to be powerful; NPRR is the most reliable method, but it is not very powerful; SVC behaves well when the effect size is moderate, but is less reliable with other effect sizes. Detailed tables of results are annexed. Conclusions: Before undertaking statistical aggregation in software engineering, it is worthwhile checking whether there is any appreciable difference in the reliability and power of the methods. If there is, software engineers should select the method that optimizes both parameters.
Resumo:
Pragmatism is the leading motivation of regularization. We can understand regularization as a modification of the maximum-likelihood estimator so that a reasonable answer could be given in an unstable or ill-posed situation. To mention some typical examples, this happens when fitting parametric or non-parametric models with more parameters than data or when estimating large covariance matrices. Regularization is usually used, in addition, to improve the bias-variance tradeoff of an estimation. Then, the definition of regularization is quite general, and, although the introduction of a penalty is probably the most popular type, it is just one out of multiple forms of regularization. In this dissertation, we focus on the applications of regularization for obtaining sparse or parsimonious representations, where only a subset of the inputs is used. A particular form of regularization, L1-regularization, plays a key role for reaching sparsity. Most of the contributions presented here revolve around L1-regularization, although other forms of regularization are explored (also pursuing sparsity in some sense). In addition to present a compact review of L1-regularization and its applications in statistical and machine learning, we devise methodology for regression, supervised classification and structure induction of graphical models. Within the regression paradigm, we focus on kernel smoothing learning, proposing techniques for kernel design that are suitable for high dimensional settings and sparse regression functions. We also present an application of regularized regression techniques for modeling the response of biological neurons. Supervised classification advances deal, on the one hand, with the application of regularization for obtaining a na¨ıve Bayes classifier and, on the other hand, with a novel algorithm for brain-computer interface design that uses group regularization in an efficient manner. Finally, we present a heuristic for inducing structures of Gaussian Bayesian networks using L1-regularization as a filter. El pragmatismo es la principal motivación de la regularización. Podemos entender la regularización como una modificación del estimador de máxima verosimilitud, de tal manera que se pueda dar una respuesta cuando la configuración del problema es inestable. A modo de ejemplo, podemos mencionar el ajuste de modelos paramétricos o no paramétricos cuando hay más parámetros que casos en el conjunto de datos, o la estimación de grandes matrices de covarianzas. Se suele recurrir a la regularización, además, para mejorar el compromiso sesgo-varianza en una estimación. Por tanto, la definición de regularización es muy general y, aunque la introducción de una función de penalización es probablemente el método más popular, éste es sólo uno de entre varias posibilidades. En esta tesis se ha trabajado en aplicaciones de regularización para obtener representaciones dispersas, donde sólo se usa un subconjunto de las entradas. En particular, la regularización L1 juega un papel clave en la búsqueda de dicha dispersión. La mayor parte de las contribuciones presentadas en la tesis giran alrededor de la regularización L1, aunque también se exploran otras formas de regularización (que igualmente persiguen un modelo disperso). Además de presentar una revisión de la regularización L1 y sus aplicaciones en estadística y aprendizaje de máquina, se ha desarrollado metodología para regresión, clasificación supervisada y aprendizaje de estructura en modelos gráficos. Dentro de la regresión, se ha trabajado principalmente en métodos de regresión local, proponiendo técnicas de diseño del kernel que sean adecuadas a configuraciones de alta dimensionalidad y funciones de regresión dispersas. También se presenta una aplicación de las técnicas de regresión regularizada para modelar la respuesta de neuronas reales. Los avances en clasificación supervisada tratan, por una parte, con el uso de regularización para obtener un clasificador naive Bayes y, por otra parte, con el desarrollo de un algoritmo que usa regularización por grupos de una manera eficiente y que se ha aplicado al diseño de interfaces cerebromáquina. Finalmente, se presenta una heurística para inducir la estructura de redes Bayesianas Gaussianas usando regularización L1 a modo de filtro.
Resumo:
In Operational Modal Analysis (OMA) of a structure, the data acquisition process may be repeated many times. In these cases, the analyst has several similar records for the modal analysis of the structure that have been obtained at di�erent time instants (multiple records). The solution obtained varies from one record to another, sometimes considerably. The differences are due to several reasons: statistical errors of estimation, changes in the external forces (unmeasured forces) that modify the output spectra, appearance of spurious modes, etc. Combining the results of the di�erent individual analysis is not straightforward. To solve the problem, we propose to make the joint estimation of the parameters using all the records. This can be done in a very simple way using state space models and computing the estimates by maximum-likelihood. The method provides a single result for the modal parameters that combines optimally all the records.
Resumo:
Computing the modal parameters of large structures in Operational Modal Analysis often requires to process data from multiple non simultaneously recorded setups of sensors. These setups share some sensors in common, the so-called reference sensors that are fixed for all the measurements, while the other sensors are moved from one setup to the next. One possibility is to process the setups separately what result in different modal parameter estimates for each setup. Then the reference sensors are used to merge or glue the different parts of the mode shapes to obtain global modes, while the natural frequencies and damping ratios are usually averaged. In this paper we present a state space model that can be used to process all setups at once so the global mode shapes are obtained automatically and subsequently only a value for the natural frequency and damping ratio of each mode is computed. We also present how this model can be estimated using maximum likelihood and the Expectation Maximization algorithm. We apply this technique to real data measured at a footbridge.
Resumo:
Following the success achieved in previous research projects usin non-destructive methods to estimate the physical and mechanical aging of particle and fibre boards, this paper studies the relationships between aging, physical and mechanical changes, using non-destructive measurements of oriented strand board (OSB). 184 pieces of OSB board from a French source were tested to analyze its actual physical and mechanical properties. The same properties were estimated using acoustic non-destructive methods (ultrasound and stress wave velocity) during a physical laboratory aging test. Measurements were recorded of propagation wave velocity with the sensors aligned, edge to edge, and forming an angle of 45 degrees, with both sensors on the same face of the board. This is because aligned measures are not possible on site. The velocity results are always higher in 45 degree measurements. Given the results of statistical analysis, it can be concluded that there is a strong relationship between acoustic measurements and the decline in physical and mechanical properties of the panels due to aging. The authors propose several models to estimate the physical and mechanical properties of board, as well as their degree of aging. The best results are obtained using ultrasound, although the difference in comparison with the stress wave method is not very significant. A reliable prediction of the degree of deterioration (aging) of board is presented.
Resumo:
This paper studies the relationship between aging, physical changes and the results of non-destructive testing of plywood. 176 pieces of plywood were tested to analyze their actual and estimated density using non-destructive methods (screw withdrawal force and ultrasound wave velocity) during a laboratory aging test. From the results of statistical analysis it can be concluded that there is a strong relationship between the non-destructive measurements carried out, and the decline in the physical properties of the panels due to aging. The authors propose several models to estimate board density. The best results are obtained with ultrasound. A reliable prediction of the degree of deterioration (aging) of board is presented.
Resumo:
Esta tesis estudia la evolución estructural de conjuntos de neuronas como la capacidad de auto-organización desde conjuntos de neuronas separadas hasta que forman una red (clusterizada) compleja. Esta tesis contribuye con el diseño e implementación de un algoritmo no supervisado de segmentación basado en grafos con un coste computacional muy bajo. Este algoritmo proporciona de forma automática la estructura completa de la red a partir de imágenes de cultivos neuronales tomadas con microscopios de fase con una resolución muy alta. La estructura de la red es representada mediante un objeto matemático (matriz) cuyos nodos representan a las neuronas o grupos de neuronas y los enlaces son las conexiones reconstruidas entre ellos. Este algoritmo extrae también otras medidas morfológicas importantes que caracterizan a las neuronas y a las neuritas. A diferencia de otros algoritmos hasta el momento, que necesitan de fluorescencia y técnicas inmunocitoquímicas, el algoritmo propuesto permite el estudio longitudinal de forma no invasiva posibilitando el estudio durante la formación de un cultivo. Además, esta tesis, estudia de forma sistemática un grupo de variables topológicas que garantizan la posibilidad de cuantificar e investigar la progresión de las características principales durante el proceso de auto-organización del cultivo. Nuestros resultados muestran la existencia de un estado concreto correspondiente a redes con configuracin small-world y la emergencia de propiedades a micro- y meso-escala de la estructura de la red. Finalmente, identificamos los procesos físicos principales que guían las transformaciones morfológicas de los cultivos y proponemos un modelo de crecimiento de red que reproduce el comportamiento cuantitativamente de las observaciones experimentales. ABSTRACT The thesis analyzes the morphological evolution of assemblies of living neurons, as they self-organize from collections of separated cells into elaborated, clustered, networks. In particular, it contributes with the design and implementation of a graph-based unsupervised segmentation algorithm, having an associated very low computational cost. The processing automatically retrieves the whole network structure from large scale phase-contrast images taken at high resolution throughout the entire life of a cultured neuronal network. The network structure is represented by a mathematical object (a matrix) in which nodes are identified neurons or neurons clusters, and links are the reconstructed connections between them. The algorithm is also able to extract any other relevant morphological information characterizing neurons and neurites. More importantly, and at variance with other segmentation methods that require fluorescence imaging from immunocyto- chemistry techniques, our measures are non invasive and entitle us to carry out a fully longitudinal analysis during the maturation of a single culture. In turn, a systematic statistical analysis of a group of topological observables grants us the possibility of quantifying and tracking the progression of the main networks characteristics during the self-organization process of the culture. Our results point to the existence of a particular state corresponding to a small-world network configuration, in which several relevant graphs micro- and meso-scale properties emerge. Finally, we identify the main physical processes taking place during the cultures morphological transformations, and embed them into a simplified growth model that quantitatively reproduces the overall set of experimental observations.
Resumo:
Esta tesis estudia la evolución estructural de conjuntos de neuronas como la capacidad de auto-organización desde conjuntos de neuronas separadas hasta que forman una red (clusterizada) compleja. Esta tesis contribuye con el diseño e implementación de un algoritmo no supervisado de segmentación basado en grafos con un coste computacional muy bajo. Este algoritmo proporciona de forma automática la estructura completa de la red a partir de imágenes de cultivos neuronales tomadas con microscopios de fase con una resolución muy alta. La estructura de la red es representada mediante un objeto matemático (matriz) cuyos nodos representan a las neuronas o grupos de neuronas y los enlaces son las conexiones reconstruidas entre ellos. Este algoritmo extrae también otras medidas morfológicas importantes que caracterizan a las neuronas y a las neuritas. A diferencia de otros algoritmos hasta el momento, que necesitan de fluorescencia y técnicas inmunocitoquímicas, el algoritmo propuesto permite el estudio longitudinal de forma no invasiva posibilitando el estudio durante la formación de un cultivo. Además, esta tesis, estudia de forma sistemática un grupo de variables topológicas que garantizan la posibilidad de cuantificar e investigar la progresión de las características principales durante el proceso de auto-organización del cultivo. Nuestros resultados muestran la existencia de un estado concreto correspondiente a redes con configuracin small-world y la emergencia de propiedades a micro- y meso-escala de la estructura de la red. Finalmente, identificamos los procesos físicos principales que guían las transformaciones morfológicas de los cultivos y proponemos un modelo de crecimiento de red que reproduce el comportamiento cuantitativamente de las observaciones experimentales. ABSTRACT The thesis analyzes the morphological evolution of assemblies of living neurons, as they self-organize from collections of separated cells into elaborated, clustered, networks. In particular, it contributes with the design and implementation of a graph-based unsupervised segmentation algorithm, having an associated very low computational cost. The processing automatically retrieves the whole network structure from large scale phase-contrast images taken at high resolution throughout the entire life of a cultured neuronal network. The network structure is represented by a mathematical object (a matrix) in which nodes are identified neurons or neurons clusters, and links are the reconstructed connections between them. The algorithm is also able to extract any other relevant morphological information characterizing neurons and neurites. More importantly, and at variance with other segmentation methods that require fluorescence imaging from immunocyto- chemistry techniques, our measures are non invasive and entitle us to carry out a fully longitudinal analysis during the maturation of a single culture. In turn, a systematic statistical analysis of a group of topological observables grants us the possibility of quantifying and tracking the progression of the main networks characteristics during the self-organization process of the culture. Our results point to the existence of a particular state corresponding to a small-world network configuration, in which several relevant graphs micro- and meso-scale properties emerge. Finally, we identify the main physical processes taking place during the cultures morphological transformations, and embed them into a simplified growth model that quantitatively reproduces the overall set of experimental observations.