906 resultados para statistical techniques


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Vietnam has developed rapidly over the past 15 years. However, progress was not uniformly distributed across the country. Availability, adequate visualization and analysis of spatially explicit data on socio-economic and environmental aspects can support both research and policy towards sustainable development. Applying appropriate mapping techniques allows gleaning important information from tabular socio-economic data. Spatial analysis of socio-economic phenomena can yield insights into locally-specifi c patterns and processes that cannot be generated by non-spatial applications. This paper presents techniques and applications that develop and analyze spatially highly disaggregated socioeconomic datasets. A number of examples show how such information can support informed decisionmaking and research in Vietnam.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

AIM: The aim of the present review was to systematically assess the dental literature in terms of soft tissue grafting techniques. The focused question was: is one method superior over others for augmentation and stability of the augmented soft tissue in terms of increasing the width of keratinized tissue (part 1) and gain in soft tissue volume (part 2). METHODS: A Medline search was performed for human studies focusing on augmentation of keratinized tissue and/or soft tissue volume, and complemented by additional hand searching. Relevant studies were identified and statistical results were reported for meta-analyses including the test minus control weighted mean differences with 95% confidence intervals, the I-squared statistic for tests of heterogeneity, and the number of significant studies. RESULTS: Twenty-five (part 1) and three (part 2) studies met the inclusion criteria; 14 studies (part 1) were eligible for comparison using meta-analyses. An apically positioned flap/vestibuloplasty (APF/V) procedure resulted in a statistically significantly greater gain in keratinized tissue than untreated controls. APF/V plus autogenous tissue revealed statistically significantly more attached gingiva compared with untreated controls and a borderline statistical significance compared with APF/V plus allogenic tissue. Statistically significantly more shrinkage was observed for the APF/V plus allogenic graft compared with the APF/V plus autogenous tissue. Patient-centered outcomes did not reveal any of the treatment methods to be superior regarding postoperative complications. The three studies reporting on soft tissue volume augmentation could not be compared due to lack of homogeneity. The use of subepithelial connective tissue grafts (SCTGs) resulted in statistically significantly more soft tissue volume gain compared with free gingival grafts (FGGs). CONCLUSIONS: APF/V is a successful treatment concept to increase the width of keratinized tissue or attached gingiva around teeth. The addition of autogenous tissue statistically significantly increases the width of attached gingiva. For soft tissue volume augmentation, only limited data are available favoring SCTGs over FGG.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Harvesting techniques can affect cellular parameters of autogenous bone grafts in vitro. Whether these differences translate to in vivo bone formation, however, remains unknown. OBJECTIVE: The purpose of this study was to assess the impact of different harvesting techniques on bone formation and graft resorption in vivo. MATERIAL AND METHODS: Four harvesting techniques were used: (i) corticocancellous blocks particulated by a bone mill; (ii) bone scraper; (iii) piezosurgery; and (iv) bone slurry collected from a filter device upon drilling. The grafts were placed into bone defects in the mandibles of 12 minipigs. The animals were sacrificed after 1, 2, 4 and 8 weeks of healing. Histology and histomorphometrical analyses were performed to assess bone formation and graft resorption. An explorative statistical analysis was performed. RESULTS: The amount of new bone increased, while the amount of residual bone decreased over time with all harvesting techniques. At all given time points, no significant advantage of any harvesting technique on bone formation was observed. The harvesting technique, however, affected bone formation and the amount of residual graft within the overall healing period. Friedman test revealed an impact of the harvesting technique on residual bone graft after 2 and 4 weeks. At the later time point, post hoc testing showed more newly formed bone in association with bone graft processed by bone mill than harvested by bone scraper and piezosurgery. CONCLUSIONS: Transplantation of autogenous bone particles harvested with four techniques in the present model resulted in moderate differences in terms of bone formation and graft resorption.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Long-term measurements of CO2 flux can be obtained using the eddy covariance technique, but these datasets are affected by gaps which hinder the estimation of robust long-term means and annual ecosystem exchanges. We compare results obtained using three gap-fill techniques: multiple regression (MR), multiple imputation (MI), and artificial neural networks (ANNs), applied to a one-year dataset of hourly CO2 flux measurements collected in Lutjewad, over a flat agriculture area near the Wadden Sea dike in the north of the Netherlands. The dataset was separated in two subsets: a learning and a validation set. The performances of gap-filling techniques were analysed by calculating statistical criteria: coefficient of determination (R2), root mean square error (RMSE), mean absolute error (MAE), maximum absolute error (MaxAE), and mean square bias (MSB). The gap-fill accuracy is seasonally dependent, with better results in cold seasons. The highest accuracy is obtained using ANN technique which is also less sensitive to environmental/seasonal conditions. We argue that filling gaps directly on measured CO2 fluxes is more advantageous than the common method of filling gaps on calculated net ecosystem change, because ANN is an empirical method and smaller scatter is expected when gap filling is applied directly to measurements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Calcium levels in spines play a significant role in determining the sign and magnitude of synaptic plasticity. The magnitude of calcium influx into spines is highly dependent on influx through N-methyl D-aspartate (NMDA) receptors, and therefore depends on the number of postsynaptic NMDA receptors in each spine. We have calculated previously how the number of postsynaptic NMDA receptors determines the mean and variance of calcium transients in the postsynaptic density, and how this alters the shape of plasticity curves. However, the number of postsynaptic NMDA receptors in the postsynaptic density is not well known. Anatomical methods for estimating the number of NMDA receptors produce estimates that are very different than those produced by physiological techniques. The physiological techniques are based on the statistics of synaptic transmission and it is difficult to experimentally estimate their precision. In this paper we use stochastic simulations in order to test the validity of a physiological estimation technique based on failure analysis. We find that the method is likely to underestimate the number of postsynaptic NMDA receptors, explain the source of the error, and re-derive a more precise estimation technique. We also show that the original failure analysis as well as our improved formulas are not robust to small estimation errors in key parameters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nuclear morphometry (NM) uses image analysis to measure features of the cell nucleus which are classified as: bulk properties, shape or form, and DNA distribution. Studies have used these measurements as diagnostic and prognostic indicators of disease with inconclusive results. The distributional properties of these variables have not been systematically investigated although much of the medical data exhibit nonnormal distributions. Measurements are done on several hundred cells per patient so summary measurements reflecting the underlying distribution are needed.^ Distributional characteristics of 34 NM variables from prostate cancer cells were investigated using graphical and analytical techniques. Cells per sample ranged from 52 to 458. A small sample of patients with benign prostatic hyperplasia (BPH), representing non-cancer cells, was used for general comparison with the cancer cells.^ Data transformations such as log, square root and 1/x did not yield normality as measured by the Shapiro-Wilks test for normality. A modulus transformation, used for distributions having abnormal kurtosis values, also did not produce normality.^ Kernel density histograms of the 34 variables exhibited non-normality and 18 variables also exhibited bimodality. A bimodality coefficient was calculated and 3 variables: DNA concentration, shape and elongation, showed the strongest evidence of bimodality and were studied further.^ Two analytical approaches were used to obtain a summary measure for each variable for each patient: cluster analysis to determine significant clusters and a mixture model analysis using a two component model having a Gaussian distribution with equal variances. The mixture component parameters were used to bootstrap the log likelihood ratio to determine the significant number of components, 1 or 2. These summary measures were used as predictors of disease severity in several proportional odds logistic regression models. The disease severity scale had 5 levels and was constructed of 3 components: extracapsulary penetration (ECP), lymph node involvement (LN+) and seminal vesicle involvement (SV+) which represent surrogate measures of prognosis. The summary measures were not strong predictors of disease severity. There was some indication from the mixture model results that there were changes in mean levels and proportions of the components in the lower severity levels. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, reconstruction of three-dimensional (3D) patient-specific models of a hip joint from two-dimensional (2D) calibrated X-ray images is addressed. Existing 2D-3D reconstruction techniques usually reconstruct a patient-specific model of a single anatomical structure without considering the relationship to its neighboring structures. Thus, when those techniques would be applied to reconstruction of patient-specific models of a hip joint, the reconstructed models may penetrate each other due to narrowness of the hip joint space and hence do not represent a true hip joint of the patient. To address this problem we propose a novel 2D-3D reconstruction framework using an articulated statistical shape model (aSSM). Different from previous work on constructing an aSSM, where the joint posture is modeled as articulation in a training set via statistical analysis, here it is modeled as a parametrized rotation of the femur around the joint center. The exact rotation of the hip joint as well as the patient-specific models of the joint structures, i.e., the proximal femur and the pelvis, are then estimated by optimally fitting the aSSM to a limited number of calibrated X-ray images. Taking models segmented from CT data as the ground truth, we conducted validation experiments on both plastic and cadaveric bones. Qualitatively, the experimental results demonstrated that the proposed 2D-3D reconstruction framework preserved the hip joint structure and no model penetration was found. Quantitatively, average reconstruction errors of 1.9 mm and 1.1 mm were found for the pelvis and the proximal femur, respectively.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The International Society for Clinical Densitometry (ISCD) has developed new official positions for the clinical use of computed tomography (CT) scans acquired without a calibration phantom, for example, CT scans obtained for other diagnosis such as colonography. This also addresses techniques suggested for opportunistic screening of osteoporosis. The ISCD task force for quantitative CT reviewed the evidence for clinical applications of these new techniques and presented a report with recommendations at the 2015 ISCD Position Development Conference. Here we discuss the agreed upon ISCD official positions with supporting medical evidence, rationale, controversy, and suggestions for further study. Advanced techniques summarized as statistical parameter mapping methods were also reviewed. Their future use is promising but the clinical application is premature. The clinical use of QCT of the hip is addressed in part I and of finite element analysis of the hip and spine in part II.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mixed longitudinal designs are important study designs for many areas of medical research. Mixed longitudinal studies have several advantages over cross-sectional or pure longitudinal studies, including shorter study completion time and ability to separate time and age effects, thus are an attractive choice. Statistical methodology used in general longitudinal studies has been rapidly developing within the last few decades. Common approaches for statistical modeling in studies with mixed longitudinal designs have been the linear mixed-effects model incorporating an age or time effect. The general linear mixed-effects model is considered an appropriate choice to analyze repeated measurements data in longitudinal studies. However, common use of linear mixed-effects model on mixed longitudinal studies often incorporates age as the only random-effect but fails to take into consideration the cohort effect in conducting statistical inferences on age-related trajectories of outcome measurements. We believe special attention should be paid to cohort effects when analyzing data in mixed longitudinal designs with multiple overlapping cohorts. Thus, this has become an important statistical issue to address. ^ This research aims to address statistical issues related to mixed longitudinal studies. The proposed study examined the existing statistical analysis methods for the mixed longitudinal designs and developed an alternative analytic method to incorporate effects from multiple overlapping cohorts as well as from different aged subjects. The proposed study used simulation to evaluate the performance of the proposed analytic method by comparing it with the commonly-used model. Finally, the study applied the proposed analytic method to the data collected by an existing study Project HeartBeat!, which had been evaluated using traditional analytic techniques. Project HeartBeat! is a longitudinal study of cardiovascular disease (CVD) risk factors in childhood and adolescence using a mixed longitudinal design. The proposed model was used to evaluate four blood lipids adjusting for age, gender, race/ethnicity, and endocrine hormones. The result of this dissertation suggest the proposed analytic model could be a more flexible and reliable choice than the traditional model in terms of fitting data to provide more accurate estimates in mixed longitudinal studies. Conceptually, the proposed model described in this study has useful features, including consideration of effects from multiple overlapping cohorts, and is an attractive approach for analyzing data in mixed longitudinal design studies.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Accurate quantitative estimation of exposure using retrospective data has been one of the most challenging tasks in the exposure assessment field. To improve these estimates, some models have been developed using published exposure databases with their corresponding exposure determinants. These models are designed to be applied to reported exposure determinants obtained from study subjects or exposure levels assigned by an industrial hygienist, so quantitative exposure estimates can be obtained. ^ In an effort to improve the prediction accuracy and generalizability of these models, and taking into account that the limitations encountered in previous studies might be due to limitations in the applicability of traditional statistical methods and concepts, the use of computer science- derived data analysis methods, predominantly machine learning approaches, were proposed and explored in this study. ^ The goal of this study was to develop a set of models using decision trees/ensemble and neural networks methods to predict occupational outcomes based on literature-derived databases, and compare, using cross-validation and data splitting techniques, the resulting prediction capacity to that of traditional regression models. Two cases were addressed: the categorical case, where the exposure level was measured as an exposure rating following the American Industrial Hygiene Association guidelines and the continuous case, where the result of the exposure is expressed as a concentration value. Previously developed literature-based exposure databases for 1,1,1 trichloroethane, methylene dichloride and, trichloroethylene were used. ^ When compared to regression estimations, results showed better accuracy of decision trees/ensemble techniques for the categorical case while neural networks were better for estimation of continuous exposure values. Overrepresentation of classes and overfitting were the main causes for poor neural network performance and accuracy. Estimations based on literature-based databases using machine learning techniques might provide an advantage when they are applied to other methodologies that combine `expert inputs' with current exposure measurements, like the Bayesian Decision Analysis tool. The use of machine learning techniques to more accurately estimate exposures from literature-based exposure databases might represent the starting point for the independence from the expert judgment.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Measures of agro-ecosystems genetic variability are essential to sustain scientific-based actions and policies tending to protect the ecosystem services they provide. To build the genetic variability datum it is necessary to deal with a large number and different types of variables. Molecular marker data is highly dimensional by nature, and frequently additional types of information are obtained, as morphological and physiological traits. This way, genetic variability studies are usually associated with the measurement of several traits on each entity. Multivariate methods are aimed at finding proximities between entities characterized by multiple traits by summarizing information in few synthetic variables. In this work we discuss and illustrate several multivariate methods used for different purposes to build the datum of genetic variability. We include methods applied in studies for exploring the spatial structure of genetic variability and the association of genetic data to other sources of information. Multivariate techniques allow the pursuit of the genetic variability datum, as a unifying notion that merges concepts of type, abundance and distribution of variability at gene level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Stereo video techniques are effective for estimating the space-time wave dynamics over an area of the ocean. Indeed, a stereo camera view allows retrieval of both spatial and temporal data whose statistical content is richer than that of time series data retrieved from point wave probes. Classical epipolar techniques and modern variational methods are reviewed to reconstruct the sea surface from the stereo pairs sequentially in time. Current improvements of the variational methods are presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One important task in the design of an antenna is to carry out an analysis to find out the characteristics of the antenna that best fulfills the specifications fixed by the application. After that, a prototype is manufactured and the next stage in design process is to check if the radiation pattern differs from the designed one. Besides the radiation pattern, other radiation parameters like directivity, gain, impedance, beamwidth, efficiency, polarization, etc. must be also evaluated. For this purpose, accurate antenna measurement techniques are needed in order to know exactly the actual electromagnetic behavior of the antenna under test. Due to this fact, most of the measurements are performed in anechoic chambers, which are closed areas, normally shielded, covered by electromagnetic absorbing material, that simulate free space propagation conditions, due to the absorption of the radiation absorbing material. Moreover, these facilities can be employed independently of the weather conditions and allow measurements free from interferences. Despite all the advantages of the anechoic chambers, the results obtained both from far-field measurements and near-field measurements are inevitably affected by errors. Thus, the main objective of this Thesis is to propose algorithms to improve the quality of the results obtained in antenna measurements by using post-processing techniques and without requiring additional measurements. First, a deep revision work of the state of the art has been made in order to give a general vision of the possibilities to characterize or to reduce the effects of errors in antenna measurements. Later, new methods to reduce the unwanted effects of four of the most commons errors in antenna measurements are described and theoretical and numerically validated. The basis of all them is the same, to perform a transformation from the measurement surface to another domain where there is enough information to easily remove the contribution of the errors. The four errors analyzed are noise, reflections, truncation errors and leakage and the tools used to suppress them are mainly source reconstruction techniques, spatial and modal filtering and iterative algorithms to extrapolate functions. Therefore, the main idea of all the methods is to modify the classical near-field-to-far-field transformations by including additional steps with which errors can be greatly suppressed. Moreover, the proposed methods are not computationally complex and, because they are applied in post-processing, additional measurements are not required. The noise is the most widely studied error in this Thesis, proposing a total of three alternatives to filter out an important noise contribution before obtaining the far-field pattern. The first one is based on a modal filtering. The second alternative uses a source reconstruction technique to obtain the extreme near-field where it is possible to apply a spatial filtering. The last one is to back-propagate the measured field to a surface with the same geometry than the measurement surface but closer to the AUT and then to apply also a spatial filtering. All the alternatives are analyzed in the three most common near-field systems, including comprehensive noise statistical analyses in order to deduce the signal-to-noise ratio improvement achieved in each case. The method to suppress reflections in antenna measurements is also based on a source reconstruction technique and the main idea is to reconstruct the field over a surface larger than the antenna aperture in order to be able to identify and later suppress the virtual sources related to the reflective waves. The truncation error presents in the results obtained from planar, cylindrical and partial spherical near-field measurements is the third error analyzed in this Thesis. The method to reduce this error is based on an iterative algorithm to extrapolate the reliable region of the far-field pattern from the knowledge of the field distribution on the AUT plane. The proper termination point of this iterative algorithm as well as other critical aspects of the method are also studied. The last part of this work is dedicated to the detection and suppression of the two most common leakage sources in antenna measurements. A first method tries to estimate the leakage bias constant added by the receiver’s quadrature detector to every near-field data and then suppress its effect on the far-field pattern. The second method can be divided into two parts; the first one to find the position of the faulty component that radiates or receives unwanted radiation, making easier its identification within the measurement environment and its later substitution; and the second part of this method is able to computationally remove the leakage effect without requiring the substitution of the faulty component. Resumen Una tarea importante en el diseño de una antena es llevar a cabo un análisis para averiguar las características de la antena que mejor cumple las especificaciones fijadas por la aplicación. Después de esto, se fabrica un prototipo de la antena y el siguiente paso en el proceso de diseño es comprobar si el patrón de radiación difiere del diseñado. Además del patrón de radiación, otros parámetros de radiación como la directividad, la ganancia, impedancia, ancho de haz, eficiencia, polarización, etc. deben ser también evaluados. Para lograr este propósito, se necesitan técnicas de medida de antenas muy precisas con el fin de saber exactamente el comportamiento electromagnético real de la antena bajo prueba. Debido a esto, la mayoría de las medidas se realizan en cámaras anecoicas, que son áreas cerradas, normalmente revestidas, cubiertas con material absorbente electromagnético. Además, estas instalaciones se pueden emplear independientemente de las condiciones climatológicas y permiten realizar medidas libres de interferencias. A pesar de todas las ventajas de las cámaras anecoicas, los resultados obtenidos tanto en medidas en campo lejano como en medidas en campo próximo están inevitablemente afectados por errores. Así, el principal objetivo de esta Tesis es proponer algoritmos para mejorar la calidad de los resultados obtenidos en medida de antenas mediante el uso de técnicas de post-procesado. Primeramente, se ha realizado un profundo trabajo de revisión del estado del arte con el fin de dar una visión general de las posibilidades para caracterizar o reducir los efectos de errores en medida de antenas. Después, se han descrito y validado tanto teórica como numéricamente nuevos métodos para reducir el efecto indeseado de cuatro de los errores más comunes en medida de antenas. La base de todos ellos es la misma, realizar una transformación de la superficie de medida a otro dominio donde hay suficiente información para eliminar fácilmente la contribución de los errores. Los cuatro errores analizados son ruido, reflexiones, errores de truncamiento y leakage y las herramientas usadas para suprimirlos son principalmente técnicas de reconstrucción de fuentes, filtrado espacial y modal y algoritmos iterativos para extrapolar funciones. Por lo tanto, la principal idea de todos los métodos es modificar las transformaciones clásicas de campo cercano a campo lejano incluyendo pasos adicionales con los que los errores pueden ser enormemente suprimidos. Además, los métodos propuestos no son computacionalmente complejos y dado que se aplican en post-procesado, no se necesitan medidas adicionales. El ruido es el error más ampliamente estudiado en esta Tesis, proponiéndose un total de tres alternativas para filtrar una importante contribución de ruido antes de obtener el patrón de campo lejano. La primera está basada en un filtrado modal. La segunda alternativa usa una técnica de reconstrucción de fuentes para obtener el campo sobre el plano de la antena donde es posible aplicar un filtrado espacial. La última es propagar el campo medido a una superficie con la misma geometría que la superficie de medida pero más próxima a la antena y luego aplicar también un filtrado espacial. Todas las alternativas han sido analizadas en los sistemas de campo próximos más comunes, incluyendo detallados análisis estadísticos del ruido con el fin de deducir la mejora de la relación señal a ruido lograda en cada caso. El método para suprimir reflexiones en medida de antenas está también basado en una técnica de reconstrucción de fuentes y la principal idea es reconstruir el campo sobre una superficie mayor que la apertura de la antena con el fin de ser capaces de identificar y después suprimir fuentes virtuales relacionadas con las ondas reflejadas. El error de truncamiento que aparece en los resultados obtenidos a partir de medidas en un plano, cilindro o en la porción de una esfera es el tercer error analizado en esta Tesis. El método para reducir este error está basado en un algoritmo iterativo para extrapolar la región fiable del patrón de campo lejano a partir de información de la distribución del campo sobre el plano de la antena. Además, se ha estudiado el punto apropiado de terminación de este algoritmo iterativo así como otros aspectos críticos del método. La última parte de este trabajo está dedicado a la detección y supresión de dos de las fuentes de leakage más comunes en medida de antenas. El primer método intenta realizar una estimación de la constante de fuga del leakage añadido por el detector en cuadratura del receptor a todos los datos en campo próximo y después suprimir su efecto en el patrón de campo lejano. El segundo método se puede dividir en dos partes; la primera de ellas para encontrar la posición de elementos defectuosos que radian o reciben radiación indeseada, haciendo más fácil su identificación dentro del entorno de medida y su posterior substitución. La segunda parte del método es capaz de eliminar computacionalmente el efector del leakage sin necesidad de la substitución del elemento defectuoso.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pragmatism is the leading motivation of regularization. We can understand regularization as a modification of the maximum-likelihood estimator so that a reasonable answer could be given in an unstable or ill-posed situation. To mention some typical examples, this happens when fitting parametric or non-parametric models with more parameters than data or when estimating large covariance matrices. Regularization is usually used, in addition, to improve the bias-variance tradeoff of an estimation. Then, the definition of regularization is quite general, and, although the introduction of a penalty is probably the most popular type, it is just one out of multiple forms of regularization. In this dissertation, we focus on the applications of regularization for obtaining sparse or parsimonious representations, where only a subset of the inputs is used. A particular form of regularization, L1-regularization, plays a key role for reaching sparsity. Most of the contributions presented here revolve around L1-regularization, although other forms of regularization are explored (also pursuing sparsity in some sense). In addition to present a compact review of L1-regularization and its applications in statistical and machine learning, we devise methodology for regression, supervised classification and structure induction of graphical models. Within the regression paradigm, we focus on kernel smoothing learning, proposing techniques for kernel design that are suitable for high dimensional settings and sparse regression functions. We also present an application of regularized regression techniques for modeling the response of biological neurons. Supervised classification advances deal, on the one hand, with the application of regularization for obtaining a na¨ıve Bayes classifier and, on the other hand, with a novel algorithm for brain-computer interface design that uses group regularization in an efficient manner. Finally, we present a heuristic for inducing structures of Gaussian Bayesian networks using L1-regularization as a filter. El pragmatismo es la principal motivación de la regularización. Podemos entender la regularización como una modificación del estimador de máxima verosimilitud, de tal manera que se pueda dar una respuesta cuando la configuración del problema es inestable. A modo de ejemplo, podemos mencionar el ajuste de modelos paramétricos o no paramétricos cuando hay más parámetros que casos en el conjunto de datos, o la estimación de grandes matrices de covarianzas. Se suele recurrir a la regularización, además, para mejorar el compromiso sesgo-varianza en una estimación. Por tanto, la definición de regularización es muy general y, aunque la introducción de una función de penalización es probablemente el método más popular, éste es sólo uno de entre varias posibilidades. En esta tesis se ha trabajado en aplicaciones de regularización para obtener representaciones dispersas, donde sólo se usa un subconjunto de las entradas. En particular, la regularización L1 juega un papel clave en la búsqueda de dicha dispersión. La mayor parte de las contribuciones presentadas en la tesis giran alrededor de la regularización L1, aunque también se exploran otras formas de regularización (que igualmente persiguen un modelo disperso). Además de presentar una revisión de la regularización L1 y sus aplicaciones en estadística y aprendizaje de máquina, se ha desarrollado metodología para regresión, clasificación supervisada y aprendizaje de estructura en modelos gráficos. Dentro de la regresión, se ha trabajado principalmente en métodos de regresión local, proponiendo técnicas de diseño del kernel que sean adecuadas a configuraciones de alta dimensionalidad y funciones de regresión dispersas. También se presenta una aplicación de las técnicas de regresión regularizada para modelar la respuesta de neuronas reales. Los avances en clasificación supervisada tratan, por una parte, con el uso de regularización para obtener un clasificador naive Bayes y, por otra parte, con el desarrollo de un algoritmo que usa regularización por grupos de una manera eficiente y que se ha aplicado al diseño de interfaces cerebromáquina. Finalmente, se presenta una heurística para inducir la estructura de redes Bayesianas Gaussianas usando regularización L1 a modo de filtro.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La predicción de energía eólica ha desempeñado en la última década un papel fundamental en el aprovechamiento de este recurso renovable, ya que permite reducir el impacto que tiene la naturaleza fluctuante del viento en la actividad de diversos agentes implicados en su integración, tales como el operador del sistema o los agentes del mercado eléctrico. Los altos niveles de penetración eólica alcanzados recientemente por algunos países han puesto de manifiesto la necesidad de mejorar las predicciones durante eventos en los que se experimenta una variación importante de la potencia generada por un parque o un conjunto de ellos en un tiempo relativamente corto (del orden de unas pocas horas). Estos eventos, conocidos como rampas, no tienen una única causa, ya que pueden estar motivados por procesos meteorológicos que se dan en muy diferentes escalas espacio-temporales, desde el paso de grandes frentes en la macroescala a procesos convectivos locales como tormentas. Además, el propio proceso de conversión del viento en energía eléctrica juega un papel relevante en la ocurrencia de rampas debido, entre otros factores, a la relación no lineal que impone la curva de potencia del aerogenerador, la desalineación de la máquina con respecto al viento y la interacción aerodinámica entre aerogeneradores. En este trabajo se aborda la aplicación de modelos estadísticos a la predicción de rampas a muy corto plazo. Además, se investiga la relación de este tipo de eventos con procesos atmosféricos en la macroescala. Los modelos se emplean para generar predicciones de punto a partir del modelado estocástico de una serie temporal de potencia generada por un parque eólico. Los horizontes de predicción considerados van de una a seis horas. Como primer paso, se ha elaborado una metodología para caracterizar rampas en series temporales. La denominada función-rampa está basada en la transformada wavelet y proporciona un índice en cada paso temporal. Este índice caracteriza la intensidad de rampa en base a los gradientes de potencia experimentados en un rango determinado de escalas temporales. Se han implementado tres tipos de modelos predictivos de cara a evaluar el papel que juega la complejidad de un modelo en su desempeño: modelos lineales autorregresivos (AR), modelos de coeficientes variables (VCMs) y modelos basado en redes neuronales (ANNs). Los modelos se han entrenado en base a la minimización del error cuadrático medio y la configuración de cada uno de ellos se ha determinado mediante validación cruzada. De cara a analizar la contribución del estado macroescalar de la atmósfera en la predicción de rampas, se ha propuesto una metodología que permite extraer, a partir de las salidas de modelos meteorológicos, información relevante para explicar la ocurrencia de estos eventos. La metodología se basa en el análisis de componentes principales (PCA) para la síntesis de la datos de la atmósfera y en el uso de la información mutua (MI) para estimar la dependencia no lineal entre dos señales. Esta metodología se ha aplicado a datos de reanálisis generados con un modelo de circulación general (GCM) de cara a generar variables exógenas que posteriormente se han introducido en los modelos predictivos. Los casos de estudio considerados corresponden a dos parques eólicos ubicados en España. Los resultados muestran que el modelado de la serie de potencias permitió una mejora notable con respecto al modelo predictivo de referencia (la persistencia) y que al añadir información de la macroescala se obtuvieron mejoras adicionales del mismo orden. Estas mejoras resultaron mayores para el caso de rampas de bajada. Los resultados también indican distintos grados de conexión entre la macroescala y la ocurrencia de rampas en los dos parques considerados. Abstract One of the main drawbacks of wind energy is that it exhibits intermittent generation greatly depending on environmental conditions. Wind power forecasting has proven to be an effective tool for facilitating wind power integration from both the technical and the economical perspective. Indeed, system operators and energy traders benefit from the use of forecasting techniques, because the reduction of the inherent uncertainty of wind power allows them the adoption of optimal decisions. Wind power integration imposes new challenges as higher wind penetration levels are attained. Wind power ramp forecasting is an example of such a recent topic of interest. The term ramp makes reference to a large and rapid variation (1-4 hours) observed in the wind power output of a wind farm or portfolio. Ramp events can be motivated by a broad number of meteorological processes that occur at different time/spatial scales, from the passage of large-scale frontal systems to local processes such as thunderstorms and thermally-driven flows. Ramp events may also be conditioned by features related to the wind-to-power conversion process, such as yaw misalignment, the wind turbine shut-down and the aerodynamic interaction between wind turbines of a wind farm (wake effect). This work is devoted to wind power ramp forecasting, with special focus on the connection between the global scale and ramp events observed at the wind farm level. The framework of this study is the point-forecasting approach. Time series based models were implemented for very short-term prediction, this being characterised by prediction horizons up to six hours ahead. As a first step, a methodology to characterise ramps within a wind power time series was proposed. The so-called ramp function is based on the wavelet transform and it provides a continuous index related to the ramp intensity at each time step. The underlying idea is that ramps are characterised by high power output gradients evaluated under different time scales. A number of state-of-the-art time series based models were considered, namely linear autoregressive (AR) models, varying-coefficient models (VCMs) and artificial neural networks (ANNs). This allowed us to gain insights into how the complexity of the model contributes to the accuracy of the wind power time series modelling. The models were trained in base of a mean squared error criterion and the final set-up of each model was determined through cross-validation techniques. In order to investigate the contribution of the global scale into wind power ramp forecasting, a methodological proposal to identify features in atmospheric raw data that are relevant for explaining wind power ramp events was presented. The proposed methodology is based on two techniques: principal component analysis (PCA) for atmospheric data compression and mutual information (MI) for assessing non-linear dependence between variables. The methodology was applied to reanalysis data generated with a general circulation model (GCM). This allowed for the elaboration of explanatory variables meaningful for ramp forecasting that were utilized as exogenous variables by the forecasting models. The study covered two wind farms located in Spain. All the models outperformed the reference model (the persistence) during both ramp and non-ramp situations. Adding atmospheric information had a noticeable impact on the forecasting performance, specially during ramp-down events. Results also suggested different levels of connection between the ramp occurrence at the wind farm level and the global scale.