891 resultados para High-Dimensional Space Geometrical Informatics (HDSGI)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the setting of high-dimensional linear models with Gaussian noise, we investigate the possibility of confidence statements connected to model selection. Although there exist numerous procedures for adaptive (point) estimation, the construction of adaptive confidence regions is severely limited (cf. Li in Ann Stat 17:1001–1008, 1989). The present paper sheds new light on this gap. We develop exact and adaptive confidence regions for the best approximating model in terms of risk. One of our constructions is based on a multiscale procedure and a particular coupling argument. Utilizing exponential inequalities for noncentral χ2-distributions, we show that the risk and quadratic loss of all models within our confidence region are uniformly bounded by the minimal risk times a factor close to one.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Displacements of the Earth’s surface caused by tidal and non-tidal loading forces are relevant in high-precision space geodesy. Some of the corrections are recommended by the international scientific community to be applied at the observation level, e.g., ocean tidal loading (OTL) and atmospheric tidal loading (ATL). Non-tidal displacement corrections are in general recommended not to be applied in the products of the International Earth Rotation and Reference Systems Service, in particular atmospheric non-tidal loading (ANTL), oceanic and hydrological non-tidal corrections. We assess and compare the impact of OTL, ATL and ANTL on SLR-derived parameters by reprocessing 12 years of SLR data considering and ignoring individual corrections. We show that loading displacements have an influence not only on station long-term stability, but also on geocenter coordinates, Earth Rotation Parameters, and satellite orbits. Applying the loading corrections reduces the amplitudes of annual signals in the time series of geocenter and station coordinates. The general improvement of the SLR station 3D coordinate repeatability when applying OTL, ATL and ANTL corrections are 19.5 %, 0.2 % and 3.3 % respectively, w.r.t. the solutions without loading corrections. ANTL corrections play a crucial role in the combination of optical (SLR) and microwave (GNSS, VLBI, DORIS) space geodetic observation techniques, because of the so-called Blue-Sky effect: SLR measurements can be carried out only under cloudless sky conditions—typically during high air pressure conditions, when the Earth’s crust is deformed, whereas microwave observations are weather-independent. Thus, applying the loading corrections at the observation level improves SLR-derived products as well as the consistency with microwave-based results. We assess the Blue-Sky effect on SLR stations and the consistency improvement between GNSS and SLR solutions when ANTL corrections are included. The omission of ANTL corrections may lead to inconsistencies between SLR and GNSS solutions of up to 2.5 mm for inland stations. As a result, the estimated GNSS–SLR coordinate differences correspond better to the local ties at the co-located stations when applying ANTL corrections.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An autonomous energy source within a human body is of key importance in the development of medical implants. This work deals with the modelling and the validation of an energy harvesting device which converts the myocardial contractions into electrical energy. The mechanism consists of a clockwork from a commercially available wrist watch. We developed a physical model which is able to predict the total amount of energy generated when applying an external excitation. For the validation of the model, a custom-made hexapod robot was used to accelerate the harvesting device along a given trajectory. We applied forward kinematics to determine the actual motion experienced by the harvesting device. The motion provides translational as well as rotational motion information for accurate simulations in three-dimensional space. The physical model could be successfully validated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposed an automated three-dimensional (3D) lumbar intervertebral disc (IVD) segmentation strategy from Magnetic Resonance Imaging (MRI) data. Starting from two user supplied landmarks, the geometrical parameters of all lumbar vertebral bodies and intervertebral discs are automatically extracted from a mid-sagittal slice using a graphical model based template matching approach. Based on the estimated two-dimensional (2D) geometrical parameters, a 3D variable-radius soft tube model of the lumbar spine column is built by model fitting to the 3D data volume. Taking the geometrical information from the 3D lumbar spine column as constraints and segmentation initialization, the disc segmentation is achieved by a multi-kernel diffeomorphic registration between a 3D template of the disc and the observed MRI data. Experiments on 15 patient data sets showed the robustness and the accuracy of the proposed algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

My dissertation focuses mainly on Bayesian adaptive designs for phase I and phase II clinical trials. It includes three specific topics: (1) proposing a novel two-dimensional dose-finding algorithm for biological agents, (2) developing Bayesian adaptive screening designs to provide more efficient and ethical clinical trials, and (3) incorporating missing late-onset responses to make an early stopping decision. Treating patients with novel biological agents is becoming a leading trend in oncology. Unlike cytotoxic agents, for which toxicity and efficacy monotonically increase with dose, biological agents may exhibit non-monotonic patterns in their dose-response relationships. Using a trial with two biological agents as an example, we propose a phase I/II trial design to identify the biologically optimal dose combination (BODC), which is defined as the dose combination of the two agents with the highest efficacy and tolerable toxicity. A change-point model is used to reflect the fact that the dose-toxicity surface of the combinational agents may plateau at higher dose levels, and a flexible logistic model is proposed to accommodate the possible non-monotonic pattern for the dose-efficacy relationship. During the trial, we continuously update the posterior estimates of toxicity and efficacy and assign patients to the most appropriate dose combination. We propose a novel dose-finding algorithm to encourage sufficient exploration of untried dose combinations in the two-dimensional space. Extensive simulation studies show that the proposed design has desirable operating characteristics in identifying the BODC under various patterns of dose-toxicity and dose-efficacy relationships. Trials of combination therapies for the treatment of cancer are playing an increasingly important role in the battle against this disease. To more efficiently handle the large number of combination therapies that must be tested, we propose a novel Bayesian phase II adaptive screening design to simultaneously select among possible treatment combinations involving multiple agents. Our design is based on formulating the selection procedure as a Bayesian hypothesis testing problem in which the superiority of each treatment combination is equated to a single hypothesis. During the trial conduct, we use the current values of the posterior probabilities of all hypotheses to adaptively allocate patients to treatment combinations. Simulation studies show that the proposed design substantially outperforms the conventional multi-arm balanced factorial trial design. The proposed design yields a significantly higher probability for selecting the best treatment while at the same time allocating substantially more patients to efficacious treatments. The proposed design is most appropriate for the trials combining multiple agents and screening out the efficacious combination to be further investigated. The proposed Bayesian adaptive phase II screening design substantially outperformed the conventional complete factorial design. Our design allocates more patients to better treatments while at the same time providing higher power to identify the best treatment at the end of the trial. Phase II trial studies usually are single-arm trials which are conducted to test the efficacy of experimental agents and decide whether agents are promising to be sent to phase III trials. Interim monitoring is employed to stop the trial early for futility to avoid assigning unacceptable number of patients to inferior treatments. We propose a Bayesian single-arm phase II design with continuous monitoring for estimating the response rate of the experimental drug. To address the issue of late-onset responses, we use a piece-wise exponential model to estimate the hazard function of time to response data and handle the missing responses using the multiple imputation approach. We evaluate the operating characteristics of the proposed method through extensive simulation studies. We show that the proposed method reduces the total length of the trial duration and yields desirable operating characteristics for different physician-specified lower bounds of response rate with different true response rates.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Los nudos son los defectos que más disminuyen la resistencia de piezas de madera en la escala estructural, al ocasionar no solo una discontinuidad material, sino también la desviación de las fibras que se encuentran a su alrededor. En la década de los 80 se introdujo la teoría de la analogía fluido-fibra, como un método que aproximaba adecuadamente todas estas desviaciones. Sin embargo en su aplicación tridimensional, nunca se consideraron las diferencias geométricas en el sentido perpendicular al eje longitudinal de las piezas estructurales, lo cual imposibilitaba la simulación numérica de algunos de los principales tipos de nudos, y disminuía la precisión obtenida en aquellos nudos en los que la modelización sí era viable. En este trabajo se propone un modelo programado en lenguaje paramétrico de un software de elementos finitos que, bajo una formulación en tres dimensiones más general, permitirá estudiar de forma automatizada el comportamiento estructural de la madera bajo la influencia de los principales tipos de nudos, a partir de la geometría visible de los mismos y la posición de la médula en la pieza, y el cual ha sido contrastado experimentalmente, simulando de forma muy precisa el comportamiento mecánico de vigas sometidas a ensayos de flexión a cuatro puntos. Knots are the defects that most reduce the strength of lumber at the structural level, by causing not only a material discontinuity but also the deviation of the fibers that surround them. In the 80's it was introduced the theory of the flow-grain analogy as a method to approximating adequately these deviations. However, in three-dimensional applications, geometrical differences in the direction perpendicular to the longitudinal axis of the structural specimens were never considered before, which prevented the numerical simulation of some of the main types of knots, and decreased the achieved precision in those kind of knots where modeling itself was possible. This paper purposes a parametric model programmed in a finite element software, in the way that with a more general three-dimensional formulation, an automated study of the structural behavior of timber under the influence of the main types of knots is allowed by only knowing the visible geometry of such defects, and the position of the pith. Furthermore that has been confirmed experimentally obtaining very accurately simulations of the mechanical behavior of beams under four points bending test.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pragmatism is the leading motivation of regularization. We can understand regularization as a modification of the maximum-likelihood estimator so that a reasonable answer could be given in an unstable or ill-posed situation. To mention some typical examples, this happens when fitting parametric or non-parametric models with more parameters than data or when estimating large covariance matrices. Regularization is usually used, in addition, to improve the bias-variance tradeoff of an estimation. Then, the definition of regularization is quite general, and, although the introduction of a penalty is probably the most popular type, it is just one out of multiple forms of regularization. In this dissertation, we focus on the applications of regularization for obtaining sparse or parsimonious representations, where only a subset of the inputs is used. A particular form of regularization, L1-regularization, plays a key role for reaching sparsity. Most of the contributions presented here revolve around L1-regularization, although other forms of regularization are explored (also pursuing sparsity in some sense). In addition to present a compact review of L1-regularization and its applications in statistical and machine learning, we devise methodology for regression, supervised classification and structure induction of graphical models. Within the regression paradigm, we focus on kernel smoothing learning, proposing techniques for kernel design that are suitable for high dimensional settings and sparse regression functions. We also present an application of regularized regression techniques for modeling the response of biological neurons. Supervised classification advances deal, on the one hand, with the application of regularization for obtaining a na¨ıve Bayes classifier and, on the other hand, with a novel algorithm for brain-computer interface design that uses group regularization in an efficient manner. Finally, we present a heuristic for inducing structures of Gaussian Bayesian networks using L1-regularization as a filter. El pragmatismo es la principal motivación de la regularización. Podemos entender la regularización como una modificación del estimador de máxima verosimilitud, de tal manera que se pueda dar una respuesta cuando la configuración del problema es inestable. A modo de ejemplo, podemos mencionar el ajuste de modelos paramétricos o no paramétricos cuando hay más parámetros que casos en el conjunto de datos, o la estimación de grandes matrices de covarianzas. Se suele recurrir a la regularización, además, para mejorar el compromiso sesgo-varianza en una estimación. Por tanto, la definición de regularización es muy general y, aunque la introducción de una función de penalización es probablemente el método más popular, éste es sólo uno de entre varias posibilidades. En esta tesis se ha trabajado en aplicaciones de regularización para obtener representaciones dispersas, donde sólo se usa un subconjunto de las entradas. En particular, la regularización L1 juega un papel clave en la búsqueda de dicha dispersión. La mayor parte de las contribuciones presentadas en la tesis giran alrededor de la regularización L1, aunque también se exploran otras formas de regularización (que igualmente persiguen un modelo disperso). Además de presentar una revisión de la regularización L1 y sus aplicaciones en estadística y aprendizaje de máquina, se ha desarrollado metodología para regresión, clasificación supervisada y aprendizaje de estructura en modelos gráficos. Dentro de la regresión, se ha trabajado principalmente en métodos de regresión local, proponiendo técnicas de diseño del kernel que sean adecuadas a configuraciones de alta dimensionalidad y funciones de regresión dispersas. También se presenta una aplicación de las técnicas de regresión regularizada para modelar la respuesta de neuronas reales. Los avances en clasificación supervisada tratan, por una parte, con el uso de regularización para obtener un clasificador naive Bayes y, por otra parte, con el desarrollo de un algoritmo que usa regularización por grupos de una manera eficiente y que se ha aplicado al diseño de interfaces cerebromáquina. Finalmente, se presenta una heurística para inducir la estructura de redes Bayesianas Gaussianas usando regularización L1 a modo de filtro.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En 1905, aparecen en la revista "Annalen der physik" tres artículos que revolucionarán las ciencias físicas y pondrán en jaque los asentados conceptos newtonianos de Espacio y Tiempo. La formulación de la Teoría de la Relatividad por Albert Einstein pone en crisis el valor absoluto de estos conceptos, y permite proponer nuevas reflexiones a propósito de su concepción dentro del campo de la física. Esta revolución ¿podría extrapolarse al campo de la arquitectura, donde Espacio y Tiempo tienen un papel protagonista? Hay que entender la complejidad del hecho arquitectónico y las innumerables variables que participan de su definición. Se estudia en esta Tesis Doctoral un aspecto muy concreto: cómo un paradigma (la Teoría de la Relatividad) puede intervenir y modificar, o no, la Arquitectura. Se plantea para ello ir al origen; desentrañar el momento de interacción entre la Teoría de la Relatividad y la Teoría de la Arquitectura, que permita determinar si aquella influyó sobre ésta en los escritos teóricos de las vanguardias aplicados a la Arquitectura. “Después de Einstein. Una arquitectura para una teoría” buscará los puntos de conexión de la Teoría de la Relatividad con la teoría arquitectónica de las vanguardias de principio del siglo XX, su influencia, la contaminación entre una y otra, con posibles resultados arquitectónicos a partir de esta interacción, capaz de definir nuevos argumentos formales para un nuevo lenguaje enArquitectura. Annalen der physik Después de Einstein. Una arquitectura para una teoría Para ello la Tesis se estructura en cuatro capítulos. El primero expone el ámbito geográfico y cronológico donde se desarrolla la Teoría de la Relatividad con la repercusión teórica que tiene para el arte, en función de una nueva definición de espacio vinculado al tiempo, como evento que se desarrolla en un ámbito cuatridimensional; la indeterminación de las medidas de espacio y de las medidas de tiempo, y la importancia de entender la materia como energía. El segundo capítulo estudia los movimientos de vanguardia coetáneos a la eclosión de la Relatividad, enmarcados en su ámbito geográfico más próximo. El cubismo se muestra como movimiento que participa ocasionalmente de las matemáticas y la geometría, bajo el influjo del científico Henri Poincaré y las geometrías no euclidianas. El futurismo indaga en los avances de la ciencia desde una cierta lejanía, cierta falta de rigor o profundidad científica para extraer las leyes de su nuevo idealismo plástico constructivo, definiendo e interpretando su Universo a partir de los avances de la ciencia, en respuesta a la crisis del espacio y del tiempo newtonianos. El lenguaje científico se encuentra presente en conceptos como "simultaneidad" (Boccioni), "expansión esférica de la luz en el espacio" (Severini y Carrá), "cuatridimensionalidad", "espacio-tiempo", "aire-luz-fuerza", "materia y energía" que paralelamente conforman el cuerpo operacional de la teoría de Einstein. Si bien no es posible atribuir a la Teoría de la Relatividad un papel protagonista como referente para el pensamiento artístico, en 1936, con la aparición del manifiesto Dimensionista, se atribuyen explícitamente a las teorías de Einstein las nuevas ideas de espacio-tiempo del espíritu europeo seguido por cubistas y futuristas. El tercer capítulo describe cómo la Teoría de la Relatividad llegó a ser fuente de inspiración para la Teoría de la Arquitectura. Estructurado en tres subcapítulos, se estudia el autor principal que aportó para la Arquitectura conceptos e ideas extrapoladas de la Teoría de la Relatividad después de su estudio e interpretación (Van Doesburg), dónde se produjeron las influencias y puntos de contacto (Lissitzky, Eggeling, Moholy-Nagy) y cómo fueron difundidas a través de la arquitectura (Einsteinturm de Mendelsohn) y de las revistas especializadas. El cuarto capítulo extrae las conclusiones del estudio realizado en esta Tesis, que bien pudiera resumir MoholyNagy en su texto "Vision inmotion" (1946) al comentar: "Ya que el "espacio-tiempo" puede ser un término engañoso, tiene que hacerse especialmente hincapié en que los problemas de espacio-tiempo en el arte no están necesariamente basados en la Teoría de la Relatividad de Einstein. Esto no tiene intención de descartar la relevancia de su teoría para las artes. Pero los artistas y los laicos rara vez tienen el conocimiento matemático para visualizar en fórmulas científicas las analogías con su propio trabajo. La terminología de Einstein del "espacio-tiempo" y la "relatividad" ha sido absorbida por nuestro lenguaje diario." ABSTRACT. "AFTER EINSTEIN:ANARCHITECTUREFORATHEORY." In 1905, three articles were published in the journal "Annalen der Physik ". They revolutionized physical sciences and threw into crisis the newtonian concepts of Space and Time. The formulation of the Theory of Relativity by Albert Einstein put a strain on the absolute value of these concepts, and proposed new reflections about them in the field of Physics. Could this revolution be extrapolated to the field of Architecture, where Space and Time have a main role? It is necessary to understand the complexity of architecture and the countless variables involved in its definition. For this reason, in this PhD. Thesis, we study a specific aspect: how a paradigm (Theory of Relativity) can intervene and modify -or not- Architecture. It is proposed to go back to the origin; to unravel the moment in which the interaction between the Theory of Relativity and the Theory of Architecture takes place, to determine whether the Theory of Relativity influenced on the theoretical avant-garde writings applied to Architecture. "After Einstein.An architecture for a theory " will search the connection points between the Theory of Relativity and architectural avant-garde theory of the early twentieth century, the influence and contamination between them, giving rise to new architectures that define new formal arguments for a new architectural language. Annalen der Physik This thesis is divided into four chapters. The first one describes the geographical and chronological scope in which the Theory of Relativity is developed showing its theoretical implications in the field of art, according to a new definition of Space linked to Time, as an event that takes place in a fourdimensional space; the indetermination of the measurement of space and time, and the importance of understanding "matter" as "energy". The second chapter examines the avant-garde movements contemporary to the theory of relativity. Cubism is shown as an artist movement that occasionally participates in mathematics and geometry, under the influence of Henri Poincaré and non-Euclidean geometries. Futurism explores the advances of science at a certain distance, with lack of scientific rigor to extract the laws of their new plastic constructive idealism. Scientific language is present in concepts like "simultaneity" (Boccioni), "expanding light in space" (Severini and Carra), "four-dimensional space", "space-time", "light-air-force," "matter and energy" similar to the operational concepts of Einstein´s theory. While it is not possible to attribute a leading role to the Theory of Relativity, as a benchmark for artistic laws, in 1936, with the publication of the Dimensionist manifest, the new ideas of space-time followed by cubist and futurist were attributed to the Einstein's theory. The third chapter describes how the Theory of Relativity became an inspiration for the architectural theory. Structured into three subsections, we study the main author who studied the theory of relativity and ,as a consequence, contributed with some concepts and ideas to the theory of architecture (Van Doesburg), where influences and contact points took place (Lissitzky, Eggeling, Moholy-Nagy) and how were disseminated throughArchitecture (Einsteinturm, by Mendelsohn) and journals. The fourth chapter draws the conclusions of this PhD. Thesis, which could be well summarized by Moholy Nagy in his text "Vision in Motion" (1946): vi Since "space-time" can be a misleading term, it especially has to be emphasized that the space-time problems in the arts are not necessarily based upon Einstein´s Theory of Relativity. This is not meant to discount the relevance of his theory to the arts. But artists and laymen seldom have the mathematical knowledge to visualize in scientific formulae the analogies to their own work. Einstein's terminology of "space-time" and "relativity" has been absorbed by our daily language.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many existing engineering works model the statistical characteristics of the entities under study as normal distributions. These models are eventually used for decision making, requiring in practice the definition of the classification region corresponding to the desired confidence level. Surprisingly enough, however, a great amount of computer vision works using multidimensional normal models leave unspecified or fail to establish correct confidence regions due to misconceptions on the features of Gaussian functions or to wrong analogies with the unidimensional case. The resulting regions incur in deviations that can be unacceptable in high-dimensional models. Here we provide a comprehensive derivation of the optimal confidence regions for multivariate normal distributions of arbitrary dimensionality. To this end, firstly we derive the condition for region optimality of general continuous multidimensional distributions, and then we apply it to the widespread case of the normal probability density function. The obtained results are used to analyze the confidence error incurred by previous works related to vision research, showing that deviations caused by wrong regions may turn into unacceptable as dimensionality increases. To support the theoretical analysis, a quantitative example in the context of moving object detection by means of background modeling is given.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The origins for this work arise in response to the increasing need for biologists and doctors to obtain tools for visual analysis of data. When dealing with multidimensional data, such as medical data, the traditional data mining techniques can be a tedious and complex task, even to some medical experts. Therefore, it is necessary to develop useful visualization techniques that can complement the expert’s criterion, and at the same time visually stimulate and make easier the process of obtaining knowledge from a dataset. Thus, the process of interpretation and understanding of the data can be greatly enriched. Multidimensionality is inherent to any medical data, requiring a time-consuming effort to get a clinical useful outcome. Unfortunately, both clinicians and biologists are not trained in managing more than four dimensions. Specifically, we were aimed to design a 3D visual interface for gene profile analysis easy in order to be used both by medical and biologist experts. In this way, a new analysis method is proposed: MedVir. This is a simple and intuitive analysis mechanism based on the visualization of any multidimensional medical data in a three dimensional space that allows interaction with experts in order to collaborate and enrich this representation. In other words, MedVir makes a powerful reduction in data dimensionality in order to represent the original information into a three dimensional environment. The experts can interact with the data and draw conclusions in a visual and quickly way.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Territory or zone design processes entail partitioning a geographic space, organized as a set of areal units, into different regions or zones according to a specific set of criteria that are dependent on the application context. In most cases, the aim is to create zones of approximately equal sizes (zones with equal numbers of inhabitants, same average sales, etc.). However, some of the new applications that have emerged, particularly in the context of sustainable development policies, are aimed at defining zones of a predetermined, though not necessarily similar, size. In addition, the zones should be built around a given set of seeds. This type of partitioning has not been sufficiently researched; therefore, there are no known approaches for automated zone delimitation. This study proposes a new method based on a discrete version of the adaptive additively weighted Voronoi diagram that makes it possible to partition a two-dimensional space into zones of specific sizes, taking both the position and the weight of each seed into account. The method consists of repeatedly solving a traditional additively weighted Voronoi diagram, so that each seed?s weight is updated at every iteration. The zones are geographically connected using a metric based on the shortest path. Tests conducted on the extensive farming system of three municipalities in Castile-La Mancha (Spain) have established that the proposed heuristic procedure is valid for solving this type of partitioning problem. Nevertheless, these tests confirmed that the given seed position determines the spatial configuration the method must solve and this may have a great impact on the resulting partition.