938 resultados para inverse probability weights
Resumo:
Graphical techniques for modeling the dependencies of randomvariables have been explored in a variety of different areas includingstatistics, statistical physics, artificial intelligence, speech recognition, image processing, and genetics.Formalisms for manipulating these models have been developedrelatively independently in these research communities. In this paper weexplore hidden Markov models (HMMs) and related structures within the general framework of probabilistic independencenetworks (PINs). The paper contains a self-contained review of the basic principles of PINs.It is shown that the well-known forward-backward (F-B) and Viterbialgorithms for HMMs are special cases of more general inference algorithms forarbitrary PINs. Furthermore, the existence of inference and estimationalgorithms for more general graphical models provides a set of analysistools for HMM practitioners who wish to explore a richer class of HMMstructures.Examples of relatively complex models to handle sensorfusion and coarticulationin speech recognitionare introduced and treated within the graphical model framework toillustrate the advantages of the general approach.
Resumo:
Synechocystis PCC 6803 is a photosynthetic bacterium that has the potential to make bioproducts from carbon dioxide and light. Biochemical production from photosynthetic organisms is attractive because it replaces the typical bioprocessing steps of crop growth, milling, and fermentation, with a one-step photosynthetic process. However, low yields and slow growth rates limit the economic potential of such endeavors. Rational metabolic engineering methods are hindered by limited cellular knowledge and inadequate models of Synechocystis. Instead, inverse metabolic engineering, a scheme based on combinatorial gene searches which does not require detailed cellular models, but can exploit sequence data and existing molecular biological techniques, was used to find genes that (1) improve the production of the biopolymer poly-3-hydroxybutyrate (PHB) and (2) increase the growth rate. A fluorescence activated cell sorting assay was developed to screen for high PHB producing clones. Separately, serial sub-culturing was used to select clones that improve growth rate. Novel gene knock-outs were identified that increase PHB production and others that increase the specific growth rate. These improvements make this system more attractive for industrial use and demonstrate the power of inverse metabolic engineering to identify novel phenotype-associated genes in poorly understood systems.
Resumo:
Compositional data analysis motivated the introduction of a complete Euclidean structure in the simplex of D parts. This was based on the early work of J. Aitchison (1986) and completed recently when Aitchinson distance in the simplex was associated with an inner product and orthonormal bases were identified (Aitchison and others, 2002; Egozcue and others, 2003). A partition of the support of a random variable generates a composition by assigning the probability of each interval to a part of the composition. One can imagine that the partition can be refined and the probability density would represent a kind of continuous composition of probabilities in a simplex of infinitely many parts. This intuitive idea would lead to a Hilbert-space of probability densities by generalizing the Aitchison geometry for compositions in the simplex into the set probability densities
Resumo:
The Aitchison vector space structure for the simplex is generalized to a Hilbert space structure A2(P) for distributions and likelihoods on arbitrary spaces. Central notations of statistics, such as Information or Likelihood, can be identified in the algebraical structure of A2(P) and their corresponding notions in compositional data analysis, such as Aitchison distance or centered log ratio transform. In this way very elaborated aspects of mathematical statistics can be understood easily in the light of a simple vector space structure and of compositional data analysis. E.g. combination of statistical information such as Bayesian updating, combination of likelihood and robust M-estimation functions are simple additions/ perturbations in A2(Pprior). Weighting observations corresponds to a weighted addition of the corresponding evidence. Likelihood based statistics for general exponential families turns out to have a particularly easy interpretation in terms of A2(P). Regular exponential families form finite dimensional linear subspaces of A2(P) and they correspond to finite dimensional subspaces formed by their posterior in the dual information space A2(Pprior). The Aitchison norm can identified with mean Fisher information. The closing constant itself is identified with a generalization of the cummulant function and shown to be Kullback Leiblers directed information. Fisher information is the local geometry of the manifold induced by the A2(P) derivative of the Kullback Leibler information and the space A2(P) can therefore be seen as the tangential geometry of statistical inference at the distribution P. The discussion of A2(P) valued random variables, such as estimation functions or likelihoods, give a further interpretation of Fisher information as the expected squared norm of evidence and a scale free understanding of unbiased reasoning
Resumo:
The identification of compositional changes in fumarolic gases of active and quiescent volcanoes is one of the most important targets in monitoring programs. From a general point of view, many systematic (often cyclic) and random processes control the chemistry of gas discharges, making difficult to produce a convincing mathematical-statistical modelling. Changes in the chemical composition of volcanic gases sampled at Vulcano Island (Aeolian Arc, Sicily, Italy) from eight different fumaroles located in the northern sector of the summit crater (La Fossa) have been analysed by considering their dependence from time in the period 2000-2007. Each intermediate chemical composition has been considered as potentially derived from the contribution of the two temporal extremes represented by the 2000 and 2007 samples, respectively, by using inverse modelling methodologies for compositional data. Data pertaining to fumaroles F5 and F27, located on the rim and in the inner part of La Fossa crater, respectively, have been used to achieve the proposed aim. The statistical approach has allowed us to highlight the presence of random and not random fluctuations, features useful to understand how the volcanic system works, opening new perspectives in sampling strategies and in the evaluation of the natural risk related to a quiescent volcano
Resumo:
In this paper, we define a new scheme to develop and evaluate protection strategies for building reliable GMPLS networks. This is based on what we have called the network protection degree (NPD). The NPD consists of an a priori evaluation, the failure sensibility degree (FSD), which provides the failure probability, and an a posteriori evaluation, the failure impact degree (FID), which determines the impact on the network in case of failure, in terms of packet loss and recovery time. Having mathematical formulated these components, experimental results demonstrate the benefits of the utilization of the NPD, when used to enhance some current QoS routing algorithms in order to offer a certain degree of protection
Resumo:
In networks with small buffers, such as optical packet switching based networks, the convolution approach is presented as one of the most accurate method used for the connection admission control. Admission control and resource management have been addressed in other works oriented to bursty traffic and ATM. This paper focuses on heterogeneous traffic in OPS based networks. Using heterogeneous traffic and bufferless networks the enhanced convolution approach is a good solution. However, both methods (CA and ECA) present a high computational cost for high number of connections. Two new mechanisms (UMCA and ISCA) based on Monte Carlo method are proposed to overcome this drawback. Simulation results show that our proposals achieve lower computational cost compared to enhanced convolution approach with an small stochastic error in the probability estimation
Resumo:
Exam questions and solutions in PDF
Resumo:
Exam questions and solutions in LaTex. Diagrams for the questions are all together in the support.zip file, as .eps files
Resumo:
Exam questions and solutions in PDF
Resumo:
Exam questions and solutions in LaTex. Diagrams for the questions are all together in the support.zip file, as .eps files
Resumo:
En este trabajo se implementa una metodología para incluir momentos de orden superior en la selección de portafolios, haciendo uso de la Distribución Hiperbólica Generalizada, para posteriormente hacer un análisis comparativo frente al modelo de Markowitz.
Resumo:
La incidencia y prevalencia de enfermedad y riesgo cardiovascular (RCV) se incrementan con los años, como consecuencia de la falta de control en los factores de riesgo modificables, por ejemplo el sedentarismo, principalmente observado en trabajadores de oficina. El objetivo del presente trabajo fue identificar los factores asociados con el incremento del RCV en trabajadores de una empresa del estado en Bogotá, Colombia en el año 2013, a través de un estudio descriptivo de corte transversal a partir de una base de datos suministrada por la empresa con información de 272 trabajadores. Se incluyeron variables sociodemográficas, perfil ocupacional, factores de riesgo, historia clínica y medidas metabólicas. Los datos fueron estudiados a través de análisis univariado, bivariado y multivariado de regresión logística binaria. El 100% de los empleados tiene un contrato a término indefinido, siendo el género femenino más predominante. Se identificó que el RCV presente en el 11.8% de la población se asocia principalmente con la presencia de diabetes mellitus tipo 2 (ORA 9.97; IC95% 2.14-14.96, p=0.019), la alteración en el índice de masa corporal (ORA 5.67; IC95% 4.48-9.19, p=0.026), la hipertensión arterial sistólica (ORA 3.44; IC95% 2.21-4.01, p=0.037. Además hubo una relación inversa respecto al puntaje de la escala Framingham, donde menores puntajes se asociaron a menor RCV (ORA 0.04; IC95% 0.02-0.71, p=0.029), una vez se ajustó el modelo por edad, género y antigüedad en la empresa. No se encontró relación estadísticamente significativa entre el RCV, el cargo y la antigüedad laboral. Se concluye que en esta población trabajadora, independientemente de la edad, tiempo de antigüedad en la empresa y el género, los factores de riesgo clásicos para RCV están presentes y por lo tanto se deben iniciar medidas de promoción y prevención en aras de disminuir la probabilidad que el RCV encontrado se traduzca en un evento cardiovascular y de ésta manera optimizar la productividad en esta empresa.
Resumo:
Resumen en portugués, español y francés. Resumen basado en el de la publicación
Resumo:
Aquesta tesi presenta un nou mètode pel disseny invers de reflectors. Ens hem centrat en tres temes principals: l’ús de fonts de llum reals i complexes, la definició d’un algoritme ràpid pel càlcul de la il•luminació del reflector, i la definició d’un algoritme d’optimització per trobar més eficientment el reflector desitjat. Les fonts de llum estan representades per models near-field, que es comprimeixen amb un error molt petit, fins i tot per fonts de llum amb milions de raigs i objectes a il•luminar molt propers. Llavors proposem un mètode ràpid per obtenir la distribució de la il•luminació d’un reflector i la seva comparació amb la il•luminació desitjada, i que treballa completament en la GPU. Finalment, proposem un nou mètode d’optimització global que permet trobar la solució en menys passos que molts altres mètodes d’optimització clàssics, i alhora evitant mínims locals.