980 resultados para multilevel statistical modeling
Resumo:
The Upper Blue Nile River Basin (UBNRB) located in the western part of Ethiopia, between 7° 45’ and 12° 45’N and 34° 05’ and 39° 45’E has a total area of 174962 km2 . More than 80% of the population in the basin is engaged in agricultural activities. Because of the particularly dry climate in the basin, likewise to most other regions of Ethiopia, the agricultural productivity depends to a very large extent on the occurrence of the seasonal rains. This situation makes agriculture highly vulnerable to the impact of potential climate hazards which are about to inflict Africa as a whole and Ethiopia in particular. To analyze these possible impacts of future climate change on the water resources in the UBNRB, in the first part of the thesis climate projection for precipitation, minimum and maximum temperatures in the basin, using downscaled predictors from three GCMs (ECHAM5, GFDL21 and CSIRO-MK3) under SRES scenarios A1B and A2 have been carried out. The two statistical downscaling models used are SDSM and LARS-WG, whereby SDSM is used to downscale ECHAM5-predictors alone and LARS-WG is applied in both mono-model mode with predictors from ECHAM5 and in multi-model mode with combined predictors from ECHAM5, GFDL21 and CSIRO-MK3. For the calibration/validation of the downscaled models, observed as well as NCEP climate data in the 1970 - 2000 reference period is used. The future projections are made for two time periods; 2046-2065 (2050s) and 2081-2100 (2090s). For the 2050s future time period the downscaled climate predictions indicate rise of 0.6°C to 2.7°C for the seasonal maximum temperatures Tmax, and of 0.5°C to 2.44°C for the minimum temperatures Tmin. Similarly, during the 2090s the seasonal Tmax increases by 0.9°C to 4.63°C and Tmin by 1°C to 4.6°C, whereby these increases are generally higher for the A2 than for the A1B scenario. For most sub-basins of the UBNRB, the predicted changes of Tmin are larger than those of Tmax. Meanwhile, for the precipitation, both downscaling tools predict large changes which, depending on the GCM employed, are such that the spring and summer seasons will be experiencing decreases between -36% to 1% and the autumn and winter seasons an increase of -8% to 126% for the two future time periods, regardless of the SRES scenario used. In the second part of the thesis the semi-distributed, physically based hydrologic model, SWAT (Soil Water Assessment Tool), is used to evaluate the impacts of the above-predicted future climate change on the hydrology and water resources of the UBNRB. Hereby the downscaled future predictors are used as input in the SWAT model to predict streamflow of the Upper Blue Nile as well as other relevant water resources parameter in the basin. Calibration and validation of the streamflow model is done again on 1970-2000 measured discharge at the outlet gage station Eldiem, whereby the most sensitive out the numerous “tuneable” calibration parameters in SWAT have been selected by means of a sophisticated sensitivity analysis. Consequently, a good calibration/validation model performance with a high NSE-coefficient of 0.89 is obtained. The results of the future simulations of streamflow in the basin, using both SDSM- and LARS-WG downscaled output in SWAT reveal a decline of -10% to -61% of the future Blue Nile streamflow, And, expectedly, these obviously adverse effects on the future UBNRB-water availibiliy are more exacerbated for the 2090’s than for the 2050’s, regardless of the SRES.
Resumo:
Modeling and predicting co-occurrences of events is a fundamental problem of unsupervised learning. In this contribution we develop a statistical framework for analyzing co-occurrence data in a general setting where elementary observations are joint occurrences of pairs of abstract objects from two finite sets. The main challenge for statistical models in this context is to overcome the inherent data sparseness and to estimate the probabilities for pairs which were rarely observed or even unobserved in a given sample set. Moreover, it is often of considerable interest to extract grouping structure or to find a hierarchical data organization. A novel family of mixture models is proposed which explain the observed data by a finite number of shared aspects or clusters. This provides a common framework for statistical inference and structure discovery and also includes several recently proposed models as special cases. Adopting the maximum likelihood principle, EM algorithms are derived to fit the model parameters. We develop improved versions of EM which largely avoid overfitting problems and overcome the inherent locality of EM--based optimization. Among the broad variety of possible applications, e.g., in information retrieval, natural language processing, data mining, and computer vision, we have chosen document retrieval, the statistical analysis of noun/adjective co-occurrence and the unsupervised segmentation of textured images to test and evaluate the proposed algorithms.
Resumo:
We present a new approach to model and classify breast parenchymal tissue. Given a mammogram, first, we will discover the distribution of the different tissue densities in an unsupervised manner, and second, we will use this tissue distribution to perform the classification. We achieve this using a classifier based on local descriptors and probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature. We studied the influence of different descriptors like texture and SIFT features at the classification stage showing that textons outperform SIFT in all cases. Moreover we demonstrate that pLSA automatically extracts meaningful latent aspects generating a compact tissue representation based on their densities, useful for discriminating on mammogram classification. We show the results of tissue classification over the MIAS and DDSM datasets. We compare our method with approaches that classified these same datasets showing a better performance of our proposal
Resumo:
Resumen: Este trabajo estudia los resultados en matemáticas y lenguaje de 32000 estudiantes en la prueba saber 11 del 2008, de la ciudad de Bogotá. Este análisis reconoce que los individuos se encuentran contenidos en barrios y colegios, pero no todos los individuos del mismo barrio asisten a la misma escuela y viceversa. Con el fin de modelar esta estructura de datos se utilizan varios modelos econométricos, incluyendo una regresión jerárquica multinivel de efectos cruzados. Nuestro objetivo central es identificar en qué medida y que condiciones del barrio y del colegio se correlacionan con los resultados educacionales de la población objetivo y cuáles características de los barrios y de los colegios están más asociadas al resultado en las pruebas. Usamos datos de la prueba saber 11, del censo de colegios c600, del censo poblacional del 2005 y de la policía metropolitana de Bogotá. Nuestras estimaciones muestran que tanto el barrio como el colegio están correlacionados con los resultados en las pruebas; pero el efecto del colegio parece ser mucho más fuerte que el del barrio. Las características del colegio que están más asociadas con el resultado en las pruebas son la educación de los profesores, la jornada, el valor de la pensión, y el contexto socio económico del colegio. Las características de los barrios más asociadas con el resultado en las pruebas son, la presencia de universitarios en la UPZ, un clúster de altos niveles de educación y nivel de crimen en el barrio que se correlaciona negativamente. Los resultados anteriores fueron hallados teniendo en cuenta controles familiares y personales.
Resumo:
This investigation deals with the question of when a particular population can be considered to be disease-free. The motivation is the case of BSE where specific birth cohorts may present distinct disease-free subpopulations. The specific objective is to develop a statistical approach suitable for documenting freedom of disease, in particular, freedom from BSE in birth cohorts. The approach is based upon a geometric waiting time distribution for the occurrence of positive surveillance results and formalizes the relationship between design prevalence, cumulative sample size and statistical power. The simple geometric waiting time model is further modified to account for the diagnostic sensitivity and specificity associated with the detection of disease. This is exemplified for BSE using two different models for the diagnostic sensitivity. The model is furthermore modified in such a way that a set of different values for the design prevalence in the surveillance streams can be accommodated (prevalence heterogeneity) and a general expression for the power function is developed. For illustration, numerical results for BSE suggest that currently (data status September 2004) a birth cohort of Danish cattle born after March 1999 is free from BSE with probability (power) of 0.8746 or 0.8509, depending on the choice of a model for the diagnostic sensitivity.
Resumo:
Covariation in the structural composition of the gut microbiome and the spectroscopically derived metabolic phenotype (metabotype) of a rodent model for obesity were investigated using a range of multivariate statistical tools. Urine and plasma samples from three strains of 10-week-old male Zucker rats (obese (fa/fa, n = 8), lean (fal-, n = 8) and lean (-/-, n = 8)) were characterized via high-resolution H-1 NMR spectroscopy, and in parallel, the fecal microbial composition was investigated using fluorescence in situ hydridization (FISH) and denaturing gradient gel electrophoresis (DGGE) methods. All three Zucker strains had different relative abundances of the dominant members of their intestinal microbiota (FISH), with the novel observation of a Halomonas and a Sphingomonas species being present in the (fa/fa) obese strain on the basis of DGGE data. The two functionally and phenotypically normal Zucker strains (fal- and -/-) were readily distinguished from the (fa/fa) obese rats on the basis of their metabotypes with relatively lower urinary hippurate and creatinine, relatively higher levels of urinary isoleucine, leucine and acetate and higher plasma LDL and VLDL levels typifying the (fa/fa) obese strain. Collectively, these data suggest a conditional host genetic involvement in selection of the microbial species in each host strain, and that both lean and obese animals could have specific metabolic phenotypes that are linked to their individual microbiomes.
Resumo:
The Boltzmann equation in presence of boundary and initial conditions, which describes the general case of carrier transport in microelectronic devices is analysed in terms of Monte Carlo theory. The classical Ensemble Monte Carlo algorithm which has been devised by merely phenomenological considerations of the initial and boundary carrier contributions is now derived in a formal way. The approach allows to suggest a set of event-biasing algorithms for statistical enhancement as an alternative of the population control technique, which is virtually the only algorithm currently used in particle simulators. The scheme of the self-consistent coupling of Boltzmann and Poisson equation is considered for the case of weighted particles. It is shown that particles survive the successive iteration steps.
Resumo:
To understand the resilience of aquatic ecosystems to environmental change, it is important to determine how multiple, related environmental factors, such as near-surface air temperature and river flow, will change during the next century. This study develops a novel methodology that combines statistical downscaling and fish species distribution modeling, to enhance the understanding of how global climate changes (modeled by global climate models at coarse-resolution) may affect local riverine fish diversity. The novelty of this work is the downscaling framework developed to provide suitable future projections of fish habitat descriptors, focusing particularly on the hydrology which has been rarely considered in previous studies. The proposed modeling framework was developed and tested in a major European system, the Adour-Garonne river basin (SW France, 116,000 km(2)), which covers distinct hydrological and thermal regions from the Pyrenees to the Atlantic coast. The simulations suggest that, by 2100, the mean annual stream flow is projected to decrease by approximately 15% and temperature to increase by approximately 1.2 °C, on average. As consequence, the majority of cool- and warm-water fish species is projected to expand their geographical range within the basin while the few cold-water species will experience a reduction in their distribution. The limitations and potential benefits of the proposed modeling approach are discussed. Copyright © 2012 Elsevier B.V. All rights reserved.
Resumo:
An incidence matrix analysis is used to model a three-dimensional network consisting of resistive and capacitive elements distributed across several interconnected layers. A systematic methodology for deriving a descriptor representation of the network with random allocation of the resistors and capacitors is proposed. Using a transformation of the descriptor representation into standard state-space form, amplitude and phase admittance responses of three-dimensional random RC networks are obtained. Such networks display an emergent behavior with a characteristic Jonscher-like response over a wide range of frequencies. A model approximation study of these networks is performed to infer the admittance response using integral and fractional order models. It was found that a fractional order model with only seven parameters can accurately describe the responses of networks composed of more than 70 nodes and 200 branches with 100 resistors and 100 capacitors. The proposed analysis can be used to model charge migration in amorphous materials, which may be associated to specific macroscopic or microscopic scale fractal geometrical structures in composites displaying a viscoelastic electromechanical response, as well as to model the collective responses of processes governed by random events described using statistical mechanics.
Resumo:
The growing energy consumption in the residential sector represents about 30% of global demand. This calls for Demand Side Management solutions propelling change in behaviors of end consumers, with the aim to reduce overall consumption as well as shift it to periods in which demand is lower and where the cost of generating energy is lower. Demand Side Management solutions require detailed knowledge about the patterns of energy consumption. The profile of electricity demand in the residential sector is highly correlated with the time of active occupancy of the dwellings; therefore in this study the occupancy patterns in Spanish properties was determined using the 2009–2010 Time Use Survey (TUS), conducted by the National Statistical Institute of Spain. The survey identifies three peaks in active occupancy, which coincide with morning, noon and evening. This information has been used to input into a stochastic model which generates active occupancy profiles of dwellings, with the aim to simulate domestic electricity consumption. TUS data were also used to identify which appliance-related activities could be considered for Demand Side Management solutions during the three peaks of occupancy.
Resumo:
Advanced forecasting of space weather requires simulation of the whole Sun-to-Earth system, which necessitates driving magnetospheric models with the outputs from solar wind models. This presents a fundamental difficulty, as the magnetosphere is sensitive to both large-scale solar wind structures, which can be captured by solar wind models, and small-scale solar wind “noise,” which is far below typical solar wind model resolution and results primarily from stochastic processes. Following similar approaches in terrestrial climate modeling, we propose statistical “downscaling” of solar wind model results prior to their use as input to a magnetospheric model. As magnetospheric response can be highly nonlinear, this is preferable to downscaling the results of magnetospheric modeling. To demonstrate the benefit of this approach, we first approximate solar wind model output by smoothing solar wind observations with an 8 h filter, then add small-scale structure back in through the addition of random noise with the observed spectral characteristics. Here we use a very simple parameterization of noise based upon the observed probability distribution functions of solar wind parameters, but more sophisticated methods will be developed in the future. An ensemble of results from the simple downscaling scheme are tested using a model-independent method and shown to add value to the magnetospheric forecast, both improving the best estimate and quantifying the uncertainty. We suggest a number of features desirable in an operational solar wind downscaling scheme.
Resumo:
Stochastic methods are a crucial area in contemporary climate research and are increasingly being used in comprehensive weather and climate prediction models as well as reduced order climate models. Stochastic methods are used as subgrid-scale parameterizations (SSPs) as well as for model error representation, uncertainty quantification, data assimilation, and ensemble prediction. The need to use stochastic approaches in weather and climate models arises because we still cannot resolve all necessary processes and scales in comprehensive numerical weather and climate prediction models. In many practical applications one is mainly interested in the largest and potentially predictable scales and not necessarily in the small and fast scales. For instance, reduced order models can simulate and predict large-scale modes. Statistical mechanics and dynamical systems theory suggest that in reduced order models the impact of unresolved degrees of freedom can be represented by suitable combinations of deterministic and stochastic components and non-Markovian (memory) terms. Stochastic approaches in numerical weather and climate prediction models also lead to the reduction of model biases. Hence, there is a clear need for systematic stochastic approaches in weather and climate modeling. In this review, we present evidence for stochastic effects in laboratory experiments. Then we provide an overview of stochastic climate theory from an applied mathematics perspective. We also survey the current use of stochastic methods in comprehensive weather and climate prediction models and show that stochastic parameterizations have the potential to remedy many of the current biases in these comprehensive models.