901 resultados para Asymptotic behaviour, Bayesian methods, Mixture models, Overfitting, Posterior concentration


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster, Laird, and Rubin 1977)---both for the estimation of mixture components and for coping with the missing data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La crisis que se desató en el mercado hipotecario en Estados Unidos en 2008 y que logró propagarse a lo largo de todo sistema financiero, dejó en evidencia el nivel de interconexión que actualmente existe entre las entidades del sector y sus relaciones con el sector productivo, dejando en evidencia la necesidad de identificar y caracterizar el riesgo sistémico inherente al sistema, para que de esta forma las entidades reguladoras busquen una estabilidad tanto individual, como del sistema en general. El presente documento muestra, a través de un modelo que combina el poder informativo de las redes y su adecuación a un modelo espacial auto regresivo (tipo panel), la importancia de incorporar al enfoque micro-prudencial (propuesto en Basilea II), una variable que capture el efecto de estar conectado con otras entidades, realizando así un análisis macro-prudencial (propuesto en Basilea III).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose and estimate a financial distress model that explicitly accounts for the interactions or spill-over effects between financial institutions, through the use of a spatial continuity matrix that is build from financial network data of inter bank transactions. Such setup of the financial distress model allows for the empirical validation of the importance of network externalities in determining financial distress, in addition to institution specific and macroeconomic covariates. The relevance of such specification is that it incorporates simultaneously micro-prudential factors (Basel 2) as well as macro-prudential and systemic factors (Basel 3) as determinants of financial distress. Results indicate network externalities are an important determinant of financial health of a financial institutions. The parameter that measures the effect of network externalities is both economically and statistical significant and its inclusion as a risk factor reduces the importance of the firm specific variables such as the size or degree of leverage of the financial institution. In addition we analyze the policy implications of the network factor model for capital requirements and deposit insurance pricing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The biggest challenge in conservation biology is breaking down the gap between research and practical management. A major obstacle is the fact that many researchers are unwilling to tackle projects likely to produce sparse or messy data because the results would be difficult to publish in refereed journals. The obvious solution to sparse data is to build up results from multiple studies. Consequently, we suggest that there needs to be greater emphasis in conservation biology on publishing papers that can be built on by subsequent research rather than on papers that produce clear results individually. This building approach requires: (1) a stronger theoretical framework, in which researchers attempt to anticipate models that will be relevant in future studies and incorporate expected differences among studies into those models; (2) use of modern methods for model selection and multi-model inference, and publication of parameter estimates under a range of plausible models; (3) explicit incorporation of prior information into each case study; and (4) planning management treatments in an adaptive framework that considers treatments applied in other studies. We encourage journals to publish papers that promote this building approach rather than expecting papers to conform to traditional standards of rigor as stand-alone papers, and believe that this shift in publishing philosophy would better encourage researchers to tackle the most urgent conservation problems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Preferred structures in the surface pressure variability are investigated in and compared between two 100-year simulations of the Hadley Centre climate model HadCM3. In the first (control) simulation, the model is forced with pre-industrial carbon dioxide concentration (1×CO2) and in the second simulation the model is forced with doubled CO2 concentration (2×CO2). Daily winter (December-January-February) surface pressures over the Northern Hemisphere are analysed. The identification of preferred patterns is addressed using multivariate mixture models. For the control simulation, two significant flow regimes are obtained at 5% and 2.5% significance levels within the state space spanned by the leading two principal components. They show a high pressure centre over the North Pacific/Aleutian Islands associated with a low pressure centre over the North Atlantic, and its reverse. For the 2×CO2 simulation, no such behaviour is obtained. At higher-dimensional state space, flow patterns are obtained from both simulations. They are found to be significant at the 1% level for the control simulation and at the 2.5% level for the 2×CO2 simulation. Hence under CO2 doubling, regime behaviour in the large-scale wave dynamics weakens. Doubling greenhouse gas concentration affects both the frequency of occurrence of regimes and also the pattern structures. The less frequent regime becomes amplified and the more frequent regime weakens. The largest change is observed over the Pacific where a significant deepening of the Aleutian low is obtained under CO2 doubling.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this article we review recent progress on the design, analysis and implementation of numerical-asymptotic boundary integral methods for the computation of frequency-domain acoustic scattering in a homogeneous unbounded medium by a bounded obstacle. The main aim of the methods is to allow computation of scattering at arbitrarily high frequency with finite computational resources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Investigation of preferred structures of planetary wave dynamics is addressed using multivariate Gaussian mixture models. The number of components in the mixture is obtained using order statistics of the mixing proportions, hence avoiding previous difficulties related to sample sizes and independence issues. The method is first applied to a few low-order stochastic dynamical systems and data from a general circulation model. The method is next applied to winter daily 500-hPa heights from 1949 to 2003 over the Northern Hemisphere. A spatial clustering algorithm is first applied to the leading two principal components (PCs) and shows significant clustering. The clustering is particularly robust for the first half of the record and less for the second half. The mixture model is then used to identify the clusters. Two highly significant extratropical planetary-scale preferred structures are obtained within the first two to four EOF state space. The first pattern shows a Pacific-North American (PNA) pattern and a negative North Atlantic Oscillation (NAO), and the second pattern is nearly opposite to the first one. It is also observed that some subspaces show multivariate Gaussianity, compatible with linearity, whereas others show multivariate non-Gaussianity. The same analysis is also applied to two subperiods, before and after 1978, and shows a similar regime behavior, with a slight stronger support for the first subperiod. In addition a significant regime shift is also observed between the two periods as well as a change in the shape of the distribution. The patterns associated with the regime shifts reflect essentially a PNA pattern and an NAO pattern consistent with the observed global warming effect on climate and the observed shift in sea surface temperature around the mid-1970s.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A cross-sectional study of serum antibody responses of cattle to tick-borne pathogens (Theileria parva, Theileria mutans, Anaplasma marginale, Babesia bigemina and Babesia bovis) was conducted on smallholder dairy farms in Tanga and Iringa Regions of Tanzania. Seroprevalence was highest for T. parva (48% in Iringa and 23% in Tanga) and B. bigemina (43% in Iringa and 27% in Tanga) and lowest for B. bovis (12% in Iringa and 6% in Tanga). We use spatial and non-spatial models, fitted using classical and Bayesian methods, to explore risk factors associated with seroprevalence. These include both fixed effects (age, grazing history and breeding status) and random effects (farm and local spatial effects). In both regions, seroprevalence for all tick-borne pathogens increased significantly with age. Animals pasture grazed in the 3 months prior to the start of the sampling period were significantly more likely to be seropositive for Theileria spp. and Babesia spp. Pasture grazed animals were more likely to be seropositive than zero-grazed animals for A. marginale, but the relationship was weaker than that observed for the other four pathogens. This study did not detect any significant differences in seroprevalence associated with other management-related variables, including the method or frequency of acaricide application. After adjusting for age, there was weak evidence of localised (< 5 km) spatial correlation in exposure to some of the tick borne diseases. However, this was small compared with the 'farm-effect', suggesting that risk factors specific to the farm were more important than those common to the local neighbourhood. Many animals were seropositive for more than one pathogen and the correlation between exposure to the different pathogens remained after adjusting for the identified risk factors. Identifying the determinants of exposure to multiple tick-borne pathogens and characterizing local variation in risk will assist in the development of more effective control strategies for smallholder dairy farms. (c) 2005 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We conducted the first molecular phylogenetic study of Ficus section Malvanthera (Moraceae; subgenus Urostigma) based on 32 Malvanthera accessions and seven outgroups representing other sections of Ficus subgenus Urostigma. We used DNA sequences from the nuclear ribosomal internal and external transcribed spacers (ITS and ETS), and the glyceraldehyde-3-phosphate dehydrogenase (G3pdh) region. Phylogenetic analysis using maximum parsimony, maximum likelihood and Bayesian methods recovered a monophyletic section Malvanthera to the exclusion of the rubber fig, Ficus elastica. The results of the phylogenetic analyses do not conform to any previously proposed taxonomic subdivision of the section and characters used for previous classification are homoplasious. Geographic distribution, however, is highly conserved and Melanesian Malvanthera are monophyletic. A new subdivision of section Malvanthera reflecting phylogenetic relationships is presented. Section Malvanthera likely diversified during a period of isolation in Australia and subsequently colonized New Guinea. Two Australian series are consistent with a pattern of dispersal out of rainforest habitat into drier habitats accompanied by a reduction in plant height during the transition from hemi-epiphytic trees to lithophytic trees and shrubs. In contradiction with a previous study of Pleistodontes phylogeny suggesting multiple changes in pollination behaviour, reconstruction of changes in pollination behaviour on Malvanthera, suggests only one or a few gains of active pollination within the section. (C) 2008 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of phase II single-arm clinical trials of a new drug is to determine whether it has sufficient promising activity to warrant its further development. For the last several years Bayesian statistical methods have been proposed and used. Bayesian approaches are ideal for earlier phase trials as they take into account information that accrues during a trial. Predictive probabilities are then updated and so become more accurate as the trial progresses. Suitable priors can act as pseudo samples, which make small sample clinical trials more informative. Thus patients have better chances to receive better treatments. The goal of this paper is to provide a tutorial for statisticians who use Bayesian methods for the first time or investigators who have some statistical background. In addition, real data from three clinical trials are presented as examples to illustrate how to conduct a Bayesian approach for phase II single-arm clinical trials with binary outcomes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Enhanced release of CO2 to the atmosphere from soil organic carbon as a result of increased temperatures may lead to a positive feedback between climate change and the carbon cycle, resulting in much higher CO2 levels and accelerated lobal warming. However, the magnitude of this effect is uncertain and critically dependent on how the decomposition of soil organic C (heterotrophic respiration) responds to changes in climate. Previous studies with the Hadley Centre’s coupled climate–carbon cycle general circulation model (GCM) (HadCM3LC) used a simple, single-pool soil carbon model to simulate the response. Here we present results from numerical simulations that use the more sophisticated ‘RothC’ multipool soil carbon model, driven with the same climate data. The results show strong similarities in the behaviour of the two models, although RothC tends to simulate slightly smaller changes in global soil carbon stocks for the same forcing. RothC simulates global soil carbon stocks decreasing by 54 GtC by 2100 in a climate change simulation compared with an 80 GtC decrease in HadCM3LC. The multipool carbon dynamics of RothC cause it to exhibit a slower magnitude of transient response to both increased organic carbon inputs and changes in climate. We conclude that the projection of a positive feedback between climate and carbon cycle is robust, but the magnitude of the feedback is dependent on the structure of the soil carbon model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The plethora, and mass take up, of digital communication tech- nologies has resulted in a wealth of interest in social network data collection and analysis in recent years. Within many such networks the interactions are transient: thus those networks evolve over time. In this paper we introduce a class of models for such networks using evolving graphs with memory dependent edges, which may appear and disappear according to their recent history. We consider time discrete and time continuous variants of the model. We consider the long term asymptotic behaviour as a function of parameters controlling the memory dependence. In particular we show that such networks may continue evolving forever, or else may quench and become static (containing immortal and/or extinct edges). This depends on the ex- istence or otherwise of certain infinite products and series involving age dependent model parameters. To test these ideas we show how model parameters may be calibrated based on limited samples of time dependent data, and we apply these concepts to three real networks: summary data on mobile phone use from a developing region; online social-business network data from China; and disaggregated mobile phone communications data from a reality mining experiment in the US. In each case we show that there is evidence for memory dependent dynamics, such as that embodied within the class of models proposed here.