965 resultados para Latent variable models
Resumo:
This paper is concerned with the selection of inputs for classification models based on ratios of measured quantities. For this purpose, all possible ratios are built from the quantities involved and variable selection techniques are used to choose a convenient subset of ratios. In this context, two selection techniques are proposed: one based on a pre-selection procedure and another based on a genetic algorithm. In an example involving the financial distress prediction of companies, the models obtained from ratios selected by the proposed techniques compare favorably to a model using ratios usually found in the financial distress literature.
Resumo:
In this paper we are mainly concerned with the development of efficient computer models capable of accurately predicting the propagation of low-to-middle frequency sound in the sea, in axially symmetric (2D) and in fully 3D environments. The major physical features of the problem, i.e. a variable bottom topography, elastic properties of the subbottom structure, volume attenuation and other range inhomogeneities are efficiently treated. The computer models presented are based on normal mode solutions of the Helmholtz equation on the one hand, and on various types of numerical schemes for parabolic approximations of the Helmholtz equation on the other. A new coupled mode code is introduced to model sound propagation in range-dependent ocean environments with variable bottom topography, where the effects of an elastic bottom, of volume attenuation, surface and bottom roughness are taken into account. New computer models based on finite difference and finite element techniques for the numerical solution of parabolic approximations are also presented. They include an efficient modeling of the bottom influence via impedance boundary conditions, they cover wide angle propagation, elastic bottom effects, variable bottom topography and reverberation effects. All the models are validated on several benchmark problems and versus experimental data. Results thus obtained were compared with analogous results from standard codes in the literature.
Resumo:
This study tested the hypothesis that aggressive, localized infections and asymptomatic systemic infections were caused by distinct specialized groups of Botrytis cinerea, using microsatellite genotypes at nine loci of 243 isolates of B. cinerea obtained from four hosts (strawberry (Fragaria ´ananassa), blackberry (Rubus fruticosus agg.), dandelion, (Taraxacum of®- cinale agg.) and primrose (Primula vulgaris)) in three regions in southern England (in the vicinities of Brighton, Reading and Bath). The populations were extremely variable, with up to 20 alleles per locus and high genic diversity. Each host in each region had a population of B. cinerea with distinctive genetic features, and there were also consistent host and regional distinctions. The B. cinerea population from strawberry was distinguished from that on other hosts, including blackberry, most notably by a common 154-bp amplicon at locus 5 (present in 35 of 77 samples) that was rare in isolates from other hosts (9¤166), and by the rarity (3¤77) of a 112-bp allele at locus 7 that was common (58¤166) in isolates from other hosts. There was signi®cant linkage disequilibrium overall within the B. cinerea populations on blackberry and strawberry, but with quite different patterns of association among isolates from the two hosts. No evidence was found for differentiation between populations of B. cinerea from systemically infected hosts and those from locally infected fruits.
Resumo:
A large number of urban surface energy balance models now exist with different assumptions about the important features of the surface and exchange processes that need to be incorporated. To date, no com- parison of these models has been conducted; in contrast, models for natural surfaces have been compared extensively as part of the Project for Intercomparison of Land-surface Parameterization Schemes. Here, the methods and first results from an extensive international comparison of 33 models are presented. The aim of the comparison overall is to understand the complexity required to model energy and water exchanges in urban areas. The degree of complexity included in the models is outlined and impacts on model performance are discussed. During the comparison there have been significant developments in the models with resulting improvements in performance (root-mean-square error falling by up to two-thirds). Evaluation is based on a dataset containing net all-wave radiation, sensible heat, and latent heat flux observations for an industrial area in Vancouver, British Columbia, Canada. The aim of the comparison is twofold: to identify those modeling ap- proaches that minimize the errors in the simulated fluxes of the urban energy balance and to determine the degree of model complexity required for accurate simulations. There is evidence that some classes of models perform better for individual fluxes but no model performs best or worst for all fluxes. In general, the simpler models perform as well as the more complex models based on all statistical measures. Generally the schemes have best overall capability to model net all-wave radiation and least capability to model latent heat flux.
Resumo:
An input variable selection procedure is introduced for the identification and construction of multi-input multi-output (MIMO) neurofuzzy operating point dependent models. The algorithm is an extension of a forward modified Gram-Schmidt orthogonal least squares procedure for a linear model structure which is modified to accommodate nonlinear system modeling by incorporating piecewise locally linear model fitting. The proposed input nodes selection procedure effectively tackles the problem of the curse of dimensionality associated with lattice-based modeling algorithms such as radial basis function neurofuzzy networks, enabling the resulting neurofuzzy operating point dependent model to be widely applied in control and estimation. Some numerical examples are given to demonstrate the effectiveness of the proposed construction algorithm.
Resumo:
Analyzes the use of linear and neural network models for financial distress classification, with emphasis on the issues of input variable selection and model pruning. A data-driven method for selecting input variables (financial ratios, in this case) is proposed. A case study involving 60 British firms in the period 1997-2000 is used for illustration. It is shown that the use of the Optimal Brain Damage pruning technique can considerably improve the generalization ability of a neural model. Moreover, the set of financial ratios obtained with the proposed selection procedure is shown to be an appropriate alternative to the ratios usually employed by practitioners.
Resumo:
This paper is concerned with the use of a genetic algorithm to select financial ratios for corporate distress classification models. For this purpose, the fitness value associated to a set of ratios is made to reflect the requirements of maximizing the amount of information available for the model and minimizing the collinearity between the model inputs. A case study involving 60 failed and continuing British firms in the period 1997-2000 is used for illustration. The classification model based on ratios selected by the genetic algorithm compares favorably with a model employing ratios usually found in the financial distress literature.
Resumo:
Several studies using ocean–atmosphere general circulation models (GCMs) suggest that the atmospheric component plays a dominant role in the modelled El Niño-Southern Oscillation (ENSO). To help elucidate these findings, the two main atmosphere feedbacks relevant to ENSO, the Bjerknes positive feedback (μ) and the heat flux negative feedback (α), are here analysed in nine AMIP runs of the CMIP3 multimodel dataset. We find that these models generally have improved feedbacks compared to the coupled runs which were analysed in part I of this study. The Bjerknes feedback, μ, is increased in most AMIP runs compared to the coupled run counterparts, and exhibits both positive and negative biases with respect to ERA40. As in the coupled runs, the shortwave and latent heat flux feedbacks are the two dominant components of α in the AMIP runs. We investigate the mechanisms behind these two important feedbacks, in particular focusing on the strong 1997–1998 El Niño. Biases in the shortwave flux feedback, α SW, are the main source of model uncertainty in α. Most models do not successfully represent the negative αSW in the East Pacific, primarily due to an overly strong low-cloud positive feedback in the far eastern Pacific. Biases in the cloud response to dynamical changes dominate the modelled α SW biases, though errors in the large-scale circulation response to sea surface temperature (SST) forcing also play a role. Analysis of the cloud radiative forcing in the East Pacific reveals model biases in low cloud amount and optical thickness which may affect α SW. We further show that the negative latent heat flux feedback, α LH, exhibits less diversity than α SW and is primarily driven by variations in the near-surface specific humidity difference. However, biases in both the near-surface wind speed and humidity response to SST forcing can explain the inter-model αLH differences.
Resumo:
An evaluation is undertaken of the statistics of daily precipitation as simulated by five regional climate models using comprehensive observations in the region of the European Alps. Four limited area models and one variable-resolution global model are considered, all with a grid spacing of 50 km. The 15-year integrations were forced from reanalyses and observed sea surface temperature and sea ice (global model from sea surface only). The observational reference is based on 6400 rain gauge records (10–50 stations per grid box). Evaluation statistics encompass mean precipitation, wet-day frequency, precipitation intensity, and quantiles of the frequency distribution. For mean precipitation, the models reproduce the characteristics of the annual cycle and the spatial distribution. The domain mean bias varies between −23% and +3% in winter and between −27% and −5% in summer. Larger errors are found for other statistics. In summer, all models underestimate precipitation intensity (by 16–42%) and there is a too low frequency of heavy events. This bias reflects too dry summer mean conditions in three of the models, while it is partly compensated by too many low-intensity events in the other two models. Similar intermodel differences are found for other European subregions. Interestingly, the model errors are very similar between the two models with the same dynamical core (but different parameterizations) and they differ considerably between the two models with similar parameterizations (but different dynamics). Despite considerable biases, the models reproduce prominent mesoscale features of heavy precipitation, which is a promising result for their use in climate change downscaling over complex topography.
Resumo:
This study presents a model intercomparison of four regional climate models (RCMs) and one variable resolution atmospheric general circulation model (AGCM) applied over Europe with special focus on the hydrological cycle and the surface energy budget. The models simulated the 15 years from 1979 to 1993 by using quasi-observed boundary conditions derived from ECMWF re-analyses (ERA). The model intercomparison focuses on two large atchments representing two different climate conditions covering two areas of major research interest within Europe. The first is the Danube catchment which represents a continental climate dominated by advection from the surrounding land areas. It is used to analyse the common model error of a too dry and too warm simulation of the summertime climate of southeastern Europe. This summer warming and drying problem is seen in many RCMs, and to a less extent in GCMs. The second area is the Baltic Sea catchment which represents maritime climate dominated by advection from the ocean and from the Baltic Sea. This catchment is a research area of many studies within Europe and also covered by the BALTEX program. The observed data used are monthly mean surface air temperature, precipitation and river discharge. For all models, these are used to estimate mean monthly biases of all components of the hydrological cycle over land. In addition, the mean monthly deviations of the surface energy fluxes from ERA data are computed. Atmospheric moisture fluxes from ERA are compared with those of one model to provide an independent estimate of the convergence bias derived from the observed data. These help to add weight to some of the inferred estimates and explain some of the discrepancies between them. An evaluation of these biases and deviations suggests possible sources of error in each of the models. For the Danube catchment, systematic errors in the dynamics cause the prominent summer drying problem for three of the RCMs, while for the fourth RCM this is related to deficiencies in the land surface parametrization. The AGCM does not show this drying problem. For the Baltic Sea catchment, all models similarily overestimate the precipitation throughout the year except during the summer. This model deficit is probably caused by the internal model parametrizations, such as the large-scale condensation and the convection schemes.
Resumo:
Undirected graphical models are widely used in statistics, physics and machine vision. However Bayesian parameter estimation for undirected models is extremely challenging, since evaluation of the posterior typically involves the calculation of an intractable normalising constant. This problem has received much attention, but very little of this has focussed on the important practical case where the data consists of noisy or incomplete observations of the underlying hidden structure. This paper specifically addresses this problem, comparing two alternative methodologies. In the first of these approaches particle Markov chain Monte Carlo (Andrieu et al., 2010) is used to efficiently explore the parameter space, combined with the exchange algorithm (Murray et al., 2006) for avoiding the calculation of the intractable normalising constant (a proof showing that this combination targets the correct distribution in found in a supplementary appendix online). This approach is compared with approximate Bayesian computation (Pritchard et al., 1999). Applications to estimating the parameters of Ising models and exponential random graphs from noisy data are presented. Each algorithm used in the paper targets an approximation to the true posterior due to the use of MCMC to simulate from the latent graphical model, in lieu of being able to do this exactly in general. The supplementary appendix also describes the nature of the resulting approximation.
Resumo:
Abstract This study presents a model intercomparison of four regional climate models (RCMs) and one variable resolution atmospheric general circulation model (AGCM) applied over Europe with special focus on the hydrological cycle and the surface energy budget. The models simulated the 15 years from 1979 to 1993 by using quasi-observed boundary conditions derived from ECMWF re-analyses (ERA). The model intercomparison focuses on two large atchments representing two different climate conditions covering two areas of major research interest within Europe. The first is the Danube catchment which represents a continental climate dominated by advection from the surrounding land areas. It is used to analyse the common model error of a too dry and too warm simulation of the summertime climate of southeastern Europe. This summer warming and drying problem is seen in many RCMs, and to a less extent in GCMs. The second area is the Baltic Sea catchment which represents maritime climate dominated by advection from the ocean and from the Baltic Sea. This catchment is a research area of many studies within Europe and also covered by the BALTEX program. The observed data used are monthly mean surface air temperature, precipitation and river discharge. For all models, these are used to estimate mean monthly biases of all components of the hydrological cycle over land. In addition, the mean monthly deviations of the surface energy fluxes from ERA data are computed. Atmospheric moisture fluxes from ERA are compared with those of one model to provide an independent estimate of the convergence bias derived from the observed data. These help to add weight to some of the inferred estimates and explain some of the discrepancies between them. An evaluation of these biases and deviations suggests possible sources of error in each of the models. For the Danube catchment, systematic errors in the dynamics cause the prominent summer drying problem for three of the RCMs, while for the fourth RCM this is related to deficiencies in the land surface parametrization. The AGCM does not show this drying problem. For the Baltic Sea catchment, all models similarily overestimate the precipitation throughout the year except during the summer. This model deficit is probably caused by the internal model parametrizations, such as the large-scale condensation and the convection schemes.
Resumo:
A method to solve a quasi-geostrophic two-layer model including the variation of static stability is presented. The divergent part of the wind is incorporated by means of an iterative procedure. The procedure is rather fast and the time of computation is only 60–70% longer than for the usual two-layer model. The method of solution is justified by the conservation of the difference between the gross static stability and the kinetic energy. To eliminate the side-boundary conditions the experiments have been performed on a zonal channel model. The investigation falls mainly into three parts: The first part (section 5) contains a discussion of the significance of some physically inconsistent approximations. It is shown that physical inconsistencies are rather serious and for these inconsistent models which were studied the total kinetic energy increased faster than the gross static stability. In the next part (section 6) we are studying the effect of a Jacobian difference operator which conserves the total kinetic energy. The use of this operator in two-layer models will give a slight improvement but probably does not have any practical use in short periodic forecasts. It is also shown that the energy-conservative operator will change the wave-speed in an erroneous way if the wave-number or the grid-length is large in the meridional direction. In the final part (section 7) we investigate the behaviour of baroclinic waves for some different initial states and for two energy-consistent models, one with constant and one with variable static stability. According to the linear theory the waves adjust rather rapidly in such a way that the temperature wave will lag behind the pressure wave independent of the initial configuration. Thus, both models give rise to a baroclinic development even if the initial state is quasi-barotropic. The effect of the variation of static stability is very small, qualitative differences in the development are only observed during the first 12 hours. For an amplifying wave we will get a stabilization over the troughs and an instabilization over the ridges.
Resumo:
The redistribution of a finite amount of martian surface dust during global dust storms and in the intervening periods has been modelled in a dust lifting version of the UK Mars General Circulation Model. When using a constant, uniform threshold in the model’s wind stress lifting parameterisation and assuming an unlimited supply of surface dust, multiannual simulations displayed some variability in dust lifting activity from year to year, arising from internal variability manifested in surface wind stress, but dust storms were limited in size and formed within a relatively short seasonal window. Lifting thresholds were then allowed to vary at each model gridpoint, dependent on the rates of emission or deposition of dust. This enhanced interannual variability in dust storm magnitude and timing, such that model storms covered most of the observed ranges in size and initiation date within a single multiannual simulation. Peak storm magnitude in a given year was primarily determined by the availability of surface dust at a number of key sites in the southern hemisphere. The observed global dust storm (GDS) frequency of roughly one in every 3 years was approximately reproduced, but the model failed to generate these GDSs spontaneously in the southern hemisphere, where they have typically been observed to initiate. After several years of simulation, the surface threshold field—a proxy for net change in surface dust density—showed good qualitative agreement with the observed pattern of martian surface dust cover. The model produced a net northward cross-equatorial dust mass flux, which necessitated the addition of an artificial threshold decrease rate in order to allow the continued generation of dust storms over the course of a multiannual simulation. At standard model resolution, for the southward mass flux due to cross-equatorial flushing storms to offset the northward flux due to GDSs on a timescale of ∼3 years would require an increase in the former by a factor of 3–4. Results at higher model resolution and uncertainties in dust vertical profiles mean that quasi-periodic redistribution of dust on such a timescale nevertheless appears to be a plausible explanation for the observed GDS frequency.
Resumo:
The link between the Pacific/North American pattern (PNA) and the North Atlantic Oscillation (NAO) is investigated in reanalysis data (NCEP, ERA40) and multi-century CGCM runs for present day climate using three versions of the ECHAM model. PNA and NAO patterns and indices are determined via rotated principal component analysis on monthly mean 500 hPa geopotential height fields using the varimax criteria. On average, the multi-century CGCM simulations show a significant anti-correlation between PNA and NAO. Further, multi-decadal periods with significantly enhanced (high anti-correlation, active phase) or weakened (low correlations, inactive phase) coupling are found in all CGCMs. In the simulated active phases, the storm track activity near Newfoundland has a stronger link with the PNA variability than during the inactive phases. On average, the reanalysis datasets show no significant anti-correlation between PNA and NAO indices, but during the sub-period 1973–1994 a significant anti-correlation is detected, suggesting that the present climate could correspond to an inactive period as detected in the CGCMs. An analysis of possible physical mechanisms suggests that the link between the patterns is established by the baroclinic waves forming the North Atlantic storm track. The geopotential height anomalies associated with negative PNA phases induce an increased advection of warm and moist air from the Gulf of Mexico and cold air from Canada. Both types of advection contribute to increase baroclinicity over eastern North America and also to increase the low level latent heat content of the warm air masses. Thus, growth conditions for eddies at the entrance of the North Atlantic storm track are enhanced. Considering the average temporal development during winter for the CGCM, results show an enhanced Newfoundland storm track maximum in the early winter for negative PNA, followed by a downstream enhancement of the Atlantic storm track in the subsequent months. In active (passive) phases, this seasonal development is enhanced (suppressed). As the storm track over the central and eastern Atlantic is closely related to the NAO variability, this development can be explained by the shift of the NAO index to more positive values.