170 resultados para outlier
Resumo:
This dissertation deals with aspects of sequential data assimilation (in particular ensemble Kalman filtering) and numerical weather forecasting. In the first part, the recently formulated Ensemble Kalman-Bucy (EnKBF) filter is revisited. It is shown that the previously used numerical integration scheme fails when the magnitude of the background error covariance grows beyond that of the observational error covariance in the forecast window. Therefore, we present a suitable integration scheme that handles the stiffening of the differential equations involved and doesn’t represent further computational expense. Moreover, a transform-based alternative to the EnKBF is developed: under this scheme, the operations are performed in the ensemble space instead of in the state space. Advantages of this formulation are explained. For the first time, the EnKBF is implemented in an atmospheric model. The second part of this work deals with ensemble clustering, a phenomenon that arises when performing data assimilation using of deterministic ensemble square root filters in highly nonlinear forecast models. Namely, an M-member ensemble detaches into an outlier and a cluster of M-1 members. Previous works may suggest that this issue represents a failure of EnSRFs; this work dispels that notion. It is shown that ensemble clustering can be reverted also due to nonlinear processes, in particular the alternation between nonlinear expansion and compression of the ensemble for different regions of the attractor. Some EnSRFs that use random rotations have been developed to overcome this issue; these formulations are analyzed and their advantages and disadvantages with respect to common EnSRFs are discussed. The third and last part contains the implementation of the Robert-Asselin-Williams (RAW) filter in an atmospheric model. The RAW filter is an improvement to the widely popular Robert-Asselin filter that successfully suppresses spurious computational waves while avoiding any distortion in the mean value of the function. Using statistical significance tests both at the local and field level, it is shown that the climatology of the SPEEDY model is not modified by the changed time stepping scheme; hence, no retuning of the parameterizations is required. It is found the accuracy of the medium-term forecasts is increased by using the RAW filter.
Resumo:
Ensemble clustering (EC) can arise in data assimilation with ensemble square root filters (EnSRFs) using non-linear models: an M-member ensemble splits into a single outlier and a cluster of M−1 members. The stochastic Ensemble Kalman Filter does not present this problem. Modifications to the EnSRFs by a periodic resampling of the ensemble through random rotations have been proposed to address it. We introduce a metric to quantify the presence of EC and present evidence to dispel the notion that EC leads to filter failure. Starting from a univariate model, we show that EC is not a permanent but transient phenomenon; it occurs intermittently in non-linear models. We perform a series of data assimilation experiments using a standard EnSRF and a modified EnSRF by a resampling though random rotations. The modified EnSRF thus alleviates issues associated with EC at the cost of traceability of individual ensemble trajectories and cannot use some of algorithms that enhance performance of standard EnSRF. In the non-linear regimes of low-dimensional models, the analysis root mean square error of the standard EnSRF slowly grows with ensemble size if the size is larger than the dimension of the model state. However, we do not observe this problem in a more complex model that uses an ensemble size much smaller than the dimension of the model state, along with inflation and localisation. Overall, we find that transient EC does not handicap the performance of the standard EnSRF.
Resumo:
An ensemble forecast is a collection of runs of a numerical dynamical model, initialized with perturbed initial conditions. In modern weather prediction for example, ensembles are used to retrieve probabilistic information about future weather conditions. In this contribution, we are concerned with ensemble forecasts of a scalar quantity (say, the temperature at a specific location). We consider the event that the verification is smaller than the smallest, or larger than the largest ensemble member. We call these events outliers. If a K-member ensemble accurately reflected the variability of the verification, outliers should occur with a base rate of 2/(K + 1). In operational forecast ensembles though, this frequency is often found to be higher. We study the predictability of outliers and find that, exploiting information available from the ensemble, forecast probabilities for outlier events can be calculated which are more skilful than the unconditional base rate. We prove this analytically for statistically consistent forecast ensembles. Further, the analytical results are compared to the predictability of outliers in an operational forecast ensemble by means of model output statistics. We find the analytical and empirical results to agree both qualitatively and quantitatively.
Resumo:
Surface-based GPS measurements of zenith path delay (ZPD) can be used to derive vertically integrated water vapor (IWV) of the atmosphere. ZPD data are collected in a global network presently consisting of 160 stations as part of the International GPS Service. In the present study, ZPD data from this network are converted into IWV using observed surface pressure and mean atmospheric water vapor column temperature obtained from the European Centre for Medium-Range Weather Forecasts' (ECMWF) operational analyses (OA). For the 4 months of January/July 2000/2001, the GPS-derived IWV values are compared to the IWV from the ECMWF OA, with a special focus on the monthly averaged difference (bias) and the standard deviation of daily differences. This comparison shows that the GPS-derived IWV values are well suited for the validation of OA of IWV. For most GPS stations, the IWV data agree quite well with the analyzed data indicating that they are both correct at these locations. Larger differences for individual days are interpreted as errors in the analyses. A dry bias in the winter is found over central United States, Canada, and central Siberia, suggesting a systematic analysis error. Larger differences were mainly found in mountain areas. These were related to representation problems and interpolation difficulties between model height and station height. In addition, the IWV comparison can be used to identify errors or problems in the observations of ZPD. This includes errors in the data itself, e.g., erroneous outlier in the measured time series, as well as systematic errors that affect all IWV values at a specific station. Such stations were excluded from the intercomparison. Finally, long-term requirements for a GPS-based water vapor monitoring system are discussed.
Resumo:
A range of possible changes in the frequency and characteristics of European wind storms under future climate conditions was investigated on the basis of a multi-model ensemble of 9 coupled global climate model (GCM) simulations for the 20th and 21st centuries following the IPCC SRES A1B scenario. A multi-model approach allowed an estimation of the (un)certainties of the climate change signals. General changes in large-scale atmospheric flow were analysed, the occurrence of wind storms was quantified, and atmospheric features associated with wind storm events were considered. Identified storm days were investigated according to atmospheric circulation, associated pressure patterns, cyclone tracks and wind speed patterns. Validation against reanalysis data revealed that the GCMs are in general capable of realistically reproducing characteristics of European circulation weather types (CWTs) and wind storms. Results are given with respect to frequency of occurrence, storm-associated flow conditions, cyclone tracks and specific wind speed patterns. Under anthropogenic climate change conditions (SRES A1B scenario), increased frequency of westerly flow during winter is detected over the central European investigation area. In the ensemble mean, the number of detected wind storm days increases between 19 and 33% for 2 different measures of storminess, only 1 GCM revealed less storm days. The increased number of storm days detected in most models is disproportionately high compared to the related CWT changes. The mean intensity of cyclones associated with storm days in the ensemble mean increases by about 10 (±10)% in the Eastern Atlantic, near the British Isles and in the North Sea. Accordingly, wind speeds associated with storm events increase significantly by about 5 (±5)% over large parts of central Europe, mainly on days with westerly flow. The basic conclusions of this work remain valid if different ensemble contructions are considered, leaving out an outlier model or including multiple runs of one particular model.
Resumo:
The book is concerned with the rise of the Greek Golden Dawn. Although most literature focuses on demand and supply-side explanations, this book progresses beyond the state of the art by examining the Golden Dawn as an outlier and focusing on political culture as an explanation for its dramatic rise.
Resumo:
Introduction: Resistance to anticoagulants in Norway rats (Rattus norvegicus) and house mice (Mus domesticus) has been studied in the UK since the early 1960s. In no other country in the world is our understanding of resistance phenomena so extensive and profound. Almost every aspect of resistance in the key rodent target species has been examined in laboratory and field trials and results obtained by independent researchers have been published. It is the principal purpose of this document to present a short synopsis of this information. More recently, however, the development of genetical techniques has provided a definitive means of detection of resistant genotypes among pest rodent populations. Preliminary information from a number of such surveys will also be presented. Resistance in Norway rats: A total of nine different anticoagulant resistance mutations (single nucleotide polymorphisms or SNPs) are found among Norway rats in the UK. In no other country worldwide are present so many different forms of Norway rat resistance. Among these nine SNPs, five are known to confer on rats that carry them a significant degree of resistance to anticoagulant rodenticides. These mutations are: L128Q, Y139S, L120Q, Y139C and Y139F. The latter three mutations confer, to varying degrees, practical resistance to bromadiolone and difenacoum, the two second-generation anticoagulants in predominant use in the UK. It is the recommendation of RRAG that bromadiolone and difenacoum should not be used against rats carrying the L120Q, Y139C and Y139F mutations because this will promote the spread of resistance and jeopardise the long-term efficacy of anticoagulants. Brodifacoum, flocoumafen and difethialone are effective against these three genotypes but cannot presently be used because of the regulatory restriction that they can only be applied against rats that are living and feeding predominantly indoors. Our understanding of the geographical distribution of Norway rat resistance in incomplete but is rapidly increasing. In particular, the mapping of the focus of L120Q Norway rat resistance in central-southern England by DNA sequencing is well advanced. We now know that rats carrying this resistance mutation are present across a large part of the counties of Hampshire, Berkshire and Wiltshire, and the resistance spreads into Avon, Oxfordshire and Surrey. It is also found, perhaps as outlier foci, in south-west Scotland and East Sussex. L120Q is currently the most severe form of anticoagulant resistance found in Norway rats and is prevalent over a considerable part of central-southern England. A second form of advanced Norway rat resistance is conferred by the Y139C mutation. This is noteworthy because it occurs in at least four different foci that are widely geographically dispersed, namely in Dumfries and Galloway, Gloucestershire, Yorkshire and Norfolk. Once again, bromadiolone and difenacoum are not recommended for use against rats carrying this genotype and a concern of RRAG is that continued applications of resisted active substances may result in Y139C becoming more or less ubiquitous across much of the UK. Another type of advanced resistance, the Y139F mutation, is present in Kent and Sussex. This means that Norway rats, carrying some degree of resistance to bromadiolone and difenacoum, are now found from the south coast of Kent, west into the city of Bristol, to Yorkshire in the north-east and to the south-west of Scotland. This difficult situation can only deteriorate further where these three genotypes exist and resisted anticoagulants are predominantly used against them. Resistance in house mice: House mouse is not so well understood but the presence in the UK of two resistant genotypes, L128S and Y139C, is confirmed. House mice are naturally tolerant to anticoagulants and such is the nature of this tolerance, and the presence of genetical resistance, that house mice resistant to the first-generation anticoagulants are considered to be widespread in the UK. Consequently, baits containing warfarin, sodium warfarin, chlorophacinone and coumatetralyl are not approved for use against mice. This regulatory position is endorsed by RRAG. Baits containing brodifacoum, flocoumafen and difethialone are effective against house mice and may be applied in practice because house mouse infestations are predominantly indoors. There are some reports of resistance among mice in some areas to the second-generation anticoagulant bromadiolone, while difenacoum remains largely efficacious. Alternatives to anticoagulants: The use of habitat manipulation, that is the removal of harbourage, denial of the availability of food and the prevention of ingress to structures, is an essential component of sustainable rodent pest management. All are of importance in the management of resistant rodents and have the advantage of not selecting for resistant genotypes. The use of these techniques may be particularly valuable in preventing the build-up of rat infestations. However, none can be used to remove any sizeable extant rat infestation and for practical reasons their use against house mice is problematic. Few alternative chemical interventions are available in the European Union because of the removal from the market of zinc phosphide, calciferol and bromethalin. Our virtual complete reliance on the use of anticoagulants for the chemical control of rodents in the UK, and more widely in the EU, calls for improved schemes for resistance management. Of course, these might involve the use of alternatives to anticoagulant rodenticides. Also important is an increasing knowledge of the distribution of resistance mutations in rats and mice and the use of only fully effective anticoagulants against them.
Resumo:
Outliers são observações que parecem ser inconsistentes com as demais. Também chamadas de valores atípicos, extremos ou aberrantes, estas inconsistências podem ser causadas por mudanças de política ou crises econômicas, ondas inesperadas de frio ou calor, erros de medida ou digitação, entre outras. Outliers não são necessariamente valores incorretos, mas, quando provenientes de erros de medida ou digitação, podem distorcer os resultados de uma análise e levar o pesquisador à conclusões equivocadas. O objetivo deste trabalho é estudar e comparar diferentes métodos para detecção de anormalidades em séries de preços do Índice de Preços ao Consumidor (IPC), calculado pelo Instituto Brasileiro de Economia (IBRE) da Fundação Getulio Vargas (FGV). O IPC mede a variação dos preços de um conjunto fixo de bens e serviços componentes de despesas habituais das famílias com nível de renda situado entre 1 e 33 salários mínimos mensais e é usado principalmente como um índice de referência para avaliação do poder de compra do consumidor. Além do método utilizado atualmente no IBRE pelos analistas de preços, os métodos considerados neste estudo são: variações do Método do IBRE, Método do Boxplot, Método do Boxplot SIQR, Método do Boxplot Ajustado, Método de Cercas Resistentes, Método do Quartil, do Quartil Modificado, Método do Desvio Mediano Absoluto e Algoritmo de Tukey. Tais métodos foram aplicados em dados pertencentes aos municípios Rio de Janeiro e São Paulo. Para que se possa analisar o desempenho de cada método, é necessário conhecer os verdadeiros valores extremos antecipadamente. Portanto, neste trabalho, tal análise foi feita assumindo que os preços descartados ou alterados pelos analistas no processo de crítica são os verdadeiros outliers. O Método do IBRE é bastante correlacionado com os preços alterados ou descartados pelos analistas. Sendo assim, a suposição de que os preços alterados ou descartados pelos analistas são os verdadeiros valores extremos pode influenciar os resultados, fazendo com que o mesmo seja favorecido em comparação com os demais métodos. No entanto, desta forma, é possível computar duas medidas através das quais os métodos são avaliados. A primeira é a porcentagem de acerto do método, que informa a proporção de verdadeiros outliers detectados. A segunda é o número de falsos positivos produzidos pelo método, que informa quantos valores precisaram ser sinalizados para um verdadeiro outlier ser detectado. Quanto maior for a proporção de acerto gerada pelo método e menor for a quantidade de falsos positivos produzidos pelo mesmo, melhor é o desempenho do método. Sendo assim, foi possível construir um ranking referente ao desempenho dos métodos, identificando o melhor dentre os analisados. Para o município do Rio de Janeiro, algumas das variações do Método do IBRE apresentaram desempenhos iguais ou superiores ao do método original. Já para o município de São Paulo, o Método do IBRE apresentou o melhor desempenho. Em trabalhos futuros, espera-se testar os métodos em dados obtidos por simulação ou que constituam bases largamente utilizadas na literatura, de forma que a suposição de que os preços descartados ou alterados pelos analistas no processo de crítica são os verdadeiros outliers não interfira nos resultados.
Resumo:
Esta tese versa sobre as cidades médias no atual contexto do desenvolvimento urbano brasileiro e nordestino. Na região Nordeste, o processo de urbanização foi lento, atomizado, geográfico e economicamente disperso, o que resultou numa rede urbana truncada, constituída principalmente por suas nove capitais regionais e cerca de duas dezenas de cidades de porte médio, em sua maioria, interiorizadas. Foi a partir dessa rede urbana nordestina interiorizada que nos propomos a estudar Pau dos Ferros, no Rio Grande do Norte e o papel que ela desempenha na rede urbana nordestina e potiguar. Compreender os determinantes da produção do espaço urbano-regional de Pau dos Ferros que o caracterizam como cidade média, com fins a refletir sobre o seu papel no desenvolvimento regional foi o objetivo geral desta pesquisa. Nossa hipótese é que, a despeito de um contingente populacional pequeno, Pau dos Ferros vem desempenhando na rede urbana do Nordeste e do Rio Grande do Norte as funções de cidade média, particularmente, na oferta dos serviços de educação superior e saúde, além da oferta de empregos, notadamente no comércio e nos serviços públicos, o que nos permitiu tratá-la à priori, a partir do conceito de cidade (inter) média. Para esta investigação, partimos da proposta de estudo e do pensamento de autores como Faria (1978), Benko (1999) e Brandão (2007), os quais propõem o estudo do urbano a partir de situações concretas que permitam compreender os fenomenos em sua múltiplas causalidades. Dessa forma, o fio condutor desta análise foi o modo como vem se reconfigurando as cidades médias e como essa reconfiguração tem afetado de diferentes formas as relações entre as cidades e entre as cidades e as regiões. Os resultados das análises apontaram que os investimentos públicos em saúde e educação têm contribuído para a atração de investimentos privados em suas respectivas áreas, e também em outras, o que tem ajudado a dinamizar a economia da cidade, inclusive modificado parcialmente sua estrutura ocupacional. Pau dos Ferros se destacou como um polo comercial e de serviços na rede urbana potiguar, formando um outlier no Alto Oeste, organizando uma bacia de empregos na sua área de influência que constatamos ser composta por 55 municípios do RN, CE e PB. Ao se consolidar como polo regional na oferta dos serviços de saúde e de educação superior, ampliou-se o fluxo de pessoas que realizam movimento pendular para trabalho e estudo. Em síntese, constatou-se que, a despeito do pequeno contingente populacional, a configuração urbano-regional de Pau dos Ferros, tanto em termos de sua dinâmica urbana, como de sua abrangência regional, fazem dela uma cidade média na rede urbana nordestina interiorizada
Resumo:
Parent, L. E., Natale, W. and Ziadi, N. 2009. Compositional nutrient diagnosis of corn using the Mahalanobis distance as nutrient imbalance index. Can. J. Soil Sci. 89: 383-390. Compositional nutrient diagnosis (CND) provides a plant nutrient imbalance index (CND - r(2)) with assumed chi(2) distribution. The Mahalanobis distance D(2), which detects outliers in compositional data sets, also has a chi(2) distribution. The objective of this paper was to compare D(2) and CND - r(2) nutrient imbalance indexes in corn (Zea mays L.). We measured grain yield as well as N, P, K, Ca, Mg, Cu, Fe, Mn, and Zn concentrations in the ear leaf at silk stage for 210 calibration sites in the St. Lawrence Lowlands [2300-2700 corn thermal units (CTU)] as well as 30 phosphorus (2300-2700 CTU; 10 sites) and 10 nitrogen (1900-2100 CTU; one site) replicated fertilizer treatments for validation. We derived CND norms as mean, standard deviation, and the inverse covariance matrix of centred log ratios (clr) for high yielding specimens (>= 9.0 Mg grain ha(-1) at 150 g H(2)O kg(-1) moisture content) in the 2300-2700 CTU zone. Using chi(2) = 17 (P < 0.05) with nine degrees of freedom (i.e., nine nutrients) as a rejection criterion for outliers and a yield threshold of 8.6 Mg ha(-1) after Cate-Nelson partitioning between low- and high-yielders in the P validation data set, D(2) misclassified two specimens compared with nine for CND -r(2). The D(2) classification was not significantly different from a chi(2) classification (P > 0.05), but the CND - r(2) classification differed significantly from chi(2) or D(2) (P < 0.001). A threshold value for nutrient imbalance could thus be derived probabilistically for conducting D(2) diagnosis, while the CND - r(2) nutrient imbalance threshold must be calibrated using fertilizer trials. In the proposed CND - D(2) procedure, D(2) is first computed to classify the specimen as possible outlier. Thereafter, nutrient indices are ranked in their order of limitation. The D(2) norms appeared less effective in the 1900-2100 CTU zone.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Anaerobic threshold (AT) is usually estimated as a change point problem by visual analysis of the cardiorespiratory response to incremental dynamic exercise. In this study, two phase linear (TPL) models of the linear-linear and linear-quadratic type were used for the estimation of AT. The correlation coefficient between the classical and statistical approaches was 0.88, and 0.89 after outlier exclusion. The TPL models provide a simple method for estimating AT that can be easily implemented using a digital computer for the automatic pattern recognition of AT.
Resumo:
Linear mixed effects models have been widely used in analysis of data where responses are clustered around some random effects, so it is not reasonable to assume independence between observations in the same cluster. In most biological applications, it is assumed that the distributions of the random effects and of the residuals are Gaussian. This makes inferences vulnerable to the presence of outliers. Here, linear mixed effects models with normal/independent residual distributions for robust inferences are described. Specific distributions examined include univariate and multivariate versions of the Student-t, the slash and the contaminated normal. A Bayesian framework is adopted and Markov chain Monte Carlo is used to carry out the posterior analysis. The procedures are illustrated using birth weight data on rats in a texicological experiment. Results from the Gaussian and robust models are contrasted, and it is shown how the implementation can be used for outlier detection. The thick-tailed distributions provide an appealing robust alternative to the Gaussian process in linear mixed models, and they are easily implemented using data augmentation and MCMC techniques.
Resumo:
In this work, a new approach for supervised pattern recognition is presented which improves the learning algorithm of the Optimum-Path Forest classifier (OPF), centered on detection and elimination of outliers in the training set. Identification of outliers is based on a penalty computed for each sample in the training set from the corresponding number of imputable false positive and false negative classification of samples. This approach enhances the accuracy of OPF while still gaining in classification time, at the expense of a slight increase in training time. © 2010 Springer-Verlag.
Resumo:
Pós-graduação em Medicina Veterinária - FMVZ