933 resultados para rainfall-runoff empirical statistical model
Resumo:
The performance of the SAOP potential for the calculation of NMR chemical shifts was evaluated. SAOP results show considerable improvement with respect to previous potentials, like VWN or BP86, at least for the carbon, nitrogen, oxygen, and fluorine chemical shifts. Furthermore, a few NMR calculations carried out on third period atoms (S, P, and Cl) improved when using the SAOP potential
Resumo:
In this article we present a hybrid approach for automatic summarization of Spanish medical texts. There are a lot of systems for automatic summarization using statistics or linguistics, but only a few of them combining both techniques. Our idea is that to reach a good summary we need to use linguistic aspects of texts, but as well we should benefit of the advantages of statistical techniques. We have integrated the Cortex (Vector Space Model) and Enertex (statistical physics) systems coupled with the Yate term extractor, and the Disicosum system (linguistics). We have compared these systems and afterwards we have integrated them in a hybrid approach. Finally, we have applied this hybrid system over a corpora of medical articles and we have evaluated their performances obtaining good results.
Resumo:
This paper presents and estimates a dynamic choice model in the attribute space considering rational consumers. In light of the evidence of several state-dependence patterns, the standard attribute-based model is extended by considering a general utility function where pure inertia and pure variety-seeking behaviors can be explained in the model as particular linear cases. The dynamics of the model are fully characterized by standard dynamic programming techniques. The model presents a stationary consumption pattern that can be inertial, where the consumer only buys one product, or a variety-seeking one, where the consumer shifts among varied products.We run some simulations to analyze the consumption paths out of the steady state. Underthe hybrid utility assumption, the consumer behaves inertially among the unfamiliar brandsfor several periods, eventually switching to a variety-seeking behavior when the stationary levels are approached. An empirical analysis is run using scanner databases for three different product categories: fabric softener, saltine cracker, and catsup. Non-linear specifications provide the best fit of the data, as hybrid functional forms are found in all the product categories for most attributes and segments. These results reveal the statistical superiority of the non-linear structure and confirm the gradual trend to seek variety as the level of familiarity with the purchased items increases.
Resumo:
Cabo Verde é constituído por 10 ilhas, sendo a ilha do Maio a mais antiga do arquipélago, com uma área de 269 km2, tendo como comprimento máximo 24100 m, uma largura máxima de 16300 m e uma população total de 6740 habitantes. No que concerne à geomorfologia e geologia, a ilha é considerada plana e é composta por formações eruptivas e sedimentares, sendo as formações sedimentares dominantes na ilha. Apresenta as formações mais antigas de Cabo Verde, de idade jurássica e cretácica. No entanto, não apresenta as formações eruptivas mais recentes como as restantes ilhas. A ilha do Maio enquadra-se num clima do tipo árido e semiárido, com uma temperatura média de 24.5 ºC e uma precipitação anual de 125.4 mm. Estimativas efectuadas com base no modelo do balanço hídrico sequencial diário mostram que cerca de 7% da precipitação corresponde a escoamento superficial e 14.1% a escoamento subterrâneo. Pela aplicação deste modelo e do método do balanço químico do ião cloreto, os recursos hídricos subterrâneos renováveis anualmente na ilha do Maio estão, em ano médio, compreendidos entre 3.44 x 106 m3 e 4.76 x 106 m3.por sua vez, o escoamento total é estimado em 7.8 x 106 m3 anuais, o que equivale a cerca de 21 400 m3/dia. O escoamento subterrâneo na ilha do Maio faz-se globalmente de um modo centrífugo a partir das elevações do maciço central. O gradiente hidráulico assume valores entre 0.05% e 2.9%, sendo que o valor mais baixo ocorre no sector norte da ilha, o que favorece o fenómeno de intrusão salina. Relativamente à qualidade da água, verifica-se que as amostras recolhidas correspondem a águas muito mineralizadas, com valores de condutividade eléctrica compreendidos entre 832 μS/cm e 7730 μS/cm. Por sua vez, os valores de TDS estão compreendidos entre 705.8 mg/L e 4210.4 mg/L. Nestas condições, as águas subterrâneas analisadas podem ser consideradas águas salobras. A fácies hidroquímica dominante é a cloretada sódica, sendo que grande parte das amostras pode ser considerada cloretada-bicarbonatada sódica. Admitindo que a amostragem efectuada tem significado estatístico, poderá dizer-se que, a nível físico-químico, cerca de 20% das águas subterrâneas são próprias para o consumo humano. No que respeita à utilização da água para rega, as águas analisadas apresentam baixo a alto perigo de alcalinização do solo e alto a muito alto perigo de salinização. Em síntese, pode concluir-se que, não obstante o carácter árido da ilha do Maio, a mesma apresenta um potencial de recursos hídricos não negligenciável, eventualmente suficiente para suprir as necessidades hídricas da população. No entanto, o estudo desenvolvido mostra a necessidade de implementar medidas susceptíveis de proporcionarem um aproveitamento sustentado dos recursos hídricos, no quadro da gestão integrada dos recursos hídricos da ilha do Maio.
Resumo:
Given $n$ independent replicates of a jointly distributed pair $(X,Y)\in {\cal R}^d \times {\cal R}$, we wish to select from a fixed sequence of model classes ${\cal F}_1, {\cal F}_2, \ldots$ a deterministic prediction rule $f: {\cal R}^d \to {\cal R}$ whose risk is small. We investigate the possibility of empirically assessingthe {\em complexity} of each model class, that is, the actual difficulty of the estimation problem within each class. The estimated complexities are in turn used to define an adaptive model selection procedure, which is based on complexity penalized empirical risk.The available data are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select a candidate rule from each cover based on empirical risk. The covering radii are determined empirically to optimize a tight upper bound on the estimation error. An estimate is chosen from the list of candidates in order to minimize the sum of class complexity and empirical risk. A distinguishing feature of the approach is that the complexity of each model class is assessed empirically, based on the size of its empirical cover.Finite sample performance bounds are established for the estimates, and these bounds are applied to several non-parametric estimation problems. The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand. In addition, it is shown that the estimate can be consistent,and even possess near optimal rates of convergence, when each model class has an infinite VC or pseudo dimension.For regression estimation with squared loss we modify our estimate to achieve a faster rate of convergence.
Resumo:
Ground clutter caused by anomalous propagation (anaprop) can affect seriously radar rain rate estimates, particularly in fully automatic radar processing systems, and, if not filtered, can produce frequent false alarms. A statistical study of anomalous propagation detected from two operational C-band radars in the northern Italian region of Emilia Romagna is discussed, paying particular attention to its diurnal and seasonal variability. The analysis shows a high incidence of anaprop in summer, mainly in the morning and evening, due to the humid and hot summer climate of the Po Valley, particularly in the coastal zone. Thereafter, a comparison between different techniques and datasets to retrieve the vertical profile of the refractive index gradient in the boundary layer is also presented. In particular, their capability to detect anomalous propagation conditions is compared. Furthermore, beam path trajectories are simulated using a multilayer ray-tracing model and the influence of the propagation conditions on the beam trajectory and shape is examined. High resolution radiosounding data are identified as the best available dataset to reproduce accurately the local propagation conditions, while lower resolution standard TEMP data suffers from interpolation degradation and Numerical Weather Prediction model data (Lokal Model) are able to retrieve a tendency to superrefraction but not to detect ducting conditions. Observing the ray tracing of the centre, lower and upper limits of the radar antenna 3-dB half-power main beam lobe it is concluded that ducting layers produce a change in the measured volume and in the power distribution that can lead to an additional error in the reflectivity estimate and, subsequently, in the estimated rainfall rate.
Resumo:
With the advancement of high-throughput sequencing and dramatic increase of available genetic data, statistical modeling has become an essential part in the field of molecular evolution. Statistical modeling results in many interesting discoveries in the field, from detection of highly conserved or diverse regions in a genome to phylogenetic inference of species evolutionary history Among different types of genome sequences, protein coding regions are particularly interesting due to their impact on proteins. The building blocks of proteins, i.e. amino acids, are coded by triples of nucleotides, known as codons. Accordingly, studying the evolution of codons leads to fundamental understanding of how proteins function and evolve. The current codon models can be classified into three principal groups: mechanistic codon models, empirical codon models and hybrid ones. The mechanistic models grasp particular attention due to clarity of their underlying biological assumptions and parameters. However, they suffer from simplified assumptions that are required to overcome the burden of computational complexity. The main assumptions applied to the current mechanistic codon models are (a) double and triple substitutions of nucleotides within codons are negligible, (b) there is no mutation variation among nucleotides of a single codon and (c) assuming HKY nucleotide model is sufficient to capture essence of transition- transversion rates at nucleotide level. In this thesis, I develop a framework of mechanistic codon models, named KCM-based model family framework, based on holding or relaxing the mentioned assumptions. Accordingly, eight different models are proposed from eight combinations of holding or relaxing the assumptions from the simplest one that holds all the assumptions to the most general one that relaxes all of them. The models derived from the proposed framework allow me to investigate the biological plausibility of the three simplified assumptions on real data sets as well as finding the best model that is aligned with the underlying characteristics of the data sets. -- Avec l'avancement de séquençage à haut débit et l'augmentation dramatique des données géné¬tiques disponibles, la modélisation statistique est devenue un élément essentiel dans le domaine dé l'évolution moléculaire. Les résultats de la modélisation statistique dans de nombreuses découvertes intéressantes dans le domaine de la détection, de régions hautement conservées ou diverses dans un génome de l'inférence phylogénétique des espèces histoire évolutive. Parmi les différents types de séquences du génome, les régions codantes de protéines sont particulièrement intéressants en raison de leur impact sur les protéines. Les blocs de construction des protéines, à savoir les acides aminés, sont codés par des triplets de nucléotides, appelés codons. Par conséquent, l'étude de l'évolution des codons mène à la compréhension fondamentale de la façon dont les protéines fonctionnent et évoluent. Les modèles de codons actuels peuvent être classés en trois groupes principaux : les modèles de codons mécanistes, les modèles de codons empiriques et les hybrides. Les modèles mécanistes saisir une attention particulière en raison de la clarté de leurs hypothèses et les paramètres biologiques sous-jacents. Cependant, ils souffrent d'hypothèses simplificatrices qui permettent de surmonter le fardeau de la complexité des calculs. Les principales hypothèses retenues pour les modèles actuels de codons mécanistes sont : a) substitutions doubles et triples de nucleotides dans les codons sont négligeables, b) il n'y a pas de variation de la mutation chez les nucléotides d'un codon unique, et c) en supposant modèle nucléotidique HKY est suffisant pour capturer l'essence de taux de transition transversion au niveau nucléotidique. Dans cette thèse, je poursuis deux objectifs principaux. Le premier objectif est de développer un cadre de modèles de codons mécanistes, nommé cadre KCM-based model family, sur la base de la détention ou de l'assouplissement des hypothèses mentionnées. En conséquence, huit modèles différents sont proposés à partir de huit combinaisons de la détention ou l'assouplissement des hypothèses de la plus simple qui détient toutes les hypothèses à la plus générale qui détend tous. Les modèles dérivés du cadre proposé nous permettent d'enquêter sur la plausibilité biologique des trois hypothèses simplificatrices sur des données réelles ainsi que de trouver le meilleur modèle qui est aligné avec les caractéristiques sous-jacentes des jeux de données. Nos expériences montrent que, dans aucun des jeux de données réelles, tenant les trois hypothèses mentionnées est réaliste. Cela signifie en utilisant des modèles simples qui détiennent ces hypothèses peuvent être trompeuses et les résultats de l'estimation inexacte des paramètres. Le deuxième objectif est de développer un modèle mécaniste de codon généralisée qui détend les trois hypothèses simplificatrices, tandis que d'informatique efficace, en utilisant une opération de matrice appelée produit de Kronecker. Nos expériences montrent que sur un jeux de données choisis au hasard, le modèle proposé de codon mécaniste généralisée surpasse autre modèle de codon par rapport à AICc métrique dans environ la moitié des ensembles de données. En outre, je montre à travers plusieurs expériences que le modèle général proposé est biologiquement plausible.
Resumo:
Infiltration is the passage of water through the soil surface, influenced by the soil type and cultivation and by the soil roughness, surface cover and water content. Infiltration absorbs most of the rainwater and is therefore crucial for planning mechanical conservation practices to manage runoff. This study determined water infiltration in two soil types under different types of management and cultivation, with simulated rainfall of varying intensity and duration applied at different times, and to adjust the empirical model of Horton to the infiltration data. The study was conducted in southern Brazil, on Dystric Nitisol (Nitossolo Bruno aluminoférrico húmico) and Humic Cambisol (Cambissolo Húmico alumínico léptico) soils to assess the following situations: simulated rains on the Nitisol from 2001 to 2012 in 31 treatments, differing in crop type, sowing direction, type of soil opener on the seeder, amount and type of crop residue and amount of liquid swine manure applied; on the Cambisol, rains were simlated from 2006 to 2012 and 18 treatments were evaluated, differing in crop, seeding direction and crop residue type. The constant of the water infiltration rate into the soil varies significantly with the soil type (30.2 mm h-1 in the Nitisol and 6.6 mm h-1 in the Cambisol), regardless of the management system, application time and rain intensity and duration. At the end of rainfalls, soil-water infiltration varies significantly with the management system, with the timing of application and rain intensity and duration, with values ranging from 13 to 59 mm h-1, in the two studied soils. The characteristics of the sowing operation in terms of relief, crop type and amount and type of crop residue influenced soil water infiltration: in the Nitisol, the values of contour and downhill seeding vary between 27 and 43 mm h-1, respectively, with crop residues of corn, wheat and soybean while in the Cambisol, the variation is between 2 and 36 mm h-1, respectively, in soybean and corn crops. The Horton model fits the values of water infiltration rate into the soil, resulting in the equation i = 30.2 + (68.2 - 30.2) e-0.0371t (R2 = 0.94**) for the Nitisol and i = 6.6 + (64.5 - 6.6) e-0.0537t (R2 = 0.99**) for the Cambisol.
Resumo:
In this study we propose an evaluation of the angular effects altering the spectral response of the land-cover over multi-angle remote sensing image acquisitions. The shift in the statistical distribution of the pixels observed in an in-track sequence of WorldView-2 images is analyzed by means of a kernel-based measure of distance between probability distributions. Afterwards, the portability of supervised classifiers across the sequence is investigated by looking at the evolution of the classification accuracy with respect to the changing observation angle. In this context, the efficiency of various physically and statistically based preprocessing methods in obtaining angle-invariant data spaces is compared and possible synergies are discussed.
Resumo:
Microsatellite loci mutate at an extremely high rate and are generally thought to evolve through a stepwise mutation model. Several differentiation statistics taking into account the particular mutation scheme of the microsatellite have been proposed. The most commonly used is R(ST) which is independent of the mutation rate under a generalized stepwise mutation model. F(ST) and R(ST) are commonly reported in the literature, but often differ widely. Here we compare their statistical performances using individual-based simulations of a finite island model. The simulations were run under different levels of gene flow, mutation rates, population number and sizes. In addition to the per locus statistical properties, we compare two ways of combining R(ST) over loci. Our simulations show that even under a strict stepwise mutation model, no statistic is best overall. All estimators suffer to different extents from large bias and variance. While R(ST) better reflects population differentiation in populations characterized by very low gene-exchange, F(ST) gives better estimates in cases of high levels of gene flow. The number of loci sampled (12, 24, or 96) has only a minor effect on the relative performance of the estimators under study. For all estimators there is a striking effect of the number of samples, with the differentiation estimates showing very odd distributions for two samples.
Resumo:
Background: In longitudinal studies where subjects experience recurrent incidents over a period of time, such as respiratory infections, fever or diarrhea, statistical methods are required to take into account the within-subject correlation. Methods: For repeated events data with censored failure, the independent increment (AG), marginal (WLW) and conditional (PWP) models are three multiple failure models that generalize Cox"s proportional hazard model. In this paper, we revise the efficiency, accuracy and robustness of all three models under simulated scenarios with varying degrees of within-subject correlation, censoring levels, maximum number of possible recurrences and sample size. We also study the methods performance on a real dataset from a cohort study with bronchial obstruction. Results: We find substantial differences between methods and there is not an optimal method. AG and PWP seem to be preferable to WLW for low correlation levels but the situation reverts for high correlations. Conclusions: All methods are stable in front of censoring, worsen with increasing recurrence levels and share a bias problem which, among other consequences, makes asymptotic normal confidence intervals not fully reliable, although they are well developed theoretically.