937 resultados para Statistical methodologies
Resumo:
Most current methods for adult skeletal age-at-death estimation are based on American samples comprising individuals of European and African ancestry. Our limited understanding of population variability hampers our efforts to apply these techniques to various skeletal populations around the world, especially in global forensic contexts. Further, documented skeletal samples are rare, limiting our ability to test our techniques. The objective of this paper is to test three pelvic macroscopic methods (1-Suchey-Brooks; 2- Lovejoy; 3- Buckberry and Chamberlain) on a documented modern Spanish sample. These methods were selected because they are popular among Spanish anthropologists and because they never have been tested in a Spanish sample. The study sample consists of 80 individuals (55 ♂ and 25 ♀) of known sex and age from the Valladolid collection. Results indicate that in all three methods, levels of bias and inaccuracy increase with age. The Lovejoy method performs poorly (27%) compared with Suchey-Brooks (71%) and Buckberry and Chamberlain (86%). However, the levels of correlation between phases and chronological ages are low and comparable in the three methods (< 0.395). The apparent accuracy of the Suchey-Brooks and Buckberry and Chamberlain methods is largely based on the broad width of the methods" estimated intervals. This study suggests that before systematic application of these three methodologies in Spanish populations, further statistical modeling and research into the co-variance of chronological age with morphological change is necessary. Future methods should be developed specific to various world populations, and should allow for both precision and flexibility in age estimation.
Resumo:
Many people regard the concept of hypothesis testing as fundamental to inferential statistics. Various schools of thought, in particular frequentist and Bayesian, have promoted radically different solutions for taking a decision about the plausibility of competing hypotheses. Comprehensive philosophical comparisons about their advantages and drawbacks are widely available and continue to span over large debates in the literature. More recently, controversial discussion was initiated by an editorial decision of a scientific journal [1] to refuse any paper submitted for publication containing null hypothesis testing procedures. Since the large majority of papers published in forensic journals propose the evaluation of statistical evidence based on the so called p-values, it is of interest to expose the discussion of this journal's decision within the forensic science community. This paper aims to provide forensic science researchers with a primer on the main concepts and their implications for making informed methodological choices.
Resumo:
This article proposes a checklist to improve statistical reporting in the manuscripts submitted to Public Understanding of Science. Generally, these guidelines will allow the reviewers (and readers) to judge whether the evidence provided in the manuscript is relevant. The article ends with other suggestions for a better statistical quality of the journal.
Resumo:
This thesis develops a comprehensive and a flexible statistical framework for the analysis and detection of space, time and space-time clusters of environmental point data. The developed clustering methods were applied in both simulated datasets and real-world environmental phenomena; however, only the cases of forest fires in Canton of Ticino (Switzerland) and in Portugal are expounded in this document. Normally, environmental phenomena can be modelled as stochastic point processes where each event, e.g. the forest fire ignition point, is characterised by its spatial location and occurrence in time. Additionally, information such as burned area, ignition causes, landuse, topographic, climatic and meteorological features, etc., can also be used to characterise the studied phenomenon. Thereby, the space-time pattern characterisa- tion represents a powerful tool to understand the distribution and behaviour of the events and their correlation with underlying processes, for instance, socio-economic, environmental and meteorological factors. Consequently, we propose a methodology based on the adaptation and application of statistical and fractal point process measures for both global (e.g. the Morisita Index, the Box-counting fractal method, the multifractal formalism and the Ripley's K-function) and local (e.g. Scan Statistics) analysis. Many measures describing the space-time distribution of environmental phenomena have been proposed in a wide variety of disciplines; nevertheless, most of these measures are of global character and do not consider complex spatial constraints, high variability and multivariate nature of the events. Therefore, we proposed an statistical framework that takes into account the complexities of the geographical space, where phenomena take place, by introducing the Validity Domain concept and carrying out clustering analyses in data with different constrained geographical spaces, hence, assessing the relative degree of clustering of the real distribution. Moreover, exclusively to the forest fire case, this research proposes two new methodologies to defining and mapping both the Wildland-Urban Interface (WUI) described as the interaction zone between burnable vegetation and anthropogenic infrastructures, and the prediction of fire ignition susceptibility. In this regard, the main objective of this Thesis was to carry out a basic statistical/- geospatial research with a strong application part to analyse and to describe complex phenomena as well as to overcome unsolved methodological problems in the characterisation of space-time patterns, in particular, the forest fire occurrences. Thus, this Thesis provides a response to the increasing demand for both environmental monitoring and management tools for the assessment of natural and anthropogenic hazards and risks, sustainable development, retrospective success analysis, etc. The major contributions of this work were presented at national and international conferences and published in 5 scientific journals. National and international collaborations were also established and successfully accomplished. -- Cette thèse développe une méthodologie statistique complète et flexible pour l'analyse et la détection des structures spatiales, temporelles et spatio-temporelles de données environnementales représentées comme de semis de points. Les méthodes ici développées ont été appliquées aux jeux de données simulées autant qu'A des phénomènes environnementaux réels; nonobstant, seulement le cas des feux forestiers dans le Canton du Tessin (la Suisse) et celui de Portugal sont expliqués dans ce document. Normalement, les phénomènes environnementaux peuvent être modélisés comme des processus ponctuels stochastiques ou chaque événement, par ex. les point d'ignition des feux forestiers, est déterminé par son emplacement spatial et son occurrence dans le temps. De plus, des informations tels que la surface bru^lée, les causes d'ignition, l'utilisation du sol, les caractéristiques topographiques, climatiques et météorologiques, etc., peuvent aussi être utilisées pour caractériser le phénomène étudié. Par conséquent, la définition de la structure spatio-temporelle représente un outil puissant pour compren- dre la distribution du phénomène et sa corrélation avec des processus sous-jacents tels que les facteurs socio-économiques, environnementaux et météorologiques. De ce fait, nous proposons une méthodologie basée sur l'adaptation et l'application de mesures statistiques et fractales des processus ponctuels d'analyse global (par ex. l'indice de Morisita, la dimension fractale par comptage de boîtes, le formalisme multifractal et la fonction K de Ripley) et local (par ex. la statistique de scan). Des nombreuses mesures décrivant les structures spatio-temporelles de phénomènes environnementaux peuvent être trouvées dans la littérature. Néanmoins, la plupart de ces mesures sont de caractère global et ne considèrent pas de contraintes spatiales com- plexes, ainsi que la haute variabilité et la nature multivariée des événements. A cet effet, la méthodologie ici proposée prend en compte les complexités de l'espace géographique ou le phénomène a lieu, à travers de l'introduction du concept de Domaine de Validité et l'application des mesures d'analyse spatiale dans des données en présentant différentes contraintes géographiques. Cela permet l'évaluation du degré relatif d'agrégation spatiale/temporelle des structures du phénomène observé. En plus, exclusif au cas de feux forestiers, cette recherche propose aussi deux nouvelles méthodologies pour la définition et la cartographie des zones périurbaines, décrites comme des espaces anthropogéniques à proximité de la végétation sauvage ou de la forêt, et de la prédiction de la susceptibilité à l'ignition de feu. A cet égard, l'objectif principal de cette Thèse a été d'effectuer une recherche statistique/géospatiale avec une forte application dans des cas réels, pour analyser et décrire des phénomènes environnementaux complexes aussi bien que surmonter des problèmes méthodologiques non résolus relatifs à la caractérisation des structures spatio-temporelles, particulièrement, celles des occurrences de feux forestières. Ainsi, cette Thèse fournit une réponse à la demande croissante de la gestion et du monitoring environnemental pour le déploiement d'outils d'évaluation des risques et des dangers naturels et anthro- pogéniques. Les majeures contributions de ce travail ont été présentées aux conférences nationales et internationales, et ont été aussi publiées dans 5 revues internationales avec comité de lecture. Des collaborations nationales et internationales ont été aussi établies et accomplies avec succès.
Resumo:
Construction of multiple sequence alignments is a fundamental task in Bioinformatics. Multiple sequence alignments are used as a prerequisite in many Bioinformatics methods, and subsequently the quality of such methods can be critically dependent on the quality of the alignment. However, automatic construction of a multiple sequence alignment for a set of remotely related sequences does not always provide biologically relevant alignments.Therefore, there is a need for an objective approach for evaluating the quality of automatically aligned sequences. The profile hidden Markov model is a powerful approach in comparative genomics. In the profile hidden Markov model, the symbol probabilities are estimated at each conserved alignment position. This can increase the dimension of parameter space and cause an overfitting problem. These two research problems are both related to conservation. We have developed statistical measures for quantifying the conservation of multiple sequence alignments. Two types of methods are considered, those identifying conserved residues in an alignment position, and those calculating positional conservation scores. The positional conservation score was exploited in a statistical prediction model for assessing the quality of multiple sequence alignments. The residue conservation score was used as part of the emission probability estimation method proposed for profile hidden Markov models. The results of the predicted alignment quality score highly correlated with the correct alignment quality scores, indicating that our method is reliable for assessing the quality of any multiple sequence alignment. The comparison of the emission probability estimation method with the maximum likelihood method showed that the number of estimated parameters in the model was dramatically decreased, while the same level of accuracy was maintained. To conclude, we have shown that conservation can be successfully used in the statistical model for alignment quality assessment and in the estimation of emission probabilities in the profile hidden Markov models.
Resumo:
In this thesis the X-ray tomography is discussed from the Bayesian statistical viewpoint. The unknown parameters are assumed random variables and as opposite to traditional methods the solution is obtained as a large sample of the distribution of all possible solutions. As an introduction to tomography an inversion formula for Radon transform is presented on a plane. The vastly used filtered backprojection algorithm is derived. The traditional regularization methods are presented sufficiently to ground the Bayesian approach. The measurements are foton counts at the detector pixels. Thus the assumption of a Poisson distributed measurement error is justified. Often the error is assumed Gaussian, altough the electronic noise caused by the measurement device can change the error structure. The assumption of Gaussian measurement error is discussed. In the thesis the use of different prior distributions in X-ray tomography is discussed. Especially in severely ill-posed problems the use of a suitable prior is the main part of the whole solution process. In the empirical part the presented prior distributions are tested using simulated measurements. The effect of different prior distributions produce are shown in the empirical part of the thesis. The use of prior is shown obligatory in case of severely ill-posed problem.
Resumo:
This thesis was focussed on statistical analysis methods and proposes the use of Bayesian inference to extract information contained in experimental data by estimating Ebola model parameters. The model is a system of differential equations expressing the behavior and dynamics of Ebola. Two sets of data (onset and death data) were both used to estimate parameters, which has not been done by previous researchers in (Chowell, 2004). To be able to use both data, a new version of the model has been built. Model parameters have been estimated and then used to calculate the basic reproduction number and to study the disease-free equilibrium. Estimates of the parameters were useful to determine how well the model fits the data and how good estimates were, in terms of the information they provided about the possible relationship between variables. The solution showed that Ebola model fits the observed onset data at 98.95% and the observed death data at 93.6%. Since Bayesian inference can not be performed analytically, the Markov chain Monte Carlo approach has been used to generate samples from the posterior distribution over parameters. Samples have been used to check the accuracy of the model and other characteristics of the target posteriors.
Resumo:
The optimal design of a heat exchanger system is based on given model parameters together with given standard ranges for machine design variables. The goals set for minimizing the Life Cycle Cost (LCC) function which represents the price of the saved energy, for maximizing the momentary heat recovery output with given constraints satisfied and taking into account the uncertainty in the models were successfully done. Nondominated Sorting Genetic Algorithm II (NSGA-II) for the design optimization of a system is presented and implemented inMatlab environment. Markov ChainMonte Carlo (MCMC) methods are also used to take into account the uncertainty in themodels. Results show that the price of saved energy can be optimized. A wet heat exchanger is found to be more efficient and beneficial than a dry heat exchanger even though its construction is expensive (160 EUR/m2) compared to the construction of a dry heat exchanger (50 EUR/m2). It has been found that the longer lifetime weights higher CAPEX and lower OPEX and vice versa, and the effect of the uncertainty in the models has been identified in a simplified case of minimizing the area of a dry heat exchanger.
Resumo:
Tämän tutkimuksen kohdeorganisaatio on suuren teollisuusyrityksen sisäinen raaka-aineen hankkija ja toimittaja. Tutkimuksessa selvitetään, mistä kohdeorganisaation hankinta-asiakkuuksien arvo muodostuu ja kuinka olemassa olevan liiketoimintadatan perusteella voidaan tutkia, arvioida ja luokitella kauppojen ja asiakkuuksien arvokkuutta aikaan sitomatta, objektiivisesti ja luotettavasti. Tutkimuksen teoriaosiossa esitellään lähestymistapoja ja menetelmiä, joiden avulla voidaan jalostaa olemassa olevasta datasta uutta sidosryhmätietämystä liiketoiminnan käyttöön, sekä tarkastellaan asiakaskannattavuusanalyysin, portfolioanalyysin, sekä asiakassegmentoinnin perusteita ja malleja. Näiden teorioiden ja mallien pohjalta rakennetaan kohdeorganisaatiolle räätälöity, indeksoituihin hinta-, määrä- ja kauppojen toistuvuus-muuttujiin perustuva, asiakkuuksien arvottamis- ja luokittelumalli. Arvottamis- ja luokittelumalli testataan vuosien 2003–2007 liiketoimintadatasta muodostetulla 389 336 kaupparivin otoksella, joka sisältää 42 186 arvioitavaa asiakkuussuhdetta. Merkittävin esille nouseva havainto on noin 5 000:n keskimääräistä selkeästi kalliimman asiakkuuden ryhmä. Aineisto ja sen poikkeavuudet testataan tilastollisin menetelmin, jotta saadaan selville asiakkuuden arvoon vaikuttavat ja arvoa selittävät tekijät. Lopuksi pohditaan arvottamismallin merkitystä analyyttisemman ostotoiminnan ja asiakkuudenhallinnan välineenä, sekä esitetään muutamia parannusehdotuksia.
Resumo:
When laboratory intercomparison exercises are conducted, there is no a priori dependence of the concentration of a certain compound determined in one laboratory to that determined by another(s). The same applies when comparing different methodologies. A existing data set of total mercury readings in fish muscle samples involved in a Brazilian intercomparison exercise was used to show that correlation analysis is the most effective statistical tool in this kind of experiments. Problems associated with alternative analytical tools such as mean or paired 't'-test comparison and regression analysis are discussed.
Resumo:
Four different pseudopotentials and three methodologies were employed in the calculation of the geometry and the frequencies of metal complexes like [M(NH3)2X2] [X=halogen, M=Zn, Cd], and [Hg(NH3)2]Cl2. The vibrational assignments were carefully checked and compared to the theoretically calculated ones. Graphical procedures were employed to estimate family errors and their average behavior. The calculated results show the SBK-X basis set with the best results for the geometries and calculated frequencies, for individual species and statistical results. Its use is recommend, mainly if the neighborhood atoms are described with similar pseudopotentials. Excellent results were also obtained with the Hay and Wadt pseudopotential.
Resumo:
Two high performance liquid chromatography (HPLC) methods for the quantitative determination of indinavir sulfate were tested, validated and statistically compared. Assays were carried out using as mobile phases mixtures of dibutylammonium phosphate buffer pH 6.5 and acetonitrile (55:45) at 1 mL/min or citrate buffer pH 5 and acetonitrile (60:40) at 1 mL/min, an octylsilane column (RP-8) and a UV spectrophotometric detector at 260 nm. Both methods showed good sensitivity, linearity, precision and accuracy. The statistical analysis using the t-student test for the determination of indinavir sulfate raw material and capsules indicated no statistically significant difference between the two methods.
Resumo:
The publication of the fourth IPCC report, as well as the number of research results reported in recent years about the regionalization of climate projections, were the driving forces to justify the update of the report on climate change in Catalonia. Specifically, the new IPCC report contains new climate projections at global and continental scales, while several international projects (especially European projects PRUDENCE and ENSEMBLES) have produced continental-scale climate projections, which allow for distinguishing between European regions. For Spain, some of these results have been included in a document commissioned by the“State Agency of Meteorology”. In addition, initiatives are being developed within Catalonia (in particular, by the Meteorological Service of Catalonia) to downscale climate projections in this area. The present paper synthesizes results of these and other previously published studies, as well as our own analysis of results of the ENSEMBLES project. The aim is to propose scenarios of variation in temperature and rainfall in Catalonia during the 21st Century. Thus, by the middle of this century temperatures could rise up to 2 C compared with that of the late 20th Century. These increases would probably be higher in summer than in winter, generalized across the territory but less pronounced in coastal areas. Rainfall, however, would not change much, but it could slightly decrease. Towards the end of the 21st Century, temperatures could rise to about 5 C above that of the last century, while the average rainfall could decrease by more than 10%. Increases in temperature would be higher in summer and in areas further from the coast. Rainfall would decrease especially during the summer, while it could even increase in winter in mountainous areas such as the Pyrenees.