946 resultados para Statistical Tools
Resumo:
Water quality data are often collected at different sites over time to improve water quality management. Water quality data usually exhibit the following characteristics: non-normal distribution, presence of outliers, missing values, values below detection limits (censored), and serial dependence. It is essential to apply appropriate statistical methodology when analyzing water quality data to draw valid conclusions and hence provide useful advice in water management. In this chapter, we will provide and demonstrate various statistical tools for analyzing such water quality data, and will also introduce how to use a statistical software R to analyze water quality data by various statistical methods. A dataset collected from the Susquehanna River Basin will be used to demonstrate various statistical methods provided in this chapter. The dataset can be downloaded from website http://www.srbc.net/programs/CBP/nutrientprogram.htm.
Resumo:
Microfluidics has recently emerged as a new method of manufacturing liposomes, which allows for reproducible mixing in miliseconds on the nanoliter scale. Here we investigate microfluidics-based manufacturing of liposomes. The aim of these studies was to assess the parameters in a microfluidic process by varying the total flow rate (TFR) and the flow rate ratio (FRR) of the solvent and aqueous phases. Design of experiment and multivariate data analysis were used for increased process understanding and development of predictive and correlative models. High FRR lead to the bottom-up synthesis of liposomes, with a strong correlation with vesicle size, demonstrating the ability to in-process control liposomes size; the resulting liposome size correlated with the FRR in the microfluidics process, with liposomes of 50 nm being reproducibly manufactured. Furthermore, we demonstrate the potential of a high throughput manufacturing of liposomes using microfluidics with a four-fold increase in the volumetric flow rate, maintaining liposome characteristics. The efficacy of these liposomes was demonstrated in transfection studies and was modelled using predictive modeling. Mathematical modelling identified FRR as the key variable in the microfluidic process, with the highest impact on liposome size, polydispersity and transfection efficiency. This study demonstrates microfluidics as a robust and high-throughput method for the scalable and highly reproducible manufacture of size-controlled liposomes. Furthermore, the application of statistically based process control increases understanding and allows for the generation of a design-space for controlled particle characteristics.
Resumo:
This thesis explored the development of statistical methods to support the monitoring and improvement in quality of treatment delivered to patients undergoing coronary angioplasty procedures. To achieve this goal, a suite of outcome measures was identified to characterise performance of the service, statistical tools were developed to monitor the various indicators and measures to strengthen governance processes were implemented and validated. Although this work focused on pursuit of these aims in the context of a an angioplasty service located at a single clinical site, development of the tools and techniques was undertaken mindful of the potential application to other clinical specialties and a wider, potentially national, scope.
Resumo:
Les séquences protéiques naturelles sont le résultat net de l’interaction entre les mécanismes de mutation, de sélection naturelle et de dérive stochastique au cours des temps évolutifs. Les modèles probabilistes d’évolution moléculaire qui tiennent compte de ces différents facteurs ont été substantiellement améliorés au cours des dernières années. En particulier, ont été proposés des modèles incorporant explicitement la structure des protéines et les interdépendances entre sites, ainsi que les outils statistiques pour évaluer la performance de ces modèles. Toutefois, en dépit des avancées significatives dans cette direction, seules des représentations très simplifiées de la structure protéique ont été utilisées jusqu’à présent. Dans ce contexte, le sujet général de cette thèse est la modélisation de la structure tridimensionnelle des protéines, en tenant compte des limitations pratiques imposées par l’utilisation de méthodes phylogénétiques très gourmandes en temps de calcul. Dans un premier temps, une méthode statistique générale est présentée, visant à optimiser les paramètres d’un potentiel statistique (qui est une pseudo-énergie mesurant la compatibilité séquence-structure). La forme fonctionnelle du potentiel est par la suite raffinée, en augmentant le niveau de détails dans la description structurale sans alourdir les coûts computationnels. Plusieurs éléments structuraux sont explorés : interactions entre pairs de résidus, accessibilité au solvant, conformation de la chaîne principale et flexibilité. Les potentiels sont ensuite inclus dans un modèle d’évolution et leur performance est évaluée en termes d’ajustement statistique à des données réelles, et contrastée avec des modèles d’évolution standards. Finalement, le nouveau modèle structurellement contraint ainsi obtenu est utilisé pour mieux comprendre les relations entre niveau d’expression des gènes et sélection et conservation de leur séquence protéique.
Resumo:
Pressing global environmental problems highlight the need to develop tools to measure progress towards "sustainability." However, some argue that any such attempt inevitably reflects the views of those creating such tools and only produce highly contested notions of "reality." To explore this tension, we critically assesses the Environmental Sustainability Index (ESI), a well-publicized product of the World Economic Forum that is designed to measure 'sustainability' by ranking nations on league tables based on extensive databases of environmental indicators. By recreating this index, and then using statistical tools (principal components analysis) to test relations between various components of the index, we challenge ways in which countries are ranked in the ESI. Based on this analysis, we suggest (1) that the approach taken to aggregate, interpret and present the ESI creates a misleading impression that Western countries are more sustainable than the developing world; (2) that unaccounted methodological biases allowed the authors of the ESI to over-generalize the relative 'sustainability' of different countries; and, (3) that this has resulted in simplistic conclusions on the relation between economic growth and environmental sustainability. This criticism should not be interpreted as a call for the abandonment of efforts to create standardized comparable data. Instead, this paper proposes that indicator selection and data collection should draw on a range of voices, including local stakeholders as well as international experts. We also propose that aggregating data into final league ranking tables is too prone to error and creates the illusion of absolute and categorical interpretations. (c) 2004 Elsevier Ltd. All rights reserved.
Resumo:
The main objective involved with this paper consists of presenting the results obtained from the application of artificial neural networks and statistical tools in the automatic identification and classification process of faults in electric power distribution systems. The developed techniques to treat the proposed problem have used, in an integrated way, several approaches that can contribute to the successful detection process of faults, aiming that it is carried out in a reliable and safe way. The compilations of the results obtained from practical experiments accomplished in a pilot distribution feeder have demonstrated that the developed techniques provide accurate results, identifying and classifying efficiently the several occurrences of faults observed in the feeder. © 2006 IEEE.
Resumo:
Statistical methods for analyzing agroecological data might not be able to help agroecologists to solve all of the current problems concerning crop and animal husbandry, but such methods could well help them assess, tackle, and resolve several agroecological issues in a more reliable and accurate manner. Therefore, our goal in this article is to discuss the importance of statistical tools for alternative agronomic approaches, because alternative approaches, such as organic farming, should not only be promoted by encouraging farmers to deploy agroecological techniques, but also by providing agroecologists with robust analyses based on rigorous statistical procedures.
Resumo:
Coastal sand dunes represent a richness first of all in terms of defense from the sea storms waves and the saltwater ingression; moreover these morphological elements constitute an unique ecosystem of transition between the sea and the land environment. The research about dune system is a strong part of the coastal sciences, since the last century. Nowadays this branch have assumed even more importance for two reasons: on one side the born of brand new technologies, especially related to the Remote Sensing, have increased the researcher possibilities; on the other side the intense urbanization of these days have strongly limited the dune possibilities of development and fragmented what was remaining from the last century. This is particularly true in the Ravenna area, where the industrialization united to the touristic economy and an intense subsidence, have left only few dune ridges residual still active. In this work three different foredune ridges, along the Ravenna coast, have been studied with Laser Scanner technology. This research didn’t limit to analyze volume or spatial difference, but try also to find new ways and new features to monitor this environment. Moreover the author planned a series of test to validate data from Terrestrial Laser Scanner (TLS), with the additional aim of finalize a methodology to test 3D survey accuracy. Data acquired by TLS were then applied on one hand to test some brand new applications, such as Digital Shore Line Analysis System (DSAS) and Computational Fluid Dynamics (CFD), to prove their efficacy in this field; on the other hand the author used TLS data to find any correlation with meteorological indexes (Forcing Factors), linked to sea and wind (Fryberger's method) applying statistical tools, such as the Principal Component Analysis (PCA).
Resumo:
There is remarkable growing concern about the quality control at the time, which has led to the search for methods capable of addressing effectively the reliability analysis as part of the Statistic. Managers, researchers and Engineers must understand that 'statistical thinking' is not just a set of statistical tools. They should start considering 'statistical thinking' from a 'system', which means, developing systems that meet specific statistical tools and other methodologies for an activity. The aim of this article is to encourage them (engineers, researchers and managers) to develop a new way of thinking.
Resumo:
Objective: To summarise the extent to which narrative text fields in administrative health data are used to gather information about the event resulting in presentation to a health care provider for treatment of an injury, and to highlight best practise approaches to conducting narrative text interrogation for injury surveillance purposes.----- Design: Systematic review----- Data sources: Electronic databases searched included CINAHL, Google Scholar, Medline, Proquest, PubMed and PubMed Central.. Snowballing strategies were employed by searching the bibliographies of retrieved references to identify relevant associated articles.----- Selection criteria: Papers were selected if the study used a health-related database and if the study objectives were to a) use text field to identify injury cases or use text fields to extract additional information on injury circumstances not available from coded data or b) use text fields to assess accuracy of coded data fields for injury-related cases or c) describe methods/approaches for extracting injury information from text fields.----- Methods: The papers identified through the search were independently screened by two authors for inclusion, resulting in 41 papers selected for review. Due to heterogeneity between studies metaanalysis was not performed.----- Results: The majority of papers reviewed focused on describing injury epidemiology trends using coded data and text fields to supplement coded data (28 papers), with these studies demonstrating the value of text data for providing more specific information beyond what had been coded to enable case selection or provide circumstantial information. Caveats were expressed in terms of the consistency and completeness of recording of text information resulting in underestimates when using these data. Four coding validation papers were reviewed with these studies showing the utility of text data for validating and checking the accuracy of coded data. Seven studies (9 papers) described methods for interrogating injury text fields for systematic extraction of information, with a combination of manual and semi-automated methods used to refine and develop algorithms for extraction and classification of coded data from text. Quality assurance approaches to assessing the robustness of the methods for extracting text data was only discussed in 8 of the epidemiology papers, and 1 of the coding validation papers. All of the text interrogation methodology papers described systematic approaches to ensuring the quality of the approach.----- Conclusions: Manual review and coding approaches, text search methods, and statistical tools have been utilised to extract data from narrative text and translate it into useable, detailed injury event information. These techniques can and have been applied to administrative datasets to identify specific injury types and add value to previously coded injury datasets. Only a few studies thoroughly described the methods which were used for text mining and less than half of the studies which were reviewed used/described quality assurance methods for ensuring the robustness of the approach. New techniques utilising semi-automated computerised approaches and Bayesian/clustering statistical methods offer the potential to further develop and standardise the analysis of narrative text for injury surveillance.
Resumo:
The detached housing scheme is a unique and exclusive segment of the residential property market in Malaysia. Generally, the product is expensive and for many Malaysians who can afford them, owning a detached house is a once in a lifetime opportunity. In spite of this, most of the owners fail to fully comprehend the specific need of this type of housing scheme, increasing the risk of it being a problematic project. Unlike other types of pre-designed ‘mass housing’ schemes, the detached housing scheme may be built specifically to cater the needs and demands of its owner. Therefore, maximum owner participation is vital as the development progresses to guarantee the success of the project. In addition, due to it’s unique design the house would have to individually comply with the requirements and regulations of relevant authorities. Failure of owner to recognise this will result in delays, fines and penalties, disputes and ultimately cost overruns. These circumstances highlight the need for a model to guide the owner through the entire development process of a detached house. Therefore, this research aims to develop a model for a successful detached housing development in Malaysia through maximising owner participation during it’s various development stages. To achieve this, questionnaire surveys and case studies methods shall be employed to acquire the detached housing owners’ experiences in developing their detached houses in Malaysia. Relevant statistical tools shall be applied to analyse the responses. The results gained from this study shall be synthesised into a model of successful detached housing development for the reference of future detached housing owners in Malaysia.
Resumo:
Of the numerous factors that play a role in fatal pedestrian collisions, the time of day, day of the week, and time of year can be significant determinants. More than 60% of all pedestrian collisions in 2007 occurred at night, despite the presumed decrease in both pedestrian and automobile exposure during the night. Although this trend is partially explained by factors such as fatigue and alcohol consumption, prior analysis of the Fatality Analysis Reporting System database suggests that pedestrian fatalities increase as light decreases after controlling for other factors. This study applies graphical cross-tabulation, a novel visual assessment approach, to explore the relationships among collision variables. The results reveal that twilight and the first hour of darkness typically observe the greatest frequency of pedestrian fatal collisions. These hours are not necessarily the most risky on a per mile travelled basis, however, because pedestrian volumes are often still high. Additional analysis is needed to quantify the extent to which pedestrian exposure (walking/crossing activity) in these time periods plays a role in pedestrian crash involvement. Weekly patterns of pedestrian fatal collisions vary by time of year due to the seasonal changes in sunset time. In December, collisions are concentrated around twilight and the first hour of darkness throughout the week while, in June, collisions are most heavily concentrated around twilight and the first hours of darkness on Friday and Saturday. Friday and Saturday nights in June may be the most dangerous times for pedestrians. Knowing when pedestrian risk is highest is critically important for formulating effective mitigation strategies and for efficiently investing safety funds. This applied visual approach is a helpful tool for researchers intending to communicate with policy-makers and to identify relationships that can then be tested with more sophisticated statistical tools.
Resumo:
Plant biosecurity requires statistical tools to interpret field surveillance data in order to manage pest incursions that threaten crop production and trade. Ultimately, management decisions need to be based on the probability that an area is infested or free of a pest. Current informal approaches to delimiting pest extent rely upon expert ecological interpretation of presence / absence data over space and time. Hierarchical Bayesian models provide a cohesive statistical framework that can formally integrate the available information on both pest ecology and data. The overarching method involves constructing an observation model for the surveillance data, conditional on the hidden extent of the pest and uncertain detection sensitivity. The extent of the pest is then modelled as a dynamic invasion process that includes uncertainty in ecological parameters. Modelling approaches to assimilate this information are explored through case studies on spiralling whitefly, Aleurodicus dispersus and red banded mango caterpillar, Deanolis sublimbalis. Markov chain Monte Carlo simulation is used to estimate the probable extent of pests, given the observation and process model conditioned by surveillance data. Statistical methods, based on time-to-event models, are developed to apply hierarchical Bayesian models to early detection programs and to demonstrate area freedom from pests. The value of early detection surveillance programs is demonstrated through an application to interpret surveillance data for exotic plant pests with uncertain spread rates. The model suggests that typical early detection programs provide a moderate reduction in the probability of an area being infested but a dramatic reduction in the expected area of incursions at a given time. Estimates of spiralling whitefly extent are examined at local, district and state-wide scales. The local model estimates the rate of natural spread and the influence of host architecture, host suitability and inspector efficiency. These parameter estimates can support the development of robust surveillance programs. Hierarchical Bayesian models for the human-mediated spread of spiralling whitefly are developed for the colonisation of discrete cells connected by a modified gravity model. By estimating dispersal parameters, the model can be used to predict the extent of the pest over time. An extended model predicts the climate restricted distribution of the pest in Queensland. These novel human-mediated movement models are well suited to demonstrating area freedom at coarse spatio-temporal scales. At finer scales, and in the presence of ecological complexity, exploratory models are developed to investigate the capacity for surveillance information to estimate the extent of red banded mango caterpillar. It is apparent that excessive uncertainty about observation and ecological parameters can impose limits on inference at the scales required for effective management of response programs. The thesis contributes novel statistical approaches to estimating the extent of pests and develops applications to assist decision-making across a range of plant biosecurity surveillance activities. Hierarchical Bayesian modelling is demonstrated as both a useful analytical tool for estimating pest extent and a natural investigative paradigm for developing and focussing biosecurity programs.
Resumo:
Pedestrians’ use of mp3 players or mobile phones can pose the risk of being hit by motor vehicles. We present an approach for detecting a crash risk level using the computing power and the microphone of mobile devices that can be used to alert the user in advance of an approaching vehicle so as to avoid a crash. A single feature extractor classifier is not usually able to deal with the diversity of risky acoustic scenarios. In this paper, we address the problem of detection of vehicles approaching a pedestrian by a novel, simple, non resource intensive acoustic method. The method uses a set of existing statistical tools to mine signal features. Audio features are adaptively thresholded for relevance and classified with a three component heuristic. The resulting Acoustic Hazard Detection (AHD) system has a very low false positive detection rate. The results of this study could help mobile device manufacturers to embed the presented features into future potable devices and contribute to road safety.
Resumo:
As proteins within cells are spatially organized according to their role, knowledge about protein localization gives insight into protein function. Here, we describe the LOPIT technique (localization of organelle proteins by isotope tagging) developed for the simultaneous and confident determination of the steady-state distribution of hundreds of integral membrane proteins within organelles. The technique uses a partial membrane fractionation strategy in conjunction with quantitative proteomics. Localization of proteins is achieved by measuring their distribution pattern across the density gradient using amine-reactive isotope tagging and comparing these patterns with those of known organelle residents. LOPIT relies on the assumption that proteins belonging to the same organelle will co-fractionate. Multivariate statistical tools are then used to group proteins according to the similarities in their distributions, and hence localization without complete centrifugal separation is achieved. The protocol requires approximately 3 weeks to complete and can be applied in a high-throughput manner to material from many varied sources.