951 resultados para statistical distribution
Resumo:
In this paper, we carry out robust modeling and influence diagnostics in Birnbaum-Saunders (BS) regression models. Specifically, we present some aspects related to BS and log-BS distributions and their generalizations from the Student-t distribution, and develop BS-t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools. In addition, we apply the obtained results to real data from insurance, which shows the uses of the proposed model. Copyright (c) 2011 John Wiley & Sons, Ltd.
Resumo:
Questa tesi si inserisce nell'ambito delle analisi statistiche e dei metodi stocastici applicati all'analisi delle sequenze di DNA. Nello specifico il nostro lavoro è incentrato sullo studio del dinucleotide CG (CpG) all'interno del genoma umano, che si trova raggruppato in zone specifiche denominate CpG islands. Queste sono legate alla metilazione del DNA, un processo che riveste un ruolo fondamentale nella regolazione genica. La prima parte dello studio è dedicata a una caratterizzazione globale del contenuto e della distribuzione dei 16 diversi dinucleotidi all'interno del genoma umano: in particolare viene studiata la distribuzione delle distanze tra occorrenze successive dello stesso dinucleotide lungo la sequenza. I risultati vengono confrontati con diversi modelli nulli: sequenze random generate con catene di Markov di ordine zero (basate sulle frequenze relative dei nucleotidi) e uno (basate sulle probabilità di transizione tra diversi nucleotidi) e la distribuzione geometrica per le distanze. Da questa analisi le proprietà caratteristiche del dinucleotide CpG emergono chiaramente, sia dal confronto con gli altri dinucleotidi che con i modelli random. A seguito di questa prima parte abbiamo scelto di concentrare le successive analisi in zone di interesse biologico, studiando l’abbondanza e la distribuzione di CpG al loro interno (CpG islands, promotori e Lamina Associated Domains). Nei primi due casi si osserva un forte arricchimento nel contenuto di CpG, e la distribuzione delle distanze è spostata verso valori inferiori, indicando che questo dinucleotide è clusterizzato. All’interno delle LADs si trovano mediamente meno CpG e questi presentano distanze maggiori. Infine abbiamo adottato una rappresentazione a random walk del DNA, costruita in base al posizionamento dei dinucleotidi: il walk ottenuto presenta caratteristiche drasticamente diverse all’interno e all’esterno di zone annotate come CpG island. Riteniamo pertanto che metodi basati su questo approccio potrebbero essere sfruttati per migliorare l’individuazione di queste aree di interesse nel genoma umano e di altri organismi.
Resumo:
The geometrical factors defining an adhesive joint are of great importance as its design greatly conditions the performance of the bonding. One of the most relevant geometrical factors is the thickness of the adhesive as it decisively influences the mechanical properties of the bonding and has a clear economic impact on the manufacturing processes or long runs. The traditional mechanical joints (riveting, welding, etc.) are characterised by a predictable performance, and are very reliable in service conditions. Thus, structural adhesive joints will only be selected in industrial applications demanding mechanical requirements and adverse environmental conditions if the suitable reliability (the same or higher than the mechanical joints) is guaranteed. For this purpose, the objective of this paper is to analyse the influence of the adhesive thickness on the mechanical behaviour of the joint and, by means of a statistical analysis based on Weibull distribution, propose the optimum thickness for the adhesive combining the best mechanical performance and high reliability. This procedure, which is applicable without a great deal of difficulty to other joints and adhesives, provides a general use for a more reliable use of adhesive bondings and, therefore, for a better and wider use in the industrial manufacturing processes.
Resumo:
Mode of access: Internet.
Resumo:
Mode of access: Internet.
Resumo:
Traditional vegetation mapping methods use high cost, labour-intensive aerial photography interpretation. This approach can be subjective and is limited by factors such as the extent of remnant vegetation, and the differing scale and quality of aerial photography over time. An alternative approach is proposed which integrates a data model, a statistical model and an ecological model using sophisticated Geographic Information Systems (GIS) techniques and rule-based systems to support fine-scale vegetation community modelling. This approach is based on a more realistic representation of vegetation patterns with transitional gradients from one vegetation community to another. Arbitrary, though often unrealistic, sharp boundaries can be imposed on the model by the application of statistical methods. This GIS-integrated multivariate approach is applied to the problem of vegetation mapping in the complex vegetation communities of the Innisfail Lowlands in the Wet Tropics bioregion of Northeastern Australia. The paper presents the full cycle of this vegetation modelling approach including sampling sites, variable selection, model selection, model implementation, internal model assessment, model prediction assessments, models integration of discrete vegetation community models to generate a composite pre-clearing vegetation map, independent data set model validation and model prediction's scale assessments. An accurate pre-clearing vegetation map of the Innisfail Lowlands was generated (0.83r(2)) through GIS integration of 28 separate statistical models. This modelling approach has good potential for wider application, including provision of. vital information for conservation planning and management; a scientific basis for rehabilitation of disturbed and cleared areas; a viable method for the production of adequate vegetation maps for conservation and forestry planning of poorly-studied areas. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
This thesis presents an analysis of the stability of complex distribution networks. We present a stability analysis against cascading failures. We propose a spin [binary] model, based on concepts of statistical mechanics. We test macroscopic properties of distribution networks with respect to various topological structures and distributions of microparameters. The equilibrium properties of the systems are obtained in a statistical mechanics framework by application of the replica method. We demonstrate the validity of our approach by comparing it with Monte Carlo simulations. We analyse the network properties in terms of phase diagrams and found both qualitative and quantitative dependence of the network properties on the network structure and macroparameters. The structure of the phase diagrams points at the existence of phase transition and the presence of stable and metastable states in the system. We also present an analysis of robustness against overloading in the distribution networks. We propose a model that describes a distribution process in a network. The model incorporates the currents between any connected hubs in the network, local constraints in the form of Kirchoff's law and a global optimizational criterion. The flow of currents in the system is driven by the consumption. We study two principal types of model: infinite and finite link capacity. The key properties are the distributions of currents in the system. We again use a statistical mechanics framework to describe the currents in the system in terms of macroscopic parameters. In order to obtain observable properties we apply the replica method. We are able to assess the criticality of the level of demand with respect to the available resources and the architecture of the network. Furthermore, the parts of the system, where critical currents may emerge, can be identified. This, in turn, provides us with the characteristic description of the spread of the overloading in the systems.
Resumo:
In this letter, we propose an analytical approach to model uplink intercell interference (ICI) in hexagonal grid based orthogonal frequency division multiple access (OFMDA) cellular networks. The key idea is that the uplink ICI from individual cells is approximated with a lognormal distribution with statistical parameters being determined analytically. Accordingly, the aggregated uplink ICI is approximated with another lognormal distribution and its statistical parameters can be determined from those of individual cells using Fenton-Wilkson method. Analytic expressions of uplink ICI are derived with two traditional frequency reuse schemes, namely integer frequency reuse schemes with factor 1 (IFR-1) and factor 3 (IFR-3). Uplink fractional power control and lognormal shadowing are modeled. System performances in terms of signal to interference plus noise ratio (SINR) and spectrum efficiency are also derived. The proposed model has been validated by simulations. © 2013 IEEE.
Resumo:
This work presents a computational, called MOMENTS, code developed to be used in process control to determine a characteristic transfer function to industrial units when radiotracer techniques were been applied to study the unit´s performance. The methodology is based on the measuring the residence time distribution function (RTD) and calculate the first and second temporal moments of the tracer data obtained by two scintillators detectors NaI positioned to register a complete tracer movement inside the unit. Non linear regression technique has been used to fit various mathematical models and a statistical test was used to select the best result to the transfer function. Using the code MOMENTS, twelve different models can be used to fit a curve and calculate technical parameters to the unit.
Resumo:
This dissertation proposes statistical methods to formulate, estimate and apply complex transportation models. Two main problems are part of the analyses conducted and presented in this dissertation. The first method solves an econometric problem and is concerned with the joint estimation of models that contain both discrete and continuous decision variables. The use of ordered models along with a regression is proposed and their effectiveness is evaluated with respect to unordered models. Procedure to calculate and optimize the log-likelihood functions of both discrete-continuous approaches are derived, and difficulties associated with the estimation of unordered models explained. Numerical approximation methods based on the Genz algortithm are implemented in order to solve the multidimensional integral associated with the unordered modeling structure. The problems deriving from the lack of smoothness of the probit model around the maximum of the log-likelihood function, which makes the optimization and the calculation of standard deviations very difficult, are carefully analyzed. A methodology to perform out-of-sample validation in the context of a joint model is proposed. Comprehensive numerical experiments have been conducted on both simulated and real data. In particular, the discrete-continuous models are estimated and applied to vehicle ownership and use models on data extracted from the 2009 National Household Travel Survey. The second part of this work offers a comprehensive statistical analysis of free-flow speed distribution; the method is applied to data collected on a sample of roads in Italy. A linear mixed model that includes speed quantiles in its predictors is estimated. Results show that there is no road effect in the analysis of free-flow speeds, which is particularly important for model transferability. A very general framework to predict random effects with few observations and incomplete access to model covariates is formulated and applied to predict the distribution of free-flow speed quantiles. The speed distribution of most road sections is successfully predicted; jack-knife estimates are calculated and used to explain why some sections are poorly predicted. Eventually, this work contributes to the literature in transportation modeling by proposing econometric model formulations for discrete-continuous variables, more efficient methods for the calculation of multivariate normal probabilities, and random effects models for free-flow speed estimation that takes into account the survey design. All methods are rigorously validated on both real and simulated data.
Resumo:
Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones), rather than objective reality. Bayesian analysis is (arguably) a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.
Resumo:
1. Species' distribution modelling relies on adequate data sets to build reliable statistical models with high predictive ability. However, the money spent collecting empirical data might be better spent on management. A less expensive source of species' distribution information is expert opinion. This study evaluates expert knowledge and its source. In particular, we determine whether models built on expert knowledge apply over multiple regions or only within the region where the knowledge was derived. 2. The case study focuses on the distribution of the brush-tailed rock-wallaby Petrogale penicillata in eastern Australia. We brought together from two biogeographically different regions substantial and well-designed field data and knowledge from nine experts. We used a novel elicitation tool within a geographical information system to systematically collect expert opinions. The tool utilized an indirect approach to elicitation, asking experts simpler questions about observable rather than abstract quantities, with measures in place to identify uncertainty and offer feedback. Bayesian analysis was used to combine field data and expert knowledge in each region to determine: (i) how expert opinion affected models based on field data and (ii) how similar expert-informed models were within regions and across regions. 3. The elicitation tool effectively captured the experts' opinions and their uncertainties. Experts were comfortable with the map-based elicitation approach used, especially with graphical feedback. Experts tended to predict lower values of species occurrence compared with field data. 4. Across experts, consensus on effect sizes occurred for several habitat variables. Expert opinion generally influenced predictions from field data. However, south-east Queensland and north-east New South Wales experts had different opinions on the influence of elevation and geology, with these differences attributable to geological differences between these regions. 5. Synthesis and applications. When formulated as priors in Bayesian analysis, expert opinion is useful for modifying or strengthening patterns exhibited by empirical data sets that are limited in size or scope. Nevertheless, the ability of an expert to extrapolate beyond their region of knowledge may be poor. Hence there is significant merit in obtaining information from local experts when compiling species' distribution models across several regions.
Resumo:
The recent development of indoor wireless local area network (WLAN) standards at 2.45 GHz and 5 GHz has led to increased interest in propagation studies at these frequency bands. Within the indoor environment, human body effects can strongly reduce the quality of wireless communication systems. Human body effects can cause temporal variations and shadowing due to pedestrian movement and antenna- body interaction with portable terminals. This book presents a statistical characterisation, based on measurements, of human body effects on indoor narrowband channels at 2.45 GHz and at 5.2 GHz. A novel cumulative distribution function (CDF) that models the 5 GHz narrowband channel in populated indoor environments is proposed. This novel CDF describes the received envelope in terms of pedestrian traffic. In addition, a novel channel model for the populated indoor environment is proposed for the Multiple-Input Multiple-Output (MIMO) narrowband channel in presence of pedestrians at 2.45 GHz. Results suggest that practical MIMO systems must be sufficiently adaptive if they are to benefit from the capacity enhancement caused by pedestrian movement.
Resumo:
In this thesis, the relationship between air pollution and human health has been investigated utilising Geographic Information System (GIS) as an analysis tool. The research focused on how vehicular air pollution affects human health. The main objective of this study was to analyse the spatial variability of pollutants, taking Brisbane City in Australia as a case study, by the identification of the areas of high concentration of air pollutants and their relationship with the numbers of death caused by air pollutants. A correlation test was performed to establish the relationship between air pollution, number of deaths from respiratory disease, and total distance travelled by road vehicles in Brisbane. GIS was utilized to investigate the spatial distribution of the air pollutants. The main finding of this research is the comparison between spatial and non-spatial analysis approaches, which indicated that correlation analysis and simple buffer analysis of GIS using the average levels of air pollutants from a single monitoring station or by group of few monitoring stations is a relatively simple method for assessing the health effects of air pollution. There was a significant positive correlation between variable under consideration, and the research shows a decreasing trend of concentration of nitrogen dioxide at the Eagle Farm and Springwood sites and an increasing trend at CBD site. Statistical analysis shows that there exists a positive relationship between the level of emission and number of deaths, though the impact is not uniform as certain sections of the population are more vulnerable to exposure. Further statistical tests found that the elderly people of over 75 years age and children between 0-15 years of age are the more vulnerable people exposed to air pollution. A non-spatial approach alone may be insufficient for an appropriate evaluation of the impact of air pollutant variables and their inter-relationships. It is important to evaluate the spatial features of air pollutants before modeling the air pollution-health relationships.