Biblioteca Digital

149 resultados para Multivariate generalized t -distribution

em University of Queensland eSpace - Australia

Estimating Income Inequality in China Using Grouped Data and the Generalized Beta Distribution

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There are two main types of data sources of income distributions in China: household survey data and grouped data. Household survey data are typically available for isolated years and individual provinces. In comparison, aggregate or grouped data are typically available more frequently and usually have national coverage. In principle, grouped data allow investigation of the change of inequality over longer, continuous periods of time, and the identification of patterns of inequality across broader regions. Nevertheless, a major limitation of grouped data is that only mean (average) income and income shares of quintile or decile groups of the population are reported. Directly using grouped data reported in this format is equivalent to assuming that all individuals in a quintile or decile group have the same income. This potentially distorts the estimate of inequality within each region. The aim of this paper is to apply an improved econometric method designed to use grouped data to study income inequality in China. A generalized beta distribution is employed to model income inequality in China at various levels and periods of time. The generalized beta distribution is more general and flexible than the lognormal distribution that has been used in past research, and also relaxes the assumption of a uniform distribution of income within quintile and decile groups of populations. The paper studies the nature and extent of inequality in rural and urban China over the period 1978 to 2002. Income inequality in the whole of China is then modeled using a mixture of province-specific distributions. The estimated results are used to study the trends in national inequality, and to discuss the empirical findings in the light of economic reforms, regional policies, and globalization of the Chinese economy.

R-estimator of location of the generalized secant hyperbolic distribution

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The generalized secant hyperbolic distribution (GSHD) proposed in Vaughan (2002) includes a wide range of unimodal symmetric distributions, with the Cauchy and uniform distributions being the limiting cases, and the logistic and hyperbolic secant distributions being special cases. The current article derives an asymptotically efficient rank estimator of the location parameter of the GSHD and suggests the corresponding one- and two-sample optimal rank tests. The rank estimator derived is compared to the modified MLE of location proposed in Vaughan (2002). By combining these two estimators, a computationally attractive method for constructing an exact confidence interval of the location parameter is developed. The statistical procedures introduced in the current article are illustrated by examples.

Two-sample scale rank procedures optimal for the generalized secant hyperbolic distribution

Relevância:

40.00% 40.00%

Publicador:

Robust Mixture Modelling Using the t Distribution

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.

Maximum Likelihood Estimation of Mixture Densities for Binned and Truncated Multivariate Data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.

Changes in body water distribution during treatment with inhaled steroid in pre-school children

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Primary objective: The study aimed to examine the changes in water distribution in the soft tissue during systemic steroid activity. Research design: A three-way cross-over, randomized, placebo-controlled, double-blind trial was used, including 4 weeks of fluticasone propionate pMDI 200 mug b.i.d. delivered via Babyhaler(R), budesonide pressurized metered dose inhaler (pMDI) 200 mug b.i.d. delivered via Nebuchamber(R) and placebo. Spacers were primed before use. In total, 40 children aged 1-3 years, with mild intermittent asthma were included. Twenty-five of the children completed all three treatments. At the end of each treatment period body impedance and skin ultrasonography were measured. Methods and procedures: We measured changes in water content of the soft tissues by two methods. Skin ultrasonography was used to detect small changes in dermal water content, and bioelectrical impedance was used to assess body water content and distribution. Main outcomes and results: We found an increase in skin density of the shin from fluticasone as measured by ultrasonography (p = 0.01). There was a tendency for a consistent elevation of impedance parameters from active treatments compared to placebo although overall this effect was not statistically significant (0.1< p <0.2). However, sub-analyses indicated a significant effect on whole-body and leg impedance from budesonide treatment (p <0.05). Conclusion: Decreased growth during inhaled steroid treatment seems to partly reflect generalized changes in body water.

A new method for identification of protein (sub)families in a set of proteins based on hydropathy distribution in proteins

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Structural similarity among proteins is reflected in the distribution of hydropathicity along the amino acids in the protein sequence. Similarities in the hydropathy distributions are obvious for homologous proteins within a protein family. They also were observed for proteins with related structures, even when sequence similarities were undetectable. Here we present a novel method that employs the hydropathy distribution in proteins for identification of (sub)families in a set of (homologous) proteins. We represent proteins as points in a generalized hydropathy space, represented by vectors of specifically defined features. The features are derived from hydropathy of the individual amino acids. Projection of this space onto principal axes reveals groups of proteins with related hydropathy distributions. The groups identified correspond well to families of structurally and functionally related proteins. We found that this method accurately identifies protein families in a set of proteins, or subfamilies in a set of homologous proteins. Our results show that protein families can be identified by the analysis of hydropathy distribution, without the need for sequence alignment. (C) 2005 Wiley-Liss, Inc.

Designs for generalized linear models with several variables and model uncertainty

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Standard factorial designs sometimes may be inadequate for experiments that aim to estimate a generalized linear model, for example, for describing a binary response in terms of several variables. A method is proposed for finding exact designs for such experiments that uses a criterion allowing for uncertainty in the link function, the linear predictor, or the model parameters, together with a design search. Designs are assessed and compared by simulation of the distribution of efficiencies relative to locally optimal designs over a space of possible models. Exact designs are investigated for two applications, and their advantages over factorial and central composite designs are demonstrated.

Modelling pre-clearing vegetation distribution using GIS-integrated statistical, ecological and data models: A case study from the wet tropics of Northeastern Australia

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Traditional vegetation mapping methods use high cost, labour-intensive aerial photography interpretation. This approach can be subjective and is limited by factors such as the extent of remnant vegetation, and the differing scale and quality of aerial photography over time. An alternative approach is proposed which integrates a data model, a statistical model and an ecological model using sophisticated Geographic Information Systems (GIS) techniques and rule-based systems to support fine-scale vegetation community modelling. This approach is based on a more realistic representation of vegetation patterns with transitional gradients from one vegetation community to another. Arbitrary, though often unrealistic, sharp boundaries can be imposed on the model by the application of statistical methods. This GIS-integrated multivariate approach is applied to the problem of vegetation mapping in the complex vegetation communities of the Innisfail Lowlands in the Wet Tropics bioregion of Northeastern Australia. The paper presents the full cycle of this vegetation modelling approach including sampling sites, variable selection, model selection, model implementation, internal model assessment, model prediction assessments, models integration of discrete vegetation community models to generate a composite pre-clearing vegetation map, independent data set model validation and model prediction's scale assessments. An accurate pre-clearing vegetation map of the Innisfail Lowlands was generated (0.83r(2)) through GIS integration of 28 separate statistical models. This modelling approach has good potential for wider application, including provision of. vital information for conservation planning and management; a scientific basis for rehabilitation of disturbed and cleared areas; a viable method for the production of adequate vegetation maps for conservation and forestry planning of poorly-studied areas. (c) 2006 Elsevier B.V. All rights reserved.

Liquid distribution as a means to describing the granule growth mechanism

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Copper concentrate (chalcopyrite) was granulated in a rotating drum with a diameter of 0.3 m and a length of 0.2 m. Water was used as the binder and it was sprayed onto the powder bed with a nozzle. This material exhibited induction type behaviour, which was defined by Iveson and Litster [AIChE J. 44 (1998) 1510]. Induction type behaviour is characterized by the occurrence of an induction stage, during which the granules are gradually being compacted and little or no growth occurs. At the end of this induction stage, binder liquid is squeezed from the interior of the granules onto the granule surface and the granules are then surface-wet. This results in a rapid growth rate of the granules. Different types of experiments were conducted. The influence of the nozzle pressure and the distance from the nozzle to the powder bed on the growth behaviour of the granules as well as on the binder distribution was examined. The results of these experiments led to the postulation of a modified mechanism for induction type behaviour: it was found that after the binder was delivered, there were large granules containing a high amount of binder and small granules containing less binder. During the induction stage, the granules are compacted and binder liquid continuously appears at the surface of the large granules. These wet spots that are continuously being formed pick up the dry and small granules. When all the small granules have been picked up, further expulsion of binder liquid onto the granules' surface results in granules that remain surface-wet. This phenomenon marks the end of the induction stage and it coincides with the disappearance of the small granules. The hypothesis was tested by selectively removing the smaller granules during an experiment. As expected, this resulted in a shorter induction time.

Distribution of major, minor, and trace metals in lake environments of Antarctica

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The concentrations of major, minor and trace metals were measured in water samples collected from five shallow Antarctic lakes (Carezza, Edmonson Point (No 14 and 15a), Inexpressible Island and Tarn Flat) found in Terra Nova Bay (northern Victoria Land, Antarctica) during the Italian Expeditions of 1993-2001. The total concentrations of a large suite of elements (Al, As, Ba, Ca, Cd, Ce, Co, Cr, Cs, Cu, Fe, Ga, Gd, K, La, Li, Mg, Mn, Mo, Na, Nd, Ni, Pb, Pr, Rb, Sc, Si, Sr, Ta, Ti, U, V, Y, W, Zn and Zr) were determined using spectroscopic techniques (ICP-AES, GF-AAS and ICP-MS). The results are similar to those obtained for the freshwater lakes of the Larsemann Hills, East Antarctica, and for the McMurdo Dry Valleys. Principal Component Analysis (PCA) and Cluster Analysis (CA) were performed to identify groups of samples with similar characteristics and to find correlations between the variables. The variability observed within the water samples is closely connected to the sea spray input; hence, it is primarily a consequence of geographical and meteorological factors, such as distance from the ocean and time of year. The trace element levels, in particular those of heavy metals, are very low, suggesting an origin from natural sources rather than from anthropogenic contamination.

Distribution, expression, and motif variability of ankyrin domain genes in Wolbachia pipientis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The endosymbiotic bacterium Wolbachia pipientis infects a wide range of arthropods, in which it induces a variety of reproductive phenotypes, including cytoplasmic incompatibility (CI), parthenogenesis, male killing, and reversal of genetic sex determination. The recent sequencing and annotation of the first Wolbachia genome revealed an unusually high number of genes encoding ankyrin domain (ANK) repeats. These ANK genes are likely to be important in mediating the Wolbachia-host interaction. In this work we determined the distribution and expression of the different ANK genes found in the sequenced Wolbachia wMel genome in nine Wolbachia strains that induce different phenotypic effects in their hosts. A comparison of the ANK genes of wMel and the non-CI-inducing wAu Wolbachia strain revealed significant differences between the strains. This was reflected in sequence variability in shared genes that could result in alterations in the encoded proteins, such as motif deletions, amino acid insertions, and in some cases disruptions due to insertion of transposable elements and premature stops. In addition, one wMel ANK gene, which is part of an operon, was absent in the wAu genome. These variations are likely to affect the affinity, function, and cellular location of the predicted proteins encoded by these genes.

Tissue distribution and prevalence of Wolbachia infections in tsetse flies, Glossina spp.

Relevância:

20.00% 20.00%

Publicador:

Cytoplasmic incompatibility in Drosophila populations: influence of assortative mating on symbiont distribution

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cytoplasmic incompatibility is known to occur between strains of both Drosophila simulans and D. melanogaster. Incompatibility is associated with the infection of Drosophila with microorganismal endosymbionts. This paper reports survey work conducted on strains of D. simulans and D. melanogaster from diverse geographical locations finding that infected populations are relatively rare and scattered in their distribution. The distribution of infected populations of D. simulans appears to be at odds with deterministic models predicting the rapid spread of the infection through uninfected populations. Examination of isofemale lines from four localities in California where populations appear to be polymorphic for the infection failed to find evidence for consistent assortative mating preferences between infected and uninfected populations that may explain the basis for the observed polymorphism.

Sampling phylogenetic tree space with the generalized Gibbs sampler

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The generalized Gibbs sampler (GGS) is a recently developed Markov chain Monte Carlo (MCMC) technique that enables Gibbs-like sampling of state spaces that lack a convenient representation in terms of a fixed coordinate system. This paper describes a new sampler, called the tree sampler, which uses the GGS to sample from a state space consisting of phylogenetic trees. The tree sampler is useful for a wide range of phylogenetic applications, including Bayesian, maximum likelihood, and maximum parsimony methods. A fast new algorithm to search for a maximum parsimony phylogeny is presented, using the tree sampler in the context of simulated annealing. The mathematics underlying the algorithm is explained and its time complexity is analyzed. The method is tested on two large data sets consisting of 123 sequences and 500 sequences, respectively. The new algorithm is shown to compare very favorably in terms of speed and accuracy to the program DNAPARS from the PHYLIP package.

«
1
2
3
4
5
6
7
8
9
10
»