981 resultados para Proportion Data


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Often in biomedical research, we deal with continuous (clustered) proportion responses ranging between zero and one quantifying the disease status of the cluster units. Interestingly, the study population might also consist of relatively disease-free as well as highly diseased subjects, contributing to proportion values in the interval [0, 1]. Regression on a variety of parametric densities with support lying in (0, 1), such as beta regression, can assess important covariate effects. However, they are deemed inappropriate due to the presence of zeros and/or ones. To evade this, we introduce a class of general proportion density, and further augment the probabilities of zero and one to this general proportion density, controlling for the clustering. Our approach is Bayesian and presents a computationally convenient framework amenable to available freeware. Bayesian case-deletion influence diagnostics based on q-divergence measures are automatic from the Markov chain Monte Carlo output. The methodology is illustrated using both simulation studies and application to a real dataset from a clinical periodontology study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Joint generalized linear models and double generalized linear models (DGLMs) were designed to model outcomes for which the variability can be explained using factors and/or covariates. When such factors operate, the usual normal regression models, which inherently exhibit constant variance, will under-represent variation in the data and hence may lead to erroneous inferences. For count and proportion data, such noise factors can generate a so-called overdispersion effect, and the use of binomial and Poisson models underestimates the variability and, consequently, incorrectly indicate significant effects. In this manuscript, we propose a DGLM from a Bayesian perspective, focusing on the case of proportion data, where the overdispersion can be modeled using a random effect that depends on some noise factors. The posterior joint density function was sampled using Monte Carlo Markov Chain algorithms, allowing inferences over the model parameters. An application to a data set on apple tissue culture is presented, for which it is shown that the Bayesian approach is quite feasible, even when limited prior information is available, thereby generating valuable insight for the researcher about its experimental results.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The capability of a feature model of immediate memory (Nairne, 1990; Neath, 2000) to predict and account for a relationship between absolute and proportion scoring of immediate serial recall when memory load is varied (the list-length effect, LLE) is examined. The model correctly predicts the novel finding of an LLE in immediate serial order memory similar to that observed with free recall and previously assumed to be attributable to the long-term memory component of that procedure (Glanzer, 1972). The usefulness of formal models as predictive tools and the continuity between short-term serial order and longer term item memory are considered.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present experimental results for wavelength-division multiplexed (WDM) transmission performance using unbalanced proportions of 1s and 0s in pseudo-random bit sequence (PRBS) data. This investigation simulates the effect of local, in time, data unbalancing which occurs in some coding systems such as forward error correction when extra bits are added to the WDM data stream. We show that such local unbalancing, which would practically give a time-dependent error-rate, can be employed to improve the legacy long-haul WDM system performance if the system is allowed to operate in the nonlinear power region. We use a recirculating loop to simulate a long-haul fibre system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multi-frequency bioimpedance analysis (MFBIA) was used to determine the impedance, reactance and resistance of 103 lamb carcasses (17.1-34.2 kg) immediately after slaughter and evisceration. Carcasses were halved, frozen and one half subsequently homogenized and analysed for water, crude protein and fat content. Three measures of carcass length were obtained. Diagonal length between the electrodes (right side biceps femoris to left side of neck) explained a greater proportion of the variance in water mass than did estimates of spinal length and was selected for use in the index L-2/Z to predict the mass of chemical components in the carcass. Use of impedance (Z) measured at the characteristic frequency (Z(c)) instead of 50 kHz (Z(50)) did not improve the power of the model to predict the mass of water, protein or fat in the carcass. While L-2/Z(50) explained a significant proportion of variation in the masses of body water (r(2) 0.64), protein (r(2) 0.34) and fat (r(2) 0.35), its inclusion in multi-variate indices offered small or no increases in predictive capacity when hot carcass weight (HCW) and a measure of rib fat-depth (GR) were present in the model. Optimized equations were able to account for 65-90 % of the variance observed in the weight of chemical components in the carcass. It is concluded that single frequency impedance data do not provide better prediction of carcass composition than can be obtained from measures of HCW and GR. Indices of intracellular water mass derived from impedance at zero frequency and the characteristic frequency explained a similar proportion of the variance in carcass protein mass as did the index L-2/Z(50).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In order to examine whether different populations show the same pattern of onset in the Southern Hemisphere, we examined the age-at-first-admission distribution for schizophrenia based on mental health registers from Australia and Brazil. Data on age-at-first-admission for individuals with schizophrenia were extracted from two names-linked registers, (1) the Queensland Mental Health Statistics System, Australia (N=7651, F= 3293, M=4358), and (2) a psychiatric hospital register in Pelotas, Brazil (N=4428, F=2220, M=2208). Age distributions were derived for males and females for both datasets. The general population structure tbr both countries was also obtained. There were significantly more males in the Queensland dataset (gz = 56.9, df3, p < 0.0001 ). Both dataset distributions were skewed to the right. Onset rose steeply after puberty to reach a modal age group of 20-29 for men and women, with a more gradual tail toward the older age groups. In Queensland 68% of women with schizophrenia had their first admissions after age 30, while the proportion from Brazil was 58%. Compared to the Australian dataset, the Brazilian dataset had a slightly greater proportion of first admissions under the age 30 and a slightly smaller proportion over the age of 60 years. This reflects the underlying age distributions of the two populations. This study confirms the wide age range and gender differences in age-at-first-admission distributions for schizophrenia and identified a significant difference in the gender ratio between the two datasets. Given widely differing health services, cultural practices, ethic variability, and the different underlying population distributions, the age-at-first-admission in Queensland and Brazil showed more similarities than differences. Acknowledgments: The Stanley Foundation supported this project.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: To determine whether coinfection with sexually transmitted diseases (STD) increases HIV shedding in genital-tract secretions, and whether STD treatment reduces this shedding. Design: Systematic review and data synthesis of cross-sectional and cohort studies meeting. predefined quality criteria. Main Outcome Measures: Proportion of patients with and without a STD who had detectable HIV in genital secretions, HIV toad in genital secretions, or change following STD treatment. Results: Of 48 identified studies, three cross-sectional and three cohort studies were included. HIV was detected significantly more frequently in participants infected with Neisseria gonorrhoeae (125 of 309 participants, 41%) than in those without N gonorrhoeae infection (311 of 988 participants, 32%; P = 0.004). HIV was not significantly more frequently detected in persons infected with Chlamydia trachomatis (28 of 67 participants, 42%) than in those without C trachomatis infection (375 of 1149 participants, 33%; P = 0.13). Median HIV load reported in only one study was greater in men with urethritis (12.4 x 10(4) versus 1.51 x 10(4) copies/ml; P = 0.04). In the only cohort study in which this could be fully assessed, treatment of women with any STD reduced the proportion of those with detectable HIV from 39% to 29% (P = 0.05), whereas this proportion remained stable among controls (15-17%), A second cohort study reported fully on HIV load; among men with urethritis, viral load fell from 12.4 to 4.12 x 10(4) copies/ml 2 weeks posttreatment, whereas viral load remained stable in those without urethritis. Conclusion: Few high-quality studies were found. HIV is detected moderately more frequently in genital secretions of men and women with a STD, and HIV load is substantially increased among men with urethritis, Successful STD treatment reduces both of these parameters, but not to control levels. More high-quality studies are needed to explore this important relationship further.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The principle of using induction rules based on spatial environmental data to model a soil map has previously been demonstrated Whilst the general pattern of classes of large spatial extent and those with close association with geology were delineated small classes and the detailed spatial pattern of the map were less well rendered Here we examine several strategies to improve the quality of the soil map models generated by rule induction Terrain attributes that are better suited to landscape description at a resolution of 250 m are introduced as predictors of soil type A map sampling strategy is developed Classification error is reduced by using boosting rather than cross validation to improve the model Further the benefit of incorporating the local spatial context for each environmental variable into the rule induction is examined The best model was achieved by sampling in proportion to the spatial extent of the mapped classes boosting the decision trees and using spatial contextual information extracted from the environmental variables.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Historically, the cure rate model has been used for modeling time-to-event data within which a significant proportion of patients are assumed to be cured of illnesses, including breast cancer, non-Hodgkin lymphoma, leukemia, prostate cancer, melanoma, and head and neck cancer. Perhaps the most popular type of cure rate model is the mixture model introduced by Berkson and Gage [1]. In this model, it is assumed that a certain proportion of the patients are cured, in the sense that they do not present the event of interest during a long period of time and can found to be immune to the cause of failure under study. In this paper, we propose a general hazard model which accommodates comprehensive families of cure rate models as particular cases, including the model proposed by Berkson and Gage. The maximum-likelihood-estimation procedure is discussed. A simulation study analyzes the coverage probabilities of the asymptotic confidence intervals for the parameters. A real data set on children exposed to HIV by vertical transmission illustrates the methodology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In many occupational safety interventions, the objective is to reduce the injury incidence as well as the mean claims cost once injury has occurred. The claims cost data within a period typically contain a large proportion of zero observations (no claim). The distribution thus comprises a point mass at 0 mixed with a non-degenerate parametric component. Essentially, the likelihood function can be factorized into two orthogonal components. These two components relate respectively to the effect of covariates on the incidence of claims and the magnitude of claims, given that claims are made. Furthermore, the longitudinal nature of the intervention inherently imposes some correlation among the observations. This paper introduces a zero-augmented gamma random effects model for analysing longitudinal data with many zeros. Adopting the generalized linear mixed model (GLMM) approach reduces the original problem to the fitting of two independent GLMMs. The method is applied to evaluate the effectiveness of a workplace risk assessment teams program, trialled within the cleaning services of a Western Australian public hospital.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The tests that are currently available for the measurement of overexpression of the human epidermal growth factor-2 (HER2) in breast cancer have shown considerable problems in accuracy and interlaboratory reproducibility. Although these problems are partly alleviated by the use of validated, standardised 'kits', there may be considerable cost involved in their use. Prior to testing it may therefore be an advantage to be able to predict from basic pathology data whether a cancer is likely to overexpress HER2. In this study, we have correlated pathology features of cancers with the frequency of HER2 overexpression assessed by immunohistochemistry (IHC) using HercepTest (Dako). In addition, fluorescence in situ hybridisation (FISH) has been used to re-test the equivocal cancers and interobserver variation in assessing HER2 overexpression has been examined by a slide circulation scheme. Of the 1536 cancers, 1144 (74.5%) did not overexpress HER2. Unequivocal overexpression (3+ by IHC) was seen in 186 cancers (12%) and an equivocal result (2+ by IHC) was seen in 206 cancers (13%). Of the 156 IHC 3+ cancers for which complete data was available, 149 (95.5%) were ductal NST and 152 (97%) were histological grade 2 or 3. Only 1 of 124 infiltrating lobular carcinomas (0.8%) showed HER2 overexpression. None of the 49 'special types' of carcinoma showed HER2 overexpression. Re-testing by FISH of a proportion of the IHC 2+ cancers showed that only 25 (23%) of those assessable exhibited HER2 gene amplification, but 46 of the 47 IHC 3+ cancers (98%) were confirmed as showing gene amplification. Circulating slides for the assessment of HER2 score showed a moderate level of agreement between pathologists (kappa 0.4). As a result of this study we would advocate consideration of a triage approach to HER-2 testing. Infiltrating lobular and special types of carcinoma may not need to be routinely tested at presentation nor may grade 1 NST carcinomas in which only 1.4% have been shown to overexpress HER2. Testing of these carcinomas may be performed when HER2 status is required to assist in therapeutic or other clinical/prognostic decision-making. The highest yield of HER2 overexpressing carcinomas is seen in the grade 3 NST subgroup in which 24% are positive by IHC. (C) 2003 Elsevier Science Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Between October 1988 and April 1989 a cross-sectional survey was carried out in six out of eight blood banks of Goiânia, Central Brazil. Subjects attending for first-time blood donation in the mornings of the study period (n = 1358) were interviewed and screened for T. cruzi infection as a part of a major study among blood donors. Tests to anti-T. cruzi antibodies were performed, simultaneously, by indirect hem agglutination test (IHA) and complement fixation test (CFT). A subject was considered seropositive when any one of the two tests showed a positive result. Information on age, sex, place of birth, migration and socio-economic level was recorded. Results from this survey were compared with seroprevalence rates obtained in previous studies in an attempt to analyse trend of T. cruzi infection in an endemic urban area. The overall seroprevalence of T. cruzi infection among first-time donors was found to be 3.5% (95% confidence interval 2.5%-4.5% ). The seroprevalence rate increased with age up to 45 years and then decreased. Migrants from rural areas had higher seroprevalence rates than subjects from urban counties (1.8%-16.2% vs. 0%-3.6%). A four fold decrease in prevalence rates was observed when these rates were compared with those of fifteen years ago. Two possible hypotheses to explain this difference were suggested: 1. a cohort effect related with the decrease of transmission in rural areas and/or 2. a differential proportion of people of rural origin among blood donors between the two periods. The potential usefulness of blood banks as a source of epidemiological information to monitor trends of T. cruzi infection in an urban adult population was stressed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The development of high spatial resolution airborne and spaceborne sensors has improved the capability of ground-based data collection in the fields of agriculture, geography, geology, mineral identification, detection [2, 3], and classification [4–8]. The signal read by the sensor from a given spatial element of resolution and at a given spectral band is a mixing of components originated by the constituent substances, termed endmembers, located at that element of resolution. This chapter addresses hyperspectral unmixing, which is the decomposition of the pixel spectra into a collection of constituent spectra, or spectral signatures, and their corresponding fractional abundances indicating the proportion of each endmember present in the pixel [9, 10]. Depending on the mixing scales at each pixel, the observed mixture is either linear or nonlinear [11, 12]. The linear mixing model holds when the mixing scale is macroscopic [13]. The nonlinear model holds when the mixing scale is microscopic (i.e., intimate mixtures) [14, 15]. The linear model assumes negligible interaction among distinct endmembers [16, 17]. The nonlinear model assumes that incident solar radiation is scattered by the scene through multiple bounces involving several endmembers [18]. Under the linear mixing model and assuming that the number of endmembers and their spectral signatures are known, hyperspectral unmixing is a linear problem, which can be addressed, for example, under the maximum likelihood setup [19], the constrained least-squares approach [20], the spectral signature matching [21], the spectral angle mapper [22], and the subspace projection methods [20, 23, 24]. Orthogonal subspace projection [23] reduces the data dimensionality, suppresses undesired spectral signatures, and detects the presence of a spectral signature of interest. The basic concept is to project each pixel onto a subspace that is orthogonal to the undesired signatures. As shown in Settle [19], the orthogonal subspace projection technique is equivalent to the maximum likelihood estimator. This projection technique was extended by three unconstrained least-squares approaches [24] (signature space orthogonal projection, oblique subspace projection, target signature space orthogonal projection). Other works using maximum a posteriori probability (MAP) framework [25] and projection pursuit [26, 27] have also been applied to hyperspectral data. In most cases the number of endmembers and their signatures are not known. Independent component analysis (ICA) is an unsupervised source separation process that has been applied with success to blind source separation, to feature extraction, and to unsupervised recognition [28, 29]. ICA consists in finding a linear decomposition of observed data yielding statistically independent components. Given that hyperspectral data are, in given circumstances, linear mixtures, ICA comes to mind as a possible tool to unmix this class of data. In fact, the application of ICA to hyperspectral data has been proposed in reference 30, where endmember signatures are treated as sources and the mixing matrix is composed by the abundance fractions, and in references 9, 25, and 31–38, where sources are the abundance fractions of each endmember. In the first approach, we face two problems: (1) The number of samples are limited to the number of channels and (2) the process of pixel selection, playing the role of mixed sources, is not straightforward. In the second approach, ICA is based on the assumption of mutually independent sources, which is not the case of hyperspectral data, since the sum of the abundance fractions is constant, implying dependence among abundances. This dependence compromises ICA applicability to hyperspectral images. In addition, hyperspectral data are immersed in noise, which degrades the ICA performance. IFA [39] was introduced as a method for recovering independent hidden sources from their observed noisy mixtures. IFA implements two steps. First, source densities and noise covariance are estimated from the observed data by maximum likelihood. Second, sources are reconstructed by an optimal nonlinear estimator. Although IFA is a well-suited technique to unmix independent sources under noisy observations, the dependence among abundance fractions in hyperspectral imagery compromises, as in the ICA case, the IFA performance. Considering the linear mixing model, hyperspectral observations are in a simplex whose vertices correspond to the endmembers. Several approaches [40–43] have exploited this geometric feature of hyperspectral mixtures [42]. Minimum volume transform (MVT) algorithm [43] determines the simplex of minimum volume containing the data. The MVT-type approaches are complex from the computational point of view. Usually, these algorithms first find the convex hull defined by the observed data and then fit a minimum volume simplex to it. Aiming at a lower computational complexity, some algorithms such as the vertex component analysis (VCA) [44], the pixel purity index (PPI) [42], and the N-FINDR [45] still find the minimum volume simplex containing the data cloud, but they assume the presence in the data of at least one pure pixel of each endmember. This is a strong requisite that may not hold in some data sets. In any case, these algorithms find the set of most pure pixels in the data. Hyperspectral sensors collects spatial images over many narrow contiguous bands, yielding large amounts of data. For this reason, very often, the processing of hyperspectral data, included unmixing, is preceded by a dimensionality reduction step to reduce computational complexity and to improve the signal-to-noise ratio (SNR). Principal component analysis (PCA) [46], maximum noise fraction (MNF) [47], and singular value decomposition (SVD) [48] are three well-known projection techniques widely used in remote sensing in general and in unmixing in particular. The newly introduced method [49] exploits the structure of hyperspectral mixtures, namely the fact that spectral vectors are nonnegative. The computational complexity associated with these techniques is an obstacle to real-time implementations. To overcome this problem, band selection [50] and non-statistical [51] algorithms have been introduced. This chapter addresses hyperspectral data source dependence and its impact on ICA and IFA performances. The study consider simulated and real data and is based on mutual information minimization. Hyperspectral observations are described by a generative model. This model takes into account the degradation mechanisms normally found in hyperspectral applications—namely, signature variability [52–54], abundance constraints, topography modulation, and system noise. The computation of mutual information is based on fitting mixtures of Gaussians (MOG) to data. The MOG parameters (number of components, means, covariances, and weights) are inferred using the minimum description length (MDL) based algorithm [55]. We study the behavior of the mutual information as a function of the unmixing matrix. The conclusion is that the unmixing matrix minimizing the mutual information might be very far from the true one. Nevertheless, some abundance fractions might be well separated, mainly in the presence of strong signature variability, a large number of endmembers, and high SNR. We end this chapter by sketching a new methodology to blindly unmix hyperspectral data, where abundance fractions are modeled as a mixture of Dirichlet sources. This model enforces positivity and constant sum sources (full additivity) constraints. The mixing matrix is inferred by an expectation-maximization (EM)-type algorithm. This approach is in the vein of references 39 and 56, replacing independent sources represented by MOG with mixture of Dirichlet sources. Compared with the geometric-based approaches, the advantage of this model is that there is no need to have pure pixels in the observations. The chapter is organized as follows. Section 6.2 presents a spectral radiance model and formulates the spectral unmixing as a linear problem accounting for abundance constraints, signature variability, topography modulation, and system noise. Section 6.3 presents a brief resume of ICA and IFA algorithms. Section 6.4 illustrates the performance of IFA and of some well-known ICA algorithms with experimental data. Section 6.5 studies the ICA and IFA limitations in unmixing hyperspectral data. Section 6.6 presents results of ICA based on real data. Section 6.7 describes the new blind unmixing scheme and some illustrative examples. Section 6.8 concludes with some remarks.