55 resultados para ALS data-set


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Normal mixture models are often used to cluster continuous data. However, conventional approaches for fitting these models will have problems in producing nonsingular estimates of the component-covariance matrices when the dimension of the observations is large relative to the number of observations. In this case, methods such as principal components analysis (PCA) and the mixture of factor analyzers model can be adopted to avoid these estimation problems. We examine these approaches applied to the Cabernet wine data set of Ashenfelter (1999), considering the clustering of both the wines and the judges, and comparing our results with another analysis. The mixture of factor analyzers model proves particularly effective in clustering the wines, accurately classifying many of the wines by location.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

To account for the preponderance of zero counts and simultaneous correlation of observations, a class of zero-inflated Poisson mixed regression models is applicable for accommodating the within-cluster dependence. In this paper, a score test for zero-inflation is developed for assessing correlated count data with excess zeros. The sampling distribution and the power of the test statistic are evaluated by simulation studies. The results show that the test statistic performs satisfactorily under a wide range of conditions. The test procedure is further illustrated using a data set on recurrent urinary tract infections. Copyright (c) 2005 John Wiley & Sons, Ltd.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The paper investigates a Bayesian hierarchical model for the analysis of categorical longitudinal data from a large social survey of immigrants to Australia. Data for each subject are observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and the explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Traditional vegetation mapping methods use high cost, labour-intensive aerial photography interpretation. This approach can be subjective and is limited by factors such as the extent of remnant vegetation, and the differing scale and quality of aerial photography over time. An alternative approach is proposed which integrates a data model, a statistical model and an ecological model using sophisticated Geographic Information Systems (GIS) techniques and rule-based systems to support fine-scale vegetation community modelling. This approach is based on a more realistic representation of vegetation patterns with transitional gradients from one vegetation community to another. Arbitrary, though often unrealistic, sharp boundaries can be imposed on the model by the application of statistical methods. This GIS-integrated multivariate approach is applied to the problem of vegetation mapping in the complex vegetation communities of the Innisfail Lowlands in the Wet Tropics bioregion of Northeastern Australia. The paper presents the full cycle of this vegetation modelling approach including sampling sites, variable selection, model selection, model implementation, internal model assessment, model prediction assessments, models integration of discrete vegetation community models to generate a composite pre-clearing vegetation map, independent data set model validation and model prediction's scale assessments. An accurate pre-clearing vegetation map of the Innisfail Lowlands was generated (0.83r(2)) through GIS integration of 28 separate statistical models. This modelling approach has good potential for wider application, including provision of. vital information for conservation planning and management; a scientific basis for rehabilitation of disturbed and cleared areas; a viable method for the production of adequate vegetation maps for conservation and forestry planning of poorly-studied areas. (c) 2006 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Objective: An estimation of cut-off points for the diagnosis of diabetes mellitus (DM) based on individual risk factors. Methods: A subset of the 1991 Oman National Diabetes Survey is used, including all patients with a 2h post glucose load >= 200 mg/dl (278 subjects) and a control group of 286 subjects. All subjects previously diagnosed as diabetic and all subjects with missing data values were excluded. The data set was analyzed by use of the SPSS Clementine data mining system. Decision Tree Learners (C5 and CART) and a method for mining association rules (the GRI algorithm) are used. The fasting plasma glucose (FPG), age, sex, family history of diabetes and body mass index (BMI) are input risk factors (independent variables), while diabetes onset (the 2h post glucose load >= 200 mg/dl) is the output (dependent variable). All three techniques used were tested by use of crossvalidation (89.8%). Results: Rules produced for diabetes diagnosis are: A- GRI algorithm (1) FPG>=108.9 mg/dl, (2) FPG>=107.1 and age>39.5 years. B- CART decision trees: FPG >=110.7 mg/dl. C- The C5 decision tree learner: (1) FPG>=95.5 and 54, (2) FPG>=106 and 25.2 kg/m2. (3) FPG>=106 and =133 mg/dl. The three techniques produced rules which cover a significant number of cases (82%), with confidence between 74 and 100%. Conclusion: Our approach supports the suggestion that the present cut-off value of fasting plasma glucose (126 mg/dl) for the diagnosis of diabetes mellitus needs revision, and the individual risk factors such as age and BMI should be considered in defining the new cut-off value.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A large number of models have been derived from the two-parameter Weibull distribution and are referred to as Weibull models. They exhibit a wide range of shapes for the density and hazard functions, which makes them suitable for modelling complex failure data sets. The WPP and IWPP plot allows one to determine in a systematic manner if one or more of these models are suitable for modelling a given data set. This paper deals with this topic.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The present study details new turbulence field measurements conducted continuously at high frequency for 50 hours in the upper zone of a small subtropical estuary with semi-diurnal tides. Acoustic Doppler velocimetry was used, and the signal was post-processed thoroughly. The suspended sediment concentration wad further deduced from the acoustic backscatter intensity. The field data set demonstrated some unique flow features of the upstream estuarine zone, including some low-frequency longitudinal oscillations induced by internal and external resonance. A striking feature of the data set is the large fluctuations in all turbulence properties and suspended sediment concentration during the tidal cycle. This feature has been rarely documented.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In natural estuaries, the predictions of scalar dispersion are rarely predicted accurately because of a lack of fundamental understanding of the turbulence structure in estuaries. Herein detailed turbulence field measurements were conducted continuously at high frequency for 50 hours in the upper zone of a small subtropical estuary with semi-diurnal tides. Acoustic Doppler velocimetry was deemed the most appropriate measurement technique for such shallow water depths (less than 0.4 m at low tides), and a thorough post-processing technique was applied. In addition, some experiments were conducted in laboratory under controlled conditions using water and soil samples collected in the estuary to test the relationship between acoustic backscatter strength and suspended sediment load. A striking feature of the field data set was the large fluctuations in all turbulence characteristics during the tidal cycle, including the suspended sediment flux. This feature was rarely documented.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

To simulate cropping systems, crop models must not only give reliable predictions of yield across a wide range of environmental conditions, they must also quantify water and nutrient use well, so that the status of the soil at maturity is a good representation of the starting conditions for the next cropping sequence. To assess the suitability for this task a range of crop models, currently used in Australia, were tested. The models differed in their design objectives, complexity and structure and were (i) tested on diverse, independent data sets from a wide range of environments and (ii) model components were further evaluated with one detailed data set from a semi-arid environment. All models were coded into the cropping systems shell APSIM, which provides a common soil water and nitrogen balance. Crop development was input, thus differences between simulations were caused entirely by difference in simulating crop growth. Under nitrogen non-limiting conditions between 73 and 85% of the observed kernel yield variation across environments was explained by the models. This ranged from 51 to 77% under varying nitrogen supply. Water and nitrogen effects on leaf area index were predicted poorly by all models resulting in erroneous predictions of dry matter accumulation and water use. When measured light interception was used as input, most models improved in their prediction of dry matter and yield. This test highlighted a range of compensating errors in all modelling approaches. Time course and final amount of water extraction was simulated well by two models, while others left up to 25% of potentially available soil water in the profile. Kernel nitrogen percentage was predicted poorly by all models due to its sensitivity to small dry matter changes. Yield and dry matter could be estimated adequately for a range of environmental conditions using the general concepts of radiation use efficiency and transpiration efficiency. However, leaf area and kernel nitrogen dynamics need to be improved to achieve better estimates of water and nitrogen use if such models are to be use to evaluate cropping systems. (C) 1998 Elsevier Science B.V.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A simple theoretical framework is presented for bioassay studies using three component in vitro systems. An equilibrium model is used to derive equations useful for predicting changes in biological response after addition of hormone-binding-protein or as a consequence of increased hormone affinity. Sets of possible solutions for receptor occupancy and binding protein occupancy are found for typical values of receptor and binding protein affinity constants. Unique equilibrium solutions are dictated by the initial condition of total hormone concentration. According to the occupancy theory of drug action, increasing the affinity of a hormone for its receptor will result in a proportional increase in biological potency. However, the three component model predicts that the magnitude of increase in biological potency will be a small fraction of the proportional increase in affinity. With typical initial conditions a two-fold increase in hormone affinity for its receptor is predicted to result in only a 33% increase in biological response. Under the same conditions an Ii-fold increase in hormone affinity for receptor would be needed to produce a two-fold increase in biological potency. Some currently used bioassay systems may be unrecognized three component systems and gross errors in biopotency estimates will result if the effect of binding protein is not calculated. An algorithm derived from the three component model is used to predict changes in biological response after addition of binding protein to in vitro systems. The algorithm is tested by application to a published data set from an experimental study in an in vitro system (Lim et al., 1990, Endocrinology 127, 1287-1291). Predicted changes show good agreement (within 8%) with experimental observations. (C) 1998 Academic Press Limited.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This study extends previous attempts to assess emotion with single adjective descriptors, by examining semantic as well as cognitive, motivational, and intensity features of emotions. The focus was on seven negative emotions common to several emotion typologies: anger, fear, sadness, shame, pity, jealousy, and contempt. For each of these emotions, seven items were generated corresponding to cognitive appraisal about the self, cognitive appraisal about the environment, action tendency, action fantasy, synonym, antonym, and intensity range of the emotion, respectively. A pilot study established that 48 of the 49 items were linked predominantly to the specific emotions as predicted. The main data set comprising 700 subjects' ratings of relatedness between items and emotions was subjected to a series of factor analyses, which revealed that 44 of the 49 items loaded on the emotion constructs as predicted. A final factor analysis of these items uncovered seven factors accounting for 39% of the variance. These emergent factors corresponded to the hypothesized emotion constructs, with the exception of anger and fear, which were somewhat confounded. These findings lay the groundwork for the construction of an instrument to assess emotions multicomponentially.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We assembled a globally-derived data set for site-averaged foliar delta(15)N, the delta(15)N of whole surface mineral soil and corresponding site factors (mean annual rainfall and temperature, latitude, altitude and soil pH). The delta(15)N of whole soil was related to all of the site variables (including foliar delta(15)N) except altitude and, when regressed on latitude and rainfall, provided the best model of these data, accounting for 49% of the variation in whole soil delta(15)N. As single linear regressions, site-averaged foliar delta(15)N was more strongly related to rainfall than was whole soil delta(15)N. A smaller data set showed similar, negative correlations between whole soil delta(15)N, site-averaged foliar delta(15)N and soil moisture variations during a single growing season. The negative correlation between water availability (measured here by rainfall and temperature) and soil or plant delta(15)N fails at the landscape scale, where wet spots are delta(15)N-enriched relative to their drier surroundings. Here we present global and seasonal data, postulate a proximate mechanism for the overall relationship between water availability and ecosystem delta(15)N and, newly, a mechanism accounting for the highly delta(15)N-depleted values found in the foliage and soils of many wet/cold ecosystems. These hypotheses are complemented by documentation of the present gaps in knowledge, suggesting lines of research which will provide new insights into terrestrial N-cycling. Our conclusions are consistent with those of Austin and Vitousek (1998) that foliar (and soil) delta(15)N appear to be related to the residence time of whole ecosystem N.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The use of long-term forecasts of pest pressure is central to better pest management. We relate the Southern Oscillation Index (SOI) and the Sea Surface Temperature (SST) to long-term light-trap catches of the two key moth pests of Australian agriculture, Helicoverpa punctigera (Wallengren) and H. armigera (Hubner), at Narrabri, New South Wales over 11 years, and for H. punctigera only at Turretfield, South Australia over 22 years. At Narrabri, the size of the first spring generation of both species was significantly correlated with the SOI in certain months, sometimes up to 15 months before the date of trapping. Differences in the SOI and SST between significant months were used to build composite variables in multiple regressions which gave fitted values of the trap catches to less than 25% of the observed values. The regressions suggested that useful forecasts of both species could be made 6-15 months ahead. The influence of the two weather variables on trap catches of H. punctigera at Turretfield were not as strong as at Narrabri, probably because the SOI was not as strongly related to rainfall in southern Australia as it is in eastern Australia. The best fits were again given by multiple regressions with SOI plus SST variables, to within 40% of the observed values. The reliability of both variables as predictors of moth numbers may be limited by the lack of stability in the SOI-rainfall correlation over the historical record. As no other data set is available to test the regressions, they can only be tested by future use. The use of long-term forecasts in pest management is discussed, and preliminary analyses of other long sets of insect numbers suggest that the Southern Oscillation Index may be a useful predictor of insect numbers in other parts of the world.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

To investigate whether there are gender differences in the bone geometry of the proximal femur during the adolescent years we used an interactive computer program ?Hip Strength Analysis? developed by Beck and associates (Beck et al., Invest Radiol. 1990,25:6-18.) to derive femoral neck geometry parameters from DXA bone scans (Hologic 2000, array mode). We analyzed a longitudinal data-set collected on 70 boys and 68 girls over a seven year period. Distance and velocity curves for height were fitted for each child utilizing a cubic spline procedure and the age of peak height velocity (PHV) was determined. To control for maturational differences between children of the same chronological age and between boys and girls, section modulus (Z) an index of bending strength, cross sectional area of bone (CSA), sub-periosteal width (SPW), and BMD values at the neck and shaft of the proximal femur were determined for points on each individual?s curve at the age of PHV and one and two years on either side of peak. To control for size differences, height and weight were introduced as co-variates in the two-way analyses of variance looking at gender over time measured at the maturational age points (-2, -1, age of PHV, +1, +2). The following figure presents the results of the analyses on two variables, BMD and Z at neck and shaft regions:After the age of peak linear growth (PHV), independent of body size, there was a gender difference in BMD at the shaft but not at the neck. Section modulus at both sites indicated that male bones became significantly stronger after PHV. Underlying these maturational changes, male bones became wider (SPW) after PHV in both the neck and shaft and enclosed more material (CSA) at all maturational age points at both regions. These results call into question the emphasis on using BMD as a measure of skeletal integrity in growing children

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: A variety of methods for prediction of peptide binding to major histocompatibility complex (MHC) have been proposed. These methods are based on binding motifs, binding matrices, hidden Markov models (HMM), or artificial neural networks (ANN). There has been little prior work on the comparative analysis of these methods. Materials and Methods: We performed a comparison of the performance of six methods applied to the prediction of two human MHC class I molecules, including binding matrices and motifs, ANNs, and HMMs. Results: The selection of the optimal prediction method depends on the amount of available data (the number of peptides of known binding affinity to the MHC molecule of interest), the biases in the data set and the intended purpose of the prediction (screening of a single protein versus mass screening). When little or no peptide data are available, binding motifs are the most useful alternative to random guessing or use of a complete overlapping set of peptides for selection of candidate binders. As the number of known peptide binders increases, binding matrices and HMM become more useful predictors. ANN and HMM are the predictive methods of choice for MHC alleles with more than 100 known binding peptides. Conclusion: The ability of bioinformatic methods to reliably predict MHC binding peptides, and thereby potential T-cell epitopes, has major implications for clinical immunology, particularly in the area of vaccine design.