936 resultados para Nonparametric discriminant analysis


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents an analysis of motor vehicle insurance claims relating to vehicle damage and to associated medical expenses. We use univariate severity distributions estimated with parametric and non-parametric methods. The methods are implemented using the statistical package R. Parametric analysis is limited to estimation of normal and lognormal distributions for each of the two claim types. The nonparametric analysis presented involves kernel density estimation. We illustrate the benefits of applying transformations to data prior to employing kernel based methods. We use a log-transformation and an optimal transformation amongst a class of transformations that produces symmetry in the data. The central aim of this paper is to provide educators with material that can be used in the classroom to teach statistical estimation methods, goodness of fit analysis and importantly statistical computing in the context of insurance and risk management. To this end, we have included in the Appendix of this paper all the R code that has been used in the analysis so that readers, both students and educators, can fully explore the techniques described

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Triatoma dimidiata is one of the major vectors of Chagas disease in Latin America. Its range includes Mexico, all countries of Central America, Colombia, and Ecuador. In light of recent genetic analysis suggesting that the possible origin of this species is the Yucatan peninsula, we have analyzed populations from the state of Yucatan, San Luis Potosi, and Veracruz in Mexico, and a population from the southern region of the Yucatan peninsula located in Northern Guatemala, the region of El Peten. Classical morphometry including principal component, discriminant, sexual dimorphism, and wing asymmetry was analyzed. San Luis Potosi and Veracruz populations were indistinguishable while clearly separate from Yucatan and Peten populations. Despite important genetic differences, Yucatan and Peten populations were highly similar. Yucatan specimens were the smallest in size, while females were larger than males in all populations. Only head characters were necessary to distinguish population level differences, although wing fluctuating asymmetry was present in all populations. These results are discussed in light of recent findings suggesting genetic polymorphism in most populations of Triatoma dimidiata south of Chiapas to Ecuador.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS: We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS: Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research considers the problem of spatial data classification using machine learning algorithms: probabilistic neural networks (PNN) and support vector machines (SVM). As a benchmark model simple k-nearest neighbor algorithm is considered. PNN is a neural network reformulation of well known nonparametric principles of probability density modeling using kernel density estimator and Bayesian optimal or maximum a posteriori decision rules. PNN is well suited to problems where not only predictions but also quantification of accuracy and integration of prior information are necessary. An important property of PNN is that they can be easily used in decision support systems dealing with problems of automatic classification. Support vector machine is an implementation of the principles of statistical learning theory for the classification tasks. Recently they were successfully applied for different environmental topics: classification of soil types and hydro-geological units, optimization of monitoring networks, susceptibility mapping of natural hazards. In the present paper both simulated and real data case studies (low and high dimensional) are considered. The main attention is paid to the detection and learning of spatial patterns by the algorithms applied.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

PURPOSE: In contrast to other human tumors, a repression of the cell-surface glycoprotein CD44 on neuroblastoma is a marker of aggressiveness that usually correlates to N-myc amplification. We thus compared the prognostic value of both markers in the initial staging of 121 children treated for neuroblastoma in collaborative institutions. METHODS: Frozen samples were analyzed by a rapid and well-standardized technique of immunostaining with monoclonal antibodies (MoAbs) against epitopes in the CD44 constant region. RESULTS: In this retrospective series, CD44 was expressed on 102 specimens and strongly correlated with favorable tumor stages and histology, younger age, and normal N-myc copy numbers. In univariate analysis, CD44 expression and normal N-myc were the most powerful markers of favorable clinical outcome (P < 10(-6) and chi 2 = 65.40 and P < 10(-6) and chi 2 = 42.56, respectively), but analysis of CD44 affords significant prognostic discrimination in subgroups of patients with or without N-myc-amplified tumors. In the subgroup of stage IV neuroblastomas, CD44 was the only significant prognostic marker (P < .02, chi 2 = 5.76), whereas N-myc status was not discriminant. In multivariate analysis of five factors, ie, N-myc amplification, CD44 expression, age, tumor stage, and histology, the only independent prognostic factors of event-free survival were CD44 expression and tumor stage. CONCLUSION: The analysis of CD44 cell-surface expression must be recommended as an additional biologic marker in the initial staging of the disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Soil penetration resistance (PR) is a measure of soil compaction closely related to soil structure and plant growth. However, the variability in PR hampers the statistical analyses. This study aimed to evaluate the variability of soil PR on the efficiency of parametric and nonparametric analyses in indentifying significant effects of soil compaction and to classify the coefficient of variation of PR into low, medium, high and very high. On six dates, the PR of a typical dystrophic Red Ultisol under continuous no-tillage for 16 years was measured. Three tillage and/or traffic conditions were established with the application of: (i) no chiseling or additional traffic, (ii) additional compaction, and (iii) chiseling. On each date, the nineteen PR data (measured at every 1.5 cm to a depth of 28.5 cm) were grouped in layers with different thickness. In each layer, the treatment effects were evaluated by variance (ANOVA) and Kruskal-Wallis analyses in a completely randomized design, and the coefficients of variation of all analyses were classified (low, intermediate, high and very high). The ANOVA performed better in discriminating the compaction effects, but the rejection rate of null hypothesis decreased from 100 to 80 % when the coefficient of variation increased from 15 to 26 %. The values of 15 and 26 % were the thresholds separating the low/intermediate and the high/very high coefficient variation classes of PR in this Ultisol.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose: Previous studies of the visual outcome in bilateral non-arteritic anterior ischemic optic neuropathy (NAION) have yielded conflicting results, specifically regarding congruity between fellow eyes. Prior studies have used measures of acuity and computerized perimetry but none has compared Goldmann visual field outcomes between fellow eyes. In order to better define the concordance of visual loss in this condition, we reviewed our cases of bilateral sequential NAION, including measures of visual acuity, pupillary function and both pattern and severity of visual field loss.Methods: We performed a retrospective chart review of 102 patients with a diagnosis of bilateral sequential NAION. Of the 102 patients, 86 were included in the study for analysis of final visual outcome between the affected eyes. Visual function was assessed using visual acuity, Goldmann visual fields, color vision and RAPD. A quantitative total visual field score and score per quadrant was analyzed for each eye using the numerical Goldmann visual field scoring method previously described by Esterman and colleagues. Based upon these scores, we calculated the total deviation and pattern deviation between fellow eyes and between eyes of different patients. Statistical significance was determined using nonparametric tests.Results: A statistically significant correlation was found between fellow eyes for multiple parameters, including logMAR visual acuity (P = 0.0101), global visual field (P = 0.0001), superior visual field (P = 0.0001), and inferior visual field (P = 0.0001). In addition, the mean deviation of both total (P = 0.0000000007) and pattern (P = 0.000000004) deviation analyses was significantly less between fellow eyes ("intra"-eyes) than between eyes of different patients ("inter"-eyes).Conclusions: Visual function between fellow eyes showed a fair to moderate correlation that was statistically significant. The pattern of vision loss was also more similar in fellow eyes than between eyes of different patients. These results may help allow better prediction of visual outcome for the second eye in patients with NAION. These findings may also be useful for evaluating efficacy of therapeutic interventions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Biological scaling analyses employing the widely used bivariate allometric model are beset by at least four interacting problems: (1) choice of an appropriate best-fit line with due attention to the influence of outliers; (2) objective recognition of divergent subsets in the data (allometric grades); (3) potential restrictions on statistical independence resulting from phylogenetic inertia; and (4) the need for extreme caution in inferring causation from correlation. A new non-parametric line-fitting technique has been developed that eliminates requirements for normality of distribution, greatly reduces the influence of outliers and permits objective recognition of grade shifts in substantial datasets. This technique is applied in scaling analyses of mammalian gestation periods and of neonatal body mass in primates. These analyses feed into a re-examination, conducted with partial correlation analysis, of the maternal energy hypothesis relating to mammalian brain evolution, which suggests links between body size and brain size in neonates and adults, gestation period and basal metabolic rate. Much has been made of the potential problem of phylogenetic inertia as a confounding factor in scaling analyses. However, this problem may be less severe than suspected earlier because nested analyses of variance conducted on residual variation (rather than on raw values) reveals that there is considerable variance at low taxonomic levels. In fact, limited divergence in body size between closely related species is one of the prime examples of phylogenetic inertia. One common approach to eliminating perceived problems of phylogenetic inertia in allometric analyses has been calculation of 'independent contrast values'. It is demonstrated that the reasoning behind this approach is flawed in several ways. Calculation of contrast values for closely related species of similar body size is, in fact, highly questionable, particularly when there are major deviations from the best-fit line for the scaling relationship under scrutiny.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We used biotinylated dextran amine (BDA) to anterogradely label individual axons projecting from primary somatosensory cortex (S1) to four different cortical areas in rats. A major goal was to determine whether axon terminals in these target areas shared morphometric similarities based on the shape of individual terminal arbors and the density of two bouton types: en passant (Bp) and terminaux (Bt). Evidence from tridimensional reconstructions of isolated axon terminal fragments (n=111) did support a degree of morphological heterogeneity establishing two broad groups of axon terminals. Morphological parameters associated with the complexity of terminal arbors and the proportion of beaded Bp vs stalked Bt were found to differ significantly in these two groups following a discriminant function statistical analysis across axon fragments. Interestingly, both groups occurred in all four target areas, possibly consistent with a commonality of presynaptic processing of tactile information. These findings lay the ground for additional work aiming to investigate synaptic function at the single bouton level and see how this might be associated with emerging properties in postsynaptic targets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

McCausland (2004a) describes a new theory of random consumer demand. Theoretically consistent random demand can be represented by a \"regular\" \"L-utility\" function on the consumption set X. The present paper is about Bayesian inference for regular L-utility functions. We express prior and posterior uncertainty in terms of distributions over the indefinite-dimensional parameter set of a flexible functional form. We propose a class of proper priors on the parameter set. The priors are flexible, in the sense that they put positive probability in the neighborhood of any L-utility function that is regular on a large subset bar(X) of X; and regular, in the sense that they assign zero probability to the set of L-utility functions that are irregular on bar(X). We propose methods of Bayesian inference for an environment with indivisible goods, leaving the more difficult case of indefinitely divisible goods for another paper. We analyse individual choice data from a consumer experiment described in Harbaugh et al. (2001).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the analysis of tax reform, when equity is traded off against efficiency, the measurement of the latter requires us to know how tax-induced price changes affect quantities supplied and demanded. in this paper, we present various econometric procedures for estimating how taxes affect demand.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

L’analyse de la marche a émergé comme l’un des domaines médicaux le plus im- portants récemment. Les systèmes à base de marqueurs sont les méthodes les plus fa- vorisées par l’évaluation du mouvement humain et l’analyse de la marche, cependant, ces systèmes nécessitent des équipements et de l’expertise spécifiques et sont lourds, coûteux et difficiles à utiliser. De nombreuses approches récentes basées sur la vision par ordinateur ont été développées pour réduire le coût des systèmes de capture de mou- vement tout en assurant un résultat de haute précision. Dans cette thèse, nous présentons notre nouveau système d’analyse de la démarche à faible coût, qui est composé de deux caméras vidéo monoculaire placées sur le côté gauche et droit d’un tapis roulant. Chaque modèle 2D de la moitié du squelette humain est reconstruit à partir de chaque vue sur la base de la segmentation dynamique de la couleur, l’analyse de la marche est alors effectuée sur ces deux modèles. La validation avec l’état de l’art basée sur la vision du système de capture de mouvement (en utilisant le Microsoft Kinect) et la réalité du ter- rain (avec des marqueurs) a été faite pour démontrer la robustesse et l’efficacité de notre système. L’erreur moyenne de l’estimation du modèle de squelette humain par rapport à la réalité du terrain entre notre méthode vs Kinect est très prometteur: les joints des angles de cuisses (6,29◦ contre 9,68◦), jambes (7,68◦ contre 11,47◦), pieds (6,14◦ contre 13,63◦), la longueur de la foulée (6.14cm rapport de 13.63cm) sont meilleurs et plus stables que ceux de la Kinect, alors que le système peut maintenir une précision assez proche de la Kinect pour les bras (7,29◦ contre 6,12◦), les bras inférieurs (8,33◦ contre 8,04◦), et le torse (8,69◦contre 6,47◦). Basé sur le modèle de squelette obtenu par chaque méthode, nous avons réalisé une étude de symétrie sur différentes articulations (coude, genou et cheville) en utilisant chaque méthode sur trois sujets différents pour voir quelle méthode permet de distinguer plus efficacement la caractéristique symétrie / asymétrie de la marche. Dans notre test, notre système a un angle de genou au maximum de 8,97◦ et 13,86◦ pour des promenades normale et asymétrique respectivement, tandis que la Kinect a donné 10,58◦et 11,94◦. Par rapport à la réalité de terrain, 7,64◦et 14,34◦, notre système a montré une plus grande précision et pouvoir discriminant entre les deux cas.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis Entitled “modelling and analysis of recurrent event data with multiple causes.Survival data is a term used for describing data that measures the time to occurrence of an event.In survival studies, the time to occurrence of an event is generally referred to as lifetime.Recurrent event data are commonly encountered in longitudinal studies when individuals are followed to observe the repeated occurrences of certain events. In many practical situations, individuals under study are exposed to the failure due to more than one causes and the eventual failure can be attributed to exactly one of these causes.The proposed model was useful in real life situations to study the effect of covariates on recurrences of certain events due to different causes.In Chapter 3, an additive hazards model for gap time distributions of recurrent event data with multiple causes was introduced. The parameter estimation and asymptotic properties were discussed .In Chapter 4, a shared frailty model for the analysis of bivariate competing risks data was presented and the estimation procedures for shared gamma frailty model, without covariates and with covariates, using EM algorithm were discussed. In Chapter 6, two nonparametric estimators for bivariate survivor function of paired recurrent event data were developed. The asymptotic properties of the estimators were studied. The proposed estimators were applied to a real life data set. Simulation studies were carried out to find the efficiency of the proposed estimators.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

So far, in the bivariate set up, the analysis of lifetime (failure time) data with multiple causes of failure is done by treating each cause of failure separately. with failures from other causes considered as independent censoring. This approach is unrealistic in many situations. For example, in the analysis of mortality data on married couples one would be interested to compare the hazards for the same cause of death as well as to check whether death due to one cause is more important for the partners’ risk of death from other causes. In reliability analysis. one often has systems with more than one component and many systems. subsystems and components have more than one cause of failure. Design of high-reliability systems generally requires that the individual system components have extremely high reliability even after long periods of time. Knowledge of the failure behaviour of a component can lead to savings in its cost of production and maintenance and. in some cases, to the preservation of human life. For the purpose of improving reliability. it is necessary to identify the cause of failure down to the component level. By treating each cause of failure separately with failures from other causes considered as independent censoring, the analysis of lifetime data would be incomplete. Motivated by this. we introduce a new approach for the analysis of bivariate competing risk data using the bivariate vector hazard rate of Johnson and Kotz (1975).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper considers meta-analysis of diagnostic studies that use a continuous score for classification of study participants into healthy or diseased groups. Classification is often done on the basis of a threshold or cut-off value, which might vary between studies. Consequently, conventional meta-analysis methodology focusing solely on separate analysis of sensitivity and specificity might be confounded by a potentially unknown variation of the cut-off value. To cope with this phenomena it is suggested to use, instead, an overall estimate of the misclassification error previously suggested and used as Youden’s index and; furthermore, it is argued that this index is less prone to between-study variation of cut-off values. A simple Mantel–Haenszel estimator as a summary measure of the overall misclassification error is suggested, which adjusts for a potential study effect. The measure of the misclassification error based on Youden’s index is advantageous in that it easily allows an extension to a likelihood approach, which is then able to cope with unobserved heterogeneity via a nonparametric mixture model. All methods are illustrated at hand of an example on a diagnostic meta-analysis on duplex doppler ultrasound, with angiography as the standard for stroke prevention.