933 resultados para Statistical factora analysis
Resumo:
Summary [résumé français voir ci-dessous] From the beginning of the 20th century the world population has been confronted with the human immune deficiency virus 1 (HIV-1). This virus has the particularity to mutate fast, and could thus evade and adapt to the human host. Our closest evolutionary related organisms, the non-human primates, are less susceptible to HIV-1. In a broader sense, primates are differentially susceptible to various retrovirus. Species specificity may be due to genetic differences among primates. In the present study we applied evolutionary and comparative genetic techniques to characterize the evolutionary pattern of host cellular determinants of HIV-1 pathogenesis. The study of the evolution of genes coding for proteins participating to the restriction or pathogenesis of HIV-1 may help understanding the genetic basis of modern human susceptibility to infection. To perform comparative genetics analysis, we constituted a collection of primate DNA and RNA to allow generation of de novo sequence of gene orthologs. More recently, release to the public domain of two new primate complete genomes (bornean orang-utan and common marmoset) in addition of the three previously available genomes (human, chimpanzee and Rhesus monkey) help scaling up the evolutionary and comparative genome analysis. Sequence analysis used phylogenetic and statistical methods for detecting molecular adaptation. We identified different selective pressures acting on host proteins involved in HIV-1 pathogenesis. Proteins with HIV-1 restriction properties in non-human primates were under strong positive selection, in particular in regions of interaction with viral proteins. These regions carried key residues for the antiviral activity. Proteins of the innate immunity presented an evolutionary pattern of conservation (purifying selection) but with signals of relaxed constrain if we compared them to the average profile of purifying selection of the primate genomes. Large scale analysis resulted in patterns of evolutionary pressures according to molecular function, biological process and cellular distribution. The data generated by various analyses served to guide the ancestral reconstruction of TRIM5a a potent antiviral host factor. The resurrected TRIM5a from the common ancestor of Old world monkeys was effective against HIV-1 and the recent resurrected hominoid variants were more effective against other retrovirus. Thus, as the result of trade-offs in the ability to restrict different retrovirus, human might have been exposed to HIV-1 at a time when TRIM5a lacked the appropriate specific restriction activity. The application of evolutionary and comparative genetic tools should be considered for the systematical assessment of host proteins relevant in viral pathogenesis, and to guide biological and functional studies. Résumé La population mondiale est confrontée depuis le début du vingtième siècle au virus de l'immunodéficience humaine 1 (VIH-1). Ce virus a un taux de mutation particulièrement élevé, il peut donc s'évader et s'adapter très efficacement à son hôte. Les organismes évolutivement le plus proches de l'homme les primates nonhumains sont moins susceptibles au VIH-1. De façon générale, les primates répondent différemment aux rétrovirus. Cette spécificité entre espèces doit résider dans les différences génétiques entre primates. Dans cette étude nous avons appliqué des techniques d'évolution et de génétique comparative pour caractériser le modèle évolutif des déterminants cellulaires impliqués dans la pathogenèse du VIH- 1. L'étude de l'évolution des gènes, codant pour des protéines impliquées dans la restriction ou la pathogenèse du VIH-1, aidera à la compréhension des bases génétiques ayant récemment rendu l'homme susceptible. Pour les analyses de génétique comparative, nous avons constitué une collection d'ADN et d'ARN de primates dans le but d'obtenir des nouvelles séquences de gènes orthologues. Récemment deux nouveaux génomes complets ont été publiés (l'orang-outan du Bornéo et Marmoset commun) en plus des trois génomes déjà disponibles (humain, chimpanzé, macaque rhésus). Ceci a permis d'améliorer considérablement l'étendue de l'analyse. Pour détecter l'adaptation moléculaire nous avons analysé les séquences à l'aide de méthodes phylogénétiques et statistiques. Nous avons identifié différentes pressions de sélection agissant sur les protéines impliquées dans la pathogenèse du VIH-1. Des protéines avec des propriétés de restriction du VIH-1 dans les primates non-humains présentent un taux particulièrement haut de remplacement d'acides aminés (sélection positive). En particulier dans les régions d'interaction avec les protéines virales. Ces régions incluent des acides aminés clé pour l'activité de restriction. Les protéines appartenant à l'immunité inné présentent un modèle d'évolution de conservation (sélection purifiante) mais avec des traces de "relaxation" comparé au profil général de sélection purifiante du génome des primates. Une analyse à grande échelle a permis de classifier les modèles de pression évolutive selon leur fonction moléculaire, processus biologique et distribution cellulaire. Les données générées par les différentes analyses ont permis la reconstruction ancestrale de TRIM5a, un puissant facteur antiretroviral. Le TRIM5a ressuscité, correspondant à l'ancêtre commun entre les grands singes et les groupe des catarrhiniens, est efficace contre le VIH-1 moderne. Les TRIM5a ressuscités plus récents, correspondant aux ancêtres des grands singes, sont plus efficaces contre d'autres rétrovirus. Ainsi, trouver un compromis dans la capacité de restreindre différents rétrovirus, l'homme aurait été exposé au VIH-1 à une période où TRIM5a manquait d'activité de restriction spécifique contre celui-ci. L'application de techniques d'évolution et de génétique comparative devraient être considérées pour l'évaluation systématique de protéines impliquées dans la pathogenèse virale, ainsi que pour guider des études biologiques et fonctionnelles
Resumo:
One of the disadvantages of old age is that there is more past than future: this,however, may be turned into an advantage if the wealth of experience and, hopefully,wisdom gained in the past can be reflected upon and throw some light on possiblefuture trends. To an extent, then, this talk is necessarily personal, certainly nostalgic,but also self critical and inquisitive about our understanding of the discipline ofstatistics. A number of almost philosophical themes will run through the talk: searchfor appropriate modelling in relation to the real problem envisaged, emphasis onsensible balances between simplicity and complexity, the relative roles of theory andpractice, the nature of communication of inferential ideas to the statistical layman, theinter-related roles of teaching, consultation and research. A list of keywords might be:identification of sample space and its mathematical structure, choices betweentransform and stay, the role of parametric modelling, the role of a sample spacemetric, the underused hypothesis lattice, the nature of compositional change,particularly in relation to the modelling of processes. While the main theme will berelevance to compositional data analysis we shall point to substantial implications forgeneral multivariate analysis arising from experience of the development ofcompositional data analysis…
Resumo:
Background The prognostic potential of individual clinical and molecular parameters in stage II/III colon cancer has been investigated, but a thorough multivariable assessment of their relative impact is missing. Methods Tumors from patients (N = 1404) in the PETACC3 adjuvant chemotherapy trial were examined for BRAF and KRAS mutations, microsatellite instability (MSI), chromosome 18q loss of heterozygosity (18qLOH), and SMAD4 expression. Their importance in predicting relapse-free survival (RFS) and overall survival (OS) was assessed by Kaplan-Meier analyses, Cox regression models, and recursive partitioning trees. All statistical tests were two-sided. Results MSI-high status and SMAD4 focal loss of expression were identified as independent prognostic factors with better RFS (hazard ratio [HR] of recurrence = 0.54, 95% CI = 0.37 to 0.81, P = .003) and OS (HR of death = 0.43, 95% CI = 0.27 to 0.70, P = .001) for MSI-high status and worse RFS (HR = 1.47, 95% CI = 1.19 to 1.81, P < .001) and OS (HR = 1.58, 95% CI = 1.23 to 2.01, P < .001) for SMAD4 loss. 18qLOH did not have any prognostic value in RFS or OS. Recursive partitioning identified refinements of TNM into new clinically interesting prognostic subgroups. Notably, T3N1 tumors with MSI-high status and retained SMAD4 expression had outcomes similar to stage II disease. Conclusions Concomitant assessment of molecular and clinical markers in multivariable analysis is essential to confirm or refute their independent prognostic value. Including molecular markers with independent prognostic value might allow more accurate prediction of prognosis than TNM staging alone.
Resumo:
Correspondence analysis, when used to visualize relationships in a table of counts(for example, abundance data in ecology), has been frequently criticized as being too sensitiveto objects (for example, species) that occur with very low frequency or in very few samples. Inthis statistical report we show that this criticism is generally unfounded. We demonstrate this inseveral data sets by calculating the actual contributions of rare objects to the results ofcorrespondence analysis and canonical correspondence analysis, both to the determination ofthe principal axes and to the chi-square distance. It is a fact that rare objects are oftenpositioned as outliers in correspondence analysis maps, which gives the impression that theyare highly influential, but their low weight offsets their distant positions and reduces their effecton the results. An alternative scaling of the correspondence analysis solution, the contributionbiplot, is proposed as a way of mapping the results in order to avoid the problem of outlying andlow contributing rare objects.
Resumo:
γ-Hydroxybutyric acid (GHB) is an endogenous short-chain fatty acid popular as a recreational drug due to sedative and euphoric effects, but also often implicated in drug-facilitated sexual assaults owing to disinhibition and amnesic properties. Whilst discrimination between endogenous and exogenous GHB as required in intoxication cases may be achieved by the determination of the carbon isotope content, such information has not yet been exploited to answer source inference questions of forensic investigation and intelligence interests. However, potential isotopic fractionation effects occurring through the whole metabolism of GHB may be a major concern in this regard. Thus, urine specimens from six healthy male volunteers who ingested prescription GHB sodium salt, marketed as Xyrem(®), were analysed by means of gas chromatography/combustion/isotope ratio mass spectrometry to assess this particular topic. A very narrow range of δ(13)C values, spreading from -24.810/00 to -25.060/00, was observed, whilst mean δ(13)C value of Xyrem(®) corresponded to -24.990/00. Since urine samples and prescription drug could not be distinguished by means of statistical analysis, carbon isotopic effects and subsequent influence on δ(13)C values through GHB metabolism as a whole could be ruled out. Thus, a link between GHB as a raw matrix and found in a biological fluid may be established, bringing relevant information regarding source inference evaluation. Therefore, this study supports a diversified scope of exploitation for stable isotopes characterized in biological matrices from investigations on intoxication cases to drug intelligence programmes.
Resumo:
Structural equation models are widely used in economic, socialand behavioral studies to analyze linear interrelationships amongvariables, some of which may be unobservable or subject to measurementerror. Alternative estimation methods that exploit different distributionalassumptions are now available. The present paper deals with issues ofasymptotic statistical inferences, such as the evaluation of standarderrors of estimates and chi--square goodness--of--fit statistics,in the general context of mean and covariance structures. The emphasisis on drawing correct statistical inferences regardless of thedistribution of the data and the method of estimation employed. A(distribution--free) consistent estimate of $\Gamma$, the matrix ofasymptotic variances of the vector of sample second--order moments,will be used to compute robust standard errors and a robust chi--squaregoodness--of--fit squares. Simple modifications of the usual estimateof $\Gamma$ will also permit correct inferences in the case of multi--stage complex samples. We will also discuss the conditions under which,regardless of the distribution of the data, one can rely on the usual(non--robust) inferential statistics. Finally, a multivariate regressionmodel with errors--in--variables will be used to illustrate, by meansof simulated data, various theoretical aspects of the paper.
Resumo:
Aim This study used data from temperate forest communities to assess: (1) five different stepwise selection methods with generalized additive models, (2) the effect of weighting absences to ensure a prevalence of 0.5, (3) the effect of limiting absences beyond the environmental envelope defined by presences, (4) four different methods for incorporating spatial autocorrelation, and (5) the effect of integrating an interaction factor defined by a regression tree on the residuals of an initial environmental model. Location State of Vaud, western Switzerland. Methods Generalized additive models (GAMs) were fitted using the grasp package (generalized regression analysis and spatial predictions, http://www.cscf.ch/grasp). Results Model selection based on cross-validation appeared to be the best compromise between model stability and performance (parsimony) among the five methods tested. Weighting absences returned models that perform better than models fitted with the original sample prevalence. This appeared to be mainly due to the impact of very low prevalence values on evaluation statistics. Removing zeroes beyond the range of presences on main environmental gradients changed the set of selected predictors, and potentially their response curve shape. Moreover, removing zeroes slightly improved model performance and stability when compared with the baseline model on the same data set. Incorporating a spatial trend predictor improved model performance and stability significantly. Even better models were obtained when including local spatial autocorrelation. A novel approach to include interactions proved to be an efficient way to account for interactions between all predictors at once. Main conclusions Models and spatial predictions of 18 forest communities were significantly improved by using either: (1) cross-validation as a model selection method, (2) weighted absences, (3) limited absences, (4) predictors accounting for spatial autocorrelation, or (5) a factor variable accounting for interactions between all predictors. The final choice of model strategy should depend on the nature of the available data and the specific study aims. Statistical evaluation is useful in searching for the best modelling practice. However, one should not neglect to consider the shapes and interpretability of response curves, as well as the resulting spatial predictions in the final assessment.
Resumo:
The primary purpose of this brief is to provide various statistical and institutional details on the development and current status of the public agricultural research system in Cape Verde. This information has been collected and presented in a systematic way in order to inform and thereby improve research policy formulation with regard to the Cape Verdean NARS. Most importantly, these data are assembled and reported in a way that makes them directly comparable with the data presented in the other country briefs in this series. And because institutions take time to develop and there are often considerable lags in the agricultural research process, it is necessary for many analytical and policy purposes to have access to longer-run series of data. NARSs vary markedly in their institutional structure and these institutional aspects can have a substantial and direct effect on their research performance. To provide a basis for analysis and cross-country, over-time comparisons, the various research agencies in a country have been grouped into five general categories; government, semi-public, private, academic, and supranational. A description of these categories is provided in table 1.
Resumo:
The primary purpose of this brief is to provide various statistical and institutional details on the development and current status of the public agricultural research system in Cape Verde. This information has been collected and presented in a systematic way in order to inform and thereby improve research policy formulation with regard to the Cape Verdean NARS. Most importantly, these data are assembled and reported in a way that makes them directly comparable with the data presented in the other country briefs in this series. And because institutions take time to develop and there are often considerable lags in the agricultural research process, it is necessary for many analytical and policy purposes to have access to longer-run series of data. NARSs vary markedly in their institutional structure and these institutional aspects can have a substantial and direct effect on their research performance. To provide a basis for analysis and cross-country, over-time comparisons, the various research agencies in a country have been grouped into five general categories; government, semi-public, private, academic, and supranational. A description of these categories is provided in table 1.
Resumo:
Although correspondence analysis is now widely available in statistical software packages and applied in a variety of contexts, notably the social and environmental sciences, there are still some misconceptions about this method as well as unresolved issues which remain controversial to this day. In this paper we hope to settle these matters, namely (i) the way CA measures variance in a two-way table and how to compare variances between tables of different sizes, (ii) the influence, or rather lack of influence, of outliers in the usual CA maps, (iii) the scaling issue and the biplot interpretation of maps,(iv) whether or not to rotate a solution, and (v) statistical significance of results.
Resumo:
The objective of this study was the identification of the attributes and dimensions of service quality affecting the service performance of the five stars resort hotels located in the Cape Verde Islands. The reason boosting the initiative to do this research was the paramount role of the resort hotels in the development of the travel and tourism sector in Cape Verde, and the impact that today this sector has had in the economy of that country. The research opens with a literature review on the service quality theory in the hotel industry, starting from the middle of the 1980s with the classic model of service quality and SERVQUAL instrument to the analysis of recent models of service quality measurement in the hotel industry, as it is an example the scale of items developed in 2003 in the Lodging Quality Index (LQI). Furthermore, the study elaborates an analysis on the importance of the travel and tourism activities in the Cape Verde Islands, and it evidences the enormous importance of those activities in the performance of the Cape Verdean hotel industry. In sequence the study analyzes in details the hotel industry of Cape Verde and it identifies the market size of the five stars resort hotels and their current operators in that market. Moreover, the research develops with an online questionnaire elaborated and sent through the platforms of travel websites and communities to the guests whom have experienced the service of the five stars resort hotels located in the Cape Verde Islands. The scope of the questionnaire was to assess the attributes and dimensions of service quality in the five stars resort hotels of Cape Verde. The results of the questionnaire were in sequence analyzed through descriptive and applied statistics, using Microsoft Excel and the Statistical Package for Social Science (SPSS). Content validity analysis, factor analysis, and reliability analysis of the factors were made to purify an initial scale of 47 items of service quality. An instrument with three dimensions covering twenty four attributes of service quality assessment in the five stars resort hotels of Cape Verde was finally created. The three dimensions found were: staff competence; food and entertainment; and physical facilities. This study on the service in the five stars resort hotels of Cape Verde ends with brief comments on the status of service quality according to the identified dimensions and their attributes. In the conclusion, the study summarizes the whole work and gives some directions for future research.
Resumo:
The spatial variability of strongly weathered soils under sugarcane and soybean/wheat rotation was quantitatively assessed on 33 fields in two regions in São Paulo State, Brazil: Araras (15 fields with sugarcane) and Assis (11 fields with sugarcane and seven fields with soybean/wheat rotation). Statistical methods used were: nested analysis of variance (for 11 fields), semivariance analysis and analysis of variance within and between fields. Spatial levels from 50 m to several km were analyzed. Results are discussed with reference to a previously published study carried out in the surroundings of Passo Fundo (RS). Similar variability patterns were found for clay content, organic C content and cation exchange capacity. The fields studied are quite homogeneous with respect to these relatively stable soil characteristics. Spatial variability of other characteristics (resin extractable P, pH, base- and Al-saturation and also soil colour), varies with region and, or land use management. Soil management for sugarcane seems to have induced modifications to greater depths than for soybean/wheat rotation. Surface layers of soils under soybean/wheat present relatively little variation, apparently as a result of very intensive soil management. The major part of within-field variation occurs at short distances (< 50 m) in all study areas. Hence, little extra information would be gained by increasing sampling density from, say, 1/km² to 1/50 m². For many purposes, the soils in the study regions can be mapped with the same observation density, but residual variance will not be the same in all areas. Bulk sampling may help to reveal spatial patterns between 50 and 1.000 m.
Adenocarcinoma of the pancreas: Comparative single centre analysis between ductal and mucinous type.
Resumo:
1. Background¦Adenocarcinomas of the pancreas are exocrine tumors, originate from ductal system, including two morphologically distinct entities: the ductal adenocarcinoma and mucinous adenocarcinoma. Ductal adenocarcinoma is by far the most frequent malignant tumor in the pancreas, representing at least about 90% of all pancreas cancers. It is associated with very poor prognosis, due to the fact that actually there are no any biological markers or diagnostic tools for identification of the disease at an early stage. Most of the time the disease is extensive with vascular and nerves involvement or with metastatic spread at the time of diagnosis (1). The median survival is less than 5% at 5 years, placing it, at the fifth leading cause of death by cancer in the world (2). The mucinous form of pancreatic adenocarcinoma is less frequent, and seems to have a better prognosis with about 57% survival at 5 years (1)(3)(4).¦Each morphologic type of pancreatic adenocarcinoma is associated with particular preneoplastic lesions. Two types of preneoplastic lesions are described: firstly, pancreatic intra-epithelial neoplasia (PanIN) which affects the small and peripheral pancreatic ducts, and the intraductal papillary-mucinous neoplasm (IPMN) interested the main pancreatic ducts and its principal branches. Both of preneoplastic lesions lead by different mechanisms to the pancreatic adenocarcinoma (1)(2)(3)(4)(5)(6)(7)(8)(9)(10).¦The purpose of our study consists in a retrospective analysis of various clinical and histo-morphological parameters in order to assess a difference in survival between these two morphological types of pancreatic adenocarcinomas.¦1.2 Material and methods¦We conducted a retrospective analysis including 35 patients, (20 men and 15 women), beneficed the surgical treatment for pancreas adenocarcinoma at the Surgical Department of University Hospital in Lausanne. The patients involved in our study have been treated between 2003 and 2008, permitting at least 5-years mean follow up. For each patient the following parameters were analysed: age, gender, type of operation, type of preneoplastic lesions, TNM stage, histological grade of the tumor, vascular invasion, lymphatic and perineural invasion, resection margins, and adjuvant treatment.¦The results from these observations were included in a univariate and multivariate statistical analysis and compared with overall survival, as well as specific survival for each morphologic subtype of adenocarcinoma.¦As a low number of mucinous adenocarcinomas (n=5) was insufficient to conduct a pertinent statistical analysis, we compared the data obtained from adenocarcinomas developed on PanIN with adenocarcinomas developed on IPMN including both, ductal or mucinous types.¦1.3 Result¦Our results show that adenocarcinomas developed on pre-existing IPMN including both morphologic types (ductal and mucinous form) are associated with a better survival and prognosis than adenocarciomas developed on PanIN.¦1.4 Conclusion¦This study reflects that the most relevant parameter in survival in pancreatic adenocarcinoma seems to be the type of preneoplastic lesion. The significant difference in survival was noted between adenocarcinomas developing on PanIN as compared to adenocarcinomas developed on IPMN precursor lesions. Ductal adenocarcinomas developped on IPMN present significantly longer survival than those developed on PanIN lesions (P value= 0,01). Therefore we can suggest that the histological type of preneoplastic lesion rather than the histological type of adenocarcinoma should be the determinant prognosis factor in survival of pancreatic adenocarcinoma.
Resumo:
The development of statistical models for forensic fingerprint identification purposes has been the subject of increasing research attention in recent years. This can be partly seen as a response to a number of commentators who claim that the scientific basis for fingerprint identification has not been adequately demonstrated. In addition, key forensic identification bodies such as ENFSI [1] and IAI [2] have recently endorsed and acknowledged the potential benefits of using statistical models as an important tool in support of the fingerprint identification process within the ACE-V framework. In this paper, we introduce a new Likelihood Ratio (LR) model based on Support Vector Machines (SVMs) trained with features discovered via morphometric and spatial analyses of corresponding minutiae configurations for both match and close non-match populations often found in AFIS candidate lists. Computed LR values are derived from a probabilistic framework based on SVMs that discover the intrinsic spatial differences of match and close non-match populations. Lastly, experimentation performed on a set of over 120,000 publicly available fingerprint images (mostly sourced from the National Institute of Standards and Technology (NIST) datasets) and a distortion set of approximately 40,000 images, is presented, illustrating that the proposed LR model is reliably guiding towards the right proposition in the identification assessment of match and close non-match populations. Results further indicate that the proposed model is a promising tool for fingerprint practitioners to use for analysing the spatial consistency of corresponding minutiae configurations.
Resumo:
BACKGROUND: As part of EUROCAT's surveillance of congenital anomalies in Europe, a statistical monitoring system has been developed to detect recent clusters or long-term (10 year) time trends. The purpose of this article is to describe the system for the identification and investigation of 10-year time trends, conceived as a "screening" tool ultimately leading to the identification of trends which may be due to changing teratogenic factors.METHODS: The EUROCAT database consists of all cases of congenital anomalies including livebirths, fetal deaths from 20 weeks gestational age, and terminations of pregnancy for fetal anomaly. Monitoring of 10-year trends is performed for each registry for each of 96 non-independent EUROCAT congenital anomaly subgroups, while Pan-Europe analysis combines data from all registries. The monitoring results are reviewed, prioritized according to a prioritization strategy, and communicated to registries for investigation. Twenty-one registries covering over 4 million births, from 1999 to 2008, were included in monitoring in 2010.CONCLUSIONS: Significant increasing trends were detected for abdominal wall anomalies, gastroschisis, hypospadias, Trisomy 18 and renal dysplasia in the Pan-Europe analysis while 68 increasing trends were identified in individual registries. A decreasing trend was detected in over one-third of anomaly subgroups in the Pan-Europe analysis, and 16.9% of individual registry tests. Registry preliminary investigations indicated that many trends are due to changes in data quality, ascertainment, screening, or diagnostic methods. Some trends are inevitably chance phenomena related to multiple testing, while others seem to represent real and continuing change needing further investigation and response by regional/national public health authorities.