986 resultados para R-Statistical computing
Resumo:
”compositions” is a new R-package for the analysis of compositional and positive data.It contains four classes corresponding to the four different types of compositional andpositive geometry (including the Aitchison geometry). It provides means for computation,plotting and high-level multivariate statistical analysis in all four geometries.These geometries are treated in an fully analogous way, based on the principle of workingin coordinates, and the object-oriented programming paradigm of R. In this way,called functions automatically select the most appropriate type of analysis as a functionof the geometry. The graphical capabilities include ternary diagrams and tetrahedrons,various compositional plots (boxplots, barplots, piecharts) and extensive graphical toolsfor principal components. Afterwards, ortion and proportion lines, straight lines andellipses in all geometries can be added to plots. The package is accompanied by ahands-on-introduction, documentation for every function, demos of the graphical capabilitiesand plenty of usage examples. It allows direct and parallel computation inall four vector spaces and provides the beginner with a copy-and-paste style of dataanalysis, while letting advanced users keep the functionality and customizability theydemand of R, as well as all necessary tools to add own analysis routines. A completeexample is included in the appendix
Resumo:
This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.
Resumo:
Metabolic problems lead to numerous failures during clinical trials, and much effort is now devoted to developing in silico models predicting metabolic stability and metabolites. Such models are well known for cytochromes P450 and some transferases, whereas less has been done to predict the activity of human hydrolases. The present study was undertaken to develop a computational approach able to predict the hydrolysis of novel esters by human carboxylesterase hCES2. The study involved first a homology modeling of the hCES2 protein based on the model of hCES1 since the two proteins share a high degree of homology (congruent with 73%). A set of 40 known substrates of hCES2 was taken from the literature; the ligands were docked in both their neutral and ionized forms using GriDock, a parallel tool based on the AutoDock4.0 engine which can perform efficient and easy virtual screening analyses of large molecular databases exploiting multi-core architectures. Useful statistical models (e.g., r (2) = 0.91 for substrates in their unprotonated state) were calculated by correlating experimental pK(m) values with distance between the carbon atom of the substrate's ester group and the hydroxy function of Ser228. Additional parameters in the equations accounted for hydrophobic and electrostatic interactions between substrates and contributing residues. The negatively charged residues in the hCES2 cavity explained the preference of the enzyme for neutral substrates and, more generally, suggested that ligands which interact too strongly by ionic bonds (e.g., ACE inhibitors) cannot be good CES2 substrates because they are trapped in the cavity in unproductive modes and behave as inhibitors. The effects of protonation on substrate recognition and the contrasting behavior of substrates and products were finally investigated by MD simulations of some CES2 complexes.
Resumo:
First: A continuous-time version of Kyle's model (Kyle 1985), known as the Back's model (Back 1992), of asset pricing with asymmetric information, is studied. A larger class of price processes and of noise traders' processes are studied. The price process, as in Kyle's model, is allowed to depend on the path of the market order. The process of the noise traders' is an inhomogeneous Lévy process. Solutions are found by the Hamilton-Jacobi-Bellman equations. With the insider being risk-neutral, the price pressure is constant, and there is no equilibirium in the presence of jumps. If the insider is risk-averse, there is no equilibirium in the presence of either jumps or drifts. Also, it is analised when the release time is unknown. A general relation is established between the problem of finding an equilibrium and of enlargement of filtrations. Random announcement time is random is also considered. In such a case the market is not fully efficient and there exists equilibrium if the sensitivity of prices with respect to the global demand is time decreasing according with the distribution of the random time. Second: Power variations. it is considered, the asymptotic behavior of the power variation of processes of the form _integral_0^t u(s-)dS(s), where S_ is an alpha-stable process with index of stability 0&alpha&2 and the integral is an Itô integral. Stable convergence of corresponding fluctuations is established. These results provide statistical tools to infer the process u from discrete observations. Third: A bond market is studied where short rates r(t) evolve as an integral of g(t-s)sigma(s) with respect to W(ds), where g and sigma are deterministic and W is the stochastic Wiener measure. Processes of this type are particular cases of ambit processes. These processes are in general not of the semimartingale kind.
Resumo:
BACKGROUND Functional brain images such as Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) have been widely used to guide the clinicians in the Alzheimer's Disease (AD) diagnosis. However, the subjectivity involved in their evaluation has favoured the development of Computer Aided Diagnosis (CAD) Systems. METHODS It is proposed a novel combination of feature extraction techniques to improve the diagnosis of AD. Firstly, Regions of Interest (ROIs) are selected by means of a t-test carried out on 3D Normalised Mean Square Error (NMSE) features restricted to be located within a predefined brain activation mask. In order to address the small sample-size problem, the dimension of the feature space was further reduced by: Large Margin Nearest Neighbours using a rectangular matrix (LMNN-RECT), Principal Component Analysis (PCA) or Partial Least Squares (PLS) (the two latter also analysed with a LMNN transformation). Regarding the classifiers, kernel Support Vector Machines (SVMs) and LMNN using Euclidean, Mahalanobis and Energy-based metrics were compared. RESULTS Several experiments were conducted in order to evaluate the proposed LMNN-based feature extraction algorithms and its benefits as: i) linear transformation of the PLS or PCA reduced data, ii) feature reduction technique, and iii) classifier (with Euclidean, Mahalanobis or Energy-based methodology). The system was evaluated by means of k-fold cross-validation yielding accuracy, sensitivity and specificity values of 92.78%, 91.07% and 95.12% (for SPECT) and 90.67%, 88% and 93.33% (for PET), respectively, when a NMSE-PLS-LMNN feature extraction method was used in combination with a SVM classifier, thus outperforming recently reported baseline methods. CONCLUSIONS All the proposed methods turned out to be a valid solution for the presented problem. One of the advances is the robustness of the LMNN algorithm that not only provides higher separation rate between the classes but it also makes (in combination with NMSE and PLS) this rate variation more stable. In addition, their generalization ability is another advance since several experiments were performed on two image modalities (SPECT and PET).
Resumo:
In this article we introduce JULIDE, a software toolkit developed to perform the 3D reconstruction, intensity normalization, volume standardization by 3D image registration and voxel-wise statistical analysis of autoradiographs of mouse brain sections. This software tool has been developed in the open-source ITK software framework and is freely available under a GPL license. The article presents the complete image processing chain from raw data acquisition to 3D statistical group analysis. Results of the group comparison in the context of a study on spatial learning are shown as an illustration of the data that can be obtained with this tool.
Resumo:
BACKGROUND & AIMS Hy's Law, which states that hepatocellular drug-induced liver injury (DILI) with jaundice indicates a serious reaction, is used widely to determine risk for acute liver failure (ALF). We aimed to optimize the definition of Hy's Law and to develop a model for predicting ALF in patients with DILI. METHODS We collected data from 771 patients with DILI (805 episodes) from the Spanish DILI registry, from April 1994 through August 2012. We analyzed data collected at DILI recognition and at the time of peak levels of alanine aminotransferase (ALT) and total bilirubin (TBL). RESULTS Of the 771 patients with DILI, 32 developed ALF. Hepatocellular injury, female sex, high levels of TBL, and a high ratio of aspartate aminotransferase (AST):ALT were independent risk factors for ALF. We compared 3 ways to use Hy's Law to predict which patients would develop ALF; all included TBL greater than 2-fold the upper limit of normal (×ULN) and either ALT level greater than 3 × ULN, a ratio (R) value (ALT × ULN/alkaline phosphatase × ULN) of 5 or greater, or a new ratio (nR) value (ALT or AST, whichever produced the highest ×ULN/ alkaline phosphatase × ULN value) of 5 or greater. At recognition of DILI, the R- and nR-based models identified patients who developed ALF with 67% and 63% specificity, respectively, whereas use of only ALT level identified them with 44% specificity. However, the level of ALT and the nR model each identified patients who developed ALF with 90% sensitivity, whereas the R criteria identified them with 83% sensitivity. An equal number of patients who did and did not develop ALF had alkaline phosphatase levels greater than 2 × ULN. An algorithm based on AST level greater than 17.3 × ULN, TBL greater than 6.6 × ULN, and AST:ALT greater than 1.5 identified patients who developed ALF with 82% specificity and 80% sensitivity. CONCLUSIONS When applied at DILI recognition, the nR criteria for Hy's Law provides the best balance of sensitivity and specificity whereas our new composite algorithm provides additional specificity in predicting the ultimate development of ALF.
Resumo:
Au cours des deux dernières décennies, la technique d'imagerie arthro-scanner a bénéficié de nombreux progrès technologiques et représente aujourd'hui une excellente alternative à l'imagerie par résonance magnétique (IRM) et / ou arthro-IRM dans l'évaluation des pathologies de la hanche. Cependant, elle reste limitée par l'exposition aux rayonnements ionisants importante. Les techniques de reconstruction itérative (IR) ont récemment été mis en oeuvre avec succès en imagerie ; la littérature montre que l'utilisation ces dernières contribue à réduire la dose d'environ 40 à 55%, comparativement aux protocoles courants utilisant la rétroprojection filtrée (FBP), en scanner de rachis. A notre connaissance, l'utilisation de techniques IR en arthro-scanner de hanche n'a pas été évaluée jusqu'à présent. Le but de notre étude était d'évaluer l'impact de la technique ASIR (GE Healthcare) sur la qualité de l'image objective et subjective en arthro-scanner de hanche, et d'évaluer son potentiel en terme de réduction de dose. Pour cela, trente sept patients examinés par arthro-scanner de hanche ont été randomisés en trois groupes : dose standard (CTDIvol = 38,4 mGy) et deux groupes de dose réduite (CTDIvol = 24,6 ou 15,4 mGy). Les images ont été reconstruites en rétroprojection filtrée (FBP) puis en appliquant différents pourcentages croissants d'ASIR (30, 50, 70 et 90%). Le bruit et le rapport contraste sur bruit (CNR) ont été mesurés. Deux radiologues spécialisés en imagerie musculo-squelettique ont évalué de manière indépendante la qualité de l'image au niveau de plusieurs structures anatomiques en utilisant une échelle de quatre grades. Ils ont également évalué les lésions labrales et du cartilage articulaire. Les résultats révèlent que le bruit augmente (p = 0,0009) et le CNR diminue (p = 0,001) de manière significative lorsque la dose diminue. A l'inverse, le bruit diminue (p = 0,0001) et le contraste sur bruit augmente (p < 0,003) de manière significative lorsque le pourcentage d'ASIR augmente ; on trouve également une augmentation significative des scores de la qualité de l'image pour le labrum, le cartilage, l'os sous-chondral, la qualité de l'image globale (au delà de ASIR 50%), ainsi que le bruit (p < 0,04), et une réduction significative pour l'os trabuculaire et les muscles (p < 0,03). Indépendamment du niveau de dose, il n'y a pas de différence significative pour la détection et la caractérisation des lésions labrales (n=24, p = 1) et des lésions cartilagineuses (n=40, p > 0,89) en fonction du pourcentage d'ASIR. Notre travail a permis de montrer que l'utilisation de plus de 50% d'ASIR permet de reduire de manière significative la dose d'irradiation reçue par le patient lors d'un arthro-scanner de hanche tout en maintenant une qualité d'image diagnostique comparable par rapport à un protocole de dose standard utilisant la rétroprojection filtrée.
Resumo:
La tomodensitométrie (CT) est une technique d'imagerie dont l'intérêt n'a cessé de croître depuis son apparition dans le début des années 70. Dans le domaine médical, son utilisation est incontournable à tel point que ce système d'imagerie pourrait être amené à devenir victime de son succès si son impact au niveau de l'exposition de la population ne fait pas l'objet d'une attention particulière. Bien évidemment, l'augmentation du nombre d'examens CT a permis d'améliorer la prise en charge des patients ou a rendu certaines procédures moins invasives. Toutefois, pour assurer que le compromis risque - bénéfice soit toujours en faveur du patient, il est nécessaire d'éviter de délivrer des doses non utiles au diagnostic.¦Si cette action est importante chez l'adulte elle doit être une priorité lorsque les examens se font chez l'enfant, en particulier lorsque l'on suit des pathologies qui nécessitent plusieurs examens CT au cours de la vie du patient. En effet, les enfants et jeunes adultes sont plus radiosensibles. De plus, leur espérance de vie étant supérieure à celle de l'adulte, ils présentent un risque accru de développer un cancer radio-induit dont la phase de latence peut être supérieure à vingt ans. Partant du principe que chaque examen radiologique est justifié, il devient dès lors nécessaire d'optimiser les protocoles d'acquisitions pour s'assurer que le patient ne soit pas irradié inutilement. L'avancée technologique au niveau du CT est très rapide et depuis 2009, de nouvelles techniques de reconstructions d'images, dites itératives, ont été introduites afin de réduire la dose et améliorer la qualité d'image.¦Le présent travail a pour objectif de déterminer le potentiel des reconstructions itératives statistiques pour réduire au minimum les doses délivrées lors d'examens CT chez l'enfant et le jeune adulte tout en conservant une qualité d'image permettant le diagnostic, ceci afin de proposer des protocoles optimisés.¦L'optimisation d'un protocole d'examen CT nécessite de pouvoir évaluer la dose délivrée et la qualité d'image utile au diagnostic. Alors que la dose est estimée au moyen d'indices CT (CTDIV0| et DLP), ce travail a la particularité d'utiliser deux approches radicalement différentes pour évaluer la qualité d'image. La première approche dite « physique », se base sur le calcul de métriques physiques (SD, MTF, NPS, etc.) mesurées dans des conditions bien définies, le plus souvent sur fantômes. Bien que cette démarche soit limitée car elle n'intègre pas la perception des radiologues, elle permet de caractériser de manière rapide et simple certaines propriétés d'une image. La seconde approche, dite « clinique », est basée sur l'évaluation de structures anatomiques (critères diagnostiques) présentes sur les images de patients. Des radiologues, impliqués dans l'étape d'évaluation, doivent qualifier la qualité des structures d'un point de vue diagnostique en utilisant une échelle de notation simple. Cette approche, lourde à mettre en place, a l'avantage d'être proche du travail du radiologue et peut être considérée comme méthode de référence.¦Parmi les principaux résultats de ce travail, il a été montré que les algorithmes itératifs statistiques étudiés en clinique (ASIR?, VEO?) ont un important potentiel pour réduire la dose au CT (jusqu'à-90%). Cependant, par leur fonctionnement, ils modifient l'apparence de l'image en entraînant un changement de texture qui pourrait affecter la qualité du diagnostic. En comparant les résultats fournis par les approches « clinique » et « physique », il a été montré que ce changement de texture se traduit par une modification du spectre fréquentiel du bruit dont l'analyse permet d'anticiper ou d'éviter une perte diagnostique. Ce travail montre également que l'intégration de ces nouvelles techniques de reconstruction en clinique ne peut se faire de manière simple sur la base de protocoles utilisant des reconstructions classiques. Les conclusions de ce travail ainsi que les outils développés pourront également guider de futures études dans le domaine de la qualité d'image, comme par exemple, l'analyse de textures ou la modélisation d'observateurs pour le CT.¦-¦Computed tomography (CT) is an imaging technique in which interest has been growing since it first began to be used in the early 1970s. In the clinical environment, this imaging system has emerged as the gold standard modality because of its high sensitivity in producing accurate diagnostic images. However, even if a direct benefit to patient healthcare is attributed to CT, the dramatic increase of the number of CT examinations performed has raised concerns about the potential negative effects of ionizing radiation on the population. To insure a benefit - risk that works in favor of a patient, it is important to balance image quality and dose in order to avoid unnecessary patient exposure.¦If this balance is important for adults, it should be an absolute priority for children undergoing CT examinations, especially for patients suffering from diseases requiring several follow-up examinations over the patient's lifetime. Indeed, children and young adults are more sensitive to ionizing radiation and have an extended life span in comparison to adults. For this population, the risk of developing cancer, whose latency period exceeds 20 years, is significantly higher than for adults. Assuming that each patient examination is justified, it then becomes a priority to optimize CT acquisition protocols in order to minimize the delivered dose to the patient. Over the past few years, CT advances have been developing at a rapid pace. Since 2009, new iterative image reconstruction techniques, called statistical iterative reconstructions, have been introduced in order to decrease patient exposure and improve image quality.¦The goal of the present work was to determine the potential of statistical iterative reconstructions to reduce dose as much as possible without compromising image quality and maintain diagnosis of children and young adult examinations.¦The optimization step requires the evaluation of the delivered dose and image quality useful to perform diagnosis. While the dose is estimated using CT indices (CTDIV0| and DLP), the particularity of this research was to use two radically different approaches to evaluate image quality. The first approach, called the "physical approach", computed physical metrics (SD, MTF, NPS, etc.) measured on phantoms in well-known conditions. Although this technique has some limitations because it does not take radiologist perspective into account, it enables the physical characterization of image properties in a simple and timely way. The second approach, called the "clinical approach", was based on the evaluation of anatomical structures (diagnostic criteria) present on patient images. Radiologists, involved in the assessment step, were asked to score image quality of structures for diagnostic purposes using a simple rating scale. This approach is relatively complicated to implement and also time-consuming. Nevertheless, it has the advantage of being very close to the practice of radiologists and is considered as a reference method.¦Primarily, this work revealed that the statistical iterative reconstructions studied in clinic (ASIR? and VECO have a strong potential to reduce CT dose (up to -90%). However, by their mechanisms, they lead to a modification of the image appearance with a change in image texture which may then effect the quality of the diagnosis. By comparing the results of the "clinical" and "physical" approach, it was showed that a change in texture is related to a modification of the noise spectrum bandwidth. The NPS analysis makes possible to anticipate or avoid a decrease in image quality. This project demonstrated that integrating these new statistical iterative reconstruction techniques can be complex and cannot be made on the basis of protocols using conventional reconstructions. The conclusions of this work and the image quality tools developed will be able to guide future studies in the field of image quality as texture analysis or model observers dedicated to CT.
Resumo:
We study an adaptive statistical approach to analyze brain networks represented by brain connection matrices of interregional connectivity (connectomes). Our approach is at a middle level between a global analysis and single connections analysis by considering subnetworks of the global brain network. These subnetworks represent either the inter-connectivity between two brain anatomical regions or by the intra-connectivity within the same brain anatomical region. An appropriate summary statistic, that characterizes a meaningful feature of the subnetwork, is evaluated. Based on this summary statistic, a statistical test is performed to derive the corresponding p-value. The reformulation of the problem in this way reduces the number of statistical tests in an orderly fashion based on our understanding of the problem. Considering the global testing problem, the p-values are corrected to control the rate of false discoveries. Finally, the procedure is followed by a local investigation within the significant subnetworks. We contrast this strategy with the one based on the individual measures in terms of power. We show that this strategy has a great potential, in particular in cases where the subnetworks are well defined and the summary statistics are properly chosen. As an application example, we compare structural brain connection matrices of two groups of subjects with a 22q11.2 deletion syndrome, distinguished by their IQ scores.
Resumo:
The package HIERFSTAT for the statistical software R, created by the R Development Core Team, allows the estimate of hierarchical F-statistics from a hierarchy with any numbers of levels. In addition, it allows testing the statistical significance of population differentiation for these different levels, using a generalized likelihood-ratio test. The package HIERFSTAT is available at http://www.unil.ch/popgen/softwares/hierfstat.htm.
Resumo:
Accurate detection of subpopulation size determinations in bimodal populations remains problematic yet it represents a powerful way by which cellular heterogeneity under different environmental conditions can be compared. So far, most studies have relied on qualitative descriptions of population distribution patterns, on population-independent descriptors, or on arbitrary placement of thresholds distinguishing biological ON from OFF states. We found that all these methods fall short of accurately describing small population sizes in bimodal populations. Here we propose a simple, statistics-based method for the analysis of small subpopulation sizes for use in the free software environment R and test this method on real as well as simulated data. Four so-called population splitting methods were designed with different algorithms that can estimate subpopulation sizes from bimodal populations. All four methods proved more precise than previously used methods when analyzing subpopulation sizes of transfer competent cells arising in populations of the bacterium Pseudomonas knackmussii B13. The methods' resolving powers were further explored by bootstrapping and simulations. Two of the methods were not severely limited by the proportions of subpopulations they could estimate correctly, but the two others only allowed accurate subpopulation quantification when this amounted to less than 25% of the total population. In contrast, only one method was still sufficiently accurate with subpopulations smaller than 1% of the total population. This study proposes a number of rational approximations to quantifying small subpopulations and offers an easy-to-use protocol for their implementation in the open source statistical software environment R.
Resumo:
Analysis of variance is commonly used in morphometry in order to ascertain differences in parameters between several populations. Failure to detect significant differences between populations (type II error) may be due to suboptimal sampling and lead to erroneous conclusions; the concept of statistical power allows one to avoid such failures by means of an adequate sampling. Several examples are given in the morphometry of the nervous system, showing the use of the power of a hierarchical analysis of variance test for the choice of appropriate sample and subsample sizes. In the first case chosen, neuronal densities in the human visual cortex, we find the number of observations to be of little effect. For dendritic spine densities in the visual cortex of mice and humans, the effect is somewhat larger. A substantial effect is shown in our last example, dendritic segmental lengths in monkey lateral geniculate nucleus. It is in the nature of the hierarchical model that sample size is always more important than subsample size. The relative weight to be attributed to subsample size thus depends on the relative magnitude of the between observations variance compared to the between individuals variance.
Resumo:
BACKGROUND: As part of EUROCAT's surveillance of congenital anomalies in Europe, a statistical monitoring system has been developed to detect recent clusters or long-term (10 year) time trends. The purpose of this article is to describe the system for the identification and investigation of 10-year time trends, conceived as a "screening" tool ultimately leading to the identification of trends which may be due to changing teratogenic factors.METHODS: The EUROCAT database consists of all cases of congenital anomalies including livebirths, fetal deaths from 20 weeks gestational age, and terminations of pregnancy for fetal anomaly. Monitoring of 10-year trends is performed for each registry for each of 96 non-independent EUROCAT congenital anomaly subgroups, while Pan-Europe analysis combines data from all registries. The monitoring results are reviewed, prioritized according to a prioritization strategy, and communicated to registries for investigation. Twenty-one registries covering over 4 million births, from 1999 to 2008, were included in monitoring in 2010.CONCLUSIONS: Significant increasing trends were detected for abdominal wall anomalies, gastroschisis, hypospadias, Trisomy 18 and renal dysplasia in the Pan-Europe analysis while 68 increasing trends were identified in individual registries. A decreasing trend was detected in over one-third of anomaly subgroups in the Pan-Europe analysis, and 16.9% of individual registry tests. Registry preliminary investigations indicated that many trends are due to changes in data quality, ascertainment, screening, or diagnostic methods. Some trends are inevitably chance phenomena related to multiple testing, while others seem to represent real and continuing change needing further investigation and response by regional/national public health authorities.