931 resultados para Topological data analysis
Resumo:
Gene clustering is a useful exploratory technique to group together genes with similar expression levels under distinct cell cycle phases or distinct conditions. It helps the biologist to identify potentially meaningful relationships between genes. In this study, we propose a clustering method based on multivariate normal mixture models, where the number of clusters is predicted via sequential hypothesis tests: at each step, the method considers a mixture model of m components (m = 2 in the first step) and tests if in fact it should be m - 1. If the hypothesis is rejected, m is increased and a new test is carried out. The method continues (increasing m) until the hypothesis is accepted. The theoretical core of the method is the full Bayesian significance test, an intuitive Bayesian approach, which needs no model complexity penalization nor positive probabilities for sharp hypotheses. Numerical experiments were based on a cDNA microarray dataset consisting of expression levels of 205 genes belonging to four functional categories, for 10 distinct strains of Saccharomyces cerevisiae. To analyze the method's sensitivity to data dimension, we performed principal components analysis on the original dataset and predicted the number of classes using 2 to 10 principal components. Compared to Mclust (model-based clustering), our method shows more consistent results.
Resumo:
Purpose: To evaluate patellar kinematics of volunteers Without knee pain at rest and during isometric contraction in open- and closed-kinetic-chain exercises. Methods: Twenty individuals took part in this study. All were submitted to magnetic resonance imaging (MRI) during rest and voluntary isometric contraction (VIC) in the open anti closed kinetic chain at 15 degrees, 30 degrees, and 45 degrees of knee flexion. Through MRI and using medical e-film software, the following measurements were evaluated: sulcus angle, patellar-tilt angle, and bisect offset. The mixed-effects linear model was used for comparison between knee positions, between rest and isometric contractions, and between (he exercises. Results: Data analysis revealed that the sulcus angle decreased as knee flexion increased and revealed increases with isometric contractions in both the open and closed kinetic chain for all knee-flexion angles. The patellar-tilt angle decreased with isometric contractions in both the open and closed kinetic chain for every knee position. However, in the closed kinetic chain, patellar tilt increased significantly with the knee flexed at 15 degrees. The bisect offset increased with the knee flexed at 15 degrees during isometric contractions and decreased as knee flexion increased during both exercises. Conclusion: VIC in the last degrees of knee extension may compromise patellar dynamics. On the other hand, it is possible to favor patellar stability by performing muscle contractions with the knee flexed at 30 degrees and 45 degrees in either the open or closed kinetic chain.
Resumo:
The growing interest in solar twins is motivated by the possibility of comparing them directly to the Sun. To carry on this kind of analysis, we need to know their physical characteristics with precision. Our first objective is to use asteroseismology and interferometry on the brightest of them: 18 Sco. We observed the star during 12 nights with HARPS for seismology and used the PAVO beam-combiner at CHARA for interferometry. An average large frequency separation 134.4+/-0.3 mu Hz and angular and linear radiuses of 0.6759 +/- 0.0062 mas and 1.010 +/- 0.009 R(circle dot) were estimated. We used these values to derive the mass of the star, 1.02 +/- 0.03 M(circle dot).
Resumo:
Aims. We derive lists of proper-motions and kinematic membership probabilities for 49 open clusters and possible open clusters in the zone of the Bordeaux PM2000 proper motion catalogue (+ 11 degrees <= delta <= + 18 degrees). We test different parametrisations of the proper motion and position distribution functions and select the most successful one. In the light of those results, we analyse some objects individually. Methods. We differenciate between cluster and field member stars, and assign membership probabilities, by applying a new and fully automated method based on both parametrisations of the proper motion and position distribution functions, and genetic algorithm optimization heuristics associated with a derivative-based hill climbing algorithm for the likelihood optimization. Results. We present a catalogue comprising kinematic parameters and associated membership probability lists for 49 open clusters and possible open clusters in the Bordeaux PM2000 catalogue region. We note that this is the first determination of proper motions for five open clusters. We confirm the non-existence of two kinematic populations in the region of 15 previously suspected non-existent objects.
Resumo:
Context. X-ray data analysis have found that fairly complex structures at cluster centres are more common than expected. Many of these structures have similar morphologies, which exhibit spiral-like substructure. Aims. It is not yet well known how these structures are formed or maintained. Understanding the origin of these spiral-like features at the centre of some clusters is the major motivation behind this work. Methods. We analyse deep Chandra observations of 15 nearby galaxy clusters ( 0.01 < z < 0.06), and use X-ray temperature and substructure maps to detect small features at the cores of the clusters. Results. We detect spiral-like features at the centre of 7 clusters: A85, A426, A496, Hydra A cluster, Centaurus, Ophiuchus, and A4059. These patterns are similar to those found in numerical hydrodynamic simulations of cluster mergers with non-zero impact parameter. In some clusters of our sample, a strong radio source also occupies the inner region of the cluster, which indicates a possible connection between the two. Our investigation implies that these spiral-like structures may be caused by off-axis minor mergers. Since these features occur in regions of high density, they may confine radio emission from the central galaxy producing, in some cases, unusual radio morphology.
Resumo:
Context. We present spectroscopic ground-based observations of the early Be star HD 49330 obtained simultaneously with the CoRoT-LRA1 run just before the burst observed in the CoRoT data. Aims. Ground-based spectroscopic observations of the early Be star HD 49330 obtained during the precursor phase and just before the start of an outburst allow us to disantangle stellar and circumstellar contributions and identify modes of stellar pulsations in this rapidly rotating star. Methods. Time series analysis (TSA) is performed on photospheric line profiles of He I and Si III by means of the least squares method. Results. We find two main frequencies f1 = 11.86 c d(-1) and f2 = 16.89 c d(-1) which can be associated with high order p-mode pulsations. We also detect a frequency f3 = 1.51 c d(-1) which can be associated with a low order g-mode. Moreover we show that the stellar line profile variability changed over the spectroscopic run. These results are in agreement with the results of the CoRoT data analysis, as shown in Huat et al. (2009). Conclusions. Our study of mid-and short-term spectroscopic variability allows the identification of p-and g-modes in HD 49330. It also allows us to display changes in the line profile variability before the start of an outburst. This brings new constraints for the seimic modelling of this star.
Resumo:
Aims. In this work, we describe the pipeline for the fast supervised classification of light curves observed by the CoRoT exoplanet CCDs. We present the classification results obtained for the first four measured fields, which represent a one-year in-orbit operation. Methods. The basis of the adopted supervised classification methodology has been described in detail in a previous paper, as is its application to the OGLE database. Here, we present the modifications of the algorithms and of the training set to optimize the performance when applied to the CoRoT data. Results. Classification results are presented for the observed fields IRa01, SRc01, LRc01, and LRa01 of the CoRoT mission. Statistics on the number of variables and the number of objects per class are given and typical light curves of high-probability candidates are shown. We also report on new stellar variability types discovered in the CoRoT data. The full classification results are publicly available.
Resumo:
We study the star/galaxy classification efficiency of 13 different decision tree algorithms applied to photometric objects in the Sloan Digital Sky Survey Data Release Seven (SDSS-DR7). Each algorithm is defined by a set of parameters which, when varied, produce different final classification trees. We extensively explore the parameter space of each algorithm, using the set of 884,126 SDSS objects with spectroscopic data as the training set. The efficiency of star-galaxy separation is measured using the completeness function. We find that the Functional Tree algorithm (FT) yields the best results as measured by the mean completeness in two magnitude intervals: 14 <= r <= 21 (85.2%) and r >= 19 (82.1%). We compare the performance of the tree generated with the optimal FT configuration to the classifications provided by the SDSS parametric classifier, 2DPHOT, and Ball et al. We find that our FT classifier is comparable to or better in completeness over the full magnitude range 15 <= r <= 21, with much lower contamination than all but the Ball et al. classifier. At the faintest magnitudes (r > 19), our classifier is the only one that maintains high completeness (> 80%) while simultaneously achieving low contamination (similar to 2.5%). We also examine the SDSS parametric classifier (psfMag - modelMag) to see if the dividing line between stars and galaxies can be adjusted to improve the classifier. We find that currently stars in close pairs are often misclassified as galaxies, and suggest a new cut to improve the classifier. Finally, we apply our FT classifier to separate stars from galaxies in the full set of 69,545,326 SDSS photometric objects in the magnitude range 14 <= r <= 21.
Resumo:
The efficacy of fluorescence spectroscopy to detect squamous cell carcinoma is evaluated in an animal model following laser excitation at 442 and 532 nm. Lesions are chemically induced with a topical DMBA application at the left lateral tongue of Golden Syrian hamsters. The animals are investigated every 2 weeks after the 4th week of induction until a total of 26 weeks. The right lateral tongue of each animal is considered as a control site (normal contralateral tissue) and the induced lesions are analyzed as a set of points covering the entire clinically detectable area. Based on fluorescence spectral differences, four indices are determined to discriminate normal and carcinoma tissues, based on intraspectral analysis. The spectral data are also analyzed using a multivariate data analysis and the results are compared with histology as the diagnostic gold standard. The best result achieved is for blue excitation using the KNN (K-nearest neighbor, a interspectral analysis) algorithm with a sensitivity of 95.7% and a specificity of 91.6%. These high indices indicate that fluorescence spectroscopy may constitute a fast noninvasive auxiliary tool for diagnostic of cancer within the oral cavity. (C) 2008 Society of Photo-Optical Instrumentation Engineers.
Resumo:
PANI films were deposited on glass substrates by in-situ polymerization and characterized by UV-VIS spectroscopy and atomic force microscopy. A method is developed to accurately analyze ellipsometric data obtained for transparent glass substrates before and after modification with absorbing polymer films. Surface modification was made with an overlayer such as polyaniline ( PANI), which exhibits different optical properties by varying its oxidation state. First, the issue of using transparent substrates for ellipsometry studies was examined and then, spectroscopic ellipsometry was used to characterize absorbing overlayers on transparent glasses. The same methodologies of data analysis can be also applied to other absorbing films on transparent substrates, and deposited by different techniques.
Resumo:
Background: Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction). There are many genomic and proteomic applications that rely on feature selection to answer questions such as selecting signature genes which are informative about some biological state, e. g., normal tissues and several types of cancer; or inferring a prediction network among elements such as genes, proteins and external stimuli. In these applications, a recurrent problem is the lack of samples to perform an adequate estimate of the joint probabilities between element states. A myriad of feature selection algorithms and criterion functions have been proposed, although it is difficult to point the best solution for each application. Results: The intent of this work is to provide an open-source multiplataform graphical environment for bioinformatics problems, which supports many feature selection algorithms, criterion functions and graphic visualization tools such as scatterplots, parallel coordinates and graphs. A feature selection approach for growing genetic networks from seed genes ( targets or predictors) is also implemented in the system. Conclusion: The proposed feature selection environment allows data analysis using several algorithms, criterion functions and graphic visualization tools. Our experiments have shown the software effectiveness in two distinct types of biological problems. Besides, the environment can be used in different pattern recognition applications, although the main concern regards bioinformatics tasks.
Resumo:
A rapid method for classification of mineral waters is proposed. The discrimination power was evaluated by a novel combination of chemometric data analysis and qualitative multi-elemental fingerprints of mineral water samples acquired from different regions of the Brazilian territory. The classification of mineral waters was assessed using only the wavelength emission intensities obtained by inductively coupled plasma optical emission spectrometry (ICP OES), monitoring different lines of Al, B, Ba, Ca, Cl, Cu, Co, Cr, Fe, K, Mg, Mn, Na, Ni, P, Pb, S, Sb, Si, Sr, Ti, V, and Zn, and Be, Dy, Gd, In, La, Sc and Y as internal standards. Data acquisition was done under robust (RC) and non-robust (NRC) conditions. Also, the combination of signal intensities of two or more emission lines for each element were evaluated instead of the individual lines. The performance of two classification-k-nearest neighbor (kNN) and soft independent modeling of class analogy (SIMCA)-and preprocessing algorithms, autoscaling and Pareto scaling, were evaluated for the ability to differentiate between the various samples in each approach tested (combination of robust or non-robust conditions with use of individual lines or sum of the intensities of emission lines). It was shown that qualitative ICP OES fingerprinting in combination with multivariate analysis is a promising analytical tool that has potential to become a recognized procedure for rapid authenticity and adulteration testing of mineral water samples or other material whose physicochemical properties (or origin) are directly related to mineral content.
Resumo:
Aims: We aimed to evaluate if the co-localisation of calcium and necrosis in intravascular ultrasound virtual histology (IVUS-VH) is due to artefact, and whether this effect can be mathematically estimated. Methods and results: We hypothesised that, in case calcium induces an artefactual coding of necrosis, any addition in calcium content would generate an artificial increment in the necrotic tissue. Stent struts were used to simulate the ""added calcium"". The change in the amount and in the spatial localisation of necrotic tissue was evaluated before and after stenting (n=17 coronary lesions) by means of a especially developed imaging software. The area of ""calcium"" increased from a median of 0.04 mm(2) at baseline to 0.76 mm(2) after stenting (p<0.01). In parallel the median necrotic content increased from 0.19 mm(2) to 0.59 mm(2) (p<0.01). The ""added"" calcium strongly predicted a proportional increase in necrosis-coded tissue in the areas surrounding the calcium-like spots (model R(2)=0.70; p<0.001). Conclusions: Artificial addition of calcium-like elements to the atherosclerotic plaque led to an increase in necrotic tissue in virtual histology that is probably artefactual. The overestimation of necrotic tissue by calcium strictly followed a linear pattern, indicating that it may be amenable to mathematical correction.
Resumo:
In this paper we proposed a new two-parameters lifetime distribution with increasing failure rate. The new distribution arises on a latent complementary risk problem base. The properties of the proposed distribution are discussed, including a formal proof of its probability density function and explicit algebraic formulae for its reliability and failure rate functions, quantiles and moments, including the mean and variance. A simple EM-type algorithm for iteratively computing maximum likelihood estimates is presented. The Fisher information matrix is derived analytically in order to obtaining the asymptotic covariance matrix. The methodology is illustrated on a real data set. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The inverse Weibull distribution has the ability to model failure rates which are quite common in reliability and biological studies. A three-parameter generalized inverse Weibull distribution with decreasing and unimodal failure rate is introduced and studied. We provide a comprehensive treatment of the mathematical properties of the new distribution including expressions for the moment generating function and the rth generalized moment. The mixture model of two generalized inverse Weibull distributions is investigated. The identifiability property of the mixture model is demonstrated. For the first time, we propose a location-scale regression model based on the log-generalized inverse Weibull distribution for modeling lifetime data. In addition, we develop some diagnostic tools for sensitivity analysis. Two applications of real data are given to illustrate the potentiality of the proposed regression model.