Biblioteca Digital

64 resultados para component classification

Phylogenetic relationships within the speciose family Characidae (Teleostei: Ostariophysi: Characiformes) based on multilocus analysis and extensive ingroup sampling

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: With nearly 1,100 species, the fish family Characidae represents more than half of the species of Characiformes, and is a key component of Neotropical freshwater ecosystems. The composition, phylogeny, and classification of Characidae is currently uncertain, despite significant efforts based on analysis of morphological and molecular data. No consensus about the monophyly of this group or its position within the order Characiformes has been reached, challenged by the fact that many key studies to date have non-overlapping taxonomic representation and focus only on subsets of this diversity. Results: In the present study we propose a new definition of the family Characidae and a hypothesis of relationships for the Characiformes based on phylogenetic analysis of DNA sequences of two mitochondrial and three nuclear genes (4,680 base pairs). The sequences were obtained from 211 samples representing 166 genera distributed among all 18 recognized families in the order Characiformes, all 14 recognized subfamilies in the Characidae, plus 56 of the genera so far considered incertae sedis in the Characidae. The phylogeny obtained is robust, with most lineages significantly supported by posterior probabilities in Bayesian analysis, and high bootstrap values from maximum likelihood and parsimony analyses. Conclusion: A monophyletic assemblage strongly supported in all our phylogenetic analysis is herein defined as the Characidae and includes the characiform species lacking a supraorbital bone and with a derived position of the emergence of the hyoid artery from the anterior ceratohyal. To recognize this and several other monophyletic groups within characiforms we propose changes in the limits of several families to facilitate future studies in the Characiformes and particularly the Characidae. This work presents a new phylogenetic framework for a speciose and morphologically diverse group of freshwater fishes of significant ecological and evolutionary importance across the Neotropics and portions of Africa.

The International LAM Registry: A Component of an Innovative Web-Based Clinician, Researcher, and Patient-Driven Rare Disease Research Platform

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: A relative friability to capture a sufficiently large patient population in any one geographic location has traditionally limited research into rare diseases. Methods and Results: Clinicians interested in the rare disease lymphangioleiomyomatosis (LAM) have worked with the LAM Treatment Alliance, the MIT Media Lab, and Clozure Associates to cooperate in the design of a state-of-the-art data coordination platform that can be used for clinical trials and other research focused on the global LAM patient population. This platform is a component of a set of web-based resources, including a patient self-report data portal, aimed at accelerating research in rare diseases in a rigorous fashion. Conclusions: Collaboration between clinicians, researchers, advocacy groups, and patients can create essential community resource infrastructure to accelerate rare disease research. The International LAM Registry is an example of such an effort.

Fluorescence and reflectance spectroscopy for identification of atherosclerosis in human carotid arteries using principal components analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objectives: The aim of this work was to verify the differentiation between normal and pathological human carotid artery tissues by using fluorescence and reflectance spectroscopy in the 400- to 700-nm range and the spectral characterization by means of principal components analysis. Background Data: Atherosclerosis is the most common and serious pathology of the cardiovascular system. Principal components represent the main spectral characteristics that occur within the spectral data and could be used for tissue classification. Materials and Methods: Sixty postmortem carotid artery fragments (26 non-atherosclerotic and 34 atherosclerotic with non-calcified plaques) were studied. The excitation radiation consisted of a 488-nm argon laser. Two 600-mu m core optical fibers were used, one for excitation and one to collect the fluorescence radiation from the samples. The reflectance system was composed of a halogen lamp coupled to an excitation fiber positioned in one of the ports of an integrating sphere that delivered 5 mW to the sample. The photo-reflectance signal was coupled to a 1/4-m spectrograph via an optical fiber. Euclidean distance was then used to classify each principal component score into one of two classes, normal and atherosclerotic tissue, for both fluorescence and reflectance. Results: The principal components analysis allowed classification of the samples with 81% sensitivity and 88% specificity for fluorescence, and 81% sensitivity and 91% specificity for reflectance. Conclusions: Our results showed that principal components analysis could be applied to differentiate between normal and atherosclerotic tissue with high sensitivity and specificity.

NGC 7097: THE ACTIVE GALACTIC NUCLEUS AND ITS MIRROR, REVEALED BY PRINCIPAL COMPONENT ANALYSIS TOMOGRAPHY

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Three-dimensional spectroscopy techniques are becoming more and more popular, producing an increasing number of large data cubes. The challenge of extracting information from these cubes requires the development of new techniques for data processing and analysis. We apply the recently developed technique of principal component analysis (PCA) tomography to a data cube from the center of the elliptical galaxy NGC 7097 and show that this technique is effective in decomposing the data into physically interpretable information. We find that the first five principal components of our data are associated with distinct physical characteristics. In particular, we detect a low-ionization nuclear-emitting region (LINER) with a weak broad component in the Balmer lines. Two images of the LINER are present in our data, one seen through a disk of gas and dust, and the other after scattering by free electrons and/or dust particles in the ionization cone. Furthermore, we extract the spectrum of the LINER, decontaminated from stellar and extended nebular emission, using only the technique of PCA tomography. We anticipate that the scattered image has polarized light due to its scattered nature.

Automated supervised classification of variable stars in the CoRoT programme Method and application to the first four exoplanet fields

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Aims. In this work, we describe the pipeline for the fast supervised classification of light curves observed by the CoRoT exoplanet CCDs. We present the classification results obtained for the first four measured fields, which represent a one-year in-orbit operation. Methods. The basis of the adopted supervised classification methodology has been described in detail in a previous paper, as is its application to the OGLE database. Here, we present the modifications of the algorithms and of the training set to optimize the performance when applied to the CoRoT data. Results. Classification results are presented for the observed fields IRa01, SRc01, LRc01, and LRa01 of the CoRoT mission. Statistics on the number of variables and the number of objects per class are given and typical light curves of high-probability candidates are shown. We also report on new stellar variability types discovered in the CoRoT data. The full classification results are publicly available.

Hubble parameter reconstruction from a principal component analysis: minimizing the bias

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Aims. A model-independent reconstruction of the cosmic expansion rate is essential to a robust analysis of cosmological observations. Our goal is to demonstrate that current data are able to provide reasonable constraints on the behavior of the Hubble parameter with redshift, independently of any cosmological model or underlying gravity theory. Methods. Using type Ia supernova data, we show that it is possible to analytically calculate the Fisher matrix components in a Hubble parameter analysis without assumptions about the energy content of the Universe. We used a principal component analysis to reconstruct the Hubble parameter as a linear combination of the Fisher matrix eigenvectors (principal components). To suppress the bias introduced by the high redshift behavior of the components, we considered the value of the Hubble parameter at high redshift as a free parameter. We first tested our procedure using a mock sample of type Ia supernova observations, we then applied it to the real data compiled by the Sloan Digital Sky Survey (SDSS) group. Results. In the mock sample analysis, we demonstrate that it is possible to drastically suppress the bias introduced by the high redshift behavior of the principal components. Applying our procedure to the real data, we show that it allows us to determine the behavior of the Hubble parameter with reasonable uncertainty, without introducing any ad-hoc parameterizations. Beyond that, our reconstruction agrees with completely independent measurements of the Hubble parameter obtained from red-envelope galaxies.

CHARACTERIZATION OF THE TREE COMPONENT IN A SEMIDECIDUOUS FOREST IN THE ESPINHACO RANGE: A SUBSIDY TO CONSERVATION

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study was conducted in the Private Reserve Mata do Jambreiro (912 ha), localized in the Iron Quadrangle, Minas Gerais, southeastern portion of the Espinhaco Range, which is predominantly covered by semideciduous seasonal montane forest. Three topographically and physiognomic similar areas located within a continuum forest fragment, distant by 1.3 to 1.5 km were sampled by the point-quadrat method. In each area, 30 points were marked. Individuals with a minimum perimeter at the breast height (PBH) of 15 cm were sampled, totaling 111 species belonging to 40 families. The most representative family was Fabaceae, with 14.29% of the total number of species. Low floristic similarity (5.3% to 34.4%) was observed between the areas, pointing out the importance of distribution of sample units in continuous fragments. Shannon diversity index (H') found was 4.22 and Pielou equability (J) 0.894. Soil analysis showed some differences in chemical composition between the three studied areas and was an important component for the interpretation of the floristic variation found. The low floristic similarity observed here for close areas justify the requirement of more detailed inventories by Brazilian Environmental Agencies for the legal authorization procedures prior to the establishment of new enterprising projects. Also, the professionals that conduct rapid inventories, mainly the Environmental Consultants, should give more attention to this kind of floristic variation and to the methods used to inventory complex forests.

Two-component Abelian sandpile models

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In one-component Abelian sandpile models, the toppling probabilities are independent quantities. This is not the case in multicomponent models. The condition of associativity of the underlying Abelian algebras imposes nonlinear relations among the toppling probabilities. These relations are derived for the case of two-component quadratic Abelian algebras. We show that Abelian sandpile models with two conservation laws have only trivial avalanches.

TESTING STATISTICAL HYPOTHESIS ON RANDOM TREES AND APPLICATIONS TO THE PROTEIN CLASSIFICATION PROBLEM

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Efficient automatic protein classification is of central importance in genomic annotation. As an independent way to check the reliability of the classification, we propose a statistical approach to test if two sets of protein domain sequences coming from two families of the Pfam database are significantly different. We model protein sequences as realizations of Variable Length Markov Chains (VLMC) and we use the context trees as a signature of each protein family. Our approach is based on a Kolmogorov-Smirnov-type goodness-of-fit test proposed by Balding et at. [Limit theorems for sequences of random trees (2008), DOI: 10.1007/s11749-008-0092-z]. The test statistic is a supremum over the space of trees of a function of the two samples; its computation grows, in principle, exponentially fast with the maximal number of nodes of the potential trees. We show how to transform this problem into a max-flow over a related graph which can be solved using a Ford-Fulkerson algorithm in polynomial time on that number. We apply the test to 10 randomly chosen protein domain families from the seed of Pfam-A database (high quality, manually curated families). The test shows that the distributions of context trees coming from different families are significantly different. We emphasize that this is a novel mathematical approach to validate the automatic clustering of sequences in any context. We also study the performance of the test via simulations on Galton-Watson related processes.

LIPSCHITZ CLASSIFICATION OF FUNCTIONS ON A HOLDER TRIANGLE

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of semialgebraic Lipschitz classification of quasihomogeneous polynomials on a Holder triangle is studied. For this problem, the ""moduli"" are described completely in certain combinatorial terms.

A Component of the Xanthomonadaceae Type IV Secretion System Combines a VirB7 Motif with a N0 Domain Found in Outer Membrane Transport Proteins

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Type IV secretion systems (T4SS) are used by Gram-negative bacteria to translocate protein and DNA substrates across the cell envelope and into target cells. Translocation across the outer membrane is achieved via a ringed tetradecameric outer membrane complex made up of a small VirB7 lipoprotein (normally 30 to 45 residues in the mature form) and the C-terminal domains of the VirB9 and VirB10 subunits. Several species from the genera of Xanthomonas phytopathogens possess an uncharacterized type IV secretion system with some distinguishing features, one of which is an unusually large VirB7 subunit (118 residues in the mature form). Here, we report the NMR and 1.0 angstrom X-ray structures of the VirB7 subunit from Xanthomonas citri subsp. citri (VirB7(XAC2622)) and its interaction with VirB9. NMR solution studies show that residues 27-41 of the disordered flexible N-terminal region of VirB7(XAC2622) interact specifically with the VirB9 C-terminal domain, resulting in a significant reduction in the conformational freedom of both regions. VirB7(XAC2622) has a unique C-terminal domain whose topology is strikingly similar to that of N0 domains found in proteins from different systems involved in transport across the bacterial outer membrane. We show that VirB7(XAC2622) oligomerizes through interactions involving conserved residues in the N0 domain and residues 42-49 within the flexible N-terminal region and that these homotropic interactions can persist in the presence of heterotropic interactions with VirB9. Finally, we propose that VirB(7XAC2622) oligomerization is compatible with the core complex structure in a manner such that the N0 domains form an extra layer on the perimeter of the tetradecameric ring.

Improving sample representativeness in environmental studies: a major component for the uncertainty budget

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For environmental quality assessment, INAA has been applied for determining chemical elements in small (200 mg) and large (200 g) samples of leaves from 200 trees. By applying the Ingamells` constant, the expected percent standard deviation was estimated in 0.9-2.2% for 200 mg samples. Otherwise, for composite samples (200 g), expected standard deviation varied from 0.5 to 10% in spite of analytical uncertainties ranging from 2 to 30%. Results thereby suggested the expression of the degree of representativeness as a source of uncertainty, contributing for increasing of the reliability of environmental studies mainly in the case of composite samples.

Laser-induced breakdown spectroscopy and chemometrics for classification of toys relying on toxic elements

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Quality control of toys for avoiding children exposure to potentially toxic elements is of utmost relevance and it is a common requirement in national and/or international norms for health and safety reasons. Laser-induced breakdown spectroscopy (LIBS) was recently evaluated at authors` laboratory for direct analysis of plastic toys and one of the main difficulties for the determination of Cd. Cr and Pb was the variety of mixtures and types of polymers. As most norms rely on migration (lixiviation) protocols, chemometric classification models from LIBS spectra were tested for sampling toys that present potential risk of Cd, Cr and Pb contamination. The classification models were generated from the emission spectra of 51 polymeric toys and by using Partial Least Squares - Discriminant Analysis (PLS-DA), Soft Independent Modeling of Class Analogy (SIMCA) and K-Nearest Neighbor (KNN). The classification models and validations were carried out with 40 and 11 test samples, respectively. Best results were obtained when KNN was used, with corrected predictions varying from 95% for Cd to 100% for Cr and Pb. (C) 2011 Elsevier B.V. All rights reserved.

Kernel machines for epilepsy diagnosis via EEG signal classification: A comparative study

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective: We carry out a systematic assessment on a suite of kernel-based learning machines while coping with the task of epilepsy diagnosis through automatic electroencephalogram (EEG) signal classification. Methods and materials: The kernel machines investigated include the standard support vector machine (SVM), the least squares SVM, the Lagrangian SVM, the smooth SVM, the proximal SVM, and the relevance vector machine. An extensive series of experiments was conducted on publicly available data, whose clinical EEG recordings were obtained from five normal subjects and five epileptic patients. The performance levels delivered by the different kernel machines are contrasted in terms of the criteria of predictive accuracy, sensitivity to the kernel function/parameter value, and sensitivity to the type of features extracted from the signal. For this purpose, 26 values for the kernel parameter (radius) of two well-known kernel functions (namely. Gaussian and exponential radial basis functions) were considered as well as 21 types of features extracted from the EEG signal, including statistical values derived from the discrete wavelet transform, Lyapunov exponents, and combinations thereof. Results: We first quantitatively assess the impact of the choice of the wavelet basis on the quality of the features extracted. Four wavelet basis functions were considered in this study. Then, we provide the average accuracy (i.e., cross-validation error) values delivered by 252 kernel machine configurations; in particular, 40%/35% of the best-calibrated models of the standard and least squares SVMs reached 100% accuracy rate for the two kernel functions considered. Moreover, we show the sensitivity profiles exhibited by a large sample of the configurations whereby one can visually inspect their levels of sensitiveness to the type of feature and to the kernel function/parameter value. Conclusions: Overall, the results evidence that all kernel machines are competitive in terms of accuracy, with the standard and least squares SVMs prevailing more consistently. Moreover, the choice of the kernel function and parameter value as well as the choice of the feature extractor are critical decisions to be taken, albeit the choice of the wavelet family seems not to be so relevant. Also, the statistical values calculated over the Lyapunov exponents were good sources of signal representation, but not as informative as their wavelet counterparts. Finally, a typical sensitivity profile has emerged among all types of machines, involving some regions of stability separated by zones of sharp variation, with some kernel parameter values clearly associated with better accuracy rates (zones of optimality). (C) 2011 Elsevier B.V. All rights reserved.

Improvements on ICA mixture models for image pre-processing and segmentation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Today several different unsupervised classification algorithms are commonly used to cluster similar patterns in a data set based only on its statistical properties. Specially in image data applications, self-organizing methods for unsupervised classification have been successfully applied for clustering pixels or group of pixels in order to perform segmentation tasks. The first important contribution of this paper refers to the development of a self-organizing method for data classification, named Enhanced Independent Component Analysis Mixture Model (EICAMM), which was built by proposing some modifications in the Independent Component Analysis Mixture Model (ICAMM). Such improvements were proposed by considering some of the model limitations as well as by analyzing how it should be improved in order to become more efficient. Moreover, a pre-processing methodology was also proposed, which is based on combining the Sparse Code Shrinkage (SCS) for image denoising and the Sobel edge detector. In the experiments of this work, the EICAMM and other self-organizing models were applied for segmenting images in their original and pre-processed versions. A comparative analysis showed satisfactory and competitive image segmentation results obtained by the proposals presented herein. (C) 2008 Published by Elsevier B.V.

«
1
2
3
4
5
»