102 resultados para constrained clustering
Resumo:
Efficient automatic protein classification is of central importance in genomic annotation. As an independent way to check the reliability of the classification, we propose a statistical approach to test if two sets of protein domain sequences coming from two families of the Pfam database are significantly different. We model protein sequences as realizations of Variable Length Markov Chains (VLMC) and we use the context trees as a signature of each protein family. Our approach is based on a Kolmogorov-Smirnov-type goodness-of-fit test proposed by Balding et at. [Limit theorems for sequences of random trees (2008), DOI: 10.1007/s11749-008-0092-z]. The test statistic is a supremum over the space of trees of a function of the two samples; its computation grows, in principle, exponentially fast with the maximal number of nodes of the potential trees. We show how to transform this problem into a max-flow over a related graph which can be solved using a Ford-Fulkerson algorithm in polynomial time on that number. We apply the test to 10 randomly chosen protein domain families from the seed of Pfam-A database (high quality, manually curated families). The test shows that the distributions of context trees coming from different families are significantly different. We emphasize that this is a novel mathematical approach to validate the automatic clustering of sequences in any context. We also study the performance of the test via simulations on Galton-Watson related processes.
Resumo:
Many natural populations exploiting a wide range of resources are actually composed of relatively specialized individuals. This interindividual variation is thought to be a consequence of the invasion of `empty` niches in depauperate communities, generally in temperate regions. If individual niches are constrained by functional trade-offs, the expansion of the population niche is only achieved by an increase in interindividual variation, consistent with the `niche variation hypothesis`. According to this hypothesis, we should not expect interindividual variation in species belonging to highly diverse, packed communities. In the present study, we measured the degree of interindividual diet variation in four species of frogs of the highly diverse Brazilian Cerrado, using both gut contents and delta(13)C stable isotopes. We found evidence of significant diet variation in the four species, indicating that this phenomenon is not restricted to depauperate communities in temperate regions. The lack of correlations between the frogs` morphology and diet indicate that trade-offs do not depend on the morphological characters measured here and are probably not biomechanical. The nature of the trade-offs remains unknown, but are likely to be cognitive or physiological. Finally, we found a positive correlation between the population niche width and the degree of diet variation, but a null model showed that this correlation can be generated by individuals sampling randomly from a common set of resources. Therefore, albeit consistent with, our results cannot be taken as evidence in favour of the niche variation hypothesis.
Resumo:
The k(0)-method instrumental neutron activation analysis (k(0)-INAA) was employed for determining chemical elements in bird feathers. A collection was obtained taking into account several bird species from wet ecosystems in diverse regions of Brazil. For comparison reason, feathers were actively sampled in a riparian forest from the Marins Stream, Piracicaba, Sao Paulo State, using mist nets specific for capturing birds. Biological certified reference materials were used for assessing the quality of analytical procedure. Quantification of chemical elements was performed using the k(0)-INAA Quantu Software. Sixteen chemical elements, including macro and micronutrients, and trace elements, have been quantified in feathers, in which analytical uncertainties varied from 2% to 40% depending on the chemical element mass fraction. Results indicated high mass fractions of Br (max=7.9 mgkg(-1)), Co (max= 0.47 mg kg(-1)), Cr (max =68 mg kg(-1)), Hg (max =2.79 mg kg(-1)), Sb (max= 0.20 mg kg(-1)), Se (max=1.3 mg kg(-1)) and Zn (max =192 mg kg(-1)) in bird feathers, probably associated with the degree of pollution of the areas evaluated. In order to corroborate the use of k(0)-INAA results in biomonitoring studies using avian community, different factor analysis methods were used to check chemical element source apportionment and locality clustering based on feather chemical composition. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Various molecular systems are available for epidemiological, genetic, evolutionary, taxonomic and systematic studies of innumerable fungal infections, especially those caused by the opportunistic pathogen C. albicans. A total of 75 independent oral isolates were selected in order to compare Multilocus Enzyme Electrophoresis (MLEE), Electrophoretic Karyotyping (EK) and Microsatellite Markers (Simple Sequence Repeats - SSRs), in their abilities to differentiate and group C. albicans isolates (discriminatory power), and also, to evaluate the concordance and similarity of the groups of strains determined by cluster analysis for each fingerprinting method. Isoenzyme typing was performed using eleven enzyme systems: Adh, Sdh, M1p, Mdh, Idh, Gdh, G6pdh, Asd, Cat, Po, and Lap (data previously published). The EK method consisted of chromosomal DNA separation by pulsed-field gel electrophoresis using a CHEF system. The microsatellite markers were investigated by PCR using three polymorphic loci: EF3, CDC3, and HIS3. Dendrograms were generated by the SAHN method and UPGMA algorithm based on similarity matrices (S(SM)). The discriminatory power of the three methods was over 95%, however a paired analysis among them showed a parity of 19.7-22.4% in the identification of strains. Weak correlation was also observed among the genetic similarity matrices (S(SM)(MLEE) x S(SM)(EK) x S(SM)(SSRs)). Clustering analyses showed a mean of 9 +/- 12.4 isolates per cluster (3.8 +/- 8 isolates/taxon) for MLEE, 6.2 +/- 4.9 isolates per cluster (4 +/- 4.5 isolates/taxon) for SSRs, and 4.1 +/- 2.3 isolates per cluster (2.6 +/- 2.3 isolates/taxon) for EK. A total of 45 (13%), 39(11.2%), 5 (1.4%) and 3 (0.9%) clusters pairs from 347 showed similarity (Si) of 0.1-10%, 10.1-20%, 20.1-30% and 30.1-40%, respectively. Clinical and molecular epidemiological correlation involving the opportunistic pathogen C. albicans may be attributed dependently of each method of genotyping (i.e., MLEE, EK, and SSRs) supplemented with similarity and grouping analysis. Therefore, the use of genotyping systems that give results which offer minimum disparity, or the combination of the results of these systems, can provide greater security and consistency in the determination of strains and their genetic relationships. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Today several different unsupervised classification algorithms are commonly used to cluster similar patterns in a data set based only on its statistical properties. Specially in image data applications, self-organizing methods for unsupervised classification have been successfully applied for clustering pixels or group of pixels in order to perform segmentation tasks. The first important contribution of this paper refers to the development of a self-organizing method for data classification, named Enhanced Independent Component Analysis Mixture Model (EICAMM), which was built by proposing some modifications in the Independent Component Analysis Mixture Model (ICAMM). Such improvements were proposed by considering some of the model limitations as well as by analyzing how it should be improved in order to become more efficient. Moreover, a pre-processing methodology was also proposed, which is based on combining the Sparse Code Shrinkage (SCS) for image denoising and the Sobel edge detector. In the experiments of this work, the EICAMM and other self-organizing models were applied for segmenting images in their original and pre-processed versions. A comparative analysis showed satisfactory and competitive image segmentation results obtained by the proposals presented herein. (C) 2008 Published by Elsevier B.V.
Resumo:
Continuing our series of papers on the three-dimensional (3D) structure and accurate distances of planetary nebulae (PNe), we present here the results obtained for PN NGC 40. Using data from different sources and wavelengths, we construct 3D photoionization models and derive the physical quantities of the ionizing source and nebular gas. The procedure, discussed in detail in the previous papers, consists of the use of 3D photoionization codes constrained by observational data to derive the 3D nebular structure, physical and chemical characteristics, and ionizing star parameters of the objects by simultaneously fitting the integrated line intensities, the density map, the temperature map, and the observed morphologies in different emission lines. For this particular case we combined hydrodynamical simulations with the photoionization scheme in order to obtain self-consistent distributions of density and velocity of the nebular material. Combining the velocity field with the emission-line cubes we also obtained the synthetic position-velocity plots that are compared to the observations. Finally, using theoretical evolutionary tracks of intermediate-and low-mass stars, we derive the mass and age of the central star of NGC 40 as (0.567 +/- 0.06) M(circle dot) and (5810 +/- 600) yr, respectively. The distance obtained from the fitting procedure was (1150 +/- 120) pc.
Resumo:
We investigated the effect of joint immobilization on the postural sway during quiet standing. We hypothesized that the center of pressure (COP), rambling, and trembling trajectories would be affected by joint immobilization. Ten young adults stood on a force plate during 60 s without and with immobilized joints (only knees constrained, CK; knees and hips, CH; and knees, hips, and trunk, CT). with their eyes open (OE) or closed (CE). The root mean square deviation (RMS, the standard deviation from the mean) and mean speed of COP, rambling, and trembling trajectories in the anterior-posterior and medial-lateral directions were analyzed. Similar effects of vision were observed for both directions: larger amplitudes for all variables were observed in the CE condition. In the anterior-posterior direction, postural sway increased only when the knees, hips, and trunk were immobilized. For the medial-lateral direction, the RMS and the mean speed of the COP, rambling, and trembling displacements decreased after immobilization of knees and hips and knees, hips, and trunk. These findings indicate that the single inverted pendulum model is unable to completely explain the processes involved in the control of the quiet upright stance in the anterior-posterior and medial-lateral directions. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
The principal aim of studies of enzyme-mediated reactions has been to provide comparative and quantitative information on enzyme-catalyzed reactions under distinct conditions. The classic Michaelis-Menten model (Biochem Zeit 49:333, 1913) for enzyme kinetic has been widely used to determine important parameters involved in enzyme catalysis, particularly the Michaelis-Menten constant (K (M) ) and the maximum velocity of reaction (V (max) ). Subsequently, a detailed treatment of the mechanisms of enzyme catalysis was undertaken by Briggs-Haldane (Biochem J 19:338, 1925). These authors proposed the steady-state treatment, since its applicability was constrained to this condition. The present work describes an extending solution of the Michaelis-Menten model without the need for such a steady-state restriction. We provide the first analysis of all of the individual reaction constants calculated analytically. Using this approach, it is possible to accurately predict the results under new experimental conditions and to characterize and optimize industrial processes in the fields of chemical and food engineering, pharmaceuticals and biotechnology.
Resumo:
The taxonomy of the N(2)-fixing bacteria belonging to the genus Bradyrhizobium is still poorly refined, mainly due to conflicting results obtained by the analysis of the phenotypic and genotypic properties. This paper presents an application of a method aiming at the identification of possible new clusters within a Brazilian collection of 119 Bradryrhizobium strains showing phenotypic characteristics of B. japonicum and B. elkanii. The stability was studied as a function of the number of restriction enzymes used in the RFLP-PCR analysis of three ribosomal regions with three restriction enzymes per region. The method proposed here uses Clustering algorithms with distances calculated by average-linkage clustering. Introducing perturbations using sub-sampling techniques makes the stability analysis. The method showed efficacy in the grouping of the species B. japonicum and B. elkanii. Furthermore, two new clusters were clearly defined, indicating possible new species, and sub-clusters within each detected cluster. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Hybrid active-passive damping treatments combine the reliability, low cost and robustness of viscoelastic damping treatments and the high-performance, modal selective and adaptive piezoelectric active control. Numerous hybrid damping treatments have been reported in the literature. They differ mainly by the relative positions of viscoelastic treatments, sensors and piezoelectric actuators. In this work we present an experimental analysis of three active-passive damping design configurations applied to a cantilever beam. In particular, two design configurations based on the extension mode of piezoelectric actuators combined with viscoelastic constrained layer damping treatments and one design configuration with shear piezoelectric actuators embedded in a sandwich beam with viscoelastic core are analyzed. For comparison purposes, a purely active design configuration with an extension piezoelectric actuator bonded to an elastic beam is also analyzed. The active-passive damping performance of the four design configurations is compared. Results show that active-passive design configurations provide more reliable and wider-range damping performance than the purely active configuration.
Resumo:
This paper analyses the presence of financial constraint in the investment decisions of 367 Brazilian firms from 1997 to 2004, using a Bayesian econometric model with group-varying parameters. The motivation for this paper is the use of clustering techniques to group firms in a totally endogenous form. In order to classify the firms we used a hybrid clustering method, that is, hierarchical and non-hierarchical clustering techniques jointly. To estimate the parameters a Bayesian approach was considered. Prior distributions were assumed for the parameters, classifying the model in random or fixed effects. Ordinate predictive density criterion was used to select the model providing a better prediction. We tested thirty models and the better prediction considers the presence of 2 groups in the sample, assuming the fixed effect model with a Student t distribution with 20 degrees of freedom for the error. The results indicate robustness in the identification of financial constraint when the firms are classified by the clustering techniques. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
We consider a class of two-dimensional problems in classical linear elasticity for which material overlapping occurs in the absence of singularities. Of course, material overlapping is not physically realistic, and one possible way to prevent it uses a constrained minimization theory. In this theory, a minimization problem consists of minimizing the total potential energy of a linear elastic body subject to the constraint that the deformation field must be locally invertible. Here, we use an interior and an exterior penalty formulation of the minimization problem together with both a standard finite element method and classical nonlinear programming techniques to compute the minimizers. We compare both formulations by solving a plane problem numerically in the context of the constrained minimization theory. The problem has a closed-form solution, which is used to validate the numerical results. This solution is regular everywhere, including the boundary. In particular, we show numerical results which indicate that, for a fixed finite element mesh, the sequences of numerical solutions obtained with both the interior and the exterior penalty formulations converge to the same limit function as the penalization is enforced. This limit function yields an approximate deformation field to the plane problem that is locally invertible at all points in the domain. As the mesh is refined, this field converges to the exact solution of the plane problem.
Resumo:
Wireless Sensor Networks (WSNs) have a vast field of applications, including deployment in hostile environments. Thus, the adoption of security mechanisms is fundamental. However, the extremely constrained nature of sensors and the potentially dynamic behavior of WSNs hinder the use of key management mechanisms commonly applied in modern networks. For this reason, many lightweight key management solutions have been proposed to overcome these constraints. In this paper, we review the state of the art of these solutions and evaluate them based on metrics adequate for WSNs. We focus on pre-distribution schemes well-adapted for homogeneous networks (since this is a more general network organization), thus identifying generic features that can improve some of these metrics. We also discuss some challenges in the area and future research directions. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
In this paper, a framework for detection of human skin in digital images is proposed. This framework is composed of a training phase and a detection phase. A skin class model is learned during the training phase by processing several training images in a hybrid and incremental fuzzy learning scheme. This scheme combines unsupervised-and supervised-learning: unsupervised, by fuzzy clustering, to obtain clusters of color groups from training images; and supervised to select groups that represent skin color. At the end of the training phase, aggregation operators are used to provide combinations of selected groups into a skin model. In the detection phase, the learned skin model is used to detect human skin in an efficient way. Experimental results show robust and accurate human skin detection performed by the proposed framework.
Resumo:
This paper contains a new proposal for the definition of the fundamental operation of query under the Adaptive Formalism, one capable of locating functional nuclei from descriptions of their semantics. To demonstrate the method`s applicability, an implementation of the query procedure constrained to a specific class of devices is shown, and its asymptotic computational complexity is discussed.