56 resultados para Topological classification
Resumo:
Background: Lynch syndrome (LS) is an autosomal dominant inherited cancer syndrome characterized by early onset cancers of the colorectum, endometrium and other tumours. A significant proportion of DNA variants in LS patients are unclassified. Reports on the pathogenicity of the c.1852_1853AA>GC (p.Lys618Ala) variant of the MLH1 gene are conflicting. In this study, we provide new evidence indicating that this variant has no significant implications for LS.Methods: The following approach was used to assess the clinical significance of the p.Lys618Ala variant: frequency in a control population, case-control comparison, co-occurrence of the p.Lys618Ala variant with a pathogenic mutation, co-segregation with the disease and microsatellite instability in tumours from carriers of the variant. We genotyped p.Lys618Ala in 1034 individuals (373 sporadic colorectal cancer [CRC] patients, 250 index subjects from families suspected of having LS [revised Bethesda guidelines] and 411 controls). Three well-characterized LS families that fulfilled the Amsterdam II Criteria and consisted of members with the p.Lys618Ala variant were included to assess co-occurrence and co-segregation. A subset of colorectal tumour DNA samples from 17 patients carrying the p.Lys618Ala variant was screened for microsatellite instability using five mononucleotide markers.Results: Twenty-seven individuals were heterozygous for the p.Lys618Ala variant; nine had sporadic CRC (2.41%), seven were suspected of having hereditary CRC (2.8%) and 11 were controls (2.68%). There were no significant associations in the case-control and case-case studies. The p.Lys618Ala variant was co-existent with pathogenic mutations in two unrelated LS families. In one family, the allele distribution of the pathogenic and unclassified variant was in trans, in the other family the pathogenic variant was detected in the MSH6 gene and only the deleterious variant co-segregated with the disease in both families. Only two positive cases of microsatellite instability (2/17, 11.8%) were detected in tumours from p.Lys618Ala carriers, indicating that this variant does not play a role in functional inactivation of MLH1 in CRC patients.Conclusions: The p.Lys618Ala variant should be considered a neutral variant for LS. These findings have implications for the clinical management of CRC probands and their relatives.
Resumo:
For the ∼1% of the human genome in the ENCODE regions, only about half of the transcriptionally active regions (TARs) identified with tiling microarrays correspond to annotated exons. Here we categorize this large amount of “unannotated transcription.” We use a number of disparate features to classify the 6988 novel TARs—array expression profiles across cell lines and conditions, sequence composition, phylogenetic profiles (presence/absence of syntenic conservation across 17 species), and locations relative to genes. In the classification, we first filter out TARs with unusual sequence composition and those likely resulting from cross-hybridization. We then associate some of those remaining with proximal exons having correlated expression profiles. Finally, we cluster unclassified TARs into putative novel loci, based on similar expression and phylogenetic profiles. To encapsulate our classification, we construct a Database of Active Regions and Tools (DART.gersteinlab.org). DART has special facilities for rapidly handling and comparing many sets of TARs and their heterogeneous features, synchronizing across builds, and interfacing with other resources. Overall, we find that ∼14% of the novel TARs can be associated with known genes, while ∼21% can be clustered into ∼200 novel loci. We observe that TARs associated with genes are enriched in the potential to form structural RNAs and many novel TAR clusters are associated with nearby promoters. To benchmark our classification, we design a set of experiments for testing the connectivity of novel TARs. Overall, we find that 18 of the 46 connections tested validate by RT-PCR and four of five sequenced PCR products confirm connectivity unambiguously.
Resumo:
Background: The cooperative interaction between transcription factors has a decisive role in the control of the fate of the eukaryotic cell. Computational approaches for characterizing cooperative transcription factors in yeast, however, are based on different rationales and provide a low overlap between their results. Because the wealth of information contained in protein interaction networks and regulatory networks has proven highly effective in elucidating functional relationships between proteins, we compared different sets of cooperative transcription factor pairs (predicted by four different computational methods) within the frame of those networks. Results: Our results show that the overlap between the sets of cooperative transcription factors predicted by the different methods is low yet significant. Cooperative transcription factors predicted by all methods are closer and more clustered in the protein interaction network than expected by chance. On the other hand, members of a cooperative transcription factor pair neither seemed to regulate each other nor shared similar regulatory inputs, although they do regulate similar groups of target genes. Conclusion: Despite the different definitions of transcriptional cooperativity and the different computational approaches used to characterize cooperativity between transcription factors, the analysis of their roles in the framework of the protein interaction network and the regulatory network indicates a common denominator for the predictions under study. The knowledge of the shared topological properties of cooperative transcription factor pairs in both networks can be useful not only for designing better prediction methods but also for better understanding the complexities of transcriptional control in eukaryotes.
Resumo:
Automatic classification of makams from symbolic data is a rarely studied topic. In this paper, first a review of an n-gram based approach is presented using various representations of the symbolic data. While a high degree of precision can be obtained, confusion happens mainly for makams using (almost) the same scale and pitch hierarchy but differ in overall melodic progression, seyir. To further improve the system, first n-gram based classification is tested for various sections of the piece to take into account a feature of the seyir that melodic progression starts in a certain region of the scale. In a second test, a hierarchical classification structure is designed which uses n-grams and seyir features in different levels to further improve the system.
Resumo:
Subjective language detection is one of the most important challenges in Sentiment Analysis. Because of the weight and frequency in opinionated texts, adjectives are considered a key piece in the opinion extraction process. These subjective units are more and more frequently collected in polarity lexicons in which they appear annotated with their prior polarity. However, at the moment, any polarity lexicon takes into account prior polarity variations across domains. This paper proves that a majority of adjectives change their prior polarity value depending on the domain. We propose a distinction between domain dependent and romain independent adjectives. Moreover, our analysis led us to propose a further classification related to subjectivity degree: constant, mixed and highly subjective adjectives. Following this classification, polarity values will be a better support for Sentiment Analysis.
Resumo:
The work we present here addresses cue-based noun classification in English and Spanish. Its main objective is to automatically acquire lexical semantic information by classifying nouns into previously known noun lexical classes. This is achieved by using particular aspects of linguistic contexts as cues that identify a specific lexical class. Here we concentrate on the task of identifying such cues and the theoretical background that allows for an assessment of the complexity of the task. The results show that, despite of the a-priori complexity of the task, cue-based classification is a useful tool in the automatic acquisition of lexical semantic classes.
Resumo:
In this paper we propose a Pyramidal Classification Algorithm,which together with an appropriate aggregation index producesan indexed pseudo-hierarchy (in the strict sense) withoutinversions nor crossings. The computer implementation of thealgorithm makes it possible to carry out some simulation testsby Monte Carlo methods in order to study the efficiency andsensitivity of the pyramidal methods of the Maximum, Minimumand UPGMA. The results shown in this paper may help to choosebetween the three classification methods proposed, in order toobtain the classification that best fits the original structureof the population, provided we have an a priori informationconcerning this structure.
Resumo:
The classical binary classification problem is investigatedwhen it is known in advance that the posterior probability function(or regression function) belongs to some class of functions. We introduceand analyze a method which effectively exploits this knowledge. The methodis based on minimizing the empirical risk over a carefully selected``skeleton'' of the class of regression functions. The skeleton is acovering of the class based on a data--dependent metric, especiallyfitted for classification. A new scale--sensitive dimension isintroduced which is more useful for the studied classification problemthan other, previously defined, dimension measures. This fact isdemonstrated by performance bounds for the skeleton estimate in termsof the new dimension.
Resumo:
We express the Lyubeznik numbers of the local ring of a complex isolated singularity in terms of Betti numbers of the associated real link.
Resumo:
The absolute K magnitudes and kinematic parameters of about 350 oxygen-rich Long-Period Variable stars are calibrated, by means of an up-to-date maximum-likelihood method, using HIPPARCOS parallaxes and proper motions together with radial velocities and, as additional data, periods and V-K colour indices. Four groups, differing by their kinematics and mean magnitudes, are found. For each of them, we also obtain the distributions of magnitude, period and de-reddened colour of the base population, as well as de-biased period-luminosity-colour relations and their two-dimensional projections. The SRa semiregulars do not seem to constitute a separate class of LPVs. The SRb appear to belong to two populations of different ages. In a PL diagram, they constitute two evolutionary sequences towards the Mira stage. The Miras of the disk appear to pulsate on a lower-order mode. The slopes of their de-biased PL and PC relations are found to be very different from the ones of the Oxygen Miras of the LMC. This suggests that a significant number of so-called Miras of the LMC are misclassified. This also suggests that the Miras of the LMC do not constitute a homogeneous group, but include a significant proportion of metal-deficient stars, suggesting a relatively smooth star formation history. As a consequence, one may not trivially transpose the LMC period-luminosity relation from one galaxy to the other.
Resumo:
During the period 1996-2000, forty-three heavy rainfall events have been detected in the Internal Basins of Catalonia (Northeastern of Spain). Most of these events caused floods and serious damage. This high number leads to the need for a methodology to classify them, on the basis of their surface rainfall distribution, their internal organization and their physical features. The aim of this paper is to show a methodology to analyze systematically the convective structures responsible of those heavy rainfall events on the basis of the information supplied by the meteorological radar. The proposed methodology is as follows. Firstly, the rainfall intensity and the surface rainfall pattern are analyzed on the basis of the raingauge data. Secondly, the convective structures at the lowest level are identified and characterized by using a 2-D algorithm, and the convective cells are identified by using a 3-D procedure that looks for the reflectivity cores in every radar volume. Thirdly, the convective cells (3-D) are associated with the 2-D structures (convective rainfall areas). This methodology has been applied to the 43 heavy rainfall events using the meteorological radar located near Barcelona and the SAIH automatic raingauge network.
Resumo:
The edge excitations and related topological orders of correlated states of a fast rotating Bose gas are studied. Using exact diagonalization of small systems, we compute the energies and number of edge excitations, as well as the boson occupancy near the edge for various states. The chiral Luttinger-liquid theory of Wen is found to be a good description of the edges of the bosonic Laughlin and other states identified as members of the principal Jain sequence for bosons. However, we find that in a harmonic trap the edge of the state identified as the Moore-Read (Pfaffian) state shows a number of anomalies. An experimental way of detecting these correlated states is also discussed.
Resumo:
We prove for any pure three-quantum-bit state the existence of local bases which allow one to build a set of five orthogonal product states in terms of which the state can be written in a unique form. This leads to a canonical form which generalizes the two-quantum-bit Schmidt decomposition. It is uniquely characterized by the five entanglement parameters. It leads to a complete classification of the three-quantum-bit states. It shows that the right outcome of an adequate local measurement always erases all entanglement between the other two parties.