930 resultados para Pattern classification


Relevância:

20.00% 20.00%

Publicador:

Resumo:

HAMAP (High-quality Automated and Manual Annotation of Proteins-available at http://hamap.expasy.org/) is a system for the automatic classification and annotation of protein sequences. HAMAP provides annotation of the same quality and detail as UniProtKB/Swiss-Prot, using manually curated profiles for protein sequence family classification and expert curated rules for functional annotation of family members. HAMAP data and tools are made available through our website and as part of the UniRule pipeline of UniProt, providing annotation for millions of unreviewed sequences of UniProtKB/TrEMBL. Here we report on the growth of HAMAP and updates to the HAMAP system since our last report in the NAR Database Issue of 2013. We continue to augment HAMAP with new family profiles and annotation rules as new protein families are characterized and annotated in UniProtKB/Swiss-Prot; the latest version of HAMAP (as of 3 September 2014) contains 1983 family classification profiles and 1998 annotation rules (up from 1780 and 1720). We demonstrate how the complex logic of HAMAP rules allows for precise annotation of individual functional variants within large homologous protein families. We also describe improvements to our web-based tool HAMAP-Scan which simplify the classification and annotation of sequences, and the incorporation of an improved sequence-profile search algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Oxalate catabolism, which can have both medical and environmental implications, is performed by phylogenetically diverse bacteria. The formyl-CoA-transferase gene was chosen as a molecular marker of the oxalotrophic function. Degenerated primers were deduced from an alignment of frc gene sequences available in databases. The specificity of primers was tested on a variety of frc-containing and frc-lacking bacteria. The frc-primers were then used to develop PCR-DGGE and real-time SybrGreen PCR assays in soils containing various amounts of oxalate. Some PCR products from pure cultures and from soil samples were cloned and sequenced. Data were used to generate a phylogenetic tree showing that environmental PCR products belonged to the target physiological group. The extent of diversity visualised on DGGE pattern was higher for soil samples containing carbonate resulting from oxalate catabolism. Moreover, the amount of frc gene copies in the investigated soils was detected in the range of 1.64x10(7) to 1.75x10(8)/g of dry soil under oxalogenic tree (representing 0.5 to 1.2% of total 16S rRNA gene copies), whereas the number of frc gene copies in the reference soil was 6.4x10(6) (or 0.2% of 16S rRNA gene copies). This indicates that oxalotrophic bacteria are numerous and widespread in soils and that a relationship exists between the presence of the oxalogenic trees Milicia excelsa and Afzelia africana and the relative abundance of oxalotrophic guilds in the total bacterial communities. This is obviously related to the accomplishment of the oxalate-carbonate pathway, which explains the alkalinization and calcium carbonate accumulation occurring below these trees in an otherwise acidic soil. The molecular tools developed in this study will allow in-depth understanding of the functional implication of these bacteria on carbonate accumulation as a way of atmospheric CO(2) sequestration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Lynch syndrome (LS) is an autosomal dominant inherited cancer syndrome characterized by early onset cancers of the colorectum, endometrium and other tumours. A significant proportion of DNA variants in LS patients are unclassified. Reports on the pathogenicity of the c.1852_1853AA>GC (p.Lys618Ala) variant of the MLH1 gene are conflicting. In this study, we provide new evidence indicating that this variant has no significant implications for LS.Methods: The following approach was used to assess the clinical significance of the p.Lys618Ala variant: frequency in a control population, case-control comparison, co-occurrence of the p.Lys618Ala variant with a pathogenic mutation, co-segregation with the disease and microsatellite instability in tumours from carriers of the variant. We genotyped p.Lys618Ala in 1034 individuals (373 sporadic colorectal cancer [CRC] patients, 250 index subjects from families suspected of having LS [revised Bethesda guidelines] and 411 controls). Three well-characterized LS families that fulfilled the Amsterdam II Criteria and consisted of members with the p.Lys618Ala variant were included to assess co-occurrence and co-segregation. A subset of colorectal tumour DNA samples from 17 patients carrying the p.Lys618Ala variant was screened for microsatellite instability using five mononucleotide markers.Results: Twenty-seven individuals were heterozygous for the p.Lys618Ala variant; nine had sporadic CRC (2.41%), seven were suspected of having hereditary CRC (2.8%) and 11 were controls (2.68%). There were no significant associations in the case-control and case-case studies. The p.Lys618Ala variant was co-existent with pathogenic mutations in two unrelated LS families. In one family, the allele distribution of the pathogenic and unclassified variant was in trans, in the other family the pathogenic variant was detected in the MSH6 gene and only the deleterious variant co-segregated with the disease in both families. Only two positive cases of microsatellite instability (2/17, 11.8%) were detected in tumours from p.Lys618Ala carriers, indicating that this variant does not play a role in functional inactivation of MLH1 in CRC patients.Conclusions: The p.Lys618Ala variant should be considered a neutral variant for LS. These findings have implications for the clinical management of CRC probands and their relatives.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For the ∼1% of the human genome in the ENCODE regions, only about half of the transcriptionally active regions (TARs) identified with tiling microarrays correspond to annotated exons. Here we categorize this large amount of “unannotated transcription.” We use a number of disparate features to classify the 6988 novel TARs—array expression profiles across cell lines and conditions, sequence composition, phylogenetic profiles (presence/absence of syntenic conservation across 17 species), and locations relative to genes. In the classification, we first filter out TARs with unusual sequence composition and those likely resulting from cross-hybridization. We then associate some of those remaining with proximal exons having correlated expression profiles. Finally, we cluster unclassified TARs into putative novel loci, based on similar expression and phylogenetic profiles. To encapsulate our classification, we construct a Database of Active Regions and Tools (DART.gersteinlab.org). DART has special facilities for rapidly handling and comparing many sets of TARs and their heterogeneous features, synchronizing across builds, and interfacing with other resources. Overall, we find that ∼14% of the novel TARs can be associated with known genes, while ∼21% can be clustered into ∼200 novel loci. We observe that TARs associated with genes are enriched in the potential to form structural RNAs and many novel TAR clusters are associated with nearby promoters. To benchmark our classification, we design a set of experiments for testing the connectivity of novel TARs. Overall, we find that 18 of the 46 connections tested validate by RT-PCR and four of five sequenced PCR products confirm connectivity unambiguously.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Understanding the molecular mechanisms responsible for the regulation of the transcriptome present in eukaryotic cells isone of the most challenging tasks in the postgenomic era. In this regard, alternative splicing (AS) is a key phenomenoncontributing to the production of different mature transcripts from the same primary RNA sequence. As a plethora ofdifferent transcript forms is available in databases, a first step to uncover the biology that drives AS is to identify thedifferent types of reflected splicing variation. In this work, we present a general definition of the AS event along with anotation system that involves the relative positions of the splice sites. This nomenclature univocally and dynamically assignsa specific ‘‘AS code’’ to every possible pattern of splicing variation. On the basis of this definition and the correspondingcodes, we have developed a computational tool (AStalavista) that automatically characterizes the complete landscape of ASevents in a given transcript annotation of a genome, thus providing a platform to investigate the transcriptome diversityacross genes, chromosomes, and species. Our analysis reveals that a substantial part—in human more than a quarter—ofthe observed splicing variations are ignored in common classification pipelines. We have used AStalavista to investigate andto compare the AS landscape of different reference annotation sets in human and in other metazoan species and found thatproportions of AS events change substantially depending on the annotation protocol, species-specific attributes, andcoding constraints acting on the transcripts. The AStalavista system therefore provides a general framework to conductspecific studies investigating the occurrence, impact, and regulation of AS.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Introduction: Measures of the degree of lumbar spinal stenosis (LSS) such as antero-posterior diameter of the canal, and dural sac cross sectional area vary, and do not correlate with symptoms or results of surgery. We created a grading system, comprised of seven categories, based on the morphology of the dural sac and its contents as seen on T2 axial images. The categories take into account the ratio of rootlet/ CSF content. Grade A indicates no significant compression, grade D is equivalent to a total myelograhic block. We compared this classification with commonly used criteria of severity of stenosis. Methods: Fifty T2 axial MRI images taken at disc level from 27 symptomatic LSS patients undergoing decompressive surgery were classified twice by two radiologists and three spinal surgeons working at different institutions and countries. Dural sac cross-sectional surface area and AP diameter of the canal were measured both at disc and pedicle level from DICOM images using OsiriX software. Intraand inter-observer reliability were assessed using Cohen's, Fleiss' kappa statistics, and t test. Results: For the morphological grading the average intra-and inter observer kappas were 0.76 and 0.69+, respectively, for physicians working in the study originating country. Combining all observers the kappa values were 0.57 ± 0.19. and 0.44 ± 0.19, respectively. AP diameter and dural sac cross-sectional area measurements showed no statistically significant differences between observers. No correlation between morphological grading and AP diameter or dural sac crosssectional areawas observed in 13 (26%) and 8 cases (16%), respectively. Discussion: The proposed morphological grading relies on the identification of the dural sac and CSF better seen on full MRI series. This was not available to the external observers, which might explain the lower overall kappa values. Since no specific measurement tools are needed the grading suits everyday clinical practice and favours communication of degree of stenosis between practising physicians. The absence of a strict correlation with the dural sac surface suggests that measuring the surface alone might be insufficient in defining LSS as it is essentially a mismatch between the spinal canal and its contents. This grading is now adopted in our unit and further studies concentrating on relation between morphology, clinical symptoms and surgical results are underway.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Collage is a pattern-based visual design authoring tool for the creation of collaborative learning scripts computationally modelled with IMS Learning Design (LD). The pattern-based visual approach aims to provide teachers with design ideas that are based on broadly accepted practices. Besides, it seeks hiding the LD notation so that teachers can easily create their own designs. The use of visual representations supports both the understanding of the design ideas and the usability of the authoring tool. This paper presents a multicase study comprising three different cases that evaluate the approach from different perspectives. The first case includes workshops where teachers use Collage. A second case implies the design of a scenario proposed by a third-party using related approaches. The third case analyzes a situation where students follow a design created with Collage. The cross-case analysis provides a global understanding of the possibilities and limitations of the pattern-based visual design approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Monitoring of posture allocations and activities enables accurate estimation of energy expenditure and may aid in obesity prevention and treatment. At present, accurate devices rely on multiple sensors distributed on the body and thus may be too obtrusive for everyday use. This paper presents a novel wearable sensor, which is capable of very accurate recognition of common postures and activities. The patterns of heel acceleration and plantar pressure uniquely characterize postures and typical activities while requiring minimal preprocessing and no feature extraction. The shoe sensor was tested in nine adults performing sitting and standing postures and while walking, running, stair ascent/descent and cycling. Support vector machines (SVMs) were used for classification. A fourfold validation of a six-class subject-independent group model showed 95.2% average accuracy of posture/activity classification on full sensor set and over 98% on optimized sensor set. Using a combination of acceleration/pressure also enabled a pronounced reduction of the sampling frequency (25 to 1 Hz) without significant loss of accuracy (98% versus 93%). Subjects had shoe sizes (US) M9.5-11 and W7-9 and body mass index from 18.1 to 39.4 kg/m2 and thus suggesting that the device can be used by individuals with varying anthropometric characteristics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automatic classification of makams from symbolic data is a rarely studied topic. In this paper, first a review of an n-gram based approach is presented using various representations of the symbolic data. While a high degree of precision can be obtained, confusion happens mainly for makams using (almost) the same scale and pitch hierarchy but differ in overall melodic progression, seyir. To further improve the system, first n-gram based classification is tested for various sections of the piece to take into account a feature of the seyir that melodic progression starts in a certain region of the scale. In a second test, a hierarchical classification structure is designed which uses n-grams and seyir features in different levels to further improve the system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Introduction: Responses to external stimuli are typically investigated by averaging peri-stimulus electroencephalography (EEG) epochs in order to derive event-related potentials (ERPs) across the electrode montage, under the assumption that signals that are related to the external stimulus are fixed in time across trials. We demonstrate the applicability of a single-trial model based on patterns of scalp topographies (De Lucia et al, 2007) that can be used for ERP analysis at the single-subject level. The model is able to classify new trials (or groups of trials) with minimal a priori hypotheses, using information derived from a training dataset. The features used for the classification (the topography of responses and their latency) can be neurophysiologically interpreted, because a difference in scalp topography indicates a different configuration of brain generators. An above chance classification accuracy on test datasets implicitly demonstrates the suitability of this model for EEG data. Methods: The data analyzed in this study were acquired from two separate visual evoked potential (VEP) experiments. The first entailed passive presentation of checkerboard stimuli to each of the four visual quadrants (hereafter, "Checkerboard Experiment") (Plomp et al, submitted). The second entailed active discrimination of novel versus repeated line drawings of common objects (hereafter, "Priming Experiment") (Murray et al, 2004). Four subjects per experiment were analyzed, using approx. 200 trials per experimental condition. These trials were randomly separated in training (90%) and testing (10%) datasets in 10 independent shuffles. In order to perform the ERP analysis we estimated the statistical distribution of voltage topographies by a Mixture of Gaussians (MofGs), which reduces our original dataset to a small number of representative voltage topographies. We then evaluated statistically the degree of presence of these template maps across trials and whether and when this was different across experimental conditions. Based on these differences, single-trials or sets of a few single-trials were classified as belonging to one or the other experimental condition. Classification performance was assessed using the Receiver Operating Characteristic (ROC) curve. Results: For the Checkerboard Experiment contrasts entailed left vs. right visual field presentations for upper and lower quadrants, separately. The average posterior probabilities, indicating the presence of the computed template maps in time and across trials revealed significant differences starting at ~60-70 ms post-stimulus. The average ROC curve area across all four subjects was 0.80 and 0.85 for upper and lower quadrants, respectively and was in all cases significantly higher than chance (unpaired t-test, p<0.0001). In the Priming Experiment, we contrasted initial versus repeated presentations of visual object stimuli. Their posterior probabilities revealed significant differences, which started at 250ms post-stimulus onset. The classification accuracy rates with single-trial test data were at chance level. We therefore considered sub-averages based on five single trials. We found that for three out of four subjects' classification rates were significantly above chance level (unpaired t-test, p<0.0001). Conclusions: The main advantage of the present approach is that it is based on topographic features that are readily interpretable along neurophysiologic lines. As these maps were previously normalized by the overall strength of the field potential on the scalp, a change in their presence across trials and between conditions forcibly reflects a change in the underlying generator configurations. The temporal periods of statistical difference between conditions were estimated for each training dataset for ten shuffles of the data. Across the ten shuffles and in both experiments, we observed a high level of consistency in the temporal periods over which the two conditions differed. With this method we are able to analyze ERPs at the single-subject level providing a novel tool to compare normal electrophysiological responses versus single cases that cannot be considered part of any cohort of subjects. This aspect promises to have a strong impact on both basic and clinical research.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new radiolarian order - Archaeospicularia - is proposed for some Lower Paleozoic radiolarians previously considered to belong to Spumellaria and to Collodaria. It is characterized by a globular shell made of several spicules which can be free, interlocked, or fused to formed a latticed wall. The present paper gives the definition of this order and proposes a first classification. It is supposed that the Archaeospicularia represents the oldest radiolarian group and that in the Lower Paleozoic it gave rise to the orders Entactinaria, Albaillellaria, and probably Spumellaria by the reduction of the number of initial spicules. The origin of this order and its relationships with other groups of organisms with siliceous skeletons are also briefly discussed. (C) 2000 Academie des sciences / Editions scientifiques et medicales Elsevier SAS.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Subjective language detection is one of the most important challenges in Sentiment Analysis. Because of the weight and frequency in opinionated texts, adjectives are considered a key piece in the opinion extraction process. These subjective units are more and more frequently collected in polarity lexicons in which they appear annotated with their prior polarity. However, at the moment, any polarity lexicon takes into account prior polarity variations across domains. This paper proves that a majority of adjectives change their prior polarity value depending on the domain. We propose a distinction between domain dependent and romain independent adjectives. Moreover, our analysis led us to propose a further classification related to subjectivity degree: constant, mixed and highly subjective adjectives. Following this classification, polarity values will be a better support for Sentiment Analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The work we present here addresses cue-based noun classification in English and Spanish. Its main objective is to automatically acquire lexical semantic information by classifying nouns into previously known noun lexical classes. This is achieved by using particular aspects of linguistic contexts as cues that identify a specific lexical class. Here we concentrate on the task of identifying such cues and the theoretical background that allows for an assessment of the complexity of the task. The results show that, despite of the a-priori complexity of the task, cue-based classification is a useful tool in the automatic acquisition of lexical semantic classes.