857 resultados para Classification (of information)
Resumo:
Background: The function of a protein can be deciphered with higher accuracy from its structure than from its amino acid sequence. Due to the huge gap in the available protein sequence and structural space, tools that can generate functionally homogeneous clusters using only the sequence information, hold great importance. For this, traditional alignment-based tools work well in most cases and clustering is performed on the basis of sequence similarity. But, in the case of multi-domain proteins, the alignment quality might be poor due to varied lengths of the proteins, domain shuffling or circular permutations. Multi-domain proteins are ubiquitous in nature, hence alignment-free tools, which overcome the shortcomings of alignment-based protein comparison methods, are required. Further, existing tools classify proteins using only domain-level information and hence miss out on the information encoded in the tethered regions or accessory domains. Our method, on the other hand, takes into account the full-length sequence of a protein, consolidating the complete sequence information to understand a given protein better. Results: Our web-server, CLAP (Classification of Proteins), is one such alignment-free software for automatic classification of protein sequences. It utilizes a pattern-matching algorithm that assigns local matching scores (LMS) to residues that are a part of the matched patterns between two sequences being compared. CLAP works on full-length sequences and does not require prior domain definitions. Pilot studies undertaken previously on protein kinases and immunoglobulins have shown that CLAP yields clusters, which have high functional and domain architectural similarity. Moreover, parsing at a statistically determined cut-off resulted in clusters that corroborated with the sub-family level classification of that particular domain family. Conclusions: CLAP is a useful protein-clustering tool, independent of domain assignment, domain order, sequence length and domain diversity. Our method can be used for any set of protein sequences, yielding functionally relevant clusters with high domain architectural homogeneity. The CLAP web server is freely available for academic use at http://nslab.mbu.iisc.ernet.in/clap/.
Resumo:
Habitat mapping and characterization has been defined as a high-priority management issue for the Olympic Coast National Marine Sanctuary (OCNMS), especially for poorly known deep-sea habitats that may be sensitive to anthropogenic disturbance. As a result, a team of scientists from OCNMS, National Centers for Coastal Ocean Science (NCCOS), and other partnering institutions initiated a series of surveys to assess the distribution of deep-sea coral/sponge assemblages within the sanctuary and to look for evidence of potential anthropogenic impacts in these critical habitats. Initial results indicated that remotely delineating areas of hard bottom substrate through acoustic sensing could be a useful tool to increase the efficiency and success of subsequent ROV-based surveys of the associated deep-sea fauna. Accordingly, side scan sonar surveys were conducted in May 2004, June 2005, and April 2006 aboard the NOAA Ship McArthur II to: (1) obtain additional imagery of the seafloor for broader habitat-mapping coverage of sanctuary waters, and (2) help delineate suitable deep-sea coral/sponge habitat, in areas of both high and low commercial-fishing activities, to serve as sites for surveying-in more detail using an ROV on subsequent cruises. Several regions of the sea floor throughout the OCNMS were surveyed and mosaicked at 1-meter pixel resolution. Imagery from the side scan sonar mapping efforts was integrated with other complementary data from a towed camera sled, ROVs, sedimentary samples, and bathymetry records to describe geological and biological (where possible) aspects of habitat. Using a hierarchical deep-water marine benthic classification scheme (Greene et al. 1999), we created a preliminary map of various habitat polygon features for use in a geographical information system (GIS). This report provides a description of the mapping and groundtruthing efforts as well as results of the image classification procedure for each of the areas surveyed. (PDF contains 60 pages.)
Resumo:
In September 2002, side scan sonar was used to image a portion of the sea floor in the northern OCNMS and was mosaiced at 1-meter pixel resolution using 100 kHz data collected at 300-meter range scale. Video from a remotely-operated vehicle (ROV), bathymetry data, sedimentary samples, and sonar mapping have been integrated to describe geological and biological aspects of habitat and polygon features have been created and attributed with a hierarchical deep-water marine benthic classification scheme (Greene et al. 1999). The data can be used with geographic information system (GIS) software for display, query, and analysis. Textural analysis of the sonar images provided a relatively automated method for delineating substrate into three broad classes representing soft, mixed sediment, and hard bottom. Microhabitat and presence of certain biologic attributes were also populated into the polygon features, but strictly limited to areas where video groundtruthing occurred. Further groundtruthing work in specific areas would improve confidence in the classified habitat map. (PDF contains 22 pages.)
Resumo:
[EN]Fundación Zain is developing new built heritage assessment protocols. The goal is to objectivize and standardize the analysis and decision process that leads to determining the degree of protection of built heritage in the Basque Country. The ultimate step in this objectivization and standardization effort will be the development of an information and communication technology (ICT) tool for the assessment of built heritage. This paper presents the ground work carried out to make this tool possible: the automatic, image-based delineation of stone masonry. This is a necessary first step in the development of the tool, as the built heritage that will be assessed consists of stone masonry construction, and many of the features analyzed can be characterized according to the geometry and arrangement of the stones. Much of the assessment is carried out through visual inspection. Thus, this process will be automated by applying image processing on digital images of the elements under inspection. The principal contribution of this paper is the automatic delineation the framework proposed. The other contribution is the performance evaluation of this delineation as the input to a classifier for a geometrically characterized feature of a built heritage object. The element chosen to perform this evaluation is the stone arrangement of masonry walls. The validity of the proposed framework is assessed on real images of masonry walls.
Resumo:
The paper traces the history of the different documentation media used for information dissemination. Such early media are clay tablets, papyrus, and vellum or parchment codex. The invention of printing however revolutionized the information industry, enabling the production of books in multiple copies. Photography came into documentation mainly to preserve rare materials and those that easily deteriorate. This paper reports the efforts of National Institute for Freshwater Fisheries Research (NIFFR) and Kainji Lake Fisheries Promotion Project (KLFPPP), Nigeria, to develop an Object Oriented Database (OOD) using photographs. The photographs are stored in digitized form on commercial computers, using the program ACDSee 32 for classification, description and retrieval. Specifically the paper focuses on photographs in fisheries as visual communication and expression. Presently, the database contains photo documents about the following aspects of Kainji Lake fisheries: fishing gears and crafts, fish preservation methods
Resumo:
Fundacion Zain is developing new built heritage assessment protocols. The goal is to objectivize and standardize the analysis and decision process that leads to determining the degree of protection of built heritage in the Basque Country. The ultimate step in this objectivization and standardization effort will be the development of an information and communication technology (ICT) tool for the assessment of built heritage. This paper presents the ground work carried out to make this tool possible: the automatic, image-based delineation of stone masonry. This is a necessary first step in the development of the tool, as the built heritage that will be assessed consists of stone masonry construction, and many of the features analyzed can be characterized according to the geometry and arrangement of the stones. Much of the assessment is carried out through visual inspection. Thus, this process will be automated by applying image processing on digital images of the elements under inspection. The principal contribution of this paper is the automatic delineation the framework proposed. The other contribution is the performance evaluation of this delineation as the input to a classifier for a geometrically characterized feature of a built heritage object. The element chosen to perform this evaluation is the stone arrangement of masonry walls. The validity of the proposed framework is assessed on real images of masonry walls.
Resumo:
The National Marine Fisheries Service is required by law to conduct social impact assessments of communities impacted by fishery management plans. To facilitate this process, we developed a technique for grouping communities based on common sociocultural attributes. Multivariate data reduction techniques (e.g. principal component analyses, cluster analyses) were used to classify Northeast U.S. fishing communities based on census and fisheries data. The comparisons indicate that the clusters represent real groupings that can be verified with the profiles. We then selected communities representative of different values on these multivariate dimensions for in-depth analysis. The derived clusters are then compared based on more detailed data from fishing community profiles. Ground-truthing (e.g. visiting the communities and collecting primary information) a sample of communities from three clusters (two overlapping geographically) indicates that the more remote techniques are sufficient for typing the communities for further in-depth analyses. The in-depth analyses provide additional important information which we contend is representative of all communities within the cluster.
Resumo:
We have applied a number of objective statistical techniques to define homogeneous climatic regions for the Pacific Ocean, using COADS (Woodruff et al 1987) monthly sea surface temperature (SST) for 1950-1989 as the key variable. The basic data comprised all global 4°x4° latitude/longitude boxes with enough data available to yield reliable long-term means of monthly mean SST. An R-mode principal components analysis of these data, following a technique first used by Stidd (1967), yields information about harmonics of the annual cycles of SST. We used the spatial coefficients (one for each 4-degree box and eigenvector) as input to a K-means cluster analysis to classify the gridbox SST data into 34 global regions, in which 20 comprise the Pacific and Indian oceans. Seasonal time series were then produced for each of these regions. For comparison purposes, the variance spectrum of each regional anomaly time series was calculated. Most of the significant spectral peaks occur near the biennial (2.1-2.2 years) and ENSO (~3-6 years) time scales in the tropical regions. Decadal scale fluctuations are important in the mid-latitude ocean regions.
Resumo:
The taxonomy of the douc and snub-nosed langurs has changed several times during the 20th century. The controversy over the systematic position of these animals has been due in part to difficulties in studying them: both the doucs and the snub-nosed langurs are rare in the wild and are generally poorly represented in institutional collections. This review is based on a detailed examination of relatively large numbers of specimens of most of the species of langurs concerned. An attempt was made to draw upon as many types of information as were available in order to make an assessment of the phyletic relationships between the langur species under discussion. Toward this end, quantitative and qualitative features of the skeleton, specific features of visceral anatomy and characteristics of the pelage were utilized. The final data matrix comprised 178 characters. The matrix was analyzed using the program Hennig86. The results of the analysis support the following conclusions: (1) that the douc and snub-nosed langurs are generically distinct and should be referred to as species of Pygathrix and Rhinopithecus, respectively; (2) that the Tonkin snub-nosed langur be placed in its own subgenus as Rhinopithecus (Presbytiscus) avunculus and that the Chinese snub-nosed langur thus be placed in the subgenus Rhinopithecus (Rhinopithecus); (3) that four extant species of Rhinopithecus be recognized: R. (Rhinopithecus) roxellana Milne Edwards, 1870; R. (Rhinopithecus) bieti Milne Edwards, 1897; R. (Rhinopithecus) brelichi Thomas, 1903, and R. (Presbytiscus) avunculus Dollman, 1912; (4) that the Chinese snub-nosed langurs fall into northern and southern subgroups divided by the Yangtze river; (5) that R. lantianensis Hu and Qi, 1978, is a valid fossil species, and (6) the precise affinities and taxonomic status of the fossil species R. tingianus Matthew and Granger, 1923, are unclear because the type specimen is a subadult.
Resumo:
C. Shang and Q. Shen. Aiding classification of gene expression data with feature selection: a comparative study. Computational Intelligence Research, 1(1):68-76.
Resumo:
As a by-product of the ‘information revolution’ which is currently unfolding, lifetimes of man (and indeed computer) hours are being allocated for the automated and intelligent interpretation of data. This is particularly true in medical and clinical settings, where research into machine-assisted diagnosis of physiological conditions gains momentum daily. Of the conditions which have been addressed, however, automated classification of allergy has not been investigated, even though the numbers of allergic persons are rising, and undiagnosed allergies are most likely to elicit fatal consequences. On the basis of the observations of allergists who conduct oral food challenges (OFCs), activity-based analyses of allergy tests were performed. Algorithms were investigated and validated by a pilot study which verified that accelerometer-based inquiry of human movements is particularly well-suited for objective appraisal of activity. However, when these analyses were applied to OFCs, accelerometer-based investigations were found to provide very poor separation between allergic and non-allergic persons, and it was concluded that the avenues explored in this thesis are inadequate for the classification of allergy. Heart rate variability (HRV) analysis is known to provide very significant diagnostic information for many conditions. Owing to this, electrocardiograms (ECGs) were recorded during OFCs for the purpose of assessing the effect that allergy induces on HRV features. It was found that with appropriate analysis, excellent separation between allergic and nonallergic subjects can be obtained. These results were, however, obtained with manual QRS annotations, and these are not a viable methodology for real-time diagnostic applications. Even so, this was the first work which has categorically correlated changes in HRV features to the onset of allergic events, and manual annotations yield undeniable affirmation of this. Fostered by the successful results which were obtained with manual classifications, automatic QRS detection algorithms were investigated to facilitate the fully automated classification of allergy. The results which were obtained by this process are very promising. Most importantly, the work that is presented in this thesis did not obtain any false positive classifications. This is a most desirable result for OFC classification, as it allows complete confidence to be attributed to classifications of allergy. Furthermore, these results could be particularly advantageous in clinical settings, as machine-based classification can detect the onset of allergy which can allow for early termination of OFCs. Consequently, machine-based monitoring of OFCs has in this work been shown to possess the capacity to significantly and safely advance the current state of clinical art of allergy diagnosis
Resumo:
The detection of dense harmful algal blooms (HABs) by satellite remote sensing is usually based on analysis of chlorophyll-a as a proxy. However, this approach does not provide information about the potential harm of bloom, nor can it identify the dominant species. The developed HAB risk classification method employs a fully automatic data-driven approach to identify key characteristics of water leaving radiances and derived quantities, and to classify pixels into “harmful”, “non-harmful” and “no bloom” categories using Linear Discriminant Analysis (LDA). Discrimination accuracy is increased through the use of spectral ratios of water leaving radiances, absorption and backscattering. To reduce the false alarm rate the data that cannot be reliably classified are automatically labelled as “unknown”. This method can be trained on different HAB species or extended to new sensors and then applied to generate independent HAB risk maps; these can be fused with other sensors to fill gaps or improve spatial or temporal resolution. The HAB discrimination technique has obtained accurate results on MODIS and MERIS data, correctly identifying 89% of Phaeocystis globosa HABs in the southern North Sea and 88% of Karenia mikimotoi blooms in the Western English Channel. A linear transformation of the ocean colour discriminants is used to estimate harmful cell counts, demonstrating greater accuracy than if based on chlorophyll-a; this will facilitate its integration into a HAB early warning system operating in the southern North Sea.