16 resultados para Ethnic and racial classification
em Indian Institute of Science - Bangalore - Índia
Pi-turns in proteins and peptides: Classification, conformation, occurrence, hydration and sequence.
Resumo:
The i + 5-->i hydrogen bonded turn conformation (pi-turn) with the fifth residue adopting alpha L conformation is frequently found at the C-terminus of helices in proteins and hence is speculated to be a "helix termination signal." An analysis of the occurrence of i + 5-->i hydrogen bonded turn conformation at any general position in proteins (not specifically at the helix C-terminus), using coordinates of 228 protein crystal structures determined by X-ray crystallography to better than 2.5 A resolution is reported in this paper. Of 486 detected pi-turn conformations, 367 have the (i + 4)th residue in alpha L conformation, generally occurring at the C-terminus of alpha-helices, consistent with previous observations. However, a significant number (111) of pi-turn conformations occur with (i + 4)th residue in alpha R conformation also, generally occurring in alpha-helices as distortions either at the terminii or at the middle, a novel finding. These two sets of pi-turn conformations are referred to by the names pi alpha L and pi alpha R-turns, respectively, depending upon whether the (i + 4)th residue adopts alpha L or alpha R conformations. Four pi-turns, named pi alpha L'-turns, were noticed to be mirror images of pi alpha L-turns, and four more pi-turns, which have the (i + 4)th residue in beta conformation and denoted as pi beta-turns, occur as a part of hairpin bend connecting twisted beta-strands. Consecutive pi-turns occur, but only with pi alpha R-turns. The preference for amino acid residues is different in pi alpha L and pi alpha R-turns. However, both show a preference for Pro after the C-termini. Hydrophilic residues are preferred at positions i + 1, i + 2, and i + 3 of pi alpha L-turns, whereas positions i and i + 5 prefer hydrophobic residues. Residue i + 4 in pi alpha L-turns is mainly Gly and less often Asn. Although pi alpha R-turns generally occur as distortions in helices, their amino acid preference is different from that of helices. Poor helix formers, such as His, Tyr, and Asn, also were found to be preferred for pi alpha R-turns, whereas good helix former Ala is not preferred. pi-Turns in peptides provide a picture of the pi-turn at atomic resolution. Only nine peptide-based pi-turns are reported so far, and all of them belong to pi alpha L-turn type with an achiral residue in position i + 4. The results are of importance for structure prediction, modeling, and de novo design of proteins.
Resumo:
Moving shadow detection and removal from the extracted foreground regions of video frames, aim to limit the risk of misconsideration of moving shadows as a part of moving objects. This operation thus enhances the rate of accuracy in detection and classification of moving objects. With a similar reasoning, the present paper proposes an efficient method for the discrimination of moving object and moving shadow regions in a video sequence, with no human intervention. Also, it requires less computational burden and works effectively under dynamic traffic road conditions on highways (with and without marking lines), street ways (with and without marking lines). Further, we have used scale-invariant feature transform-based features for the classification of moving vehicles (with and without shadow regions), which enhances the effectiveness of the proposed method. The potentiality of the method is tested with various data sets collected from different road traffic scenarios, and its superiority is compared with the existing methods. (C) 2013 Elsevier GmbH. All rights reserved.
Resumo:
Transductive SVM (TSVM) is a well known semi-supervised large margin learning method for binary text classification. In this paper we extend this method to multi-class and hierarchical classification problems. We point out that the determination of labels of unlabeled examples with fixed classifier weights is a linear programming problem. We devise an efficient technique for solving it. The method is applicable to general loss functions. We demonstrate the value of the new method using large margin loss on a number of multi-class and hierarchical classification datasets. For maxent loss we show empirically that our method is better than expectation regularization/constraint and posterior regularization methods, and competitive with the version of entropy regularization method which uses label constraints.
Resumo:
This paper focuses on optimisation algorithms inspired by swarm intelligence for satellite image classification from high resolution satellite multi- spectral images. Amongst the multiple benefits and uses of remote sensing, one of the most important has been its use in solving the problem of land cover mapping. As the frontiers of space technology advance, the knowledge derived from the satellite data has also grown in sophistication. Image classification forms the core of the solution to the land cover mapping problem. No single classifier can prove to satisfactorily classify all the basic land cover classes of an urban region. In both supervised and unsupervised classification methods, the evolutionary algorithms are not exploited to their full potential. This work tackles the land map covering by Ant Colony Optimisation (ACO) and Particle Swarm Optimisation (PSO) which are arguably the most popular algorithms in this category. We present the results of classification techniques using swarm intelligence for the problem of land cover mapping for an urban region. The high resolution Quick-bird data has been used for the experiments.
Resumo:
Background:Overwhelming majority of the Serine/Threonine protein kinases identified by gleaning archaeal and eubacterial genomes could not be classified into any of the well known Hanks and Hunter subfamilies of protein kinases. This is owing to the development of Hanks and Hunter classification scheme based on eukaryotic protein kinases which are highly divergent from their prokaryotic homologues. A large dataset of prokaryotic Serine/Threonine protein kinases recognized from genomes of prokaryotes have been used to develop a classification framework for prokaryotic Ser/Thr protein kinases. Methodology/Principal Findings: We have used traditional sequence alignment and phylogenetic approaches and clustered the prokaryotic kinases which represent 72 subfamilies with at least 4 members in each. Such a clustering enables classification of prokaryotic Ser/Thr kinases and it can be used as a framework to classify newly identified prokaryotic Ser/Thr kinases. After series of searches in a comprehensive sequence database we recognized that 38 subfamilies of prokaryotic protein kinases are associated to a specific taxonomic level. For example 4, 6 and 3 subfamilies have been identified that are currently specific to phylum proteobacteria, cyanobacteria and actinobacteria respectively. Similarly subfamilies which are specific to an order, sub-order, class, family and genus have also been identified. In addition to these, we also identify organism-diverse subfamilies. Members of these clusters are from organisms of different taxonomic levels, such as archaea, bacteria, eukaryotes and viruses.Conclusion/Significance: Interestingly, occurrence of several taxonomic level specific subfamilies of prokaryotic kinases contrasts with classification of eukaryotic protein kinases in which most of the popular subfamilies of eukaryotic protein kinases occur diversely in several eukaryotes. Many prokaryotic Ser/Thr kinases exhibit a wide variety of modular organization which indicates a degree of complexity and protein-protein interactions in the signaling pathways in these microbes.
Resumo:
Background: Two clinically relevant high-risk HPV (HR-HPV) types 16 and 18 are etiologically associated with the development of cervical carcinoma and are also reported to be present in many other carcinomas in extra-genital organ sites. Presence of HPV has been reported in breast carcinoma which is the second most common cancer in India and is showing a fast rising trend in urban population. The two early genes E6 and E7 of HPV type 16 have been shown to immortalize breast epithelial cells in vitro, but the role of HPV infection in breast carcinogenesis is highly controversial. Present study has therefore been undertaken to analyze the prevalence of HPV infection in both breast cancer tissues and blood samples from a large number of Indian women with breast cancer from different geographic regions. Methods: The presence of all mucosal HPVs and the most common high-risk HPV types 16 and 18 DNA was detected by two different PCR methods - (i) conventional PCR assays using consensus primers (MY09/11, or GP5 +/GP6+) or HPV16 E6/E7 primers and (ii) highly sensitive Real-Time PCR. A total of 228 biopsies and corresponding 142 blood samples collected prospectively from 252 patients from four different regions of India with significant socio-cultural, ethnic and demographic variations were tested. Results: All biopsies and blood samples of breast cancer patients tested by PCR methods did not show positivity for HPV DNA sequences in conventional PCRs either by MY09/11 or by GP5+/GP6+/HPV16 E6/E7 primers. Further testing of these samples by real time PCR also failed to detect HPV DNA sequences. Conclusions: Lack of detection of HPV DNA either in the tumor or in the blood DNA of breast cancer patients by both conventional and real time PCR does not support a role of genital HPV in the pathogenesis of breast cancer in Indian women.
Resumo:
Urbanisation is the increase in the population of cities in proportion to the region's rural population. Urbanisation in India is very rapid with urban population growing at around 2.3 percent per annum. Urban sprawl refers to the dispersed development along highways or surrounding the city and in rural countryside with implications such as loss of agricultural land, open space and ecologically sensitive habitats. Sprawl is thus a pattern and pace of land use in which the rate of land consumed for urban purposes exceeds the rate of population growth resulting in an inefficient and consumptive use of land and its associated resources. This unprecedented urbanisation trend due to burgeoning population has posed serious challenges to the decision makers in the city planning and management process involving plethora of issues like infrastructure development, traffic congestion, and basic amenities (electricity, water, and sanitation), etc. In this context, to aid the decision makers in following the holistic approaches in the city and urban planning, the pattern, analysis, visualization of urban growth and its impact on natural resources has gained importance. This communication, analyses the urbanisation pattern and trends using temporal remote sensing data based on supervised learning using maximum likelihood estimation of multivariate normal density parameters and Bayesian classification approach. The technique is implemented for Greater Bangalore – one of the fastest growing city in the World, with Landsat data of 1973, 1992 and 2000, IRS LISS-3 data of 1999, 2006 and MODIS data of 2002 and 2007. The study shows that there has been a growth of 466% in urban areas of Greater Bangalore across 35 years (1973 to 2007). The study unravels the pattern of growth in Greater Bangalore and its implication on local climate and also on the natural resources, necessitating appropriate strategies for the sustainable management.
Resumo:
The design and operation of the minimum cost classifier, where the total cost is the sum of the measurement cost and the classification cost, is computationally complex. Noting the difficulties associated with this approach, decision tree design directly from a set of labelled samples is proposed in this paper. The feature space is first partitioned to transform the problem to one of discrete features. The resulting problem is solved by a dynamic programming algorithm over an explicitly ordered state space of all outcomes of all feature subsets. The solution procedure is very general and is applicable to any minimum cost pattern classification problem in which each feature has a finite number of outcomes. These techniques are applied to (i) voiced, unvoiced, and silence classification of speech, and (ii) spoken vowel recognition. The resulting decision trees are operationally very efficient and yield attractive classification accuracies.
Resumo:
The presence of a large number of spectral bands in the hyperspectral images increases the capability to distinguish between various physical structures. However, they suffer from the high dimensionality of the data. Hence, the processing of hyperspectral images is applied in two stages: dimensionality reduction and unsupervised classification techniques. The high dimensionality of the data has been reduced with the help of Principal Component Analysis (PCA). The selected dimensions are classified using Niche Hierarchical Artificial Immune System (NHAIS). The NHAIS combines the splitting method to search for the optimal cluster centers using niching procedure and the merging method is used to group the data points based on majority voting. Results are presented for two hyperspectral images namely EO-1 Hyperion image and Indian pines image. A performance comparison of this proposed hierarchical clustering algorithm with the earlier three unsupervised algorithms is presented. From the results obtained, we deduce that the NHAIS is efficient.
Resumo:
Crop type classification using remote sensing data plays a vital role in planning cultivation activities and for optimal usage of the available fertile land. Thus a reliable and precise classification of agricultural crops can help improve agricultural productivity. Hence in this paper a gene expression programming based fuzzy logic approach for multiclass crop classification using Multispectral satellite image is proposed. The purpose of this work is to utilize the optimization capabilities of GEP for tuning the fuzzy membership functions. The capabilities of GEP as a classifier is also studied. The proposed method is compared to Bayesian and Maximum likelihood classifier in terms of performance evaluation. From the results we can conclude that the proposed method is effective for classification.
Resumo:
Imaging flow cytometry is an emerging technology that combines the statistical power of flow cytometry with spatial and quantitative morphology of digital microscopy. It allows high-throughput imaging of cells with good spatial resolution, while they are in flow. This paper proposes a general framework for the processing/classification of cells imaged using imaging flow cytometer. Each cell is localized by finding an accurate cell contour. Then, features reflecting cell size, circularity and complexity are extracted for the classification using SVM. Unlike the conventional iterative, semi-automatic segmentation algorithms such as active contour, we propose a noniterative, fully automatic graph-based cell localization. In order to evaluate the performance of the proposed framework, we have successfully classified unstained label-free leukaemia cell-lines MOLT, K562 and HL60 from video streams captured using custom fabricated cost-effective microfluidics-based imaging flow cytometer. The proposed system is a significant development in the direction of building a cost-effective cell analysis platform that would facilitate affordable mass screening camps looking cellular morphology for disease diagnosis. Lay description In this article, we propose a novel framework for processing the raw data generated using microfluidics based imaging flow cytometers. Microfluidics microscopy or microfluidics based imaging flow cytometry (mIFC) is a recent microscopy paradigm, that combines the statistical power of flow cytometry with spatial and quantitative morphology of digital microscopy, which allows us imaging cells while they are in flow. In comparison to the conventional slide-based imaging systems, mIFC is a nascent technology enabling high throughput imaging of cells and is yet to take the form of a clinical diagnostic tool. The proposed framework process the raw data generated by the mIFC systems. The framework incorporates several steps: beginning from pre-processing of the raw video frames to enhance the contents of the cell, localising the cell by a novel, fully automatic, non-iterative graph based algorithm, extraction of different quantitative morphological parameters and subsequent classification of cells. In order to evaluate the performance of the proposed framework, we have successfully classified unstained label-free leukaemia cell-lines MOLT, K562 and HL60 from video streams captured using cost-effective microfluidics based imaging flow cytometer. The cell lines of HL60, K562 and MOLT were obtained from ATCC (American Type Culture Collection) and are separately cultured in the lab. Thus, each culture contains cells from its own category alone and thereby provides the ground truth. Each cell is localised by finding a closed cell contour by defining a directed, weighted graph from the Canny edge images of the cell such that the closed contour lies along the shortest weighted path surrounding the centroid of the cell from a starting point on a good curve segment to an immediate endpoint. Once the cell is localised, morphological features reflecting size, shape and complexity of the cells are extracted and used to develop a support vector machine based classification system. We could classify the cell-lines with good accuracy and the results were quite consistent across different cross validation experiments. We hope that imaging flow cytometers equipped with the proposed framework for image processing would enable cost-effective, automated and reliable disease screening in over-loaded facilities, which cannot afford to hire skilled personnel in large numbers. Such platforms would potentially facilitate screening camps in low income group countries; thereby transforming the current health care paradigms by enabling rapid, automated diagnosis for diseases like cancer.
Resumo:
Plants are sessile organisms that have evolved a variety of mechanisms to maintain their cellular homeostasis under stressful environmental conditions. Survival of plants under abiotic stress conditions requires specialized group of heat shock protein machinery, belonging to Hsp70:J-protein family. These heat shock proteins are most ubiquitous types of chaperone machineries involved in diverse cellular processes including protein folding, translocation across cell membranes, and protein degradation. They play a crucial role in maintaining the protein homeostasis by reestablishing functional native conformations under environmental stress conditions, thus providing protection to the cell. J-proteins are co-chaperones of Hsp70 machine, which play a critical role by stimulating Hsp70s ATPase activity, thereby stabilizing its interaction with client proteins. Using genome-wide analysis of Arabidopsis thaliana, here we have outlined identification and systematic classification of J-protein co-chaperones which are key regulators of Hsp70s function. In comparison with Saccharomyces cerevisiae model system, a comprehensive domain structural organization, cellular localization, and functional diversity of A. thaliana J-proteins have also been summarized. Electronic supplementary material The online version of this article (doi:10.1007/s10142-009-0132-0) contains supplementary material, which is available to authorized users.
Resumo:
Heat shock protein information resource (HSPIR) is a concerted database of six major heat shock proteins (HSPs), namely, Hsp70, Hsp40, Hsp60, Hsp90, Hsp100 and small HSP. The HSPs are essential for the survival of all living organisms, as they protect the conformations of proteins on exposure to various stress conditions. They are a highly conserved group of proteins involved in diverse physiological functions, including de novo folding, disaggregation and protein trafficking. Moreover, their critical role in the control of disease progression made them a prime target of research. Presently, limited information is available on HSPs in reference to their identification and structural classification across genera. To that extent, HSPIR provides manually curated information on sequence, structure, classification, ontology, domain organization, localization and possible biological functions extracted from UniProt, GenBank, Protein Data Bank and the literature. The database offers interactive search with incorporated tools, which enhances the analysis. HSPIR is a reliable resource for researchers exploring structure, function and evolution of HSPs.
Resumo:
Comments constitute an important part of Web 2.0. In this paper, we consider comments on news articles. To simplify the task of relating the comment content to the article content the comments are about, we propose the idea of showing comments alongside article segments and explore automatic mapping of comments to article segments. This task is challenging because of the vocabulary mismatch between the articles and the comments. We present supervised and unsupervised techniques for aligning comments to segments the of article the comments are about. More specifically, we provide a novel formulation of supervised alignment problem using the framework of structured classification. Our experimental results show that structured classification model performs better than unsupervised matching and binary classification model.
Resumo:
This paper discusses a novel high-speed approach for human action recognition in H. 264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of our work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can effect in reduced hardware utilization and fast recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust in outdoor as well as indoor testing scenarios. We have tested our method on two benchmark action datasets and achieved more than 85% accuracy. The proposed algorithm classifies actions with speed (>2000 fps) approximately 100 times more than existing state-of-the-art pixel-domain algorithms.