32 resultados para Independent component analysis

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new application of two dimensional Principal Component Analysis (2DPCA) to the problem of online character recognition in Tamil Script. A novel set of features employing polynomial fits and quartiles in combination with conventional features are derived for each sample point of the Tamil character obtained after smoothing and resampling. These are stacked to form a matrix, using which a covariance matrix is constructed. A subset of the eigenvectors of the covariance matrix is employed to get the features in the reduced sub space. Each character is modeled as a separate subspace and a modified form of the Mahalanobis distance is derived to classify a given test character. Results indicate that the recognition accuracy using the 2DPCA scheme shows an approximate 3% improvement over the conventional PCA technique.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The transient changes in resistances of Cr0.8Fe0.2NbO4 thick film sensors towards specified concentrations of H-2, NH3, acetonitrile, acetone, alcohol, cyclohexane and petroleum gas at different operating temperatures were recorded. The analyte-specific characteristics such as slopes of the response and retrace curves, area under the curve and sensitivity deduced from the transient curve of the respective analyte gas have been used to construct a data matrix. Principal component analysis (PCA) was applied to this data and the score plot was obtained. Distinguishing one reducing gas from the other is demonstrated based on this approach, which otherwise is not possible by measuring relative changes in conductivity. This methodology is extended for three Cr0.8Fe0.2NbO4 thick film sensor array operated at different temperatures. (C) 2015 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rice landraces are lineages developed by farmers through artificial selection during the long-term domestication process. Despite huge potential for crop improvement, they are largely understudied in India. Here, we analyse a suite of phenotypic characters from large numbers of Indian landraces comprised of both aromatic and non-aromatic varieties. Our primary aim was to investigate the major determinants of diversity, the strength of segregation among aromatic and non-aromatic landraces as well as that within aromatic landraces. Using principal component analysis, we found that grain length, width and weight, panicle weight and leaf length have the most substantial contribution. Discriminant analysis can effectively distinguish the majority of aromatic from non-aromatic landraces. More interestingly, within aromatic landraces long-grain traditional Basmati and short-grain non-Basmati aromatics remain morphologically well differentiated. The present research emphasizes the general patterns of phenotypic diversity and finds out the most important characters. It also confirms the existence of very unique short-grain aromatic landraces, perhaps carrying signatures of independent origin of an additional aroma quantitative trait locus in the indica group, unlike introgression of specific alleles of the BADH2 gene from the japonica group as in Basmati. We presume that this parallel origin and evolution of aroma in short-grain indica landraces are linked to the long history of rice domestication that involved inheritance of several traits from Oryza nivara, in addition to O. rufipogon. We conclude with a note that the insights from the phenotypic analysis essentially comprise the first part, which will likely be validated with subsequent molecular analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Principal component analysis is applied to derive patterns of temporal variation of the rainfall at fifty-three stations in peninsular India. The location of the stations in the coordinate space determined by the amplitudes of the two leading eigenvectors is used to delineate them into eight clusters. The clusters obtained seem to be stable with respect to variations in the grid of stations used. Stations within any cluster occur in geographically contiguous areas.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Skew correction of complex document images is a difficult task. We propose an edge-based connected component approach for robust skew correction of documents with complex layout and content. The algorithm essentially consists of two steps - an 'initialization' step to determine the image orientation from the centroids of the connected components and a 'search' step to find the actual skew of the image. During initialization, we choose two different sets of points regularly spaced across the the image, one from the left to right and the other from top to bottom. The image orientation is determined from the slope between the two succesive nearest neighbors of each of the points in the chosen set. The search step finds succesive nearest neighbors that satisfy the parameters obtained in the initialization step. The final skew is determined from the slopes obtained in the 'search' step. Unlike other connected component based methods, the proposed method does not require any binarization step that generally precedes connected component analysis. The method works well for scanned documents with complex layout of any skew with a precision of 0.5 degrees.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Image fusion is a formal framework which is expressed as means and tools for the alliance of multisensor, multitemporal, and multiresolution data. Multisource data vary in spectral, spatial and temporal resolutions necessitating advanced analytical or numerical techniques for enhanced interpretation capabilities. This paper reviews seven pixel based image fusion techniques - intensity-hue-saturation, brovey, high pass filter (HPF), high pass modulation (HPM), principal component analysis, fourier transform and correspondence analysis.Validation of these techniques on IKONOS data (Panchromatic band at I m spatial resolution and Multispectral 4 bands at 4 in spatial resolution) reveal that HPF and HPM methods synthesises the images closest to those the corresponding multisensors would observe at the high resolution level.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Angiogenin is a protein belonging to the superfamily of RNase A. The RNase activity of this protein is essential for its angiogenic activity. Although members of the RNase A family carry out RNase activity, they differ markedly in their strength and specificity. In this paper, we address the problem of higher specificity of angiogenin towards cytosine against uracil in the first base binding position. We have carried out extensive nano-second level molecular dynamics(MD) computer simulations on the native bovine angiogenin and on the CMP and UMP complexes of this protein in aqueous medium with explicit molecular solvent. The structures thus generated were subjected to a rigorous free energy component analysis to arrive at a plausible molecular thermodynamic explanation for the substrate specificity of angiogenin.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Urbanisation is a dynamic complex phenomenon involving large scale changes in the land uses at local levels. Analyses of changes in land uses in urban environments provide a historical perspective of land use and give an opportunity to assess the spatial patterns, correlation, trends, rate and impacts of the change, which would help in better regional planning and good governance of the region. Main objective of this research is to quantify the urban dynamics using temporal remote sensing data with the help of well-established landscape metrics. Bangalore being one of the rapidly urbanising landscapes in India has been chosen for this investigation. Complex process of urban sprawl was modelled using spatio temporal analysis. Land use analyses show 584% growth in built-up area during the last four decades with the decline of vegetation by 66% and water bodies by 74%. Analyses of the temporal data reveals an increase in urban built up area of 342.83% (during 1973-1992), 129.56% (during 1992-1999), 106.7% (1999-2002), 114.51% (2002-2006) and 126.19% from 2006 to 2010. The Study area was divided into four zones and each zone is further divided into 17 concentric circles of 1 km incrementing radius to understand the patterns and extent of the urbanisation at local levels. The urban density gradient illustrates radial pattern of urbanisation for the period 1973-2010. Bangalore grew radially from 1973 to 2010 indicating that the urbanisation is intensifying from the central core and has reached the periphery of the Greater Bangalore. Shannon's entropy, alpha and beta population densities were computed to understand the level of urbanisation at local levels. Shannon's entropy values of recent time confirms dispersed haphazard urban growth in the city, particularly in the outskirts of the city. This also illustrates the extent of influence of drivers of urbanisation in various directions. Landscape metrics provided in depth knowledge about the sprawl. Principal component analysis helped in prioritizing the metrics for detailed analyses. The results clearly indicates that whole landscape is aggregating to a large patch in 2010 as compared to earlier years which was dominated by several small patches. The large scale conversion of small patches to large single patch can be seen from 2006 to 2010. In the year 2010 patches are maximally aggregated indicating that the city is becoming more compact and more urbanised in recent years. Bangalore was the most sought after destination for its climatic condition and the availability of various facilities (land availability, economy, political factors) compared to other cities. The growth into a single urban patch can be attributed to rapid urbanisation coupled with the industrialisation. Monitoring of growth through landscape metrics helps to maintain and manage the natural resources. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Recent research on glioblastoma (GBM) has focused on deducing gene signatures predicting prognosis. The present study evaluated the mRNA expression of selected genes and correlated with outcome to arrive at a prognostic gene signature. Methods: Patients with GBM (n = 123) were prospectively recruited, treated with a uniform protocol and followed up. Expression of 175 genes in GBM tissue was determined using qRT-PCR. A supervised principal component analysis followed by derivation of gene signature was performed. Independent validation of the signature was done using TCGA data. Gene Ontology and KEGG pathway analysis was carried out among patients from TCGA cohort. Results: A 14 gene signature was identified that predicted outcome in GBM. A weighted gene (WG) score was found to be an independent predictor of survival in multivariate analysis in the present cohort (HR = 2.507; B = 0.919; p < 0.001) and in TCGA cohort. Risk stratification by standardized WG score classified patients into low and high risk predicting survival both in our cohort (p = <0.001) and TCGA cohort (p = 0.001). Pathway analysis using the most differentially regulated genes (n = 76) between the low and high risk groups revealed association of activated inflammatory/immune response pathways and mesenchymal subtype in the high risk group. Conclusion: We have identified a 14 gene expression signature that can predict survival in GBM patients. A network analysis revealed activation of inflammatory response pathway specifically in high risk group. These findings may have implications in understanding of gliomagenesis, development of targeted therapies and selection of high risk cancer patients for alternate adjuvant therapies.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Latent variable methods, such as PLCA (Probabilistic Latent Component Analysis) have been successfully used for analysis of non-negative signal representations. In this paper, we formulate PLCS (Probabilistic Latent Component Segmentation), which models each time frame of a spectrogram as a spectral distribution. Given the signal spectrogram, the segmentation boundaries are estimated using a maximum-likelihood approach. For an efficient solution, the algorithm imposes a hard constraint that each segment is modelled by a single latent component. The hard constraint facilitates the solution of ML boundary estimation using dynamic programming. The PLCS framework does not impose a parametric assumption unlike earlier ML segmentation techniques. PLCS can be naturally extended to model coarticulation between successive phones. Experiments on the TIMIT corpus show that the proposed technique is promising compared to most state of the art speech segmentation algorithms.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The practice of Ayurveda, the traditional medicine of India, is based on the concept of three major constitutional types (Vata, Pitta and Kapha) defined as ``Prakriti''. To the best of our knowledge, no study has convincingly correlated genomic variations with the classification of Prakriti. In the present study, we performed genome-wide SNP (single nucleotide polymorphism) analysis (Affymetrix, 6.0) of 262 well-classified male individuals (after screening 3416 subjects) belonging to three Prakritis. We found 52 SNPs (p <= 1 x 10(-5)) were significantly different between Prakritis, without any confounding effect of stratification, after 10(6) permutations. Principal component analysis (PCA) of these SNPs classified 262 individuals into their respective groups (Vata, Pitta and Kapha) irrespective of their ancestry, which represent its power in categorization. We further validated our finding with 297 Indian population samples with known ancestry. Subsequently, we found that PGM1 correlates with phenotype of Pitta as described in the ancient text of Caraka Samhita, suggesting that the phenotypic classification of India's traditional medicine has a genetic basis; and its Prakriti-based practice in vogue for many centuries resonates with personalized medicine.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The development of techniques for scaling up classifiers so that they can be applied to problems with large datasets of training examples is one of the objectives of data mining. Recently, AdaBoost has become popular among machine learning community thanks to its promising results across a variety of applications. However, training AdaBoost on large datasets is a major problem, especially when the dimensionality of the data is very high. This paper discusses the effect of high dimensionality on the training process of AdaBoost. Two preprocessing options to reduce dimensionality, namely the principal component analysis and random projection are briefly examined. Random projection subject to a probabilistic length preserving transformation is explored further as a computationally light preprocessing step. The experimental results obtained demonstrate the effectiveness of the proposed training process for handling high dimensional large datasets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ninety-two strong-motion earthquake records from the California region, U.S.A., have been statistically studied using principal component analysis in terms of twelve important standardized strong-motion characteristics. The first two principal components account for about 57 per cent of the total variance. Based on these two components the earthquake records are classified into nine groups in a two-dimensional principal component plane. Also a unidimensional engineering rating scale is proposed. The procedure can be used as an objective approach for classifying and rating future earthquakes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Nanotechnology is a new technology which is generating a lot of interest among academicians, practitioners and scientists. Critical research is being carried out in this area all over the world.Governments are creating policy initiatives to promote developments it the nanoscale science and technology developments. Private investment is also seeing a rising trend. Large number of academic institutions and national laboratories has set up research centers that are workingon the multiple applications of nanotechnology. Wide ranges of applications are claimed for nanotechnology. This consists of materials, chemicals, textiles, semiconductors, to wonder drug delivery systems and diagnostics. Nanotechnology is considered to be a next big wave of technology after information technology and biotechnology. In fact, nanotechnology holds the promise of advances that exceed those achieved in recent decades in computers and biotechnology. Much interest in nanotechnology also could be because of the fact that enormous monetary benefits are expected from nanotechnology based products. According to NSF, revenues from nanotechnology could touch $ 1 trillion by 2015. However much of the benefits are projected ones. Realizing claimed benefits require successful development of nanoscience andv nanotechnology research efforts. That is the journey of invention to innovation has to be completed. For this to happen the technology has to flow from laboratory to market. Nanoscience and nanotechnology research efforts have to come out in the form of new products, new processes, and new platforms.India has also started its Nanoscience and Nanotechnology development program in under its 10(th) Five Year Plan and funds worth Rs. One billion have been allocated for Nanoscience and Nanotechnology Research and Development. The aim of the paper is to assess Nanoscience and Nanotechnology initiatives in India. We propose a conceptual model derived from theresource based view of the innovation. We have developed a structured questionnaire to measure the constructs in the conceptual model. Responses have been collected from 115 scientists and engineers working in the field of Nanoscience and Nanotechnology. The responses have been analyzed further by using Principal Component Analysis, Cluster Analysis and Regression Analysis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

While plants of a single species emit a diversity of volatile organic compounds (VOCs) to attract or repel interacting organisms, these specific messages may be lost in the midst of the hundreds of VOCs produced by sympatric plants of different species, many of which may have no signal content. Receivers must be able to reduce the babel or noise in these VOCs in order to correctly identify the message. For chemical ecologists faced with vast amounts of data on volatile signatures of plants in different ecological contexts, it is imperative to employ accurate methods of classifying messages, so that suitable bioassays may then be designed to understand message content. We demonstrate the utility of `Random Forests' (RF), a machine-learning algorithm, for the task of classifying volatile signatures and choosing the minimum set of volatiles for accurate discrimination, using datam from sympatric Ficus species as a case study. We demonstrate the advantages of RF over conventional classification methods such as principal component analysis (PCA), as well as data-mining algorithms such as support vector machines (SVM), diagonal linear discriminant analysis (DLDA) and k-nearest neighbour (KNN) analysis. We show why a tree-building method such as RF, which is increasingly being used by the bioinformatics, food technology and medical community, is particularly advantageous for the study of plant communication using volatiles, dealing, as it must, with abundant noise.