Biblioteca Digital

30 resultados para PRINCIPAL COMPONENT ANALYSIS

em Indian Institute of Science - Bangalore - Índia

Two Dimensional Principal Component Analysis for Online Tamil Character Recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new application of two dimensional Principal Component Analysis (2DPCA) to the problem of online character recognition in Tamil Script. A novel set of features employing polynomial fits and quartiles in combination with conventional features are derived for each sample point of the Tamil character obtained after smoothing and resampling. These are stacked to form a matrix, using which a covariance matrix is constructed. A subset of the eigenvectors of the covariance matrix is employed to get the features in the reduced sub space. Each character is modeled as a separate subspace and a modified form of the Mahalanobis distance is derived to classify a given test character. Results indicate that the recognition accuracy using the 2DPCA scheme shows an approximate 3% improvement over the conventional PCA technique.

Application of principal component analysis to gas sensing characteristics of Cr0.8Fe0.2NbO4 thick film array

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The transient changes in resistances of Cr0.8Fe0.2NbO4 thick film sensors towards specified concentrations of H-2, NH3, acetonitrile, acetone, alcohol, cyclohexane and petroleum gas at different operating temperatures were recorded. The analyte-specific characteristics such as slopes of the response and retrace curves, area under the curve and sensitivity deduced from the transient curve of the respective analyte gas have been used to construct a data matrix. Principal component analysis (PCA) was applied to this data and the score plot was obtained. Distinguishing one reducing gas from the other is demonstrated based on this approach, which otherwise is not possible by measuring relative changes in conductivity. This methodology is extended for three Cr0.8Fe0.2NbO4 thick film sensor array operated at different temperatures. (C) 2015 Elsevier B.V. All rights reserved.

Cluster analysis of rainfall stations of the Indian peninsula

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Principal component analysis is applied to derive patterns of temporal variation of the rainfall at fifty-three stations in peninsular India. The location of the stations in the coordinate space determined by the amplitudes of the two leading eigenvectors is used to delineate them into eight clusters. The clusters obtained seem to be stable with respect to variations in the grid of stations used. Stations within any cluster occur in geographically contiguous areas.

Fusion of Multisensor Data:Review and Comparative Analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image fusion is a formal framework which is expressed as means and tools for the alliance of multisensor, multitemporal, and multiresolution data. Multisource data vary in spectral, spatial and temporal resolutions necessitating advanced analytical or numerical techniques for enhanced interpretation capabilities. This paper reviews seven pixel based image fusion techniques - intensity-hue-saturation, brovey, high pass filter (HPF), high pass modulation (HPM), principal component analysis, fourier transform and correspondence analysis.Validation of these techniques on IKONOS data (Panchromatic band at I m spatial resolution and Multispectral 4 bands at 4 in spatial resolution) reveal that HPF and HPM methods synthesises the images closest to those the corresponding multisensors would observe at the high resolution level.

Design of a FIR filter for image restoration using principal component neural networks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The neural network finds its application in many image denoising applications because of its inherent characteristics such as nonlinear mapping and self-adaptiveness. The design of filters largely depends on the a-priori knowledge about the type of noise. Due to this, standard filters are application and image specific. Widely used filtering algorithms reduce noisy artifacts by smoothing. However, this operation normally results in smoothing of the edges as well. On the other hand, sharpening filters enhance the high frequency details making the image non-smooth. An integrated general approach to design a finite impulse response filter based on principal component neural network (PCNN) is proposed in this study for image filtering, optimized in the sense of visual inspection and error metric. This algorithm exploits the inter-pixel correlation by iteratively updating the filter coefficients using PCNN. This algorithm performs optimal smoothing of the noisy image by preserving high and low frequency features. Evaluation results show that the proposed filter is robust under various noise distributions. Further, the number of unknown parameters is very few and most of these parameters are adaptively obtained from the processed image.

Insights to urban dynamics through landscape spatial pattern analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Urbanisation is a dynamic complex phenomenon involving large scale changes in the land uses at local levels. Analyses of changes in land uses in urban environments provide a historical perspective of land use and give an opportunity to assess the spatial patterns, correlation, trends, rate and impacts of the change, which would help in better regional planning and good governance of the region. Main objective of this research is to quantify the urban dynamics using temporal remote sensing data with the help of well-established landscape metrics. Bangalore being one of the rapidly urbanising landscapes in India has been chosen for this investigation. Complex process of urban sprawl was modelled using spatio temporal analysis. Land use analyses show 584% growth in built-up area during the last four decades with the decline of vegetation by 66% and water bodies by 74%. Analyses of the temporal data reveals an increase in urban built up area of 342.83% (during 1973-1992), 129.56% (during 1992-1999), 106.7% (1999-2002), 114.51% (2002-2006) and 126.19% from 2006 to 2010. The Study area was divided into four zones and each zone is further divided into 17 concentric circles of 1 km incrementing radius to understand the patterns and extent of the urbanisation at local levels. The urban density gradient illustrates radial pattern of urbanisation for the period 1973-2010. Bangalore grew radially from 1973 to 2010 indicating that the urbanisation is intensifying from the central core and has reached the periphery of the Greater Bangalore. Shannon's entropy, alpha and beta population densities were computed to understand the level of urbanisation at local levels. Shannon's entropy values of recent time confirms dispersed haphazard urban growth in the city, particularly in the outskirts of the city. This also illustrates the extent of influence of drivers of urbanisation in various directions. Landscape metrics provided in depth knowledge about the sprawl. Principal component analysis helped in prioritizing the metrics for detailed analyses. The results clearly indicates that whole landscape is aggregating to a large patch in 2010 as compared to earlier years which was dominated by several small patches. The large scale conversion of small patches to large single patch can be seen from 2006 to 2010. In the year 2010 patches are maximally aggregated indicating that the city is becoming more compact and more urbanised in recent years. Bangalore was the most sought after destination for its climatic condition and the availability of various facilities (land availability, economy, political factors) compared to other cities. The growth into a single urban patch can be attributed to rapid urbanisation coupled with the industrialisation. Monitoring of growth through landscape metrics helps to maintain and manage the natural resources. (C) 2012 Elsevier B.V. All rights reserved.

Genome-wide analysis correlates Ayurveda Prakriti

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The practice of Ayurveda, the traditional medicine of India, is based on the concept of three major constitutional types (Vata, Pitta and Kapha) defined as ``Prakriti''. To the best of our knowledge, no study has convincingly correlated genomic variations with the classification of Prakriti. In the present study, we performed genome-wide SNP (single nucleotide polymorphism) analysis (Affymetrix, 6.0) of 262 well-classified male individuals (after screening 3416 subjects) belonging to three Prakritis. We found 52 SNPs (p <= 1 x 10(-5)) were significantly different between Prakritis, without any confounding effect of stratification, after 10(6) permutations. Principal component analysis (PCA) of these SNPs classified 262 individuals into their respective groups (Vata, Pitta and Kapha) irrespective of their ancestry, which represent its power in categorization. We further validated our finding with 297 Indian population samples with known ancestry. Subsequently, we found that PGM1 correlates with phenotype of Pitta as described in the ancient text of Caraka Samhita, suggesting that the phenotypic classification of India's traditional medicine has a genetic basis; and its Prakriti-based practice in vogue for many centuries resonates with personalized medicine.

Speeding up AdaBoost Classifier with Random Projection

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The development of techniques for scaling up classifiers so that they can be applied to problems with large datasets of training examples is one of the objectives of data mining. Recently, AdaBoost has become popular among machine learning community thanks to its promising results across a variety of applications. However, training AdaBoost on large datasets is a major problem, especially when the dimensionality of the data is very high. This paper discusses the effect of high dimensionality on the training process of AdaBoost. Two preprocessing options to reduce dimensionality, namely the principal component analysis and random projection are briefly examined. Random projection subject to a probabilistic length preserving transformation is explored further as a computationally light preprocessing step. The experimental results obtained demonstrate the effectiveness of the proposed training process for handling high dimensional large datasets.

Classification and rating of strong-motion earthquake records

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ninety-two strong-motion earthquake records from the California region, U.S.A., have been statistically studied using principal component analysis in terms of twelve important standardized strong-motion characteristics. The first two principal components account for about 57 per cent of the total variance. Based on these two components the earthquake records are classified into nine groups in a two-dimensional principal component plane. Also a unidimensional engineering rating scale is proposed. The procedure can be used as an objective approach for classifying and rating future earthquakes.

Assessment of Nanoscience and Nanotechnology Initiatives in India

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nanotechnology is a new technology which is generating a lot of interest among academicians, practitioners and scientists. Critical research is being carried out in this area all over the world.Governments are creating policy initiatives to promote developments it the nanoscale science and technology developments. Private investment is also seeing a rising trend. Large number of academic institutions and national laboratories has set up research centers that are workingon the multiple applications of nanotechnology. Wide ranges of applications are claimed for nanotechnology. This consists of materials, chemicals, textiles, semiconductors, to wonder drug delivery systems and diagnostics. Nanotechnology is considered to be a next big wave of technology after information technology and biotechnology. In fact, nanotechnology holds the promise of advances that exceed those achieved in recent decades in computers and biotechnology. Much interest in nanotechnology also could be because of the fact that enormous monetary benefits are expected from nanotechnology based products. According to NSF, revenues from nanotechnology could touch $ 1 trillion by 2015. However much of the benefits are projected ones. Realizing claimed benefits require successful development of nanoscience andv nanotechnology research efforts. That is the journey of invention to innovation has to be completed. For this to happen the technology has to flow from laboratory to market. Nanoscience and nanotechnology research efforts have to come out in the form of new products, new processes, and new platforms.India has also started its Nanoscience and Nanotechnology development program in under its 10(th) Five Year Plan and funds worth Rs. One billion have been allocated for Nanoscience and Nanotechnology Research and Development. The aim of the paper is to assess Nanoscience and Nanotechnology initiatives in India. We propose a conceptual model derived from theresource based view of the innovation. We have developed a structured questionnaire to measure the constructs in the conceptual model. Responses have been collected from 115 scientists and engineers working in the field of Nanoscience and Nanotechnology. The responses have been analyzed further by using Principal Component Analysis, Cluster Analysis and Regression Analysis.

Reducing the babel in plant volatile communication: using the forest to see the trees

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While plants of a single species emit a diversity of volatile organic compounds (VOCs) to attract or repel interacting organisms, these specific messages may be lost in the midst of the hundreds of VOCs produced by sympatric plants of different species, many of which may have no signal content. Receivers must be able to reduce the babel or noise in these VOCs in order to correctly identify the message. For chemical ecologists faced with vast amounts of data on volatile signatures of plants in different ecological contexts, it is imperative to employ accurate methods of classifying messages, so that suitable bioassays may then be designed to understand message content. We demonstrate the utility of `Random Forests' (RF), a machine-learning algorithm, for the task of classifying volatile signatures and choosing the minimum set of volatiles for accurate discrimination, using datam from sympatric Ficus species as a case study. We demonstrate the advantages of RF over conventional classification methods such as principal component analysis (PCA), as well as data-mining algorithms such as support vector machines (SVM), diagonal linear discriminant analysis (DLDA) and k-nearest neighbour (KNN) analysis. We show why a tree-building method such as RF, which is increasingly being used by the bioinformatics, food technology and medical community, is particularly advantageous for the study of plant communication using volatiles, dealing, as it must, with abundant noise.

Nestedness pattern in stream diatom assemblages of central Western Ghats

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Community diversity and the population abundance of a particular group of species are controlled by immediate environment, inter-and intra-species interactions, landscape conditions, historical events and evolutionary processes. Nestedness is a measure of order in an ecological system, referring to the order in which the number of species is related to area or other factors. In this study we have studied the nestedness pattern in stream diatom assemblages in 24 stream sites of central Western Ghats, and report 98 taxa from the streams of central Western Ghats region. The communities show highly significant nested pattern. The Mantel test of matrix revealed a strong relationship between species assemblages and environmental conditions at the sites. A significant relationship between species assemblage and environmental condition was observed. Principal component analysis (PCA) indicated that environmental conditions differed markedly across the sampling sites, with the first three components explaining 78% of variance. Species composition of diatoms is significantly correlated with environmental distance across geographical extent. The current pattern suggests that micro-environment at regional levels influences the species composition of epilithic diatoms in streams. The nestedness shown by the diatom community was highly significant, even though it had a high proportion of idiosyncratic species, characterized with high numbers of cosmopolitan species, whereas the nested species were dominated by endemic species. PCA identifies ionic parameters and nutrients as the major features which determine the characteristics of the sampling sites. Hence the local water quality parameters are the major factors in deciding the diatom species assemblages.

Multivariate nonlinear ensemble prediction of daily chaotic rainfall with climate inputs

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The basic characteristic of a chaotic system is its sensitivity to the infinitesimal changes in its initial conditions. A limit to predictability in chaotic system arises mainly due to this sensitivity and also due to the ineffectiveness of the model to reveal the underlying dynamics of the system. In the present study, an attempt is made to quantify these uncertainties involved and thereby improve the predictability by adopting a multivariate nonlinear ensemble prediction. Daily rainfall data of Malaprabha basin, India for the period 1955-2000 is used for the study. It is found to exhibit a low dimensional chaotic nature with the dimension varying from 5 to 7. A multivariate phase space is generated, considering a climate data set of 16 variables. The chaotic nature of each of these variables is confirmed using false nearest neighbor method. The redundancy, if any, of this atmospheric data set is further removed by employing principal component analysis (PCA) method and thereby reducing it to eight principal components (PCs). This multivariate series (rainfall along with eight PCs) is found to exhibit a low dimensional chaotic nature with dimension 10. Nonlinear prediction employing local approximation method is done using univariate series (rainfall alone) and multivariate series for different combinations of embedding dimensions and delay times. The uncertainty in initial conditions is thus addressed by reconstructing the phase space using different combinations of parameters. The ensembles generated from multivariate predictions are found to be better than those from univariate predictions. The uncertainty in predictions is decreased or in other words predictability is increased by adopting multivariate nonlinear ensemble prediction. The restriction on predictability of a chaotic series can thus be altered by quantifying the uncertainty in the initial conditions and also by including other possible variables, which may influence the system. (C) 2011 Elsevier B.V. All rights reserved.

Optimal feature extraction for bilingual OCR

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Feature extraction in bilingual OCR is handicapped by the increase in the number of classes or characters to be handled. This is evident in the case of Indian languages whose alphabet set is large. It is expected that the complexity of the feature extraction process increases with the number of classes. Though the determination of the best set of features that could be used cannot be ascertained through any quantitative measures, the characteristics of the scripts can help decide on the feature extraction procedure. This paper describes a hierarchical feature extraction scheme for recognition of printed bilingual (Tamil and Roman) text. The scheme divides the combined alphabet set of both the scripts into subsets by the extraction of certain spatial and structural features. Three features viz geometric moments, DCT based features and Wavelet transform based features are extracted from the grouped symbols and a linear transformation is performed on them for the purpose of efficient representation in the feature space. The transformation is obtained by the maximization of certain criterion functions. Three techniques : Principal component analysis, maximization of Fisher's ratio and maximization of divergence measure have been employed to estimate the transformation matrix. It has been observed that the proposed hierarchical scheme allows for easier handling of the alphabets and there is an appreciable rise in the recognition accuracy as a result of the transformations.

Assessment of genetic diversity and identification of core collection in sandalwood germplasm using RAPDS

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sandalwood is an economically important aromatic tree belonging to the family Santalaceae. The trees are used mainly for their fragrant heartwood and oil that have immense potential for foreign exchange. Very little information is available on the genetic diversity in this species. Hence studies were initiated and genetic diversity estimated using RAPD markers in 51 genotypes of Santalum album procured from different geographcial regions of India and three exotic lines of S. spicatum from Australia. Eleven selected Operon primers (10mer) generated a total of 156 consistent and unambiguous amplification products ranging from 200bp to 4kb. Rare and genotype specific bands were identified which could be effectively used to distinguish the genotypes. Genetic relationships within the genotypes were evaluated by generating a dissimilarity matrix based on Ward's method (Squared Euclidean distance). The phenetic dendrogram and the Principal Component Analysis generated, separated the 51 Indian genotypes from the three Australian lines. The cluster analysis indicated that sandalwood germplasm within India constitutes a broad genetic base with values of genetic dissimilarity ranging from 15 to 91 %. A core collection of 21 selected individuals revealed the same diversity of the entire population. The results show that RAPD analysis is an efficient marker technology for estimating genetic diversity and relatedness, thereby enabling the formulation of appropriate strategies for conservation, germplasm management, and selection of diverse parents for sandalwood improvement programmes.

«
1
2
»