32 resultados para principal component regression

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The neural network finds its application in many image denoising applications because of its inherent characteristics such as nonlinear mapping and self-adaptiveness. The design of filters largely depends on the a-priori knowledge about the type of noise. Due to this, standard filters are application and image specific. Widely used filtering algorithms reduce noisy artifacts by smoothing. However, this operation normally results in smoothing of the edges as well. On the other hand, sharpening filters enhance the high frequency details making the image non-smooth. An integrated general approach to design a finite impulse response filter based on principal component neural network (PCNN) is proposed in this study for image filtering, optimized in the sense of visual inspection and error metric. This algorithm exploits the inter-pixel correlation by iteratively updating the filter coefficients using PCNN. This algorithm performs optimal smoothing of the noisy image by preserving high and low frequency features. Evaluation results show that the proposed filter is robust under various noise distributions. Further, the number of unknown parameters is very few and most of these parameters are adaptively obtained from the processed image.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new application of two dimensional Principal Component Analysis (2DPCA) to the problem of online character recognition in Tamil Script. A novel set of features employing polynomial fits and quartiles in combination with conventional features are derived for each sample point of the Tamil character obtained after smoothing and resampling. These are stacked to form a matrix, using which a covariance matrix is constructed. A subset of the eigenvectors of the covariance matrix is employed to get the features in the reduced sub space. Each character is modeled as a separate subspace and a modified form of the Mahalanobis distance is derived to classify a given test character. Results indicate that the recognition accuracy using the 2DPCA scheme shows an approximate 3% improvement over the conventional PCA technique.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The transient changes in resistances of Cr0.8Fe0.2NbO4 thick film sensors towards specified concentrations of H-2, NH3, acetonitrile, acetone, alcohol, cyclohexane and petroleum gas at different operating temperatures were recorded. The analyte-specific characteristics such as slopes of the response and retrace curves, area under the curve and sensitivity deduced from the transient curve of the respective analyte gas have been used to construct a data matrix. Principal component analysis (PCA) was applied to this data and the score plot was obtained. Distinguishing one reducing gas from the other is demonstrated based on this approach, which otherwise is not possible by measuring relative changes in conductivity. This methodology is extended for three Cr0.8Fe0.2NbO4 thick film sensor array operated at different temperatures. (C) 2015 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Nanotechnology is a new technology which is generating a lot of interest among academicians, practitioners and scientists. Critical research is being carried out in this area all over the world.Governments are creating policy initiatives to promote developments it the nanoscale science and technology developments. Private investment is also seeing a rising trend. Large number of academic institutions and national laboratories has set up research centers that are workingon the multiple applications of nanotechnology. Wide ranges of applications are claimed for nanotechnology. This consists of materials, chemicals, textiles, semiconductors, to wonder drug delivery systems and diagnostics. Nanotechnology is considered to be a next big wave of technology after information technology and biotechnology. In fact, nanotechnology holds the promise of advances that exceed those achieved in recent decades in computers and biotechnology. Much interest in nanotechnology also could be because of the fact that enormous monetary benefits are expected from nanotechnology based products. According to NSF, revenues from nanotechnology could touch $ 1 trillion by 2015. However much of the benefits are projected ones. Realizing claimed benefits require successful development of nanoscience andv nanotechnology research efforts. That is the journey of invention to innovation has to be completed. For this to happen the technology has to flow from laboratory to market. Nanoscience and nanotechnology research efforts have to come out in the form of new products, new processes, and new platforms.India has also started its Nanoscience and Nanotechnology development program in under its 10(th) Five Year Plan and funds worth Rs. One billion have been allocated for Nanoscience and Nanotechnology Research and Development. The aim of the paper is to assess Nanoscience and Nanotechnology initiatives in India. We propose a conceptual model derived from theresource based view of the innovation. We have developed a structured questionnaire to measure the constructs in the conceptual model. Responses have been collected from 115 scientists and engineers working in the field of Nanoscience and Nanotechnology. The responses have been analyzed further by using Principal Component Analysis, Cluster Analysis and Regression Analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Detecting and quantifying the presence of human-induced climate change in regional hydrology is important for studying the impacts of such changes on the water resources systems as well as for reliable future projections and policy making for adaptation. In this article a formal fingerprint-based detection and attribution analysis has been attempted to study the changes in the observed monsoon precipitation and streamflow in the rain-fed Mahanadi River Basin in India, considering the variability across different climate models. This is achieved through the use of observations, several climate model runs, a principal component analysis and regression based statistical downscaling technique, and a Genetic Programming based rainfall-runoff model. It is found that the decreases in observed hydrological variables across the second half of the 20th century lie outside the range that is expected from natural internal variability of climate alone at 95% statistical confidence level, for most of the climate models considered. For several climate models, such changes are consistent with those expected from anthropogenic emissions of greenhouse gases. However, unequivocal attribution to human-induced climate change cannot be claimed across all the climate models and uncertainties in our detection procedure, arising out of various sources including the use of models, cannot be ruled out. Changes in solar irradiance and volcanic activities are considered as other plausible natural external causes of climate change. Time evolution of the anthropogenic climate change ``signal'' in the hydrological observations, above the natural internal climate variability ``noise'' shows that the detection of the signal is achieved earlier in streamflow as compared to precipitation for most of the climate models, suggesting larger impacts of human-induced climate change on streamflow than precipitation at the river basin scale.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The objective in this work is to develop downscaling methodologies to obtain a long time record of inundation extent at high spatial resolution based on the existing low spatial resolution results of the Global Inundation Extent from Multi-Satellites (GIEMS) dataset. In semiarid regions, high-spatial-resolution a priori information can be provided by visible and infrared observations from the Moderate Resolution Imaging Spectroradiometer (MODIS). The study concentrates on the Inner Niger Delta where MODIS-derived inundation extent has been estimated at a 500-m resolution. The space-time variability is first analyzed using a principal component analysis (PCA). This is particularly effective to understand the inundation variability, interpolate in time, or fill in missing values. Two innovative methods are developed (linear regression and matrix inversion) both based on the PCA representation. These GIEMS downscaling techniques have been calibrated using the 500-m MODIS data. The downscaled fields show the expected space-time behaviors from MODIS. A 20-yr dataset of the inundation extent at 500 m is derived from this analysis for the Inner Niger Delta. The methods are very general and may be applied to many basins and to other variables than inundation, provided enough a priori high-spatial-resolution information is available. The derived high-spatial-resolution dataset will be used in the framework of the Surface Water Ocean Topography (SWOT) mission to develop and test the instrument simulator as well as to select the calibration validation sites (with high space-time inundation variability). In addition, once SWOT observations are available, the downscaled methodology will be calibrated on them in order to downscale the GIEMS datasets and to extend the SWOT benefits back in time to 1993.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Several statistical downscaling models have been developed in the past couple of decades to assess the hydrologic impacts of climate change by projecting the station-scale hydrological variables from large-scale atmospheric variables simulated by general circulation models (GCMs). This paper presents and compares different statistical downscaling models that use multiple linear regression (MLR), positive coefficient regression (PCR), stepwise regression (SR), and support vector machine (SVM) techniques for estimating monthly rainfall amounts in the state of Florida. Mean sea level pressure, air temperature, geopotential height, specific humidity, U wind, and V wind are used as the explanatory variables/predictors in the downscaling models. Data for these variables are obtained from the National Centers for Environmental Prediction-National Center for Atmospheric Research (NCEP-NCAR) reanalysis dataset and the Canadian Centre for Climate Modelling and Analysis (CCCma) Coupled Global Climate Model, version 3 (CGCM3) GCM simulations. The principal component analysis (PCA) and fuzzy c-means clustering method (FCM) are used as part of downscaling model to reduce the dimensionality of the dataset and identify the clusters in the data, respectively. Evaluation of the performances of the models using different error and statistical measures indicates that the SVM-based model performed better than all the other models in reproducing most monthly rainfall statistics at 18 sites. Output from the third-generation CGCM3 GCM for the A1B scenario was used for future projections. For the projection period 2001-10, MLR was used to relate variables at the GCM and NCEP grid scales. Use of MLR in linking the predictor variables at the GCM and NCEP grid scales yielded better reproduction of monthly rainfall statistics at most of the stations (12 out of 18) compared to those by spatial interpolation technique used in earlier studies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The development of techniques for scaling up classifiers so that they can be applied to problems with large datasets of training examples is one of the objectives of data mining. Recently, AdaBoost has become popular among machine learning community thanks to its promising results across a variety of applications. However, training AdaBoost on large datasets is a major problem, especially when the dimensionality of the data is very high. This paper discusses the effect of high dimensionality on the training process of AdaBoost. Two preprocessing options to reduce dimensionality, namely the principal component analysis and random projection are briefly examined. Random projection subject to a probabilistic length preserving transformation is explored further as a computationally light preprocessing step. The experimental results obtained demonstrate the effectiveness of the proposed training process for handling high dimensional large datasets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Principal component analysis is applied to derive patterns of temporal variation of the rainfall at fifty-three stations in peninsular India. The location of the stations in the coordinate space determined by the amplitudes of the two leading eigenvectors is used to delineate them into eight clusters. The clusters obtained seem to be stable with respect to variations in the grid of stations used. Stations within any cluster occur in geographically contiguous areas.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ninety-two strong-motion earthquake records from the California region, U.S.A., have been statistically studied using principal component analysis in terms of twelve important standardized strong-motion characteristics. The first two principal components account for about 57 per cent of the total variance. Based on these two components the earthquake records are classified into nine groups in a two-dimensional principal component plane. Also a unidimensional engineering rating scale is proposed. The procedure can be used as an objective approach for classifying and rating future earthquakes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Image fusion is a formal framework which is expressed as means and tools for the alliance of multisensor, multitemporal, and multiresolution data. Multisource data vary in spectral, spatial and temporal resolutions necessitating advanced analytical or numerical techniques for enhanced interpretation capabilities. This paper reviews seven pixel based image fusion techniques - intensity-hue-saturation, brovey, high pass filter (HPF), high pass modulation (HPM), principal component analysis, fourier transform and correspondence analysis.Validation of these techniques on IKONOS data (Panchromatic band at I m spatial resolution and Multispectral 4 bands at 4 in spatial resolution) reveal that HPF and HPM methods synthesises the images closest to those the corresponding multisensors would observe at the high resolution level.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

While plants of a single species emit a diversity of volatile organic compounds (VOCs) to attract or repel interacting organisms, these specific messages may be lost in the midst of the hundreds of VOCs produced by sympatric plants of different species, many of which may have no signal content. Receivers must be able to reduce the babel or noise in these VOCs in order to correctly identify the message. For chemical ecologists faced with vast amounts of data on volatile signatures of plants in different ecological contexts, it is imperative to employ accurate methods of classifying messages, so that suitable bioassays may then be designed to understand message content. We demonstrate the utility of `Random Forests' (RF), a machine-learning algorithm, for the task of classifying volatile signatures and choosing the minimum set of volatiles for accurate discrimination, using datam from sympatric Ficus species as a case study. We demonstrate the advantages of RF over conventional classification methods such as principal component analysis (PCA), as well as data-mining algorithms such as support vector machines (SVM), diagonal linear discriminant analysis (DLDA) and k-nearest neighbour (KNN) analysis. We show why a tree-building method such as RF, which is increasingly being used by the bioinformatics, food technology and medical community, is particularly advantageous for the study of plant communication using volatiles, dealing, as it must, with abundant noise.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Community diversity and the population abundance of a particular group of species are controlled by immediate environment, inter-and intra-species interactions, landscape conditions, historical events and evolutionary processes. Nestedness is a measure of order in an ecological system, referring to the order in which the number of species is related to area or other factors. In this study we have studied the nestedness pattern in stream diatom assemblages in 24 stream sites of central Western Ghats, and report 98 taxa from the streams of central Western Ghats region. The communities show highly significant nested pattern. The Mantel test of matrix revealed a strong relationship between species assemblages and environmental conditions at the sites. A significant relationship between species assemblage and environmental condition was observed. Principal component analysis (PCA) indicated that environmental conditions differed markedly across the sampling sites, with the first three components explaining 78% of variance. Species composition of diatoms is significantly correlated with environmental distance across geographical extent. The current pattern suggests that micro-environment at regional levels influences the species composition of epilithic diatoms in streams. The nestedness shown by the diatom community was highly significant, even though it had a high proportion of idiosyncratic species, characterized with high numbers of cosmopolitan species, whereas the nested species were dominated by endemic species. PCA identifies ionic parameters and nutrients as the major features which determine the characteristics of the sampling sites. Hence the local water quality parameters are the major factors in deciding the diatom species assemblages.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The basic characteristic of a chaotic system is its sensitivity to the infinitesimal changes in its initial conditions. A limit to predictability in chaotic system arises mainly due to this sensitivity and also due to the ineffectiveness of the model to reveal the underlying dynamics of the system. In the present study, an attempt is made to quantify these uncertainties involved and thereby improve the predictability by adopting a multivariate nonlinear ensemble prediction. Daily rainfall data of Malaprabha basin, India for the period 1955-2000 is used for the study. It is found to exhibit a low dimensional chaotic nature with the dimension varying from 5 to 7. A multivariate phase space is generated, considering a climate data set of 16 variables. The chaotic nature of each of these variables is confirmed using false nearest neighbor method. The redundancy, if any, of this atmospheric data set is further removed by employing principal component analysis (PCA) method and thereby reducing it to eight principal components (PCs). This multivariate series (rainfall along with eight PCs) is found to exhibit a low dimensional chaotic nature with dimension 10. Nonlinear prediction employing local approximation method is done using univariate series (rainfall alone) and multivariate series for different combinations of embedding dimensions and delay times. The uncertainty in initial conditions is thus addressed by reconstructing the phase space using different combinations of parameters. The ensembles generated from multivariate predictions are found to be better than those from univariate predictions. The uncertainty in predictions is decreased or in other words predictability is increased by adopting multivariate nonlinear ensemble prediction. The restriction on predictability of a chaotic series can thus be altered by quantifying the uncertainty in the initial conditions and also by including other possible variables, which may influence the system. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Feature extraction in bilingual OCR is handicapped by the increase in the number of classes or characters to be handled. This is evident in the case of Indian languages whose alphabet set is large. It is expected that the complexity of the feature extraction process increases with the number of classes. Though the determination of the best set of features that could be used cannot be ascertained through any quantitative measures, the characteristics of the scripts can help decide on the feature extraction procedure. This paper describes a hierarchical feature extraction scheme for recognition of printed bilingual (Tamil and Roman) text. The scheme divides the combined alphabet set of both the scripts into subsets by the extraction of certain spatial and structural features. Three features viz geometric moments, DCT based features and Wavelet transform based features are extracted from the grouped symbols and a linear transformation is performed on them for the purpose of efficient representation in the feature space. The transformation is obtained by the maximization of certain criterion functions. Three techniques : Principal component analysis, maximization of Fisher's ratio and maximization of divergence measure have been employed to estimate the transformation matrix. It has been observed that the proposed hierarchical scheme allows for easier handling of the alphabets and there is an appreciable rise in the recognition accuracy as a result of the transformations.