30 resultados para principal component analysis (PCA)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The development of techniques for scaling up classifiers so that they can be applied to problems with large datasets of training examples is one of the objectives of data mining. Recently, AdaBoost has become popular among machine learning community thanks to its promising results across a variety of applications. However, training AdaBoost on large datasets is a major problem, especially when the dimensionality of the data is very high. This paper discusses the effect of high dimensionality on the training process of AdaBoost. Two preprocessing options to reduce dimensionality, namely the principal component analysis and random projection are briefly examined. Random projection subject to a probabilistic length preserving transformation is explored further as a computationally light preprocessing step. The experimental results obtained demonstrate the effectiveness of the proposed training process for handling high dimensional large datasets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ninety-two strong-motion earthquake records from the California region, U.S.A., have been statistically studied using principal component analysis in terms of twelve important standardized strong-motion characteristics. The first two principal components account for about 57 per cent of the total variance. Based on these two components the earthquake records are classified into nine groups in a two-dimensional principal component plane. Also a unidimensional engineering rating scale is proposed. The procedure can be used as an objective approach for classifying and rating future earthquakes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nanotechnology is a new technology which is generating a lot of interest among academicians, practitioners and scientists. Critical research is being carried out in this area all over the world.Governments are creating policy initiatives to promote developments it the nanoscale science and technology developments. Private investment is also seeing a rising trend. Large number of academic institutions and national laboratories has set up research centers that are workingon the multiple applications of nanotechnology. Wide ranges of applications are claimed for nanotechnology. This consists of materials, chemicals, textiles, semiconductors, to wonder drug delivery systems and diagnostics. Nanotechnology is considered to be a next big wave of technology after information technology and biotechnology. In fact, nanotechnology holds the promise of advances that exceed those achieved in recent decades in computers and biotechnology. Much interest in nanotechnology also could be because of the fact that enormous monetary benefits are expected from nanotechnology based products. According to NSF, revenues from nanotechnology could touch $ 1 trillion by 2015. However much of the benefits are projected ones. Realizing claimed benefits require successful development of nanoscience andv nanotechnology research efforts. That is the journey of invention to innovation has to be completed. For this to happen the technology has to flow from laboratory to market. Nanoscience and nanotechnology research efforts have to come out in the form of new products, new processes, and new platforms.India has also started its Nanoscience and Nanotechnology development program in under its 10(th) Five Year Plan and funds worth Rs. One billion have been allocated for Nanoscience and Nanotechnology Research and Development. The aim of the paper is to assess Nanoscience and Nanotechnology initiatives in India. We propose a conceptual model derived from theresource based view of the innovation. We have developed a structured questionnaire to measure the constructs in the conceptual model. Responses have been collected from 115 scientists and engineers working in the field of Nanoscience and Nanotechnology. The responses have been analyzed further by using Principal Component Analysis, Cluster Analysis and Regression Analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Feature extraction in bilingual OCR is handicapped by the increase in the number of classes or characters to be handled. This is evident in the case of Indian languages whose alphabet set is large. It is expected that the complexity of the feature extraction process increases with the number of classes. Though the determination of the best set of features that could be used cannot be ascertained through any quantitative measures, the characteristics of the scripts can help decide on the feature extraction procedure. This paper describes a hierarchical feature extraction scheme for recognition of printed bilingual (Tamil and Roman) text. The scheme divides the combined alphabet set of both the scripts into subsets by the extraction of certain spatial and structural features. Three features viz geometric moments, DCT based features and Wavelet transform based features are extracted from the grouped symbols and a linear transformation is performed on them for the purpose of efficient representation in the feature space. The transformation is obtained by the maximization of certain criterion functions. Three techniques : Principal component analysis, maximization of Fisher's ratio and maximization of divergence measure have been employed to estimate the transformation matrix. It has been observed that the proposed hierarchical scheme allows for easier handling of the alphabets and there is an appreciable rise in the recognition accuracy as a result of the transformations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sandalwood is an economically important aromatic tree belonging to the family Santalaceae. The trees are used mainly for their fragrant heartwood and oil that have immense potential for foreign exchange. Very little information is available on the genetic diversity in this species. Hence studies were initiated and genetic diversity estimated using RAPD markers in 51 genotypes of Santalum album procured from different geographcial regions of India and three exotic lines of S. spicatum from Australia. Eleven selected Operon primers (10mer) generated a total of 156 consistent and unambiguous amplification products ranging from 200bp to 4kb. Rare and genotype specific bands were identified which could be effectively used to distinguish the genotypes. Genetic relationships within the genotypes were evaluated by generating a dissimilarity matrix based on Ward's method (Squared Euclidean distance). The phenetic dendrogram and the Principal Component Analysis generated, separated the 51 Indian genotypes from the three Australian lines. The cluster analysis indicated that sandalwood germplasm within India constitutes a broad genetic base with values of genetic dissimilarity ranging from 15 to 91 %. A core collection of 21 selected individuals revealed the same diversity of the entire population. The results show that RAPD analysis is an efficient marker technology for estimating genetic diversity and relatedness, thereby enabling the formulation of appropriate strategies for conservation, germplasm management, and selection of diverse parents for sandalwood improvement programmes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Land-use changes influence local biodiversity directly, and also cumulatively, contribute to regional and global changes in natural systems and quality of life. Consequent to these, direct impacts on the natural resources that support the health and integrity of living beings are evident in recent times. The Western Ghats being one of the global biodiversity hotspots, is reeling under a tremendous pressure from human induced changes in terms of developmental projects like hydel or thermal power plants, big dams, mining activities, unplanned agricultural practices,monoculture plantations, illegal timber logging, etc. This has led to the once contiguous forest habitats to be fragmented in patches, which in turn has led to the shrinkage of original habitat for the wildlife, change in the hydrological regime of the catchment, decreased inflow in streams,human-animal conflicts, etc. Under such circumstances, a proper management practice is called for requiring suitable biological indicators to show the impact of these changes, set priority regions and in developing models for conservation planning. Amphibians are regarded as one of the best biological indicators due to their sensitivity to even the slightest changes in the environment and hence they could be used as surrogates in conservation and management practices. They are the predominating vertebrates with a high degree of endemism (78%) in Western Ghats. The present study is an attempt to bring in the impacts of various land-uses on anuran distribution in three river basins. Sampling was carried out for amphibians during all seasons of 2003-2006 in basins of Sharavathi, Aghanashini and Bedthi. There are as many as 46 species in the region, one of which is new to science and nearly 59% of them are endemic to the Western Ghats. They belong to nine families, Dicroglossidae being represented by 14 species,followed by Rhacophoridae (9 species) and Ranidae (5 species). Species richness is high in Sharavathi river basin, with 36 species, followed by Bedthi 33 and Aghanashini 27. The impact of land-use changes, was investigated in the upper catchment of Sharavathi river basin. Species diversity indices, relative abundance values, percentage endemics gave clear indication of differences in each sub-catchment. Karl Pearson’s correlation coefficient (r) was calculated between species richness, endemics, environmental descriptors, land-use classes and fragmentation metrics. Principal component analysis was performed to depict the influence of these variables. Results show that sub-catchments with lesser percentage of forest, low canopy cover, higher amount of agricultural area, low rainfall have low species richness, less endemic species and abundant non-endemic species, whereas endemism, species richness and abundance of endemic species are more in the sub-catchments with high tree density, endemic trees, canopy cover, rainfall and lower amount of agriculture fields. This analysis aided in prioritising regions in the Sharavathi river basin for further conservation measures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Climate change vulnerability profiles are developed at the district level for agriculture, water and forest sectors for the North East region of India for the current and projected future climates. An index-based approach was used where a set of indicators that represent key sectors of vulnerability (agriculture, forest, water) is selected using the statistical technique principal component analysis. The impacts of climate change on key sectors as represented by the changes in the indicators were derived from impact assessment models. These impacted indicators were utilized for the calculation of the future vulnerability to climate change. Results indicate that majority of the districts in North East India are subject to climate induced vulnerability currently and in the near future. This is a first of its kind study that exhibits ranking of districts of North East India on the basis of the vulnerability index values. The objective of such ranking is to assist in: (i) identifying and prioritizing the most vulnerable sectors and districts; (ii) identifying adaptation interventions, and (iii) mainstreaming adaptation in development programmes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Land cover (LC) and land use (LU) dynamics induced by human and natural processes play a major role in global as well as regional patterns of landscapes influencing biodiversity, hydrology, ecology and climate. Changes in LC features resulting in forest fragmentations have posed direct threats to biodiversity, endangering the sustainability of ecological goods and services. Habitat fragmentation is of added concern as the residual spatial patterns mitigate or exacerbate edge effects. LU dynamics are obtained by classifying temporal remotely sensed satellite imagery of different spatial and spectral resolutions. This paper reviews five different image classification algorithms using spatio-temporal data of a temperate watershed in Himachal Pradesh, India. Gaussian Maximum Likelihood classifier was found to be apt for analysing spatial pattern at regional scale based on accuracy assessment through error matrix and ROC (receiver operating characteristic) curves. The LU information thus derived was then used to assess spatial changes from temporal data using principal component analysis and correspondence analysis based image differencing. The forest area dynamics was further studied by analysing the different types of fragmentation through forest fragmentation models. The computed forest fragmentation and landscape metrics show a decline of interior intact forests with a substantial increase in patch forest during 1972-2007.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address the problem of recognition and retrieval of relatively weak industrial signal such as Partial Discharges (PD) buried in excessive noise. The major bottleneck being the recognition and suppression of stochastic pulsive interference (PI) which has similar time-frequency characteristics as PD pulse. Therefore conventional frequency based DSP techniques are not useful in retrieving PD pulses. We employ statistical signal modeling based on combination of long-memory process and probabilistic principal component analysis (PPCA). An parametric analysis of the signal is exercised for extracting the features of desired pules. We incorporate a wavelet based bootstrap method for obtaining the noise training vectors from observed data. The procedure adopted in this work is completely different from the research work reported in the literature, which is generally based on deserved signal frequency and noise frequency.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Detecting and quantifying the presence of human-induced climate change in regional hydrology is important for studying the impacts of such changes on the water resources systems as well as for reliable future projections and policy making for adaptation. In this article a formal fingerprint-based detection and attribution analysis has been attempted to study the changes in the observed monsoon precipitation and streamflow in the rain-fed Mahanadi River Basin in India, considering the variability across different climate models. This is achieved through the use of observations, several climate model runs, a principal component analysis and regression based statistical downscaling technique, and a Genetic Programming based rainfall-runoff model. It is found that the decreases in observed hydrological variables across the second half of the 20th century lie outside the range that is expected from natural internal variability of climate alone at 95% statistical confidence level, for most of the climate models considered. For several climate models, such changes are consistent with those expected from anthropogenic emissions of greenhouse gases. However, unequivocal attribution to human-induced climate change cannot be claimed across all the climate models and uncertainties in our detection procedure, arising out of various sources including the use of models, cannot be ruled out. Changes in solar irradiance and volcanic activities are considered as other plausible natural external causes of climate change. Time evolution of the anthropogenic climate change ``signal'' in the hydrological observations, above the natural internal climate variability ``noise'' shows that the detection of the signal is achieved earlier in streamflow as compared to precipitation for most of the climate models, suggesting larger impacts of human-induced climate change on streamflow than precipitation at the river basin scale.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Recent research on glioblastoma (GBM) has focused on deducing gene signatures predicting prognosis. The present study evaluated the mRNA expression of selected genes and correlated with outcome to arrive at a prognostic gene signature. Methods: Patients with GBM (n = 123) were prospectively recruited, treated with a uniform protocol and followed up. Expression of 175 genes in GBM tissue was determined using qRT-PCR. A supervised principal component analysis followed by derivation of gene signature was performed. Independent validation of the signature was done using TCGA data. Gene Ontology and KEGG pathway analysis was carried out among patients from TCGA cohort. Results: A 14 gene signature was identified that predicted outcome in GBM. A weighted gene (WG) score was found to be an independent predictor of survival in multivariate analysis in the present cohort (HR = 2.507; B = 0.919; p < 0.001) and in TCGA cohort. Risk stratification by standardized WG score classified patients into low and high risk predicting survival both in our cohort (p = <0.001) and TCGA cohort (p = 0.001). Pathway analysis using the most differentially regulated genes (n = 76) between the low and high risk groups revealed association of activated inflammatory/immune response pathways and mesenchymal subtype in the high risk group. Conclusion: We have identified a 14 gene expression signature that can predict survival in GBM patients. A network analysis revealed activation of inflammatory response pathway specifically in high risk group. These findings may have implications in understanding of gliomagenesis, development of targeted therapies and selection of high risk cancer patients for alternate adjuvant therapies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study aimed to assess soil nutrient status and heavy metal content and their impact on the predominant soil bacterial communities of mangroves of the Mahanadi Delta. Mangrove soil of the Mahanadi Delta is slightly acidic and the levels of soil nutrients such as carbon, nitrogen, phosphorous and potash vary with season and site. The seasonal average concentrations (g/g) of various heavy metals were in the range: 14810-63370 (Fe), 2.8-32.6 (Cu), 13.4-55.7 (Ni), 1.8-7.9 (Cd), 16.6-54.7 (Pb), 24.4-132.5 (Zn) and 13.3-48.2 (Co). Among the different heavy metals analysed, Co, Cu and Cd were above their permissible limits, as prescribed by Indian Standards (Co=17g/g, Cu=30 g/g, Cd=3-6 g/g), indicating pollution in the mangrove soil. A viable plate count revealed the presence of different groups of bacteria in the mangrove soil, i.e. heterotrophs, free-living N-2 fixers, nitrifyers, denitrifyers, phosphate solubilisers, cellulose degraders and sulfur oxidisers. Principal component analysis performed using multivariate statistical methods showed a positive relationship between soil nutrients and microbial load. Whereas metal content such as Cu, Co and Ni showed a negative impact on some of the studied soil bacteria.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rice landraces are lineages developed by farmers through artificial selection during the long-term domestication process. Despite huge potential for crop improvement, they are largely understudied in India. Here, we analyse a suite of phenotypic characters from large numbers of Indian landraces comprised of both aromatic and non-aromatic varieties. Our primary aim was to investigate the major determinants of diversity, the strength of segregation among aromatic and non-aromatic landraces as well as that within aromatic landraces. Using principal component analysis, we found that grain length, width and weight, panicle weight and leaf length have the most substantial contribution. Discriminant analysis can effectively distinguish the majority of aromatic from non-aromatic landraces. More interestingly, within aromatic landraces long-grain traditional Basmati and short-grain non-Basmati aromatics remain morphologically well differentiated. The present research emphasizes the general patterns of phenotypic diversity and finds out the most important characters. It also confirms the existence of very unique short-grain aromatic landraces, perhaps carrying signatures of independent origin of an additional aroma quantitative trait locus in the indica group, unlike introgression of specific alleles of the BADH2 gene from the japonica group as in Basmati. We presume that this parallel origin and evolution of aroma in short-grain indica landraces are linked to the long history of rice domestication that involved inheritance of several traits from Oryza nivara, in addition to O. rufipogon. We conclude with a note that the insights from the phenotypic analysis essentially comprise the first part, which will likely be validated with subsequent molecular analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider two variants of the classical gossip algorithm. The first variant is a version of asynchronous stochastic approximation. We highlight a fundamental difficulty associated with the classical asynchronous gossip scheme, viz., that it may not converge to a desired average, and suggest an alternative scheme based on reinforcement learning that has guaranteed convergence to the desired average. We then discuss a potential application to a wireless network setting with simultaneous link activation constraints. The second variant is a gossip algorithm for distributed computation of the Perron-Frobenius eigenvector of a nonnegative matrix. While the first variant draws upon a reinforcement learning algorithm for an average cost controlled Markov decision problem, the second variant draws upon a reinforcement learning algorithm for risk-sensitive control. We then discuss potential applications of the second variant to ranking schemes, reputation networks, and principal component analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Structural information over the entire course of binding interactions based on the analyses of energy landscapes is described, which provides a framework to understand the events involved during biomolecular recognition. Conformational dynamics of malectin's exquisite selectivity for diglucosylated N-glycan (Dig-N-glycan), a highly flexible oligosaccharide comprising of numerous dihedral torsion angles, are described as an example. For this purpose, a novel approach based on hierarchical sampling for acquiring metastable molecular conformations constituting low-energy minima for understanding the structural features involved in a biologic recognition is proposed. For this purpose, four variants of principal component analysis were employed recursively in both Cartesian space and dihedral angles space that are characterized by free energy landscapes to select the most stable conformational substates. Subsequently, k-means clustering algorithm was implemented for geometric separation of the major native state to acquire a final ensemble of metastable conformers. A comparison of malectin complexes was then performed to characterize their conformational properties. Analyses of stereochemical metrics and other concerted binding events revealed surface complementarity, cooperative and bidentate hydrogen bonds, water-mediated hydrogen bonds, carbohydrate-aromatic interactions including CH-pi and stacking interactions involved in this recognition. Additionally, a striking structural transition from loop to beta-strands in malectin CRD upon specific binding to Dig-N-glycan is observed. The interplay of the above-mentioned binding events in malectin and Dig-N-glycan supports an extended conformational selection model as the underlying binding mechanism.