912 resultados para forward selection component analysis
Resumo:
The SCoTLASS problem-principal component analysis modified so that the components satisfy the Least Absolute Shrinkage and Selection Operator (LASSO) constraint-is reformulated as a dynamical system on the unit sphere. The LASSO inequality constraint is tackled by exterior penalty function. A globally convergent algorithm is developed based on the projected gradient approach. The algorithm is illustrated numerically and discussed on a well-known data set. (c) 2004 Elsevier B.V. All rights reserved.
Resumo:
Locality to other nodes on a peer-to-peer overlay network can be established by means of a set of landmarks shared among the participating nodes. Each node independently collects a set of latency measures to landmark nodes, which are used as a multi-dimensional feature vector. Each peer node uses the feature vector to generate a unique scalar index which is correlated to its topological locality. A popular dimensionality reduction technique is the space filling Hilbert’s curve, as it possesses good locality preserving properties. However, there exists little comparison between Hilbert’s curve and other techniques for dimensionality reduction. This work carries out a quantitative analysis of their properties. Linear and non-linear techniques for scaling the landmark vectors to a single dimension are investigated. Hilbert’s curve, Sammon’s mapping and Principal Component Analysis have been used to generate a 1d space with locality preserving properties. This work provides empirical evidence to support the use of Hilbert’s curve in the context of locality preservation when generating peer identifiers by means of landmark vector analysis. A comparative analysis is carried out with an artificial 2d network model and with a realistic network topology model with a typical power-law distribution of node connectivity in the Internet. Nearest neighbour analysis confirms Hilbert’s curve to be very effective in both artificial and realistic network topologies. Nevertheless, the results in the realistic network model show that there is scope for improvements and better techniques to preserve locality information are required.
Resumo:
This study clarifies the taxonomic status of Anemone coronaria and segregates the species and A. coronaria infraspecific variants using morphological and morphometric analyses. Principal component analysis of the coronaria group was performed on 25 quantitative and qualitative characters, and morphometric analysis of the A. coronaria infraspecific variants was performed on 21 quantitative and qualitative characters. The results showed that the A. coronaria group clustered into four major groups: A. coronaria L., A. biflora DC, A. bucharica (Regel) Juz.ex Komarov, and a final group including A. eranthioides Regel and A. tschernjaewii Regel. The data on the A. coronaria infraspecific variants clustered into six groups: A. coronaria L. var. coronaria L., var. cyanea Ard., var. albiflora Rouy & Fouc., var. parviflora Regel, var. ventreana Ard., and var. rissoana Ard. © 2007 The Linnean Society of London
Resumo:
Baking and 2-g mixograph analyses were performed for 55 cultivars (19 spring and 36 winter wheat) from various quality classes from the 2002 harvest in Poland. An instrumented 2-g direct-drive mixograph was used to study the mixing characteristics of the wheat cultivars. A number of parameters were extracted automatically from each mixograph trace and correlated with baking volume and flour quality parameters (protein content and high molecular weight glutenin subunit [HMW-GS] composition by SDS-PAGE) using multiple linear regression statistical analysis. Principal component analysis of the mixograph data discriminated between four flour quality classes, and predictions of baking volume were obtained using several selected mixograph parameters, chosen using a best subsets regression routine, giving R-2 values of 0.862-0.866. In particular, three new spring wheat strains (CHD 502a-c) recently registered in Poland were highly discriminated and predicted to give high baking volume on the basis of two mixograph parameters: peak bandwidth and 10-min bandwidth.
Integrated cytokine and metabolic analysis of pathological responses to parasite exposure in rodents
Resumo:
Parasitic infections cause a myriad of responses in their mammalian hosts, on immune as well as on metabolic level. A multiplex panel of cytokines and metabolites derived from four parasite-rodent models, namely, Plasmodium berghei-mouse, Trypanosoma brucei brucei-mouse, Schistosoma mansoni-mouse, and Fasciola hepatica-rat were statistically coanalyzed. 1H NMR spectroscopy and multivariate statistical analysis were used to characterize the urine and plasma metabolite profiles in infected and noninfected animals. Each parasite generated a unique metabolic signature in the host. Plasma cytokine concentrations were obtained using the ‘Meso Scale Discovery’ multi cytokine assay platform. Multivariate data integration methods were subsequently used to elucidate the component of the metabolic signature which is associated with inflammation and to determine specific metabolic correlates with parasite-induced changes in plasma cytokine levels. For example, the relative levels of acetyl glycoproteins extracted from the plasma metabolite profile in the P. berghei-infected mice were statistically correlated with IFN-γ, whereas the same cytokine was anticorrelated with glucose levels. Both the metabolic and the cytokine data showed a similar spatial distribution in principal component analysis scores plots constructed for the combined murine data, with samples from all infected animals clustering according to the parasite species and whereby the protozoan infections (P. berghei and T. b. brucei) grouped separately from the helminth infection (S. mansoni). For S. mansoni, the main infection-responsive cytokines were IL-4 and IL-5, which covaried with lactate, choline, and D-3-hydroxybutyrate. This study demonstrates that the inherently differential immune response to single and multicellular parasites not only manifests in the cytokine expression, but also consequently imprints on the metabolic signature, and calls for in-depth analysis to further explore direct links between immune features and biochemical pathways.
Resumo:
In vitro batch culture fermentations were conducted with grape seed polyphenols and human faecal microbiota, in order to monitor both changes in precursor flavan-3-ols and the formation of microbial-derived metabolites. By the application of UPLC-DAD-ESI-TQ MS, monomers, and dimeric and trimeric procyanidins were shown to be degraded during the first 10 h of fermentation, with notable inter-individual differences being observed between fermentations. This period (10 h) also coincided with the maximum formation of intermediate metabolites, such as 5-(3′,4′-dihydroxyphenyl)-γ-valerolactone and 4-hydroxy-5-(3′,4′-dihydroxyphenyl)-valeric acid, and of several phenolic acids, including 3-(3,4-dihydroxyphenyl)-propionic acid, 3,4-dihydroxyphenylacetic acid, 4-hydroxymandelic acid, and gallic acid (5–10 h maximum formation). Later phases of the incubations (10–48 h) were characterised by the appearance of mono- and non-hydroxylated forms of previous metabolites by dehydroxylation reactions. Of particular interest was the detection of γ-valerolactone, which was seen for the first time as a metabolite from the microbial catabolism of flavan-3-ols. Changes registered during fermentation were finally summarised by a principal component analysis (PCA). Results revealed that 5-(3′,4′-dihydroxyphenyl)-γ-valerolactone was a key metabolite in explaining inter-individual differences and delineating the rate and extent of the microbial catabolism of flavan-3-ols, which could finally affect absorption and bioactivity of these compounds.
Resumo:
Abstract: During the transition from endo-dormancy to eco-dormancy and subsequent growth, the onion bulb undergoes the transition from sink organ to source, to sustain cell division in the meristematic tissue. The mechanisms controlling these processes are not fully understood. Here, a detailed analysis of whole onion bulb physiological, biochemical and transcriptional changes in response to sprouting is reported, enabling a better knowledge of the mechanisms regulating post-harvest onion sprout development. Biochemical and physiological analyses were conducted on different cultivars ('Wellington', 'Sherpa' and 'Red Baron') grown at different sites over 3 years, cured at different temperatures (20, 24 and 28 degrees C) and stored under different regimes (1, 3, 6 and 6 1 degrees C). In addition, the first onion oligonucleotide microarray was developed to determine differential gene expression in onion during curing and storage, so that transcriptional changes could support biochemical and physiological analyses. There were greater transcriptional differences between samples at harvest and before sprouting than between the samples taken before and after sprouting, with some significant changes occurring during the relatively short curing period. These changes are likely to represent the transition from endo-dormancy to sprout suppression, and suggest that endo-dormancy is a relatively short period ending just after curing. Principal component analysis of biochemical and physiological data identified the ratio of monosaccharides (fructose and glucose) to disaccharide (sucrose), along with the concentration of zeatin riboside, as important factors in discriminating between sprouting and pre-sprouting bulbs. These detailed analyses provide novel insights into key regulatory triggers for sprout dormancy release in onion bulbs and provide the potential for the development of biochemical or transcriptional markers for sprout initiation. Evidence presented herein also suggests there is no detrimental effect on bulb storage life and quality caused by curing at 20 degrees C, producing a considerable saving in energy and costs.
Resumo:
An efficient two-level model identification method aiming at maximising a model׳s generalisation capability is proposed for a large class of linear-in-the-parameters models from the observational data. A new elastic net orthogonal forward regression (ENOFR) algorithm is employed at the lower level to carry out simultaneous model selection and elastic net parameter estimation. The two regularisation parameters in the elastic net are optimised using a particle swarm optimisation (PSO) algorithm at the upper level by minimising the leave one out (LOO) mean square error (LOOMSE). There are two elements of original contributions. Firstly an elastic net cost function is defined and applied based on orthogonal decomposition, which facilitates the automatic model structure selection process with no need of using a predetermined error tolerance to terminate the forward selection process. Secondly it is shown that the LOOMSE based on the resultant ENOFR models can be analytically computed without actually splitting the data set, and the associate computation cost is small due to the ENOFR procedure. Consequently a fully automated procedure is achieved without resort to any other validation data set for iterative model evaluation. Illustrative examples are included to demonstrate the effectiveness of the new approaches.
Resumo:
Background: The validity of ensemble averaging on event-related potential (ERP) data has been questioned, due to its assumption that the ERP is identical across trials. Thus, there is a need for preliminary testing for cluster structure in the data. New method: We propose a complete pipeline for the cluster analysis of ERP data. To increase the signalto-noise (SNR) ratio of the raw single-trials, we used a denoising method based on Empirical Mode Decomposition (EMD). Next, we used a bootstrap-based method to determine the number of clusters, through a measure called the Stability Index (SI). We then used a clustering algorithm based on a Genetic Algorithm (GA)to define initial cluster centroids for subsequent k-means clustering. Finally, we visualised the clustering results through a scheme based on Principal Component Analysis (PCA). Results: After validating the pipeline on simulated data, we tested it on data from two experiments – a P300 speller paradigm on a single subject and a language processing study on 25 subjects. Results revealed evidence for the existence of 6 clusters in one experimental condition from the language processing study. Further, a two-way chi-square test revealed an influence of subject on cluster membership.
Resumo:
P>The use of seven domains for the Oral Health Impact Profile (OHIP)-EDENT was not supported for its Brazilian version, making data interpretation in clinical settings difficult. Thus, the aim of this study was to assess patients` responses for the translated OHIP-EDENT in a group of edentulous subjects and to develop factor scales for application in future studies. Data from 103 conventional and implant-retained complete denture wearers (36 men, mean age of 69 center dot 1 +/- 10 center dot 3 years) were assessed using the Brazilian version of the OHIP-EDENT. Oral health-related quality of life domains were identified by factor analysis using principal component analysis as the extraction method, followed by varimax rotation. Factor analysis identified four factors that accounted for 63% of the 19 items total variance, named masticatory discomfort and disability (four items), psychological discomfort and disability (five items), social disability (five items) and oral pain and discomfort (five items). Four factors/domains of the Brazilian OHIP-EDENT version represent patient-important aspects of oral health-related quality of life.
Resumo:
The impact of human activity on the sediments of Todos os Santos Bay in Brazil was evaluated by elemental analysis and (13)C Nuclear Magnetic Resonance ((13)C NMR). This article reports a study of six sediment cores collected at different depths and regions of Todos os Santos Bay. The elemental profiles of cores collected on the eastern side of Frades Island suggest an abrupt change in the sedimentation regime. Auto-regressive Integrated Moving Average (ARIMA) analysis corroborates this result. The range of depths of the cores corresponds to about 50 years ago, coinciding with the implantation of major onshore industrial projects in the region. Principal Component Analysis of the (13)C NMR spectra clearly differentiates sediment samples closer to the Subae estuary, which have high contents of terrestrial organic matter, from those closer to a local oil refinery. The results presented in this article illustrate several important aspects of environmental impact of human activity on this bay. (C) 2011 Elsevier Ltd. All rights reserved.
Structural requirement for PPAR gamma binding revealed by a meta analysis of holo-crystal structures
Resumo:
PPAR gamma is a ligand regulated transcriptional factor that modulates the transcription of several genes involved in fat and sugar metabolism. Due to its easy bacterial expression and crystallization, several crystal structures of holo-PPAR gamma have been reported and deposited in the Protein Data Bank. Here, we investigated the three-dimensional electrostatic properties of 55 PPAR gamma ligands and used this information for clustering them through principal component analysis. We found out that, according to their electrostatic potential, these ligands can be separated in three groups, with different binding features. We also observed that non-selective and selective ligands show different 3D electrostatic properties and are separated in different clusters. The relevance of this analysis for the development of new binders is discussed. (C) 2010 Elsevier Masson SAS. All rights reserved.
Resumo:
This paper presents the formulation of a combinatorial optimization problem with the following characteristics: (i) the search space is the power set of a finite set structured as a Boolean lattice; (ii) the cost function forms a U-shaped curve when applied to any lattice chain. This formulation applies for feature selection in the context of pattern recognition. The known approaches for this problem are branch-and-bound algorithms and heuristics that explore partially the search space. Branch-and-bound algorithms are equivalent to the full search, while heuristics are not. This paper presents a branch-and-bound algorithm that differs from the others known by exploring the lattice structure and the U-shaped chain curves of the search space. The main contribution of this paper is the architecture of this algorithm that is based on the representation and exploration of the search space by new lattice properties proven here. Several experiments, with well known public data, indicate the superiority of the proposed method to the sequential floating forward selection (SFFS), which is a popular heuristic that gives good results in very short computational time. In all experiments, the proposed method got better or equal results in similar or even smaller computational time. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Brazilian sugarcane spirits were analyzed to elucidate similarities and dissimilarities by principal component analysis. Nine aldehydes, six alcohols, and six metal cations were identified and quantified. Isobutanol (LD 202.9 mu gL-1), butiraldehyde (0.08-0.5 mu gL-1), ethanol (39-47% v/v), and copper (371-6068 mu gL-1) showed marked similarities, but the concentration levels of n-butanol (1.6-7.3 mu gL-1), sec-butanol (LD 89 mu gL-1), formaldehyde (0.1-0.74 mu gL-1), valeraldehyde (0.04-0.31 mu gL-1), iron (8.6-139.1 mu gL-1), and magnesium (LD 1149 mu gL-1) exhibited differences from samples.
Resumo:
Cannabinoid compounds have widely been employed because of its medicinal and psychotropic properties. These compounds are isolated from Cannabis sativa (or marijuana) and are used in several medical treatments, such as glaucoma, nausea associated to chemotherapy, pain and many other situations. More recently, its use as appetite stimulant has been indicated in patients with cachexia or AIDS. In this work, the influence of several molecular descriptors on the psychoactivity of 50 cannabinoid compounds is analyzed aiming one obtain a model able to predict the psychoactivity of new cannabinoids. For this purpose, initially, the selection of descriptors was carried out using the Fisher`s weight, the correlation matrix among the calculated variables and principal component analysis. From these analyses, the following descriptors have been considered more relevant: E(LUMO) (energy of the lowest unoccupied molecular orbital), Log P (logarithm of the partition coefficient), VC4 (volume of the substituent at the C4 position) and LP1 (Lovasz-Pelikan index, a molecular branching index). To follow, two neural network models were used to construct a more adequate model for classifying new cannabinoid compounds. The first model employed was multi-layer perceptrons, with algorithm back-propagation, and the second model used was the Kohonen network. The results obtained from both networks were compared and showed that both techniques presented a high percentage of correctness to discriminate psychoactive and psychoinactive compounds. However, the Kohonen network was superior to multi-layer perceptrons.