39 resultados para Feature-extraction
Resumo:
Supercritical fluid extraction (SEE) of the volatile oil from Thymus vulgaris L. aerial flowering parts was performed under different conditions of pressure, temperature, mean particle size and CO2 flow rate and the correspondent yield and composition were compared with those of the essential oil isolated by hydrodistillation (HD). Both the oils were analyzed by GC and GC-MS and 52 components were identified. The main volatile components obtained were p-cymene (10.0-42.6% for SFE and 28.9-34.8% for HD), gamma-terpinene (0.8-6.9% for SFE and 5.1-7.0% for HD), linalool (2.3-5.3% for SFE and 2.8-3.1% for HD), thymol (19.5-40.8% for SFE and 35.4-41.6% for HD), and carvacrol (1.4-3.1% for SFE and 2.6-3.1% for HD). The main difference was found to be the relative percentage of thymoquinone (not found in the essential oil) and carvacryl methyl ether (1.0-1.2% for HD versus t-0.4 for SFE) which can explain the higher antioxidant activity, assessed by Rancimat test, of the SFE volatiles when compared with HD. Thymoquinone is considered a strong antioxidant compound.
Resumo:
The modelling of the experimental data of the extraction of the volatile oil from six aromatic plants (coriander, fennel, savoury, winter savoury, cotton lavender and thyme) was performed using five mathematical models, based on differential mass balances. In all cases the extraction was internal diffusion controlled and the internal mass transfer coefficienty (k(s)) have been found to change with pressure, temperature and particle size. For fennel, savoury and cotton lavender, the external mass transfer and the equilibrium phase also influenced the second extraction period, since k(s) changed with the tested flow rates. In general, the axial dispersion coefficient could be neglected for the conditions studied, since Peclet numbers were high. On the other hand, the solute-matrix interaction had to be considered in order to ensure a satisfactory description of the experimental data.
Resumo:
The present work involves the use of p-tert-butylcalix[4,6,8]arene carboxylic acid derivatives ((t)Butyl[4,6,8]CH2COOH) for selective extraction of hemoglobin. All three calixarenes extracted hemoglobin into the organic phase, exhibiting extraction parameters higher than 0.90. Evaluation of the solvent accessible positively charged amino acid side chains of hemoglobin (PDB entry 1XZ2) revealed that there are 8 arginine, 44 lysine and 30 histidine residues on the protein surface which may be involved in the interactions with the calixarene molecules. The hemoglobin-(t)Butyl[6]CH2COOH complex had pseudoperoxidase activity which catalysed the oxidation of syringaldazine in the presence of hydrogen peroxide in organic medium containing chloroform. The effect of pH, protein and substrate concentrations on biocatalysis was investigated using the hemoglobin-(t)Butyl[6]CH2COOH complex. This complex exhibited the highest specific activity of 9.92 x 10(-2) U mg protein(-1) at an initial pH of 7.5 in organic medium. Apparent kinetic parameters (V'(max), K'(m), k'(cat) and k'(cat)/K'(m)) for the pseudoperoxidase activity were determined in organic media for different pH values from a Michaelis-Menten plot. Furthermore, the stability of the protein-calixarene complex was investigated for different initial pH values and half-life (t(1/2)) values were obtained in the range of 1.96 and 2.64 days. Hemoglobin-calixarene complex present in organic medium was recovered in fresh aqueous solutions at alkaline pH, with a recovery of pseudoperoxidase activity of over 100%. These results strongly suggest that the use of calixarene derivatives is an alternative technique for protein extraction and solubilisation in organic media for biocatalysis.
Resumo:
In music genre classification, most approaches rely on statistical characteristics of low-level features computed on short audio frames. In these methods, it is implicitly considered that frames carry equally relevant information loads and that either individual frames, or distributions thereof, somehow capture the specificities of each genre. In this paper we study the representation space defined by short-term audio features with respect to class boundaries, and compare different processing techniques to partition this space. These partitions are evaluated in terms of accuracy on two genre classification tasks, with several types of classifiers. Experiments show that a randomized and unsupervised partition of the space, used in conjunction with a Markov Model classifier lead to accuracies comparable to the state of the art. We also show that unsupervised partitions of the space tend to create less hubs.
Resumo:
A discussion of the most interesting results obtained in our laboratories, during the supercritical CO(2) extraction of bioactive compounds from microalgae and volatile oils from aromatic plants, was carried out. Concerning the microalgae, the studies on Botryococcus braunii and Chlorella vulgaris were selected. Hydrocarbons from the first microalgae, which are mainly linear alkadienes (C(23)-C(31)) with an odd number of carbon atoms, were selectively extracted at 313 K increasing the pressure up to 30.0 MPa. These hydrocarbons are easily extracted at this pressure, since they are located outside the cellular walls. The extraction of carotenoids, mainly canthaxanthin and astaxanthin, from C. vulgaris is more difficult. The extraction yield of these components at 313 K and 35.0 MPa increased with the degree of crushing of the microalga, since they are not extracellular. On the other hand, for the extraction of volatile oils from aromatic plants, studies on Mentha pulegium and Satureja montana L were chosen. For the first aromatic plant, the composition of the volatile and essential oils was similar, the main components being the pulegone and menthone. However, this volatile oil contained small amounts of waxes, which content decreased with decreasing particle size of the plant matrix. For S. montana L it was also observed that both oils have a similar composition, the main components being carvacrol and thymol. The main difference is the relative amount of thymoquinone, which content can be 15 times higher in volatile oil. This oxygenated monoterpene has important biological activities. Moreover, experimental studies on anticholinesterase activity of supercritical extracts of S. montana were also carried out. The supercritical nonvolatile fraction, which presented the highest content of the protocatechuic, vanilic, chlorogenic and (+)-catechin acids, is the most promising inhibitor of the enzyme butyrylcholinesterase. In contrast, the Soxhlet acetone extract did not affect the activity of this enzyme at the concentrations tested. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
This work describes a methodology to extract symbolic rules from trained neural networks. In our approach, patterns on the network are codified using formulas on a Lukasiewicz logic. For this we take advantage of the fact that every connective in this multi-valued logic can be evaluated by a neuron in an artificial network having, by activation function the identity truncated to zero and one. This fact simplifies symbolic rule extraction and allows the easy injection of formulas into a network architecture. We trained this type of neural network using a back-propagation algorithm based on Levenderg-Marquardt algorithm, where in each learning iteration, we restricted the knowledge dissemination in the network structure. This makes the descriptive power of produced neural networks similar to the descriptive power of Lukasiewicz logic language, minimizing the information loss on the translation between connectionist and symbolic structures. To avoid redundance on the generated network, the method simplifies them in a pruning phase, using the "Optimal Brain Surgeon" algorithm. We tested this method on the task of finding the formula used on the generation of a given truth table. For real data tests, we selected the Mushrooms data set, available on the UCI Machine Learning Repository.
Resumo:
International Conference with Peer Review 2012 IEEE International Conference in Geoscience and Remote Sensing Symposium (IGARSS), 22-27 July 2012, Munich, Germany
Resumo:
In the last decade, local image features have been widely used in robot visual localization. To assess image similarity, a strategy exploiting these features compares raw descriptors extracted from the current image to those in the models of places. This paper addresses the ensuing step in this process, where a combining function must be used to aggregate results and assign each place a score. Casting the problem in the multiple classifier systems framework, we compare several candidate combiners with respect to their performance in the visual localization task. A deeper insight into the potential of the sum and product combiners is provided by testing two extensions of these algebraic rules: threshold and weighted modifications. In addition, a voting method, previously used in robot visual localization, is assessed. All combiners are tested on a visual localization task, carried out on a public dataset. It is experimentally demonstrated that the sum rule extensions globally achieve the best performance. The voting method, whilst competitive to the algebraic rules in their standard form, is shown to be outperformed by both their modified versions.
Resumo:
Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.
Resumo:
In visual sensor networks, local feature descriptors can be computed at the sensing nodes, which work collaboratively on the data obtained to make an efficient visual analysis. In fact, with a minimal amount of computational effort, the detection and extraction of local features, such as binary descriptors, can provide a reliable and compact image representation. In this paper, it is proposed to extract and code binary descriptors to meet the energy and bandwidth constraints at each sensing node. The major contribution is a binary descriptor coding technique that exploits the correlation using two different coding modes: Intra, which exploits the correlation between the elements that compose a descriptor; and Inter, which exploits the correlation between descriptors of the same image. The experimental results show bitrate savings up to 35% without any impact in the performance efficiency of the image retrieval task. © 2014 EURASIP.
Resumo:
In research on Silent Speech Interfaces (SSI), different sources of information (modalities) have been combined, aiming at obtaining better performance than the individual modalities. However, when combining these modalities, the dimensionality of the feature space rapidly increases, yielding the well-known "curse of dimensionality". As a consequence, in order to extract useful information from this data, one has to resort to feature selection (FS) techniques to lower the dimensionality of the learning space. In this paper, we assess the impact of FS techniques for silent speech data, in a dataset with 4 non-invasive and promising modalities, namely: video, depth, ultrasonic Doppler sensing, and surface electromyography. We consider two supervised (mutual information and Fisher's ratio) and two unsupervised (meanmedian and arithmetic mean geometric mean) FS filters. The evaluation was made by assessing the classification accuracy (word recognition error) of three well-known classifiers (knearest neighbors, support vector machines, and dynamic time warping). The key results of this study show that both unsupervised and supervised FS techniques improve on the classification accuracy on both individual and combined modalities. For instance, on the video component, we attain relative performance gains of 36.2% in error rates. FS is also useful as pre-processing for feature fusion. Copyright © 2014 ISCA.
Resumo:
Discrete data representations are necessary, or at least convenient, in many machine learning problems. While feature selection (FS) techniques aim at finding relevant subsets of features, the goal of feature discretization (FD) is to find concise (quantized) data representations, adequate for the learning task at hand. In this paper, we propose two incremental methods for FD. The first method belongs to the filter family, in which the quality of the discretization is assessed by a (supervised or unsupervised) relevance criterion. The second method is a wrapper, where discretized features are assessed using a classifier. Both methods can be coupled with any static (unsupervised or supervised) discretization procedure and can be used to perform FS as pre-processing or post-processing stages. The proposed methods attain efficient representations suitable for binary and multi-class problems with different types of data, being competitive with existing methods. Moreover, using well-known FS methods with the features discretized by our techniques leads to better accuracy than with the features discretized by other methods or with the original features. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
Dried flowers and leaves of Origanum glandulosum Desf. were submitted to hydrodistillation (HD) and supercritical fluid extraction with CO2 (SFE). The essential oils isolated by HD and volatile oils obtained by SFE were analysed by GC and GC/MS. Total phenolics content and antioxidant effectiveness were performed. The main components of the essential oils from Bargou and Nefza were: p-cymene (40.4% and 39%), thymol (38.7% and 34.4%) and γ- terpinene (12.3% and 19.2%), respectively. The major components obtain by SFE in the volatile oil, from Bargou and Nefza, were: p-cymene (32.3% and 36.2%), thymol (41% and 40%) and γ-terpinene (20.3% and 13.3%). Total phenolic content, expressed in gallic acid equivalent (GAE) g kg-1 dry weight, varied from 12 to 27 g kg-1 dw, and the ability to scavenge the DPPH radicals, expressed by IC50 ranged from 44 to143 mg L-1.
Resumo:
Many learning problems require handling high dimensional datasets with a relatively small number of instances. Learning algorithms are thus confronted with the curse of dimensionality, and need to address it in order to be effective. Examples of these types of data include the bag-of-words representation in text classification problems and gene expression data for tumor detection/classification. Usually, among the high number of features characterizing the instances, many may be irrelevant (or even detrimental) for the learning tasks. It is thus clear that there is a need for adequate techniques for feature representation, reduction, and selection, to improve both the classification accuracy and the memory requirements. In this paper, we propose combined unsupervised feature discretization and feature selection techniques, suitable for medium and high-dimensional datasets. The experimental results on several standard datasets, with both sparse and dense features, show the efficiency of the proposed techniques as well as improvements over previous related techniques.
Resumo:
Feature selection is a central problem in machine learning and pattern recognition. On large datasets (in terms of dimension and/or number of instances), using search-based or wrapper techniques can be cornputationally prohibitive. Moreover, many filter methods based on relevance/redundancy assessment also take a prohibitively long time on high-dimensional. datasets. In this paper, we propose efficient unsupervised and supervised feature selection/ranking filters for high-dimensional datasets. These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features. The experimental results, with up to 10(5) features, show the time efficiency of our methods, with lower generalization error than state-of-the-art techniques, while being dramatically simpler and faster.