61 resultados para Sample algorithms
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
There is an increasing interest in the application of Evolutionary Algorithms (EAs) to induce classification rules. This hybrid approach can benefit areas where classical methods for rule induction have not been very successful. One example is the induction of classification rules in imbalanced domains. Imbalanced data occur when one or more classes heavily outnumber other classes. Frequently, classical machine learning (ML) classifiers are not able to learn in the presence of imbalanced data sets, inducing classification models that always predict the most numerous classes. In this work, we propose a novel hybrid approach to deal with this problem. We create several balanced data sets with all minority class cases and a random sample of majority class cases. These balanced data sets are fed to classical ML systems that produce rule sets. The rule sets are combined creating a pool of rules and an EA is used to build a classifier from this pool of rules. This hybrid approach has some advantages over undersampling, since it reduces the amount of discarded information, and some advantages over oversampling, since it avoids overfitting. The proposed approach was experimentally analysed and the experimental results show an improvement in the classification performance measured as the area under the receiver operating characteristics (ROC) curve.
Resumo:
Context. A sample of 27 sources, cataloged as pre-main sequence stars by the Pico dos Dias Survey (PDS), is analyzed to investigate a possible contamination by post-AGB stars. The far-infrared excess due to dust present in the circumstellar envelope is typical of both categories: young stars and objects that have already left the main sequence and are suffering severe mass loss. Aims. The two known post-AGB stars in our sample inspired us to seek for other very likely or possible post-AGB objects among PDS sources previously suggested to be Herbig Ae/Be stars, by revisiting the observational database of this sample. Methods. In a comparative study with well known post-AGBs, several characteristics were evaluated: (i) parameters related to the circumstellar emission; (ii) spatial distribution to verify the background contribution from dark clouds; (iii) spectral features; and (iv) optical and infrared colors. Results. These characteristics suggest that seven objects of the studied sample are very likely post-AGBs, five are possible post-AGBs, eight are unlikely post-AGBs, and the nature of seven objects remains unclear.
Resumo:
Aims. Our goal is to study the physical properties of the circumstellar environment of young stellar objetcs (YSOs). In particular, the determination of the scattering mechanism can help us to constrain the optical depth of the disk and/or envelope in the near infrared. Methods. We used the IAGPOL imaging polarimeter along with the CamIV infrared camera at the LNA observatory to obtain near infrared polarimetry measurements in the H band of a sample of optically visible YSOs, namely, eleven T Tauri stars and eight Herbig Ae/Be stars. An independent determination of the disk (or jet) orientation was obtained for twelve objects from the literature. The circumstellar optical depth could then be estimated by comparing the integrated polarization position angle (PA) with the direction of the major axis of the disk projected onto the plane of the sky. Optically thin disks have, in general, a polarization PA that is perpendicular to the disk plane. In contrast, optically thick disks have polarization PAs parallel to the disks. Results. Among the T Tauri stars, three are consistent with having optically thin disks (AS 353A, RY Tau and UY Aur) and five with optically thick disks (V536 Aql, DG Tau, DO Tau, HL Tau and LkH alpha 358). Among the Herbig Ae/Be stars, two stars exhibit evidence of optically thin disks (Hen 3-1191 and VV Ser) and two of optically thick disks (PDS 453 and MWC 297). Our results seem consistent with optically thick disks at near infrared bands, which are more likely to be associated with younger YSOs. Marginal evidence of polarization reversal is found in RY Tau, RY Ori, WW Vul, and UY Aur. In the first three cases, this feature can be associated with the UXOR phenomenon. Correlations with the IRAS colors and the spectral index yielded evidence of an evolutionary segregation in which the disks tend to be optically thin when they are older.
Resumo:
Context. We study galaxy evolution and spatial patterns in the surroundings of a sample of 2dF groups. Aims. Our aim is to find evidence of galaxy evolution and clustering out to 10 times the virial radius of the groups and so redefine their properties according to the spatial patterns in the fields and relate them to galaxy evolution. Methods. Group members and interlopers were redefined after the identification of gaps in the redshift distribution. We then used exploratory spatial statistics based on the the second moment of the Ripley function to probe the anisotropy in the galaxy distribution around the groups. Results. We found an important anticorrelation between anisotropy around groups and the fraction of early-type galaxies in these fields. Our results illustrate how the dynamical state of galaxy groups can be ascertained by the systematic study of their neighborhoods. This is an important achievement, since the correct estimate of the extent to which galaxies are affected by the group environment and follow large-scale filamentary structure is relevant to understanding the process of galaxy clustering and evolution in the Universe.
Resumo:
We propose and analyze two different Bayesian online algorithms for learning in discrete Hidden Markov Models and compare their performance with the already known Baldi-Chauvin Algorithm. Using the Kullback-Leibler divergence as a measure of generalization we draw learning curves in simplified situations for these algorithms and compare their performances.
Resumo:
For environmental quality assessment, INAA has been applied for determining chemical elements in small (200 mg) and large (200 g) samples of leaves from 200 trees. By applying the Ingamells` constant, the expected percent standard deviation was estimated in 0.9-2.2% for 200 mg samples. Otherwise, for composite samples (200 g), expected standard deviation varied from 0.5 to 10% in spite of analytical uncertainties ranging from 2 to 30%. Results thereby suggested the expression of the degree of representativeness as a source of uncertainty, contributing for increasing of the reliability of environmental studies mainly in the case of composite samples.
Resumo:
Soils are an important component in the biogeochemical cycle of carbon, storing about four times more carbon than biomass plants and nearly three times more than the atmosphere. Moreover, the carbon content is directly related on the capacity of water retention, fertility. among other properties. Thus, soil carbon quantification in field conditions is an important challenge related to carbon cycle and global climatic changes. Nowadays. Laser Induced Breakdown Spectroscopy (LIBS) can be used for qualitative elemental analyses without previous treatment of samples and the results are obtained quickly. New optical technologies made possible the portable LIBS systems and now, the great expectation is the development of methods that make possible quantitative measurements with LIBS. The goal of this work is to calibrate a portable LIBS system to carry out quantitative measures of carbon in whole tropical soil sample. For this, six samples from the Brazilian Cerrado region (Argisoil) were used. Tropical soils have large amounts of iron in their compositions, so the carbon line at 247.86 nm presents strong interference of this element (iron lines at 247.86 and 247.95). For this reason, in this work the carbon line at 193.03 nm was used. Using methods of statistical analysis as a simple linear regression, multivariate linear regression and cross-validation were possible to obtain correlation coefficients higher than 0.91. These results show the great potential of using portable LIBS systems for quantitative carbon measurements in tropical soils. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The quality of environmental studies depends on the utilization of adequate sampling protocol and analytical method for obtaining reliable results and minimizing analytical uncertainties. In order to demonstrate the applicability of INAA for determining chemical element composition of invertebrates, this work evaluated sample representativeness in terms of subsampling and sample size. Br, Co, Fe, K, Na, Sc and Zn could be determined in very small samples despite increasing of analytical uncertainties. Special attention should be directed to invertebrate species with small structures because of the high chemical variation observed among different sample sizes tested.
Resumo:
PURPOSE: The aim of this study was to analyze the prevalence of urinary incontinence (UI) in a community sample from the city of Sao Paulo. METHODS: This epidemiological survey was conducted at a family health program in Sao Paulo, Brazil, using randomized sampling. Data were collected by interviewing residents and were analyzed by Pearson`s correlation coefficients, chi-square tests, and logistic regression analysis. RESULTS: Seventy (10.7%) of the 657 subjects currently presented UI, including 50.7% with sporadic UI and 74.3% with UI upon moderate efforts. Ninety-three percent woke up during the night, 43.7% maintained continence until the bathroom, 63.4% had a sensation of wetness, and 77.5% reported no use of any continence aids. Female gender, advanced age, gynecologic or urologic surgery, dysuria, and urinary tract infection were correlated with UI (P < .001; r(2) = 0.572). CONCLUSION: The overall prevalence of UI was found to be high and was comparable to results from multiple countries.
Resumo:
Fourier transform near infrared (FT-NIR) spectroscopy was evaluated as an analytical too[ for monitoring residual Lignin, kappa number and hexenuronic acids (HexA) content in kraft pulps of Eucalyptus globulus. Sets of pulp samples were prepared under different cooking conditions to obtain a wide range of compound concentrations that were characterised by conventional wet chemistry analytical methods. The sample group was also analysed using FT-NIR spectroscopy in order to establish prediction models for the pulp characteristics. Several models were applied to correlate chemical composition in samples with the NIR spectral data by means of PCR or PLS algorithms. Calibration curves were built by using all the spectral data or selected regions. Best calibration models for the quantification of lignin, kappa and HexA were proposed presenting R-2 values of 0.99. Calibration models were used to predict pulp titers of 20 external samples in a validation set. The lignin concentration and kappa number in the range of 1.4-18% and 8-62, respectively, were predicted fairly accurately (standard error of prediction, SEP 1.1% for lignin and 2.9 for kappa). The HexA concentration (range of 5-71 mmol kg(-1) pulp) was more difficult to predict and the SEP was 7.0 mmol kg(-1) pulp in a model of HexA quantified by an ultraviolet (UV) technique and 6.1 mmol kg(-1) pulp in a model of HexA quantified by anion-exchange chromatography (AEC). Even in wet chemical procedures used for HexA determination, there is no good agreement between methods as demonstrated by the UV and AEC methods described in the present work. NIR spectroscopy did provide a rapid estimate of HexA content in kraft pulps prepared in routine cooking experiments.
Resumo:
Experiments based on a 2(3) central composite full factorial design were carried out in 200-ml stainless-steel containers to study the pretreatment, with dilute sulfuric acid, of a sugarcane bagasse sample obtained from a local sugar-alcohol mill. The independent variables selected for study were temperature, varied from 112.5A degrees C to 157.5A degrees C, residence time, varied from 5.0 to 35.0 min, and sulfuric acid concentration, varied from 0.0% to 3.0% (w/v). Bagasse loading of 15% (w/w) was used in all experiments. Statistical analysis of the experimental results showed that all three independent variables significantly influenced the response variables, namely the bagasse solubilization, efficiency of xylose recovery in the hemicellulosic hydrolysate, efficiency of cellulose enzymatic saccharification, and percentages of cellulose, hemicellulose, and lignin in the pretreated solids. Temperature was the factor that influenced the response variables the most, followed by acid concentration and residence time, in that order. Although harsher pretreatment conditions promoted almost complete removal of the hemicellulosic fraction, the amount of xylose recovered in the hemicellulosic hydrolysate did not exceed 61.8% of the maximum theoretical value. Cellulose enzymatic saccharification was favored by more efficient removal of hemicellulose during the pretreatment. However, detoxification of the hemicellulosic hydrolysate was necessary for better bioconversion of the sugars to ethanol.
Resumo:
This study aimed to correlate the efficiency of enzymatic hydrolysis of the cellulose contained in a sugarcane bagasse sample pretreated with dilute H(2)SO(4) with the levels of independent variables such as initial content of solids and loadings of enzymes and surfactant (Tween 20), for two cellulolytic commercial preparations. The preparations, designated cellulase I and cellulase II, were characterized regarding the activities of total cellulases, endoglucanase, cellobiohydrolase, cellobiase, beta-glucosidase, xylanase, and phenoloxidases (laccase, manganese and lignin peroxidases), as well as protein contents. Both extracts showed complete cellulolytic complexes and considerable activities of xylanases, without activities of phenoloxidases. For the enzymatic hydrolyses, two 2(3) central composite full factorial designs were employed to evaluate the effects caused by the initial content of solids (1.19-4.81%, w/w) and loadings of enzymes (1.9-38.1 FPU/g bagasse) and Tween 20 (0.0-0.1 g/g bagasse) on the cellulose digestibility. Within 24 h of enzymatic hydrolysis, all three independent variables influenced the conversion of cellulose by cellulase I. Using cellulase II, only enzyme and surfactant loadings showed significant effects on cellulose conversion. An additional experiment demonstrated the possibility of increasing the initial content of solids to values much higher than 4.81% (w/w) without compromising the efficiency of cellulose conversion, consequently improving the glucose concentration in the hydrolysate.
Resumo:
Voltage and current waveforms of a distribution or transmission power system are not pure sinusoids. There are distortions in these waveforms that can be represented as a combination of the fundamental frequency, harmonics and high frequency transients. This paper presents a novel approach to identifying harmonics in power system distorted waveforms. The proposed method is based on Genetic Algorithms, which is an optimization technique inspired by genetics and natural evolution. GOOAL, a specially designed intelligent algorithm for optimization problems, was successfully implemented and tested. Two kinds of representations concerning chromosomes are utilized: binary and real. The results show that the proposed method is more precise than the traditional Fourier Transform, especially considering the real representation of the chromosomes.
Resumo:
This paper presents a strategy for the solution of the WDM optical networks planning. Specifically, the problem of Routing and Wavelength Allocation (RWA) in order to minimize the amount of wavelengths used. In this case, the problem is known as the Min-RWA. Two meta-heuristics (Tabu Search and Simulated Annealing) are applied to take solutions of good quality and high performance. The key point is the degradation of the maximum load on the virtual links in favor of minimization of number of wavelengths used; the objective is to find a good compromise between the metrics of virtual topology (load in Gb/s) and of the physical topology (quantity of wavelengths). The simulations suggest good results when compared to some existing in the literature.