20 resultados para Data selection

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Determination of the utility harmonic impedance based on measurements is a significant task for utility power-quality improvement and management. Compared to those well-established, accurate invasive methods, the noninvasive methods are more desirable since they work with natural variations of the loads connected to the point of common coupling (PCC), so that no intentional disturbance is needed. However, the accuracy of these methods has to be improved. In this context, this paper first points out that the critical problem of the noninvasive methods is how to select the measurements that can be used with confidence for utility harmonic impedance calculation. Then, this paper presents a new measurement technique which is based on the complex data-based least-square regression, combined with two techniques of data selection. Simulation and field test results show that the proposed noninvasive method is practical and robust so that it can be used with confidence to determine the utility harmonic impedances.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The objective of this study was to estimate (co)variance components using random regression on B-spline functions to weight records obtained from birth to adulthood. A total of 82 064 weight records of 8145 females obtained from the data bank of the Nellore Breeding Program (PMGRN/Nellore Brazil) which started in 1987, were used. The models included direct additive and maternal genetic effects and animal and maternal permanent environmental effects as random. Contemporary group and dam age at calving (linear and quadratic effect) were included as fixed effects, and orthogonal Legendre polynomials of age (cubic regression) were considered as random covariate. The random effects were modeled using B-spline functions considering linear, quadratic and cubic polynomials for each individual segment. Residual variances were grouped in five age classes. Direct additive genetic and animal permanent environmental effects were modeled using up to seven knots (six segments). A single segment with two knots at the end points of the curve was used for the estimation of maternal genetic and maternal permanent environmental effects. A total of 15 models were studied, with the number of parameters ranging from 17 to 81. The models that used B-splines were compared with multi-trait analyses with nine weight traits and to a random regression model that used orthogonal Legendre polynomials. A model fitting quadratic B-splines, with four knots or three segments for direct additive genetic effect and animal permanent environmental effect and two knots for maternal additive genetic effect and maternal permanent environmental effect, was the most appropriate and parsimonious model to describe the covariance structure of the data. Selection for higher weight, such as at young ages, should be performed taking into account an increase in mature cow weight. Particularly, this is important in most of Nellore beef cattle production systems, where the cow herd is maintained on range conditions. There is limited modification of the growth curve of Nellore cattle with respect to the aim of selecting them for rapid growth at young ages while maintaining constant adult weight.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Abstract Background One goal of gene expression profiling is to identify signature genes that robustly distinguish different types or grades of tumors. Several tumor classifiers based on expression profiling have been proposed using microarray technique. Due to important differences in the probabilistic models of microarray and SAGE technologies, it is important to develop suitable techniques to select specific genes from SAGE measurements. Results A new framework to select specific genes that distinguish different biological states based on the analysis of SAGE data is proposed. The new framework applies the bolstered error for the identification of strong genes that separate the biological states in a feature space defined by the gene expression of a training set. Credibility intervals defined from a probabilistic model of SAGE measurements are used to identify the genes that distinguish the different states with more reliability among all gene groups selected by the strong genes method. A score taking into account the credibility and the bolstered error values in order to rank the groups of considered genes is proposed. Results obtained using SAGE data from gliomas are presented, thus corroborating the introduced methodology. Conclusion The model representing counting data, such as SAGE, provides additional statistical information that allows a more robust analysis. The additional statistical information provided by the probabilistic model is incorporated in the methodology described in the paper. The introduced method is suitable to identify signature genes that lead to a good separation of the biological states using SAGE and may be adapted for other counting methods such as Massive Parallel Signature Sequencing (MPSS) or the recent Sequencing-By-Synthesis (SBS) technique. Some of such genes identified by the proposed method may be useful to generate classifiers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Experimental analyses of hermit crabs and their preferences for shells are essential to understand the intrinsic relationship of the crabs` dependence on shells, and may be useful to explain their shell use pattern in nature. The aim of this study was to evaluate the effect of crab species and site on the pattern of shell use, selection, and preference in the south-western Atlantic hermit crabs Pagurus brevidactylus and Pagurus criniticornis, comparing sympatric and allopatric populations. Differently from the traditional approach to evaluate shell preference by simply determining the shell selection pattern (i.e., the number of shells of each type selected), preference was defined (according to [Liszka, D., Underwood, AJ., 1990. An experimental design to determine preferences for gastropod shells by a hermit-crab. J. Exp. Mar. Biol. Ecol., 137(1), 47-62]) by the comparison of the number of crabs changing for a particular shell type when three options were given (Cerithium atratum, Morula nodulosa, and Tegula viridula) with the number of crabs changing for this same type when only this type was offered. The effect of crab species was tested at Cabelo Gordo Beach, where P. brevidacrylus was found occupying shells of C. atratum, M. nodulosa, and T viridula in similar frequencies, whereas P. criniticornis occupied predominantly shells of C atratum. In laboratory experiments the selection patterns of the two hermit-crab species for these three gastropods were different, with P criniticornis selecting mainly shells of C atratum, and R brevidactylus selecting more shells of M. nodulosa. The shell preference was also dependent on crab species, with P. criniticornis showing a clear preference for shells of C atratum, whereas P. brevidactylus did not show a preference for any of the tested shells. The effect of site was tested for the two species comparing data from Cabelo Gordo to Preta (P brevidactylus) and Araca beaches (P. criniticornis). The pattern of shell use, selection, and preference was demonstrated to be dependent on site only for P. brevidactylus. The results also showed that the shell use pattern of P criniticornis can be explained by its preference at both sites, whereas for P. brevidactylus it occurred only at Cabelo Gordo, where the absence of preference was correlated with the similar use of the three gastropod species studied. Finally, the results showed that the shell selection pattern cannot be considered as a measure of shell preference, since it overestimates crab selectivity. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sugarcane-breeding programs take at least 12 years to develop new commercial cultivars. Molecular markers offer a possibility to study the genetic architecture of quantitative traits in sugarcane, and they may be used in marker-assisted selection to speed up artificial selection. Although the performance of sugarcane progenies in breeding programs are commonly evaluated across a range of locations and harvest years, many of the QTL detection methods ignore two- and three-way interactions between QTL, harvest, and location. In this work, a strategy for QTL detection in multi-harvest-location trial data, based on interval mapping and mixed models, is proposed and applied to map QTL effects on a segregating progeny from a biparental cross of pre-commercial Brazilian cultivars, evaluated at two locations and three consecutive harvest years for cane yield (tonnes per hectare), sugar yield (tonnes per hectare), fiber percent, and sucrose content. In the mixed model, we have included appropriate (co)variance structures for modeling heterogeneity and correlation of genetic effects and non-genetic residual effects. Forty-six QTLs were found: 13 QTLs for cane yield, 14 for sugar yield, 11 for fiber percent, and 8 for sucrose content. In addition, QTL by harvest, QTL by location, and QTL by harvest by location interaction effects were significant for all evaluated traits (30 QTLs showed some interaction, and 16 none). Our results contribute to a better understanding of the genetic architecture of complex traits related to biomass production and sucrose content in sugarcane.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose a hybrid hazard regression model with threshold stress which includes the proportional hazards and the accelerated failure time models as particular cases. To express the behavior of lifetimes the generalized-gamma distribution is assumed and an inverse power law model with a threshold stress is considered. For parameter estimation we develop a sampling-based posterior inference procedure based on Markov Chain Monte Carlo techniques. We assume proper but vague priors for the parameters of interest. A simulation study investigates the frequentist properties of the proposed estimators obtained under the assumption of vague priors. Further, some discussions on model selection criteria are given. The methodology is illustrated on simulated and real lifetime data set.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: A current challenge in gene annotation is to define the gene function in the context of the network of relationships instead of using single genes. The inference of gene networks (GNs) has emerged as an approach to better understand the biology of the system and to study how several components of this network interact with each other and keep their functions stable. However, in general there is no sufficient data to accurately recover the GNs from their expression levels leading to the curse of dimensionality, in which the number of variables is higher than samples. One way to mitigate this problem is to integrate biological data instead of using only the expression profiles in the inference process. Nowadays, the use of several biological information in inference methods had a significant increase in order to better recover the connections between genes and reduce the false positives. What makes this strategy so interesting is the possibility of confirming the known connections through the included biological data, and the possibility of discovering new relationships between genes when observed the expression data. Although several works in data integration have increased the performance of the network inference methods, the real contribution of adding each type of biological information in the obtained improvement is not clear. Methods: We propose a methodology to include biological information into an inference algorithm in order to assess its prediction gain by using biological information and expression profile together. We also evaluated and compared the gain of adding four types of biological information: (a) protein-protein interaction, (b) Rosetta stone fusion proteins, (c) KEGG and (d) KEGG+GO. Results and conclusions: This work presents a first comparison of the gain in the use of prior biological information in the inference of GNs by considering the eukaryote (P. falciparum) organism. Our results indicates that information based on direct interaction can produce a higher improvement in the gain than data about a less specific relationship as GO or KEGG. Also, as expected, the results show that the use of biological information is a very important approach for the improvement of the inference. We also compared the gain in the inference of the global network and only the hubs. The results indicates that the use of biological information can improve the identification of the most connected proteins.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A data set of a commercial Nellore beef cattle selection program was used to compare breeding models that assumed or not markers effects to estimate the breeding values, when a reduced number of animals have phenotypic, genotypic and pedigree information available. This herd complete data set was composed of 83,404 animals measured for weaning weight (WW), post-weaning gain (PWG), scrotal circumference (SC) and muscle score (MS), corresponding to 116,652 animals in the relationship matrix. Single trait analyses were performed by MTDFREML software to estimate fixed and random effects solutions using this complete data. The additive effects estimated were assumed as the reference breeding values for those animals. The individual observed phenotype of each trait was adjusted for fixed and random effects solutions, except for direct additive effects. The adjusted phenotype composed of the additive and residual parts of observed phenotype was used as dependent variable for models' comparison. Among all measured animals of this herd, only 3160 animals were genotyped for 106 SNP markers. Three models were compared in terms of changes on animals' rank, global fit and predictive ability. Model 1 included only polygenic effects, model 2 included only markers effects and model 3 included both polygenic and markers effects. Bayesian inference via Markov chain Monte Carlo methods performed by TM software was used to analyze the data for model comparison. Two different priors were adopted for markers effects in models 2 and 3, the first prior assumed was a uniform distribution (U) and, as a second prior, was assumed that markers effects were distributed as normal (N). Higher rank correlation coefficients were observed for models 3_U and 3_N, indicating a greater similarity of these models animals' rank and the rank based on the reference breeding values. Model 3_N presented a better global fit, as demonstrated by its low DIC. The best models in terms of predictive ability were models 1 and 3_N. Differences due prior assumed to markers effects in models 2 and 3 could be attributed to the better ability of normal prior in handle with collinear effects. The models 2_U and 2_N presented the worst performance, indicating that this small set of markers should not be used to genetically evaluate animals with no data, since its predictive ability is restricted. In conclusion, model 3_N presented a slight superiority when a reduce number of animals have phenotypic, genotypic and pedigree information. It could be attributed to the variation retained by markers and polygenic effects assumed together and the normal prior assumed to markers effects, that deals better with the collinearity between markers. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we perform a thorough analysis of a spectral phase-encoded time spreading optical code division multiple access (SPECTS-OCDMA) system based on Walsh-Hadamard (W-H) codes aiming not only at finding optimal code-set selections but also at assessing its loss of security due to crosstalk. We prove that an inadequate choice of codes can make the crosstalk between active users to become large enough so as to cause the data from the user of interest to be detected by other user. The proposed algorithm for code optimization targets code sets that produce minimum bit error rate (BER) among all codes for a specific number of simultaneous users. This methodology allows us to find optimal code sets for any OCDMA system, regardless the code family used and the number of active users. This procedure is crucial for circumventing the unexpected lack of security due to crosstalk. We also show that a SPECTS-OCDMA system based on W-H 32(64) fundamentally limits the number of simultaneous users to 4(8) with no security violation due to crosstalk. More importantly, we prove that only a small fraction of the available code sets is actually immune to crosstalk with acceptable BER (<10(-9)) i.e., approximately 0.5% for W-H 32 with four simultaneous users, and about 1 x 10(-4)% for W-H 64 with eight simultaneous users.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this study was to compare the BLUP selection method with different selection strategies in F-2:4 and assess the efficiency of this method on the early choice of the best common bean (Phaseolus vulgaris) lines. Fifty-one F-2:4 progenies were produced from a cross between the CVIII8511 x RP-26 lines. A randomized block design was used with 20 replications and one-plant field plots. Character data on plant architecture and grain yield were obtained and then the sum of the standardized variables was estimated for simultaneous selection of both traits. Analysis was carried out by mixed models (BLUP) and the least squares method to compare different selection strategies, like mass selection, stratified mass selection and between and within progeny selection. The progenies selected by BLUP were assessed in advanced generations, always selecting the greatest and smallest sum of the standardized variables. Analyses by the least squares method and BLUP procedure ranked the progenies in the same way. The coincidence of the individuals identified by BLUP and between and within progeny selection was high and of the greatest magnitude when BLUP was compared with mass selection. Although BLUP is the best estimator of genotypic value, its efficiency in the response to long term selection is not different from any of the other methods, because it is also unable to predict the future effect of the progenies x environments interaction. It was inferred that selection success will always depend on the most accurate possible progeny assessment and using alternatives to reduce the progenies x environments interaction effect.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The reproductive performance of cattle may be influenced by several factors, but mineral imbalances are crucial in terms of direct effects on reproduction. Several studies have shown that elements such as calcium, copper, iron, magnesium, selenium, and zinc are essential for reproduction and can prevent oxidative stress. However, toxic elements such as lead, nickel, and arsenic can have adverse effects on reproduction. In this paper, we applied a simple and fast method of multi-element analysis to bovine semen samples from Zebu and European classes used in reproduction programs and artificial insemination. Samples were analyzed by inductively coupled plasma spectrometry (ICP-MS) using aqueous medium calibration and the samples were diluted in a proportion of 1:50 in a solution containing 0.01% (vol/vol) Triton X-100 and 0.5% (vol/vol) nitric acid. Rhodium, iridium, and yttrium were used as the internal standards for ICP-MS analysis. To develop a reliable method of tracing the class of bovine semen, we used data mining techniques that make it possible to classify unknown samples after checking the differentiation of known-class samples. Based on the determination of 15 elements in 41 samples of bovine semen, 3 machine-learning tools for classification were applied to determine cattle class. Our results demonstrate the potential of support vector machine (SVM), multilayer perceptron (MLP), and random forest (RF) chemometric tools to identify cattle class. Moreover, the selection tools made it possible to reduce the number of chemical elements needed from 15 to just 8.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this work was to select adequate early-maturing sweet orange cultivars for the fresh fruit market and for industrial processing using performance indexes. Performance indexes for citrus were established from data collected in an experiment carried out in the southwest region of the state of Sao Paulo, involving 12 early-maturing sweet orange cultivars. New results were obtained by identifying cultivars with superior characteristics. In a comparison with 'Hamlin' sweet orange, a standard early-maturing cultivar, 'Valencia 2' and 'Salustiana' were considered better materials for the fresh fruit market, whereas 'Westin' sweet orange was identified as a superior cultivar for orange juice processing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multi-element analysis of honey samples was carried out with the aim of developing a reliable method of tracing the origin of honey. Forty-two chemical elements were determined (Al, Cu, Pb, Zn, Mn, Cd, Tl, Co, Ni, Rb, Ba, Be, Bi, U, V, Fe, Pt, Pd, Te, Hf, Mo, Sn, Sb, P, La, Mg, I, Sm, Tb, Dy, Sd, Th, Pr, Nd, Tm, Yb, Lu, Gd, Ho, Er, Ce, Cr) by inductively coupled plasma mass spectrometry (ICP-MS). Then, three machine learning tools for classification and two for attribute selection were applied in order to prove that it is possible to use data mining tools to find the region where honey originated. Our results clearly demonstrate the potential of Support Vector Machine (SVM), Multilayer Perceptron (MLP) and Random Forest (RF) chemometric tools for honey origin identification. Moreover, the selection tools allowed a reduction from 42 trace element concentrations to only 5. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mangrove ecosystems are tropical environments that are characterized by the interaction between the land and the sea. As such, this ecosystem is vulnerable to oil spills. Here, we show a culture-independent survey of fungal communities that are found in the sediments of the following two mangroves that are located on the coast of Sao Paulo State (Brazil): (1) an oil-spill-affected mangrove and (2) a nearby unaffected mangrove. Samples were collected from each mangrove forest at three distinct locations (transect from sea to land), and the samples were analyzed by quantitative PCR and internal transcribed spacer (ITS)-based PCR-DGGE analysis. The abundance of fungi was found to be higher in the oil-affected mangrove. Visual observation and correspondence analysis (CA) of the ITS-based PCR-DGGE profiles revealed differences in the fungal communities between the sampled areas. Remarkably, the oil-spilled area was quite distinct from the unaffected sampling areas. On the basis of the ITS sequences, fungi that are associated with the Basidiomycota and Ascomycota taxa were most common and belonged primarily to the genera Epicoccum, Nigrospora, and Cladosporium. Moreover, the Nigrospora fungal species were shown to be sensitive to oil, whereas a group that was described as "uncultured Basidiomycota" was found more frequently in oil-contaminated areas. Our results showed an increase in fungal abundance in the oil-polluted mangrove regions, and these data indicated potential fungal candidates for remediation of the oil-affected mangroves.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The fig tree (Ficus carica L.) is a fruit tree of great world importance and, therefore, the genetic improvement becomes an important field of research for the crop improvement, being necessary to gather information on this species, mainly regarding its genetic variability so that appropriate propagation projects and management are made. However, the fig, in Brazil, is all produced from only one cultivar, Roxo de Valinhos, which produces seedless fruit, making impossible the conventional breeding. So, the fig breeding through induced mutagenic becomes a very important research line, greatly contributing to the fig culture development. The objective of this study was to select fig plants formed by cuttings treated with gamma ray. The plants used were obtained from buds of the cv. Roxo de Valinhos. The cuttings were irradiated with gamma rays in an irradiator Gamma Cell at 10 cm from the tip of the cutting, at doses of 30 Gy with dose rate of 238 Gy/h. The experiment consisted of 450 treatments, where each formed plant was a treatment. The treatments were numbered sequentially from 1 to 450 and spaced 2.5 x 1.5 m. It was evaluated the vegetative and the fruits characteristics, and the incidence of major crop pests and diseases. The analysis data showed that there is genetic variability among treatments and that the plants under numbers 1, 5, 20, 79, 164, 189, 194, 201, 221, 214, 258, 301, 322, 392, 433 and 440 are probably genetic mutants that should be tested as commercial orchards.