3 resultados para predictive models

em Repositório Alice (Acesso Livre à Informação Científica da Embrapa / Repository Open Access to Scientific Information from Embrapa)


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Discovery of microRNAs (miRNAs) relies on predictive models for characteristic features from miRNA precursors (pre-miRNAs). The short length of miRNA genes and the lack of pronounced sequence features complicate this task. To accommodate the peculiarities of plant and animal miRNAs systems, tools for both systems have evolved differently. However, these tools are biased towards the species for which they were primarily developed and, consequently, their predictive performance on data sets from other species of the same kingdom might be lower. While these biases are intrinsic to the species, their characterization can lead to computational approaches capable of diminishing their negative effect on the accuracy of pre-miRNAs predictive models. We investigate in this study how 45 predictive models induced for data sets from 45 species, distributed in eight subphyla/classes, perform when applied to a species different from the species used in its induction. Results: Our computational experiments show that the separability of pre-miRNAs and pseudo pre-miRNAs instances is species-dependent and no feature set performs well for all species, even within the same subphylum/class. Mitigating this species dependency, we show that an ensemble of classifiers reduced the classification errors for all 45 species. As the ensemble members were obtained using meaningful, and yet computationally viable feature sets, the ensembles also have a lower computational cost than individual classifiers that rely on energy stability parameters, which are of prohibitive computational cost in large scale applications. Conclusion: In this study, the combination of multiple pre-miRNAs feature sets and multiple learning biases enhanced the predictive accuracy of pre-miRNAs classifiers of 45 species. This is certainly a promising approach to be incorporated in miRNA discovery tools towards more accurate and less species-dependent tools.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Espécies forrageiras adaptadas às condições semiáridas são uma alternativa para reduzir os impactos negativos na cadeia produtiva de ruminantes da região Nordeste brasileira devido à sazonalidade na oferta de forragem, além de reduzir custo com o fornecimento de alimentos concentrados. Dentre as espécies, a vagem de algaroba (Prosopis juliflora SW D.C.) e palma forrageira (Opuntia e Nopalea) ganham destaque por tolerarem o déficit hídrico e produzirem em períodos onde a oferta de forragem está reduzida, além de apresentam bom valor nutricional e serem bem aceitas pelos animais. Porém, devido à variação na sua composição, seu uso na alimentação animal exige o conhecimento profundo da sua composição para a elaboração de dietas balanceadas. No entanto, devido ao custo e tempo para análise, os produtores não fazem uso da prática de análise da composição químico-bromatológica dos alimentos. Por isto, a espectroscopia de reflectância no infravermelho próximo (NIRS) representa uma importante alternativa aos métodos tradicionais. Objetivou-se com este estudo desenvolver e validar modelos de predição da composição bromatológica de vagem de algaroba e palma forrageira baseados em espectroscopia NIRS, escaneadas em dois modelos de equipamentos e com diferentes processamentos da amostra. Foram coletadas amostras de vagem de algaroba nos estados do Ceará, Bahia, Paraíba e Pernambuco, e amostras de palma forrageira nos estados do Ceará, Paraíba e Pernambuco, frescas (in natura) ou pré-secas e moídas. Para obtenção dos espectros utilizaram-se dois equipamentos NIR, Perten DA 7250 e FOSS 5000. Inicialmente os alimentos foram escaneados in natura em aparelho do modelo Perten, e, com o auxílio do software The Unscrambler 10.2 foi selecionado um grupo de amostras para o banco de calibração. As amostras selecionadas foram secas e moídas, e escaneadas novamente em equipamentos Perten e FOSS. Os valores dos parâmetros de referência foram obtidos por meio de metodologias tradicionalmente aplicadas em laboratório de nutrição animal para matéria seca (MS), matéria mineral (MM), matéria orgânica (MO), proteína bruta (PB), estrato etéreo (EE), fibra solúvel em detergente neutro (FDN), fibra solúvel em detergente ácido (FDA), hemicelulose (HEM) e digestibilidade in vitro da matéria seca (DIVMS). O desempenho dos modelos foi avaliado de acordo com os erros médios de calibração (RMSEC) e validação (RMSECV), coeficiente de determinação (R2 ) e da relação de desempenho de desvio dos modelos (RPD). A análise exploratória dos dados, por meio de tratamentos espectrais e análise de componentes principais (PCA), demonstraram que os bancos de dados eram similares entre si, dando segurança de desenvolver os modelos com todas as amostras selecionadas em um único modelo para cada alimento, algaroba e palma. Na avaliação dos resultados de referência, observou-se que a variação dos resultados para cada parâmetro corroboraram com os descritos na literatura. No desempenho dos modelos, aqueles desenvolvidos com pré-processamento da amostra (pré-secagem e moagem) se mostraram mais robustos do que aqueles construídos com amostras in natura. O aparelho NIRS Perten apresentou desempenho semelhante ao equipamento FOSS, apesar desse último cobrir uma faixa espectral maior e com intervalos de leituras menores. A técnica NIR, associada ao método de calibração multivariada de regressão por meio de quadrados mínimos (PLS), mostrou-se confiável para prever a composição químico-bromatológica de vagem de algaroba e da palma forrageira. Abstract: Forage species adapted to semi-arid conditions are an alternative to reduce the negative impacts in the feed supply for ruminants in the Brazilian Northeast region, due to seasonality in forage availability, as well as in the reducing of cost by providing concentrated feedstuffs. Among the species, mesquite pods (Prosopis juliflora SW DC) and spineless cactus (Opuntia and Nopalea) are highlighted for tolerating the drought and producion in periods where the forage is scarce, and have high nutritional value and also are well accepted by the animals. However, its use in animal diets requires a knowledge about its composition to prepare balanced diets. However, farmers usually do not use feed composition analysis, because their high cost and time-consuming. Thus, the Near Infrared Reflectance Spectroscopy in the (NIRS) is an important alternative to traditional methods. The objective of this study to develop and validate predictive models of the chemical composition of mesquite pods and spineless cactus-based NIRS spectroscopy, scanned in two different spectrometers and sample processing. Mesquite pods samples were collected in the states of Ceará, Bahia, Paraiba and Pernambuco, and samples of forage cactus in the states of Ceará, Paraíba and Pernambuco. In order to obtain the spectra, it was used two NIR equipment: Perten DA 7250 and FOSS 5000. sSpectra of samples were initially obtained fresh (as received) using Perten instrument, and with The Unscrambler software 10.2, a group of subsamples was selected to model development, keeping out redundant ones. The selected samples were dried and ground, and scanned again in both Perten and FOSS instruments. The values of the reference analysis were obtained by methods traditionally applied in animal nutrition laboratory to dry matter (DM), mineral matter (MM), organic matter (OM), crude protein (CP), ether extract (EE), soluble neutral detergent fiber (NDF), soluble acid detergent fiber (ADF), hemicellulose ( HEM) and in vitro digestibility of dry matter (DIVDM). The performance of the models was evaluated according to the Root Mean Square Error of Calibration (RMSEC) and cross-validation (RMSECV), coefficient of determination (R2 ) and the deviation of Ratio of performance Deviation of the models (RPD). Exploratory data analysis through spectral treatments and principal component analysis (PCA), showed that the databases were similar to each other, and may be treated asa single model for each feed - mesquite pods and cactus. Evaluating the reference results, it was observed that the variation were similar to those reported in the literature. Comparing the preprocessing of samples, the performance ofthose developed with preprocessing (dried and ground) of the sample were more robust than those built with fresh samples. The NIRS Perten device performance similar to FOSS equipment, although the latter cover a larger spectral range and with lower readings intervals. NIR technology associate do multivariate techniques is reliable to predict the bromatological composition of mesquite pods and cactus.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genomic selection (GS) has been used to compute genomic estimated breeding values (GEBV) of individuals; however, it has only been applied to animal and major plant crops due to high costs. Besides, breeding and selection is performed at the family level in some crops. We aimed to study the implementation of genome-wide family selection (GWFS) in two loblolly pine (Pinus taeda L.) populations: i) the breeding population CCLONES composed of 63 families (5-20 individuals per family), phenotyped for four traits (stem diameter, stem rust susceptibility, tree stiffness and lignin content) and genotyped using an Illumina Infinium assay with 4740 polymorphic SNPs, and ii) a simulated population that reproduced the same pedigree as CCLONES, 5000 polymorphic loci and two traits (oligogenic and polygenic). In both populations, phenotypic and genotypic data was pooled at the family level in silico. Phenotypes were averaged across replicates for all the individuals and allele frequency was computed for each SNP. Marker effects were estimated at the individual (GEBV) and family (GEFV) levels with Bayes-B using the package BGLR in R and models were validated using 10-fold cross validations. Predicted ability, computed by correlating phenotypes with GEBV and GEFV, was always higher for GEFV in both populations, even after standardizing GEFV predictions to be comparable to GEBV. Results revealed great potential for using GWFS in breeding programs that select families, such as most outbreeding forage species. A significant drop in genotyping costs as one sample per family is needed would allow the application of GWFS in minor crops.