2 resultados para data redundancy

em Universidad Politécnica de Madrid


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Researchers in ecology commonly use multivariate analyses (e.g. redundancy analysis, canonical correspondence analysis, Mantel correlation, multivariate analysis of variance) to interpret patterns in biological data and relate these patterns to environmental predictors. There has been, however, little recognition of the errors associated with biological data and the influence that these may have on predictions derived from ecological hypotheses. We present a permutational method that assesses the effects of taxonomic uncertainty on the multivariate analyses typically used in the analysis of ecological data. The procedure is based on iterative randomizations that randomly re-assign non identified species in each site to any of the other species found in the remaining sites. After each re-assignment of species identities, the multivariate method at stake is run and a parameter of interest is calculated. Consequently, one can estimate a range of plausible values for the parameter of interest under different scenarios of re-assigned species identities. We demonstrate the use of our approach in the calculation of two parameters with an example involving tropical tree species from western Amazonia: 1) the Mantel correlation between compositional similarity and environmental distances between pairs of sites, and; 2) the variance explained by environmental predictors in redundancy analysis (RDA). We also investigated the effects of increasing taxonomic uncertainty (i.e. number of unidentified species), and the taxonomic resolution at which morphospecies are determined (genus-resolution, family-resolution, or fully undetermined species) on the uncertainty range of these parameters. To achieve this, we performed simulations on a tree dataset from southern Mexico by randomly selecting a portion of the species contained in the dataset and classifying them as unidentified at each level of decreasing taxonomic resolution. An analysis of covariance showed that both taxonomic uncertainty and resolution significantly influence the uncertainty range of the resulting parameters. Increasing taxonomic uncertainty expands our uncertainty of the parameters estimated both in the Mantel test and RDA. The effects of increasing taxonomic resolution, however, are not as evident. The method presented in this study improves the traditional approaches to study compositional change in ecological communities by accounting for some of the uncertainty inherent to biological data. We hope that this approach can be routinely used to estimate any parameter of interest obtained from compositional data tables when faced with taxonomic uncertainty.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Reducing duplication in ex-situ collections is complicated and requires good quality genetic markers. This study was conducted to assess the value of endosperm proteins and SSRs for validation of potential duplicates and monitoring intra-accession variability. Fifty durum wheat (Triticum turgidum ssp. durum) accessions grouped in 23 potential duplicates, and previously characterised for 30 agro-morphological traits, were analysed for gliadin and high molecular weight glutenin (HMWG) subunit alleles, total protein, and 24 SSRs, covering a wide genome area. Similarity and dissimilarity matrices were generated based on protein and SSRs alleles. For heterogeneous accessions at gliadins the percent pattern homology (PH) between gliadin patterns and the Nei’s coefficient of genetic identity (I) were computed. Eighteen duplicates identical for proteins showed none or less than 3 unshared SSRs alleles. For heterogeneous accessions PH and I values lower than 80 identified clearly off-types with more than 3 SSRs unshared. Only those biotypes differing in no more than one protein-coding locus were confirmed with SSRs. A good concordance among proteins, morphological traits, and SSR were detected. However, the discrepancy in similarity detected in some cases showed that it is advisable to evaluate redundancy through distinct approaches. The analysis in proteins together with SSRs data are very useful to identify duplicates, biotypes, close related genotypes, and contaminations