946 resultados para Legacy datasets
Resumo:
Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.
Resumo:
A Geographic Information System (GIS) was used to model datasets of Leyte Island, the Philippines, to identify land which was suitable for a forest extension program on the island. The datasets were modelled to provide maps of the distance of land from cities and towns, land which was a suitable elevation and slope for smallholder forestry and land of various soil types. An expert group was used to assign numeric site suitabilities to the soil types and maps of site suitability were used to assist the selection of municipalities for the provision of extension assistance to smallholders. Modelling of the datasets was facilitated by recent developments of the ArcGIS® suite of computer programs and derivation of elevation and slope was assisted by the availability of digital elevation models (DEM) produced by the Shuttle Radar Topography (SRTM) mission. The usefulness of GIS software as a decision support tool for small-scale forestry extension programs is discussed.
Resumo:
The area of private land suitable and available for growing hoop pine (Araucaria cunninghamii) on the Atherton Tablelands in North Queensland was modelled using a geographic information system (GIS). In Atherton, Eacham and Herberton shires, approximately 64,700 ha of privately owned land were identified as having a mean annual rainfall and soil type similar to Forestry Plantations Queensland (FPQ) hoop pine growth plots with an approximate growth rate of 20 m3 per annum. Land with slope of over 25° and land covered with native vegetation were excluded in the estimation. If land which is currently used for high-value agriculture is also excluded, the net area of land potentially suitable and available for expansion of hoop pine plantations is approximately 22,900 ha. Expert silvicultural advice emphasized the role of site preparation and weed control in affecting the long-term growth rate of hoop pine. Hence, sites with less than optimal fertility and rainfall may be considered as being potentially suitable for growing hoop pine at a lower growth rate. The datasets had been prepared at various scales and differing precision for their description of land attributes. Therefore, the results of this investigation have limited applicability for planning at the individual farm level but are useful at the regional level to target areas for plantation expansion.
Resumo:
In order to examine whether different populations show the same pattern of onset in the Southern Hemisphere, we examined the age-at-first-admission distribution for schizophrenia based on mental health registers from Australia and Brazil. Data on age-at-first-admission for individuals with schizophrenia were extracted from two names-linked registers, (1) the Queensland Mental Health Statistics System, Australia (N=7651, F= 3293, M=4358), and (2) a psychiatric hospital register in Pelotas, Brazil (N=4428, F=2220, M=2208). Age distributions were derived for males and females for both datasets. The general population structure tbr both countries was also obtained. There were significantly more males in the Queensland dataset (gz = 56.9, df3, p < 0.0001 ). Both dataset distributions were skewed to the right. Onset rose steeply after puberty to reach a modal age group of 20-29 for men and women, with a more gradual tail toward the older age groups. In Queensland 68% of women with schizophrenia had their first admissions after age 30, while the proportion from Brazil was 58%. Compared to the Australian dataset, the Brazilian dataset had a slightly greater proportion of first admissions under the age 30 and a slightly smaller proportion over the age of 60 years. This reflects the underlying age distributions of the two populations. This study confirms the wide age range and gender differences in age-at-first-admission distributions for schizophrenia and identified a significant difference in the gender ratio between the two datasets. Given widely differing health services, cultural practices, ethic variability, and the different underlying population distributions, the age-at-first-admission in Queensland and Brazil showed more similarities than differences. Acknowledgments: The Stanley Foundation supported this project.
Resumo:
We re-mapped the soils of the Murray-Darling Basin (MDB) in 1995-1998 with a minimum of new fieldwork, making the most out of existing data. We collated existing digital soil maps and used inductive spatial modelling to predict soil types from those maps combined with environmental predictor variables. Lithology, Landsat Multi Spectral Scanner (Landsat MSS), the 9-s digital elevation model (DEM) of Australia and derived terrain attributes, all gridded to 250-m pixels, were the predictor variables. Because the basin-wide datasets were very large data mining software was used for modelling. Rule induction by data mining was also used to define the spatial domain of extrapolation for the extension of soil-landscape models from existing soil maps. Procedures to estimate the uncertainty associated with the predictions and quality of information for the new soil-landforms map of the MDB are described. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
The Australian Soil Resources Information System (ASRIS) database compiles the best publicly available information available across Commonwealth, State, and Territory agencies into a national database of soil profile data, digital soil and land resources maps, and climate, terrain, and lithology datasets. These datasets are described in detail in this paper. Most datasets are thematic grids that cover the intensively used agricultural zones in Australia.
Resumo:
Background: A major goal in the post-genomic era is to identify and characterise disease susceptibility genes and to apply this knowledge to disease prevention and treatment. Rodents and humans have remarkably similar genomes and share closely related biochemical, physiological and pathological pathways. In this work we utilised the latest information on the mouse transcriptome as revealed by the RIKEN FANTOM2 project to identify novel human disease-related candidate genes. We define a new term patholog to mean a homolog of a human disease-related gene encoding a product ( transcript, anti-sense or protein) potentially relevant to disease. Rather than just focus on Mendelian inheritance, we applied the analysis to all potential pathologs regardless of their inheritance pattern. Results: Bioinformatic analysis and human curation of 60,770 RIKEN full-length mouse cDNA clones produced 2,578 sequences that showed similarity ( 70 - 85% identity) to known human-disease genes. Using a newly developed biological information extraction and annotation tool ( FACTS) in parallel with human expert analysis of 17,051 MEDLINE scientific abstracts we identified 182 novel potential pathologs. Of these, 36 were identified by computational tools only, 49 by human expert analysis only and 97 by both methods. These pathologs were related to neoplastic ( 53%), hereditary ( 24%), immunological ( 5%), cardio-vascular (4%), or other (14%), disorders. Conclusions: Large scale genome projects continue to produce a vast amount of data with potential application to the study of human disease. For this potential to be realised we need intelligent strategies for data categorisation and the ability to link sequence data with relevant literature. This paper demonstrates the power of combining human expert annotation with FACTS, a newly developed bioinformatics tool, to identify novel pathologs from within large-scale mouse transcript datasets.
Resumo:
1. Cluster analysis of reference sites with similar biota is the initial step in creating River Invertebrate Prediction and Classification System (RIVPACS) and similar river bioassessment models such as Australian River Assessment System (AUSRIVAS). This paper describes and tests an alternative prediction method, Assessment by Nearest Neighbour Analysis (ANNA), based on the same philosophy as RIVPACS and AUSRIVAS but without the grouping step that some people view as artificial. 2. The steps in creating ANNA models are: (i) weighting the predictor variables using a multivariate approach analogous to principal axis correlations, (ii) calculating the weighted Euclidian distance from a test site to the reference sites based on the environmental predictors, (iii) predicting the faunal composition based on the nearest reference sites and (iv) calculating an observed/expected (O/E) analogous to RIVPACS/AUSRIVAS. 3. The paper compares AUSRIVAS and ANNA models on 17 datasets representing a variety of habitats and seasons. First, it examines each model's regressions for Observed versus Expected number of taxa, including the r(2), intercept and slope. Second, the two models' assessments of 79 test sites in New Zealand are compared. Third, the models are compared on test and presumed reference sites along a known trace metal gradient. Fourth, ANNA models are evaluated for western Australia, a geographically distinct region of Australia. The comparisons demonstrate that ANNA and AUSRIVAS are generally equivalent in performance, although ANNA turns out to be potentially more robust for the O versus E regressions and is potentially more accurate on the trace metal gradient sites. 4. The ANNA method is recommended for use in bioassessment of rivers, at least for corroborating the results of the well established AUSRIVAS- and RIVPACS-type models, if not to replace them.
Resumo:
Objective: To determine the age-standardised prevalence of peripheral arterial disease (PAD) and associated risk factors, particularly smoking. Method: Design: Cross-sectional survey of a randomly selected population. Setting: Metropolitan area of Perth, Western Australia. Participants: Men aged between 65-83 years. Results: The adjusted response fraction was 77.2%. Of 4,470 men assessed, 744 were identified as having PAD by the Edinburgh Claudication Questionnaire and/or the ankle-brachial index of systolic blood pressure, yielding an age-standardised prevalence of PAD of 15.6% (95% confidence intervals (CI): 14.5%, 16.6%). The main risk factors identified in univariate analyses were increasing age, smoking current (OR=3.9, 95% CI 2.9-5.1) or former (OR=2.0, 95% CI 1.6-2.4), physical inactivity (OR=1.4, 95% CI 1.2-1.7), a history of angina (OR=2.2, 95% CI 1.8-2.7) and diabetes mellitus (OR=2.1, 95% CI 1.7-2.6). The multivariate analysis showed that the highest relative risk associated with PAD was current smoking of 25 or more cigarettes daily (OR=7.3, 95% CI 4.2-12.8). In this population, 32% of PAD was attributable to current smoking and a further 40% was attributable to past smoking by men who did not smoke currently. Conclusions: This large observational study shows that PAD is relatively common in older, urban Australian men. In contrast with its relationship to coronary disease and stroke, previous smoking appears to have a long legacy of increased risk of PAD. Implications: This research emphasises the importance of smoking as a preventable cause of PAD.
Resumo:
The `reflexive thinking` concept is discussed in this article as a means of contextualizing John Dewey`s intellectual legacy. `Reflection` represents a fundamental element for the construction of the necessary competences to information seeking and use, and consequently to individual and collective development. Since the reflexive thinking habit in information literacy is a way of learning, some questions concerning teaching and learning processes are also investigated. The discussion is, therefore, supported by the supposition that reflexive thinking is a cognitive strategy that allows a deeper comprehension of related problems, phenomena, and processes by means of the perception of the relations and the identification of involved elements, as well as the analysis and interpretation of meanings, empowering the information literacy process.
Resumo:
Phylogenetic hypotheses are presented for Pultenaea based on cpDNA (trnL-F and ndhF) and nrDNA ( ITS) sequence data. Pultenaea, as it is currently circumscribed, comprises six strongly supported lineages whose relationships with each other and 18 closely related genera are weak or conflicting among datasets. The lack of resolution among the six Pultenaea clades and their relatives appears to be the result of a rapid radiation, which is evident in molecular data from both the chloroplast and nuclear genomes. The molecular data provide no support for the monophyly of Pultenaea as it currently stands. Given these results, Pultenaea could split into many smaller genera. We prefer the taxonomically stable alternative of subsuming all 19 genera currently recognised in Pultenaea sensu lato (= the Mirbelia group) into an expanded concept of Pultenaea that would comprise similar to 470 species.
Resumo:
Hepatitis B is a worldwide health problem affecting about 2 billion people and more than 350 million are chronic carriers of the virus. Nine HBV genotypes (A to I) have been described. The geographical distribution of HBV genotypes is not completely understood due to the limited number of samples from some parts of the world. One such example is Colombia, in which few studies have described the HBV genotypes. In this study, we characterized HBV genotypes in 143 HBsAg-positive volunteer blood donors from Colombia. A fragment of 1306 bp partially comprising HBsAg and the DNA polymerase coding regions (S/POL) was amplified and sequenced. Bayesian phylogenetic analyses were conducted using the Markov Chain Monte Carlo (MCMC) approach to obtain the maximum clade credibility (MCC) tree using BEAST v.1.5.3. Of all samples, 68 were positive and 52 were successfully sequenced. Genotype F was the most prevalent in this population (77%) - subgenotypes F3 (75%) and Fib (2%). Genotype G (7.7%) and subgenotype A2 (15.3%) were also found. Genotype G sequence analysis suggests distinct introductions of this genotype in the country. Furthermore, we estimated the time of the most recent common ancestor (TMRCA) for each HBV/F subgenotype and also for Colombian F3 sequences using two different datasets: (i) 77 sequences comprising 1306 bp of S/POL region and (ii) 283 sequences comprising 681 bp of S/POL region. We also used two other previously estimated evolutionary rates: (i) 2.60 x 10(-4) s/s/y and (ii) 1.5 x 10(-5) s/s/y. Here we report the HBV genotypes circulating in Colombia and estimated the TMRCA for the four different subgenotypes of genotype F. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Mitochondrial DNA (mtDNA) population data for forensic purposes are still scarce for some populations, which may limit the evaluation of forensic evidence especially when the rarity of a haplotype needs to be determined in a database search. In order to improve the collection of mtDNA lineages from the Iberian and South American subcontinents, we here report the results of a collaborative study involving nine laboratories from the Spanish and Portuguese Speaking Working Group of the International Society for Forensic Genetics (GHEP-ISFG) and EMPOP. The individual laboratories contributed population data that were generated throughout the past 10 years, but in the majority of cases have not been made available to the scientific community. A total of 1019 haplotypes from Iberia (Basque Country, 2 general Spanish populations, 2 North and 1 Central Portugal populations), and Latin America (3 populations from Sao Paulo) were collected, reviewed and harmonized according to defined EMPOP criteria. The majority of data ambiguities that were found during the reviewing process (41 in total) were transcription errors confirming that the documentation process is still the most error-prone stage in reporting mtDNA population data, especially when performed manually. This GHEP-EMPOP collaboration has significantly improved the quality of the individual mtDNA datasets and adds mtDNA population data as valuable resource to the EMPOP database (www.empop.org). (C) 2010 Elsevier Ireland Ltd. All rights reserved.
Resumo:
In this paper, we propose a method based on association rule-mining to enhance the diagnosis of medical images (mammograms). It combines low-level features automatically extracted from images and high-level knowledge from specialists to search for patterns. Our method analyzes medical images and automatically generates suggestions of diagnoses employing mining of association rules. The suggestions of diagnosis are used to accelerate the image analysis performed by specialists as well as to provide them an alternative to work on. The proposed method uses two new algorithms, PreSAGe and HiCARe. The PreSAGe algorithm combines, in a single step, feature selection and discretization, and reduces the mining complexity. Experiments performed on PreSAGe show that this algorithm is highly suitable to perform feature selection and discretization in medical images. HiCARe is a new associative classifier. The HiCARe algorithm has an important property that makes it unique: it assigns multiple keywords per image to suggest a diagnosis with high values of accuracy. Our method was applied to real datasets, and the results show high sensitivity (up to 95%) and accuracy (up to 92%), allowing us to claim that the use of association rules is a powerful means to assist in the diagnosing task.
Resumo:
Human leukocyte antigen (HLA) haplotypes are frequently evaluated for population history inferences and association studies. However, the available typing techniques for the main HLA loci usually do not allow the determination of the allele phase and the constitution of a haplotype, which may be obtained by a very time-consuming and expensive family-based segregation study. Without the family-based study, computational inference by probabilistic models is necessary to obtain haplotypes. Several authors have used the expectation-maximization (EM) algorithm to determine HLA haplotypes, but high levels of erroneous inferences are expected because of the genetic distance among the main HLA loci and the presence of several recombination hotspots. In order to evaluate the efficiency of computational inference methods, 763 unrelated individuals stratified into three different datasets had their haplotypes manually defined in a family-based study of HLA-A, -B, -DRB1 and -DQB1 segregation, and these haplotypes were compared with the data obtained by the following three methods: the Expectation-Maximization (EM) and Excoffier-Laval-Balding (ELB) algorithms using the arlequin 3.11 software, and the PHASE method. When comparing the methods, we observed that all algorithms showed a poor performance for haplotype reconstruction with distant loci, estimating incorrect haplotypes for 38%-57% of the samples considering all algorithms and datasets. We suggest that computational haplotype inferences involving low-resolution HLA-A, HLA-B, HLA-DRB1 and HLA-DQB1 haplotypes should be considered with caution.