163 resultados para structured data

em University of Queensland eSpace - Australia


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The infection of insect cells with baculovirus was described in a mathematical model as a part of the structured dynamic model describing whole animal cell metabolism. The model presented here is capable of simulating cell population dynamics, the concentrations of extracellular and intracellular viral components, and the heterologous product titers. The model describes the whole processes of viral infection and the effect of the infection on the host cell metabolism. Dynamic simulation of the model in batch and fed-batch mode gave good agreement between model predictions and experimental data. Optimum conditions for insect cell culture and viral infection in batch and fed-batch culture were studied using the model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The movement of chemicals through the soil to the groundwater or discharged to surface waters represents a degradation of these resources. In many cases, serious human and stock health implications are associated with this form of pollution. The chemicals of interest include nutrients, pesticides, salts, and industrial wastes. Recent studies have shown that current models and methods do not adequately describe the leaching of nutrients through soil, often underestimating the risk of groundwater contamination by surface-applied chemicals, and overestimating the concentration of resident solutes. This inaccuracy results primarily from ignoring soil structure and nonequilibrium between soil constituents, water, and solutes. A multiple sample percolation system (MSPS), consisting of 25 individual collection wells, was constructed to study the effects of localized soil heterogeneities on the transport of nutrients (NO3-, Cl-, PO43-) in the vadose zone of an agricultural soil predominantly dominated by clay. Very significant variations in drainage patterns across a small spatial scale were observed tone-way ANOVA, p < 0.001) indicating considerable heterogeneity in water flow patterns and nutrient leaching. Using data collected from the multiple sample percolation experiments, this paper compares the performance of two mathematical models for predicting solute transport, the advective-dispersion model with a reaction term (ADR), and a two-region preferential flow model (TRM) suitable for modelling nonequilibrium transport. These results have implications for modelling solute transport and predicting nutrient loading on a larger scale. (C) 2001 Elsevier Science Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Structured soils are characterized by the presence of inter- and intra-aggregate pore systems and aggregates, which show varying chemical, physical, and biological properties depending on the aggregate type and land use system. How far these aspects also affect the ion exchange processes and to what extent the interaction between the carbon distribution and kind of organic substances affect the internal soil strength as well as hydraulic properties like wettability are still under discussion. Thus, the objective of this research was to clarify the effect of soil aggregation on physical and chemical properties of structured soils at two scales: homogenized material and single aggregates. Data obtained by sequentially peeling off soil aggregates layers revealed gradients in the chemical composition from the aggregate surface to the aggregate core. In aggregates from long term untreated forest soils we found lower amounts of carbon in the external layer, while in arable soils the differentiation was not pronounced. However, soil aggregates originating from these sites exhibited a higher concentration of microbial activity in the outer aggregate layer and declined towards the interior. Furthermore, soil depth and the vegetation type affected the wettability. Aggregate strength depended. on water suction and differences in tillage treatments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Observational longitudinal research is particularly useful for assessing etiology and prognosis and for providing evidence for clinical decision making. However, there are no structured reporting requirements for studies of this design to assist authors, editors, and readers. The authors developed and tested a checklist of criteria related to threats to the internal and external validity of observational longitudinal studies. The checklist criteria concerned recruitment, data collection, biases, and data analysis and descriptive issues relevant to study rationale, study population, and generalizability. Two raters independently assessed 49 randomly selected articles describing stroke research published from 1999 to 2003 in six journals: American Journal of Epidemiology, Journal of Epidemiology and Community Health, Stroke, Annals of Neurology, Archives of Physical Medicine and Rehabilitation, and American Journal of Physical Medicine and Rehabilitation. On average, 17 of the 33 checklist criteria were reported. Criteria describing the study design were better reported than those related to internal validity. No relation was found between study type (etiologic or prognostic) or word count and quality of reporting. A flow diagram for summarizing participant flow through a study was developed. Editors and authors should consider using a checklist and flow diagram when reporting on observational longitudinal research.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This document records the process of migrating eprints.org data to a Fez repository. Fez is a Web-based digital repository and workflow management system based on Fedora (http://www.fedora.info/). At the time of migration, the University of Queensland Library was using EPrints 2.2.1 [pepper] for its ePrintsUQ repository. Once we began to develop Fez, we did not upgrade to later versions of eprints.org software since we knew we would be migrating data from ePrintsUQ to the Fez-based UQ eSpace. Since this document records our experiences of migration from an earlier version of eprints.org, anyone seeking to migrate eprints.org data into a Fez repository might encounter some small differences. Moving UQ publication data from an eprints.org repository into a Fez repository (hereafter called UQ eSpace (http://espace.uq.edu.au/) was part of a plan to integrate metadata (and, in some cases, full texts) about all UQ research outputs, including theses, images, multimedia and datasets, in a single repository. This tied in with the plan to identify and capture the research output of a single institution, the main task of the eScholarshipUQ testbed for the Australian Partnership for Sustainable Repositories project (http://www.apsr.edu.au/). The migration could not occur at UQ until the functionality in Fez was at least equal to that of the existing ePrintsUQ repository. Accordingly, as Fez development occurred throughout 2006, a list of eprints.org functionality not currently supported in Fez was created so that programming of such development could be planned for and implemented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parkinson’s disease (PD) is a progressive, degenerative, neurological disease. The progressive disability associated with PD results in substantial burdens for those with the condition, their families and society in terms of increased health resource use, earnings loss of affected individuals and family caregivers, poorer quality of life, caregiver burden, disrupted family relationships, decreased social and leisure activities, and deteriorating emotional well-being. Currently, no cure is available and the efficacy of available treatments, such as medication and surgical interventions, decreases with longer duration of the disease. Whilst the cause of PD is unknown, genetic and environmental factors are believed to contribute to its aetiology. Descriptive and analytical epidemiological studies have been conducted in a number of countries in an effort to elucidate the cause, or causes, of PD. Rural residency, farming, well water consumption, pesticide exposure, metals and solvents have been implicated as potential risk factors for PD in some previous epidemiological studies. However, there is substantial disagreement between the results of existing studies. Therefore, the role of environmental exposures in the aetiology of PD remains unclear. The main component of this thesis consists of a case-control study that assessed the contribution of environmental exposures to the risk of developing PD. An existing, previously unanalysed, dataset from a local case-control study was analysed to inform the design of the new case-control study. The analysis results suggested that regular exposure to pesticides and head injury were important risk factors for PD. However, due to the substantial limitations of this existing study, further confirmation of these results was desirable with a more robustly designed epidemiological study. A new exposure measurement instrument (a structured interviewer-delivered questionnaire) was developed for the new case-control study to obtain data on demographic, lifestyle, environmental and medical factors. Prior to its use in the case-control study, the questionnaire was assessed for test-retest repeatability in a series of 32 PD cases and 29 healthy sex-, age- and residential suburb-matched electoral roll controls. High repeatability was demonstrated for lifestyle exposures, such as smoking and coffee/tea consumption (kappas 0.70-1.00). The majority of environmental exposures, including use of pesticides, solvents and exposure to metal dusts and fumes, also showed high repeatability (kappas >0.78). A consecutive series of 163 PD case participants was recruited from a neurology clinic in Brisbane. One hundred and fifty-one (151) control participants were randomly selected from the Australian Commonwealth Electoral Roll and individually matched to the PD cases on age (± 2 years), sex and current residential suburb. Participants ranged in age from 40-89 years (mean age 67 years). Exposure data were collected in face-to-face interviews. Odds ratios and 95% confidence intervals were calculated using conditional logistic regression for matched sets in SAS version 9.1. Consistent with previous studies, ever having been a regular smoker or coffee drinker was inversely associated with PD with dose-response relationships evident for packyears smoked and number of cups of coffee drunk per day. Passive smoking from ever having lived with a smoker or worked in a smoky workplace was also inversely related to PD. Ever having been a regular tea drinker was associated with decreased odds of PD. Hobby gardening was inversely associated with PD. However, use of fungicides in the home garden or occupationally was associated with increased odds of PD. Exposure to welding fumes, cleaning solvents, or thinners occupationally was associated with increased odds of PD. Ever having resided in a rural or remote area was inversely associated with PD. Ever having resided on a farm was only associated with moderately increased odds of PD. Whilst the current study’s results suggest that environmental exposures on their own are only modest contributors to overall PD risk, the possibility that interaction with genetic factors may additively or synergistically increase risk should be considered. The results of this research support the theory that PD has a multifactorial aetiology and that environmental exposures are some of a number of factors to contribute to PD risk. There was also evidence of interaction between some factors (eg smoking and welding) to moderate PD risk.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is substantial disagreement among published epidemiological studies regarding environmental risk factors for Parkinson’s disease (PD). Differences in the quality of measurement of environmental exposures may contribute to this variation. The current study examined the test–retest repeatability of self-report data on risk factors for PD obtained from a series of 32 PD cases recruited from neurology clinics and 29 healthy sex-, age-and residential suburb-matched controls. Exposure data were collected in face-to-face interviews using a structured questionnaire derived from previous epidemiological studies. High repeatability was demonstrated for ‘lifestyle’ exposures, such as smoking and coffee/tea consumption (kappas 0.70–1.00). Environmental exposures that involved some action by the person, such as pesticide application and use of solvents and metals, also showed high repeatability (kappas>0.78). Lower repeatability was seen for rural residency and bore water consumption (kappa 0.39–0.74). In general, we found that case and control participants provided similar rates of incongruent and missing responses for categorical and continuous occupational, domestic, lifestyle and medical exposures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The final-year project for Mechanical & Space Engineering students at UQ often involves the design and flight testing of an experiment. This report describes the design and use of a simple data logger that should be suitable for collecting data from the students' flight experiments. The exercise here was taken as far as the construction of a prototype device that is suitable for ground-based testing, say, the static firing of a hybrid rocket motor.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background and Purpose: What drives some athletes to achieve at the highest level whilst other athletes fail to achieve their physical potential? Why does the ‘fire’ burn so brightly for some elite athletes and not for others? A good understanding of an athlete’s motivation is critical to a coach designing an appropriate motivational climate to realize an athlete’s physical talent. This paper examines the motivational processes of elite athletes within the framework of three major social-cognitive theories of motivation. Method: Participants were five male and five female elite track and field athletes from Australia who had finished in the top ten at either the Olympic Games and/or the World Championships in the last six years. Qualitative data were collected using semi-structured interviews. Results and Discussion: Inductive analyses revealed several major themes associated with the motivational processes of elite athletes: (a) they were highly driven by personal goals and achievement, (b) they had strong self-belief, and (c) track and field was central to their lives. The findings are discussed in light of recent social-cognitive theories of motivation, namely, self-determination theory, the hierarchical model of motivation, and achievement goal theory. Self-determined forms of motivation characterised the elite athletes in this study and, consistent with social-cognitive theories of motivation, it is suggested that goal accomplishment enhances perceptions of competence and consequently promotes self-determined forms of motivation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Three main models of parameter setting have been proposed: the Variational model proposed by Yang (2002; 2004), the Structured Acquisition model endorsed by Baker (2001; 2005), and the Very Early Parameter Setting (VEPS) model advanced by Wexler (1998). The VEPS model contends that parameters are set early. The Variational model supposes that children employ statistical learning mechanisms to decide among competing parameter values, so this model anticipates delays in parameter setting when critical input is sparse, and gradual setting of parameters. On the Structured Acquisition model, delays occur because parameters form a hierarchy, with higher-level parameters set before lower-level parameters. Assuming that children freely choose the initial value, children sometimes will miss-set parameters. However when that happens, the input is expected to trigger a precipitous rise in one parameter value and a corresponding decline in the other value. We will point to the kind of child language data that is needed in order to adjudicate among these competing models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A combination of deductive reasoning, clustering, and inductive learning is given as an example of a hybrid system for exploratory data analysis. Visualization is replaced by a dialogue with the data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper reports a comparative study of Australian and New Zealand leadership attributes, based on the GLOBE (Global Leadership and Organizational Behavior Effectiveness) program. Responses from 344 Australian managers and 184 New Zealand managers in three industries were analyzed using exploratory and confirmatory factor analysis. Results supported some of the etic leadership dimensions identified in the GLOBE study, but also found some emic dimensions of leadership for each country. An interesting finding of the study was that the New Zealand data fitted the Australian model, but not vice versa, suggesting asymmetric perceptions of leadership in the two countries.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the context of cancer diagnosis and treatment, we consider the problem of constructing an accurate prediction rule on the basis of a relatively small number of tumor tissue samples of known type containing the expression data on very many (possibly thousands) genes. Recently, results have been presented in the literature suggesting that it is possible to construct a prediction rule from only a few genes such that it has a negligible prediction error rate. However, in these results the test error or the leave-one-out cross-validated error is calculated without allowance for the selection bias. There is no allowance because the rule is either tested on tissue samples that were used in the first instance to select the genes being used in the rule or because the cross-validation of the rule is not external to the selection process; that is, gene selection is not performed in training the rule at each stage of the cross-validation process. We describe how in practice the selection bias can be assessed and corrected for by either performing a cross-validation or applying the bootstrap external to the selection process. We recommend using 10-fold rather than leave-one-out cross-validation, and concerning the bootstrap, we suggest using the so-called. 632+ bootstrap error estimate designed to handle overfitted prediction rules. Using two published data sets, we demonstrate that when correction is made for the selection bias, the cross-validated error is no longer zero for a subset of only a few genes.