981 resultados para observational methods


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis we present and evaluate two pattern matching based methods for answer extraction in textual question answering systems. A textual question answering system is a system that seeks answers to natural language questions from unstructured text. Textual question answering systems are an important research problem because as the amount of natural language text in digital format grows all the time, the need for novel methods for pinpointing important knowledge from the vast textual databases becomes more and more urgent. We concentrate on developing methods for the automatic creation of answer extraction patterns. A new type of extraction pattern is developed also. The pattern matching based approach chosen is interesting because of its language and application independence. The answer extraction methods are developed in the framework of our own question answering system. Publicly available datasets in English are used as training and evaluation data for the methods. The techniques developed are based on the well known methods of sequence alignment and hierarchical clustering. The similarity metric used is based on edit distance. The main conclusions of the research are that answer extraction patterns consisting of the most important words of the question and of the following information extracted from the answer context: plain words, part-of-speech tags, punctuation marks and capitalization patterns, can be used in the answer extraction module of a question answering system. This type of patterns and the two new methods for generating answer extraction patterns provide average results when compared to those produced by other systems using the same dataset. However, most answer extraction methods in the question answering systems tested with the same dataset are both hand crafted and based on a system-specific and fine-grained question classification. The the new methods developed in this thesis require no manual creation of answer extraction patterns. As a source of knowledge, they require a dataset of sample questions and answers, as well as a set of text documents that contain answers to most of the questions. The question classification used in the training data is a standard one and provided already in the publicly available data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Minimum Description Length (MDL) principle is a general, well-founded theoretical formalization of statistical modeling. The most important notion of MDL is the stochastic complexity, which can be interpreted as the shortest description length of a given sample of data relative to a model class. The exact definition of the stochastic complexity has gone through several evolutionary steps. The latest instantation is based on the so-called Normalized Maximum Likelihood (NML) distribution which has been shown to possess several important theoretical properties. However, the applications of this modern version of the MDL have been quite rare because of computational complexity problems, i.e., for discrete data, the definition of NML involves an exponential sum, and in the case of continuous data, a multi-dimensional integral usually infeasible to evaluate or even approximate accurately. In this doctoral dissertation, we present mathematical techniques for computing NML efficiently for some model families involving discrete data. We also show how these techniques can be used to apply MDL in two practical applications: histogram density estimation and clustering of multi-dimensional data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents methods for locating and analyzing cis-regulatory DNA elements involved with the regulation of gene expression in multicellular organisms. The regulation of gene expression is carried out by the combined effort of several transcription factor proteins collectively binding the DNA on the cis-regulatory elements. Only sparse knowledge of the 'genetic code' of these elements exists today. An automatic tool for discovery of putative cis-regulatory elements could help their experimental analysis, which would result in a more detailed view of the cis-regulatory element structure and function. We have developed a computational model for the evolutionary conservation of cis-regulatory elements. The elements are modeled as evolutionarily conserved clusters of sequence-specific transcription factor binding sites. We give an efficient dynamic programming algorithm that locates the putative cis-regulatory elements and scores them according to the conservation model. A notable proportion of the high-scoring DNA sequences show transcriptional enhancer activity in transgenic mouse embryos. The conservation model includes four parameters whose optimal values are estimated with simulated annealing. With good parameter values the model discriminates well between the DNA sequences with evolutionarily conserved cis-regulatory elements and the DNA sequences that have evolved neutrally. In further inquiry, the set of highest scoring putative cis-regulatory elements were found to be sensitive to small variations in the parameter values. The statistical significance of the putative cis-regulatory elements is estimated with the Two Component Extreme Value Distribution. The p-values grade the conservation of the cis-regulatory elements above the neutral expectation. The parameter values for the distribution are estimated by simulating the neutral DNA evolution. The conservation of the transcription factor binding sites can be used in the upstream analysis of regulatory interactions. This approach may provide mechanistic insight to the transcription level data from, e.g., microarray experiments. Here we give a method to predict shared transcriptional regulators for a set of co-expressed genes. The EEL (Enhancer Element Locator) software implements the method for locating putative cis-regulatory elements. The software facilitates both interactive use and distributed batch processing. We have used it to analyze the non-coding regions around all human genes with respect to the orthologous regions in various other species including mouse. The data from these genome-wide analyzes is stored in a relational database which is used in the publicly available web services for upstream analysis and visualization of the putative cis-regulatory elements in the human genome.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ubiquitous computing is about making computers and computerized artefacts a pervasive part of our everyday lifes, bringing more and more activities into the realm of information. The computationalization, informationalization of everyday activities increases not only our reach, efficiency and capabilities but also the amount and kinds of data gathered about us and our activities. In this thesis, I explore how information systems can be constructed so that they handle this personal data in a reasonable manner. The thesis provides two kinds of results: on one hand, tools and methods for both the construction as well as the evaluation of ubiquitous and mobile systems---on the other hand an evaluation of the privacy aspects of a ubiquitous social awareness system. The work emphasises real-world experiments as the most important way to study privacy. Additionally, the state of current information systems as regards data protection is studied. The tools and methods in this thesis consist of three distinct contributions. An algorithm for locationing in cellular networks is proposed that does not require the location information to be revealed beyond the user's terminal. A prototyping platform for the creation of context-aware ubiquitous applications called ContextPhone is described and released as open source. Finally, a set of methodological findings for the use of smartphones in social scientific field research is reported. A central contribution of this thesis are the pragmatic tools that allow other researchers to carry out experiments. The evaluation of the ubiquitous social awareness application ContextContacts covers both the usage of the system in general as well as an analysis of privacy implications. The usage of the system is analyzed in the light of how users make inferences of others based on real-time contextual cues mediated by the system, based on several long-term field studies. The analysis of privacy implications draws together the social psychological theory of self-presentation and research in privacy for ubiquitous computing, deriving a set of design guidelines for such systems. The main findings from these studies can be summarized as follows: The fact that ubiquitous computing systems gather more data about users can be used to not only study the use of such systems in an effort to create better systems but in general to study phenomena previously unstudied, such as the dynamic change of social networks. Systems that let people create new ways of presenting themselves to others can be fun for the users---but the self-presentation requires several thoughtful design decisions that allow the manipulation of the image mediated by the system. Finally, the growing amount of computational resources available to the users can be used to allow them to use the data themselves, rather than just being passive subjects of data gathering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The control of shapes of nanocrystals is crucial for using them as building blocks for various applications. In this paper, we present a critical overview of the issues involved in shape-controlled synthesis of nanostructures. In particular, we focus on the mechanisms by which anisotropic structures of high-symmetry materials (fcc crystals, for instance) could be realized. Such structures require a symmetry-breaking mechanism to be operative that typically leads to selection of one of the facets/directions for growth over all the other symmetry-equivalent crystallographic facets. We show how this selection could arise for the growth of one-dimensional structures leading to ultrafine metal nanowires and for the case of two-dimensional nanostructures where the layer-by-layer growth takes place at low driving forces leading to plate-shaped structures. We illustrate morphology diagrams to predict the formation of two-dimensional structures during wet chemical synthesis. We show the generality of the method by extending it to predict the growth of plate-shaped inorganics produced by a precipitation reaction. Finally, we present the growth of crystals under high driving forces that can lead to the formation of porous structures with large surface areas.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new approach to Penrose's twistor algebra is given. It is based on the use of a generalised quaternion algebra for the translation of statements in projective five-space into equivalent statements in twistor (conformal spinor) space. The formalism leads toSO(4, 2)-covariant formulations of the Pauli-Kofink and Fierz relations among Dirac bilinears, and generalisations of these relations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Variety selection in perennial pasture crops involves identifying best varieties from data collected from multiple harvest times in field trials. For accurate selection, the statistical methods for analysing such data need to account for the spatial and temporal correlation typically present. This paper provides an approach for analysing multi-harvest data from variety selection trials in which there may be a large number of harvest times. Methods are presented for modelling the variety by harvest effects while accounting for the spatial and temporal correlation between observations. These methods provide an improvement in model fit compared to separate analyses for each harvest, and provide insight into variety by harvest interactions. The approach is illustrated using two traits from a lucerne variety selection trial. The proposed method provides variety predictions allowing for the natural sources of variation and correlation in multi-harvest data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Free and open source software development is an alternative to traditional software engineering as an approach to the development of complex software systems. It is a way of developing software based on geographically distributed teams of volunteers without apparent central plan or traditional mechanisms of coordination. The purpose of this thesis is to summarize the current knowledge about free and open source software development and explore the ways on which further understanding on it could be gained. The results of research on the field as well as the research methods are introduced and discussed. Also adapting software process metrics to the context of free and open source software development is illustrated and the possibilities to utilize them as tools to validate other research are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While dehydration is common in older patients and is associated with poor outcomes, it has been infrequently studied in the hospital setting. Thus, the aim of this study was to identify potential barriers and enablers to the maintenance of adequate hydration in older patients in an acute hospital environment. An observational study, involving patients aged 60 years and older admitted to an acute care hospital in Queensland, Australia, was undertaken. Forty-four patients were observed during mealtimes, and chart and room audits were performed to identify hydration management strategies, weight records and the presence or absence of fluid balance charts. Results revealed a number of system and practice-related barriers including patient difficulties with opening fluid containers and low levels of documentation of hydration management strategies. Addressing these issues is an important first step towards improving the management of hydration in medically ill older hospital patients.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Unhealthy diets contribute at least 14% to Australia's disease burden and are driven by ‘obesogenic’ food environments. Compliance with dietary recommendations is particularly poor amongst disadvantaged populations including low socioeconomic groups, those living in rural/remote areas and Aboriginal and Torres Strait Islanders. The perception that healthy foods are expensive is a key barrier to healthy choices and a major determinant of diet-related health inequities. Available state/regional/local data (limited and non-comparable) suggests that, despite basic healthy foods not incurring GST, the cost of healthy food is higher and has increased more rapidly than unhealthy food over the last 15 years in Australia. However, there were no nationally standardised tools or protocols to benchmark, compare or monitor food prices and affordability in Australia. Globally, we are leading work to develop and test approaches to assess the price differential of healthy and less-healthy (current) diets under the food price module of the International Network for Food and Obesity/non-communicable diseases (NCDs) Research, Monitoring and Action Support (INFORMAS). This presentation describes contextualization of the INFORMAS approach to develop standardised Australian tools, survey protocols and data collection and analysis systems. The ‘healthy diet basket’ was based on the Australian Foundation Diet, 1 The ‘current diet basket’ and specific items included in each basket, were based on recent national dietary survey data.2 Data collection methods were piloted. The final tools and protocols were then applied to measure the price and affordability of healthy and less healthy (current) diets of different household groups in diverse communities across the nation. We have compared results for different geographical locations/population subgroups in Australia and assessed these against international INFORMAS benchmarks. The results inform the development of policy and practice, including those relevant to mooted changes to the GST base, to promote nutrition and healthy weight and prevent chronic disease in Australia.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

PURPOSE The purpose of this study was to examine the relationship between objectively measured ambient light exposure and longitudinal changes in axial eye growth in childhood. METHODS A total of 101 children (41 myopes and 60 nonmyopes), 10 to 15 years of age participated in this prospective longitudinal observational study. Axial eye growth was determined from measurements of ocular optical biometry collected at four study visits over an 18-month period. Each child’s mean daily light exposure was derived from two periods (each 14 days long) of objective light exposure measurements from a wrist-worn light sensor. RESULTS Over the 18-month study period, a modest but statistically significant association between greater average daily light exposure and slower axial eye growth was observed (P ¼ 0.047). Other significant predictors of axial eye growth in this population included children’s refractive error group (P < 0.001), sex (P < 0.01), and age (P < 0.001). Categorized according to their objectively measured average daily light exposure and adjusting for potential confounders (age, sex, baseline axial length, parental myopia, nearwork, and physical activity), children experiencing low average daily light exposure (mean daily light exposure: 459 6 117 lux, annual eye growth: 0.13 mm/y) exhibited significantly greater eye growth than children experiencing moderate (842 6 109 lux, 0.060 mm/y), and high (1455 6 317 lux, 0.065 mm/y) average daily light exposure levels (P ¼ 0.01). CONCLUSIONS In this population of children, greater daily light exposure was associated with less axial eye growth over an 18-month period. These findings support the role of light exposure in the documented association between time spent outdoors and childhood myopia.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

PURPOSE To examine longitudinal changes in choroidal thickness and axial length in a population of children with a range of refractive errors. METHODS One hundred and one children (41 myopes and 60 nonmyopes) aged 10 to 15 years participated in this prospective, observational longitudinal study. For each child, 6-month measures of choroidal thickness (using enhanced depth imaging optical coherence tomography) and axial ocular biometry were collected four times over an 18-month period. Linear mixed-models were used to examine the longitudinal changes in choroidal thickness and the relationship between changes in choroidal thickness and axial eye growth over the study period. RESULTS A significant group mean increase in subfoveal choroidal thickness was observed over 18 months (mean increase 13 6 22 lm, P < 0.001). Myopic children exhibited significantly thinner choroids compared with nonmyopic children (P < 0.001), although there was no significant time by refractive group interaction (P ¼ 0.46), indicating similar changes in choroidal thickness over time in myopes and nonmyopes. However, a significant association between the change in choroidal thickness and the change in axial length over time was found (P < 0.001, β = −0.14). Children showing faster axial eye growth exhibited significantly less choroidal thickening over time compared with children showing slower axial eye growth. CONCLUSIONS A significant increase in choroidal thickness occurs over an 18-month period in normal 10- to 15-year-old children. Children undergoing faster axial eye growth exhibited less thickening and, in some cases, a thinning of the choroid. These findings support a potential role for the choroid in the mechanisms regulating eye growth in childhood.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

- Objective To compare health service cost and length of stay between a traditional and an accelerated diagnostic approach to assess acute coronary syndromes (ACS) among patients who presented to the emergency department (ED) of a large tertiary hospital in Australia. - Design, setting and participants This historically controlled study analysed data collected from two independent patient cohorts presenting to the ED with potential ACS. The first cohort of 938 patients was recruited in 2008–2010, and these patients were assessed using the traditional diagnostic approach detailed in the national guideline. The second cohort of 921 patients was recruited in 2011–2013 and was assessed with the accelerated diagnostic approach named the Brisbane protocol. The Brisbane protocol applied early serial troponin testing for patients at 0 and 2 h after presentation to ED, in comparison with 0 and 6 h testing in traditional assessment process. The Brisbane protocol also defined a low-risk group of patients in whom no objective testing was performed. A decision tree model was used to compare the expected cost and length of stay in hospital between two approaches. Probabilistic sensitivity analysis was used to account for model uncertainty. - Results Compared with the traditional diagnostic approach, the Brisbane protocol was associated with reduced expected cost of $1229 (95% CI −$1266 to $5122) and reduced expected length of stay of 26 h (95% CI −14 to 136 h). The Brisbane protocol allowed physicians to discharge a higher proportion of low-risk and intermediate-risk patients from ED within 4 h (72% vs 51%). Results from sensitivity analysis suggested the Brisbane protocol had a high chance of being cost-saving and time-saving. - Conclusions This study provides some evidence of cost savings from a decision to adopt the Brisbane protocol. Benefits would arise for the hospital and for patients and their families.