990 resultados para complex text layout engine
Resumo:
This paper presents the evaluation of a QA system for the treatment of complex temporal questions. The system was implemented in a multilayered architecture where complex temporal questions are first decomposed into simple questions, according to the temporal relations expressed in the original question. These simple questions are then processed independently by our standard Question Answering engine and their respective answers are filtered to satisfy the temporal restrictions of each simple question. The answers to the simple decomposed questions are then combined, according to the temporal relations extracted from the original complex question, to give the final answer. This evaluation was performed as a pilot task in the Spanish QA Track of the Cross Language Evaluation Forum 2004.
Resumo:
With the development of information technology, the theory and methodology of complex network has been introduced to the language research, which transforms the system of language in a complex networks composed of nodes and edges for the quantitative analysis about the language structure. The development of dependency grammar provides theoretical support for the construction of a treebank corpus, making possible a statistic analysis of complex networks. This paper introduces the theory and methodology of the complex network and builds dependency syntactic networks based on the treebank of speeches from the EEE-4 oral test. According to the analysis of the overall characteristics of the networks, including the number of edges, the number of the nodes, the average degree, the average path length, the network centrality and the degree distribution, it aims to find in the networks potential difference and similarity between various grades of speaking performance. Through clustering analysis, this research intends to prove the network parameters’ discriminating feature and provide potential reference for scoring speaking performance.
Resumo:
With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the author(s) of a biomedical publication, or implicit, such as the positive or negative sentiment that an author had when she wrote a product review; there may also be complex context such as the social network of the authors. Many applications require analysis of topic patterns over different contexts. For instance, analysis of search logs in the context of the user can reveal how we can improve the quality of a search engine by optimizing the search results according to particular users; analysis of customer reviews in the context of positive and negative sentiments can help the user summarize public opinions about a product; analysis of blogs or scientific publications in the context of a social network can facilitate discovery of more meaningful topical communities. Since context information significantly affects the choices of topics and language made by authors, in general, it is very important to incorporate it into analyzing and mining text data. In general, modeling the context in text, discovering contextual patterns of language units and topics from text, a general task which we refer to as Contextual Text Mining, has widespread applications in text mining. In this thesis, we provide a novel and systematic study of contextual text mining, which is a new paradigm of text mining treating context information as the ``first-class citizen.'' We formally define the problem of contextual text mining and its basic tasks, and propose a general framework for contextual text mining based on generative modeling of text. This conceptual framework provides general guidance on text mining problems with context information and can be instantiated into many real tasks, including the general problem of contextual topic analysis. We formally present a functional framework for contextual topic analysis, with a general contextual topic model and its various versions, which can effectively solve the text mining problems in a lot of real world applications. We further introduce general components of contextual topic analysis, by adding priors to contextual topic models to incorporate prior knowledge, regularizing contextual topic models with dependency structure of context, and postprocessing contextual patterns to extract refined patterns. The refinements on the general contextual topic model naturally lead to a variety of probabilistic models which incorporate different types of context and various assumptions and constraints. These special versions of the contextual topic model are proved effective in a variety of real applications involving topics and explicit contexts, implicit contexts, and complex contexts. We then introduce a postprocessing procedure for contextual patterns, by generating meaningful labels for multinomial context models. This method provides a general way to interpret text mining results for real users. By applying contextual text mining in the ``context'' of other text information management tasks, including ad hoc text retrieval and web search, we further prove the effectiveness of contextual text mining techniques in a quantitative way with large scale datasets. The framework of contextual text mining not only unifies many explorations of text analysis with context information, but also opens up many new possibilities for future research directions in text mining.
Resumo:
In this article, we take a close look at the literacy demands of one task from the ‘Marvellous Micro-organisms Stage 3 Life and Living’ Primary Connections unit (Australian Academy of Science, 2005). One lesson from the unit, ‘Exploring Bread’, (pp 4-8) asks students to ‘use bread labels to locate ingredient information and synthesise understanding of bread ingredients’. We draw upon a framework offered by the New London Group (2000), that of linguistic, visual and spatial design, to consider in more detail three bread wrappers and from there the complex literacies that students need to interrelate to undertake the required task. Our findings are that although bread wrappers are an example of an everyday science text, their linguistic, visual and spatial designs and their interrelationship are not trivial. We conclude by reinforcing the need for teachers of science to also consider how the complex design elements of everyday science texts and their interrelated literacies are made visible through instructional practice.
Resumo:
The research presented in this thesis addresses inherent problems in signaturebased intrusion detection systems (IDSs) operating in heterogeneous environments. The research proposes a solution to address the difficulties associated with multistep attack scenario specification and detection for such environments. The research has focused on two distinct problems: the representation of events derived from heterogeneous sources and multi-step attack specification and detection. The first part of the research investigates the application of an event abstraction model to event logs collected from a heterogeneous environment. The event abstraction model comprises a hierarchy of events derived from different log sources such as system audit data, application logs, captured network traffic, and intrusion detection system alerts. Unlike existing event abstraction models where low-level information may be discarded during the abstraction process, the event abstraction model presented in this work preserves all low-level information as well as providing high-level information in the form of abstract events. The event abstraction model presented in this work was designed independently of any particular IDS and thus may be used by any IDS, intrusion forensic tools, or monitoring tools. The second part of the research investigates the use of unification for multi-step attack scenario specification and detection. Multi-step attack scenarios are hard to specify and detect as they often involve the correlation of events from multiple sources which may be affected by time uncertainty. The unification algorithm provides a simple and straightforward scenario matching mechanism by using variable instantiation where variables represent events as defined in the event abstraction model. The third part of the research looks into the solution to address time uncertainty. Clock synchronisation is crucial for detecting multi-step attack scenarios which involve logs from multiple hosts. Issues involving time uncertainty have been largely neglected by intrusion detection research. The system presented in this research introduces two techniques for addressing time uncertainty issues: clock skew compensation and clock drift modelling using linear regression. An off-line IDS prototype for detecting multi-step attacks has been implemented. The prototype comprises two modules: implementation of the abstract event system architecture (AESA) and of the scenario detection module. The scenario detection module implements our signature language developed based on the Python programming language syntax and the unification-based scenario detection engine. The prototype has been evaluated using a publicly available dataset of real attack traffic and event logs and a synthetic dataset. The distinct features of the public dataset are the fact that it contains multi-step attacks which involve multiple hosts with clock skew and clock drift. These features allow us to demonstrate the application and the advantages of the contributions of this research. All instances of multi-step attacks in the dataset have been correctly identified even though there exists a significant clock skew and drift in the dataset. Future work identified by this research would be to develop a refined unification algorithm suitable for processing streams of events to enable an on-line detection. In terms of time uncertainty, identified future work would be to develop mechanisms which allows automatic clock skew and clock drift identification and correction. The immediate application of the research presented in this thesis is the framework of an off-line IDS which processes events from heterogeneous sources using abstraction and which can detect multi-step attack scenarios which may involve time uncertainty.
Resumo:
The recent focus on literacy in Social Studies has been on linguistic design, particularly that related to the grammar of written and spoken text. When students are expected to produce complex hybridized genres such as timelines, a focus on the teaching and learning of linguistic design is necessary but not sufficient to complete the task. Theorizations of new literacies identify five interrelated meaning making designs for text deconstruction and reproduction: linguistic, spatial, visual, gestural, and audio design. Honing in on the complexity of timelines, this paper casts a lens on the linguistic, visual, spatial, and gestural designs of three pairs of primary school aged Social Studies learners. Drawing on a functional metalanguage, we analyze the linguistic, visual, spatial, and gestural designs of their work. We also offer suggestions of their effect, and from there consider the importance of explicit instruction in text design choices for this Social Studies task. We conclude the analysis by suggesting the foci of explicit instruction for future lessons.
Resumo:
In computational linguistics, information retrieval and applied cognition, words and concepts are often represented as vectors in high dimensional spaces computed from a corpus of text. These high dimensional spaces are often referred to as Semantic Spaces. We describe a novel and efficient approach to computing these semantic spaces via the use of complex valued vector representations. We report on the practical implementation of the proposed method and some associated experiments. We also briefly discuss how the proposed system relates to previous theoretical work in Information Retrieval and Quantum Mechanics and how the notions of probability, logic and geometry are integrated within a single Hilbert space representation. In this sense the proposed system has more general application and gives rise to a variety of opportunities for future research.
Resumo:
This paper presents an experiment designed to investigate if redundancy in an interface has any impact on the use of complex interfaces by older people and people with low prior-experience with technology. The important findings of this study were that older people (65+ years) completed the tasks on the Words only based interface faster than on Redundant (text and symbols) interface. The rest of the participants completed tasks significantly faster on the Redundant interface. From a cognitive processing perspective, sustained attention (one of the functions of Central Executive) has emerged as one of the important factors in completing tasks on complex interfaces faster and with fewer of errors.
Resumo:
Flexible design concept is a relatively new trend in airport terminal design which is believed to facilitate the ever changing needs of a terminal. Current architectural design processes become more complex every day because of the introduction of new building technologies where the concept of flexible airport terminal would apparently make the design process even more complex. Previous studies have demonstrated that ever growing aviation industry requires airport terminals to be planned, designed and constructed in such a way that should allow flexibility in design process. In order to adopt the philosophy of ‘design for flexibility’ architects need to address a wide range of differing needs. An appropriate integration of the process models, prior to the airport terminal design process, is expected to uncover the relationships that exist between spatial layout and their corresponding functions. The current paper seeks to develop a way of sharing space adjacency related information obtained from the Business Process Models (BPM) to assist in defining flexible airport terminal layouts. Critical design parameters are briefly investigated at this stage of research whilst reviewing the available design alternatives and an evaluation framework is proposed in the current paper. Information obtained from various design layouts should assist in identifying and defining flexible design matrices allowing architects to interpret and to apply those throughout the lifecycle of the terminal building.
Resumo:
Many older people have difficulties using modern consumer products due to increased product complexity both in terms of functionality and interface design. Previous research has shown that older people have more difficulty in using complex devices intuitively when compared to the younger. Furthermore, increased life expectancy and a falling birth rate have been catalysts for changes in world demographics over the past two decades. This trend also suggests a proportional increase of older people in the work-force. This realisation has led to research on the effective use of technology by older populations in an effort to engage them more productively and to assist them in leading independent lives. Ironically, not enough attention has been paid to the development of interaction design strategies that would actually enable older users to better exploit new technologies. Previous research suggests that if products are designed to reflect people's prior knowledge, they will appear intuitive to use. Since intuitive interfaces utilise domain-specific prior knowledge of users, they require minimal learning for effective interaction. However, older people are very diverse in their capabilities and domain-specific prior knowledge. In addition, ageing also slows down the process of acquiring new knowledge. Keeping these suggestions and limitations in view, the aim of this study was set to investigate possible approaches to developing interfaces that facilitate their intuitive use by older people. In this quest to develop intuitive interfaces for older people, two experiments were conducted that systematically investigated redundancy (the use of both text and icons) in interface design, complexity of interface structure (nested versus flat), and personal user factors such as cognitive abilities, perceived self-efficacy and technology anxiety. All of these factors could interfere with intuitive use. The results from the first experiment suggest that, contrary to what was hypothesised, older people (65+ years) completed the tasks on the text only based interface design faster than on the redundant interface design. The outcome of the second experiment showed that, as expected, older people took more time on a nested interface. However, they did not make significantly more errors compared with younger age groups. Contrary to what was expected, older age groups also did better under anxious conditions. The findings of this study also suggest that older age groups are more heterogeneous in their capabilities and their intuitive use of contemporary technological devices is mediated more by domain-specific technology prior knowledge and by their cognitive abilities, than chronological age. This makes it extremely difficult to develop product interfaces that are entirely intuitive to use. However, by keeping in view the cognitive limitations of older people when interfaces are developed, and using simple text-based interfaces with flat interface structure, would help them intuitively learn and use complex technological products successfully during early encounter with a product. These findings indicate that it might be more pragmatic if interfaces are designed for intuitive learning rather than for intuitive use. Based on this research and the existing literature, a model for adaptable interface design as a strategy for developing intuitively learnable product interfaces was proposed. An adaptable interface can initially use a simple text only interface to help older users to learn and successfully use the new system. Over time, this can be progressively changed to a symbols-based nested interface for more efficient and intuitive use.
Resumo:
Text categorisation is challenging, due to the complex structure with heterogeneous, changing topics in documents. The performance of text categorisation relies on the quality of samples, effectiveness of document features, and the topic coverage of categories, depending on the employing strategies; supervised or unsupervised; single labelled or multi-labelled. Attempting to deal with these reliability issues in text categorisation, we propose an unsupervised multi-labelled text categorisation approach that maps the local knowledge in documents to global knowledge in a world ontology to optimise categorisation result. The conceptual framework of the approach consists of three modules; pattern mining for feature extraction; feature-subject mapping for categorisation; concept generalisation for optimised categorisation. The approach has been promisingly evaluated by compared with typical text categorisation methods, based on the ground truth encoded by human experts.
Resumo:
Increases in functionality, power and intelligence of modern engineered systems led to complex systems with a large number of interconnected dynamic subsystems. In such machines, faults in one subsystem can cascade and affect the behavior of numerous other subsystems. This complicates the traditional fault monitoring procedures because of the need to train models of the faults that the monitoring system needs to detect and recognize. Unavoidable design defects, quality variations and different usage patterns make it infeasible to foresee all possible faults, resulting in limited diagnostic coverage that can only deal with previously anticipated and modeled failures. This leads to missed detections and costly blind swapping of acceptable components because of one’s inability to accurately isolate the source of previously unseen anomalies. To circumvent these difficulties, a new paradigm for diagnostic systems is proposed and discussed in this paper. Its feasibility is demonstrated through application examples in automotive engine diagnostics.
Resumo:
Managing large cohorts of undergraduate student nurses during off-campus clinical placement is complex and challenging. Clinical facilitators are required to support and assess nursing students during clinical placement. Therefore clear communication between university academic coordinators and clinical facilitators is essential for consistency and prompt management of emerging issues. Increasing work demands require both coordinators and facilitators to have an efficient and effective mode of communication. The aim of this study was to explore the use of Short Message Service (SMS) texts, sent between mobile phones, for communication between university Unit Coordinators and off-campus Clinical Facilitators. This study used an after-only design. During a two week clinical placement 46 clinical facilitators working with first and second year Bachelor of Nursing students from a large metropolitan Australian university were regularly sent SMS texts of relevant updates and reminders from the university coordinator. A 15 item questionnaire comprising x of 5 point likert scale and 3 open-ended questions was then used to survey the clinical facilitators. The response rate was 47.8% (n=22). Correlations were found between the approachability of the coordinator and facilitator perception of a) that the coordinator understood issues on clinical placement (r=0.785, p<0.001,), and b) being part of the teaching team (r=0.768, p<0.001). Analysis of responses to qualitative questions revealed three themes: connection, approachability and collaboration. Results indicate that SMS communication is convenient and appropriate in this setting. This quasi-experimental after-test study found regular SMS communication improves a sense of connection, approachability and collaboration.
Resumo:
Background The requirement for dual screening of titles and abstracts to select papers to examine in full text can create a huge workload, not least when the topic is complex and a broad search strategy is required, resulting in a large number of results. An automated system to reduce this burden, while still assuring high accuracy, has the potential to provide huge efficiency savings within the review process. Objectives To undertake a direct comparison of manual screening with a semi‐automated process (priority screening) using a machine classifier. The research is being carried out as part of the current update of a population‐level public health review. Methods Authors have hand selected studies for the review update, in duplicate, using the standard Cochrane Handbook methodology. A retrospective analysis, simulating a quasi‐‘active learning’ process (whereby a classifier is repeatedly trained based on ‘manually’ labelled data) will be completed, using different starting parameters. Tests will be carried out to see how far different training sets, and the size of the training set, affect the classification performance; i.e. what percentage of papers would need to be manually screened to locate 100% of those papers included as a result of the traditional manual method. Results From a search retrieval set of 9555 papers, authors excluded 9494 papers at title/abstract and 52 at full text, leaving 9 papers for inclusion in the review update. The ability of the machine classifier to reduce the percentage of papers that need to be manually screened to identify all the included studies, under different training conditions, will be reported. Conclusions The findings of this study will be presented along with an estimate of any efficiency gains for the author team if the screening process can be semi‐automated using text mining methodology, along with a discussion of the implications for text mining in screening papers within complex health reviews.
Resumo:
For wind farm optimizations with lands belonging to different owners, the traditional penalty method is highly dependent on the type of wind farm land division. The application of the traditional method can be cumbersome if the divisions are complex. To overcome this disadvantage, a new method is proposed in this paper for the first time. Unlike the penalty method which requires the addition of penalizing term when evaluating the fitness function, it is achieved through repairing the infeasible solutions before fitness evaluation. To assess the effectiveness of the proposed method on the optimization of wind farm, the optimizing results of different methods are compared for three different types of wind farm division. Different wind scenarios are also incorporated during optimization which includes (i) constant wind speed and wind direction; (ii) various wind speed and wind direction, and; (iii) the more realisticWeibull distribution. Results show that the performance of the new method varies for different land plots in the tested cases. Nevertheless, it is found that optimum or at least close to optimum results can be obtained with sequential land plot study using the new method for all cases. It is concluded that satisfactory results can be achieved using the proposed method. In addition, it has the advantage of flexibility in managing the wind farm design, which not only frees users to define the penalty parameter but without limitations on the wind farm division.