116 resultados para Query Complexity
Resumo:
While Business Process Management (BPM) is an established discipline, the increased adoption of BPM technology in recent years has introduced new challenges. One challenge concerns dealing with the ever-growing complexity of business process models. Mechanisms for dealing with this complexity can be classified into two categories: 1) those that are solely concerned with the visual representation of the model and 2) those that change its inner structure. While significant attention is paid to the latter category in the BPM literature, this paper focuses on the former category. It presents a collection of patterns that generalize and conceptualize various existing mechanisms to change the visual representation of a process model. Next, it provides a detailed analysis of the degree of support for these patterns in a number of state-of-the-art languages and tools. This paper concludes with the results of a usability evaluation of the patterns conducted with BPM practitioners.
Resumo:
The world we live in is well labeled for the benefit of humans but to date robots have made little use of this resource. In this paper we describe a system that allows robots to read and interpret visible text and use it to understand the content of the scene. We use a generative probabilistic model that explains spotted text in terms of arbitrary search terms. This allows the robot to understand the underlying function of the scene it is looking at, such as whether it is a bank or a restaurant. We describe the text spotting engine at the heart of our system that is able to detect and parse wild text in images, and the generative model, and present results from images obtained with a robot in a busy city setting.
Resumo:
The literature abounds with descriptions of failures in high-profile projects and a range of initiatives has been generated to enhance project management practice (e.g., Morris, 2006). Estimating from our own research, there are scores of other project failures that are unrecorded. Many of these failures can be explained using existing project management theory; poor risk management, inaccurate estimating, cultures of optimism dominating decision making, stakeholder mismanagement, inadequate timeframes, and so on. Nevertheless, in spite of extensive discussion and analysis of failures and attention to the presumed causes of failure, projects continue to fail in unexpected ways. In the 1990s, three U.S. state departments of motor vehicles (DMV) cancelled major projects due to time and cost overruns and inability to meet project goals (IT-Cortex, 2010). The California DMV failed to revitalize their drivers’ license and registration application process after spending $45 million. The Oregon DMV cancelled their five year, $50 million project to automate their manual, paper-based operation after three years when the estimates grew to $123 million; its duration stretched to eight years or more and the prototype was a complete failure. In 1997, the Washington state DMV cancelled their license application mitigation project because it would have been too big and obsolete by the time it was estimated to be finished. There are countless similar examples of projects that have been abandoned or that have not delivered the requirements.
Resumo:
Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the network. Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights. For example, consider a two-layer feedforward network of sigmoid units, in which the sum of the magnitudes of the weights associated with each unit is bounded by A and the input dimension is n. We show that the misclassification probability is no more than a certain error estimate (that is related to squared error on the training set) plus A3 √((log n)/m) (ignoring log A and log m factors), where m is the number of training patterns. This may explain the generalization performance of neural networks, particularly when the number of training examples is considerably smaller than the number of weights. It also supports heuristics (such as weight decay and early stopping) that attempt to keep the weights small during training. The proof techniques appear to be useful for the analysis of other pattern classifiers: when the input domain is a totally bounded metric space, we use the same approach to give upper bounds on misclassification probability for classifiers with decision boundaries that are far from the training examples.
Resumo:
In fault detection and diagnostics, limitations coming from the sensor network architecture are one of the main challenges in evaluating a system’s health status. Usually the design of the sensor network architecture is not solely based on diagnostic purposes, other factors like controls, financial constraints, and practical limitations are also involved. As a result, it quite common to have one sensor (or one set of sensors) monitoring the behaviour of two or more components. This can significantly extend the complexity of diagnostic problems. In this paper a systematic approach is presented to deal with such complexities. It is shown how the problem can be formulated as a Bayesian network based diagnostic mechanism with latent variables. The developed approach is also applied to the problem of fault diagnosis in HVAC systems, an application area with considerable modeling and measurement constraints.
Resumo:
In the multi-view approach to semisupervised learning, we choose one predictor from each of multiple hypothesis classes, and we co-regularize our choices by penalizing disagreement among the predictors on the unlabeled data. We examine the co-regularization method used in the co-regularized least squares (CoRLS) algorithm, in which the views are reproducing kernel Hilbert spaces (RKHS's), and the disagreement penalty is the average squared difference in predictions. The final predictor is the pointwise average of the predictors from each view. We call the set of predictors that can result from this procedure the co-regularized hypothesis class. Our main result is a tight bound on the Rademacher complexity of the co-regularized hypothesis class in terms of the kernel matrices of each RKHS. We find that the co-regularization reduces the Rademacher complexity by an amount that depends on the distance between the two views, as measured by a data dependent metric. We then use standard techniques to bound the gap between training error and test error for the CoRLS algorithm. Experimentally, we find that the amount of reduction in complexity introduced by co regularization correlates with the amount of improvement that co-regularization gives in the CoRLS algorithm.
Resumo:
This study aimed to explore resilience and wellbeing among a group of eight refugee women originating from several countries (mainly African) and living in Brisbane, most of whom were single mothers. To challenge mostly quantitative and gender-blind explorations of mental health concepts among refugee groups, the project sought an emic and contextual understanding of resilience and wellbeing. Established perspectives, while useful, tend to overlook the complexities of refugee mental health experiences and can neglect the dense nature of individual stories. The purpose of my study was to contest relatively simplistic narratives of mental health constructs that tend to dominate migrant and refugee studies and influence practice paradigms in the human services field. In this ethnographic exploration of mental health constructs conducted in 2008 and 2009, the use of in-depth interviews, participant observations, and visual ethnographic elements provided an opportunity for refugee women to tell their own stories. The participants’ unique narratives of pre- and post-migration experiences, shaped by specific gender, age, social, cultural and political aspects prevailing in their lives, yielded ‘thick’ ethnographic description (Geertz, 1973) of their social worlds. The findings explored in this study, namely language issues, the impact of community dynamics, and the single status of refugee women, clearly demonstrate that mental health constructs are fluid, multifaceted and complex in reality. In fact, language, community dynamics, and being a single mother, represented both opportunities and barriers in the lives of participants. In some contexts, these factors were conducive to resilience and wellbeing, while in other circumstances, these three elements acted as a hindrance to positive mental health outcomes. There are multiple dimensions to the findings, signifying that the social worlds of refugee women cannot be simplified using set definitions and neat notions of resilience and wellbeing. Instead, the intricacies and complexities embedded in the mundane of the everyday highlight novel conceptualisations of resilience and wellbeing. Based on the particular circumstances of single refugee mothers, whose experiences differ from that of married women, this thesis presents novel articulations of mental health constructs, as an alternative view to existing trends in the literature on refugee issues. Rich and multi-dimensional meanings associated with the socio-cultural determinants of mental health emerged in the process. This thesis’ findings highlight a significant gap in diasporic studies as well as simplistic assumptions about refugee women’s resettlement experiences. Single refugee women’s distinct issues are so complex and dense, that a contextual approach is critical to yield accurate depictions of their circumstances. It is therefore essential to understand refugee lived experiences within broader socio-political contexts to truly appreciate the depth of these narratives. In this manner, critical aspects salient to refugee journeys can inform different understandings of resilience, wellbeing and mental health, and shape contemporary policy and human service practice paradigms.
Resumo:
In many applications, e.g., bioinformatics, web access traces, system utilisation logs, etc., the data is naturally in the form of sequences. People have taken great interest in analysing the sequential data and finding the inherent characteristics or relationships within the data. Sequential association rule mining is one of the possible methods used to analyse this data. As conventional sequential association rule mining very often generates a huge number of association rules, of which many are redundant, it is desirable to find a solution to get rid of those unnecessary association rules. Because of the complexity and temporal ordered characteristics of sequential data, current research on sequential association rule mining is limited. Although several sequential association rule prediction models using either sequence constraints or temporal constraints have been proposed, none of them considered the redundancy problem in rule mining. The main contribution of this research is to propose a non-redundant association rule mining method based on closed frequent sequences and minimal sequential generators. We also give a definition for the non-redundant sequential rules, which are sequential rules with minimal antecedents but maximal consequents. A new algorithm called CSGM (closed sequential and generator mining) for generating closed sequences and minimal sequential generators is also introduced. A further experiment has been done to compare the performance of generating non-redundant sequential rules and full sequential rules, meanwhile, performance evaluation of our CSGM and other closed sequential pattern mining or generator mining algorithms has also been conducted. We also use generated non-redundant sequential rules for query expansion in order to improve recommendations for infrequently purchased products.
Resumo:
Language Modeling (LM) has been successfully applied to Information Retrieval (IR). However, most of the existing LM approaches only rely on term occurrences in documents, queries and document collections. In traditional unigram based models, terms (or words) are usually considered to be independent. In some recent studies, dependence models have been proposed to incorporate term relationships into LM, so that links can be created between words in the same sentence, and term relationships (e.g. synonymy) can be used to expand the document model. In this study, we further extend this family of dependence models in the following two ways: (1) Term relationships are used to expand query model instead of document model, so that query expansion process can be naturally implemented; (2) We exploit more sophisticated inferential relationships extracted with Information Flow (IF). Information flow relationships are not simply pairwise term relationships as those used in previous studies, but are between a set of terms and another term. They allow for context-dependent query expansion. Our experiments conducted on TREC collections show that we can obtain large and significant improvements with our approach. This study shows that LM is an appropriate framework to implement effective query expansion.
Resumo:
In information retrieval, a user's query is often not a complete representation of their real information need. The user's information need is a cognitive construction, however the use of cognitive models to perform query expansion have had little study. In this paper, we present a cognitively motivated query expansion technique that uses semantic features for use in ad hoc retrieval. This model is evaluated against a state-of-the-art query expansion technique. The results show our approach provides significant improvements in retrieval effectiveness for the TREC data sets tested.
Resumo:
The growing importance and need of data processing for information extraction is vital for Web databases. Due to the sheer size and volume of databases, retrieval of relevant information as needed by users has become a cumbersome process. Information seekers are faced by information overloading - too many result sets are returned for their queries. Moreover, too few or no results are returned if a specific query is asked. This paper proposes a ranking algorithm that gives higher preference to a user’s current search and also utilizes profile information in order to obtain the relevant results for a user’s query.