403 resultados para random search algorithms
em Queensland University of Technology - ePrints Archive
Resumo:
Most web service discovery systems use keyword-based search algorithms and, although partially successful, sometimes fail to satisfy some users information needs. This has given rise to several semantics-based approaches that look to go beyond simple attribute matching and try to capture the semantics of services. However, the results reported in the literature vary and in many cases are worse than the results obtained by keyword-based systems. We believe the accuracy of the mechanisms used to extract tokens from the non-natural language sections of WSDL files directly affects the performance of these techniques, because some of them can be more sensitive to noise. In this paper three existing tokenization algorithms are evaluated and a new algorithm that outperforms all the algorithms found in the literature is introduced.
Resumo:
For people with cognitive disabilities, technology is more often thought of as a support mechanism, rather than a source of division that may require intervention to equalize access across the cognitive spectrum. This paper presents a first attempt at formalizing the digital gap created by the generalization of search engines. This was achieved through the development of a mapping of cognitive abilities required by users to execute low- level tasks during a standard Web search task. The mapping demonstrates how critical these abilities are to successfully use search engines with an adequate level of independence. It will lead to a set of design guidelines for search engine interfaces that will allow for the engagement of users of all abilities, and also, more importantly, in search algorithms such as query suggestion and measure of relevance (i.e. ranking).
Resumo:
In providing simultaneous information on expression profiles for thousands of genes, microarray technologies have, in recent years, been largely used to investigate mechanisms of gene expression. Clustering and classification of such data can, indeed, highlight patterns and provide insight on biological processes. A common approach is to consider genes and samples of microarray datasets as nodes in a bipartite graphs, where edges are weighted e.g. based on the expression levels. In this paper, using a previously-evaluated weighting scheme, we focus on search algorithms and evaluate, in the context of biclustering, several variations of Genetic Algorithms. We also introduce a new heuristic “Propagate”, which consists in recursively evaluating neighbour solutions with one more or one less active conditions. The results obtained on three well-known datasets show that, for a given weighting scheme,optimal or near-optimal solutions can be identified.
Resumo:
Design as seen from the designer's perspective is a series of amazing imaginative jumps or creative leaps. But design as seen by the design historian is a smooth progression or evolution of ideas that they seem self-evident and inevitable after the event. But the next step is anything but obvious for the artist/creator/inventor/designer stuck at that point just before the creative leap. They know where they have come from and have a general sense of where they are going, but often do not have a precise target or goal. This is why it is misleading to talk of design as a problem-solving activity - it is better defined as a problem-finding activity. This has been very frustrating for those trying to assist the design process with computer-based, problem-solving techniques. By the time the problem has been defined, it has been solved. Indeed the solution is often the very definition of the problem. Design must be creative-or it is mere imitation. But since this crucial creative leap seem inevitable after the event, the question must arise, can we find some way of searching the space ahead? Of course there are serious problems of knowing what we are looking for and the vastness of the search space. It may be better to discard altogether the term "searching" in the context of the design process: Conceptual analogies such as search, search spaces and fitness landscapes aim to elucidate the design process. However, the vastness of the multidimensional spaces involved make these analogies misguided and they thereby actually result in further confounding the issue. The term search becomes a misnomer since it has connotations that imply that it is possible to find what you are looking for. In such vast spaces the term search must be discarded. Thus, any attempt at searching for the highest peak in the fitness landscape as an optimal solution is also meaningless. Futhermore, even the very existence of a fitness landscape is fallacious. Although alternatives in the same region of the vast space can be compared to one another, distant alternatives will stem from radically different roots and will therefore not be comparable in any straightforward manner (Janssen 2000). Nevertheless we still have this tantalizing possibility that if a creative idea seems inevitable after the event, then somehow might the process be rserved? This may be as improbable as attempting to reverse time. A more helpful analogy is from nature, where it is generally assumed that the process of evolution is not long-term goal directed or teleological. Dennett points out a common minsunderstanding of Darwinism: the idea that evolution by natural selection is a procedure for producing human beings. Evolution can have produced humankind by an algorithmic process, without its being true that evolution is an algorithm for producing us. If we were to wind the tape of life back and run this algorithm again, the likelihood of "us" being created again is infinitesimally small (Gould 1989; Dennett 1995). But nevertheless Mother Nature has proved a remarkably successful, resourceful, and imaginative inventor generating a constant flow of incredible new design ideas to fire our imagination. Hence the current interest in the potential of the evolutionary paradigm in design. These evolutionary methods are frequently based on techniques such as the application of evolutionary algorithms that are usually thought of as search algorithms. It is necessary to abandon such connections with searching and see the evolutionary algorithm as a direct analogy with the evolutionary processes of nature. The process of natural selection can generate a wealth of alternative experiements, and the better ones survive. There is no one solution, there is no optimal solution, but there is continuous experiment. Nature is profligate with her prototyping and ruthless in her elimination of less successful experiments. Most importantly, nature has all the time in the world. As designers we cannot afford prototyping and ruthless experiment, nor can we operate on the time scale of the natural design process. Instead we can use the computer to compress space and time and to perform virtual prototyping and evaluation before committing ourselves to actual prototypes. This is the hypothesis underlying the evolutionary paradigm in design (1992, 1995).
Resumo:
This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.
Resumo:
Seaport container terminals are an important part of the logistics systems in international trades. This paper investigates the relationship between quay cranes, yard machines and container storage locations in a multi-berth and multi-ship environment. The aims are to develop a model for improving the operation efficiency of the seaports and to develop an analytical tool for yard operation planning. Due to the fact that the container transfer times are sequence-dependent and with the large number of variables involve, the proposed model cannot be solved in a reasonable time interval for realistically sized problems. For this reason, List Scheduling and Tabu Search algorithms have been developed to solve this formidable and NP-hard scheduling problem. Numerical implementations have been analysed and promising results have been achieved.
Resumo:
The main aim of this thesis is to analyse and optimise a public hospital Emergency Department. The Emergency Department (ED) is a complex system with limited resources and a high demand for these resources. Adding to the complexity is the stochastic nature of almost every element and characteristic in the ED. The interaction with other functional areas also complicates the system as these areas have a huge impact on the ED and the ED is powerless to change them. Therefore it is imperative that OR be applied to the ED to improve the performance within the constraints of the system. The main characteristics of the system to optimise included tardiness, adherence to waiting time targets, access block and length of stay. A validated and verified simulation model was built to model the real life system. This enabled detailed analysis of resources and flow without disruption to the actual ED. A wide range of different policies for the ED and a variety of resources were able to be investigated. Of particular interest was the number and type of beds in the ED and also the shift times of physicians. One point worth noting was that neither of these resources work in isolation and for optimisation of the system both resources need to be investigated in tandem. The ED was likened to a flow shop scheduling problem with the patients and beds being synonymous with the jobs and machines typically found in manufacturing problems. This enabled an analytic scheduling approach. Constructive heuristics were developed to reactively schedule the system in real time and these were able to improve the performance of the system. Metaheuristics that optimised the system were also developed and analysed. An innovative hybrid Simulated Annealing and Tabu Search algorithm was developed that out-performed both simulated annealing and tabu search algorithms by combining some of their features. The new algorithm achieves a more optimal solution and does so in a shorter time.
Resumo:
Multi-Objective optimization for designing of a benchmark cogeneration system known as CGAM cogeneration system has been performed. In optimization approach, the thermoeconomic and Environmental aspects have been considered, simultaneously. The environmental objective function has been defined and expressed in cost terms. One of the most suitable optimization techniques developed using a particular class of search algorithms known as; Multi-Objective Particle Swarm Optimization (MOPSO) algorithm has been used here. This approach has been applied to find the set of Pareto optimal solutions with respect to the aforementioned objective functions. An example of fuzzy decision-making with the aid of Bellman-Zadeh approach has been presented and a final optimal solution has been introduced.
Resumo:
Motivation: Gene silencing, also called RNA interference, requires reliable assessment of silencer impacts. A critical task is to find matches between silencer oligomers and sites in the genome, in accordance with one-to-many matching rules (G-U matching, with provision for mismatches). Fast search algorithms are required to support silencer impact assessments in procedures for designing effective silencer sequences.Results: The article presents a matching algorithm and data structures specialized for matching searches, including a kernel procedure that addresses a Boolean version of the database task called the skyline search. Besides exact matches, the algorithm is extended to allow for the location-specific mismatches applicable in plants. Computational tests show that the algorithm is significantly faster than suffix-tree alternatives. © The Author 2010. Published by Oxford University Press. All rights reserved.
Resumo:
Random Indexing K-tree is the combination of two algorithms suited for large scale document clustering.
Resumo:
Log-linear and maximum-margin models are two commonly-used methods in supervised machine learning, and are frequently used in structured prediction problems. Efficient learning of parameters in these models is therefore an important problem, and becomes a key factor when learning from very large data sets. This paper describes exponentiated gradient (EG) algorithms for training such models, where EG updates are applied to the convex dual of either the log-linear or max-margin objective function; the dual in both the log-linear and max-margin cases corresponds to minimizing a convex function with simplex constraints. We study both batch and online variants of the algorithm, and provide rates of convergence for both cases. In the max-margin case, O(1/ε) EG updates are required to reach a given accuracy ε in the dual; in contrast, for log-linear models only O(log(1/ε)) updates are required. For both the max-margin and log-linear cases, our bounds suggest that the online EG algorithm requires a factor of n less computation to reach a desired accuracy than the batch EG algorithm, where n is the number of training examples. Our experiments confirm that the online algorithms are much faster than the batch algorithms in practice. We describe how the EG updates factor in a convenient way for structured prediction problems, allowing the algorithms to be efficiently applied to problems such as sequence learning or natural language parsing. We perform extensive evaluation of the algorithms, comparing them to L-BFGS and stochastic gradient descent for log-linear models, and to SVM-Struct for max-margin models. The algorithms are applied to a multi-class problem as well as to a more complex large-scale parsing task. In all these settings, the EG algorithms presented here outperform the other methods.
Resumo:
The main aim of this paper is to describe an adaptive re-planning algorithm based on a RRT and Game Theory to produce an efficient collision free obstacle adaptive Mission Path Planner for Search and Rescue (SAR) missions. This will provide UAV autopilots and flight computers with the capability to autonomously avoid static obstacles and No Fly Zones (NFZs) through dynamic adaptive path replanning. The methods and algorithms produce optimal collision free paths and can be integrated on a decision aid tool and UAV autopilots.
Resumo:
In the mining optimisation literature, most researchers focused on two strategic-level and tactical-level open-pit mine optimisation problems, which are respectively termed ultimate pit limit (UPIT) or constrained pit limit (CPIT). However, many researchers indicate that the substantial numbers of variables and constraints in real-world instances (e.g., with 50-1000 thousand blocks) make the CPIT’s mixed integer programming (MIP) model intractable for use. Thus, it becomes a considerable challenge to solve the large scale CPIT instances without relying on exact MIP optimiser as well as the complicated MIP relaxation/decomposition methods. To take this challenge, two new graph-based algorithms based on network flow graph and conjunctive graph theory are developed by taking advantage of problem properties. The performance of our proposed algorithms is validated by testing recent large scale benchmark UPIT and CPIT instances’ datasets of MineLib in 2013. In comparison to best known results from MineLib, it is shown that the proposed algorithms outperform other CPIT solution approaches existing in the literature. The proposed graph-based algorithms leads to a more competent mine scheduling optimisation expert system because the third-party MIP optimiser is no longer indispensable and random neighbourhood search is not necessary.
Resumo:
This paper describes the approach taken to the clustering task at INEX 2009 by a group at the Queensland University of Technology. The Random Indexing (RI) K-tree has been used with a representation that is based on the semantic markup available in the INEX 2009 Wikipedia collection. The RI K-tree is a scalable approach to clustering large document collections. This approach has produced quality clustering when evaluated using two different methodologies.