Biblioteca Digital

933 resultados para GRAPH SEARCH ALGORITHMS

Matrix Decomposition Methods for Data Mining : Computational Complexity and Algorithms

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.

Algorithms for 13C metabolic flux analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The metabolism of an organism consists of a network of biochemical reactions that transform small molecules, or metabolites, into others in order to produce energy and building blocks for essential macromolecules. The goal of metabolic flux analysis is to uncover the rates, or the fluxes, of those biochemical reactions. In a steady state, the sum of the fluxes that produce an internal metabolite is equal to the sum of the fluxes that consume the same molecule. Thus the steady state imposes linear balance constraints to the fluxes. In general, the balance constraints imposed by the steady state are not sufficient to uncover all the fluxes of a metabolic network. The fluxes through cycles and alternative pathways between the same source and target metabolites remain unknown. More information about the fluxes can be obtained from isotopic labelling experiments, where a cell population is fed with labelled nutrients, such as glucose that contains 13C atoms. Labels are then transferred by biochemical reactions to other metabolites. The relative abundances of different labelling patterns in internal metabolites depend on the fluxes of pathways producing them. Thus, the relative abundances of different labelling patterns contain information about the fluxes that cannot be uncovered from the balance constraints derived from the steady state. The field of research that estimates the fluxes utilizing the measured constraints to the relative abundances of different labelling patterns induced by 13C labelled nutrients is called 13C metabolic flux analysis. There exist two approaches of 13C metabolic flux analysis. In the optimization approach, a non-linear optimization task, where candidate fluxes are iteratively generated until they fit to the measured abundances of different labelling patterns, is constructed. In the direct approach, linear balance constraints given by the steady state are augmented with linear constraints derived from the abundances of different labelling patterns of metabolites. Thus, mathematically involved non-linear optimization methods that can get stuck to the local optima can be avoided. On the other hand, the direct approach may require more measurement data than the optimization approach to obtain the same flux information. Furthermore, the optimization framework can easily be applied regardless of the labelling measurement technology and with all network topologies. In this thesis we present a formal computational framework for direct 13C metabolic flux analysis. The aim of our study is to construct as many linear constraints to the fluxes from the 13C labelling measurements using only computational methods that avoid non-linear techniques and are independent from the type of measurement data, the labelling of external nutrients and the topology of the metabolic network. The presented framework is the first representative of the direct approach for 13C metabolic flux analysis that is free from restricting assumptions made about these parameters.In our framework, measurement data is first propagated from the measured metabolites to other metabolites. The propagation is facilitated by the flow analysis of metabolite fragments in the network. Then new linear constraints to the fluxes are derived from the propagated data by applying the techniques of linear algebra.Based on the results of the fragment flow analysis, we also present an experiment planning method that selects sets of metabolites whose relative abundances of different labelling patterns are most useful for 13C metabolic flux analysis. Furthermore, we give computational tools to process raw 13C labelling data produced by tandem mass spectrometry to a form suitable for 13C metabolic flux analysis.

Algorithms for Association-Based Gene Mapping

Relevância:

20.00% 20.00%

Publicador:

FreeS: A fast algorithm to discover frequent free subtrees using a novel canonical form

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Web data can often be represented in free tree form; however, free tree mining methods seldom exist. In this paper, a computationally fast algorithm FreeS is presented to discover all frequently occurring free subtrees in a database of labelled free trees. FreeS is designed using an optimal canonical form, BOCF that can uniquely represent free trees even during the presence of isomorphism. To avoid enumeration of false positive candidates, it utilises the enumeration approach based on a tree-structure guided scheme. This paper presents lemmas that introduce conditions to conform the generation of free tree candidates during enumeration. Empirical study using both real and synthetic datasets shows that FreeS is scalable and significantly outperforms (i.e. few orders of magnitude faster than) the state-of-the-art frequent free tree mining algorithms, HybridTreeMiner and FreeTreeMiner.

Systematic search for optical Barker codes with minimum length

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A method that yields optical Barker codes of smallest known lengths for given discrimination is described.

Location of concentrators in a computer communication network: a stochastic automation search method

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The following problem is considered. Given the locations of the Central Processing Unit (ar;the terminals which have to communicate with it, to determine the number and locations of the concentrators and to assign the terminals to the concentrators in such a way that the total cost is minimized. There is alao a fixed cost associated with each concentrator. There is ail upper limit to the number of terminals which can be connected to a concentrator. The terminals can be connected directly to the CPU also In this paper it is assumed that the concentrators can bo located anywhere in the area A containing the CPU and the terminals. Then this becomes a multimodal optimization problem. In the proposed algorithm a stochastic automaton is used as a search device to locate the minimum of the multimodal cost function . The proposed algorithm involves the following. The area A containing the CPU and the terminals is divided into an arbitrary number of regions (say K). An approximate value for the number of concentrators is assumed (say m). The optimum number is determined by iteration later The m concentrators can be assigned to the K regions in (mk) ways (m > K) or (km) ways (K>m).(All possible assignments are feasible, i.e. a region can contain 0,1,…, to concentrators). Each possible assignment is assumed to represent a state of the stochastic variable structure automaton. To start with, all the states are assigned equal probabilities. At each stage of the search the automaton visits a state according to the current probability distribution. At each visit the automaton selects a 'point' inside that state with uniform probability. The cost associated with that point is calculated and the average cost of that state is updated. Then the probabilities of all the states are updated. The probabilities are taken to bo inversely proportional to the average cost of the states After a certain number of searches the search probabilities become stationary and the automaton visits a particular state again and again. Then the automaton is said to have converged to that state Then by conducting a local gradient search within that state the exact locations of the concentrators are determined This algorithm was applied to a set of test problems and the results were compared with those given by Cooper's (1964, 1967) EAC algorithm and on the average it was found that the proposed algorithm performs better.

Dessy : desktop search and synchronization

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current smartphones have a storage capacity of several gigabytes. More and more information is stored on mobile devices. To meet the challenge of information organization, we turn to desktop search. Users often possess multiple devices, and synchronize (subsets of) information between them. This makes file synchronization more important. This thesis presents Dessy, a desktop search and synchronization framework for mobile devices. Dessy uses desktop search techniques, such as indexing, query and index term stemming, and search relevance ranking. Dessy finds files by their content, metadata, and context information. For example, PDF files may be found by their author, subject, title, or text. EXIF data of JPEG files may be used in finding them. User–defined tags can be added to files to organize and retrieve them later. Retrieved files are ranked according to their relevance to the search query. The Dessy prototype uses the BM25 ranking function, used widely in information retrieval. Dessy provides an interface for locating files for both users and applications. Dessy is closely integrated with the Syxaw file synchronizer, which provides efficient file and metadata synchronization, optimizing network usage. Dessy supports synchronization of search results, individual files, and directory trees. It allows finding and synchronizing files that reside on remote computers, or the Internet. Dessy is designed to solve the problem of efficient mobile desktop search and synchronization, also supporting remote and Internet search. Remote searches may be carried out offline using a downloaded index, or while connected to the remote machine on a weak network. To secure user data, transmissions between the Dessy client and server are encrypted using symmetric encryption. Symmetric encryption keys are exchanged with RSA key exchange. Dessy emphasizes extensibility. Also the cryptography can be extended. Users may tag their files with context tags and control custom file metadata. Adding new indexed file types, metadata fields, ranking methods, and index types is easy. Finding files is done with virtual directories, which are views into the user’s files, browseable by regular file managers. On mobile devices, the Dessy GUI provides easy access to the search and synchronization system. This thesis includes results of Dessy synchronization and search experiments, including power usage measurements. Finally, Dessy has been designed with mobility and device constraints in mind. It requires only MIDP 2.0 Mobile Java with FileConnection support, and Java 1.5 on desktop machines.

An algebraic approach to the restoration of images

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Two different matrix algorithms are described for the restoration of blurred pictures. These are illustrated by numerical examples.

A new multi-objective model to optimise rail transport scheduler

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The sugarcane transport system plays a critical role in the overall performance of Australia’s sugarcane industry. An inefficient sugarcane transport system interrupts the raw sugarcane harvesting process, delays the delivery of sugarcane to the mill, deteriorates the sugar quality, increases the usage of empty bins, and leads to the additional sugarcane production costs. Due to these negative effects, there is an urgent need for an efficient sugarcane transport schedule that should be developed by the rail schedulers. In this study, a multi-objective model using mixed integer programming (MIP) is developed to produce an industry-oriented scheduling optimiser for sugarcane rail transport system. The exact MIP solver (IBM ILOG-CPLEX) is applied to minimise the makespan and the total operating time as multi-objective functions. Moreover, the so-called Siding neighbourhood search (SNS) algorithm is developed and integrated with Sidings Satisfaction Priorities (SSP) and Rail Conflict Elimination (RCE) algorithms to solve the problem in a more efficient way. In implementation, the sugarcane transport system of Kalamia Sugar Mill that is a coastal locality about 1050 km northwest of Brisbane city is investigated as a real case study. Computational experiments indicate that high-quality solutions are obtainable in industry-scale applications.

Standardization efforts: The relationship between knowledge dimensions, search processes and innovation outcomes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We explore how a standardization effort (i.e., when a firm pursues standards to further innovation) involves different search processes for knowledge and innovation outcomes. Using an inductive case study of Vanke, a leading Chinese property developer, we show how varying degrees of knowledge complexity and codification combine to produce a typology of four types of search process: active, integrative, decentralized and passive, resulting in four types of innovation outcome: modular, radical, incremental and architectural. We argue that when the standardization effort in a firm involves highly codified knowledge, incremental and architectural innovation outcomes are fostered, while modular and radical innovations are hindered. We discuss how standardization efforts can result in a second-order innovation capability, and conclude by calling for comparative research in other settings to understand how standardization efforts can be suited to different types of search process in different industry contexts.

An Optimal Mechanism for Sponsored Search Auctions on the Web and Comparison With Other Mechanisms

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we first describe a framework to model the sponsored search auction on the web as a mechanism design problem. Using this framework, we describe two well-known mechanisms for sponsored search auction-Generalized Second Price (GSP) and Vickrey-Clarke-Groves (VCG). We then derive a new mechanism for sponsored search auction which we call optimal (OPT) mechanism. The OPT mechanism maximizes the search engine's expected revenue, while achieving Bayesian incentive compatibility and individual rationality of the advertisers. We then undertake a detailed comparative study of the mechanisms GSP, VCG, and OPT. We compute and compare the expected revenue earned by the search engine under the three mechanisms when the advertisers are symmetric and some special conditions are satisfied. We also compare the three mechanisms in terms of incentive compatibility, individual rationality, and computational complexity. Note to Practitioners-The advertiser-supported web site is one of the successful business models in the emerging web landscape. When an Internet user enters a keyword (i.e., a search phrase) into a search engine, the user gets back a page with results, containing the links most relevant to the query and also sponsored links, (also called paid advertisement links). When a sponsored link is clicked, the user is directed to the corresponding advertiser's web page. The advertiser pays the search engine in some appropriate manner for sending the user to its web page. Against every search performed by any user on any keyword, the search engine faces the problem of matching a set of advertisers to the sponsored slots. In addition, the search engine also needs to decide on a price to be charged to each advertiser. Due to increasing demands for Internet advertising space, most search engines currently use auction mechanisms for this purpose. These are called sponsored search auctions. A significant percentage of the revenue of Internet giants such as Google, Yahoo!, MSN, etc., comes from sponsored search auctions. In this paper, we study two auction mechanisms, GSP and VCG, which are quite popular in the sponsored auction context, and pursue the objective of designing a mechanism that is superior to these two mechanisms. In particular, we propose a new mechanism which we call the OPT mechanism. This mechanism maximizes the search engine's expected revenue subject to achieving Bayesian incentive compatibility and individual rationality. Bayesian incentive compatibility guarantees that it is optimal for each advertiser to bid his/her true value provided that all other agents also bid their respective true values. Individual rationality ensures that the agents participate voluntarily in the auction since they are assured of gaining a non-negative payoff by doing so.

Partial Integrated Guidance and Control of Surface-to-Air Interceptors for High Speed Targets

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An important limitation of the existing IGC algorithms, is that they do not explicitly exploit the inherent time scale separation that exist in aerospace vehicles between rotational and translational motions and hence can be ineffective. To address this issue, a two-loop partial integrated guidance and control (PIGC) scheme has been proposed in this paper. In this design, the outer loop uses a recently developed, computationally efficient, optimal control formulation named as model predictive static programming. It gives the commanded pitch and yaw rates whereas necessary roll-rate command is generated from a roll-stabilization loop. The inner loop tracks the outer loop commands using the Dynamic inversion philosophy. Uncommonly, Six-Degree of freedom (Six-DOF) model is used directly in both the loops. This intelligent manipulation preserves the inherent time scale separation property between the translational and rotational dynamics, and hence overcomes the deficiency of current IGC designs, while preserving its benefits. Comparative studies of PIGC with one loop IGC and conventional three loop design were carried out for engaging incoming high speed target. Simulation studies demonstrate the usefulness of this method.

Star Camera Calibration Combined with Independent Spacecraft Attitude Determination

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A methodology for determining spacecraft attitude and autonomously calibrating star camera, both independent of each other, is presented in this paper. Unlike most of the attitude determination algorithms where attitude of the satellite depend on the camera calibrating parameters (like principal point offset, focal length etc.), the proposed method has the advantage of computing spacecraft attitude independently of camera calibrating parameters except lens distortion. In the proposed method both attitude estimation and star camera calibration is done together independent of each other by directly utilizing the star coordinate in image plane and corresponding star vector in inertial coordinate frame. Satellite attitude, camera principal point offset, focal length (in pixel), lens distortion coefficient are found by a simple two step method. In the first step, all parameters (except lens distortion) are estimated using a closed-form solution based on a distortion free camera model. In the second step lens distortion coefficient is estimated by linear least squares method using the solution of the first step to be used in the camera model that incorporates distortion. These steps are applied in an iterative manner to refine the estimated parameters. The whole procedure is faster enough for onboard implementation.

Essays on Search and Informational Asymmetry in Labor and Credit Markets

Relevância:

20.00% 20.00%

Publicador:

Feature extraction for fingerprint classification

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents two algorithms for smoothing and feature extraction for fingerprint classification. Deutsch's(2) Thinning algorithm (rectangular array) is used for thinning the digitized fingerprint (binary version). A simple algorithm is also suggested for classifying the fingerprints. Experimental results obtained using such algorithms are presented.

«
1
2
...
49
50
51
52
53
54
55
...
62
63
»