98 resultados para Search problems


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The analysis of sequential data is required in many diverse areas such as telecommunications, stock market analysis, and bioinformatics. A basic problem related to the analysis of sequential data is the sequence segmentation problem. A sequence segmentation is a partition of the sequence into a number of non-overlapping segments that cover all data points, such that each segment is as homogeneous as possible. This problem can be solved optimally using a standard dynamic programming algorithm. In the first part of the thesis, we present a new approximation algorithm for the sequence segmentation problem. This algorithm has smaller running time than the optimal dynamic programming algorithm, while it has bounded approximation ratio. The basic idea is to divide the input sequence into subsequences, solve the problem optimally in each subsequence, and then appropriately combine the solutions to the subproblems into one final solution. In the second part of the thesis, we study alternative segmentation models that are devised to better fit the data. More specifically, we focus on clustered segmentations and segmentations with rearrangements. While in the standard segmentation of a multidimensional sequence all dimensions share the same segment boundaries, in a clustered segmentation the multidimensional sequence is segmented in such a way that dimensions are allowed to form clusters. Each cluster of dimensions is then segmented separately. We formally define the problem of clustered segmentations and we experimentally show that segmenting sequences using this segmentation model, leads to solutions with smaller error for the same model cost. Segmentation with rearrangements is a novel variation to the segmentation problem: in addition to partitioning the sequence we also seek to apply a limited amount of reordering, so that the overall representation error is minimized. We formulate the problem of segmentation with rearrangements and we show that it is an NP-hard problem to solve or even to approximate. We devise effective algorithms for the proposed problem, combining ideas from dynamic programming and outlier detection algorithms in sequences. In the final part of the thesis, we discuss the problem of aggregating results of segmentation algorithms on the same set of data points. In this case, we are interested in producing a partitioning of the data that agrees as much as possible with the input partitions. We show that this problem can be solved optimally in polynomial time using dynamic programming. Furthermore, we show that not all data points are candidates for segment boundaries in the optimal solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

XML documents are becoming more and more common in various environments. In particular, enterprise-scale document management is commonly centred around XML, and desktop applications as well as online document collections are soon to follow. The growing number of XML documents increases the importance of appropriate indexing methods and search tools in keeping the information accessible. Therefore, we focus on content that is stored in XML format as we develop such indexing methods. Because XML is used for different kinds of content ranging all the way from records of data fields to narrative full-texts, the methods for Information Retrieval are facing a new challenge in identifying which content is subject to data queries and which should be indexed for full-text search. In response to this challenge, we analyse the relation of character content and XML tags in XML documents in order to separate the full-text from data. As a result, we are able to both reduce the size of the index by 5-6\% and improve the retrieval precision as we select the XML fragments to be indexed. Besides being challenging, XML comes with many unexplored opportunities which are not paid much attention in the literature. For example, authors often tag the content they want to emphasise by using a typeface that stands out. The tagged content constitutes phrases that are descriptive of the content and useful for full-text search. They are simple to detect in XML documents, but also possible to confuse with other inline-level text. Nonetheless, the search results seem to improve when the detected phrases are given additional weight in the index. Similar improvements are reported when related content is associated with the indexed full-text including titles, captions, and references. Experimental results show that for certain types of document collections, at least, the proposed methods help us find the relevant answers. Even when we know nothing about the document structure but the XML syntax, we are able to take advantage of the XML structure when the content is indexed for full-text search.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis studies optimisation problems related to modern large-scale distributed systems, such as wireless sensor networks and wireless ad-hoc networks. The concrete tasks that we use as motivating examples are the following: (i) maximising the lifetime of a battery-powered wireless sensor network, (ii) maximising the capacity of a wireless communication network, and (iii) minimising the number of sensors in a surveillance application. A sensor node consumes energy both when it is transmitting or forwarding data, and when it is performing measurements. Hence task (i), lifetime maximisation, can be approached from two different perspectives. First, we can seek for optimal data flows that make the most out of the energy resources available in the network; such optimisation problems are examples of so-called max-min linear programs. Second, we can conserve energy by putting redundant sensors into sleep mode; we arrive at the sleep scheduling problem, in which the objective is to find an optimal schedule that determines when each sensor node is asleep and when it is awake. In a wireless network simultaneous radio transmissions may interfere with each other. Task (ii), capacity maximisation, therefore gives rise to another scheduling problem, the activity scheduling problem, in which the objective is to find a minimum-length conflict-free schedule that satisfies the data transmission requirements of all wireless communication links. Task (iii), minimising the number of sensors, is related to the classical graph problem of finding a minimum dominating set. However, if we are not only interested in detecting an intruder but also locating the intruder, it is not sufficient to solve the dominating set problem; formulations such as minimum-size identifying codes and locating dominating codes are more appropriate. This thesis presents approximation algorithms for each of these optimisation problems, i.e., for max-min linear programs, sleep scheduling, activity scheduling, identifying codes, and locating dominating codes. Two complementary approaches are taken. The main focus is on local algorithms, which are constant-time distributed algorithms. The contributions include local approximation algorithms for max-min linear programs, sleep scheduling, and activity scheduling. In the case of max-min linear programs, tight upper and lower bounds are proved for the best possible approximation ratio that can be achieved by any local algorithm. The second approach is the study of centralised polynomial-time algorithms in local graphs these are geometric graphs whose structure exhibits spatial locality. Among other contributions, it is shown that while identifying codes and locating dominating codes are hard to approximate in general graphs, they admit a polynomial-time approximation scheme in local graphs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current smartphones have a storage capacity of several gigabytes. More and more information is stored on mobile devices. To meet the challenge of information organization, we turn to desktop search. Users often possess multiple devices, and synchronize (subsets of) information between them. This makes file synchronization more important. This thesis presents Dessy, a desktop search and synchronization framework for mobile devices. Dessy uses desktop search techniques, such as indexing, query and index term stemming, and search relevance ranking. Dessy finds files by their content, metadata, and context information. For example, PDF files may be found by their author, subject, title, or text. EXIF data of JPEG files may be used in finding them. User–defined tags can be added to files to organize and retrieve them later. Retrieved files are ranked according to their relevance to the search query. The Dessy prototype uses the BM25 ranking function, used widely in information retrieval. Dessy provides an interface for locating files for both users and applications. Dessy is closely integrated with the Syxaw file synchronizer, which provides efficient file and metadata synchronization, optimizing network usage. Dessy supports synchronization of search results, individual files, and directory trees. It allows finding and synchronizing files that reside on remote computers, or the Internet. Dessy is designed to solve the problem of efficient mobile desktop search and synchronization, also supporting remote and Internet search. Remote searches may be carried out offline using a downloaded index, or while connected to the remote machine on a weak network. To secure user data, transmissions between the Dessy client and server are encrypted using symmetric encryption. Symmetric encryption keys are exchanged with RSA key exchange. Dessy emphasizes extensibility. Also the cryptography can be extended. Users may tag their files with context tags and control custom file metadata. Adding new indexed file types, metadata fields, ranking methods, and index types is easy. Finding files is done with virtual directories, which are views into the user’s files, browseable by regular file managers. On mobile devices, the Dessy GUI provides easy access to the search and synchronization system. This thesis includes results of Dessy synchronization and search experiments, including power usage measurements. Finally, Dessy has been designed with mobility and device constraints in mind. It requires only MIDP 2.0 Mobile Java with FileConnection support, and Java 1.5 on desktop machines.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of the present study is to analyze Confucian understandings of the Christian doctrine of salvation in order to find the basic problems in the Confucian-Christian dialogue. I will approach the task via a systematic theological analysis of four issues in order to limit the thesis to an appropriate size. They are analyzed in three chapters as follows: 1. The Confucian concept concerning the existence of God. Here I discuss mainly the issue of assimilation of the Christian concept of God to the concepts of Sovereign on High (Shangdi) and Heaven (Tian) in Confucianism. 2. The Confucian understanding of the object of salvation and its status in Christianity. 3. The Confucian understanding of the means of salvation in Christianity. Before beginning this analysis it is necessary to clarify the vast variety of controversies, arguments, ideas, opinions and comments expressed in the name of Confucianism; thus, clear distinctions among different schools of Confucianism are given in chapter 2. In the last chapter I will discuss the results of my research in this study by pointing out the basic problems that will appear in the analysis. The results of the present study provide conclusions in three related areas: the tacit differences in the ways of thinking between Confucians and Christians, the basic problems of the Confucian-Christian dialogue, and the affirmative elements in the dialogue. In addition to a summary, a bibliography and an index, there are also eight appendices, where I have introduced important background information for readers to understand the present study.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ongoing habitat loss and fragmentation threaten much of the biodiversity that we know today. As such, conservation efforts are required if we want to protect biodiversity. Conservation budgets are typically tight, making the cost-effective selection of protected areas difficult. Therefore, reserve design methods have been developed to identify sets of sites, that together represent the species of conservation interest in a cost-effective manner. To be able to select reserve networks, data on species distributions is needed. Such data is often incomplete, but species habitat distribution models (SHDMs) can be used to link the occurrence of the species at the surveyed sites to the environmental conditions at these locations (e.g. climatic, vegetation and soil conditions). The probability of the species occurring at unvisited location is next predicted by the model, based on the environmental conditions of those sites. The spatial configuration of reserve networks is important, because habitat loss around reserves can influence the persistence of species inside the network. Since species differ in their requirements for network configuration, the spatial cohesion of networks needs to be species-specific. A way to account for species-specific requirements is to use spatial variables in SHDMs. Spatial SHDMs allow the evaluation of the effect of reserve network configuration on the probability of occurrence of the species inside the network. Even though reserves are important for conservation, they are not the only option available to conservation planners. To enhance or maintain habitat quality, restoration or maintenance measures are sometimes required. As a result, the number of conservation options per site increases. Currently available reserve selection tools do however not offer the ability to handle multiple, alternative options per site. This thesis extends the existing methodology for reserve design, by offering methods to identify cost-effective conservation planning solutions when multiple, alternative conservation options are available per site. Although restoration and maintenance measures are beneficial to certain species, they can be harmful to other species with different requirements. This introduces trade-offs between species when identifying which conservation action is best applied to which site. The thesis describes how the strength of such trade-offs can be identified, which is useful for assessing consequences of conservation decisions regarding species priorities and budget. Furthermore, the results of the thesis indicate that spatial SHDMs can be successfully used to account for species-specific requirements for spatial cohesion - in the reserve selection (single-option) context as well as in the multi-option context. Accounting for the spatial requirements of multiple species and allowing for several conservation options is however complicated, due to trade-offs in species requirements. It is also shown that spatial SHDMs can be successfully used for gaining information on factors that drive a species spatial distribution. Such information is valuable to conservation planning, as better knowledge on species requirements facilitates the design of networks for species persistence. This methods and results described in this thesis aim to improve species probabilities of persistence, by taking better account of species habitat and spatial requirements. Many real-world conservation planning problems are characterised by a variety of conservation options related to protection, restoration and maintenance of habitat. Planning tools therefore need to be able to incorporate multiple conservation options per site, in order to continue the search for cost-effective conservation planning solutions. Simultaneously, the spatial requirements of species need to be considered. The methods described in this thesis offer a starting point for combining these two relevant aspects of conservation planning.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

By detecting leading protons produced in the Central Exclusive Diffractive process, p+p → p+X+p, one can measure the missing mass, and scan for possible new particle states such as the Higgs boson. This process augments - in a model independent way - the standard methods for new particle searches at the Large Hadron Collider (LHC) and will allow detailed analyses of the produced central system, such as the spin-parity properties of the Higgs boson. The exclusive central diffractive process makes possible precision studies of gluons at the LHC and complements the physics scenarios foreseen at the next e+e− linear collider. This thesis first presents the conclusions of the first systematic analysis of the expected precision measurement of the leading proton momentum and the accuracy of the reconstructed missing mass. In this initial analysis, the scattered protons are tracked along the LHC beam line and the uncertainties expected in beam transport and detection of the scattered leading protons are accounted for. The main focus of the thesis is in developing the necessary radiation hard precision detector technology for coping with the extremely demanding experimental environment of the LHC. This will be achieved by using a 3D silicon detector design, which in addition to the radiation hardness of up to 5×10^15 neutrons/cm2, offers properties such as a high signal-to- noise ratio, fast signal response to radiation and sensitivity close to the very edge of the detector. This work reports on the development of a novel semi-3D detector design that simplifies the 3D fabrication process, but conserves the necessary properties of the 3D detector design required in the LHC and in other imaging applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This dissertation concerns the Punan Vuhang, former hunter-gatherers who are now part-time farmers living in an area of remote rainforest in the Malaysian state of Sarawak. It covers two themes: first, examining their methods of securing a livelihood in the rainforest, and second looking at their adaptation to a settled life and agriculture, and their response to rapid and large-scale commercial logging. This study engages the long-running debates among anthropologists and ecologists on whether recent hunting-gathering societies were able to survive in the tropical rainforest without dependence on farming societies for food resources. In the search for evidence, the study poses three questions: What food resources were available to rainforest hunter-gatherers? How did they hunt and gather these foods? How did they cope with periodic food shortages? In fashioning a life in the rainforest, the Punan Vuhang survived resource scarcity by developing adaptive strategies through intensive use of their knowledge of the forest and its resources. They also adopted social practices such as sharing and reciprocity, and resource tenure to sustain themselves without recourse to external sources of food. In the 1960s, the Punan Vuhang settled down in response to external influences arising in part from the Indonesian-Malaysian Confrontation. This, in turn, initiated a series of processes with political, economic and religious implications. However, elements of the traditional economy have remained resilient as the people continue to hunt, fish and gather, and are able to farm on an individual basis, unlike neighboring shifting cultivators who need to cooperate with each other. At the beginning of the 21st century, the Punan Vuhang face a new challenge arising from the issue of rights in the context of the state and national law and large-scale commercial logging in their forest habitat. The future seems bleak as they face the social problems of alcoholism, declining leadership, and dependence on cash income and commodities from the market.