911 resultados para Query complexity


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In today's fast-paced and interconnected digital world, the data generated by an increasing number of applications is being modeled as dynamic graphs. The graph structure encodes relationships among data items, while the structural changes to the graphs as well as the continuous stream of information produced by the entities in these graphs make them dynamic in nature. Examples include social networks where users post status updates, images, videos, etc.; phone call networks where nodes may send text messages or place phone calls; road traffic networks where the traffic behavior of the road segments changes constantly, and so on. There is a tremendous value in storing, managing, and analyzing such dynamic graphs and deriving meaningful insights in real-time. However, a majority of the work in graph analytics assumes a static setting, and there is a lack of systematic study of the various dynamic scenarios, the complexity they impose on the analysis tasks, and the challenges in building efficient systems that can support such tasks at a large scale. In this dissertation, I design a unified streaming graph data management framework, and develop prototype systems to support increasingly complex tasks on dynamic graphs. In the first part, I focus on the management and querying of distributed graph data. I develop a hybrid replication policy that monitors the read-write frequencies of the nodes to decide dynamically what data to replicate, and whether to do eager or lazy replication in order to minimize network communication and support low-latency querying. In the second part, I study parallel execution of continuous neighborhood-driven aggregates, where each node aggregates the information generated in its neighborhoods. I build my system around the notion of an aggregation overlay graph, a pre-compiled data structure that enables sharing of partial aggregates across different queries, and also allows partial pre-computation of the aggregates to minimize the query latencies and increase throughput. Finally, I extend the framework to support continuous detection and analysis of activity-based subgraphs, where subgraphs could be specified using both graph structure as well as activity conditions on the nodes. The query specification tasks in my system are expressed using a set of active structural primitives, which allows the query evaluator to use a set of novel optimization techniques, thereby achieving high throughput. Overall, in this dissertation, I define and investigate a set of novel tasks on dynamic graphs, design scalable optimization techniques, build prototype systems, and show the effectiveness of the proposed techniques through extensive evaluation using large-scale real and synthetic datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we extend recent results of Fiorini et al. on the extension complexity of the cut polytope and related polyhedra. We first describe a lifting argument to show exponential extension complexity for a number of NP-complete problems including subset-sum and three dimensional matching. We then obtain a relationship between the extension complexity of the cut polytope of a graph and that of its graph minors. Using this we are able to show exponential extension complexity for the cut polytope of a large number of graphs, including those used in quantum information and suspensions of cubic planar graphs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objectives: To analyze the relationship between pharmacotherapeutical complexity and compliance of therapeutic objectives in HIV+ patients on antiretroviral treatment and concomitant dyslipidemia therapy. Materials and methods: A retrospective observational study including HIV patients on stable antiretroviral treatment during the past 6 months, and dyslipidemia treatment between January and December, 2013. The complexity index was calculated with the tool developed by McDonald et al. Other variables analyzed were: age, gender, risk factor of HIV, smoking, alcoholism and drugs, psychiatric disorders, adherence to antiretroviral treatment and lipid lowering drugs, and clinical parameters (HIV viral load, CD4 count, plasma levels of total cholesterol, LDL, HDL, and triglycerides). In order to determine the predictive factors associated with the compliance of therapeutic objectives, univariate analysis was conducted through logistical regression, followed by a multivariate analysis. Results: The study included 89 patients; 56.8% of them met the therapeutic objectives for dyslipidemia. The complexity index was significantly higher (p = 0.02) in those patients who did not reach the objective values (median 51.8 vs. 38.9). Adherence to lipid lowering treatment was significantly associated with compliance of the therapeutic objectives established for dyslipidemia treatment. A 67.0% of patients met the objectives for their antiretroviral treatment; however, the complexity index was not significantly higher (p = 0.06) in those patients who did not meet said objectives. Conclusions: Pharmacotherapeutical complexity represents a key factor in terms of achieving health objectives in HIV+ patients on treatment for dyslipidemia.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

La possibilité d’estimer l’impact du changement climatique en cours sur le comportement hydrologique des hydro-systèmes est une nécessité pour anticiper les adaptations inévitables et nécessaires que doivent envisager nos sociétés. Dans ce contexte, ce projet doctoral présente une étude sur l’évaluation de la sensibilité des projections hydrologiques futures à : (i) La non-robustesse de l’identification des paramètres des modèles hydrologiques, (ii) l’utilisation de plusieurs jeux de paramètres équifinaux et (iii) l’utilisation de différentes structures de modèles hydrologiques. Pour quantifier l’impact de la première source d’incertitude sur les sorties des modèles, quatre sous-périodes climatiquement contrastées sont tout d’abord identifiées au sein des chroniques observées. Les modèles sont calés sur chacune de ces quatre périodes et les sorties engendrées sont analysées en calage et en validation en suivant les quatre configurations du Different Splitsample Tests (Klemeš, 1986;Wilby, 2005; Seiller et al. (2012);Refsgaard et al. (2014)). Afin d’étudier la seconde source d’incertitude liée à la structure du modèle, l’équifinalité des jeux de paramètres est ensuite prise en compte en considérant pour chaque type de calage les sorties associées à des jeux de paramètres équifinaux. Enfin, pour évaluer la troisième source d’incertitude, cinq modèles hydrologiques de différents niveaux de complexité sont appliqués (GR4J, MORDOR, HSAMI, SWAT et HYDROTEL) sur le bassin versant québécois de la rivière Au Saumon. Les trois sources d’incertitude sont évaluées à la fois dans conditions climatiques observées passées et dans les conditions climatiques futures. Les résultats montrent que, en tenant compte de la méthode d’évaluation suivie dans ce doctorat, l’utilisation de différents niveaux de complexité des modèles hydrologiques est la principale source de variabilité dans les projections de débits dans des conditions climatiques futures. Ceci est suivi par le manque de robustesse de l’identification des paramètres. Les projections hydrologiques générées par un ensemble de jeux de paramètres équifinaux sont proches de celles associées au jeu de paramètres optimal. Par conséquent, plus d’efforts devraient être investis dans l’amélioration de la robustesse des modèles pour les études d’impact sur le changement climatique, notamment en développant les structures des modèles plus appropriés et en proposant des procédures de calage qui augmentent leur robustesse. Ces travaux permettent d’apporter une réponse détaillée sur notre capacité à réaliser un diagnostic des impacts des changements climatiques sur les ressources hydriques du bassin Au Saumon et de proposer une démarche méthodologique originale d’analyse pouvant être directement appliquée ou adaptée à d’autres contextes hydro-climatiques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work we consider several instances of the following problem: "how complicated can the isomorphism relation for countable models be?"' Using the Borel reducibility framework, we investigate this question with regard to the space of countable models of particular complete first-order theories. We also investigate to what extent this complexity is mirrored in the number of back-and-forth inequivalent models of the theory. We consider this question for two large and related classes of theories. First, we consider o-minimal theories, showing that if T is o-minimal, then the isomorphism relation is either Borel complete or Borel. Further, if it is Borel, we characterize exactly which values can occur, and when they occur. In all cases Borel completeness implies lambda-Borel completeness for all lambda. Second, we consider colored linear orders, which are (complete theories of) a linear order expanded by countably many unary predicates. We discover the same characterization as with o-minimal theories, taking the same values, with the exception that all finite values are possible except two. We characterize exactly when each possibility occurs, which is similar to the o-minimal case. Additionally, we extend Schirrman's theorem, showing that if the language is finite, then T is countably categorical or Borel complete. As before, in all cases Borel completeness implies lambda-Borel completeness for all lambda.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Edge-labeled graphs have proliferated rapidly over the last decade due to the increased popularity of social networks and the Semantic Web. In social networks, relationships between people are represented by edges and each edge is labeled with a semantic annotation. Hence, a huge single graph can express many different relationships between entities. The Semantic Web represents each single fragment of knowledge as a triple (subject, predicate, object), which is conceptually identical to an edge from subject to object labeled with predicates. A set of triples constitutes an edge-labeled graph on which knowledge inference is performed. Subgraph matching has been extensively used as a query language for patterns in the context of edge-labeled graphs. For example, in social networks, users can specify a subgraph matching query to find all people that have certain neighborhood relationships. Heavily used fragments of the SPARQL query language for the Semantic Web and graph queries of other graph DBMS can also be viewed as subgraph matching over large graphs. Though subgraph matching has been extensively studied as a query paradigm in the Semantic Web and in social networks, a user can get a large number of answers in response to a query. These answers can be shown to the user in accordance with an importance ranking. In this thesis proposal, we present four different scoring models along with scalable algorithms to find the top-k answers via a suite of intelligent pruning techniques. The suggested models consist of a practically important subset of the SPARQL query language augmented with some additional useful features. The first model called Substitution Importance Query (SIQ) identifies the top-k answers whose scores are calculated from matched vertices' properties in each answer in accordance with a user-specified notion of importance. The second model called Vertex Importance Query (VIQ) identifies important vertices in accordance with a user-defined scoring method that builds on top of various subgraphs articulated by the user. Approximate Importance Query (AIQ), our third model, allows partial and inexact matchings and returns top-k of them with a user-specified approximation terms and scoring functions. In the fourth model called Probabilistic Importance Query (PIQ), a query consists of several sub-blocks: one mandatory block that must be mapped and other blocks that can be opportunistically mapped. The probability is calculated from various aspects of answers such as the number of mapped blocks, vertices' properties in each block and so on and the most top-k probable answers are returned. An important distinguishing feature of our work is that we allow the user a huge amount of freedom in specifying: (i) what pattern and approximation he considers important, (ii) how to score answers - irrespective of whether they are vertices or substitution, and (iii) how to combine and aggregate scores generated by multiple patterns and/or multiple substitutions. Because so much power is given to the user, indexing is more challenging than in situations where additional restrictions are imposed on the queries the user can ask. The proposed algorithms for the first model can also be used for answering SPARQL queries with ORDER BY and LIMIT, and the method for the second model also works for SPARQL queries with GROUP BY, ORDER BY and LIMIT. We test our algorithms on multiple real-world graph databases, showing that our algorithms are far more efficient than popular triple stores.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Homomorphic encryption is a particular type of encryption method that enables computing over encrypted data. This has a wide range of real world ramifications such as being able to blindly compute a search result sent to a remote server without revealing its content. In the first part of this thesis, we discuss how database search queries can be made secure using a homomorphic encryption scheme based on the ideas of Gahi et al. Gahi’s method is based on the integer-based fully homomorphic encryption scheme proposed by Dijk et al. We propose a new database search scheme called the Homomorphic Query Processing Scheme, which can be used with the ring-based fully homomorphic encryption scheme proposed by Braserski. In the second part of this thesis, we discuss the cybersecurity of the smart electric grid. Specifically, we use the Homomorphic Query Processing scheme to construct a keyword search technique in the smart grid. Our work is based on the Public Key Encryption with Keyword Search (PEKS) method introduced by Boneh et al. and a Multi-Key Homomorphic Encryption scheme proposed by L´opez-Alt et al. A summary of the results of this thesis (specifically the Homomorphic Query Processing Scheme) is published at the 14th Canadian Workshop on Information Theory (CWIT).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The farm-gate value of extensive beef production from the northern Gulf region of Queensland, Australia, is ~$150 million annually. Poor profitability and declining equity are common issues for most beef businesses in the region. The beef industry relies primarily on native pasture systems and studies continue to report a decline in the condition and productivity of important land types in the region. Governments and Natural Resource Management groups are investing significant resources to restore landscape health and productivity. Fundamental community expectations also include broader environmental outcomes such as reducing beef industry greenhouse gas emissions. Whole-of-business analysis results are presented from 18 extensive beef businesses (producers) to highlight the complex social and economic drivers of management decisions that impact on the natural resource and environment. Business analysis activities also focussed on improving enterprise performance. Profitability, herd performance and greenhouse emission benchmarks are documented and discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nitrogen (N) is an essential plant nutrient in maize production, and if considering only natural sources, is often the limiting factor world-wide in terms of a plant’s grain yield. For this reason, many farmers around the world supplement available soil N with synthetic man-made forms. Years of over-application of N fertilizer have led to increased N in groundwater and streams due to leaching and run-off from agricultural sites. In the Midwest Corn Belt much of this excess N eventually makes its way to the Gulf of Mexico leading to eutrophication (increase of phytoplankton) and a hypoxic (reduced oxygen) dead zone. Growing concerns about these types of problems and desire for greater input use efficiency have led to demand for crops with improved N use efficiency (NUE) to allow reduced N fertilizer application rates and subsequently lower N pollution. It is well known that roots are responsible for N uptake by plants, but it is relatively unknown how root architecture affects this ability. This research was conducted to better understand the influence of root complexity (RC) in maize on a plant’s response to N stress as well as the influence of RC on other above-ground plant traits. Thirty-one above-ground plant traits were measured for 64 recombinant inbred lines (RILs) from the intermated B73 & Mo17 (IBM) population and their backcrosses (BCs) to either parent, B73 and Mo17, under normal (182 kg N ha-1) and N deficient (0 kg N ha-1) conditions. The RILs were selected based on results from an earlier experiment by Novais et al. (2011) which screened 232 RILs from the IBM to obtain their root complexity measurements. The 64 selected RILs were comprised of 31 of the lowest complexity RILs (RC1) and 33 of the highest complexity RILs (RC2) in terms of root architecture (characterized as fractal dimensions). The use of the parental BCs classifies the experiment as Design III, an experimental design developed by Comstock and Robinson (1952) which allows for estimation of dominance significance and level. Of the 31 traits measured, 12 were whole plant traits chosen due to their documented response to N stress. The other 19 traits were ear traits commonly measured for their influence on yield. Results showed that genotypes from RC1 and RC2 significantly differ for several above-ground phenotypes. We also observed a difference in the number and magnitude of N treatment responses between the two RC classes. Differences in phenotypic trait correlations and their change in response to N were also observed between the RC classes. RC did not seem to have a strong correlation with calculated NUE (ΔYield/ΔN). Quantitative genetic analysis utilizing the Design III experimental design revealed significant dominance effects acting on several traits as well as changes in significance and dominance level between N treatments. Several QTL were mapped for 26 of the 31 traits and significant N effects were observed across the majority of the genome for some N stress indicative traits (e.g. stay-green). This research and related projects are essential to a better understanding of plant N uptake and metabolism. Understanding these processes is a necessary step in the progress towards the goal of breeding for better NUE crops.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose: Current thinking about ‘patient safety’ emphasises the causal relationship between the work environment and the delivery of clinical care. This research draws on the theory of Normal Accidents to extend this analysis and better understand the ‘organisational factors’ that threaten safety. Methods: Ethnographic research methods were used, with observations of the operating department setting for 18 month and interviews with 80 members of hospital staff. The setting for the study was the Operating Department of a large teaching hospital in the North-West of England. Results: The work of the operating department is determined by inter-dependant, ‘tightly coupled’ organisational relationships between hospital departments based upon the timely exchange of information, services and resources required for the delivery of care. Failures within these processes, manifest as ‘breakdowns’ within inter-departmental relationships lead to situations of constraint, rapid change and uncertainty in the work of the operating department that require staff to break with established routines and work with increased time and emotional pressures. This means that staff focus on working quickly, as opposed to working safely. Conclusion: Analysis of safety needs to move beyond a focus on the immediate work environment and individual practice, to consider the more complex and deeply structured organisational systems of hospital activity. For departmental managers the scope for service planning to control for safety may be limited as the structured ‘real world’ situation of service delivery is shaped by inter-department and organisational factors that are perhaps beyond the scope of departmental management.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class­ based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state­ of ­the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state­ of­ the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

International audience

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Throughout the last years technologic improvements have enabled internet users to analyze and retrieve data regarding Internet searches. In several fields of study this data has been used. Some authors have been using search engine query data to forecast economic variables, to detect influenza areas or to demonstrate that it is possible to capture some patterns in stock markets indexes. In this paper one investment strategy is presented using Google Trends’ weekly query data from major global stock market indexes’ constituents. The results suggest that it is indeed possible to achieve higher Info Sharpe ratios, especially for the major European stock market indexes in comparison to those provided by a buy-and-hold strategy for the period considered.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

International audience