931 resultados para Processing Graph
Resumo:
This thesis aims at empowering software customers with a tool to build software tests them selves, based on a gradual refinement of natural language scenarios into executable visual test models. The process is divided in five steps: 1. First, a natural language parser is used to extract a graph of grammatical relations from the textual scenario descriptions. 2. The resulting graph is transformed into an informal story pattern by interpreting structurization rules based on Fujaba Story Diagrams. 3. While the informal story pattern can already be used by humans the diagram still lacks technical details, especially type information. To add them, a recommender based framework uses web sites and other resources to generate formalization rules. 4. As a preparation for the code generation the classes derived for formal story patterns are aligned across all story steps, substituting a class diagram. 5. Finally, a headless version of Fujaba is used to generate an executable JUnit test. The graph transformations used in the browser application are specified in a textual domain specific language and visualized as story pattern. Last but not least, only the heavyweight parsing (step 1) and code generation (step 5) are executed on the server side. All graph transformation steps (2, 3 and 4) are executed in the browser by an interpreter written in JavaScript/GWT. This result paves the way for online collaboration between global teams of software customers, IT business analysts and software developers.
Resumo:
The Internet of Things (IoT) consists of a worldwide “network of networks,” composed by billions of interconnected heterogeneous devices denoted as things or “Smart Objects” (SOs). Significant research efforts have been dedicated to port the experience gained in the design of the Internet to the IoT, with the goal of maximizing interoperability, using the Internet Protocol (IP) and designing specific protocols like the Constrained Application Protocol (CoAP), which have been widely accepted as drivers for the effective evolution of the IoT. This first wave of standardization can be considered successfully concluded and we can assume that communication with and between SOs is no longer an issue. At this time, to favor the widespread adoption of the IoT, it is crucial to provide mechanisms that facilitate IoT data management and the development of services enabling a real interaction with things. Several reference IoT scenarios have real-time or predictable latency requirements, dealing with billions of device collecting and sending an enormous quantity of data. These features create a new need for architectures specifically designed to handle this scenario, hear denoted as “Big Stream”. In this thesis a new Big Stream Listener-based Graph architecture is proposed. Another important step, is to build more applications around the Web model, bringing about the Web of Things (WoT). As several IoT testbeds have been focused on evaluating lower-layer communication aspects, this thesis proposes a new WoT Testbed aiming at allowing developers to work with a high level of abstraction, without worrying about low-level details. Finally, an innovative SOs-driven User Interface (UI) generation paradigm for mobile applications in heterogeneous IoT networks is proposed, to simplify interactions between users and things.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Il periodo in cui viviamo rappresenta la cuspide di una forte e rapida evoluzione nella comprensione del linguaggio naturale, raggiuntasi prevalentemente grazie allo sviluppo di modelli neurali. Nell'ambito dell'information extraction, tali progressi hanno recentemente consentito di riconoscere efficacemente relazioni semantiche complesse tra entità menzionate nel testo, quali proteine, sintomi e farmaci. Tale task -- reso possibile dalla modellazione ad eventi -- è fondamentale in biomedicina, dove la crescita esponenziale del numero di pubblicazioni scientifiche accresce ulteriormente il bisogno di sistemi per l'estrazione automatica delle interazioni racchiuse nei documenti testuali. La combinazione di AI simbolica e sub-simbolica può consentire l'introduzione di conoscenza strutturata nota all'interno di language model, rendendo quest'ultimi più robusti, fattuali e interpretabili. In tale contesto, la verbalizzazione di grafi è uno dei task su cui si riversano maggiori aspettative. Nonostante l'importanza di tali contributi (dallo sviluppo di chatbot alla formulazione di nuove ipotesi di ricerca), ad oggi, risultano assenti contributi capaci di verbalizzare gli eventi biomedici espressi in letteratura, apprendendo il legame tra le interazioni espresse in forma a grafo e la loro controparte testuale. La tesi propone il primo dataset altamente comprensivo su coppie evento-testo, includendo diverse sotto-aree biomediche, quali malattie infettive, ricerca oncologica e biologia molecolare. Il dataset introdotto viene usato come base per l'addestramento di modelli generativi allo stato dell'arte sul task di verbalizzazione, adottando un approccio text-to-text e illustrando una tecnica formale per la codifica di grafi evento mediante testo aumentato. Infine, si dimostra la validità degli eventi per il miglioramento delle capacità di comprensione dei modelli neurali su altri task NLP, focalizzandosi su single-document summarization e multi-task learning.
Resumo:
Con el auge del Cloud Computing, las aplicaciones de proceso de datos han sufrido un incremento de demanda, y por ello ha cobrado importancia lograr m�ás eficiencia en los Centros de Proceso de datos. El objetivo de este trabajo es la obtenci�ón de herramientas que permitan analizar la viabilidad y rentabilidad de diseñar Centros de Datos especializados para procesamiento de datos, con una arquitectura, sistemas de refrigeraci�ón, etc. adaptados. Algunas aplicaciones de procesamiento de datos se benefician de las arquitecturas software, mientras que en otras puede ser m�ás eficiente un procesamiento con arquitectura hardware. Debido a que ya hay software con muy buenos resultados en el procesamiento de grafos, como el sistema XPregel, en este proyecto se realizará una arquitectura hardware en VHDL, implementando el algoritmo PageRank de Google de forma escalable. Se ha escogido este algoritmo ya que podr��á ser m�ás eficiente en arquitectura hardware, debido a sus características concretas que se indicaráan m�ás adelante. PageRank sirve para ordenar las p�áginas por su relevancia en la web, utilizando para ello la teorí��a de grafos, siendo cada página web un vértice de un grafo; y los enlaces entre páginas, las aristas del citado grafo. En este proyecto, primero se realizará un an�álisis del estado de la técnica. Se supone que la implementaci�ón en XPregel, un sistema de procesamiento de grafos, es una de las m�ás eficientes. Por ello se estudiará esta �ultima implementaci�ón. Sin embargo, debido a que Xpregel procesa, en general, algoritmos que trabajan con grafos; no tiene en cuenta ciertas caracterí��sticas del algoritmo PageRank, por lo que la implementaci�on no es �optima. Esto es debido a que en PageRank, almacenar todos los datos que manda un mismo v�értice es un gasto innecesario de memoria ya que todos los mensajes que manda un vértice son iguales entre sí e iguales a su PageRank. Se realizará el diseño en VHDL teniendo en cuenta esta caracter��ística del citado algoritmo,evitando almacenar varias veces los mensajes que son iguales. Se ha elegido implementar PageRank en VHDL porque actualmente las arquitecturas de los sistemas operativos no escalan adecuadamente. Se busca evaluar si con otra arquitectura se obtienen mejores resultados. Se realizará un diseño partiendo de cero, utilizando la memoria ROM de IPcore de Xillinx (Software de desarrollo en VHDL), generada autom�áticamente. Se considera hacer cuatro tipos de módulos para que as�� el procesamiento se pueda hacer en paralelo. Se simplificar�á la estructura de XPregel con el fin de intentar aprovechar la particularidad de PageRank mencionada, que hace que XPregel no le saque el m�aximo partido. Despu�és se escribirá el c�ódigo, realizando una estructura escalable, ya que en la computación intervienen millones de páginas web. A continuación, se sintetizar�á y se probará el código en una FPGA. El �ultimo paso será una evaluaci�ón de la implementaci�ón, y de posibles mejoras en cuanto al consumo.
Resumo:
Graph analytics is an important and computationally demanding class of data analytics. It is essential to balance scalability, ease-of-use and high performance in large scale graph analytics. As such, it is necessary to hide the complexity of parallelism, data distribution and memory locality behind an abstract interface. The aim of this work is to build a scalable graph analytics framework that does not demand significant parallel programming experience based on NUMA-awareness.
The realization of such a system faces two key problems:
(i)~how to develop a scale-free parallel programming framework that scales efficiently across NUMA domains; (ii)~how to efficiently apply graph partitioning in order to create separate and largely independent work items that can be distributed among threads.
Resumo:
Edge-labeled graphs have proliferated rapidly over the last decade due to the increased popularity of social networks and the Semantic Web. In social networks, relationships between people are represented by edges and each edge is labeled with a semantic annotation. Hence, a huge single graph can express many different relationships between entities. The Semantic Web represents each single fragment of knowledge as a triple (subject, predicate, object), which is conceptually identical to an edge from subject to object labeled with predicates. A set of triples constitutes an edge-labeled graph on which knowledge inference is performed. Subgraph matching has been extensively used as a query language for patterns in the context of edge-labeled graphs. For example, in social networks, users can specify a subgraph matching query to find all people that have certain neighborhood relationships. Heavily used fragments of the SPARQL query language for the Semantic Web and graph queries of other graph DBMS can also be viewed as subgraph matching over large graphs. Though subgraph matching has been extensively studied as a query paradigm in the Semantic Web and in social networks, a user can get a large number of answers in response to a query. These answers can be shown to the user in accordance with an importance ranking. In this thesis proposal, we present four different scoring models along with scalable algorithms to find the top-k answers via a suite of intelligent pruning techniques. The suggested models consist of a practically important subset of the SPARQL query language augmented with some additional useful features. The first model called Substitution Importance Query (SIQ) identifies the top-k answers whose scores are calculated from matched vertices' properties in each answer in accordance with a user-specified notion of importance. The second model called Vertex Importance Query (VIQ) identifies important vertices in accordance with a user-defined scoring method that builds on top of various subgraphs articulated by the user. Approximate Importance Query (AIQ), our third model, allows partial and inexact matchings and returns top-k of them with a user-specified approximation terms and scoring functions. In the fourth model called Probabilistic Importance Query (PIQ), a query consists of several sub-blocks: one mandatory block that must be mapped and other blocks that can be opportunistically mapped. The probability is calculated from various aspects of answers such as the number of mapped blocks, vertices' properties in each block and so on and the most top-k probable answers are returned. An important distinguishing feature of our work is that we allow the user a huge amount of freedom in specifying: (i) what pattern and approximation he considers important, (ii) how to score answers - irrespective of whether they are vertices or substitution, and (iii) how to combine and aggregate scores generated by multiple patterns and/or multiple substitutions. Because so much power is given to the user, indexing is more challenging than in situations where additional restrictions are imposed on the queries the user can ask. The proposed algorithms for the first model can also be used for answering SPARQL queries with ORDER BY and LIMIT, and the method for the second model also works for SPARQL queries with GROUP BY, ORDER BY and LIMIT. We test our algorithms on multiple real-world graph databases, showing that our algorithms are far more efficient than popular triple stores.
Resumo:
Background: Various neuroimaging studies, both structural and functional, have provided support for the proposal that a distributed brain network is likely to be the neural basis of intelligence. The theory of Distributed Intelligent Processing Systems (DIPS), first developed in the field of Artificial Intelligence, was proposed to adequately model distributed neural intelligent processing. In addition, the neural efficiency hypothesis suggests that individuals with higher intelligence display more focused cortical activation during cognitive performance, resulting in lower total brain activation when compared with individuals who have lower intelligence. This may be understood as a property of the DIPS. Methodology and Principal Findings: In our study, a new EEG brain mapping technique, based on the neural efficiency hypothesis and the notion of the brain as a Distributed Intelligence Processing System, was used to investigate the correlations between IQ evaluated with WAIS (Whechsler Adult Intelligence Scale) and WISC (Wechsler Intelligence Scale for Children), and the brain activity associated with visual and verbal processing, in order to test the validity of a distributed neural basis for intelligence. Conclusion: The present results support these claims and the neural efficiency hypothesis.
Resumo:
The advances made in channel-capacity codes, such as turbo codes and low-density parity-check (LDPC) codes, have played a major role in the emerging distributed source coding paradigm. LDPC codes can be easily adapted to new source coding strategies due to their natural representation as bipartite graphs and the use of quasi-optimal decoding algorithms, such as belief propagation. This paper tackles a relevant scenario in distributedvideo coding: lossy source coding when multiple side information (SI) hypotheses are available at the decoder, each one correlated with the source according to different correlation noise channels. Thus, it is proposed to exploit multiple SI hypotheses through an efficient joint decoding technique withmultiple LDPC syndrome decoders that exchange information to obtain coding efficiency improvements. At the decoder side, the multiple SI hypotheses are created with motion compensated frame interpolation and fused together in a novel iterative LDPC based Slepian-Wolf decoding algorithm. With the creation of multiple SI hypotheses and the proposed decoding algorithm, bitrate savings up to 8.0% are obtained for similar decoded quality.
Resumo:
Vision, Speed, Electroencephalogram, Gamma Band Activity
Resumo:
The amount of biological data has grown exponentially in recent decades. Modern biotechnologies, such as microarrays and next-generation sequencing, are capable to produce massive amounts of biomedical data in a single experiment. As the amount of the data is rapidly growing there is an urgent need for reliable computational methods for analyzing and visualizing it. This thesis addresses this need by studying how to efficiently and reliably analyze and visualize high-dimensional data, especially that obtained from gene expression microarray experiments. First, we will study the ways to improve the quality of microarray data by replacing (imputing) the missing data entries with the estimated values for these entries. Missing value imputation is a method which is commonly used to make the original incomplete data complete, thus making it easier to be analyzed with statistical and computational methods. Our novel approach was to use curated external biological information as a guide for the missing value imputation. Secondly, we studied the effect of missing value imputation on the downstream data analysis methods like clustering. We compared multiple recent imputation algorithms against 8 publicly available microarray data sets. It was observed that the missing value imputation indeed is a rational way to improve the quality of biological data. The research revealed differences between the clustering results obtained with different imputation methods. On most data sets, the simple and fast k-NN imputation was good enough, but there were also needs for more advanced imputation methods, such as Bayesian Principal Component Algorithm (BPCA). Finally, we studied the visualization of biological network data. Biological interaction networks are examples of the outcome of multiple biological experiments such as using the gene microarray techniques. Such networks are typically very large and highly connected, thus there is a need for fast algorithms for producing visually pleasant layouts. A computationally efficient way to produce layouts of large biological interaction networks was developed. The algorithm uses multilevel optimization within the regular force directed graph layout algorithm.
Resumo:
Recently, research projects such as PADLR and SWAP have developed tools like Edutella or Bibster, which are targeted at establishing peer-to-peer knowledge management (P2PKM) systems. In such a system, it is necessary to obtain provide brief semantic descriptions of peers, so that routing algorithms or matchmaking processes can make decisions about which communities peers should belong to, or to which peers a given query should be forwarded. This paper proposes the use of graph clustering techniques on knowledge bases for that purpose. Using this clustering, we can show that our strategy requires up to 58% fewer queries than the baselines to yield full recall in a bibliographic P2PKM scenario.
Resumo:
Early psychiatry investigated dreams to understand psychopathologies. Contemporary psychiatry, which neglects dreams, has been criticized for lack of objectivity. In search of quantitative insight into the structure of psychotic speech, we investigated speech graph attributes (SGA) in patients with schizophrenia, bipolar disorder type I, and non-psychotic controls as they reported waking and dream contents. Schizophrenic subjects spoke with reduced connectivity, in tight correlation with negative and cognitive symptoms measured by standard psychometric scales. Bipolar and control subjects were undistinguishable by waking reports, but in dream reports bipolar subjects showed significantly less connectivity. Dream-related SGA outperformed psychometric scores or waking-related data for group sorting. Altogether, the results indicate that online and offline processing, the two most fundamental modes of brain operation, produce nearly opposite effects on recollections: While dreaming exposes differences in the mnemonic records across individuals, waking dampens distinctions. The results also demonstrate the feasibility of the differential diagnosis of psychosis based on the analysis of dream graphs, pointing to a fast, low-cost and language-invariant tool for psychiatric diagnosis and the objective search for biomarkers. The Freudian notion that ‘‘dreams are the royal road to the unconscious’’ is clinically useful, after all.
Resumo:
Early psychiatry investigated dreams to understand psychopathologies. Contemporary psychiatry, which neglects dreams, has been criticized for lack of objectivity. In search of quantitative insight into the structure of psychotic speech, we investigated speech graph attributes (SGA) in patients with schizophrenia, bipolar disorder type I, and non-psychotic controls as they reported waking and dream contents. Schizophrenic subjects spoke with reduced connectivity, in tight correlation with negative and cognitive symptoms measured by standard psychometric scales. Bipolar and control subjects were undistinguishable by waking reports, but in dream reports bipolar subjects showed significantly less connectivity. Dream-related SGA outperformed psychometric scores or waking-related data for group sorting. Altogether, the results indicate that online and offline processing, the two most fundamental modes of brain operation, produce nearly opposite effects on recollections: While dreaming exposes differences in the mnemonic records across individuals, waking dampens distinctions. The results also demonstrate the feasibility of the differential diagnosis of psychosis based on the analysis of dream graphs, pointing to a fast, low-cost and language-invariant tool for psychiatric diagnosis and the objective search for biomarkers. The Freudian notion that ‘‘dreams are the royal road to the unconscious’’ is clinically useful, after all
Resumo:
Early psychiatry investigated dreams to understand psychopathologies. Contemporary psychiatry, which neglects dreams, has been criticized for lack of objectivity. In search of quantitative insight into the structure of psychotic speech, we investigated speech graph attributes (SGA) in patients with schizophrenia, bipolar disorder type I, and non-psychotic controls as they reported waking and dream contents. Schizophrenic subjects spoke with reduced connectivity, in tight correlation with negative and cognitive symptoms measured by standard psychometric scales. Bipolar and control subjects were undistinguishable by waking reports, but in dream reports bipolar subjects showed significantly less connectivity. Dream-related SGA outperformed psychometric scores or waking-related data for group sorting. Altogether, the results indicate that online and offline processing, the two most fundamental modes of brain operation, produce nearly opposite effects on recollections: While dreaming exposes differences in the mnemonic records across individuals, waking dampens distinctions. The results also demonstrate the feasibility of the differential diagnosis of psychosis based on the analysis of dream graphs, pointing to a fast, low-cost and language-invariant tool for psychiatric diagnosis and the objective search for biomarkers. The Freudian notion that ‘‘dreams are the royal road to the unconscious’’ is clinically useful, after all