Biblioteca Digital

977 resultados para Scalable Nanofabrication

Secure and Scalable Statistical Computation of Questionnaire Data in R

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Collecting data via a questionnaire and analyzing them while preserving respondents’ privacy may increase the number of respondents and the truthfulness of their responses. It may also reduce the systematic differences between respondents and non-respondents. In this paper, we propose a privacy-preserving method for collecting and analyzing survey responses using secure multi-party computation (SMC). The method is secure under the semi-honest adversarial model. The proposed method computes a wide variety of statistics. Total and stratified statistical counts are computed using the secure protocols developed in this paper. Then, additional statistics, such as a contingency table, a chi-square test, an odds ratio, and logistic regression, are computed within the R statistical environment using the statistical counts as building blocks. The method was evaluated on a questionnaire dataset of 3,158 respondents sampled for a medical study and simulated questionnaire datasets of up to 50,000 respondents. The computation time for the statistical analyses linearly scales as the number of respondents increases. The results show that the method is efficient and scalable for practical use. It can also be used for other applications in which categorical data are collected.

A Scalable General Purpose System for Large-Scale Graph Processing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Graph analytics is an important and computationally demanding class of data analytics. It is essential to balance scalability, ease-of-use and high performance in large scale graph analytics. As such, it is necessary to hide the complexity of parallelism, data distribution and memory locality behind an abstract interface. The aim of this work is to build a scalable graph analytics framework that does not demand significant parallel programming experience based on NUMA-awareness.
The realization of such a system faces two key problems:
(i)~how to develop a scale-free parallel programming framework that scales efficiently across NUMA domains; (ii)~how to efficiently apply graph partitioning in order to create separate and largely independent work items that can be distributed among threads.

A Scalable Design Framework for Variability Management in Large-Scale Software Product Lines

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Variability management is one of the major challenges in software product line adoption, since it needs to be efficiently managed at various levels of the software product line development process (e.g., requirement analysis, design, implementation, etc.). One of the main challenges within variability management is the handling and effective visualization of large-scale (industry-size) models, which in many projects, can reach the order of thousands, along with the dependency relationships that exist among them. These have raised many concerns regarding the scalability of current variability management tools and techniques and their lack of industrial adoption. To address the scalability issues, this work employed a combination of quantitative and qualitative research methods to identify the reasons behind the limited scalability of existing variability management tools and techniques. In addition to producing a comprehensive catalogue of existing tools, the outcome form this stage helped understand the major limitations of existing tools. Based on the findings, a novel approach was created for managing variability that employed two main principles for supporting scalability. First, the separation-of-concerns principle was employed by creating multiple views of variability models to alleviate information overload. Second, hyperbolic trees were used to visualise models (compared to Euclidian space trees traditionally used). The result was an approach that can represent models encompassing hundreds of variability points and complex relationships. These concepts were demonstrated by implementing them in an existing variability management tool and using it to model a real-life product line with over a thousand variability points. Finally, in order to assess the work, an evaluation framework was designed based on various established usability assessment best practices and standards. The framework was then used with several case studies to benchmark the performance of this work against other existing tools.

Scalable unstructured mesh decomposition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As the efficiency of parallel software increases it is becoming common to measure near linear speedup for many applications. For a problem size N on P processors then with software running at O(N=P ) the performance restrictions due to file i/o systems and mesh decomposition running at O(N) become increasingly apparent especially for large P . For distributed memory parallel systems an additional limit to scalability results from the finite memory size available for i/o scatter/gather operations. Simple strategies developed to address the scalability of scatter/gather operations for unstructured mesh based applications have been extended to provide scalable mesh decomposition through the development of a parallel graph partitioning code, JOSTLE [8]. The focus of this work is directed towards the development of generic strategies that can be incorporated into the Computer Aided Parallelisation Tools (CAPTools) project.

Probabilistic Models for Scalable Knowledge Graph Construction

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the past decade, systems that extract information from millions of Internet documents have become commonplace. Knowledge graphs -- structured knowledge bases that describe entities, their attributes and the relationships between them -- are a powerful tool for understanding and organizing this vast amount of information. However, a significant obstacle to knowledge graph construction is the unreliability of the extracted information, due to noise and ambiguity in the underlying data or errors made by the extraction system and the complexity of reasoning about the dependencies between these noisy extractions. My dissertation addresses these challenges by exploiting the interdependencies between facts to improve the quality of the knowledge graph in a scalable framework. I introduce a new approach called knowledge graph identification (KGI), which resolves the entities, attributes and relationships in the knowledge graph by incorporating uncertain extractions from multiple sources, entity co-references, and ontological constraints. I define a probability distribution over possible knowledge graphs and infer the most probable knowledge graph using a combination of probabilistic and logical reasoning. Such probabilistic models are frequently dismissed due to scalability concerns, but my implementation of KGI maintains tractable performance on large problems through the use of hinge-loss Markov random fields, which have a convex inference objective. This allows the inference of large knowledge graphs using 4M facts and 20M ground constraints in 2 hours. To further scale the solution, I develop a distributed approach to the KGI problem which runs in parallel across multiple machines, reducing inference time by 90%. Finally, I extend my model to the streaming setting, where a knowledge graph is continuously updated by incorporating newly extracted facts. I devise a general approach for approximately updating inference in convex probabilistic models, and quantify the approximation error by defining and bounding inference regret for online models. Together, my work retains the attractive features of probabilistic models while providing the scalability necessary for large-scale knowledge graph construction. These models have been applied on a number of real-world knowledge graph projects, including the NELL project at Carnegie Mellon and the Google Knowledge Graph.

Scalable semantic aware context storage

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The number of connected devices collecting and distributing real-world information through various systems, is expected to soar in the coming years. As the number of such connected devices grows, it becomes increasingly difficult to store and share all these new sources of information. Several context representation schemes try to standardize this information, but none of them have been widely adopted. In previous work we addressed this challenge, however our solution had some drawbacks: poor semantic extraction and scalability. In this paper we discuss ways to efficiently deal with representation schemes' diversity and propose a novel d-dimension organization model. Our evaluation shows that d-dimension model improves scalability and semantic extraction.

Bayesian Optimisation Algorithm for Nurse Scheduling, Scalable Optimization via Probabilistic Modeling

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Our research has shown that schedules can be built mimicking a human scheduler by using a set of rules that involve domain knowledge. This chapter presents a Bayesian Optimization Algorithm (BOA) for the nurse scheduling problem that chooses such suitable scheduling rules from a set for each nurse’s assignment. Based on the idea of using probabilistic models, the BOA builds a Bayesian network for the set of promising solutions and samples these networks to generate new candidate solutions. Computational results from 52 real data instances demonstrate the success of this approach. It is also suggested that the learning mechanism in the proposed algorithm may be suitable for other scheduling problems.

Scalable synthetic method for IT-SOFCs compounds

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Póster presentado en: 12th EUROPEAN SOFC & SOE FORUM 2016. 5–8 July 2016, KKL Lucerne/Switzerland

Investigation of the functional roles of host cell proteins involved in Coronavirus infection using highly specific and scalable RNA interference (RNAi) approach

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Since its identification in the 1990s, the RNA interference (RNAi) pathway has proven extremely useful in elucidating the function of proteins in the context of cells and even whole organisms. In particular, this sequence-specific and powerful loss-of-function approach has greatly simplified the study of the role of host cell factors implicated in the life cycle of viruses. Here, we detail the RNAi method we have developed and used to specifically knock down the expression of ezrin, an actin binding protein that was identified by yeast two-hybrid screening to interact with the Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) spike (S) protein. This method was used to study the role of ezrin, specifically during the entry stage of SARS-CoV infection.

COMPRESSIVE QUANTIZATION FOR SCALABLE CLOUD RADIO ACCESS NETWORKS

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the proliferation of new mobile devices and applications, the demand for ubiquitous wireless services has increased dramatically in recent years. The explosive growth in the wireless traffic requires the wireless networks to be scalable so that they can be efficiently extended to meet the wireless communication demands. In a wireless network, the interference power typically grows with the number of devices without necessary coordination among them. On the other hand, large scale coordination is always difficult due to the low-bandwidth and high-latency interfaces between access points (APs) in traditional wireless networks. To address this challenge, cloud radio access network (C-RAN) has been proposed, where a pool of base band units (BBUs) are connected to the distributed remote radio heads (RRHs) via high bandwidth and low latency links (i.e., the front-haul) and are responsible for all the baseband processing. But the insufficient front-haul link capacity may limit the scale of C-RAN and prevent it from fully utilizing the benefits made possible by the centralized baseband processing. As a result, the front-haul link capacity becomes a bottleneck in the scalability of C-RAN. In this dissertation, we explore the scalable C-RAN in the effort of tackling this challenge. In the first aspect of this dissertation, we investigate the scalability issues in the existing wireless networks and propose a novel time-reversal (TR) based scalable wireless network in which the interference power is naturally mitigated by the focusing effects of TR communications without coordination among APs or terminal devices (TDs). Due to this nice feature, it is shown that the system can be easily extended to serve more TDs. Motivated by the nice properties of TR communications in providing scalable wireless networking solutions, in the second aspect of this dissertation, we apply the TR based communications to the C-RAN and discover the TR tunneling effects which alleviate the traffic load in the front-haul links caused by the increment of TDs. We further design waveforming schemes to optimize the downlink and uplink transmissions in the TR based C-RAN, which are shown to improve the downlink and uplink transmission accuracies. Consequently, the traffic load in the front-haul links is further alleviated by the reducing re-transmissions caused by transmission errors. Moreover, inspired by the TR-based C-RAN, we propose the compressive quantization scheme which applies to the uplink of multi-antenna C-RAN so that more antennas can be utilized with the limited front-haul capacity, which provide rich spatial diversity such that the massive TDs can be served more efficiently.

Scalable nonparametric Bayesian multilevel clustering

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multilevel clustering problems where the con-tent and contextual information are jointly clustered are ubiquitous in modern datasets. Existing works on this problem are limited to small datasets due to the use of the Gibbs sampler. We address the problem of scaling up multi-level clustering under a Bayesian nonparametric setting, extending the MC2 model proposed in (Nguyen et al., 2014). We ground our approach in structured mean-field and stochastic variational inference (SVI) and develop a tree-structured SVI algorithm that exploits the interplay between content and context modeling. Our new algorithm avoids the need to repeatedly go through the corpus as in Gibbs sampler. More crucially, our method is immediately amendable to parallelization, facilitating a scalable distributed implementation on the Apache Spark platform. We conduct extensive experiments in a variety of domains including text, images, and real-world user application activities. Direct comparison with the Gibbs-sampler demonstrates that our method is an order-of-magnitude faster without loss of model quality. Our Spark-based implementation gains an-other order-of-magnitude speedup and can scale to large real-world datasets containing millions of documents and groups.

Fast and scalable counterfeits estimation for large-scale RFID systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many algorithms have been introduced to deterministically authenticate Radio Frequency Identification (RFID) tags, while little work has been done to address scalability issue in batch authentications. Deterministic approaches verify tags one by one, and the communication overhead and time cost grow linearly with increasing size of tags. We design a fast and scalable counterfeits estimation scheme, INformative Counting (INC), which achieves sublinear authentication time and communication cost in batch verifications. The key novelty of INC builds on an FM-Sketch variant authentication synopsis that can capture key counting information using only sublinear space. With the help of this well-designed data structure, INC is able to provide authentication results with accurate estimates of the number of counterfeiting tags and genuine tags, while previous batch authentication methods merely provide 0/1 results indicating the existence of counterfeits. We conduct detailed theoretical analysis and extensive experiments to examine this design and the results show that INC significantly outperforms previous work in terms of effectiveness and efficiency.

SCALABLE APPROXIMATION OF KERNEL FUZZY C-MEANS

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Virtually every sector of business and industry that uses computing, including financial analysis, search engines, and electronic commerce, incorporate Big Data analysis into their business model. Sophisticated clustering algorithms are popular for deducing the nature of data by assigning labels to unlabeled data. We address two main challenges in Big Data. First, by definition, the volume of Big Data is too large to be loaded into a computer’s memory (this volume changes based on the computer used or available, but there is always a data set that is too large for any computer). Second, in real-time applications, the velocity of new incoming data prevents historical data from being stored and future data from being accessed. Therefore, we propose our Streaming Kernel Fuzzy c-Means (stKFCM) algorithm, which reduces both computational complexity and space complexity significantly. The proposed stKFCM only requires O(n2) memory where n is the (predetermined) size of a data subset (or data chunk) at each time step, which makes this algorithm truly scalable (as n can be chosen based on the available memory). Furthermore, only 2n2 elements of the full N × N (where N >> n) kernel matrix need to be calculated at each time-step, thus reducing both the computation time in producing the kernel elements and also the complexity of the FCM algorithm. Empirical results show that stKFCM, even with relatively very small n, can provide clustering performance as accurately as kernel fuzzy c-means run on the entire data set while achieving a significant speedup.

Paralinguistic Discussion in an Online Educational Setting: A Preliminary Study

Relevância:

10.00% 10.00%

Publicador:

Web-based public participatory GIS

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Decision support systems (DSS) have evolved rapidly during the last decade from stand alone or limited networked solutions to online participatory solutions. One of the major enablers of this change is the fastest growing areas of geographical information system (GIS) technology development that relates to the use of the Internet as a means to access, display, and analyze geospatial data remotely. World-wide many federal, state, and particularly local governments are designing to facilitate data sharing using interactive Internet map servers. This new generation DSS or planning support systems (PSS), interactive Internet map server, is the solution for delivering dynamic maps and GIS data and services via the world-wide Web, and providing public participatory GIS (PPGIS) opportunities to a wider community (Carver, 2001; Jankowski & Nyerges, 2001). It provides a highly scalable framework for GIS Web publishing, Web-based public participatory GIS (WPPGIS), which meets the needs of corporate intranets and demands of worldwide Internet access (Craig, 2002). The establishment of WPPGIS provides spatial data access through a support centre or a GIS portal to facilitate efficient access to and sharing of related geospatial data (Yigitcanlar, Baum, & Stimson, 2003). As more and more public and private entities adopt WPPGIS technology, the importance and complexity of facilitating geospatial data sharing is growing rapidly (Carver, 2003). Therefore, this article focuses on the online public participation dimension of the GIS technology. The article provides an overview of recent literature on GIS and WPPGIS, and includes a discussion on the potential use of these technologies in providing a democratic platform for the public in decision-making.

«
1
2
...
15
16
17
18
19
20
21
...
65
66
»