983 resultados para scalable analysis


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Monitoring the environment with acoustic sensors is an effective method for understanding changes in ecosystems. Through extensive monitoring, large-scale, ecologically relevant, datasets can be produced that can inform environmental policy. The collection of acoustic sensor data is a solved problem; the current challenge is the management and analysis of raw audio data to produce useful datasets for ecologists. This paper presents the applied research we use to analyze big acoustic datasets. Its core contribution is the presentation of practical large-scale acoustic data analysis methodologies. We describe details of the data workflows we use to provide both citizen scientists and researchers practical access to large volumes of ecoacoustic data. Finally, we propose a work in progress large-scale architecture for analysis driven by a hybrid cloud-and-local production-grade website.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

n the past decade, the analysis of data has faced the challenge of dealing with very large and complex datasets and the real-time generation of data. Technologies to store and access these complex and large datasets are in place. However, robust and scalable analysis technologies are needed to extract meaningful information from these datasets. The research field of Information Visualization and Visual Data Analytics addresses this need. Information visualization and data mining are often used complementary to each other. Their common goal is the extraction of meaningful information from complex and possibly large data. However, though data mining focuses on the usage of silicon hardware, visualization techniques also aim to access the powerful image-processing capabilities of the human brain. This article highlights the research on data visualization and visual analytics techniques. Furthermore, we highlight existing visual analytics techniques, systems, and applications including a perspective on the field from the chemical process industry.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The inverse temperature hyperparameter of the hidden Potts model governs the strength of spatial cohesion and therefore has a substantial influence over the resulting model fit. The difficulty arises from the dependence of an intractable normalising constant on the value of the inverse temperature, thus there is no closed form solution for sampling from the distribution directly. We review three computational approaches for addressing this issue, namely pseudolikelihood, path sampling, and the approximate exchange algorithm. We compare the accuracy and scalability of these methods using a simulation study.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Context-sensitive points-to analysis is critical for several program optimizations. However, as the number of contexts grows exponentially, storage requirements for the analysis increase tremendously for large programs, making the analysis non-scalable. We propose a scalable flow-insensitive context-sensitive inclusion-based points-to analysis that uses a specially designed multi-dimensional bloom filter to store the points-to information. Two key observations motivate our proposal: (i) points-to information (between pointer-object and between pointer-pointer) is sparse, and (ii) moving from an exact to an approximate representation of points-to information only leads to reduced precision without affecting correctness of the (may-points-to) analysis. By using an approximate representation a multi-dimensional bloom filter can significantly reduce the memory requirements with a probabilistic bound on loss in precision. Experimental evaluation on SPEC 2000 benchmarks and two large open source programs reveals that with an average storage requirement of 4MB, our approach achieves almost the same precision (98.6%) as the exact implementation. By increasing the average memory to 27MB, it achieves precision upto 99.7% for these benchmarks. Using Mod/Ref analysis as the client, we find that the client analysis is not affected that often even when there is some loss of precision in the points-to representation. We find that the NoModRef percentage is within 2% of the exact analysis while requiring 4MB (maximum 15MB) memory and less than 4 minutes on average for the points-to analysis. Another major advantage of our technique is that it allows to trade off precision for memory usage of the analysis.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The ability to perform strong updates is the main contributor to the precision of flow-sensitive pointer analysis algorithms. Traditional flow-sensitive pointer analyses cannot strongly update pointers residing in the heap. This is a severe restriction for Java programs. In this paper, we propose a new flow-sensitive pointer analysis algorithm for Java that can perform strong updates on heap-based pointers effectively. Instead of points-to graphs, we represent our points-to information as maps from access paths to sets of abstract objects. We have implemented our analysis and run it on several large Java benchmarks. The results show considerable improvement in precision over the points-to graph based flow-insensitive and flow-sensitive analyses, with reasonable running time.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Today there is a growing interest in the integration of health monitoring applications in portable devices necessitating the development of methods that improve the energy efficiency of such systems. In this paper, we present a systematic approach that enables energy-quality trade-offs in spectral analysis systems for bio-signals, which are useful in monitoring various health conditions as those associated with the heart-rate. To enable such trade-offs, the processed signals are expressed initially in a basis in which significant components that carry most of the relevant information can be easily distinguished from the parts that influence the output to a lesser extent. Such a classification allows the pruning of operations associated with the less significant signal components leading to power savings with minor quality loss since only less useful parts are pruned under the given requirements. To exploit the attributes of the modified spectral analysis system, thresholding rules are determined and adopted at design- and run-time, allowing the static or dynamic pruning of less-useful operations based on the accuracy and energy requirements. The proposed algorithm is implemented on a typical sensor node simulator and results show up-to 82% energy savings when static pruning is combined with voltage and frequency scaling, compared to the conventional algorithm in which such trade-offs were not available. In addition, experiments with numerous cardiac samples of various patients show that such energy savings come with a 4.9% average accuracy loss, which does not affect the system detection capability of sinus-arrhythmia which was used as a test case. 

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Semantic interoperability is essential to facilitate efficient collaboration in heterogeneous multi-site healthcare environments. The deployment of a semantic interoperability solution has the potential to enable a wide range of informatics supported applications in clinical care and research both within as ingle healthcare organization and in a network of organizations. At the same time, building and deploying a semantic interoperability solution may require significant effort to carryout data transformation and to harmonize the semantics of the information in the different systems. Our approach to semantic interoperability leverages existing healthcare standards and ontologies, focusing first on specific clinical domains and key applications, and gradually expanding the solution when needed. An important objective of this work is to create a semantic link between clinical research and care environments to enable applications such as streamlining the execution of multi-centric clinical trials, including the identification of eligible patients for the trials. This paper presents an analysis of the suitability of several widely-used medical ontologies in the clinical domain: SNOMED-CT, LOINC, MedDRA, to capture the semantics of the clinical trial eligibility criteria, of the clinical trial data (e.g., Clinical Report Forms), and of the corresponding patient record data that would enable the automatic identification of eligible patients. Next to the coverage provided by the ontologies we evaluate and compare the sizes of the sets of relevant concepts and their relative frequency to estimate the cost of data transformation, of building the necessary semantic mappings, and of extending the solution to new domains. This analysis shows that our approach is both feasible and scalable.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since manually constructing domain-specific sentiment lexicons is extremely time consuming and it may not even be feasible for domains where linguistic expertise is not available. Research on the automatic construction of domain-specific sentiment lexicons has become a hot topic in recent years. The main contribution of this paper is the illustration of a novel semi-supervised learning method which exploits both term-to-term and document-to-term relations hidden in a corpus for the construction of domain specific sentiment lexicons. More specifically, the proposed two-pass pseudo labeling method combines shallow linguistic parsing and corpusbase statistical learning to make domain-specific sentiment extraction scalable with respect to the sheer volume of opinionated documents archived on the Internet these days. Another novelty of the proposed method is that it can utilize the readily available user-contributed labels of opinionated documents (e.g., the user ratings of product reviews) to bootstrap the performance of sentiment lexicon construction. Our experiments show that the proposed method can generate high quality domain-specific sentiment lexicons as directly assessed by human experts. Moreover, the system generated domain-specific sentiment lexicons can improve polarity prediction tasks at the document level by 2:18% when compared to other well-known baseline methods. Our research opens the door to the development of practical and scalable methods for domain-specific sentiment analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Flexible information exchange is critical to successful design-analysis integration, but current top-down, standards-based and model-oriented strategies impose restrictions that contradicts this flexibility. In this article we present a bottom-up, user-controlled and process-oriented approach to linking design and analysis applications that is more responsive to the varied needs of designers and design teams. Drawing on research into scientific workflows, we present a framework for integration that capitalises on advances in cloud computing to connect discrete tools via flexible and distributed process networks. We then discuss how a shared mapping process that is flexible and user friendly supports non-programmers in creating these custom connections. Adopting a services-oriented system architecture, we propose a web-based platform that enables data, semantics and models to be shared on the fly. We then discuss potential challenges and opportunities for its development as a flexible, visual, collaborative, scalable and open system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Flexible information exchange is critical to successful design integration, but current top-down, standards-based and model-oriented strategies impose restrictions that are contradictory to this flexibility. In this paper we present a bottom-up, user-controlled and process-oriented approach to linking design and analysis applications that is more responsive to the varied needs of designers and design teams. Drawing on research into scientific workflows, we present a framework for integration that capitalises on advances in cloud computing to connect discrete tools via flexible and distributed process networks. Adopting a services-oriented system architecture, we propose a web-based platform that enables data, semantics and models to be shared on the fly. We discuss potential challenges and opportunities for the development thereof as a flexible, visual, collaborative, scalable and open system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A monolithic stationary phase was prepared via free radical co-polymerization of ethylene glycol dimethacrylate (EDMA) and glycidyl methacrylate (GMA) with pore diameter tailored specifically for plasmid binding, retention and elution. The polymer was functionalized. with 2-chloro-N,N-diethylethylamine hydrochloride (DEAE-Cl) for anion-exchange purification of plasmid DNA (pDNA) from clarified lysate obtained from E. coli DH5α-pUC19 culture in a ribonuclease/ protease-free environment. Characterization of the monolithic resin showed a porous material, with 68% of the pores existing in the matrix having diameters above 300 nm. The final product isolated from a single-stage 5 min anion-exchange purification was a pure and homogeneous supercoiled (SC) pDNA with no gDNA, RNA and protein contamination as confirmed by ethidium bromide agarose gel electrophoresis (EtBr-AGE), enzyme restriction analysis and sodium dodecyl sulfate-polyacrylamide gel electrophoresis. This non-toxic technique is cGMP compatible and highly scalable for production of pDNA on a commercial level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Systems-level identification and analysis of cellular circuits in the brain will require the development of whole-brain imaging with single-cell resolution. To this end, we performed comprehensive chemical screening to develop a whole-brain clearing and imaging method, termed CUBIC (clear, unobstructed brain imaging cocktails and computational analysis). CUBIC is a simple and efficient method involving the immersion of brain samples in chemical mixtures containing aminoalcohols, which enables rapid whole-brain imaging with single-photon excitation microscopy. CUBIC is applicable to multicolor imaging of fluorescent proteins or immunostained samples in adult brains and is scalable from a primate brain to subcellular structures. We also developed a whole-brain cell-nuclear counterstaining protocol and a computational image analysis pipeline that, together with CUBIC reagents, enable the visualization and quantification of neural activities induced by environmental stimulation. CUBIC enables time-course expression profiling of whole adult brains with single-cell resolution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis introduces a new way of using prior information in a spatial model and develops scalable algorithms for fitting this model to large imaging datasets. These methods are employed for image-guided radiation therapy and satellite based classification of land use and water quality. This study has utilized a pre-computation step to achieve a hundredfold improvement in the elapsed runtime for model fitting. This makes it much more feasible to apply these models to real-world problems, and enables full Bayesian inference for images with a million or more pixels.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Large Display Arrays (LDAs) use Light Emitting Diodes (LEDs) in order to inform a viewing audience. A matrix of individually driven LEDs allows the area represented to display text, images and video. LDAs have undergone rapid development over the past 10 years in both the modular and semi-flexible formats. This thesis critically analyses the communication architecture and processor functionality of current LDAs and presents an alternative method, that is, Scalable Flexible Large Display Arrays (SFLDAs). SFLDAs are more adaptable to a variety of applications because of enhancements in scalability and flexibility. Scalability is the ability to configure SFLDAs from 0.8m2 to 200m2. Flexibility is increased functionality within the processors to handle changes in configuration and the use of a communication architecture that standardises two-way communication throughout the SFLDA. While common video platforms such as Digital Video Interface (DVI), Serial Digital Interface (SDI), and High Definition Multimedia Interface (HDMI) are considered as solutions for the communication architecture of SFLDAs, so too is modulation, fibre optic, capacitive coupling and Ethernet. From an analysis of these architectures, Ethernet was identified as the best solution. The use of Ethernet as the communication architecture in SFLDAs means that both hardware and software modules are capable of interfacing to the SFLDAs. The Video to Ethernet Processor Unit (VEPU), Scoreboard, Image and Control Software (SICS) and Ethernet to LED Processor Unit (ELPU) have been developed to form the key components in designing and implementing the first SFLDA. Data throughput rate and spectrophotometer tests were used to measure the effectiveness of Ethernet within the SFLDA constructs. The result of testing and analysis of these architectures showed that Ethernet satisfactorily met the requirements of SFLDAs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Relay selection for cooperative communications promises significant performance improvements, and is, therefore, attracting considerable attention. While several criteria have been proposed for selecting one or more relays, distributed mechanisms that perform the selection have received relatively less attention. In this paper, we develop a novel, yet simple, asymptotic analysis of a splitting-based multiple access selection algorithm to find the single best relay. The analysis leads to simpler and alternate expressions for the average number of slots required to find the best user. By introducing a new contention load' parameter, the analysis shows that the parameter settings used in the existing literature can be improved upon. New and simple bounds are also derived. Furthermore, we propose a new algorithm that addresses the general problem of selecting the best Q >= 1 relays, and analyze and optimize it. Even for a large number of relays, the scalable algorithm selects the best two relays within 4.406 slots and the best three within 6.491 slots, on average. We also propose a new and simple scheme for the practically relevant case of discrete metrics. Altogether, our results develop a unifying perspective about the general problem of distributed selection in cooperative systems and several other multi-node systems.