970 resultados para Semantics - Data processing
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
Resumo:
Cloud computing offers massive scalability and elasticity required by many scien-tific and commercial applications. Combining the computational and data handling capabilities of clouds with parallel processing also has the potential to tackle Big Data problems efficiently. Science gateway frameworks and workflow systems enable application developers to implement complex applications and make these available for end-users via simple graphical user interfaces. The integration of such frameworks with Big Data processing tools on the cloud opens new oppor-tunities for application developers. This paper investigates how workflow sys-tems and science gateways can be extended with Big Data processing capabilities. A generic approach based on infrastructure aware workflows is suggested and a proof of concept is implemented based on the WS-PGRADE/gUSE science gateway framework and its integration with the Hadoop parallel data processing solution based on the MapReduce paradigm in the cloud. The provided analysis demonstrates that the methods described to integrate Big Data processing with workflows and science gateways work well in different cloud infrastructures and application scenarios, and can be used to create massively parallel applications for scientific analysis of Big Data.
Resumo:
An array of Bio-Argo floats equipped with radiometric sensors has been recently deployed in various open ocean areas representative of the diversity of trophic and bio-optical conditions prevailing in the so-called Case 1 waters. Around solar noon and almost everyday, each float acquires 0-250 m vertical profiles of Photosynthetically Available Radiation and downward irradiance at three wavelengths (380, 412 and 490 nm). Up until now, more than 6500 profiles for each radiometric channel have been acquired. As these radiometric data are collected out of operator’s control and regardless of meteorological conditions, specific and automatic data processing protocols have to be developed. Here, we present a data quality-control procedure aimed at verifying profile shapes and providing near real-time data distribution. This procedure is specifically developed to: 1) identify main issues of measurements (i.e. dark signal, atmospheric clouds, spikes and wave-focusing occurrences); 2) validate the final data with a hierarchy of tests to ensure a scientific utilization. The procedure, adapted to each of the four radiometric channels, is designed to flag each profile in a way compliant with the data management procedure used by the Argo program. Main perturbations in the light field are identified by the new protocols with good performances over the whole dataset. This highlights its potential applicability at the global scale. Finally, the comparison with modeled surface irradiances allows assessing the accuracy of quality-controlled measured irradiance values and identifying any possible evolution over the float lifetime due to biofouling and instrumental drift.
Resumo:
BACKGROUND: The neonatal and pediatric antimicrobial point prevalence survey (PPS) of the Antibiotic Resistance and Prescribing in European Children project (http://www.arpecproject.eu/) aims to standardize a method for surveillance of antimicrobial use in children and neonates admitted to the hospital within Europe. This article describes the audit criteria used and reports overall country-specific proportions of antimicrobial use. An analytical review presents methodologies on antimicrobial use.
METHODS: A 1-day PPS on antimicrobial use in hospitalized children was organized in September 2011, using a previously validated and standardized method. The survey included all inpatient pediatric and neonatal beds and identified all children receiving an antimicrobial treatment on the day of survey. Mandatory data were age, gender, (birth) weight, underlying diagnosis, antimicrobial agent, dose and indication for treatment. Data were entered through a web-based system for data-entry and reporting, based on the WebPPS program developed for the European Surveillance of Antimicrobial Consumption project.
RESULTS: There were 2760 and 1565 pediatric versus 1154 and 589 neonatal inpatients reported among 50 European (n = 14 countries) and 23 non-European hospitals (n = 9 countries), respectively. Overall, antibiotic pediatric and neonatal use was significantly higher in non-European (43.8%; 95% confidence interval [CI]: 41.3-46.3% and 39.4%; 95% CI: 35.5-43.4%) compared with that in European hospitals (35.4; 95% CI: 33.6-37.2% and 21.8%; 95% CI: 19.4-24.2%). Proportions of antibiotic use were highest in hematology/oncology wards (61.3%; 95% CI: 56.2-66.4%) and pediatric intensive care units (55.8%; 95% CI: 50.3-61.3%).
CONCLUSIONS: An Antibiotic Resistance and Prescribing in European Children standardized web-based method for a 1-day PPS was successfully developed and conducted in 73 hospitals worldwide. It offers a simple, feasible and sustainable way of data collection that can be used globally.
Resumo:
Field-programmable gate arrays are ideal hosts to custom accelerators for signal, image, and data processing but de- mand manual register transfer level design if high performance and low cost are desired. High-level synthesis reduces this design burden but requires manual design of complex on-chip and off-chip memory architectures, a major limitation in applications such as video processing. This paper presents an approach to resolve this shortcoming. A constructive process is described that can derive such accelerators, including on- and off-chip memory storage from a C description such that a user-defined throughput constraint is met. By employing a novel statement-oriented approach, dataflow intermediate models are derived and used to support simple ap- proaches for on-/off-chip buffer partitioning, derivation of custom on-chip memory hierarchies and architecture transformation to ensure user-defined throughput constraints are met with minimum cost. When applied to accelerators for full search motion estima- tion, matrix multiplication, Sobel edge detection, and fast Fourier transform, it is shown how real-time performance up to an order of magnitude in advance of existing commercial HLS tools is enabled whilst including all requisite memory infrastructure. Further, op- timizations are presented that reduce the on-chip buffer capacity and physical resource cost by up to 96% and 75%, respectively, whilst maintaining real-time performance.
Resumo:
The astonishing development of diverse and different hardware platforms is twofold: on one side, the challenge for the exascale performance for big data processing and management; on the other side, the mobile and embedded devices for data collection and human machine interaction. This drove to a highly hierarchical evolution of programming models. GVirtuS is the general virtualization system developed in 2009 and firstly introduced in 2010 enabling a completely transparent layer among GPUs and VMs. This paper shows the latest achievements and developments of GVirtuS, now supporting CUDA 6.5, memory management and scheduling. Thanks to the new and improved remoting capabilities, GVirtus now enables GPU sharing among physical and virtual machines based on x86 and ARM CPUs on local workstations,computing clusters and distributed cloud appliances.
Resumo:
A major weakness among loading models for pedestrians walking on flexible structures proposed in recent years is the various uncorroborated assumptions made in their development. This applies to spatio-temporal characteristics of pedestrian loading and the nature of multi-object interactions. To alleviate this problem, a framework for the determination of localised pedestrian forces on full-scale structures is presented using a wireless attitude and heading reference systems (AHRS). An AHRS comprises a triad of tri-axial accelerometers, gyroscopes and magnetometers managed by a dedicated data processing unit, allowing motion in three-dimensional space to be reconstructed. A pedestrian loading model based on a single point inertial measurement from an AHRS is derived and shown to perform well against benchmark data collected on an instrumented treadmill. Unlike other models, the current model does not take any predefined form nor does it require any extrapolations as to the timing and amplitude of pedestrian loading. In order to assess correctly the influence of the moving pedestrian on behaviour of a structure, an algorithm for tracking the point of application of pedestrian force is developed based on data from a single AHRS attached to a foot. A set of controlled walking tests with a single pedestrian is conducted on a real footbridge for validation purposes. A remarkably good match between the measured and simulated bridge response is found, indeed confirming applicability of the proposed framework.
Resumo:
This article is the result of a study that seeks to understand the relationship between socio-economic conditions, health and active ageing. Behaviours related to active ageing in relation to health were identified as were the strategies used in active ageing and their determinants. A qualitative methodology was adopted in the form of semi-structured interviews. Data processing consisted of thematic content analysis in interviews. Two socio-economic groups of elderly Cape Verdean men and women composed the study sample. Both groups totalled 22 cases. Findings indicated that the socio-economic status interferes directly in the affairs of active ageing rather than health issues. In the higher socio-economic group, it was found that status determines active ageing rather than health issues.
Resumo:
Objectivo: o presente estudo pretende caracterizar a qualidade de vida dos idosos da Região de Leiria, comparando aqueles que vivem no Domicílio com os que vivem em Instituições. Para tal propomos caracterizar a população em estudo sóciodemograficamente; identificar factores situacionais consoante o seu local de residência; avaliar níveis de dependência , apoio social e funcionalidade familiar; avaliar a qualidade de vida e identificar a relação entre as várias variáveis e a qualidade de vida. Método: Para tal optou-se por passar um questionário a um total de 238 idosos, 111 residentes em Instituições e 127 residentes no domicílio. Ao longo do processo de recolha de dados foram cumpridas as exigências éticas que pautam a nossa profissão. Foram utilizados métodos de estatística descritiva e de estatística analítica para o tratamento de dados. Resultados: Os resultados obtidos permitiram a caracterização sócio-demográfica dos idosos da região de Leiria. Foi ainda possível comparar os dois grupos em estudo, não se tendo encontrado diferenças significativas entre os dois grupos para as variáveis biopsicossociais. Conclusão: A maioria dos idosos inquiridos tem qualidade de vida, sendo que os idosos residentes no domicílio apresentam maior qualidade de vida. /
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Business Process Management (BPM) is able to organize and frame a company focusing in the improvement or assurance of performance in order to gain competitive advantage. Although it is believed that BPM improves various aspects of organizational performance, there has been a lack of empirical evidence about this. The present study has the purpose to develop a model to show the impact of business process management in organizational performance. To accomplish that, the theoretical basis required to know the elements that configurate BPM and the measures that can evaluate the BPM success on organizational performance is built through a systematic literature review (SLR). Then, a research model is proposed according to SLR results. Empirical data will be collected from a survey of larg and mid-sized industrial and service companies headquartered in Brazil. A quantitative analysis will be performed using structural equation modeling (SEM) to show if the direct effects among BPM and organizational performance can be considered statistically significant. At the end will discuss these results and their managerial and cientific implications.Keywords: Business process management (BPM). Organizational performance. Firm performance. Business models. Structural Equation Modeling. Systematic Literature Review.
Resumo:
Recent advances in the massively parallel computational abilities of graphical processing units (GPUs) have increased their use for general purpose computation, as companies look to take advantage of big data processing techniques. This has given rise to the potential for malicious software targeting GPUs, which is of interest to forensic investigators examining the operation of software. The ability to carry out reverse-engineering of software is of great importance within the security and forensics elds, particularly when investigating malicious software or carrying out forensic analysis following a successful security breach. Due to the complexity of the Nvidia CUDA (Compute Uni ed Device Architecture) framework, it is not clear how best to approach the reverse engineering of a piece of CUDA software. We carry out a review of the di erent binary output formats which may be encountered from the CUDA compiler, and their implications on reverse engineering. We then demonstrate the process of carrying out disassembly of an example CUDA application, to establish the various techniques available to forensic investigators carrying out black-box disassembly and reverse engineering of CUDA binaries. We show that the Nvidia compiler, using default settings, leaks useful information. Finally, we demonstrate techniques to better protect intellectual property in CUDA algorithm implementations from reverse engineering.
Resumo:
Force plate or pressure plate analysis came as an innovative tool to biomechanics and sport medicine -- This allows engineers, scientists and doctors to virtually reconstruct the way a person steps while running or walking using a measuring system and a computer -- With this information they can calculate and analyze a whole set of variables and factors that characterize the step -- Then they are able to make corrections and/or optimizations, designing appropriate shoes and insoles for the patient -- The idea is to study and understand all the hardware and software implications of this process and all the components involved, and then propose an alternative solution -- This solution should have at least similar performance to existing systems -- It should increase the accuracy and/or the sampling frequency to obtain better results -- By the end, there should be a working prototype of a pressure measuring system and a mathematical model to govern it -- The costs of the system have to be lower than most of the systems in the market
Resumo:
The marine diatom Haslea ostrearia [1] produces a water-soluble blue-pigment named marennine [2] of economic interest. But the lack of knowledge of the ecological conditions, under which this microalga develops in its natural ecosystem, more especially bacteria H. ostrearia interactions, prevents any optimization of its culture in well-controlled conditions. The structure of the bacterial community was analyzed by PCR-TTGE before and after the isolation of H. ostrearia cells recovered from 4 localities, to distinguish the relative part of the biotope and the biocenose and eventually to describe the temporal dynamic of the structure of the bacterial community at two time-scales. The differences in genetic fingerprints, more especially high between two H. ostrearia isolates (HO-R and HO-BM) showed also the highest differences in the bacterial structure [3] as the result of specific metabolomics profiles. The non-targeted metabolomic investigation showed that these profiles were more distinct in case of bacteria-alga associations than for the H. ostrearia monoculture Here we present a Q-TOF LC/MS metabolomic fingerprinting approach [3]: - to investigate differential metabolites of axenic versus non axenic H. ostrearia cultures. - to focus on the specific metabolites of a bacterial surrounding associated with the activation or inhibition of the microalga growing. The Agilent suite of data processing software makes feature finding, statistical analysis, and identification easier. This enables rapid transformation of complex raw data into biologically relevant metabolite information.