876 resultados para bigdata, data stream processing, dsp, apache storm, cyber security
Resumo:
The purpose of this study was to evaluate the accuracy of electronic apex locators Digital Signal Processing (DSP) and ProPex, for root canal length determination in primary teeth. Fifteen primary molars (a total of 34 root canals) were divided into two groups: Group I - without physiological resorption (n = 16); and Group II - with physiological resorption (n = 18). The length of each canal was measured by introducing a file until its tip was visible and then it was retracted 1 mm. For electronic measurement, the devices were set to 1 mm short of the apical resorption. The data were analysed statistically using the intraclass correlation coefficient (ICC). Results showed that the ICC was high for both electronic apex locators in all situations - with (ICC: DSP = 0.82 and Propex = 0.89) or without resorption (ICC: DSP = 0.92 and Propex = 0.90). Both apex locators were extremely accurate in determining the working length in primary teeth, both with or without physiological resorption.
Resumo:
In recent years, the use of Reverse Engineering systems has got a considerable interest for a wide number of applications. Therefore, many research activities are focused on accuracy and precision of the acquired data and post processing phase improvements. In this context, this PhD Thesis deals with the definition of two novel methods for data post processing and data fusion between physical and geometrical information. In particular a technique has been defined for error definition in 3D points’ coordinates acquired by an optical triangulation laser scanner, with the aim to identify adequate correction arrays to apply under different acquisition parameters and operative conditions. Systematic error in data acquired is thus compensated, in order to increase accuracy value. Moreover, the definition of a 3D thermogram is examined. Object geometrical information and its thermal properties, coming from a thermographic inspection, are combined in order to have a temperature value for each recognizable point. Data acquired by an optical triangulation laser scanner are also used to normalize temperature values and make thermal data independent from thermal-camera point of view.
Resumo:
Many applications in several domains such as telecommunications, network security, large scale sensor networks, require online processing of continuous data lows. They produce very high loads that requires aggregating the processing capacity of many nodes. Current Stream Processing Engines do not scale with the input load due to single-node bottlenecks. Additionally, they are based on static con?gurations that lead to either under or over-provisioning. In this paper, we present StreamCloud, a scalable and elastic stream processing engine for processing large data stream volumes. StreamCloud uses a novel parallelization technique that splits queries into subqueries that are allocated to independent sets of nodes in a way that minimizes the distribution overhead. Its elastic protocols exhibit low intrusiveness, enabling effective adjustment of resources to the incoming load. Elasticity is combined with dynamic load balancing to minimize the computational resources used. The paper presents the system design, implementation and a thorough evaluation of the scalability and elasticity of the fully implemented system.
Resumo:
Two complementary benchmarks have been proposed so far for the evaluation and continuous improvement of RDF stream processors: SRBench and LSBench. They put a special focus on different features of the evaluated systems, including coverage of the streaming extensions of SPARQL supported by each processor, query processing throughput, and an early analysis of query evaluation correctness, based on comparing the results obtained by different processors for a set of queries. However, none of them has analysed the operational semantics of these processors in order to assess the correctness of query evaluation results. In this paper, we propose a characterization of the operational semantics of RDF stream processors, adapting well-known models used in the stream processing engine community: CQL and SECRET. Through this formalization, we address correctness in RDF stream processor benchmarks, allowing to determine the multiple answers that systems should provide. Finally, we present CSRBench, an extension of SRBench to address query result correctness verification using an automatic method.
Resumo:
Collaborate Filtering is one of the most popular recommendation algorithms. Most Collaborative Filtering algorithms work with a static set of data. This paper introduces a novel approach to providing recommendations using Collaborative Filtering when user rating is received over an incoming data stream. In an incoming stream there are massive amounts of data arriving rapidly making it impossible to save all the records for later analysis. By dynamically building a decision tree for every item as data arrive, the incoming data stream is used effectively although an inevitable trade off between accuracy and amount of memory used is introduced. By adding a simple personalization step using a hierarchy of the items, it is possible to improve the predicted ratings made by each decision tree and generate recommendations in real-time. Empirical studies with the dynamically built decision trees show that the personalization step improves the overall predicted accuracy.
Resumo:
The current optical communications network consists of point-to-point optical transmission paths interconnected with relatively low-speed electronic switching and routing devices. As the demand for capacity increases, then higher speed electronic devices will become necessary. It is however hard to realise electronic chip-sets above 10 Gbit/s, and therefore to increase the achievable performance of the network, electro-optic and all-optic switching and routing architectures are being investigated. This thesis aims to provide a detailed experimental analysis of high-speed optical processing within an optical time division multiplexed (OTDM) network node. This includes the functions of demultiplexing, 'drop and insert' multiplexing, data regeneration, and clock recovery. It examines the possibilities of combining these tasks using a single device. Two optical switching technologies are explored. The first is an all-optical device known as 'semiconductor optical amplifier-based nonlinear optical loop mirror' (SOA-NOLM). Switching is achieved by using an intense 'control' pulse to induce a phase shift in a low-intensity signal propagating through an interferometer. Simultaneous demultiplexing, data regeneration and clock recovery are demonstrated for the first time using a single SOA-NOLM. The second device is an electroabsorption (EA) modulator, which until this thesis had been used in a uni-directional configuration to achieve picosecond pulse generation, data encoding, demultiplexing, and 'drop and insert' multiplexing. This thesis presents results on the use of an EA modulator in a novel bi-directional configuration. Two independent channels are demultiplexed from a high-speed OTDM data stream using a single device. Simultaneous demultiplexing with stable, ultra-low jitter clock recovery is demonstrated, and then used in a self-contained 40 Gbit/s 'drop and insert' node. Finally, a 10 GHz source is analysed that exploits the EA modulator bi-directionality to increase the pulse extinction ratio to a level where it could be used in an 80 Gbit/s OTDM network.
Resumo:
Recent advances in our ability to watch the molecular and cellular processes of life in action-such as atomic force microscopy, optical tweezers and Forster fluorescence resonance energy transfer-raise challenges for digital signal processing (DSP) of the resulting experimental data. This article explores the unique properties of such biophysical time series that set them apart from other signals, such as the prevalence of abrupt jumps and steps, multi-modal distributions and autocorrelated noise. It exposes the problems with classical linear DSP algorithms applied to this kind of data, and describes new nonlinear and non-Gaussian algorithms that are able to extract information that is of direct relevance to biological physicists. It is argued that these new methods applied in this context typify the nascent field of biophysical DSP. Practical experimental examples are supplied.
Resumo:
With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.
Resumo:
Data mining can be defined as the extraction of implicit, previously un-known, and potentially useful information from data. Numerous re-searchers have been developing security technology and exploring new methods to detect cyber-attacks with the DARPA 1998 dataset for Intrusion Detection and the modified versions of this dataset KDDCup99 and NSL-KDD, but until now no one have examined the performance of the Top 10 data mining algorithms selected by experts in data mining. The compared classification learning algorithms in this thesis are: C4.5, CART, k-NN and Naïve Bayes. The performance of these algorithms are compared with accuracy, error rate and average cost on modified versions of NSL-KDD train and test dataset where the instances are classified into normal and four cyber-attack categories: DoS, Probing, R2L and U2R. Additionally the most important features to detect cyber-attacks in all categories and in each category are evaluated with Weka’s Attribute Evaluator and ranked according to Information Gain. The results show that the classification algorithm with best performance on the dataset is the k-NN algorithm. The most important features to detect cyber-attacks are basic features such as the number of seconds of a network connection, the protocol used for the connection, the network service used, normal or error status of the connection and the number of data bytes sent. The most important features to detect DoS, Probing and R2L attacks are basic features and the least important features are content features. Unlike U2R attacks, where the content features are the most important features to detect attacks.
Resumo:
Background: Gamma-band oscillations are prominently impaired in schizophrenia, but the nature of the deficit and relationship to perceptual processes is unclear. Methods: 16 patients with chronic schizophrenia (ScZ) and 16 age-matched healthy controls completed a visual paradigm while magnetoencephalographic (MEG) data was recorded. Participants had to detect randomly occurring stimulus acceleration while viewing a concentric moving grating. MEG data were analyzed for spectral power (1-100 Hz) at sensorand source-level to examine the brain regions involved in aberrant rhythmic activity, and for contribution of differences in baseline activity towards the generation of low- and highfrequency power. Results: Our data show reduced gamma-band power at sensor level in schizophrenia patients during stimulus processing while alpha-band and baseline spectrum were intact. Differences in oscillatory activity correlated with reduced behavioral detection rates in the schizophrenia group and higher scores on the “Cognitive Factor” of the Positive and Negative Syndrome Scale. Source reconstruction revealed that extra-striate (fusiform/lingual gyrus), but not striate (cuneus), visual cortices contributed towards the reduced activity observed at sensorlevel in ScZ patients. Importantly, differences in stimulus-related activity were not due to differences in baseline activity. Conclusions: Our findings highlight that MEG-measured high-frequency oscillations during visual processing can be robustly identified in ScZ. Our data further suggest impairments that involve dysfunctions in ventral stream processing and a failure to increase gamma-band activity in a task-context. Implications of these findings are discussed in the context of current theories of cortical-subcortical circuit dysfunctions and perceptual processing in ScZ.
Resumo:
This thesis reports on an investigation of the feasibility and usefulness of incorporating dynamic management facilities for managing sensed context data in a distributed contextaware mobile application. The investigation focuses on reducing the work required to integrate new sensed context streams in an existing context aware architecture. Current architectures require integration work for new streams and new contexts that are encountered. This means of operation is acceptable for current fixed architectures. However, as systems become more mobile the number of discoverable streams increases. Without the ability to discover and use these new streams the functionality of any given device will be limited to the streams that it knows how to decode. The integration of new streams requires that the sensed context data be understood by the current application. If the new source provides data of a type that an application currently requires then the new source should be connected to the application without any prior knowledge of the new source. If the type is similar and can be converted then this stream too should be appropriated by the application. Such applications are based on portable devices (phones, PDAs) for semi-autonomous services that use data from sensors connected to the devices, plus data exchanged with other such devices and remote servers. Such applications must handle input from a variety of sensors, refining the data locally and managing its communication from the device in volatile and unpredictable network conditions. The choice to focus on locally connected sensory input allows for the introduction of privacy and access controls. This local control can determine how the information is communicated to others. This investigation focuses on the evaluation of three approaches to sensor data management. The first system is characterised by its static management based on the pre-pended metadata. This was the reference system. Developed for a mobile system, the data was processed based on the attached metadata. The code that performed the processing was static. The second system was developed to move away from the static processing and introduce a greater freedom of handling for the data stream, this resulted in a heavy weight approach. The approach focused on pushing the processing of the data into a number of networked nodes rather than the monolithic design of the previous system. By creating a separate communication channel for the metadata it is possible to be more flexible with the amount and type of data transmitted. The final system pulled the benefits of the other systems together. By providing a small management class that would load a separate handler based on the incoming data, Dynamism was maximised whilst maintaining ease of code understanding. The three systems were then compared to highlight their ability to dynamically manage new sensed context. The evaluation took two approaches, the first is a quantitative analysis of the code to understand the complexity of the relative three systems. This was done by evaluating what changes to the system were involved for the new context. The second approach takes a qualitative view of the work required by the software engineer to reconfigure the systems to provide support for a new data stream. The evaluation highlights the various scenarios in which the three systems are most suited. There is always a trade-o↵ in the development of a system. The three approaches highlight this fact. The creation of a statically bound system can be quick to develop but may need to be completely re-written if the requirements move too far. Alternatively a highly dynamic system may be able to cope with new requirements but the developer time to create such a system may be greater than the creation of several simpler systems.
Resumo:
Urbanization and the ability to manage for a sustainable future present numerous challenges for geographers and planners in metropolitan regions. Remotely sensed data are inherently suited to provide information on urban land cover characteristics, and their change over time, at various spatial and temporal scales. Data models for establishing the range of urban land cover types and their biophysical composition (vegetation, soil, and impervious surfaces) are integrated to provide a hierarchical approach to classifying land cover within urban environments. These data also provide an essential component for current simulation models of urban growth patterns, as both calibration and validation data. The first stages of the approach have been applied to examine urban growth between 1988 and 1995 for a rapidly developing area in southeast Queensland, Australia. Landsat Thematic Mapper image data provided accurate (83% adjusted overall accuracy) classification of broad land cover types and their change over time. The combination of commonly available remotely sensed data, image processing methods, and emerging urban growth models highlights an important application for current and next generation moderate spatial resolution image data in studies of urban environments.
Resumo:
This paper presents an electricity medium voltage (MV) customer characterization framework supportedby knowledge discovery in database (KDD). The main idea is to identify typical load profiles (TLP) of MVconsumers and to develop a rule set for the automatic classification of new consumers. To achieve ourgoal a methodology is proposed consisting of several steps: data pre-processing; application of severalclustering algorithms to segment the daily load profiles; selection of the best partition, corresponding tothe best consumers’ segmentation, based on the assessments of several clustering validity indices; andfinally, a classification model is built based on the resulting clusters. To validate the proposed framework,a case study which includes a real database of MV consumers is performed.
Resumo:
The superfluous consumption of energy is faced by the modern society as a Socio-Economical and Environmental problem of the present days. This situation is worsening given that it is becoming clear that the tendency is to increase energy price every year. It is also noticeable that people, not necessarily proficient in technology, are not able to know where savings can be achieved, due to the absence of accessible awareness mechanisms. One of the home user concerns is to balance the need of reducing energy consumption, while producing the same activity with all the comfort and work efficiency. The common techniques to reduce the consumption are to use a less wasteful equipment, altering the equipment program to a more economical one or disconnecting appliances that are not necessary at the moment. However, there is no direct feedback from this performed actions, which leads to the situation where the user is not aware of the influence that these techniques have in the electrical bill. With the intension to give some control over the home consumption, Energy Management Systems (EMS) were developed. These systems allow the access to the consumption information and help understanding the energy waste. However, some studies have proven that these systems have a clear mismatch between the information that is presented and the one the user finds useful for his daily life, leading to demotivation of use. In order to create a solution more oriented towards the user’s demands, a specially tailored language (DSL) was implemented. This solution allows the user to acquire the information he considers useful, through the construction of questions about his energy consumption. The development of this language, following the Model Driven Development (MDD) approach, took into consideration the ideas of facility managers and home users in the phases of design and validation. These opinions were gathered through meetings with experts and a survey, which was conducted to the purpose of collecting statistics about what home users want to know.
Resumo:
Of the approximately 25,000 bridges in Iowa, 28% are classified as structurally deficient, functionally obsolete, or both. Because many Iowa bridges require repair or replacement with a relatively limited funding base, there is a need to develop new bridge materials that may lead to longer life spans and reduced life-cycle costs. In addition, new and effective methods for determining the condition of structures are needed to identify when the useful life has expired or other maintenance is needed. Due to its unique alloy blend, high-performance steel (HPS) has been shown to have improved weldability, weathering capabilities, and fracture toughness than conventional structural steels. Since the development of HPS in the mid-1990s, numerous bridges using HPS girders have been constructed, and many have been economically built. The East 12th Street Bridge, which replaced a deteriorated box girder bridge, is Iowa’s first bridge constructed using HPS girders. The new structure is a two-span bridge that crosses I-235 in Des Moines, Iowa, providing one lane of traffic in each direction. A remote, continuous, fiber-optic based structural health monitoring (SHM) system for the bridge was developed using off-the-shelf technologies. In the system, sensors strategically located on the bridge collect raw strain data and then transfer the data via wireless communication to a gateway system at a nearby secure facility. The data are integrated and converted to text files before being uploaded automatically to a website that provides live strain data and a live video stream. A data storage/processing system at the Bridge Engineering Center in Ames, Iowa, permanently stores and processes the data files. Several processes are performed to check the overall system’s operation, eliminate temperature effects from the complete strain record, compute the global behavior of the bridge, and count strain cycles at the various sensor locations.