944 resultados para data gathering algorithm


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Once multi-relational approach has emerged as an alternative for analyzing structured data such as relational databases, since they allow applying data mining in multiple tables directly, thus avoiding expensive joining operations and semantic losses, this work proposes an algorithm with multi-relational approach. Methods: Aiming to compare traditional approach performance and multi-relational for mining association rules, this paper discusses an empirical study between PatriciaMine - an traditional algorithm - and its corresponding multi-relational proposed, MR-Radix. Results: This work showed advantages of the multi-relational approach in performance over several tables, which avoids the high cost for joining operations from multiple tables and semantic losses. The performance provided by the algorithm MR-Radix shows faster than PatriciaMine, despite handling complex multi-relational patterns. The utilized memory indicates a more conservative growth curve for MR-Radix than PatriciaMine, which shows the increase in demand of frequent items in MR-Radix does not result in a significant growth of utilized memory like in PatriciaMine. Conclusion: The comparative study between PatriciaMine and MR-Radix confirmed efficacy of the multi-relational approach in data mining process both in terms of execution time and in relation to memory usage. Besides that, the multi-relational proposed algorithm, unlike other algorithms of this approach, is efficient for use in large relational databases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Background Once multi-relational approach has emerged as an alternative for analyzing structured data such as relational databases, since they allow applying data mining in multiple tables directly, thus avoiding expensive joining operations and semantic losses, this work proposes an algorithm with multi-relational approach. Methods Aiming to compare traditional approach performance and multi-relational for mining association rules, this paper discusses an empirical study between PatriciaMine - an traditional algorithm - and its corresponding multi-relational proposed, MR-Radix. Results This work showed advantages of the multi-relational approach in performance over several tables, which avoids the high cost for joining operations from multiple tables and semantic losses. The performance provided by the algorithm MR-Radix shows faster than PatriciaMine, despite handling complex multi-relational patterns. The utilized memory indicates a more conservative growth curve for MR-Radix than PatriciaMine, which shows the increase in demand of frequent items in MR-Radix does not result in a significant growth of utilized memory like in PatriciaMine. Conclusion The comparative study between PatriciaMine and MR-Radix confirmed efficacy of the multi-relational approach in data mining process both in terms of execution time and in relation to memory usage. Besides that, the multi-relational proposed algorithm, unlike other algorithms of this approach, is efficient for use in large relational databases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the era of the Internet of Everything, a user with a handheld or wearable device equipped with sensing capability has become a producer as well as a consumer of information and services. The more powerful these devices get, the more likely it is that they will generate and share content locally, leading to the presence of distributed information sources and the diminishing role of centralized servers. As of current practice, we rely on infrastructure acting as an intermediary, providing access to the data. However, infrastructure-based connectivity might not always be available or the best alternative. Moreover, it is often the case where the data and the processes acting upon them are of local scopus. Answers to a query about a nearby object, an information source, a process, an experience, an ability, etc. could be answered locally without reliance on infrastructure-based platforms. The data might have temporal validity limited to or bounded to a geographical area and/or the social context where the user is immersed in. In this envisioned scenario users could interact locally without the need for a central authority, hence, the claim of an infrastructure-less, provider-less platform. The data is owned by the users and consulted locally as opposed to the current approach of making them available globally and stay on forever. From a technical viewpoint, this network resembles a Delay/Disruption Tolerant Network where consumers and producers might be spatially and temporally decoupled exchanging information with each other in an adhoc fashion. To this end, we propose some novel data gathering and dissemination strategies for use in urban-wide environments which do not rely on strict infrastructure mediation. While preserving the general aspects of our study and without loss of generality, we focus our attention toward practical applicative scenarios which help us capture the characteristics of opportunistic communication networks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The overarching goal of the Pathway Semantics Algorithm (PSA) is to improve the in silico identification of clinically useful hypotheses about molecular patterns in disease progression. By framing biomedical questions within a variety of matrix representations, PSA has the flexibility to analyze combined quantitative and qualitative data over a wide range of stratifications. The resulting hypothetical answers can then move to in vitro and in vivo verification, research assay optimization, clinical validation, and commercialization. Herein PSA is shown to generate novel hypotheses about the significant biological pathways in two disease domains: shock / trauma and hemophilia A, and validated experimentally in the latter. The PSA matrix algebra approach identified differential molecular patterns in biological networks over time and outcome that would not be easily found through direct assays, literature or database searches. In this dissertation, Chapter 1 provides a broad overview of the background and motivation for the study, followed by Chapter 2 with a literature review of relevant computational methods. Chapters 3 and 4 describe PSA for node and edge analysis respectively, and apply the method to disease progression in shock / trauma. Chapter 5 demonstrates the application of PSA to hemophilia A and the validation with experimental results. The work is summarized in Chapter 6, followed by extensive references and an Appendix with additional material.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Energy consumption has been a key concern of data gathering in wireless sensor networks. Previous research works show that modulation scaling is an efficient technique to reduce energy consumption. However, such technique will also impact on both packet delivery latency and packet loss, therefore, may result in adverse effects on the qualities of applications. In this paper, we study the problem of modulation scaling and energy-optimization. A mathematical model is proposed to analyze the impact of modulation scaling on the overall energy consumption, end-to-end mean delivery latency and mean packet loss rate. A centralized optimal management mechanism is developed based on the model, which adaptively adjusts the modulation levels to minimize energy consumption while ensuring the QoS for data gathering. Experimental results show that the management mechanism saves significant energy in all the investigated scenarios. Some valuable results are also observed in the experiments. © 2004 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In data gathering wireless sensor networks, data loss often happens due to external faults such as random link faults and hazard node faults, since sensor nodes have constrained resources and are often deployed in inhospitable environments. However, already known fault tolerance mechanisms often bring new internal faults (e.g. out-of-power faults and collisions on wireless bandwidth) to the original network and dissipate lots of extra energy and time to reduce data loss. Therefore, we propose a novel Dual Cluster Heads Cooperation (CoDuch) scheme to tolerate external faults while introducing less internal faults and dissipating less extra energy and time. In CoDuch scheme, dual cluster heads cooperate with each other to reduce extra costs by sending only one copy of sensed data to the Base Station; also, dual cluster heads check errors with each other during the collecting data process. Two algorithms are developed based on the CoDuch scheme: CoDuch-l for tolerating link faults and CoDuch-b for tolerating both link faults and node faults; theory and experimental study validate their effectiveness and efficiency. © 2010 The Author Published by Oxford University Press on behalf of The British Computer Society. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Adaptive filter is a primary method to filter Electrocardiogram (ECG), because it does not need the signal statistical characteristics. In this paper, an adaptive filtering technique for denoising the ECG based on Genetic Algorithm (GA) tuned Sign-Data Least Mean Square (SD-LMS) algorithm is proposed. This technique minimizes the mean-squared error between the primary input, which is a noisy ECG, and a reference input which can be either noise that is correlated in some way with the noise in the primary input or a signal that is correlated only with ECG in the primary input. Noise is used as the reference signal in this work. The algorithm was applied to the records from the MIT -BIH Arrhythmia database for removing the baseline wander and 60Hz power line interference. The proposed algorithm gave an average signal to noise ratio improvement of 10.75 dB for baseline wander and 24.26 dB for power line interference which is better than the previous reported works

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The modelling of a nonlinear stochastic dynamical processes from data involves solving the problems of data gathering, preprocessing, model architecture selection, learning or adaptation, parametric evaluation and model validation. For a given model architecture such as associative memory networks, a common problem in non-linear modelling is the problem of "the curse of dimensionality". A series of complementary data based constructive identification schemes, mainly based on but not limited to an operating point dependent fuzzy models, are introduced in this paper with the aim to overcome the curse of dimensionality. These include (i) a mixture of experts algorithm based on a forward constrained regression algorithm; (ii) an inherent parsimonious delaunay input space partition based piecewise local lineal modelling concept; (iii) a neurofuzzy model constructive approach based on forward orthogonal least squares and optimal experimental design and finally (iv) the neurofuzzy model construction algorithm based on basis functions that are Bézier Bernstein polynomial functions and the additive decomposition. Illustrative examples demonstrate their applicability, showing that the final major hurdle in data based modelling has almost been removed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Generally wireless sensor networks rely of many-to-one communication approach for data gathering. This approach is extremely susceptible to sinkhole attack, where an intruder attracts surrounding nodes with unfaithful routing information, and subsequently presents selective forwarding or change the data that carry through it. A sinkhole attack causes an important threat to sensor networks and it should be considered that the sensor nodes are mostly spread out in open areas and of weak computation and battery power. In order to detect the intruder in a sinkhole attack this paper suggests an algorithm which firstly finds a group of suspected nodes by analyzing the consistency of data. Then, the intruder is recognized efficiently in the group by checking the network flow information. The proposed algorithm's performance has been evaluated by using numerical analysis and simulations. Therefore, accuracy and efficiency of algorithm would be verified.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Ubiquitous computing is about making computers and computerized artefacts a pervasive part of our everyday lifes, bringing more and more activities into the realm of information. The computationalization, informationalization of everyday activities increases not only our reach, efficiency and capabilities but also the amount and kinds of data gathered about us and our activities. In this thesis, I explore how information systems can be constructed so that they handle this personal data in a reasonable manner. The thesis provides two kinds of results: on one hand, tools and methods for both the construction as well as the evaluation of ubiquitous and mobile systems---on the other hand an evaluation of the privacy aspects of a ubiquitous social awareness system. The work emphasises real-world experiments as the most important way to study privacy. Additionally, the state of current information systems as regards data protection is studied. The tools and methods in this thesis consist of three distinct contributions. An algorithm for locationing in cellular networks is proposed that does not require the location information to be revealed beyond the user's terminal. A prototyping platform for the creation of context-aware ubiquitous applications called ContextPhone is described and released as open source. Finally, a set of methodological findings for the use of smartphones in social scientific field research is reported. A central contribution of this thesis are the pragmatic tools that allow other researchers to carry out experiments. The evaluation of the ubiquitous social awareness application ContextContacts covers both the usage of the system in general as well as an analysis of privacy implications. The usage of the system is analyzed in the light of how users make inferences of others based on real-time contextual cues mediated by the system, based on several long-term field studies. The analysis of privacy implications draws together the social psychological theory of self-presentation and research in privacy for ubiquitous computing, deriving a set of design guidelines for such systems. The main findings from these studies can be summarized as follows: The fact that ubiquitous computing systems gather more data about users can be used to not only study the use of such systems in an effort to create better systems but in general to study phenomena previously unstudied, such as the dynamic change of social networks. Systems that let people create new ways of presenting themselves to others can be fun for the users---but the self-presentation requires several thoughtful design decisions that allow the manipulation of the image mediated by the system. Finally, the growing amount of computational resources available to the users can be used to allow them to use the data themselves, rather than just being passive subjects of data gathering.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In lake-rich regions, the gathering of information about water quality is challenging because only a small proportion of the lakes can be assessed each year by conventional methods. One of the techniques for improving the spatial and temporal representativeness of lake monitoring is remote sensing from satellites and aircrafts. The experimental material included detailed optical measurements in 11 lakes, air- and spaceborne remote sensing measurements with concurrent field sampling, automatic raft measurements and a national dataset of routine water quality measurements from over 1100 lakes. The analyses of the spatially high-resolution airborne remote sensing data from eutrophic and mesotrophic lakes showed that one or a few discrete water quality observations using conventional monitoring can yield a clear over- or underestimation of the overall water quality in a lake. The use of TM-type satellite instruments in addition to routine monitoring results substantially increases the number of lakes for which water quality information can be obtained. The preliminary results indicated that coloured dissolved organic matter (CDOM) can be estimated with TM-type satellite instruments, which could possibly be utilised as an aid in estimating the role of lakes in global carbon budgets. Based on the results of reflectance modelling and experimental data, MERIS satellite instrument has optimal or near-optimal channels for the estimation of turbidity, chlorophyll a and CDOM in Finnish lakes. MERIS images with a 300 m spatial resolution can provide water quality information in different parts of large and medium-sized lakes, and in filling in the gaps resulting from conventional monitoring. Algorithms that would not require simultaneous field data for algorithm training would increase the amount of remote sensing-based information available for lake monitoring. The MERIS Boreal Lakes processor, trained with the optical data and concentration ranges provided by this study, enabled turbidity estimations with good accuracy without the need for algorithm correction with field measurements, while chlorophyll a and CDOM estimations require further development of the processor. The accuracy of interpreting chlorophyll a via semi empirical algorithms can be improved by classifying lakes prior to interpretation according to their CDOM level and trophic status. Optical modelling indicated that the spectral diffuse attenuation coefficient can be estimated with reasonable accuracy from the measured water quality concentrations. This provides more detailed information on light attenuation from routine monitoring measurements than is available through the Secchi disk transparency. The results of this study improve the interpretation of lake water quality by remote sensing and encourage the use of remote sensing in lake monitoring.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Based on an ultrasound-modulated optical tomography experiment, a direct, quantitative recovery of Young's modulus (E) is achieved from the modulation depth (M) in the intensity autocorrelation. The number of detector locations is limited to two in orthogonal directions, reducing the complexity of the data gathering step whilst ensuring against an impoverishment of the measurement, by employing ultrasound frequency as a parameter to vary during data collection. The M and E are related via two partial differential equations. The first one connects M to the amplitude of vibration of the scattering centers in the focal volume and the other, this amplitude to E. A (composite) sensitivity matrix is arrived at mapping the variation of M with that of E and used in a (barely regularized) Gauss-Newton algorithm to iteratively recover E. The reconstruction results showing the variation of E are presented. (C) 2015 Optical Society of America

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Since streaming data keeps coming continuously as an ordered sequence, massive amounts of data is created. A big challenge in handling data streams is the limitation of time and space. Prototype selection on streaming data requires the prototypes to be updated in an incremental manner as new data comes in. We propose an incremental algorithm for prototype selection. This algorithm can also be used to handle very large datasets. Results have been presented on a number of large datasets and our method is compared to an existing algorithm for streaming data. Our algorithm saves time and the prototypes selected gives good classification accuracy.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This thesis is an investigation into the nature of data analysis and computer software systems which support this activity.

The first chapter develops the notion of data analysis as an experimental science which has two major components: data-gathering and theory-building. The basic role of language in determining the meaningfulness of theory is stressed, and the informativeness of a language and data base pair is studied. The static and dynamic aspects of data analysis are then considered from this conceptual vantage point. The second chapter surveys the available types of computer systems which may be useful for data analysis. Particular attention is paid to the questions raised in the first chapter about the language restrictions imposed by the computer system and its dynamic properties.

The third chapter discusses the REL data analysis system, which was designed to satisfy the needs of the data analyzer in an operational relational data system. The major limitation on the use of such systems is the amount of access to data stored on a relatively slow secondary memory. This problem of the paging of data is investigated and two classes of data structure representations are found, each of which has desirable paging characteristics for certain types of queries. One representation is used by most of the generalized data base management systems in existence today, but the other is clearly preferred in the data analysis environment, as conceptualized in Chapter I.

This data representation has strong implications for a fundamental process of data analysis -- the quantification of variables. Since quantification is one of the few means of summarizing and abstracting, data analysis systems are under strong pressure to facilitate the process. Two implementations of quantification are studied: one analagous to the form of the lower predicate calculus and another more closely attuned to the data representation. A comparison of these indicates that the use of the "label class" method results in orders of magnitude improvement over the lower predicate calculus technique.