944 resultados para data gathering algorithm


Relevância:

90.00% 90.00%

Publicador:

Resumo:

This work proposes a system for classification of industrial steel pieces by means of magnetic nondestructive device. The proposed classification system presents two main stages, online system stage and off-line system stage. In online stage, the system classifies inputs and saves misclassification information in order to perform posterior analyses. In the off-line optimization stage, the topology of a Probabilistic Neural Network is optimized by a Feature Selection algorithm combined with the Probabilistic Neural Network to increase the classification rate. The proposed Feature Selection algorithm searches for the signal spectrogram by combining three basic elements: a Sequential Forward Selection algorithm, a Feature Cluster Grow algorithm with classification rate gradient analysis and a Sequential Backward Selection. Also, a trash-data recycling algorithm is proposed to obtain the optimal feedback samples selected from the misclassified ones.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

[EN]We present a new method, based on the idea of the meccano method and a novel T-mesh optimization procedure, to construct a T-spline parameterization of 2D geometries for the application of isogeometric analysis. The proposed method only demands a boundary representation of the geometry as input data. The algorithm obtains, as a result, high quality parametric transformation between 2D objects and the parametric domain, the unit square. First, we define a parametric mapping between the input boundary of the object and the boundary of the parametric domain. Then, we build a T-mesh adapted to the geometric singularities of the domain in order to preserve the features of the object boundary with a desired tolerance…

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Abstract Due to recent scientific and technological advances in information sys¬tems, it is now possible to perform almost every application on a mobile device. The need to make sense of such devices more intelligent opens an opportunity to design data mining algorithm that are able to autonomous execute in local devices to provide the device with knowledge. The problem behind autonomous mining deals with the proper configuration of the algorithm to produce the most appropriate results. Contextual information together with resource information of the device have a strong impact on both the feasibility of a particu¬lar execution and on the production of the proper patterns. On the other hand, performance of the algorithm expressed in terms of efficacy and efficiency highly depends on the features of the dataset to be analyzed together with values of the parameters of a particular implementation of an algorithm. However, few existing approaches deal with autonomous configuration of data mining algorithms and in any case they do not deal with contextual or resources information. Both issues are of particular significance, in particular for social net¬works application. In fact, the widespread use of social networks and consequently the amount of information shared have made the need of modeling context in social application a priority. Also the resource consumption has a crucial role in such platforms as the users are using social networks mainly on their mobile devices. This PhD thesis addresses the aforementioned open issues, focusing on i) Analyzing the behavior of algorithms, ii) mapping contextual and resources information to find the most appropriate configuration iii) applying the model for the case of a social recommender. Four main contributions are presented: - The EE-Model: is able to predict the behavior of a data mining algorithm in terms of resource consumed and accuracy of the mining model it will obtain. - The SC-Mapper: maps a situation defined by the context and resource state to a data mining configuration. - SOMAR: is a social activity (event and informal ongoings) recommender for mobile devices. - D-SOMAR: is an evolution of SOMAR which incorporates the configurator in order to provide updated recommendations. Finally, the experimental validation of the proposed contributions using synthetic and real datasets allows us to achieve the objectives and answer the research questions proposed for this dissertation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Ubiquitous computing software needs to be autonomous so that essential decisions such as how to configure its particular execution are self-determined. Moreover, data mining serves an important role for ubiquitous computing by providing intelligence to several types of ubiquitous computing applications. Thus, automating ubiquitous data mining is also crucial. We focus on the problem of automatically configuring the execution of a ubiquitous data mining algorithm. In our solution, we generate configuration decisions in a resource aware and context aware manner since the algorithm executes in an environment in which the context often changes and computing resources are often severely limited. We propose to analyze the execution behavior of the data mining algorithm by mining its past executions. By doing so, we discover the effects of resource and context states as well as parameter settings on the data mining quality. We argue that a classification model is appropriate for predicting the behavior of an algorithm?s execution and we concentrate on decision tree classifier. We also define taxonomy on data mining quality so that tradeoff between prediction accuracy and classification specificity of each behavior model that classifies by a different abstraction of quality, is scored for model selection. Behavior model constituents and class label transformations are formally defined and experimental validation of the proposed approach is also performed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

PAMELA (Phased Array Monitoring for Enhanced Life Assessment) SHMTM System is an integrated embedded ultrasonic guided waves based system consisting of several electronic devices and one system manager controller. The data collected by all PAMELA devices in the system must be transmitted to the controller, who will be responsible for carrying out the advanced signal processing to obtain SHM maps. PAMELA devices consist of hardware based on a Virtex 5 FPGA with a PowerPC 440 running an embedded Linux distribution. Therefore, PAMELA devices, in addition to the capability of performing tests and transmitting the collected data to the controller, have the capability of perform local data processing or pre-processing (reduction, normalization, pattern recognition, feature extraction, etc.). Local data processing decreases the data traffic over the network and allows CPU load of the external computer to be reduced. Even it is possible that PAMELA devices are running autonomously performing scheduled tests, and only communicates with the controller in case of detection of structural damages or when programmed. Each PAMELA device integrates a software management application (SMA) that allows to the developer downloading his own algorithm code and adding the new data processing algorithm to the device. The development of the SMA is done in a virtual machine with an Ubuntu Linux distribution including all necessary software tools to perform the entire cycle of development. Eclipse IDE (Integrated Development Environment) is used to develop the SMA project and to write the code of each data processing algorithm. This paper presents the developed software architecture and describes the necessary steps to add new data processing algorithms to SMA in order to increase the processing capabilities of PAMELA devices.An example of basic damage index estimation using delay and sum algorithm is provided.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Comunicación presentada en las XVI Jornadas de Ingeniería del Software y Bases de Datos, JISBD 2011, A Coruña, 5-7 septiembre 2011.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The modelling of inpatient length of stay (LOS) has important implications in health care studies. Finite mixture distributions are usually used to model the heterogeneous LOS distribution, due to a certain proportion of patients sustaining-a longer stay. However, the morbidity data are collected from hospitals, observations clustered within the same hospital are often correlated. The generalized linear mixed model approach is adopted to accommodate the inherent correlation via unobservable random effects. An EM algorithm is developed to obtain residual maximum quasi-likelihood estimation. The proposed hierarchical mixture regression approach enables the identification and assessment of factors influencing the long-stay proportion and the LOS for the long-stay patient subgroup. A neonatal LOS data set is used for illustration, (C) 2003 Elsevier Science Ltd. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Hannenhalli and Pevzner developed the first polynomial-time algorithm for the combinatorial problem of sorting of signed genomic data. Their algorithm solves the minimum number of reversals required for rearranging a genome to another when gene duplication is nonexisting. In this paper, we show how to extend the Hannenhalli-Pevzner approach to genomes with multigene families. We propose a new heuristic algorithm to compute the reversal distance between two genomes with multigene families via the concept of binary integer programming without removing gene duplicates. The experimental results on simulated and real biological data demonstrate that the proposed algorithm is able to find the reversal distance accurately. ©2005 IEEE

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Data refinements are refinement steps in which a program’s local data structures are changed. Data refinement proof obligations require the software designer to find an abstraction relation that relates the states of the original and new program. In this paper we describe an algorithm that helps a designer find an abstraction relation for a proposed refinement. Given sufficient time and space, the algorithm can find a minimal abstraction relation, and thus show that the refinement holds. As it executes, the algorithm displays mappings that cannot be in any abstraction relation. When the algorithm is not given sufficient resources to terminate, these mappings can help the designer find a suitable abstraction relation. The same algorithm can be used to test an abstraction relation supplied by the designer.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Ad hoc wireless sensor networks (WSNs) are formed from self-organising configurations of distributed, energy constrained, autonomous sensor nodes. The service lifetime of such sensor nodes depends on the power supply and the energy consumption, which is typically dominated by the communication subsystem. One of the key challenges in unlocking the potential of such data gathering sensor networks is conserving energy so as to maximize their post deployment active lifetime. This thesis described the research carried on the continual development of the novel energy efficient Optimised grids algorithm that increases the WSNs lifetime and improves on the QoS parameters yielding higher throughput, lower latency and jitter for next generation of WSNs. Based on the range and traffic relationship the novel Optimised grids algorithm provides a robust traffic dependent energy efficient grid size that minimises the cluster head energy consumption in each grid and balances the energy use throughout the network. Efficient spatial reusability allows the novel Optimised grids algorithm improves on network QoS parameters. The most important advantage of this model is that it can be applied to all one and two dimensional traffic scenarios where the traffic load may fluctuate due to sensor activities. During traffic fluctuations the novel Optimised grids algorithm can be used to re-optimise the wireless sensor network to bring further benefits in energy reduction and improvement in QoS parameters. As the idle energy becomes dominant at lower traffic loads, the new Sleep Optimised grids model incorporates the sleep energy and idle energy duty cycles that can be implemented to achieve further network lifetime gains in all wireless sensor network models. Another key advantage of the novel Optimised grids algorithm is that it can be implemented with existing energy saving protocols like GAF, LEACH, SMAC and TMAC to further enhance the network lifetimes and improve on QoS parameters. The novel Optimised grids algorithm does not interfere with these protocols, but creates an overlay to optimise the grids sizes and hence transmission range of wireless sensor nodes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Very often the experimental data are the realization of the process, fully determined by some unknown function, being distorted by hindrances. Treatment and experimental data analysis are substantially facilitated, if these data to represent as analytical expression. The experimental data processing algorithm and the example of using this algorithm for spectrographic analysis of oncologic preparations of blood is represented in this article.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

* The research is supported partly by INTAS: 04-77-7173 project, http://www.intas.be

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Owing to their important roles in biogeochemical cycles, phytoplankton functional types (PFTs) have been the aim of an increasing number of ocean color algorithms. Yet, none of the existing methods are based on phytoplankton carbon (C) biomass, which is a fundamental biogeochemical and ecological variable and the "unit of accounting" in Earth system models. We present a novel bio-optical algorithm to retrieve size-partitioned phytoplankton carbon from ocean color satellite data. The algorithm is based on existing methods to estimate particle volume from a power-law particle size distribution (PSD). Volume is converted to carbon concentrations using a compilation of allometric relationships. We quantify absolute and fractional biomass in three PFTs based on size - picophytoplankton (0.5-2 µm in diameter), nanophytoplankton (2-20 µm) and microphytoplankton (20-50 µm). The mean spatial distributions of total phytoplankton C biomass and individual PFTs, derived from global SeaWiFS monthly ocean color data, are consistent with current understanding of oceanic ecosystems, i.e., oligotrophic regions are characterized by low biomass and dominance of picoplankton, whereas eutrophic regions have high biomass to which nanoplankton and microplankton contribute relatively larger fractions. Global climatological, spatially integrated phytoplankton carbon biomass standing stock estimates using our PSD-based approach yield - 0.25 Gt of C, consistent with analogous estimates from two other ocean color algorithms and several state-of-the-art Earth system models. Satisfactory in situ closure observed between PSD and POC measurements lends support to the theoretical basis of the PSD-based algorithm. Uncertainty budget analyses indicate that absolute carbon concentration uncertainties are driven by the PSD parameter No which determines particle number concentration to first order, while uncertainties in PFTs' fractional contributions to total C biomass are mostly due to the allometric coefficients. The C algorithm presented here, which is not empirically constrained a priori, partitions biomass in size classes and introduces improvement over the assumptions of the other approaches. However, the range of phytoplankton C biomass spatial variability globally is larger than estimated by any other models considered here, which suggests an empirical correction to the No parameter is needed, based on PSD validation statistics. These corrected absolute carbon biomass concentrations validate well against in situ POC observations.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The MAREDAT atlas covers 11 types of plankton, ranging in size from bacteria to jellyfish. Together, these plankton groups determine the health and productivity of the global ocean and play a vital role in the global carbon cycle. Working within a uniform and consistent spatial and depth grid (map) of the global ocean, the researchers compiled thousands and tens of thousands of data points to identify regions of plankton abundance and scarcity as well as areas of data abundance and scarcity. At many of the grid points, the MAREDAT team accomplished the difficult conversion from abundance (numbers of organisms) to biomass (carbon mass of organisms). The MAREDAT atlas provides an unprecedented global data set for ecological and biochemical analysis and modeling as well as a clear mandate for compiling additional existing data and for focusing future data gathering efforts on key groups in key areas of the ocean. This is a gridded data product about diazotrophic organisms . There are 6 variables. Each variable is gridded on a dimension of 360 (longitude) * 180 (latitude) * 33 (depth) * 12 (month). The first group of 3 variables are: (1) number of biomass observations, (2) biomass, and (3) special nifH-gene-based biomass. The second group of 3 variables is same as the first group except that it only grids non-zero data. We have constructed a database on diazotrophic organisms in the global pelagic upper ocean by compiling more than 11,000 direct field measurements including 3 sub-databases: (1) nitrogen fixation rates, (2) cyanobacterial diazotroph abundances from cell counts and (3) cyanobacterial diazotroph abundances from qPCR assays targeting nifH genes. Biomass conversion factors are estimated based on cell sizes to convert abundance data to diazotrophic biomass. Data are assigned to 3 groups including Trichodesmium, unicellular diazotrophic cyanobacteria (group A, B and C when applicable) and heterocystous cyanobacteria (Richelia and Calothrix). Total nitrogen fixation rates and diazotrophic biomass are calculated by summing the values from all the groups. Some of nitrogen fixation rates are whole seawater measurements and are used as total nitrogen fixation rates. Both volumetric and depth-integrated values were reported. Depth-integrated values are also calculated for those vertical profiles with values at 3 or more depths.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Internet users consume online targeted advertising based on information collected about them and voluntarily share personal information in social networks. Sensor information and data from smart-phones is collected and used by applications, sometimes in unclear ways. As it happens today with smartphones, in the near future sensors will be shipped in all types of connected devices, enabling ubiquitous information gathering from the physical environment, enabling the vision of Ambient Intelligence. The value of gathered data, if not obvious, can be harnessed through data mining techniques and put to use by enabling personalized and tailored services as well as business intelligence practices, fueling the digital economy. However, the ever-expanding information gathering and use undermines the privacy conceptions of the past. Natural social practices of managing privacy in daily relations are overridden by socially-awkward communication tools, service providers struggle with security issues resulting in harmful data leaks, governments use mass surveillance techniques, the incentives of the digital economy threaten consumer privacy, and the advancement of consumergrade data-gathering technology enables new inter-personal abuses. A wide range of fields attempts to address technology-related privacy problems, however they vary immensely in terms of assumptions, scope and approach. Privacy of future use cases is typically handled vertically, instead of building upon previous work that can be re-contextualized, while current privacy problems are typically addressed per type in a more focused way. Because significant effort was required to make sense of the relations and structure of privacy-related work, this thesis attempts to transmit a structured view of it. It is multi-disciplinary - from cryptography to economics, including distributed systems and information theory - and addresses privacy issues of different natures. As existing work is framed and discussed, the contributions to the state-of-theart done in the scope of this thesis are presented. The contributions add to five distinct areas: 1) identity in distributed systems; 2) future context-aware services; 3) event-based context management; 4) low-latency information flow control; 5) high-dimensional dataset anonymity. Finally, having laid out such landscape of the privacy-preserving work, the current and future privacy challenges are discussed, considering not only technical but also socio-economic perspectives.