912 resultados para Data distribution
Resumo:
Unauthorized accesses to digital contents are serious threats to international security and informatics. We propose an offline oblivious data distribution framework that preserves the sender's security and the receiver's privacy using tamper-proof smart cards. This framework provides persistent content protections from digital piracy and promises private content consumption.
Resumo:
Continuous field mapping has to address two conflicting remote sensing requirements when collecting training data. On one hand, continuous field mapping trains fractional land cover and thus favours mixed training pixels. On the other hand, the spectral signature has to be preferably distinct and thus favours pure training pixels. The aim of this study was to evaluate the sensitivity of training data distribution along fractional and spectral gradients on the resulting mapping performance. We derived four continuous fields (tree, shrubherb, bare, water) from aerial photographs as response variables and processed corresponding spectral signatures from multitemporal Landsat 5 TM data as explanatory variables. Subsequent controlled experiments along fractional cover gradients were then based on generalised linear models. Resulting fractional and spectral distribution differed between single continuous fields, but could be satisfactorily trained and mapped. Pixels with fractional or without respective cover were much more critical than pure full cover pixels. Error distribution of continuous field models was non-uniform with respect to horizontal and vertical spatial distribution of target fields. We conclude that a sampling for continuous field training data should be based on extent and densities in the fractional and spectral, rather than the real spatial space. Consequently, adequate training plots are most probably not systematically distributed in the real spatial space, but cover the gradient and covariate structure of the fractional and spectral space well. (C) 2009 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
Resumo:
Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element
Resumo:
Data Distribution Management (DDM) is a core part of High Level Architecture standard, as its goal is to optimize the resources used by simulation environments to exchange data. It has to filter and match the set of information generated during a simulation, so that each federate, that is a simulation entity, only receives the information it needs. It is important that this is done quickly and to the best in order to get better performances and avoiding the transmission of irrelevant data, otherwise network resources may saturate quickly. The main topic of this thesis is the implementation of a super partes DDM testbed. It evaluates the goodness of DDM approaches, of all kinds. In fact it supports both region and grid based approaches, and it may support other different methods still unknown too. It uses three factors to rank them: execution time, memory and distance from the optimal solution. A prearranged set of instances is already available, but we also allow the creation of instances with user-provided parameters. This is how this thesis is structured. We start introducing what DDM and HLA are and what do they do in details. Then in the first chapter we describe the state of the art, providing an overview of the most well known resolution approaches and the pseudocode of the most interesting ones. The third chapter describes how the testbed we implemented is structured. In the fourth chapter we expose and compare the results we got from the execution of four approaches we have implemented. The result of the work described in this thesis can be downloaded on sourceforge using the following link: https://sourceforge.net/projects/ddmtestbed/. It is licensed under the GNU General Public License version 3.0 (GPLv3).
Resumo:
Il Data Distribution Management (DDM) è un componente dello standard High Level Architecture. Il suo compito è quello di rilevare le sovrapposizioni tra update e subscription extent in modo efficiente. All'interno di questa tesi si discute la necessità di avere un framework e per quali motivi è stato implementato. Il testing di algoritmi per un confronto equo, librerie per facilitare la realizzazione di algoritmi, automatizzazione della fase di compilazione, sono motivi che sono stati fondamentali per iniziare la realizzazione framework. Il motivo portante è stato che esplorando articoli scientifici sul DDM e sui vari algoritmi si è notato che in ogni articolo si creavano dei dati appositi per fare dei test. L'obiettivo di questo framework è anche quello di riuscire a confrontare gli algoritmi con un insieme di dati coerente. Si è deciso di testare il framework sul Cloud per avere un confronto più affidabile tra esecuzioni di utenti diversi. Si sono presi in considerazione due dei servizi più utilizzati: Amazon AWS EC2 e Google App Engine. Sono stati mostrati i vantaggi e gli svantaggi dell'uno e dell'altro e il motivo per cui si è scelto di utilizzare Google App Engine. Si sono sviluppati quattro algoritmi: Brute Force, Binary Partition, Improved Sort, Interval Tree Matching. Sono stati svolti dei test sul tempo di esecuzione e sulla memoria di picco utilizzata. Dai risultati si evince che l'Interval Tree Matching e l'Improved Sort sono i più efficienti. Tutti i test sono stati svolti sulle versioni sequenziali degli algoritmi e che quindi ci può essere un riduzione nel tempo di esecuzione per l'algoritmo Interval Tree Matching.
Resumo:
En este trabajo se ha investigado la posibilidad de utilizar el estándar DDS (Data Distribution Service) desarrollado por el OMG (Object Management Group) para la monitorización en tiempo real del nivel de glucosa en pacientes diabéticos. Dicho estándar sigue el patrón publicador/suscriptor de modo que, en la prueba de concepto desarrollada, los sensores del punto de cuidado son publicadores de los valores de glucosa de los pacientes y diferentes supervisores se suscriben a esa información. Estos supervisores reaccionan de la forma más adecuada a los valores y la evolución del nivel de glucosa en el paciente, por ejemplo, registrando el valor de la muestra o generando una alarma. El software de intermediación que soporta la comunicación de datos sigue el estándar DDS. Esto facilita por un lado la escalabilidad e interoperatividad de la solución desarrollada y por otro la monitorización de niveles de glucosa y la activación de protocolos predefinidos en tiempo real. La investigación se enmarca dentro del proyecto intramural PERSONA del CIBER-BBN, cuyo objetivo es el diseño de herramientas de soporte a la decisión para la monitorización continua de pacientes personalizadas e integradas en una plataforma tecnológica para diabetes.
Resumo:
The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices.
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Background: This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist. Decision-tree induction algorithms have been successfully used in drug-design related applications, specially considering that decision trees are simple to understand, interpret, and validate. There are several decision-tree induction algorithms available for general-use, but each one has a bias that makes it more suitable for a particular data distribution. In this article, we propose and investigate the automatic design of decision-tree induction algorithms tailored to particular drug-enzyme binding data sets. We investigate the performance of our new method for evaluating binding conformations of different drug candidates to InhA, and we analyze our findings with respect to decision tree accuracy, comprehensibility, and biological relevance. Results: The empirical analysis indicates that our method is capable of automatically generating decision-tree induction algorithms that significantly outperform the traditional C4.5 algorithm with respect to both accuracy and comprehensibility. In addition, we provide the biological interpretation of the rules generated by our approach, reinforcing the importance of comprehensible predictive models in this particular bioinformatics application. Conclusions: We conclude that automatically designing a decision-tree algorithm tailored to molecular docking data is a promising alternative for the prediction of the free energy from the binding of a drug candidate with a flexible-receptor.
Resumo:
The continuous advancements and enhancements of wireless systems are enabling new compelling scenarios where mobile services can adapt according to the current execution context, represented by the computational resources available at the local device, current physical location, people in physical proximity, and so forth. Such services called context-aware require the timely delivery of all relevant information describing the current context, and that introduces several unsolved complexities, spanning from low-level context data transmission up to context data storage and replication into the mobile system. In addition, to ensure correct and scalable context provisioning, it is crucial to integrate and interoperate with different wireless technologies (WiFi, Bluetooth, etc.) and modes (infrastructure-based and ad-hoc), and to use decentralized solutions to store and replicate context data on mobile devices. These challenges call for novel middleware solutions, here called Context Data Distribution Infrastructures (CDDIs), capable of delivering relevant context data to mobile devices, while hiding all the issues introduced by data distribution in heterogeneous and large-scale mobile settings. This dissertation thoroughly analyzes CDDIs for mobile systems, with the main goal of achieving a holistic approach to the design of such type of middleware solutions. We discuss the main functions needed by context data distribution in large mobile systems, and we claim the precise definition and clean respect of quality-based contracts between context consumers and CDDI to reconfigure main middleware components at runtime. We present the design and the implementation of our proposals, both in simulation-based and in real-world scenarios, along with an extensive evaluation that confirms the technical soundness of proposed CDDI solutions. Finally, we consider three highly heterogeneous scenarios, namely disaster areas, smart campuses, and smart cities, to better remark the wide technical validity of our analysis and solutions under different network deployments and quality constraints.
Resumo:
Background and Study Aim: Judo is very physiological demanding sport, but there are no many physical fitness specific tests. One of the most used specific judo tests is the Special Judo Fitness Test (SJFT) proposed by Sterkowicz ( 1995). Although this test has been used by many coaches in different countries no classificatory table was found to classify the judo athletes according to their results. Thus, the aim of this work was to present a classificatory table for this test. Material/Methods: For this purpose 141 judo athletes ( mean +/-standard deviation: 21.3+/-4.5years-old; 74.2+/-15.9 kg of body mass and 176.7+/-8.2 cm of height; judo ranking between 3(rd) kyu and 3(rd) dan) familiarized with the SJFT performed it once in order to provide data to establish a classificatory table. Results: After the analysis of data distribution a five scale table (20% for each classificatory category) was developed considering the variables used in the SJFT ( number of throws, heart rate after and 1 min after the test and index). Conclusions: The classificatory table can help coaches using the SJFT to classify their athletes` level and to monitor their physical fitness progress.
Resumo:
Core collections are of strategic importance as they allow the use of a small part of a germplasm collection that is representative of the total collection. The objective of this study was to develop a soybean core collection of the USDA Soybean Germplasm Collection by comparing the results of random, proportional, logarithmic, multivariate proportional and multivariate logarithmic sampling strategies. All but the random sampling strategy used stratification of the entire collection based on passport data and maturity group classification. The multivariate proportional and multivariate logarithmic strategies made further use of qualitative and quantitative trait data to select diverse accessions within each stratum. The 18 quantitative trait data distribution parameters were calculated for each core and for the entire collection for pairwise comparison to validate the sampling strategies. All strategies were adequate for assembling a core collection. The random core collection best represented the entire collection in statistical terms. Proportional and logarithmic strategies did not maximize statistical representation but were better in selecting maximum variability. Multivariate proportional and multivariate logarithmic strategies produced the best core collections as measured by maximum variability conservation. The soybean core collection was established using the multivariate proportional selection strategy. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática