996 resultados para Tag data confidentiality


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research falls in the area of enhancing the quality of tag-based item recommendation systems. It aims to achieve this by employing a multi-dimensional user profile approach and by analyzing the semantic aspects of tags. Tag-based recommender systems have two characteristics that need to be carefully studied in order to build a reliable system. Firstly, the multi-dimensional correlation, called as tag assignment tag>, should be appropriately modelled in order to create the user profiles [1]. Secondly, the semantics behind the tags should be considered properly as the flexibility with their design can cause semantic problems such as synonymy and polysemy [2]. This research proposes to address these two challenges for building a tag-based item recommendation system by employing tensor modeling as the multi-dimensional user profile approach, and the topic model as the semantic analysis approach. The first objective is to optimize the tensor model reconstruction and to improve the model performance in generating quality rec-ommendation. A novel Tensor-based Recommendation using Probabilistic Ranking (TRPR) method [3] has been developed. Results show this method to be scalable for large datasets and outperforming the benchmarking methods in terms of accuracy. The memory efficient loop implements the n-mode block-striped (matrix) product for tensor reconstruction as an approximation of the initial tensor. The probabilistic ranking calculates the probabil-ity of users to select candidate items using their tag preference list based on the entries generated from the reconstructed tensor. The second objective is to analyse the tag semantics and utilize the outcome in building the tensor model. This research proposes to investigate the problem using topic model approach to keep the tags nature as the “social vocabulary” [4]. For the tag assignment data, topics can be generated from the occurrences of tags given for an item. However there is only limited amount of tags availa-ble to represent items as collection of topics, since an item might have only been tagged by using several tags. Consequently, the generated topics might not able to represent the items appropriately. Furthermore, given that each tag can belong to any topics with various probability scores, the occurrence of tags cannot simply be mapped by the topics to build the tensor model. A standard weighting technique will not appropriately calculate the value of tagging activity since it will define the context of an item using a tag instead of a topic.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A tag-based item recommendation method generates an ordered list of items, likely interesting to a particular user, using the users past tagging behaviour. However, the users tagging behaviour varies in different tagging systems. A potential problem in generating quality recommendation is how to build user profiles, that interprets user behaviour to be effectively used, in recommendation models. Generally, the recommendation methods are made to work with specific types of user profiles, and may not work well with different datasets. In this paper, we investigate several tagging data interpretation and representation schemes that can lead to building an effective user profile. We discuss the various benefits a scheme brings to a recommendation method by highlighting the representative features of user tagging behaviours on a specific dataset. Empirical analysis shows that each interpretation scheme forms a distinct data representation which eventually affects the recommendation result. Results on various datasets show that an interpretation scheme should be selected based on the dominant usage in the tagging data (i.e. either higher amount of tags or higher amount of items present). The usage represents the characteristic of user tagging behaviour in the system. The results also demonstrate how the scheme is able to address the cold-start user problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article considers the risk of disclosure in linked databases when statistical analysis of micro-data is permitted. The risk of disclosure needs to be balanced against the utility of the linked data. The current work specifically considers the disclosure risks in permitting regression analysis to be performed on linked data. A new attack based on partitioning of the database is presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Discounted Cumulative Gain (DCG) is a well-known ranking evaluation measure for models built with multiple relevance graded data. By handling tagging data used in recommendation systems as an ordinal relevance set of {negative,null,positive}, we propose to build a DCG based recommendation model. We present an efficient and novel learning-to-rank method by optimizing DCG for a recommendation model using the tagging data interpretation scheme. Evaluating the proposed method on real-world datasets, we demonstrate that the method is scalable and outperforms the benchmarking methods by generating a quality top-N item recommendation list.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Single nucleotide polymorphisms (SNPs) are widely acknowledged as the marker of choice for many genetic and genomic applications because they show co-dominant inheritance, are highly abundant across genomes and are suitable for high-throughput genotyping. Here we evaluated the applicability of SNP markers developed from Crassostrea gigas and C. virginica expressed sequence tags (ESTs) in closely related Crassostrea and Ostrea species. A total of 213 putative interspecific level SNPs were identified from re-sequencing data in six amplicons, yielding on average of one interspecific level SNP per seven bp. High polymorphism levels were observed and the high success rate of transferability show that genic EST-derived SNP markers provide an efficient method for rapid marker development and SNP discovery in closely related oyster species. The six EST-SNP markers identified here will provide useful molecular tools for addressing questions in molecular ecology and evolution studies including for stock analysis (pedigree monitoring) in related oyster taxa.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Fabens method is commonly used to estimate growth parameters k and l infinity in the von Bertalanffy model from tag-recapture data. However, the Fabens method of estimation has an inherent bias when individual growth is variable. This paper presents an asymptotically unbiassed method using a maximum likelihood approach that takes account of individual variability in both maximum length and age-at-tagging. It is assumed that each individual's growth follows a von Bertalanffy curve with its own maximum length and age-at-tagging. The parameter k is assumed to be a constant to ensure that the mean growth follows a von Bertalanffy curve and to avoid overparameterization. Our method also makes more efficient use nf thp measurements at tno and recapture and includes diagnostic techniques for checking distributional assumptions. The method is reasonably robust and performs better than the Fabens method when individual growth differs from the von Bertalanffy relationship. When measurement error is negligible, the estimation involves maximizing the profile likelihood of one parameter only. The method is applied to tag-recapture data for the grooved tiger prawn (Penaeus semisulcatus) from the Gulf of Carpentaria, Australia.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we have proposed and implemented a joint Medium Access Control (MAC) -cum- Routing scheme for environment data gathering sensor networks. The design principle uses node 'battery lifetime' maximization to be traded against a network that is capable of tolerating: A known percentage of combined packet losses due to packet collisions, network synchronization mismatch and channel impairments Significant end-to-end delay of an order of few seconds We have achieved this with a loosely synchronized network of sensor nodes that implement Slotted-Aloha MAC state machine together with route information. The scheme has given encouraging results in terms of energy savings compared to other popular implementations. The overall packet loss is about 12%. The battery life time increase compared to B-MAC varies from a minimum of 30% to about 90% depending on the duty cycle.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work belongs to the field of computational high-energy physics (HEP). The key methods used in this thesis work to meet the challenges raised by the Large Hadron Collider (LHC) era experiments are object-orientation with software engineering, Monte Carlo simulation, the computer technology of clusters, and artificial neural networks. The first aspect discussed is the development of hadronic cascade models, used for the accurate simulation of medium-energy hadron-nucleus reactions, up to 10 GeV. These models are typically needed in hadronic calorimeter studies and in the estimation of radiation backgrounds. Various applications outside HEP include the medical field (such as hadron treatment simulations), space science (satellite shielding), and nuclear physics (spallation studies). Validation results are presented for several significant improvements released in Geant4 simulation tool, and the significance of the new models for computing in the Large Hadron Collider era is estimated. In particular, we estimate the ability of the Bertini cascade to simulate Compact Muon Solenoid (CMS) hadron calorimeter HCAL. LHC test beam activity has a tightly coupled cycle of simulation-to-data analysis. Typically, a Geant4 computer experiment is used to understand test beam measurements. Thus an another aspect of this thesis is a description of studies related to developing new CMS H2 test beam data analysis tools and performing data analysis on the basis of CMS Monte Carlo events. These events have been simulated in detail using Geant4 physics models, full CMS detector description, and event reconstruction. Using the ROOT data analysis framework we have developed an offline ANN-based approach to tag b-jets associated with heavy neutral Higgs particles, and we show that this kind of NN methodology can be successfully used to separate the Higgs signal from the background in the CMS experiment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We introduce a variation density function that profiles the relationship between multiple scalar fields over isosurfaces of a given scalar field. This profile serves as a valuable tool for multifield data exploration because it provides the user with cues to identify interesting isovalues of scalar fields. Existing isosurface-based techniques for scalar data exploration like Reeb graphs, contour spectra, isosurface statistics, etc., study a scalar field in isolation. We argue that the identification of interesting isovalues in a multifield data set should necessarily be based on the interaction between the different fields. We demonstrate the effectiveness of our approach by applying it to explore data from a wide variety of applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the direction of arrival (DOA) estimation problem, we encounter both finite data and insufficient knowledge of array characterization. It is therefore important to study how subspace-based methods perform in such conditions. We analyze the finite data performance of the multiple signal classification (MUSIC) and minimum norm (min. norm) methods in the presence of sensor gain and phase errors, and derive expressions for the mean square error (MSE) in the DOA estimates. These expressions are first derived assuming an arbitrary array and then simplified for the special case of an uniform linear array with isotropic sensors. When they are further simplified for the case of finite data only and sensor errors only, they reduce to the recent results given in [9-12]. Computer simulations are used to verify the closeness between the predicted and simulated values of the MSE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Employing multiple base stations is an attractive approach to enhance the lifetime of wireless sensor networks. In this paper, we address the fundamental question concerning the limits on the network lifetime in sensor networks when multiple base stations are deployed as data sinks. Specifically, we derive upper bounds on the network lifetime when multiple base stations are employed, and obtain optimum locations of the base stations (BSs) that maximize these lifetime bounds. For the case of two BSs, we jointly optimize the BS locations by maximizing the lifetime bound using a genetic algorithm based optimization. Joint optimization for more number of BSs is complex. Hence, for the case of three BSs, we optimize the third BS location using the previously obtained optimum locations of the first two BSs. We also provide simulation results that validate the lifetime bounds and the optimum locations of the BSs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a fast algorithm for data exchange in a network of processors organized as a reconfigurable tree structure. For a given data exchange table, the algorithm generates a sequence of tree configurations in which the data exchanges are to be executed. A significant feature of the algorithm is that each exchange is executed in a tree configuration in which the source and destination nodes are adjacent to each other. It has been proved in a theorem that for every pair of nodes in the reconfigurable tree structure, there always exists two and only two configurations in which these two nodes are adjacent to each other. The algorithm utilizes this fact and determines the solution so as to optimize both the number of configurations required and the time to perform the data exchanges. Analysis of the algorithm shows that it has linear time complexity, and provides a large reduction in run-time as compared to a previously proposed algorithm. This is well-confirmed from the experimental results obtained by executing a large number of randomly-generated data exchange tables. Another significant feature of the algorithm is that the bit-size of the routing information code is always two bits, irrespective of the number of nodes in the tree. This not only increases the speed of the algorithm but also results in simpler hardware inside each node.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, power management algorithms for energy harvesting sensors (EHS) that operate purely based on energy harvested from the environment are proposed. To maintain energy neutrality, EHS nodes schedule their utilization of the harvested power so as to save/draw energy into/from an inefficient battery during peak/low energy harvesting periods, respectively. Under this constraint, one of the key system design goals is to transmit as much data as possible given the energy harvesting profile. For implementational simplicity, it is assumed that the EHS transmits at a constant data rate with power control, when the channel is sufficiently good. By converting the data rate maximization problem into a convex optimization problem, the optimal load scheduling (power management) algorithm that maximizes the average data rate subject to energy neutrality is derived. Also, the energy storage requirements on the battery for implementing the proposed algorithm are calculated. Further, robust schemes that account for the insufficiency of battery storage capacity, or errors in the prediction of the harvested power are proposed. The superior performance of the proposed algorithms over conventional scheduling schemes are demonstrated through computations using numerical data from solar energy harvesting databases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider a setting in which several operators offer downlink wireless data access services in a certain geographical region. Each operator deploys several base stations or access points, and registers some subscribers. In such a situation, if operators pool their infrastructure, and permit the possibility of subscribers being served by any of the cooperating operators, then there can be overall better user satisfaction, and increased operator revenue. We use coalitional game theory to investigate such resource pooling and cooperation between operators.We use utility functions to model user satisfaction, and show that the resulting coalitional game has the property that if all operators cooperate (i.e., form a grand coalition) then there is an operating point that maximizes the sum utility over the operators while providing the operators revenues such that no subset of operators has an incentive to break away from the coalition. We investigate whether such operating points can result in utility unfairness between users of the various operators. We also study other revenue sharing concepts, namely, the nucleolus and the Shapely value. Such investigations throw light on criteria for operators to accept or reject subscribers, based on the service level agreements proposed by them. We also investigate the situation in which only certain subsets of operators may be willing to cooperate.