36 resultados para Data selection

em Indian Institute of Science - Bangalore - Índia


Relevância:

40.00% 40.00%

Publicador:

Resumo:

A common and practical paradigm in cooperative communications is the use of a dynamically selected 'best' relay to decode and forward information from a source to a destination. Such a system consists of two core phases: a relay selection phase, in which the system expends resources to select the best relay, and a data transmission phase, in which it uses the selected relay to forward data to the destination. In this paper, we study and optimize the trade-off between the selection and data transmission phase durations. We derive closed-form expressions for the overall throughput of a non-adaptive system that includes the selection phase overhead, and then optimize the selection and data transmission phase durations. Corresponding results are also derived for an adaptive system in which the relays can vary their transmission rates. Our results show that the optimal selection phase overhead can be significant even for fast selection algorithms. Furthermore, the optimal selection phase duration depends on the number of relays and whether adaptation is used.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Outlier detection in high dimensional categorical data has been a problem of much interest due to the extensive use of qualitative features for describing the data across various application areas. Though there exist various established methods for dealing with the dimensionality aspect through feature selection on numerical data, the categorical domain is actively being explored. As outlier detection is generally considered as an unsupervised learning problem due to lack of knowledge about the nature of various types of outliers, the related feature selection task also needs to be handled in a similar manner. This motivates the need to develop an unsupervised feature selection algorithm for efficient detection of outliers in categorical data. Addressing this aspect, we propose a novel feature selection algorithm based on the mutual information measure and the entropy computation. The redundancy among the features is characterized using the mutual information measure for identifying a suitable feature subset with less redundancy. The performance of the proposed algorithm in comparison with the information gain based feature selection shows its effectiveness for outlier detection. The efficacy of the proposed algorithm is demonstrated on various high-dimensional benchmark data sets employing two existing outlier detection methods.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Since streaming data keeps coming continuously as an ordered sequence, massive amounts of data is created. A big challenge in handling data streams is the limitation of time and space. Prototype selection on streaming data requires the prototypes to be updated in an incremental manner as new data comes in. We propose an incremental algorithm for prototype selection. This algorithm can also be used to handle very large datasets. Results have been presented on a number of large datasets and our method is compared to an existing algorithm for streaming data. Our algorithm saves time and the prototypes selected gives good classification accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Security in a mobile communication environment is always a matter for concern, even after deploying many security techniques at device, network, and application levels. The end-to-end security for mobile applications can be made robust by developing dynamic schemes at application level which makes use of the existing security techniques varying in terms of space, time, and attacks complexities. In this paper we present a security techniques selection scheme for mobile transactions, called the Transactions-Based Security Scheme (TBSS). The TBSS uses intelligence to study, and analyzes the security implications of transactions under execution based on certain criterion such as user behaviors, transaction sensitivity levels, and credibility factors computed over the previous transactions by the users, network vulnerability, and device characteristics. The TBSS identifies a suitable level of security techniques from the repository, which consists of symmetric, and asymmetric types of security algorithms arranged in three complexity levels, covering various encryption/decryption techniques, digital signature schemes, andhashing techniques. From this identified level, one of the techniques is deployed randomly. The results shows that, there is a considerable reduction in security cost compared to static schemes, which employ pre-fixed security techniques to secure the transactions data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The motivation behind the fusion of Intrusion Detection Systems was the realization that with the increasing traffic and increasing complexity of attacks, none of the present day stand-alone Intrusion Detection Systems can meet the high demand for a very high detection rate and an extremely low false positive rate. Multi-sensor fusion can be used to meet these requirements by a refinement of the combined response of different Intrusion Detection Systems. In this paper, we show the design technique of sensor fusion to best utilize the useful response from multiple sensors by an appropriate adjustment of the fusion threshold. The threshold is generally chosen according to the past experiences or by an expert system. In this paper, we show that the choice of the threshold bounds according to the Chebyshev inequality principle performs better. This approach also helps to solve the problem of scalability and has the advantage of failsafe capability. This paper theoretically models the fusion of Intrusion Detection Systems for the purpose of proving the improvement in performance, supplemented with the empirical evaluation. The combination of complementary sensors is shown to detect more attacks than the individual components. Since the individual sensors chosen detect sufficiently different attacks, their result can be merged for improved performance. The combination is done in different ways like (i) taking all the alarms from each system and avoiding duplications, (ii) taking alarms from each system by fixing threshold bounds, and (iii) rule-based fusion with a priori knowledge of the individual sensor performance. A number of evaluation metrics are used, and the results indicate that there is an overall enhancement in the performance of the combined detector using sensor fusion incorporating the threshold bounds and significantly better performance using simple rule-based fusion.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Receive antenna selection (AS) reduces the hardware complexity of multi-antenna receivers by dynamically connecting an instantaneously best antenna element to the available radio frequency (RF) chain. Due to the hardware constraints, the channels at various antenna elements have to be sounded sequentially to obtain estimates that are required for selecting the ``best'' antenna and for coherently demodulating data. Consequently, the channel state information at different antennas is outdated by different amounts. We show that, for this reason, simply selecting the antenna with the highest estimated channel gain is not optimum. Rather, the channel estimates of different antennas should be weighted differently, depending on the training scheme. We derive closed-form expressions for the symbol error probability (SEP) of AS for MPSK and MQAM in time-varying Rayleigh fading channels for arbitrary selection weights, and validate them with simulations. We then derive an explicit formula for the optimal selection weights that minimize the SEP. We find that when selection weights are not used, the SEP need not improve as the number of antenna elements increases, which is in contrast to the ideal channel estimation case. However, the optimal selection weights remedy this situation and significantly improve performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hardware constraints, which motivate receive antenna selection, also require that various antenna elements at the receiver be sounded sequentially to obtain estimates required for selecting the `best' antenna and for coherently demodulating data thereafter. Consequently, the channel state information at different antennas is outdated by different amounts and corrupted by noise. We show that, for this reason, simply selecting the antenna with the highest estimated channel gain is not optimum. Rather, a preferable strategy is to linearly weight the channel estimates of different antennas differently, depending on the training scheme. We derive closed-form expressions for the symbol error probability (SEP) of AS for MPSK and MQAM in time-varying Rayleigh fading channels for arbitrary selection weights, and validate them with simulations. We then characterize explicitly the optimal selection weights that minimize the SEP. We also consider packet reception, in which multiple symbols of a packet are received by the same antenna. New suboptimal, but computationally efficient weighted selection schemes are proposed for reducing the packet error rate. The benefits of weighted selection are also demonstrated using a practical channel code used in third generation cellular systems. Our results show that optimal weighted selection yields a significant performance gain over conventional unweighted selection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A common and practical paradigm in cooperative communication systems is the use of a dynamically selected `best' relay to decode and forward information from a source to a destination. Such systems use two phases - a relay selection phase, in which the system uses transmission time and energy to select the best relay, and a data transmission phase, in which it uses the spatial diversity benefits of selection to transmit data. In this paper, we derive closed-form expressions for the overall throughput and energy consumption, and study the time and energy trade-off between the selection and data transmission phases. To this end, we analyze a baseline non-adaptive system and several adaptive systems that adapt the selection phase, relay transmission power, or transmission time. Our results show that while selection yields significant benefits, the selection phase's time and energy overhead can be significant. In fact, at the optimal point, the selection can be far from perfect, and depends on the number of relays and the mode of adaptation. The results also provide guidelines about the optimal system operating point for different modes of adaptation. The analysis also sheds new insights on the fast splitting-based algorithm considered in this paper for relay selection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In receive antenna selection (AS), only signals from a subset of the antennas are processed at any time by the limited number of radio frequency (RF) chains available at the receiver. Hence, the transmitter needs to send pilots multiple times to enable the receiver to estimate the channel state of all the antennas and select the best subset. Conventionally, the sensitivity of coherent reception to channel estimation errors has been tackled by boosting the energy allocated to all pilots to ensure accurate channel estimates for all antennas. Energy for pilots received by unselected antennas is mostly wasted, especially since the selection process is robust to estimation errors. In this paper, we propose a novel training method uniquely tailored for AS that transmits one extra pilot symbol that generates accurate channel estimates for the antenna subset that actually receives data. Consequently, the transmitter can selectively boost the energy allocated to the extra pilot. We derive closed-form expressions for the proposed scheme's symbol error probability for MPSK and MQAM, and optimize the energy allocated to pilot and data symbols. Through an insightful asymptotic analysis, we show that the optimal solution achieves full diversity and is better than the conventional method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The K-means algorithm for clustering is very much dependent on the initial seed values. We use a genetic algorithm to find a near-optimal partitioning of the given data set by selecting proper initial seed values in the K-means algorithm. Results obtained are very encouraging and in most of the cases, on data sets having well separated clusters, the proposed scheme reached a global minimum.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Isoactivity lines for carbon with respect to diamond as the standard state have been calculated in the ternary system C-H-O at 1223 K to identify the diamond deposition domain. The gas composition is calculated by suppressing the formation of all condensed forms of carbon using the SOLGASMIX free-energy minimization program. Thirty six gas species were included in the calculation. From the gas composition, isoactivity lines are computed using recent data on the Gibbs energy of diamond. Except for activities less than 0.1, the isoactivity lines are almost linear on the C-H-O ternary diagram. Gas compositions which generate activity of diamond ranging from 1 to 100 at 1223 K fall inside a narrow wedge originating from the point representing CO. This wedge is very similar to the revised lens-shaped diamond growth domain identified by Bachman et al., using inputs from experiment. The small difference between the calculated and observed domains may be attributed to variation in the supersaturation required for diamond deposition with gas composition. The diamond solubility in the gas phase along the isoactivity line for a(di)=100 and P=6.7 kPa exhibits a minimum at 1280 K, which is close to the optimum temperature found experimentally. At higher supersaturations, non-diamond forms of carbon, including amorphous varieties, are expected. The results suggest that thermodynamic calculations can be useful for locating diamond growth domains in more complex CVD systems containing halogens, for which very little experimental data is available.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Antenna selection (AS) provides most of the benefits of multiple-antenna systems at drastically reduced hardware costs. In receive AS, the receiver connects a dynamically selected subset of N available antennas to the L available RF chains. The "best" subset to be used for data reception is determined by means of channel estimates acquired using training sequences. Due to the nature of AS, the channel estimates at different antennas are obtained from different transmissions of the pilot sequence, and are, thus, outdated by different amounts in a time-varying channel. We show that a linear weighting of the estimates is optimum for the subset selection process, where the weights are related to the temporal correlation of the channel variations. When L is not an integer divisor of N, we highlight a new issue of "training voids", in which the last pilot transmission is not fully exploited by the receiver. We present a "void-filling" method for fully exploiting these voids, which essentially provides more accurate training for some antennas, and derive the optimal subset selection rule for any void-filling method. We also derive new closed-form equations for the performance of receive AS with optimal subset selection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We develop an optimal, distributed, and low feedback timer-based selection scheme to enable next generation rate-adaptive wireless systems to exploit multi-user diversity. In our scheme, each user sets a timer depending on its signal to noise ratio (SNR) and transmits a small packet to identify itself when its timer expires. When the SNR-to-timer mapping is monotone non-decreasing, timers of users with better SNRs expire earlier. Thus, the base station (BS) simply selects the first user whose timer expiry it can detect, and transmits data to it at as high a rate as reliably possible. However, timers that expire too close to one another cannot be detected by the BS due to collisions. We characterize in detail the structure of the SNR-to-timer mapping that optimally handles these collisions to maximize the average data rate. We prove that the optimal timer values take only a discrete set of values, and that the rate adaptation policy strongly influences the optimal scheme's structure. The optimal average rate is very close to that of ideal selection in which the BS always selects highest rate user, and is much higher than that of the popular, but ad hoc, timer schemes considered in the literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Receive antenna selection (AS) provides many benefits of multiple-antenna systems at drastically reduced hardware costs. In it, the receiver connects a dynamically selected subset of N available antennas to the L available RF chains. Due to the nature of AS, the channel estimates at different antennas, which are required to determine the best subset for data reception, are obtained from different transmissions of the pilot sequence. Consequently, they are outdated by different amounts in a time-varying channel. We show that a linear weighting of the estimates is necessary and optimum for the subset selection process, where the weights are related to the temporal correlation of the channel variations. When L is not an integer divisor of N , we highlight a new issue of ``training voids'', in which the last pilot transmission is not fully exploited by the receiver. We then present new ``void-filling'' methods that exploit these voids and greatly improve the performance of AS. The optimal subset selection rules with void-filling, in which different antennas turn out to have different numbers of estimates, are also explicitly characterized. Closed-form equations for the symbol error probability with and without void-filling are also developed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Feature selection is an important first step in regional hydrologic studies (RHYS). Over the past few decades, advances in data collection facilities have resulted in development of data archives on a variety of hydro-meteorological variables that may be used as features in RHYS. Currently there are no established procedures for selecting features from such archives. Therefore, hydrologists often use subjective methods to arrive at a set of features. This may lead to misleading results. To alleviate this problem, a probabilistic clustering method for regionalization is presented to determine appropriate features from the available dataset. The effectiveness of the method is demonstrated by application to regionalization of watersheds in conterminous United States for low flow frequency analysis. Plausible homogeneous regions that are formed by using the proposed clustering method are compared with those from conventional methods of regionalization using L-moment based homogeneity tests. Results show that the proposed methodology is promising for RHYS.