942 resultados para outlier detection, data mining, gpgpu, gpu computing, supercomputing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present some improved analytical results as part of the ongoing work on the analysis of Fugue-256 hash function, a second round candidate in the NIST’s SHA3 competition. First we improve Aumasson and Phans’ integral distinguisher on the 5.5 rounds of the final transformation of Fugue-256 to 16.5 rounds. Next we improve the designers’ meet-in-the-middle preimage attack on Fugue-256 from 2480 time and memory to 2416. Finally, we comment on possible methods to obtain free-start distinguishers and free-start collisions for Fugue-256.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multidimensional data are getting increasing attention from researchers for creating better recommender systems in recent years. Additional metadata provides algorithms with more details for better understanding the interaction between users and items. While neighbourhood-based Collaborative Filtering (CF) approaches and latent factor models tackle this task in various ways effectively, they only utilize different partial structures of data. In this paper, we seek to delve into different types of relations in data and to understand the interaction between users and items more holistically. We propose a generic multidimensional CF fusion approach for top-N item recommendations. The proposed approach is capable of incorporating not only localized relations of user-user and item-item but also latent interaction between all dimensions of the data. Experimental results show significant improvements by the proposed approach in terms of recommendation accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the last few years, investigations of human epigenetic profiles have identified key elements of change to be Histone Modifications, stable and heritable DNA methylation and Chromatin remodeling. These factors determine gene expression levels and characterise conditions leading to disease. In order to extract information embedded in long DNA sequences, data mining and pattern recognition tools are widely used, but efforts have been limited to date with respect to analyzing epigenetic changes, and their role as catalysts in disease onset. Useful insight, however, can be gained by investigation of associated dinucleotide distributions. The focus of this paper is to explore specific dinucleotides frequencies across defined regions within the human genome, and to identify new patterns between epigenetic mechanisms and DNA content. Signal processing methods, including Fourier and Wavelet Transformations, are employed and principal results are reported.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Opsins are ancient molecules that enable animal vision by coupling to a vitamin-derived chromophore to form lightsensitive photopigments. The primary drivers of evolutionary diversification in opsins are thought to be visual tasks related to spectral sensitivity and color vision. Typically, only a few opsin amino acid sites affect photopigment spectral sensitivity. We show that opsin genes of the North American butterfly Limenitis arthemis have diversified along a latitudinal cline, consistent with natural selection due to environmental factors. We sequenced single nucleotide(SNP) polymorphisms in the coding regions of the ultraviolet (UVRh), blue (BRh), and long-wavelength (LWRh) opsin genes from ten butterfly populations along the eastern United States and found that a majority of opsin SNPs showed significant clinal variation. Outlier detection and analysis of molecular variance indicated that many SNPs are under balancing selection and show significant population structure. This contrasts with what we found by analysing SNPs in the wingless and EF-1 alpha loci, and from neutral amplified fragment length polymorphisms, which show no evidence of significant locus-specific or genome-wide structure among populations. Using a combination of functional genetic and physiological approaches, including expression in cell culture, transgenic Drosophila, UV-visible spectroscopy, and optophysiology, we show that key BRh opsin SNPs that vary clinally have almost no effect on spectral sensitivity. Our results suggest that opsin diversification in this butterfly is more consistent with natural selection unrelated to spectral tuning. Some of the clinally varying SNPs may instead play a role in regulating opsin gene expression levels or the thermostability of the opsin protein. Lastly, we discuss the possibility that insect opsins might have important, yet-to-be elucidated, adaptive functions in mediating animal responses to abiotic factors, such as temperature or photoperiod.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditional text classification technology based on machine learning and data mining techniques has made a big progress. However, it is still a big problem on how to draw an exact decision boundary between relevant and irrelevant objects in binary classification due to much uncertainty produced in the process of the traditional algorithms. The proposed model CTTC (Centroid Training for Text Classification) aims to build an uncertainty boundary to absorb as many indeterminate objects as possible so as to elevate the certainty of the relevant and irrelevant groups through the centroid clustering and training process. The clustering starts from the two training subsets labelled as relevant or irrelevant respectively to create two principal centroid vectors by which all the training samples are further separated into three groups: POS, NEG and BND, with all the indeterminate objects absorbed into the uncertain decision boundary BND. Two pairs of centroid vectors are proposed to be trained and optimized through the subsequent iterative multi-learning process, all of which are proposed to collaboratively help predict the polarities of the incoming objects thereafter. For the assessment of the proposed model, F1 and Accuracy have been chosen as the key evaluation measures. We stress the F1 measure because it can display the overall performance improvement of the final classifier better than Accuracy. A large number of experiments have been completed using the proposed model on the Reuters Corpus Volume 1 (RCV1) which is important standard dataset in the field. The experiment results show that the proposed model has significantly improved the binary text classification performance in both F1 and Accuracy compared with three other influential baseline models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Preface The 9th Australasian Conference on Information Security and Privacy (ACISP 2004) was held in Sydney, 13–15 July, 2004. The conference was sponsored by the Centre for Advanced Computing – Algorithms and Cryptography (ACAC), Information and Networked Security Systems Research (INSS), Macquarie University and the Australian Computer Society. The aims of the conference are to bring together researchers and practitioners working in areas of information security and privacy from universities, industry and government sectors. The conference program covered a range of aspects including cryptography, cryptanalysis, systems and network security. The program committee accepted 41 papers from 195 submissions. The reviewing process took six weeks and each paper was carefully evaluated by at least three members of the program committee. We appreciate the hard work of the members of the program committee and external referees who gave many hours of their valuable time. Of the accepted papers, there were nine from Korea, six from Australia, five each from Japan and the USA, three each from China and Singapore, two each from Canada and Switzerland, and one each from Belgium, France, Germany, Taiwan, The Netherlands and the UK. All the authors, whether or not their papers were accepted, made valued contributions to the conference. In addition to the contributed papers, Dr Arjen Lenstra gave an invited talk, entitled Likely and Unlikely Progress in Factoring. This year the program committee introduced the Best Student Paper Award. The winner of the prize for the Best Student Paper was Yan-Cheng Chang from Harvard University for his paper Single Database Private Information Retrieval with Logarithmic Communication. We would like to thank all the people involved in organizing this conference. In particular we would like to thank members of the organizing committee for their time and efforts, Andrina Brennan, Vijayakrishnan Pasupathinathan, Hartono Kurnio, Cecily Lenton, and members from ACAC and INSS.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Public buildings and large infrastructure are typically monitored by tens or hundreds of cameras, all capturing different physical spaces and observing different types of interactions and behaviours. However to date, in large part due to limited data availability, crowd monitoring and operational surveillance research has focused on single camera scenarios which are not representative of real-world applications. In this paper we present a new, publicly available database for large scale crowd surveillance. Footage from 12 cameras for a full work day covering the main floor of a busy university campus building, including an internal and external foyer, elevator foyers, and the main external approach are provided; alongside annotation for crowd counting (single or multi-camera) and pedestrian flow analysis for 10 and 6 sites respectively. We describe how this large dataset can be used to perform distributed monitoring of building utilisation, and demonstrate the potential of this dataset to understand and learn the relationship between different areas of a building.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Driving on an approach to a signalized intersection while distracted is relatively risky, as potential vehicular conflicts and resulting angle collisions tend to be relatively more severe compared to other locations. Given the prevalence and importance of this particular scenario, the objective of this study was to examine the decisions and actions of distracted drivers during the onset of yellow lights. Driving simulator data were obtained from a sample of 69 drivers under baseline and handheld cell phone conditions at the University of Iowa – National Advanced Driving Simulator. Explanatory variables included age, gender, cell phone use, distance to stop-line, and speed. Although there is extensive research on drivers’ responses to yellow traffic signals, the examinations have been conducted from a traditional regression-based approach, which do not necessary provide the underlying relations and patterns among the sampled data. In this paper, we exploit the benefits of both classical statistical inference and data mining techniques to identify the a priori relationships among main effects, non-linearities, and interaction effects. Results suggest that the probability of yellow light running increases with the increase in driving speed at the onset of yellow. Both young (18–25 years) and middle-aged (30–45 years) drivers reveal reduced propensity for yellow light running whilst distracted across the entire speed range, exhibiting possible risk compensation during this critical driving situation. The propensity for yellow light running for both distracted male and female older (50–60 years) drivers is significantly higher. Driver experience captured by age interacts with distraction, resulting in their combined effect having slower physiological response and being distracted particularly risky.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Acoustic classification of anurans (frogs) has received increasing attention for its promising application in biological and environment studies. In this study, a novel feature extraction method for frog call classification is presented based on the analysis of spectrograms. The frog calls are first automatically segmented into syllables. Then, spectral peak tracks are extracted to separate desired signal (frog calls) from background noise. The spectral peak tracks are used to extract various syllable features, including: syllable duration, dominant frequency, oscillation rate, frequency modulation, and energy modulation. Finally, a k-nearest neighbor classifier is used for classifying frog calls based on the results of principal component analysis. The experiment results show that syllable features can achieve an average classification accuracy of 90.5% which outperforms Mel-frequency cepstral coefficients features (79.0%).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over past few decades, frog species have been experiencing dramatic decline around the world. The reason for this decline includes habitat loss, invasive species, climate change and so on. To better know the status of frog species, classifying frogs has become increasingly important. In this study, acoustic features are investigated for multi-level classification of Australian frogs: family, genus and species, including three families, eleven genera and eighty five species which are collected from Queensland, Australia. For each frog species, six instances are selected from which ten acoustic features are calculated. Then, the multicollinearity between ten features are studied for selecting non-correlated features for subsequent analysis. A decision tree (DT) classifier is used to visually and explicitly determine which acoustic features are relatively important for classifying family, which for genus, and which for species. Finally, a weighted support vector machines (SVMs) classifier is used for the multi- level classification with three most important acoustic features respectively. Our experiment results indicate that using different acoustic feature sets can successfully classify frogs at different levels and the average classification accuracy can be up to 85.6%, 86.1% and 56.2% for family, genus and species respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Document clustering is one of the prominent methods for mining important information from the vast amount of data available on the web. However, document clustering generally suffers from the curse of dimensionality. Providentially in high dimensional space, data points tend to be more concentrated in some areas of clusters. We take advantage of this phenomenon by introducing a novel concept of dynamic cluster representation named as loci. Clusters’ loci are efficiently calculated using documents’ ranking scores generated from a search engine. We propose a fast loci-based semi-supervised document clustering algorithm that uses clusters’ loci instead of conventional centroids for assigning documents to clusters. Empirical analysis on real-world datasets shows that the proposed method produces cluster solutions with promising quality and is substantially faster than several benchmarked centroid-based semi-supervised document clustering methods.