843 resultados para Feature grouping
Resumo:
MapReduce is a computation model for processing large data sets in parallel on large clusters of machines, in a reliable, fault-tolerant manner. A MapReduce computation is broken down into a number of map tasks and reduce tasks, which are performed by so called mappers and reducers, respectively. The placement of the mappers and reducers on the machines directly affects the performance and cost of the MapReduce computation. From the computational point of view, the mappers/reducers placement problem is a generation of the classical bin packing problem, which is NPcomplete. Thus, in this paper we propose a new grouping genetic algorithm for the mappers/reducers placement problem in cloud computing. Compared with the original one, our grouping genetic algorithm uses an innovative coding scheme and also eliminates the inversion operator which is an essential operator in the original grouping genetic algorithm. The new grouping genetic algorithm is evaluated by experiments and the experimental results show that it is much more efficient than four popular algorithms for the problem, including the original grouping genetic algorithm.
Resumo:
Previous behavioral studies reported a robust effect of increased naming latencies when objects to be named were blocked within semantic category, compared to items blocked between category. This semantic context effect has been attributed to various mechanisms including inhibition or excitation of lexico-semantic representations and incremental learning of associations between semantic features and names, and is hypothesized to increase demands on verbal self-monitoring during speech production. Objects within categories also share many visual structural features, introducing a potential confound when interpreting the level at which the context effect might occur. Consistent with previous findings, we report a significant increase in response latencies when naming categorically related objects within blocks, an effect associated with increased perfusion fMRI signal bilaterally in the hippocampus and in the left middle to posterior superior temporal cortex. No perfusion changes were observed in the middle section of the left middle temporal cortex, a region associated with retrieval of lexical-semantic information in previous object naming studies. Although a manipulation of visual feature similarity did not influence naming latencies, we observed perfusion increases in the perirhinal cortex for naming objects with similar visual features that interacted with the semantic context in which objects were named. These results provide support for the view that the semantic context effect in object naming occurs due to an incremental learning mechanism, and involves increased demands on verbal self-monitoring.
Resumo:
As of today, online reviews have become more and more important in decision making process. In recent years, the problem of identifying useful reviews for users has attracted significant attentions. For instance, in order to select reviews that focus on a particular feature, researchers proposed a method which extracts all associated words of this feature as the relevant information to evaluate and find appropriate reviews. However, the extraction of associated words is not that accurate due to the noise in free review text, and this affects the overall performance negatively. In this paper, we propose a method to select reviews according to a given feature by using a review model generated based upon a domain ontology called product feature taxonomy. The proposed review model provides relevant information about the hierarchical relationships of the features in the review which captures the review characteristics accurately. Our experiment results based on real world review dataset show that our approach is able to improve the review selection performance according to the given criteria effectively.
Resumo:
Sparse optical flow algorithms, such as the Lucas-Kanade approach, provide more robustness to noise than dense optical flow algorithms and are the preferred approach in many scenarios. Sparse optical flow algorithms estimate the displacement for a selected number of pixels in the image. These pixels can be chosen randomly. However, pixels in regions with more variance between the neighbours will produce more reliable displacement estimates. The selected pixel locations should therefore be chosen wisely. In this study, the suitability of Harris corners, Shi-Tomasi's “Good features to track", SIFT and SURF interest point extractors, Canny edges, and random pixel selection for the purpose of frame-by-frame tracking using a pyramidical Lucas-Kanade algorithm is investigated. The evaluation considers the important factors of processing time, feature count, and feature trackability in indoor and outdoor scenarios using ground vehicles and unmanned aerial vehicles, and for the purpose of visual odometry estimation.
Resumo:
Trimeric autotransporter proteins (TAAs) are important virulence factors of many Gram-negative bacterial pathogens. A common feature of most TAAs is the ability to mediate adherence to eukaryotic cells or extracellular matrix (ECM) proteins via a cell surface-exposed passenger domain. Here we describe the characterization of EhaG, a TAA identified from enterohemorrhagic Escherichia coli (EHEC) O157:H7. EhaG is a positional orthologue of the recently characterized UpaG TAA from uropathogenic E. coli (UPEC). Similarly to UpaG, EhaG localized at the bacterial cell surface and promoted cell aggregation, biofilm formation, and adherence to a range of ECM proteins. However, the two orthologues display differential cellular binding: EhaG mediates specific adhesion to colorectal epithelial cells while UpaG promotes specific binding to bladder epithelial cells. The EhaG and UpaG TAAs contain extensive sequence divergence in their respective passenger domains that could account for these differences. Indeed, sequence analyses of UpaG and EhaG homologues from several E. coli genomes revealed grouping of the proteins in clades almost exclusively represented by distinct E. coli pathotypes. The expression of EhaG (in EHEC) and UpaG (in UPEC) was also investigated and shown to be significantly enhanced in an hns isogenic mutant, suggesting that H-NS acts as a negative regulator of both TAAs. Thus, while the EhaG and UpaG TAAs contain some conserved binding and regulatory features, they also possess important differences that correlate with the distinct pathogenic lifestyles of EHEC and UPEC.
Resumo:
Representation of facial expressions using continuous dimensions has shown to be inherently more expressive and psychologically meaningful than using categorized emotions, and thus has gained increasing attention over recent years. Many sub-problems have arisen in this new field that remain only partially understood. A comparison of the regression performance of different texture and geometric features and investigation of the correlations between continuous dimensional axes and basic categorized emotions are two of these. This paper presents empirical studies addressing these problems, and it reports results from an evaluation of different methods for detecting spontaneous facial expressions within the arousal-valence dimensional space (AV). The evaluation compares the performance of texture features (SIFT, Gabor, LBP) against geometric features (FAP-based distances), and the fusion of the two. It also compares the prediction of arousal and valence, obtained using the best fusion method, to the corresponding ground truths. Spatial distribution, shift, similarity, and correlation are considered for the six basic categorized emotions (i.e. anger, disgust, fear, happiness, sadness, surprise). Using the NVIE database, results show that the fusion of LBP and FAP features performs the best. The results from the NVIE and FEEDTUM databases reveal novel findings about the correlations of arousal and valence dimensions to each of six basic emotion categories.
Resumo:
Double-pulse tests are commonly used as a method for assessing the switching performance of power semiconductor switches in a clamped inductive switching application. Data generated from these tests are typically in the form of sampled waveform data captured using an oscilloscope. In cases where it is of interest to explore a multi-dimensional parameter space and corresponding result space it is necessary to reduce the data into key performance metrics via feature extraction. This paper presents techniques for the extraction of switching performance metrics from sampled double-pulse waveform data. The reported techniques are applied to experimental data from characterisation of a cascode gate drive circuit applied to power MOSFETs.
Resumo:
This paper proposes a highly reliable fault diagnosis approach for low-speed bearings. The proposed approach first extracts wavelet-based fault features that represent diverse symptoms of multiple low-speed bearing defects. The most useful fault features for diagnosis are then selected by utilizing a genetic algorithm (GA)-based kernel discriminative feature analysis cooperating with one-against-all multicategory support vector machines (OAA MCSVMs). Finally, each support vector machine is individually trained with its own feature vector that includes the most discriminative fault features, offering the highest classification performance. In this study, the effectiveness of the proposed GA-based kernel discriminative feature analysis and the classification ability of individually trained OAA MCSVMs are addressed in terms of average classification accuracy. In addition, the proposedGA- based kernel discriminative feature analysis is compared with four other state-of-the-art feature analysis approaches. Experimental results indicate that the proposed approach is superior to other feature analysis methodologies, yielding an average classification accuracy of 98.06% and 94.49% under rotational speeds of 50 revolutions-per-minute (RPM) and 80 RPM, respectively. Furthermore, the individually trained MCSVMs with their own optimal fault features based on the proposed GA-based kernel discriminative feature analysis outperform the standard OAA MCSVMs, showing an average accuracy of 98.66% and 95.01% for bearings under rotational speeds of 50 RPM and 80 RPM, respectively.
Resumo:
We propose expected attainable discrimination (EAD) as a measure to select discrete valued features for reliable discrimination between two classes of data. EAD is an average of the area under the ROC curves obtained when a simple histogram probability density model is trained and tested on many random partitions of a data set. EAD can be incorporated into various stepwise search methods to determine promising subsets of features, particularly when misclassification costs are difficult or impossible to specify. Experimental application to the problem of risk prediction in pregnancy is described.
Resumo:
It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of large scale terms and data patterns. Most existing popular text mining and classification methods have adopted term-based approaches. However, they have all suffered from the problems of polysemy and synonymy. Over the years, there has been often held the hypothesis that pattern-based methods should perform better than term-based ones in describing user preferences; yet, how to effectively use large scale patterns remains a hard problem in text mining. To make a breakthrough in this challenging issue, this paper presents an innovative model for relevance feature discovery. It discovers both positive and negative patterns in text documents as higher level features and deploys them over low-level features (terms). It also classifies terms into categories and updates term weights based on their specificity and their distributions in patterns. Substantial experiments using this model on RCV1, TREC topics and Reuters-21578 show that the proposed model significantly outperforms both the state-of-the-art term-based methods and the pattern based methods.
Resumo:
There has been a growing interest in alignment-free methods for phylogenetic analysis using complete genome data. Among them, CVTree method, feature frequency profiles method and dynamical language approach were used to investigate the whole-proteome phylogeny of large dsDNA viruses. Using the data set of large dsDNA viruses from Gao and Qi (BMC Evol. Biol. 2007), the phylogenetic results based on the CVTree method and the dynamical language approach were compared in Yu et al. (BMC Evol. Biol. 2010). In this paper, we first apply dynamical language approach to the data set of large dsDNA viruses from Wu et al. (Proc. Natl. Acad. Sci. USA 2009) and compare our phylogenetic results with those based on the feature frequency profiles method. Then we construct the whole-proteome phylogeny of the larger dataset combining the above two data sets. According to the report of The International Committee on the Taxonomy of Viruses (ICTV), the trees from our analyses are in good agreement to the latest classification of large dsDNA viruses.
Resumo:
Mandatory reporting is a key aspect of Australia’s approach to protecting children and is incorporated into all jurisdictions’ legislation, albeit in a variety of forms. In this article we examine all major newspaper’s coverage of mandatory reporting during an 18-month period in 2008-2009, when high-profile tragedies and inquiries occurred and significant policy and reform agendas were being debated. Mass media utilise a variety of lenses to inform and shape public responses and attitudes to reported events. We use frame analysis to identify the ways in which stories were composed and presented, and how language portrayed this contested area of policy. The results indicate that within an overall portrayal of system failure and the need for reform, the coverage placed major responsibility on child protection agencies for the over-reporting, under-reporting, and overburdened system identified, along with the failure of mandatory reporting to reduce risk. The implications for ongoing reform are explored along with the need for robust research to inform debate about the merits of mandatory reporting.
Resumo:
This chapter takes as its central premise the human capacity to adapt to changing environments. It is an idea that is central to complexity theory but receives only modest attention in relation to learning. To do this we will draw from a range of fields and then consider some recent research in motor control that may extend the discussion in ways not yet considered, but that will build on advances already made within pedagogy and motor control synergies. Recent work in motor control indicates that humans have far greater capacity to adapt to the ‘product space’ than was previously thought, mainly through fast heuristics and on-line corrections. These are changes that can be made in real (movement) time and are facilitated by what are referred to as ‘feed-forward’ mechanisms that take advantage of ultra-fast ways of recognizing the likely outcomes of our movements and using this as a source of feedback. We conclude by discussing some possible ideas for pedagogy within the sport and physical activity domains, the implications of which would require a rethink on how motor skill learning opportunities might best be facilitated.
Resumo:
The mean shift tracker has achieved great success in visual object tracking due to its efficiency being nonparametric. However, it is still difficult for the tracker to handle scale changes of the object. In this paper, we associate a scale adaptive approach with the mean shift tracker. Firstly, the target in the current frame is located by the mean shift tracker. Then, a feature point matching procedure is employed to get the matched pairs of the feature point between target regions in the current frame and the previous frame. We employ FAST-9 corner detector and HOG descriptor for the feature matching. Finally, with the acquired matched pairs of the feature point, the affine transformation between target regions in the two frames is solved to obtain the current scale of the target. Experimental results show that the proposed tracker gives satisfying results when the scale of the target changes, with a good performance of efficiency.