Biblioteca Digital

927 resultados para feature inspection method

The heterogeneous cluster ensemble method using hubness for clustering text documents

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a cluster ensemble method to map the corpus documents into the semantic space embedded in Wikipedia and group them using multiple types of feature space. A heterogeneous cluster ensemble is constructed with multiple types of relations i.e. document-term, document-concept and document-category. A final clustering solution is obtained by exploiting associations between document pairs and hubness of the documents. Empirical analysis with various real data sets reveals that the proposed meth-od outperforms state-of-the-art text clustering approaches.

Second-order inelastic analysis of composite framed structures based on the refined plastic hinge method

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Composite steel-concrete structures experience non-linear effects which arise from both instability-related geometric non-linearity and from material non-linearity in all of their component members. Because of this, conventional design procedures cannot capture the true behaviour of a composite frame throughout its full loading range, and so a procedure to account for those non-linearities is much needed. This paper therefore presents a numerical procedure capable of addressing geometric and material non-linearities at the strength limit state based on the refined plastic hinge method. Different material non-linearity for different composite structural components such as T-beams, concrete-filled tubular (CFT) and steel-encased reinforced concrete (SRC) sections can be treated using a routine numerical procedure for their section properties in this plastic hinge approach. Simple and conservative initial and full yield surfaces for general composite sections are proposed in this paper. The refined plastic hinge approach models springs at the ends of the element which are activated when the surface defining the interaction of bending and axial force at first yield is reached; a transition from the first yield interaction surface to the fully plastic interaction surface is postulated based on a proposed refined spring stiffness, which formulates the load-displacement relation for material non-linearity under the interaction of bending and axial actions. This produces a benign method for a beam-column composite element under general loading cases. Another main feature of this paper is that, for members containing a point of contraflexure, its location is determined with a simple application of the method herein and a node is then located at this position to reproduce the real flexural behaviour and associated material non-linearity of the member. Recourse is made to an updated Lagrangian formulation to consider geometric non-linear behaviour and to develop a non-linear solution strategy. The formulation with the refined plastic hinge approach is efficacious and robust, and so a full frame analysis incorporating geometric and material non-linearity is tractable. By way of contrast, the plastic zone approach possesses the drawback of strain-based procedures which rely on determining plastic zones within a cross-section and which require lengthwise integration. Following development of the theory, its application is illustrated with a number of varied examples.

Enhanced n-gram extraction using relevance feature discovery

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Guaranteeing the quality of extracted features that describe relevant knowledge to users or topics is a challenge because of the large number of extracted features. Most popular existing term-based feature selection methods suffer from noisy feature extraction, which is irrelevant to the user needs (noisy). One popular method is to extract phrases or n-grams to describe the relevant knowledge. However, extracted n-grams and phrases usually contain a lot of noise. This paper proposes a method for reducing the noise in n-grams. The method first extracts more specific features (terms) to remove noisy features. The method then uses an extended random set to accurately weight n-grams based on their distribution in the documents and their terms distribution in n-grams. The proposed approach not only reduces the number of extracted n-grams but also improves the performance. The experimental results on Reuters Corpus Volume 1 (RCV1) data collection and TREC topics show that the proposed method significantly outperforms the state-of-art methods underpinned by Okapi BM25, tf*idf and Rocchio.

Predicting fault-prone software modules with rank sum classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The detection and correction of defects remains among the most time consuming and expensive aspects of software development. Extensive automated testing and code inspections may mitigate their effect, but some code fragments are necessarily more likely to be faulty than others, and automated identification of fault prone modules helps to focus testing and inspections, thus limiting wasted effort and potentially improving detection rates. However, software metrics data is often extremely noisy, with enormous imbalances in the size of the positive and negative classes. In this work, we present a new approach to predictive modelling of fault proneness in software modules, introducing a new feature representation to overcome some of these issues. This rank sum representation offers improved or at worst comparable performance to earlier approaches for standard data sets, and readily allows the user to choose an appropriate trade-off between precision and recall to optimise inspection effort to suit different testing environments. The method is evaluated using the NASA Metrics Data Program (MDP) data sets, and performance is compared with existing studies based on the Support Vector Machine (SVM) and Naïve Bayes (NB) Classifiers, and with our own comprehensive evaluation of these methods.

Physical activity recognition from accelerometer data using a multi-scale ensemble method

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Accurate and detailed measurement of an individual's physical activity is a key requirement for helping researchers understand the relationship between physical activity and health. Accelerometers have become the method of choice for measuring physical activity due to their small size, low cost, convenience and their ability to provide objective information about physical activity. However, interpreting accelerometer data once it has been collected can be challenging. In this work, we applied machine learning algorithms to the task of physical activity recognition from triaxial accelerometer data. We employed a simple but effective approach of dividing the accelerometer data into short non-overlapping windows, converting each window into a feature vector, and treating each feature vector as an i.i.d training instance for a supervised learning algorithm. In addition, we improved on this simple approach with a multi-scale ensemble method that did not need to commit to a single window size and was able to leverage the fact that physical activities produced time series with repetitive patterns and discriminative features for physical activity occurred at different temporal scales.

Unbalance identification in large steam turbo-generator unit using a model-based method

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fault identification in industrial machine is a topic of major importance under engineering point of view. In fact, the possibility to identify not only the type, but also the severity and the position of a fault occurred along a shaft-line allows quick maintenance and shorten the downtime. This is really important in the power generation industry where the units are often of several tenths of meters long and where the rotors are enclosed by heavy and pressure-sealed casings. In this paper, an industrial experimental case is presented related to the identification of the unbalance on a large size steam turbine of about 1.3 GW, belonging to a nuclear power plant. The case history is analyzed by considering the vibrations measured by the condition monitoring system of the unit. A model-based method in the frequency domain, developed by the authors, is introduced in detail and it is then used to identify the position of the fault and its severity along the shaft-line. The complete model of the unit (rotor – modeled by means of finite elements, bearings – modeled by linearized damping and stiffness coefficients and foundation – modeled by means of pedestals) is analyzed and discussed before being used for the fault identification. The assessment of the actual fault was done by inspection during a scheduled maintenance and excellent correspondence was found with the identified one by means of authors’ proposed method. Finally a complete discussion is presented about the effectiveness of the method, even in presence of a not fine tuned machine model and considering only few measuring planes for the machine vibration.

Representation of facial expression categories in continuous arousal-valence space: Feature and correlation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Representation of facial expressions using continuous dimensions has shown to be inherently more expressive and psychologically meaningful than using categorized emotions, and thus has gained increasing attention over recent years. Many sub-problems have arisen in this new field that remain only partially understood. A comparison of the regression performance of different texture and geometric features and investigation of the correlations between continuous dimensional axes and basic categorized emotions are two of these. This paper presents empirical studies addressing these problems, and it reports results from an evaluation of different methods for detecting spontaneous facial expressions within the arousal-valence dimensional space (AV). The evaluation compares the performance of texture features (SIFT, Gabor, LBP) against geometric features (FAP-based distances), and the fusion of the two. It also compares the prediction of arousal and valence, obtained using the best fusion method, to the corresponding ground truths. Spatial distribution, shift, similarity, and correlation are considered for the six basic categorized emotions (i.e. anger, disgust, fear, happiness, sadness, surprise). Using the NVIE database, results show that the fusion of LBP and FAP features performs the best. The results from the NVIE and FEEDTUM databases reveal novel findings about the correlations of arousal and valence dimensions to each of six basic emotion categories.

Automated power semiconductor switching performance feature extraction from experimental double-pulse waveform data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Double-pulse tests are commonly used as a method for assessing the switching performance of power semiconductor switches in a clamped inductive switching application. Data generated from these tests are typically in the form of sampled waveform data captured using an oscilloscope. In cases where it is of interest to explore a multi-dimensional parameter space and corresponding result space it is necessary to reduce the data into key performance metrics via feature extraction. This paper presents techniques for the extraction of switching performance metrics from sampled double-pulse waveform data. The reported techniques are applied to experimental data from characterisation of a cascode gate drive circuit applied to power MOSFETs.

Correlation based method for phase identification in a three phase LV distribution network

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Low voltage distribution networks feature a high degree of load unbalance and the addition of rooftop photovoltaic is driving further unbalances in the network. Single phase consumers are distributed across the phases but even if the consumer distribution was well balanced when the network was constructed changes will occur over time. Distribution transformer losses are increased by unbalanced loadings. The estimation of transformer losses is a necessary part of the routine upgrading and replacement of transformers and the identification of the phase connections of households allows a precise estimation of the phase loadings and total transformer loss. This paper presents a new technique and preliminary test results for a method of automatically identifying the phase of each customer by correlating voltage information from the utility's transformer system with voltage information from customer smart meters. The techniques are novel as they are purely based upon a time series of electrical voltage measurements taken at the household and at the distribution transformer. Experimental results using a combination of electrical power and current of the real smart meter datasets demonstrate the performance of our techniques.

DroplIT, an improved image analysis method for droplet identification in high-throughput crystallization trials

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The application of robotics to protein crystallization trials has resulted in the production of millions of images. Manual inspection of these images to find crystals and other interesting outcomes is a major rate-limiting step. As a result there has been intense activity in developing automated algorithms to analyse these images. The very first step for most systems that have been described in the literature is to delineate each droplet. Here, a novel approach that reaches over 97% success rate and subsecond processing times is presented. This will form the seed of a new high-throughput system to scrutinize massive crystallization campaigns automatically. © 2010 International Union of Crystallography Printed in Singapore-all rights reserved.

Quality-aware review selection based on product feature taxonomy

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As of today, user-generated information such as online reviews has become increasingly significant for customers in decision making process. Meanwhile, as the volume of online reviews proliferates, there is an insistent demand to help the users tackle the information overload problem. In order to extract useful information from overwhelming reviews, considerable work has been proposed such as review summarization and review selection. Particularly, to avoid the redundant information, researchers attempt to select a small set of reviews to represent the entire review corpus by preserving its statistical properties (e.g., opinion distribution). However, one significant drawback of the existing works is that they only measure the utility of the extracted reviews as a whole without considering the quality of each individual review. As a result, the set of chosen reviews may consist of low-quality ones even its statistical property is close to that of the original review corpus, which is not preferred by the users. In this paper, we proposed a review selection method which takes review quality into consideration during the selection process. Specifically, we examine the relationships between product features based upon a domain ontology to capture the review characteristics based on which to select reviews that have good quality and preserve the opinion distribution as well. Our experimental results based on real world review datasets demonstrate that our proposed approach is feasible and able to improve the performance of the review selection effectively.

A two-stage SVM method to predict membrane protein types by incorporating amino acid classifications and physicochemical properties into a general form of Chou's PseAAC

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Membrane proteins play important roles in many biochemical processes and are also attractive targets of drug discovery for various diseases. The elucidation of membrane protein types provides clues for understanding the structure and function of proteins. Recently we developed a novel system for predicting protein subnuclear localizations. In this paper, we propose a simplified version of our system for predicting membrane protein types directly from primary protein structures, which incorporates amino acid classifications and physicochemical properties into a general form of pseudo-amino acid composition. In this simplified system, we will design a two-stage multi-class support vector machine combined with a two-step optimal feature selection process, which proves very effective in our experiments. The performance of the present method is evaluated on two benchmark datasets consisting of five types of membrane proteins. The overall accuracies of prediction for five types are 93.25% and 96.61% via the jackknife test and independent dataset test, respectively. These results indicate that our method is effective and valuable for predicting membrane protein types. A web server for the proposed method is available at http://www.juemengt.com/jcc/memty_page.php

Application of image processing techniques for frog call classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Frogs have received increasing attention due to their effectiveness for indicating the environment change. Therefore, it is important to monitor and assess frogs. With the development of sensor techniques, large volumes of audio data (including frog calls) have been collected and need to be analysed. After transforming the audio data into its spectrogram representation using short-time Fourier transform, the visual inspection of this representation motivates us to use image processing techniques for analysing audio data. Applying acoustic event detection (AED) method to spectrograms, acoustic events are firstly detected from which ridges are extracted. Three feature sets, Mel-frequency cepstral coefficients (MFCCs), AED feature set and ridge feature set, are then used for frog call classification with a support vector machine classifier. Fifteen frog species widely spread in Queensland, Australia, are selected to evaluate the proposed method. The experimental results show that ridge feature set can achieve an average classification accuracy of 74.73% which outperforms the MFCCs (38.99%) and AED feature set (67.78%).

SPSA Based Feature Relevance Estimation For Video Retrieval

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the availability of a huge amount of video data on various sources, efficient video retrieval tools are increasingly in demand. Video being a multi-modal data, the perceptions of ``relevance'' between the user provided query video (in case of Query-By-Example type of video search) and retrieved video clips are subjective in nature. We present an efficient video retrieval method that takes user's feedback on the relevance of retrieved videos and iteratively reformulates the input query feature vectors (QFV) for improved video retrieval. The QFV reformulation is done by a simple, but powerful feature weight optimization method based on Simultaneous Perturbation Stochastic Approximation (SPSA) technique. A video retrieval system with video indexing, searching and relevance feedback (RF) phases is built for demonstrating the performance of the proposed method. The query and database videos are indexed using the conventional video features like color, texture, etc. However, we use the comprehensive and novel methods of feature representations, and a spatio-temporal distance measure to retrieve the top M videos that are similar to the query. In feedback phase, the user activated iterative on the previously retrieved videos is used to reformulate the QFV weights (measure of importance) that reflect the user's preference, automatically. It is our observation that a few iterations of such feedback are generally sufficient for retrieving the desired video clips. The novel application of SPSA based RF for user-oriented feature weights optimization makes the proposed method to be distinct from the existing ones. The experimental results show that the proposed RF based video retrieval exhibit good performance.

Temporal variation in arthropod sampling effectiveness: The case for using the beat sheet method in cotton

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Predatory insects and spiders are key elements of integrated pest management (IPM) programmes in agricultural crops such as cotton. Management decisions in IPM programmes should to be based on a reliable and efficient method for counting both predators and pests. Knowledge of the temporal constraints that influence sampling is required because arthropod abundance estimates are likely to vary over a growing season and within a day. Few studies have adequately quantified this effect using the beat sheet, a potentially important sampling method. We compared the commonly used methods of suction and visual sampling to the beat sheet, with reference to an absolute cage clamp method for determining the abundance of various arthropod taxa over 5 weeks. There were significantly more entomophagous arthropods recorded using the beat sheet and cage clamp methods than by using suction or visual sampling, and these differences were more pronounced as the plants grew. In a second trial, relative estimates of entomophagous and phytophagous arthropod abundance were made using beat sheet samples collected over a day. Beat sheet estimates of the abundance of only eight of the 43 taxa examined were found to vary significantly over a day. Beat sheet sampling is recommended in further studies of arthropod abundance in cotton, but researchers and pest management advisors should bear in mind the time of season and time of day effects.

«
1
2
...
4
5
6
7
8
9
10
...
61
62
»