978 resultados para Rotating electrical machine
Resumo:
Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical models to represent multiple topics in a collection of documents, which has been widely utilized in the fields of machine learning and information retrieval, etc. But its effectiveness in information filtering is rarely known. Patterns are always thought to be more representative than single terms for representing documents. In this paper, a novel information filtering model, Pattern-based Topic Model(PBTM) , is proposed to represent the text documents not only using the topic distributions at general level but also using semantic pattern representations at detailed specific level, both of which contribute to the accurate document representation and document relevance ranking. Extensive experiments are conducted to evaluate the effectiveness of PBTM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model achieves outstanding performance.
Resumo:
Textual document set has become an important and rapidly growing information source in the web. Text classification is one of the crucial technologies for information organisation and management. Text classification has become more and more important and attracted wide attention of researchers from different research fields. In this paper, many feature selection methods, the implement algorithms and applications of text classification are introduced firstly. However, because there are much noise in the knowledge extracted by current data-mining techniques for text classification, it leads to much uncertainty in the process of text classification which is produced from both the knowledge extraction and knowledge usage, therefore, more innovative techniques and methods are needed to improve the performance of text classification. It has been a critical step with great challenge to further improve the process of knowledge extraction and effectively utilization of the extracted knowledge. Rough Set decision making approach is proposed to use Rough Set decision techniques to more precisely classify the textual documents which are difficult to separate by the classic text classification methods. The purpose of this paper is to give an overview of existing text classification technologies, to demonstrate the Rough Set concepts and the decision making approach based on Rough Set theory for building more reliable and effective text classification framework with higher precision, to set up an innovative evaluation metric named CEI which is very effective for the performance assessment of the similar research, and to propose a promising research direction for addressing the challenging problems in text classification, text mining and other relative fields.
Resumo:
The detection and correction of defects remains among the most time consuming and expensive aspects of software development. Extensive automated testing and code inspections may mitigate their effect, but some code fragments are necessarily more likely to be faulty than others, and automated identification of fault prone modules helps to focus testing and inspections, thus limiting wasted effort and potentially improving detection rates. However, software metrics data is often extremely noisy, with enormous imbalances in the size of the positive and negative classes. In this work, we present a new approach to predictive modelling of fault proneness in software modules, introducing a new feature representation to overcome some of these issues. This rank sum representation offers improved or at worst comparable performance to earlier approaches for standard data sets, and readily allows the user to choose an appropriate trade-off between precision and recall to optimise inspection effort to suit different testing environments. The method is evaluated using the NASA Metrics Data Program (MDP) data sets, and performance is compared with existing studies based on the Support Vector Machine (SVM) and Naïve Bayes (NB) Classifiers, and with our own comprehensive evaluation of these methods.
Resumo:
With the ever-increasing penetration level of wind power, the impacts of wind power on the power system are becoming more and more significant. Hence, it is necessary to systematically examine its impacts on the small signal stability and transient stability in order to find out countermeasures. As such, a comprehensive study is carried out to compare the dynamic performances of power system respectively with three widely-used power generators. First, the dynamic models are described for three types of wind power generators, i. e. the squirrel cage induction generator (SCIG), doubly fed induction generator (DFIG) and permanent magnet generator (PMG). Then, the impacts of these wind power generators on the small signal stability and transient stability are compared with that of a substituted synchronous generator (SG) in the WSCC three-machine nine-bus system by the eigenvalue analysis and dynamic time-domain simulations. Simulation results show that the impacts of different wind power generators are different under small and large disturbances.
Resumo:
Topic modelling has been widely used in the fields of information retrieval, text mining, machine learning, etc. In this paper, we propose a novel model, Pattern Enhanced Topic Model (PETM), which makes improvements to topic modelling by semantically representing topics with discriminative patterns, and also makes innovative contributions to information filtering by utilising the proposed PETM to determine document relevance based on topics distribution and maximum matched patterns proposed in this paper. Extensive experiments are conducted to evaluate the effectiveness of PETM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model significantly outperforms both state-of-the-art term-based models and pattern-based models.
Resumo:
Many mature term-based or pattern-based approaches have been used in the field of information filtering to generate users’ information needs from a collection of documents. A fundamental assumption for these approaches is that the documents in the collection are all about one topic. However, in reality users’ interests can be diverse and the documents in the collection often involve multiple topics. Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical models to represent multiple topics in a collection of documents, and this has been widely utilized in the fields of machine learning and information retrieval, etc. But its effectiveness in information filtering has not been so well explored. Patterns are always thought to be more discriminative than single terms for describing documents. However, the enormous amount of discovered patterns hinder them from being effectively and efficiently used in real applications, therefore, selection of the most discriminative and representative patterns from the huge amount of discovered patterns becomes crucial. To deal with the above mentioned limitations and problems, in this paper, a novel information filtering model, Maximum matched Pattern-based Topic Model (MPBTM), is proposed. The main distinctive features of the proposed model include: (1) user information needs are generated in terms of multiple topics; (2) each topic is represented by patterns; (3) patterns are generated from topic models and are organized in terms of their statistical and taxonomic features, and; (4) the most discriminative and representative patterns, called Maximum Matched Patterns, are proposed to estimate the document relevance to the user’s information needs in order to filter out irrelevant documents. Extensive experiments are conducted to evaluate the effectiveness of the proposed model by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model significantly outperforms both state-of-the-art term-based models and pattern-based models
Resumo:
MapReduce is a computation model for processing large data sets in parallel on large clusters of machines, in a reliable, fault-tolerant manner. A MapReduce computation is broken down into a number of map tasks and reduce tasks, which are performed by so called mappers and reducers, respectively. The placement of the mappers and reducers on the machines directly affects the performance and cost of the MapReduce computation in cloud computing. From the computational point of view, the mappers/reducers placement problem is a generation of the classical bin packing problem, which is NP-complete. Thus, in this paper we propose a new heuristic algorithm for the mappers/reducers placement problem in cloud computing and evaluate it by comparing with other several heuristics on solution quality and computation time by solving a set of test problems with various characteristics. The computational results show that our heuristic algorithm is much more efficient than the other heuristics and it can obtain a better solution in a reasonable time. Furthermore, we verify the effectiveness of our heuristic algorithm by comparing the mapper/reducer placement for a benchmark problem generated by our heuristic algorithm with a conventional mapper/reducer placement which puts a fixed number of mapper/reducer on each machine. The comparison results show that the computation using our mapper/reducer placement is much cheaper than the computation using the conventional placement while still satisfying the computation deadline.
Resumo:
Genomic sequences are fundamentally text documents, admitting various representations according to need and tokenization. Gene expression depends crucially on binding of enzymes to the DNA sequence at small, poorly conserved binding sites, limiting the utility of standard pattern search. However, one may exploit the regular syntactic structure of the enzyme's component proteins and the corresponding binding sites, framing the problem as one of detecting grammatically correct genomic phrases. In this paper we propose new kernels based on weighted tree structures, traversing the paths within them to capture the features which underpin the task. Experimentally, we and that these kernels provide performance comparable with state of the art approaches for this problem, while offering significant computational advantages over earlier methods. The methods proposed may be applied to a broad range of sequence or tree-structured data in molecular biology and other domains.
Resumo:
Due to the health impacts caused by exposures to air pollutants in urban areas, monitoring and forecasting of air quality parameters have become popular as an important topic in atmospheric and environmental research today. The knowledge on the dynamics and complexity of air pollutants behavior has made artificial intelligence models as a useful tool for a more accurate pollutant concentration prediction. This paper focuses on an innovative method of daily air pollution prediction using combination of Support Vector Machine (SVM) as predictor and Partial Least Square (PLS) as a data selection tool based on the measured values of CO concentrations. The CO concentrations of Rey monitoring station in the south of Tehran, from Jan. 2007 to Feb. 2011, have been used to test the effectiveness of this method. The hourly CO concentrations have been predicted using the SVM and the hybrid PLS–SVM models. Similarly, daily CO concentrations have been predicted based on the aforementioned four years measured data. Results demonstrated that both models have good prediction ability; however the hybrid PLS–SVM has better accuracy. In the analysis presented in this paper, statistic estimators including relative mean errors, root mean squared errors and the mean absolute relative error have been employed to compare performances of the models. It has been concluded that the errors decrease after size reduction and coefficients of determination increase from 56 to 81% for SVM model to 65–85% for hybrid PLS–SVM model respectively. Also it was found that the hybrid PLS–SVM model required lower computational time than SVM model as expected, hence supporting the more accurate and faster prediction ability of hybrid PLS–SVM model.
Resumo:
Computer vision is increasingly becoming interested in the rapid estimation of object detectors. The canonical strategy of using Hard Negative Mining to train a Support Vector Machine is slow, since the large negative set must be traversed at least once per detector. Recent work has demonstrated that, with an assumption of signal stationarity, Linear Discriminant Analysis is able to learn comparable detectors without ever revisiting the negative set. Even with this insight, the time to learn a detector can still be on the order of minutes. Correlation filters, on the other hand, can produce a detector in under a second. However, this involves the unnatural assumption that the statistics are periodic, and requires the negative set to be re-sampled per detector size. These two methods differ chie y in the structure which they impose on the co- variance matrix of all examples. This paper is a comparative study which develops techniques (i) to assume periodic statistics without needing to revisit the negative set and (ii) to accelerate the estimation of detectors with aperiodic statistics. It is experimentally verified that periodicity is detrimental.
Resumo:
In this paper an approach is presented for identification of a reduced model for coherent areas in power systems using phasor measurement units to represent the inter-area oscillations of the system. The generators which are coherent in a wide range of operating conditions form the areas in power systems and the reduced model is obtained by representing each area by an equivalent machine. The reduced nonlinear model is then identified based on the data obtained from measurement units. The simulation is performed on three test systems and the obtained results show high accuracy of identification process.
Resumo:
Composites with carbon nanotubes are becoming increasingly used in energy storage and electronic devices, due to incorporated excellent properties from carbon nanotubes and polymers. Although their properties make them more attractive than conventional smart materials, their electrical properties are found to be temperature-dependent which is important to consider for the design of devices. To study the effects of temperature in electrically conductive multi-wall carbon nanotube/epoxy composites, thin films were prepared and the effect of temperature on the resistivity, thermal properties and Raman spectral characteristics of the composite films was evaluated. Resistivity-temperature profiles showed three distinct regions in as-cured samples and only two regions in samples whose thermal histories had been erased. In the vicinity of the glass transition temperature, the as-cured composites exhibited pronounced resistivity and enthalpic relaxation peaks, which both disappeared after erasing the composites’ thermal histories by temperature cycling. Combined DSC, Raman spectroscopy, and resistivity-temperature analyses indicated that this phenomenon can be attributed to the physical aging of the epoxy matrix and that, in the region of the observed thermal history-dependent resistivity peaks, structural rearrangement of the conductive carbon nanotube network occurs through a volume expansion/relaxation process. These results have led to an overall greater understanding of the temperature-dependent behaviour of conductive carbon nanotube/epoxy composites, including the positive temperature coefficient effect.
Resumo:
Objective To develop and evaluate machine learning techniques that identify limb fractures and other abnormalities (e.g. dislocations) from radiology reports. Materials and Methods 99 free-text reports of limb radiology examinations were acquired from an Australian public hospital. Two clinicians were employed to identify fractures and abnormalities from the reports; a third senior clinician resolved disagreements. These assessors found that, of the 99 reports, 48 referred to fractures or abnormalities of limb structures. Automated methods were then used to extract features from these reports that could be useful for their automatic classification. The Naive Bayes classification algorithm and two implementations of the support vector machine algorithm were formally evaluated using cross-fold validation over the 99 reports. Result Results show that the Naive Bayes classifier accurately identifies fractures and other abnormalities from the radiology reports. These results were achieved when extracting stemmed token bigram and negation features, as well as using these features in combination with SNOMED CT concepts related to abnormalities and disorders. The latter feature has not been used in previous works that attempted classifying free-text radiology reports. Discussion Automated classification methods have proven effective at identifying fractures and other abnormalities from radiology reports (F-Measure up to 92.31%). Key to the success of these techniques are features such as stemmed token bigrams, negations, and SNOMED CT concepts associated with morphologic abnormalities and disorders. Conclusion This investigation shows early promising results and future work will further validate and strengthen the proposed approaches.
Resumo:
Bundle adjustment is one of the essential components of the computer vision toolbox. This paper revisits the resection-intersection approach, which has previously been shown to have inferior convergence properties. Modifications are proposed that greatly improve the performance of this method, resulting in a fast and accurate approach. Firstly, a linear triangulation step is added to the intersection stage, yielding higher accuracy and improved convergence rate. Secondly, the effect of parameter updates is tracked in order to reduce wasteful computation; only variables coupled to significantly changing variables are updated. This leads to significant improvements in computation time, at the cost of a small, controllable increase in error. Loop closures are handled effectively without the need for additional network modelling. The proposed approach is shown experimentally to yield comparable accuracy to a full sparse bundle adjustment (20% error increase) while computation time scales much better with the number of variables. Experiments on a progressive reconstruction system show the proposed method to be more efficient by a factor of 65 to 177, and 4.5 times more accurate (increasing over time) than a localised sparse bundle adjustment approach.
Resumo:
CSCW researchers have increasingly come to realize that the material work setting and its population of artefacts play a crucial part in coordination of distributed or co-located work. This paper uses the notion of physicality as a basis to understand cooperative work. Using examples from an ongoing fieldwork on cooperative design practices, it provides a conceptual understanding of physicality and shows that material settings and co-workers’ working practices play an important role in understanding the physicality of cooperative design.