979 resultados para Loss labeling (classification)


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Textual document set has become an important and rapidly growing information source in the web. Text classification is one of the crucial technologies for information organisation and management. Text classification has become more and more important and attracted wide attention of researchers from different research fields. In this paper, many feature selection methods, the implement algorithms and applications of text classification are introduced firstly. However, because there are much noise in the knowledge extracted by current data-mining techniques for text classification, it leads to much uncertainty in the process of text classification which is produced from both the knowledge extraction and knowledge usage, therefore, more innovative techniques and methods are needed to improve the performance of text classification. It has been a critical step with great challenge to further improve the process of knowledge extraction and effectively utilization of the extracted knowledge. Rough Set decision making approach is proposed to use Rough Set decision techniques to more precisely classify the textual documents which are difficult to separate by the classic text classification methods. The purpose of this paper is to give an overview of existing text classification technologies, to demonstrate the Rough Set concepts and the decision making approach based on Rough Set theory for building more reliable and effective text classification framework with higher precision, to set up an innovative evaluation metric named CEI which is very effective for the performance assessment of the similar research, and to propose a promising research direction for addressing the challenging problems in text classification, text mining and other relative fields.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Case note Apache Energy Ltd v Alcoa of Australia Ltd (No 2) [2013] In 2011, headlines were made when Alcoa sued Apache Energy and its partners for $158 million, a loss it claimed was a consequence of Apache Energy failing to adequately inspect and maintain the gas pipelines that supplied the gas used by Alcoa in its business. As the loss was not a consequence of any property damage or injury to Alcoa, the loss is characterised as pure economic loss...

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The detection and correction of defects remains among the most time consuming and expensive aspects of software development. Extensive automated testing and code inspections may mitigate their effect, but some code fragments are necessarily more likely to be faulty than others, and automated identification of fault prone modules helps to focus testing and inspections, thus limiting wasted effort and potentially improving detection rates. However, software metrics data is often extremely noisy, with enormous imbalances in the size of the positive and negative classes. In this work, we present a new approach to predictive modelling of fault proneness in software modules, introducing a new feature representation to overcome some of these issues. This rank sum representation offers improved or at worst comparable performance to earlier approaches for standard data sets, and readily allows the user to choose an appropriate trade-off between precision and recall to optimise inspection effort to suit different testing environments. The method is evaluated using the NASA Metrics Data Program (MDP) data sets, and performance is compared with existing studies based on the Support Vector Machine (SVM) and Naïve Bayes (NB) Classifiers, and with our own comprehensive evaluation of these methods.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Object classification is plagued by the issue of session variation. Session variation describes any variation that makes one instance of an object look different to another, for instance due to pose or illumination variation. Recent work in the challenging task of face verification has shown that session variability modelling provides a mechanism to overcome some of these limitations. However, for computer vision purposes, it has only been applied in the limited setting of face verification. In this paper we propose a local region based intersession variability (ISV) modelling approach, and apply it to challenging real-world data. We propose a region based session variability modelling approach so that local session variations can be modelled, termed Local ISV. We then demonstrate the efficacy of this technique on a challenging real-world fish image database which includes images taken underwater, providing significant real-world session variations. This Local ISV approach provides a relative performance improvement of, on average, 23% on the challenging MOBIO, Multi-PIE and SCface face databases. It also provides a relative performance improvement of 35% on our challenging fish image dataset.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A cell classification algorithm that uses first, second and third order statistics of pixel intensity distributions over pre-defined regions is implemented and evaluated. A cell image is segmented into 6 regions extending from a boundary layer to an inner circle. First, second and third order statistical features are extracted from histograms of pixel intensities in these regions. Third order statistical features used are one-dimensional bispectral invariants. 108 features were considered as candidates for Adaboost based fusion. The best 10 stage fused classifier was selected for each class and a decision tree constructed for the 6-class problem. The classifier is robust, accurate and fast by design.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The experiences of the loss reduction projects in electric power distribution companies (EPDCs) of Iran are presented. The loss reduction methods, which are proposed individually by 14 EPDCs, corresponding energy saving (ES), Investment costs (IC), and loss rate reductions are provided. In order to illustrate the effectiveness and performance of the loss reduction methods, three parameters are proposed as energy saving per investment costs (ESIC), energy saving per quantity (ESPQ), and investment costs per quantity (ICPQ). The overall ESIC of 14 EPDC as well as individual average and standard deviation of the EISC for each method is presented and compared. In addition, the average and standard deviation of the ESPQs and ICPQs for the loss reduction methods, individually, are provided and investigated. These parameters are useful for EPDCs that intend to reduce the electric losses in distribution networks as a benchmark and as a background in the planning purposes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a loss reduction planning in electric distribution networks is presented based on the successful experiences in distribution utilities of IRAN and some developed countries. The necessary technical and economical parameters of planning are calculated from related projects in IRAN. Cost, time, and benefits of every sub-program including seven loss reduction approaches are determined. Finally, the loss reduction program, the benefit per cost, and the return of investment in optimistic and pessimistic conditions are introduced.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an efficient algorithm for multi-objective distribution feeder reconfiguration based on Modified Honey Bee Mating Optimization (MHBMO) approach. The main objective of the Distribution feeder reconfiguration (DFR) is to minimize the real power loss, deviation of the nodes’ voltage. Because of the fact that the objectives are different and no commensurable, it is difficult to solve the problem by conventional approaches that may optimize a single objective. So the metahuristic algorithm has been applied to this problem. This paper describes the full algorithm to Objective functions paid, The results of simulations on a 32 bus distribution system is given and shown high accuracy and optimize the proposed algorithm in power loss minimization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Real-time image analysis and classification onboard robotic marine vehicles, such as AUVs, is a key step in the realisation of adaptive mission planning for large-scale habitat mapping in previously unexplored environments. This paper describes a novel technique to train, process, and classify images collected onboard an AUV used in relatively shallow waters with poor visibility and non-uniform lighting. The approach utilises Förstner feature detectors and Laws texture energy masks for image characterisation, and a bag of words approach for feature recognition. To improve classification performance we propose a usefulness gain to learn the importance of each histogram component for each class. Experimental results illustrate the performance of the system in characterisation of a variety of marine habitats and its ability to operate onboard an AUV's main processor suitable for real-time mission planning.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce Kamouflage: a new architecture for building theft-resistant password managers. An attacker who steals a laptop or cell phone with a Kamouflage-based password manager is forced to carry out a considerable amount of online work before obtaining any user credentials. We implemented our proposal as a replacement for the built-in Firefox password manager, and provide performance measurements and the results from experiments with large real-world password sets to evaluate the feasibility and effectiveness of our approach. Kamouflage is well suited to become a standard architecture for password managers on mobile devices.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Method Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. Results The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. Conclusions The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective: To develop a system for the automatic classification of pathology reports for Cancer Registry notifications. Method: A two pass approach is proposed to classify whether pathology reports are cancer notifiable or not. The first pass queries pathology HL7 messages for known report types that are received by the Queensland Cancer Registry (QCR), while the second pass aims to analyse the free text reports and identify those that are cancer notifiable. Cancer Registry business rules, natural language processing and symbolic reasoning using the SNOMED CT ontology were adopted in the system. Results: The system was developed on a corpus of 500 histology and cytology reports (with 47% notifiable reports) and evaluated on an independent set of 479 reports (with 52% notifiable reports). Results show that the system can reliably classify cancer notifiable reports with a sensitivity, specificity, and positive predicted value (PPV) of 0.99, 0.95, and 0.95, respectively for the development set, and 0.98, 0.96, and 0.96 for the evaluation set. High sensitivity can be achieved at a slight expense in specificity and PPV. Conclusion: The system demonstrates how medical free-text processing enables the classification of cancer notifiable pathology reports with high reliability for potential use by Cancer Registries and pathology laboratories.