5 resultados para Classification error rate

em Universidade Federal do Rio Grande do Norte(UFRN)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This dissertation describes the implementation of a WirelessHART networks simulation module for the Network Simulator 3, aiming for the acceptance of both on the present context of networks research and industry. For validating the module were imeplemented tests for attenuation, packet error rate, information transfer success rate and battery duration per station

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Context-aware applications are typically dynamic and use services provided by several sources, with different quality levels. Context information qualities are expressed in terms of Quality of Context (QoC) metadata, such as precision, correctness, refreshment, and resolution. On the other hand, service qualities are expressed via Quality of Services (QoS) metadata such as response time, availability and error rate. In order to assure that an application is using services and context information that meet its requirements, it is essential to continuously monitor the metadata. For this purpose, it is needed a QoS and QoC monitoring mechanism that meet the following requirements: (i) to support measurement and monitoring of QoS and QoC metadata; (ii) to support synchronous and asynchronous operation, thus enabling the application to periodically gather the monitored metadata and also to be asynchronously notified whenever a given metadata becomes available; (iii) to use ontologies to represent information in order to avoid ambiguous interpretation. This work presents QoMonitor, a module for QoS and QoC metadata monitoring that meets the abovementioned requirement. The architecture and implementation of QoMonitor are discussed. To support asynchronous communication QoMonitor uses two protocols: JMS and Light-PubSubHubbub. In order to illustrate QoMonitor in the development of ubiquitous application it was integrated to OpenCOPI (Open COntext Platform Integration), a Middleware platform that integrates several context provision middleware. To validate QoMonitor we used two applications as proofof- concept: an oil and gas monitoring application and a healthcare application. This work also presents a validation of QoMonitor in terms of performance both in synchronous and asynchronous requests

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Waste generated during the exploration and production of oil, water stands out due to various factors including the volume generated, the salt content, the presence of oil and chemicals and the water associated with oil is called produced water. The chemical composition of water is complex and depends strongly on the field generator, because it was in contact with the geological formation for thousands of years. This work aims to characterize the hydrochemical water produced in different areas of a field located in the Potiguar Basin. We collected 27 samples from 06 zones (400, 600, 400/600, 400/450/500, 350/400, A) the producing field called S and measured 50 required parameter divided between physical and chemical parameters, cations and anions. In hydrochemical characterization was used as tools of reasons ionic calculations, diagrams and they hydrochemical classification diagram Piper and Stiff diagram and also the statistic that helped in the identification of signature patterns for each production area including the area that supplies water injected this field for secondary oil recovery. The ionic balance error was calculated to assess the quality of the results of the analysis that was considered good, because 89% of the samples were below 5% error. Hydrochemical diagrams classified the waters as sodium chloride, with the exception of samples from Area A, from the injection well, which were classified as sodium bicarbonate. Through descriptive analysis and discriminant analysis was possible to obtain a function that differs chemically production areas, this function had a good hit rate of classification was 85%

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Waste generated during the exploration and production of oil, water stands out due to various factors including the volume generated, the salt content, the presence of oil and chemicals and the water associated with oil is called produced water. The chemical composition of water is complex and depends strongly on the field generator, because it was in contact with the geological formation for thousands of years. This work aims to characterize the hydrochemical water produced in different areas of a field located in the Potiguar Basin. We collected 27 samples from 06 zones (400, 600, 400/600, 400/450/500, 350/400, A) the producing field called S and measured 50 required parameter divided between physical and chemical parameters, cations and anions. In hydrochemical characterization was used as tools of reasons ionic calculations, diagrams and they hydrochemical classification diagram Piper and Stiff diagram and also the statistic that helped in the identification of signature patterns for each production area including the area that supplies water injected this field for secondary oil recovery. The ionic balance error was calculated to assess the quality of the results of the analysis that was considered good, because 89% of the samples were below 5% error. Hydrochemical diagrams classified the waters as sodium chloride, with the exception of samples from Area A, from the injection well, which were classified as sodium bicarbonate. Through descriptive analysis and discriminant analysis was possible to obtain a function that differs chemically production areas, this function had a good hit rate of classification was 85%