2 resultados para REGRESSION TREES
em Universidade Federal do Rio Grande do Norte(UFRN)
Resumo:
The development of wireless telecommunication in the last years has been great. It has been taking academics to conceive new ideas and techniques. Their aims are to increase the capacity and the quality of the system s services. Cells that are smaller every time, frequencies that are every time higher and environments that get more and more complex, all those facts deserve more accurate models the propagation prediction techniques are inserted in this context and results with a merger of error that is compatible with the next generations of communication systems. The objective of this Work is to present results of a propagation measurement campaign, aiming at pointing the characteristics of the mobile systems covering in the city of Natal (state of Rio Grande do Norte, Brazil). A mobile laboratory was set up, using the infra-structure available and frequently used by ANATEL. The measures were taken in three different areas: one characterized by high buildings, high relief, presence of trees and towers of different highs. These areas covered the city s central zone, a suburban / rural zone and a section of coast surrounded by sand dunes. It is important to highlight that the analysis was made taking into consideration the actual reality of cellular systems with covering ranges by reduced cells, with the intent of causing greater re-use of frequencies and greater capacity of telephone traffic. The predominance of telephone traffic by cell in the city of Natal occurs within a range inferior to 3 (three) km from the Radio-Base Station. The frequency band used was 800 MHz, corresponding to the control channels of the respective sites, which adopt the FSK modulation technique. This Dissertation starts by presenting a general vision of the models used for predicting propagation. Then, there is a description of the methodology used in the measuring, which were done using the same channels of control of the cellular system. The results obtained were compared with many existing prediction models, and some adaptations were developed by using regression techniques trying to obtain the most optimized solutions. Furthermore, according to regulations from the old Brazilian Holding Telebrás, a minimum covering of 90% of a determined previously area, in 90% of the time, must be obeyed when implanting cellular systems. For such value to be reached, considerations and studies involving the specific environment that is being covered are important. The objective of this work is contribute to this aspect
Resumo:
Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated