3 resultados para nonlinear dimensionality reduction

em Universidade Federal do Rio Grande do Norte(UFRN)


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Self-organizing maps (SOM) are artificial neural networks widely used in the data mining field, mainly because they constitute a dimensionality reduction technique given the fixed grid of neurons associated with the network. In order to properly the partition and visualize the SOM network, the various methods available in the literature must be applied in a post-processing stage, that consists of inferring, through its neurons, relevant characteristics of the data set. In general, such processing applied to the network neurons, instead of the entire database, reduces the computational costs due to vector quantization. This work proposes a post-processing of the SOM neurons in the input and output spaces, combining visualization techniques with algorithms based on gravitational forces and the search for the shortest path with the greatest reward. Such methods take into account the connection strength between neighbouring neurons and characteristics of pattern density and distances among neurons, both associated with the position that the neurons occupy in the data space after training the network. Thus, the goal consists of defining more clearly the arrangement of the clusters present in the data. Experiments were carried out so as to evaluate the proposed methods using various artificially generated data sets, as well as real world data sets. The results obtained were compared with those from a number of well-known methods existent in the literature

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the world we are constantly performing everyday actions. Two of these actions are frequent and of great importance: classify (sort by classes) and take decision. When we encounter problems with a relatively high degree of complexity, we tend to seek other opinions, usually from people who have some knowledge or even to the extent possible, are experts in the problem domain in question in order to help us in the decision-making process. Both the classification process as the process of decision making, we are guided by consideration of the characteristics involved in the specific problem. The characterization of a set of objects is part of the decision making process in general. In Machine Learning this classification happens through a learning algorithm and the characterization is applied to databases. The classification algorithms can be employed individually or by machine committees. The choice of the best methods to be used in the construction of a committee is a very arduous task. In this work, it will be investigated meta-learning techniques in selecting the best configuration parameters of homogeneous committees for applications in various classification problems. These parameters are: the base classifier, the architecture and the size of this architecture. We investigated nine types of inductors candidates for based classifier, two methods of generation of architecture and nine medium-sized groups for architecture. Dimensionality reduction techniques have been applied to metabases looking for improvement. Five classifiers methods are investigated as meta-learners in the process of choosing the best parameters of a homogeneous committee.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Lung cancer is one of the most common types of cancer and has the highest mortality rate. Patient survival is highly correlated with early detection. Computed Tomography technology services the early detection of lung cancer tremendously by offering aminimally invasive medical diagnostic tool. However, the large amount of data per examination makes the interpretation difficult. This leads to omission of nodules by human radiologist. This thesis presents a development of a computer-aided diagnosis system (CADe) tool for the detection of lung nodules in Computed Tomography study. The system, called LCD-OpenPACS (Lung Cancer Detection - OpenPACS) should be integrated into the OpenPACS system and have all the requirements for use in the workflow of health facilities belonging to the SUS (Brazilian health system). The LCD-OpenPACS made use of image processing techniques (Region Growing and Watershed), feature extraction (Histogram of Gradient Oriented), dimensionality reduction (Principal Component Analysis) and classifier (Support Vector Machine). System was tested on 220 cases, totaling 296 pulmonary nodules, with sensitivity of 94.4% and 7.04 false positives per case. The total time for processing was approximately 10 minutes per case. The system has detected pulmonary nodules (solitary, juxtavascular, ground-glass opacity and juxtapleural) between 3 mm and 30 mm.