921 resultados para Document classification,Naive Bayes classifier,Verb-object pairs
Resumo:
Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies
Resumo:
Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies
Resumo:
More than ever, there is an increase of the number of decision support methods and computer aided diagnostic systems applied to various areas of medicine. In breast cancer research, many works have been done in order to reduce false-positives when used as a double reading method. In this study, we aimed to present a set of data mining techniques that were applied to approach a decision support system in the area of breast cancer diagnosis. This method is geared to assist clinical practice in identifying mammographic findings such as microcalcifications, masses and even normal tissues, in order to avoid misdiagnosis. In this work a reliable database was used, with 410 images from about 115 patients, containing previous reviews performed by radiologists as microcalcifications, masses and also normal tissue findings. Throughout this work, two feature extraction techniques were used: the gray level co-occurrence matrix and the gray level run length matrix. For classification purposes, we considered various scenarios according to different distinct patterns of injuries and several classifiers in order to distinguish the best performance in each case described. The many classifiers used were Naïve Bayes, Support Vector Machines, k-nearest Neighbors and Decision Trees (J48 and Random Forests). The results in distinguishing mammographic findings revealed great percentages of PPV and very good accuracy values. Furthermore, it also presented other related results of classification of breast density and BI-RADS® scale. The best predictive method found for all tested groups was the Random Forest classifier, and the best performance has been achieved through the distinction of microcalcifications. The conclusions based on the several tested scenarios represent a new perspective in breast cancer diagnosis using data mining techniques.
Resumo:
Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
A Work Project, presented as part of the requirements for the Award of a Masters Degree in Management from the NOVA – School of Business and Economics
Resumo:
Remote sensing - the acquisition of information about an object or phenomenon without making physical contact with the object - is applied in a multitude of different areas, ranging from agriculture, forestry, cartography, hydrology, geology, meteorology, aerial traffic control, among many others. Regarding agriculture, an example of application of this information is regarding crop detection, to monitor existing crops easily and help in the region’s strategic planning. In any of these areas, there is always an ongoing search for better methods that allow us to obtain better results. For over forty years, the Landsat program has utilized satellites to collect spectral information from Earth’s surface, creating a historical archive unmatched in quality, detail, coverage, and length. The most recent one was launched on February 11, 2013, having a number of improvements regarding its predecessors. This project aims to compare classification methods in Portugal’s Ribatejo region, specifically regarding crop detection. The state of the art algorithms will be used in this region and their performance will be analyzed.
Resumo:
Given the limitations of different types of remote sensing images, automated land-cover classifications of the Amazon várzea may yield poor accuracy indexes. One way to improve accuracy is through the combination of images from different sensors, by either image fusion or multi-sensor classifications. Therefore, the objective of this study was to determine which classification method is more efficient in improving land cover classification accuracies for the Amazon várzea and similar wetland environments - (a) synthetically fused optical and SAR images or (b) multi-sensor classification of paired SAR and optical images. Land cover classifications based on images from a single sensor (Landsat TM or Radarsat-2) are compared with multi-sensor and image fusion classifications. Object-based image analyses (OBIA) and the J.48 data-mining algorithm were used for automated classification, and classification accuracies were assessed using the kappa index of agreement and the recently proposed allocation and quantity disagreement measures. Overall, optical-based classifications had better accuracy than SAR-based classifications. Once both datasets were combined using the multi-sensor approach, there was a 2% decrease in allocation disagreement, as the method was able to overcome part of the limitations present in both images. Accuracy decreased when image fusion methods were used, however. We therefore concluded that the multi-sensor classification method is more appropriate for classifying land cover in the Amazon várzea.
Resumo:
"Vegeu el resum a l'inici del document del fitxer adjunt."
Resumo:
Landscape classification tackles issues related to the representation and analysis of continuous and variable ecological data. In this study, a methodology is created in order to define topo-climatic landscapes (TCL) in the north-west of Catalonia (north-east of the Iberian Peninsula). TCLs relate the ecological behaviour of a landscape in terms of topography, physiognomy and climate, which compound the main drivers of an ecosystem. Selected variables are derived from different sources such as remote sensing and climatic atlas. The proposed methodology combines unsupervised interative cluster classification with a supervised fuzzy classification. As a result, 28 TCLs have been found for the study area which may be differentiated in terms of vegetation physiognomy and vegetation altitudinal range type. Furthermore a hierarchy among TCLs is set, enabling the merging of clusters and allowing for changes of scale. Through the topo-climatic landscape map, managers may identify patches with similar environmental conditions and asses at the same time the uncertainty involved.
Resumo:
Difficult tracheal intubation assessment is an important research topic in anesthesia as failed intubations are important causes of mortality in anesthetic practice. The modified Mallampati score is widely used, alone or in conjunction with other criteria, to predict the difficulty of intubation. This work presents an automatic method to assess the modified Mallampati score from an image of a patient with the mouth wide open. For this purpose we propose an active appearance models (AAM) based method and use linear support vector machines (SVM) to select a subset of relevant features obtained using the AAM. This feature selection step proves to be essential as it improves drastically the performance of classification, which is obtained using SVM with RBF kernel and majority voting. We test our method on images of 100 patients undergoing elective surgery and achieve 97.9% accuracy in the leave-one-out crossvalidation test and provide a key element to an automatic difficult intubation assessment system.
Resumo:
Report for the scientific sojourn at the Swiss Federal Institute of Technology Zurich, Switzerland, between September and December 2007. In order to make robots useful assistants for our everyday life, the ability to learn and recognize objects is of essential importance. However, object recognition in real scenes is one of the most challenging problems in computer vision, as it is necessary to deal with difficulties. Furthermore, in mobile robotics a new challenge is added to the list: computational complexity. In a dynamic world, information about the objects in the scene can become obsolete before it is ready to be used if the detection algorithm is not fast enough. Two recent object recognition techniques have achieved notable results: the constellation approach proposed by Lowe and the bag of words approach proposed by Nistér and Stewénius. The Lowe constellation approach is the one currently being used in the robot localization project of the COGNIRON project. This report is divided in two main sections. The first section is devoted to briefly review the currently used object recognition system, the Lowe approach, and bring to light the drawbacks found for object recognition in the context of indoor mobile robot navigation. Additionally the proposed improvements for the algorithm are described. In the second section the alternative bag of words method is reviewed, as well as several experiments conducted to evaluate its performance with our own object databases. Furthermore, some modifications to the original algorithm to make it suitable for object detection in unsegmented images are proposed.
Disentangling the effects of key innovations on the diversification of Bromelioideae (bromeliaceae).
Resumo:
The evolution of key innovations, novel traits that promote diversification, is often seen as major driver for the unequal distribution of species richness within the tree of life. In this study, we aim to determine the factors underlying the extraordinary radiation of the subfamily Bromelioideae, one of the most diverse clades among the neotropical plant family Bromeliaceae. Based on an extended molecular phylogenetic data set, we examine the effect of two putative key innovations, that is, the Crassulacean acid metabolism (CAM) and the water-impounding tank, on speciation and extinction rates. To this aim, we develop a novel Bayesian implementation of the phylogenetic comparative method, binary state speciation and extinction, which enables hypotheses testing by Bayes factors and accommodates the uncertainty on model selection by Bayesian model averaging. Both CAM and tank habit were found to correlate with increased net diversification, thus fulfilling the criteria for key innovations. Our analyses further revealed that CAM photosynthesis is correlated with a twofold increase in speciation rate, whereas the evolution of the tank had primarily an effect on extinction rates that were found five times lower in tank-forming lineages compared to tank-less clades. These differences are discussed in the light of biogeography, ecology, and past climate change.
Resumo:
Introduction: As part of the MicroArray Quality Control (MAQC)-II project, this analysis examines how the choice of univariate feature-selection methods and classification algorithms may influence the performance of genomic predictors under varying degrees of prediction difficulty represented by three clinically relevant endpoints. Methods: We used gene-expression data from 230 breast cancers (grouped into training and independent validation sets), and we examined 40 predictors (five univariate feature-selection methods combined with eight different classifiers) for each of the three endpoints. Their classification performance was estimated on the training set by using two different resampling methods and compared with the accuracy observed in the independent validation set. Results: A ranking of the three classification problems was obtained, and the performance of 120 models was estimated and assessed on an independent validation set. The bootstrapping estimates were closer to the validation performance than were the cross-validation estimates. The required sample size for each endpoint was estimated, and both gene-level and pathway-level analyses were performed on the obtained models. Conclusions: We showed that genomic predictor accuracy is determined largely by an interplay between sample size and classification difficulty. Variations on univariate feature-selection methods and choice of classification algorithm have only a modest impact on predictor performance, and several statistically equally good predictors can be developed for any given classification problem.