744 resultados para Unsupervised classification
Resumo:
Identification of clouds from satellite images is now a routine task. Observation of clouds from the ground, however, is still needed to acquire a complete description of cloud conditions. Among the standard meteorologicalvariables, solar radiation is the most affected by cloud cover. In this note, a method for using global and diffuse solar radiation data to classify sky conditions into several classes is suggested. A classical maximum-likelihood method is applied for clustering data. The method is applied to a series of four years of solar radiation data and human cloud observations at a site in Catalonia, Spain. With these data, the accuracy of the solar radiation method as compared with human observations is 45% when nine classes of sky conditions are to be distinguished, and it grows significantly to almost 60% when samples are classified in only five different classes. Most errors are explained by limitations in the database; therefore, further work is under way with a more suitable database
Resumo:
Fluent health information flow is critical for clinical decision-making. However, a considerable part of this information is free-form text and inabilities to utilize it create risks to patient safety and cost-effective hospital administration. Methods for automated processing of clinical text are emerging. The aim in this doctoral dissertation is to study machine learning and clinical text in order to support health information flow.First, by analyzing the content of authentic patient records, the aim is to specify clinical needs in order to guide the development of machine learning applications.The contributions are a model of the ideal information flow,a model of the problems and challenges in reality, and a road map for the technology development. Second, by developing applications for practical cases,the aim is to concretize ways to support health information flow. Altogether five machine learning applications for three practical cases are described: The first two applications are binary classification and regression related to the practical case of topic labeling and relevance ranking.The third and fourth application are supervised and unsupervised multi-class classification for the practical case of topic segmentation and labeling.These four applications are tested with Finnish intensive care patient records.The fifth application is multi-label classification for the practical task of diagnosis coding. It is tested with English radiology reports.The performance of all these applications is promising. Third, the aim is to study how the quality of machine learning applications can be reliably evaluated.The associations between performance evaluation measures and methods are addressed,and a new hold-out method is introduced.This method contributes not only to processing time but also to the evaluation diversity and quality. The main conclusion is that developing machine learning applications for text requires interdisciplinary, international collaboration. Practical cases are very different, and hence the development must begin from genuine user needs and domain expertise. The technological expertise must cover linguistics,machine learning, and information systems. Finally, the methods must be evaluated both statistically and through authentic user-feedback.
Resumo:
Twelve single-pustule isolates of Uromyces appendiculatus, the etiological agent of common bean rust, were collected in the state of Minas Gerais, Brazil, and classified according to the new international differential series and the binary nomenclature system proposed during the 3rd Bean Rust Workshop. These isolates have been used to select rust-resistant genotypes in a bean breeding program conducted by our group. The twelve isolates were classified into seven different physiological races: 21-3, 29-3, 53-3, 53-19, 61-3, 63-3 and 63-19. Races 61-3 and 63-3 were the most frequent in the area. They were represented by five and two isolates, respectively. The other races were represented by just one isolate. This is the first time the new international classification procedure has been used for U. appendiculatus physiological races in Brazil. The general adoption of this system will facilitate information exchange, allowing the cooperative use of the results obtained by different research groups throughout the world. The differential cultivars Mexico 309, Mexico 235 and PI 181996 showed resistance to all of the isolates that were characterized. It is suggested that these cultivars should be preferentially used as sources for resistance to rust in breeding programs targeting development lines adapted to the state of Minas Gerais.
Resumo:
Software testing is one of the essential parts in software engineering process. The objective of the study was to describe software testing tools and the corresponding use. The thesis contains examples of software testing tools usage. The study was conducted as a literature study, with focus on current software testing practices and quality assurance standards. In the paper a tool classifier was employed, and testing tools presented in study were classified according to it. We found that it is difficult to distinguish current available tools by certain testing activities as many of them contain functionality that exceeds scopes of a single testing type.
Resumo:
The purpose of this study is to view credit risk from the financier’s point of view in a theoretical framework. Results and aspects of the previous studies regarding measuring credit risk with accounting based scoring models are also examined. The theoretical framework and previous studies are then used to support the empirical analysis which aims to develop a credit risk measure for a bank’s internal use or a risk management tool for a company to indicate its credit risk to the financier. The study covers a sample of Finnish companies from 12 different industries and four different company categories and employs their accounting information from 2004 to 2008. The empirical analysis consists of six stage methodology process which uses measures of profitability, liquidity, capital structure and cash flow to determine financier’s credit risk, define five significant risk classes and produce risk classification model. The study is confidential until 15.10.2012.
Resumo:
Female sexual dysfunctions, including desire, arousal, orgasm and pain problems, have been shown to be highly prevalent among women around the world. The etiology of these dysfunctions is unclear but associations with health, age, psychological problems, and relationship factors have been identified. Genetic effects explain individual variation in orgasm function to some extent but until now quantitative behavior genetic analyses have not been applied to other sexual functions. In addition, behavior genetics can be applied to exploring the cause of any observed comorbidity between the dysfunctions. Discovering more about the etiology of the dysfunctions may further improve the classification systems which are currently under intense debate. The aims of the present thesis were to evaluate the psychometric properties of a Finnish-language version of a commonly used questionnaire for measuring female sexual function, the Female Sexual Function Index (FSFI), in order to investigate prevalence, comorbidity, and classification, and to explore the balance of genetic and environmental factors in the etiology as well as the associations of a number of biopsychosocial factors with female sexual functions. Female sexual functions were studied through survey methods in a population based sample of Finnish twins and their female siblings. There were two waves of data collection. The first data collection targeted 5,000 female twins aged 33–43 years and the second 7,680 female twins aged 18–33 and their over 18–year-old female siblings (n = 3,983). There was no overlap between the data collections. The combined overall response rate for both data collections was 53% (n = 8,868), with a better response rate in the second (57%) compared to the first (45%). In order to measure female sexual function, the FSFI was used. It includes 19 items which measure female sexual function during the previous four weeks in six subdomains; desire, subjective arousal, lubrication, orgasm, sexual satisfaction, and pain. In line with earlier research in clinical populations, a six factor solution of the Finnish-language version of the FSFI received supported. The internal consistencies of the scales were good to excellent. Some questions about how to avoid overestimating the prevalence of extreme dysfunctions due to women being allocated the score of zero if they had had no sexual activity during the preceding four weeks were raised. The prevalence of female sexual dysfunctions per se ranged from 11% for lubrication dysfunction to 55% for desire dysfunction. The prevalence rates for sexual dysfunction with concomitant sexual distress, in other words, sexual disorders were notably lower ranging from 7% for lubrication disorder to 23% for desire disorder. The comorbidity between the dysfunctions was substantial most notably between arousal and lubrication dysfunction even if these two dysfunctions showed distinct patterns of associations with the other dysfunctions. Genetic influences on individual variation in the six subdomains of FSFI were modest but significant ranging from 3–11% for additive genetic effects and 5–18% for nonadditive genetic effects. The rest of the variation in sexual functions was explained by nonshared environmental influences. A correlated factor model, including additive and nonadditive genetic effects and nonshared environmental effects had the best fit. All in all, every correlation between the genetic factors was significant except between lubrication and pain. All correlations between the nonshared environment factors were significant showing that there is a substantial overlap in genetic and nonshared environmental influences between the dysfunctions. In general, psychological problems, poor satisfaction with the relationship, sexual distress, and poor partner compatibility were associated with more sexual dysfunctions. Age was confounded with relationship length but had over and above relationship length a negative effect on desire and sexual satisfaction and a positive effect on orgasm and pain functions. Alcohol consumption in general was associated with better desire, arousal, lubrication, and orgasm function. Women pregnant with their first child had fewer pain problems than nulliparous nonpregnant women. Multiparous pregnant women had more orgasm problems compared to multiparous nonpregnant women. Having children was associated with less orgasm and pain problems. The conclusions were that desire, subjective arousal, lubrication, orgasm, sexual satisfaction, and pain are separate entities that have distinct associations with a number of different biopsychosocial factors. However, there is also considerable comorbidity between the dysfunctions which are explained by overlap in additive genetic, nonadditive genetic and nonshared environmental influences. Sexual dysfunctions are highly prevalent and are not always associated with sexual distress and this relationship might be moderated by a good relationship and compatibility with partner. Regarding classification, the results supports separate diagnoses for subjective arousal and genital arousal as well as the inclusion of pain under sexual dysfunctions.
Resumo:
The large and growing number of digital images is making manual image search laborious. Only a fraction of the images contain metadata that can be used to search for a particular type of image. Thus, the main research question of this thesis is whether it is possible to learn visual object categories directly from images. Computers process images as long lists of pixels that do not have a clear connection to high-level semantics which could be used in the image search. There are various methods introduced in the literature to extract low-level image features and also approaches to connect these low-level features with high-level semantics. One of these approaches is called Bag-of-Features which is studied in the thesis. In the Bag-of-Features approach, the images are described using a visual codebook. The codebook is built from the descriptions of the image patches using clustering. The images are described by matching descriptions of image patches with the visual codebook and computing the number of matches for each code. In this thesis, unsupervised visual object categorisation using the Bag-of-Features approach is studied. The goal is to find groups of similar images, e.g., images that contain an object from the same category. The standard Bag-of-Features approach is improved by using spatial information and visual saliency. It was found that the performance of the visual object categorisation can be improved by using spatial information of local features to verify the matches. However, this process is computationally heavy, and thus, the number of images must be limited in the spatial matching, for example, by using the Bag-of-Features method as in this study. Different approaches for saliency detection are studied and a new method based on the Hessian-Affine local feature detector is proposed. The new method achieves comparable results with current state-of-the-art. The visual object categorisation performance was improved by using foreground segmentation based on saliency information, especially when the background could be considered as clutter.
Resumo:
ABSTRACT Geographic Information System (GIS) is an indispensable software tool in forest planning. In forestry transportation, GIS can manage the data on the road network and solve some problems in transportation, such as route planning. Therefore, the aim of this study was to determine the pattern of the road network and define transport routes using GIS technology. The present research was conducted in a forestry company in the state of Minas Gerais, Brazil. The criteria used to classify the pattern of forest roads were horizontal and vertical geometry, and pavement type. In order to determine transport routes, a data Analysis Model Network was created in ArcGIS using an Extension Network Analyst, allowing finding a route shorter in distance and faster. The results showed a predominance of horizontal geometry classes average (3) and bad (4), indicating presence of winding roads. In the case of vertical geometry criterion, the class of highly mountainous relief (4) possessed the greatest extent of roads. Regarding the type of pavement, the occurrence of secondary coating was higher (75%), followed by primary coating (20%) and asphalt pavement (5%). The best route was the one that allowed the transport vehicle travel in a higher specific speed as a function of road pattern found in the study.
Resumo:
OBJECTIVE: to evaluate Crohn's disease recurrence and its possible predictors in patients undergoing surgical treatment. METHODS: We conducted a retrospective study with Crohn's disease (CD) patients undergoing surgical treatment between January 1992 and January 2012, and regularly monitored at the Bowel Clinic of the Hospital das Clínicas of the UFMG. RESULTS: we evaluated 125 patients, 50.4% female, with a mean age of 46.12 years, the majority (63.2%) diagnosed between 17 and 40 years of age. The ileum was involved in 58.4%, whereas stenotic behavior was observed in 44.8%, and penetrating, in 45.6%. We observed perianal disease in 26.4% of cases. The follow-up average was 152.40 months. Surgical relapse occurred in 29.6%, with a median time of 68 months from the first operation. CONCLUSION: The ileocolic location, penetrating behavior and perianal involvement (L3B3p) were associated with increased risk of surgical recurrence.
Resumo:
Gestational trophoblastic neoplasia (GTN) is the term to describe a set of malignant placental diseases, including invasive mole, choriocarcinoma, placental site trophoblastic tumor and epithelioid trophoblastic tumor. Both invasive mole and choriocarcinoma respond well to chemotherapy, and cure rates are greater than 90%. Since the advent of chemotherapy, low-risk GTN has been treated with a single agent, usually methotrexate or actinomycin D. Cases of high-risk GTN, however, should be treated with multiagent chemotherapy, and the regimen usually selected is EMA-CO, which combines etoposide, methotrexate, actinomycin D, cyclophosphamide and vincristine. This study reviews the literature about GTN to discuss current knowledge about its diagnosis and treatment.
Resumo:
The purpose of the thesis is to classify suppliers and to enhance strategic purchasing in the case company. Supplier classification is conducted to fulfill the requirements of the company quality manual and international quality standards. To gain more benefit, a strategic purchasing tool, Kraljic’s purchasing portfolio and analytical hierarchy process are utilized for the base of supplier classification. Purchasing portfolio is used to give quick and easy visual insight on product group management form the viewpoint of purchasing. From the base on purchasing portfolio alternative purchasing and supplier strategies can be formed that enhance the strategic orientation of purchasing. Thus purchasing portfolio forces the company to orient on proactive and strategic purchasing. As a result a survey method for implementing purchasing portfolio in the company is developed that exploits analytical hierarchy process. Experts from the company appoint the categorization criteria and in addition, participate in the survey to categorize product groups on the portfolio. Alternative purchasing strategies are formed. Suppliers are classified depending on the importance and characteristics of the product groups supplied.
Resumo:
Avian pathogenic Escherichia coli (APEC) is responsible for various pathological processes in birds and is considered as one of the principal causes of morbidity and mortality, associated with economic losses to the poultry industry. The objective of this study was to demonstrate that it is possible to predict antimicrobial resistance of 256 samples (APEC) using 38 different genes responsible for virulence factors, through a computer program of artificial neural networks (ANNs). A second target was to find the relationship between (PI) pathogenicity index and resistance to 14 antibiotics by statistical analysis. The results showed that the RNAs were able to make the correct classification of the behavior of APEC samples with a range from 74.22 to 98.44%, and make it possible to predict antimicrobial resistance. The statistical analysis to assess the relationship between the pathogenic index (PI) and resistance against 14 antibiotics showed that these variables are independent, i.e. peaks in PI can happen without changing the antimicrobial resistance, or the opposite, changing the antimicrobial resistance without a change in PI.
Resumo:
Työssä käydään läpi tukivektorikoneiden teoreettista pohjaa sekä tutkitaan eri parametrien vaikutusta spektridatan luokitteluun.