979 resultados para Text classification
Resumo:
The objective of this work was to assess and characterize two clones, 169 and 685, of Cabernet Sauvignon grapes and to evaluate the wine produced from these grapes. The experiment was carried out in São Joaquim, SC, Brazil, during the 2009 harvest season. During grape ripening, the evolution of physical-chemical properties, phenolic compounds, organic acids, and anthocyanins was evaluated. During grape harvest, yield components were determined for each clone. Individual and total phenolics, individual and total anthocyanins, and antioxidant activity were evaluated for wine. The clones were also assessed regarding the duration of their phenological cycle. During ripening, the evolution of phenolic compounds and of physical-chemical parameters was similar for both clones; however, during harvest, significant differences were observed regarding yield, number of bunches per plant and berries per bunch, leaf area, and organic acid, polyphenol, and anthocyanin content. The wines produced from these clones showed significant differences regarding chemical composition. The clones showed similar phenological cycle and responses to bioclimatic parameters. Principal component analysis shows that clone 685 is strongly correlated with color characteristics, mainly monomeric anthocyanins, while clone 169 is correlated with individual phenolic compounds.
Resumo:
Land use/cover classification is one of the most important applications in remote sensing. However, mapping accurate land use/cover spatial distribution is a challenge, particularly in moist tropical regions, due to the complex biophysical environment and limitations of remote sensing data per se. This paper reviews experiments related to land use/cover classification in the Brazilian Amazon for a decade. Through comprehensive analysis of the classification results, it is concluded that spatial information inherent in remote sensing data plays an essential role in improving land use/cover classification. Incorporation of suitable textural images into multispectral bands and use of segmentation‑based method are valuable ways to improve land use/cover classification, especially for high spatial resolution images. Data fusion of multi‑resolution images within optical sensor data is vital for visual interpretation, but may not improve classification performance. In contrast, integration of optical and radar data did improve classification performance when the proper data fusion method was used. Among the classification algorithms available, the maximum likelihood classifier is still an important method for providing reasonably good accuracy, but nonparametric algorithms, such as classification tree analysis, have the potential to provide better results. However, they often require more time to achieve parametric optimization. Proper use of hierarchical‑based methods is fundamental for developing accurate land use/cover classification, mainly from historical remotely sensed data.
Resumo:
The objective of this work was to evaluate the biochemical composition of six berry types belonging to Fragaria, Rubus, Vaccinium and Ribes genus. Fruit samples were collected in triplicate (50 fruit each) from 18 different species or cultivars of the mentioned genera, during three years (2008 to 2010). Content of individual sugars, organic acids, flavonols, and phenolic acids were determined by high performance liquid chromatography (HPLC) analysis, while total phenolics (TPC) and total antioxidant capacity (TAC), by using spectrophotometry. Principal component analysis (PCA) and hierarchical cluster analysis (CA) were performed to evaluate the differences in fruit biochemical profile. The highest contents of bioactive components were found in Ribes nigrum and in Fragaria vesca, Rubus plicatus, and Vaccinium myrtillus. PCA and CA were able to partially discriminate between berries on the basis of their biochemical composition. Individual and total sugars, myricetin, ellagic acid, TPC and TAC showed the highest impact on biochemical composition of the berry fruits. CA separated blackberry, raspberry, and blueberry as isolate groups, while classification of strawberry, black and red currant in a specific group has not occurred. There is a large variability both between and within the different types of berries. Metabolite fingerprinting of the evaluated berries showed unique biochemical profiles and specific combination of bioactive compound contents.
Resumo:
Recent advances in machine learning methods enable increasingly the automatic construction of various types of computer assisted methods that have been difficult or laborious to program by human experts. The tasks for which this kind of tools are needed arise in many areas, here especially in the fields of bioinformatics and natural language processing. The machine learning methods may not work satisfactorily if they are not appropriately tailored to the task in question. However, their learning performance can often be improved by taking advantage of deeper insight of the application domain or the learning problem at hand. This thesis considers developing kernel-based learning algorithms incorporating this kind of prior knowledge of the task in question in an advantageous way. Moreover, computationally efficient algorithms for training the learning machines for specific tasks are presented. In the context of kernel-based learning methods, the incorporation of prior knowledge is often done by designing appropriate kernel functions. Another well-known way is to develop cost functions that fit to the task under consideration. For disambiguation tasks in natural language, we develop kernel functions that take account of the positional information and the mutual similarities of words. It is shown that the use of this information significantly improves the disambiguation performance of the learning machine. Further, we design a new cost function that is better suitable for the task of information retrieval and for more general ranking problems than the cost functions designed for regression and classification. We also consider other applications of the kernel-based learning algorithms such as text categorization, and pattern recognition in differential display. We develop computationally efficient algorithms for training the considered learning machines with the proposed kernel functions. We also design a fast cross-validation algorithm for regularized least-squares type of learning algorithm. Further, an efficient version of the regularized least-squares algorithm that can be used together with the new cost function for preference learning and ranking tasks is proposed. In summary, we demonstrate that the incorporation of prior knowledge is possible and beneficial, and novel advanced kernels and cost functions can be used in algorithms efficiently.
Resumo:
Objective To evaluate the performance of diagnostic centers in the classification of mammography reports from an opportunistic screening undertaken by the Brazilian public health system (SUS) in the municipality of Goiânia, GO, Brazil in 2010. Materials and Methods The present ecological study analyzed data reported to the Sistema de Informação do Controle do Câncer de Mama (SISMAMA) (Breast Cancer Management Information System) by diagnostic centers involved in the mammographic screening developed by the SUS. Based on the frequency of mammograms per BI-RADS® category and on the limits established for the present study, the authors have calculated the rate of conformity for each diagnostic center. Diagnostic centers with equal rates of conformity were considered as having equal performance. Results Fifteen diagnostic centers performed mammographic studies for SUS and reported 31,198 screening mammograms. The performance of the diagnostic centers concerning BI-RADS classification has demonstrated that none of them was in conformity for all categories, one center presented conformity in five categories, two centers, in four categories, three centers, in three categories, two centers, in two categories, four centers, in one category, and three centers with no conformity. Conclusion The results of the present study demonstrate unevenness in the diagnostic centers performance in the classification of mammograms reported to SISMAMA from the opportunistic screening undertaken by SUS.
Resumo:
Objective Quantitative analysis of chest radiographs of patients with and without chronic obstructive pulmonary disease (COPD) determining if the data obtained from such radiographic images could classify such individuals according to the presence or absence of disease. Materials and Methods For such a purpose, three groups of chest radiographic images were utilized, namely: group 1, including 25 individuals with COPD; group 2, including 27 individuals without COPD; and group 3 (utilized for the reclassification /validation of the analysis), including 15 individuals with COPD. The COPD classification was based on spirometry. The variables normalized by retrosternal height were the following: pulmonary width (LARGP); levels of right (ALBDIR) and left (ALBESQ) diaphragmatic eventration; costophrenic angle (ANGCF); and right (DISDIR) and left (DISESQ) intercostal distances. Results As the radiographic images of patients with and without COPD were compared, statistically significant differences were observed between the two groups on the variables related to the diaphragm. In the COPD reclassification the following variables presented the highest indices of correct classification: ANGCF (80%), ALBDIR (73.3%), ALBESQ (86.7%). Conclusion The radiographic assessment of the chest demonstrated that the variables related to the diaphragm allow a better differentiation between individuals with and without COPD.
Resumo:
Renal cystic lesions are usually diagnosed in the radiologists' practice and therefore their characterization is crucial to determine the clinical approach to be adopted and prognosis. The Bosniak classification based on computed tomography findings has allowed for standardization and categorization of lesions in increasing order of malignancy (I, II, IIF, III and IV) in a simple and accurate way. The present iconographic essay developed with multidetector computed tomography images of selected cases from the archives of the authors' institution, is aimed at describing imaging findings that can help in the diagnosis of renal cysts.
Resumo:
AbstractRenal cell carcinoma (RCC) is the seventh most common histological type of cancer in the Western world and has shown a sustained increase in its prevalence. The histological classification of RCCs is of utmost importance, considering the significant prognostic and therapeutic implications of its histological subtypes. Imaging methods play an outstanding role in the diagnosis, staging and follow-up of RCC. Clear cell, papillary and chromophobe are the most common histological subtypes of RCC, and their preoperative radiological characterization, either followed or not by confirmatory percutaneous biopsy, may be particularly useful in cases of poor surgical condition, metastatic disease, central mass in a solitary kidney, and in patients eligible for molecular targeted therapy. New strategies recently developed for treating renal cancer, such as cryo and radiofrequency ablation, molecularly targeted therapy and active surveillance also require appropriate preoperative characterization of renal masses. Less common histological types, although sharing nonspecific imaging features, may be suspected on the basis of clinical and epidemiological data. The present study is aimed at reviewing the main clinical and imaging findings of histological RCC subtypes.
Resumo:
Abstract Objective: To assess the cutoff values established by ROC curves to classify18F-NaF uptake as normal or malignant. Materials and Methods: PET/CT images were acquired 1 hour after administration of 185 MBq of18F-NaF. Volumes of interest (VOIs) were drawn on three regions of the skeleton as follows: proximal right humerus diaphysis (HD), proximal right femoral diaphysis (FD) and first vertebral body (VB1), in a total of 254 patients, totalling 762 VOIs. The uptake in the VOIs was classified as normal or malignant on the basis of the radiopharmaceutical distribution pattern and of the CT images. A total of 675 volumes were classified as normal and 52 were classified as malignant. Thirty-five VOIs classified as indeterminate or nonmalignant lesions were excluded from analysis. The standardized uptake value (SUV) measured on the VOIs were plotted on an ROC curve for each one of the three regions. The area under the ROC (AUC) as well as the best cutoff SUVs to classify the VOIs were calculated. The best cutoff values were established as the ones with higher result of the sum of sensitivity and specificity. Results: The AUCs were 0.933, 0.889 and 0.975 for UD, FD and VB1, respectively. The best SUV cutoffs were 9.0 (sensitivity: 73%; specificity: 99%), 8.4 (sensitivity: 79%; specificity: 94%) and 21.0 (sensitivity: 93%; specificity: 95%) for UD, FD and VB1, respectively. Conclusion: The best cutoff value varies according to bone region of analysis and it is not possible to establish one value for the whole body.
Resumo:
Fluent health information flow is critical for clinical decision-making. However, a considerable part of this information is free-form text and inabilities to utilize it create risks to patient safety and cost-effective hospital administration. Methods for automated processing of clinical text are emerging. The aim in this doctoral dissertation is to study machine learning and clinical text in order to support health information flow.First, by analyzing the content of authentic patient records, the aim is to specify clinical needs in order to guide the development of machine learning applications.The contributions are a model of the ideal information flow,a model of the problems and challenges in reality, and a road map for the technology development. Second, by developing applications for practical cases,the aim is to concretize ways to support health information flow. Altogether five machine learning applications for three practical cases are described: The first two applications are binary classification and regression related to the practical case of topic labeling and relevance ranking.The third and fourth application are supervised and unsupervised multi-class classification for the practical case of topic segmentation and labeling.These four applications are tested with Finnish intensive care patient records.The fifth application is multi-label classification for the practical task of diagnosis coding. It is tested with English radiology reports.The performance of all these applications is promising. Third, the aim is to study how the quality of machine learning applications can be reliably evaluated.The associations between performance evaluation measures and methods are addressed,and a new hold-out method is introduced.This method contributes not only to processing time but also to the evaluation diversity and quality. The main conclusion is that developing machine learning applications for text requires interdisciplinary, international collaboration. Practical cases are very different, and hence the development must begin from genuine user needs and domain expertise. The technological expertise must cover linguistics,machine learning, and information systems. Finally, the methods must be evaluated both statistically and through authentic user-feedback.
Resumo:
Twelve single-pustule isolates of Uromyces appendiculatus, the etiological agent of common bean rust, were collected in the state of Minas Gerais, Brazil, and classified according to the new international differential series and the binary nomenclature system proposed during the 3rd Bean Rust Workshop. These isolates have been used to select rust-resistant genotypes in a bean breeding program conducted by our group. The twelve isolates were classified into seven different physiological races: 21-3, 29-3, 53-3, 53-19, 61-3, 63-3 and 63-19. Races 61-3 and 63-3 were the most frequent in the area. They were represented by five and two isolates, respectively. The other races were represented by just one isolate. This is the first time the new international classification procedure has been used for U. appendiculatus physiological races in Brazil. The general adoption of this system will facilitate information exchange, allowing the cooperative use of the results obtained by different research groups throughout the world. The differential cultivars Mexico 309, Mexico 235 and PI 181996 showed resistance to all of the isolates that were characterized. It is suggested that these cultivars should be preferentially used as sources for resistance to rust in breeding programs targeting development lines adapted to the state of Minas Gerais.
Resumo:
ABSTRACT Geographic Information System (GIS) is an indispensable software tool in forest planning. In forestry transportation, GIS can manage the data on the road network and solve some problems in transportation, such as route planning. Therefore, the aim of this study was to determine the pattern of the road network and define transport routes using GIS technology. The present research was conducted in a forestry company in the state of Minas Gerais, Brazil. The criteria used to classify the pattern of forest roads were horizontal and vertical geometry, and pavement type. In order to determine transport routes, a data Analysis Model Network was created in ArcGIS using an Extension Network Analyst, allowing finding a route shorter in distance and faster. The results showed a predominance of horizontal geometry classes average (3) and bad (4), indicating presence of winding roads. In the case of vertical geometry criterion, the class of highly mountainous relief (4) possessed the greatest extent of roads. Regarding the type of pavement, the occurrence of secondary coating was higher (75%), followed by primary coating (20%) and asphalt pavement (5%). The best route was the one that allowed the transport vehicle travel in a higher specific speed as a function of road pattern found in the study.
Resumo:
This study aimed to propose methods to identify croplands cultivated with winter cereals in the northern region of Rio Grande do Sul State, Brazil. Thus, temporal profiles of Normalized Difference Vegetation Index (NDVI) from MODIS sensor, from April to December of the 2000 to 2008, were analyzed. Firstly, crop masks were elaborated by subtracting the minimum NDVI image (April to May) from the maximum NDVI image (June to October). Then, an unsupervised classification of NDVI images was carried out (Isodata), considering the crop mask areas. According to the results, crop masks allowed the identification of pixels with greatest green biomass variation. This variation might be associated or not with winter cereals areas established to grain production. The unsupervised classification generated classes in which NDVI temporal profiles were associated with water bodies, pastures, winter cereals for grain production and for soil cover. Temporal NDVI profiles of the class winter cereals for grain production were in agree with crop patterns in the region (developmental stage, management standard and sowing dates). Therefore, unsupervised classification based on crop masks allows distinguishing and monitoring winter cereal crops, which were similar in terms of morphology and phenology.