820 resultados para Data classification
Resumo:
Satellite image classification involves designing and developing efficient image classifiers. With satellite image data and image analysis methods multiplying rapidly, selecting the right mix of data sources and data analysis approaches has become critical to the generation of quality land-use maps. In this study, a new postprocessing information fusion algorithm for the extraction and representation of land-use information based on high-resolution satellite imagery is presented. This approach can produce land-use maps with sharp interregional boundaries and homogeneous regions. The proposed approach is conducted in five steps. First, a GIS layer - ATKIS data - was used to generate two coarse homogeneous regions, i.e. urban and rural areas. Second, a thematic (class) map was generated by use of a hybrid spectral classifier combining Gaussian Maximum Likelihood algorithm (GML) and ISODATA classifier. Third, a probabilistic relaxation algorithm was performed on the thematic map, resulting in a smoothed thematic map. Fourth, edge detection and edge thinning techniques were used to generate a contour map with pixel-width interclass boundaries. Fifth, the contour map was superimposed on the thematic map by use of a region-growing algorithm with the contour map and the smoothed thematic map as two constraints. For the operation of the proposed method, a software package is developed using programming language C. This software package comprises the GML algorithm, a probabilistic relaxation algorithm, TBL edge detector, an edge thresholding algorithm, a fast parallel thinning algorithm, and a region-growing information fusion algorithm. The county of Landau of the State Rheinland-Pfalz, Germany was selected as a test site. The high-resolution IRS-1C imagery was used as the principal input data.
Resumo:
Nowadays communication is switching from a centralized scenario, where communication media like newspapers, radio, TV programs produce information and people are just consumers, to a completely different decentralized scenario, where everyone is potentially an information producer through the use of social networks, blogs, forums that allow a real-time worldwide information exchange. These new instruments, as a result of their widespread diffusion, have started playing an important socio-economic role. They are the most used communication media and, as a consequence, they constitute the main source of information enterprises, political parties and other organizations can rely on. Analyzing data stored in servers all over the world is feasible by means of Text Mining techniques like Sentiment Analysis, which aims to extract opinions from huge amount of unstructured texts. This could lead to determine, for instance, the user satisfaction degree about products, services, politicians and so on. In this context, this dissertation presents new Document Sentiment Classification methods based on the mathematical theory of Markov Chains. All these approaches bank on a Markov Chain based model, which is language independent and whose killing features are simplicity and generality, which make it interesting with respect to previous sophisticated techniques. Every discussed technique has been tested in both Single-Domain and Cross-Domain Sentiment Classification areas, comparing performance with those of other two previous works. The performed analysis shows that some of the examined algorithms produce results comparable with the best methods in literature, with reference to both single-domain and cross-domain tasks, in $2$-classes (i.e. positive and negative) Document Sentiment Classification. However, there is still room for improvement, because this work also shows the way to walk in order to enhance performance, that is, a good novel feature selection process would be enough to outperform the state of the art. Furthermore, since some of the proposed approaches show promising results in $2$-classes Single-Domain Sentiment Classification, another future work will regard validating these results also in tasks with more than $2$ classes.
Resumo:
This thesis is aimed to assess similarities and mismatches between the outputs from two independent methods for the cloud cover quantification and classification based on quite different physical basis. One of them is the SAFNWC software package designed to process radiance data acquired by the SEVIRI sensor in the VIS/IR. The other is the MWCC algorithm, which uses the brightness temperatures acquired by the AMSU-B and MHS sensors in their channels centered in the MW water vapour absorption band. At a first stage their cloud detection capability has been tested, by comparing the Cloud Masks they produced. These showed a good agreement between two methods, although some critical situations stand out. The MWCC, in effect, fails to reveal clouds which according to SAFNWC are fractional, cirrus, very low and high opaque clouds. In the second stage of the inter-comparison the pixels classified as cloudy according to both softwares have been. The overall observed tendency of the MWCC method, is an overestimation of the lower cloud classes. Viceversa, the more the cloud top height grows up, the more the MWCC not reveal a certain cloud portion, rather detected by means of the SAFNWC tool. This is what also emerges from a series of tests carried out by using the cloud top height information in order to evaluate the height ranges in which each MWCC category is defined. Therefore, although the involved methods intend to provide the same kind of information, in reality they return quite different details on the same atmospheric column. The SAFNWC retrieval being very sensitive to the top temperature of a cloud, brings the actual level reached by this. The MWCC, by exploiting the capability of the microwaves, is able to give an information about the levels that are located more deeply within the atmospheric column.
Resumo:
PURPOSE: The aim of the present study was to assess the oral mucosal health status of young male adults (aged 18 to 24 years) in Switzerland and to correlate their clinical findings with self-reported risk factors such as tobacco use and alcohol consumption. MATERIALS AND METHODS: Data on the oral health status of 615 Swiss Army recruits were collected using a standardised self-reported questionnaire, followed by an intraoral examination. Positive clinical findings were classified as (1) common conditions and anatomical variants, (2) reactive lesions, (3) benign tumour lesions and (4) premalignant lesions. The main locations of the oral mucosal findings were recorded on a topographical classification chart. Using correlational statistics, the findings were further associated with the known risk factors such as tobacco use and alcohol consumption. RESULTS: A total of 468 findings were diagnosed in 327 (53.17%) of the 615 subjects. In total, 445 findings (95.09%) were classified as common conditions, anatomical variants and reactive soft-tissue lesions. In the group of reactive soft-tissue lesions, there was a significantly higher percentage of smokers (P < 0.001) and subjects with a combination of smoking and alcohol consumption (P < 0.001). Eight lesions were clinically diagnosed as oral leukoplakias associated with smokeless tobacco. The prevalence of precursor lesions in the population examined was over 1%. CONCLUSIONS: Among young male adults in Switzerland, a significant number of oral mucosal lesions can be identified, which strongly correlate with tobacco use. To improve primary and secondary prevention, young adults should therefore be informed more extensively about the negative effects of tobacco use on oral health.
Resumo:
We present an update on clinical evaluation, staging, classification and treatment of canal cholesteatoma, including a meta-analysis of clinical data of the last 30 years.
Resumo:
We conducted a qualitative, multicenter study using a focus group design to explore the lived experiences of persons with any kind of primary sleep disorder with regard to functioning and contextual factors using six open-ended questions related to the International Classification of Functioning, Disability and Health (ICF) components. We classified the results using the ICF as a frame of reference. We identified the meaningful concepts within the transcribed data and then linked them to ICF categories according to established linking rules. The six focus groups with 27 participants yielded a total of 6986 relevant concepts, which were linked to a total of 168 different second-level ICF categories. From the patient perspective, the ICF components: (1) Body Functions; (2) Activities & Participation; and (3) Environmental Factors were equally represented; while (4) Body Structures appeared poignantly less frequently. Out of the total number of concepts, 1843 concepts (26%) were assigned to the ICF component Personal Factors, which is not yet classified but could indicate important aspects of resource management and strategy development of those who have a sleep disorder. Therefore, treatment of patients with sleep disorders must not be limited to anatomical and (patho-)physiological changes, but should also consider a more comprehensive view that includes patient's demands, strategies and resources in daily life and the contextual circumstances surrounding the individual.
Resumo:
We conducted an explorative, cross-sectional, multi-centre study in order to identify the most common problems of people with any kind of (primary) sleep disorder in a clinical setting using the International Classification of Functioning, Disability and Health (ICF) as a frame of reference. Data were collected from patients using a structured face-to-face interview of 45-60 min duration. A case record form for health professionals containing the extended ICF Checklist, sociodemographic variables and disease-specific variables was used. The study centres collected data of 99 individuals with sleep disorders. The identified categories include 48 (32%) for body functions, 13 (9%) body structures, 55 (37%) activities and participation and 32 (22%) for environmental factors. 'Sleep functions' (100%) and 'energy and drive functions', respectively, (85%) were the most severely impaired second-level categories of body functions followed by 'attention functions' (78%) and 'temperament and personality functions' (77%). With regard to the component activities and participation, patients felt most restricted in the categories of 'watching' (e.g. TV) (82%), 'recreation and leisure' (75%) and 'carrying out daily routine' (74%). Within the component environmental factors the categories 'support of immediate family', 'health services, systems and policies' and 'products or substances for personal consumption [medication]' were the most important facilitators; 'time-related changes', 'light' and 'climate' were the most important barriers. The study identified a large variety of functional problems reflecting the complexity of sleep disorders. The ICF has the potential to provide a comprehensive framework for the description of functional health in individuals with sleep disorders in a clinical setting.
Resumo:
Membrane interactions of porphyrinic photosensitizers (PSs) are known to play a crucial role for PS efficiency in photodynamic therapy (PDT). In the current paper, the interactions between 15 different porphyrinic PSs with various hydrophilic/lipophilic properties and phospholipid bilayers were probed by NMR spectroscopy. Unilamellar vesicles consisting of dioleoyl-phosphatidyl-choline (DOPC) were used as membrane models. PS-membrane interactions were deduced from analysis of the main DOPC (1)H-NMR resonances (choline and lipid chain signals). Initial membrane adsorption of the PSs was indicated by induced changes to the DOPC choline signal, i.e. a split into inner and outer choline peaks. Based on this parameter, the PSs could be classified into two groups, Type-A PSs causing a split and the Type-B PSs causing no split. A further classification into two subgroups each, A1, A2 and B1, B2 was based on the observed time-dependent changes of the main DOPC NMR signals following initial PS adsorption. Four different time-correlated patterns were found indicating different levels and rates of PS penetration into the hydrophobic membrane interior. The type of interaction was mainly affected by the amphiphilicity and the overall lipophilicity of the applied PS structures. In conclusion, the NMR data provided valuable structural and dynamic insights into the PS-membrane interactions which allow deriving the structural constraints for high membrane affinity and high membrane penetration of a given PS. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Current methods to characterize mesenchymal stem cells (MSCs) are limited to CD marker expression, plastic adherence and their ability to differentiate into adipogenic, osteogenic and chondrogenic precursors. It seems evident that stem cells undergoing differentiation should differ in many aspects, such as morphology and possibly also behaviour; however, such a correlation has not yet been exploited for fate prediction of MSCs. Primary human MSCs from bone marrow were expanded and pelleted to form high-density cultures and were then randomly divided into four groups to differentiate into adipogenic, osteogenic chondrogenic and myogenic progenitor cells. The cells were expanded as heterogeneous and tracked with time-lapse microscopy to record cell shape, using phase-contrast microscopy. The cells were segmented using a custom-made image-processing pipeline. Seven morphological features were extracted for each of the segmented cells. Statistical analysis was performed on the seven-dimensional feature vectors, using a tree-like classification method. Differentiation of cells was monitored with key marker genes and histology. Cells in differentiation media were expressing the key genes for each of the three pathways after 21 days, i.e. adipogenic, osteogenic and chondrogenic, which was also confirmed by histological staining. Time-lapse microscopy data were obtained and contained new evidence that two cell shape features, eccentricity and filopodia (= 'fingers') are highly informative to classify myogenic differentiation from all others. However, no robust classifiers could be identified for the other cell differentiation paths. The results suggest that non-invasive automated time-lapse microscopy could potentially be used to predict the stem cell fate of hMSCs for clinical application, based on morphology for earlier time-points. The classification is challenged by cell density, proliferation and possible unknown donor-specific factors, which affect the performance of morphology-based approaches. Copyright © 2012 John Wiley & Sons, Ltd.
Resumo:
Prediction of clinical outcome in cancer is usually achieved by histopathological evaluation of tissue samples obtained during surgical resection of the primary tumor. Traditional tumor staging (AJCC/UICC-TNM classification) summarizes data on tumor burden (T), presence of cancer cells in draining and regional lymph nodes (N) and evidence for metastases (M). However, it is now recognized that clinical outcome can significantly vary among patients within the same stage. The current classification provides limited prognostic information, and does not predict response to therapy. Recent literature has alluded to the importance of the host immune system in controlling tumor progression. Thus, evidence supports the notion to include immunological biomarkers, implemented as a tool for the prediction of prognosis and response to therapy. Accumulating data, collected from large cohorts of human cancers, has demonstrated the impact of immune-classification, which has a prognostic value that may add to the significance of the AJCC/UICC TNM-classification. It is therefore imperative to begin to incorporate the 'Immunoscore' into traditional classification, thus providing an essential prognostic and potentially predictive tool. Introduction of this parameter as a biomarker to classify cancers, as part of routine diagnostic and prognostic assessment of tumors, will facilitate clinical decision-making including rational stratification of patient treatment. Equally, the inherent complexity of quantitative immunohistochemistry, in conjunction with protocol variation across laboratories, analysis of different immune cell types, inconsistent region selection criteria, and variable ways to quantify immune infiltration, all underline the urgent requirement to reach assay harmonization. In an effort to promote the Immunoscore in routine clinical settings, an international task force was initiated. This review represents a follow-up of the announcement of this initiative, and of the J Transl Med. editorial from January 2012. Immunophenotyping of tumors may provide crucial novel prognostic information. The results of this international validation may result in the implementation of the Immunoscore as a new component for the classification of cancer, designated TNM-I (TNM-Immune).
Resumo:
With recent advances in mass spectrometry techniques, it is now possible to investigate proteins over a wide range of molecular weights in small biological specimens. This advance has generated data-analytic challenges in proteomics, similar to those created by microarray technologies in genetics, namely, discovery of "signature" protein profiles specific to each pathologic state (e.g., normal vs. cancer) or differential profiles between experimental conditions (e.g., treated by a drug of interest vs. untreated) from high-dimensional data. We propose a data analytic strategy for discovering protein biomarkers based on such high-dimensional mass-spectrometry data. A real biomarker-discovery project on prostate cancer is taken as a concrete example throughout the paper: the project aims to identify proteins in serum that distinguish cancer, benign hyperplasia, and normal states of prostate using the Surface Enhanced Laser Desorption/Ionization (SELDI) technology, a recently developed mass spectrometry technique. Our data analytic strategy takes properties of the SELDI mass-spectrometer into account: the SELDI output of a specimen contains about 48,000 (x, y) points where x is the protein mass divided by the number of charges introduced by ionization and y is the protein intensity of the corresponding mass per charge value, x, in that specimen. Given high coefficients of variation and other characteristics of protein intensity measures (y values), we reduce the measures of protein intensities to a set of binary variables that indicate peaks in the y-axis direction in the nearest neighborhoods of each mass per charge point in the x-axis direction. We then account for a shifting (measurement error) problem of the x-axis in SELDI output. After these pre-analysis processing of data, we combine the binary predictors to generate classification rules for cancer, benign hyperplasia, and normal states of prostate. Our approach is to apply the boosting algorithm to select binary predictors and construct a summary classifier. We empirically evaluate sensitivity and specificity of the resulting summary classifiers with a test dataset that is independent from the training dataset used to construct the summary classifiers. The proposed method performed nearly perfectly in distinguishing cancer and benign hyperplasia from normal. In the classification of cancer vs. benign hyperplasia, however, an appreciable proportion of the benign specimens were classified incorrectly as cancer. We discuss practical issues associated with our proposed approach to the analysis of SELDI output and its application in cancer biomarker discovery.
Resumo:
The advances in computational biology have made simultaneous monitoring of thousands of features possible. The high throughput technologies not only bring about a much richer information context in which to study various aspects of gene functions but they also present challenge of analyzing data with large number of covariates and few samples. As an integral part of machine learning, classification of samples into two or more categories is almost always of interest to scientists. In this paper, we address the question of classification in this setting by extending partial least squares (PLS), a popular dimension reduction tool in chemometrics, in the context of generalized linear regression based on a previous approach, Iteratively ReWeighted Partial Least Squares, i.e. IRWPLS (Marx, 1996). We compare our results with two-stage PLS (Nguyen and Rocke, 2002A; Nguyen and Rocke, 2002B) and other classifiers. We show that by phrasing the problem in a generalized linear model setting and by applying bias correction to the likelihood to avoid (quasi)separation, we often get lower classification error rates.
Resumo:
OBJECT: In this study, 1H magnetic resonance (MR) spectroscopy was prospectively tested as a reliable method for presurgical grading of neuroepithelial brain tumors. METHODS: Using a database of tumor spectra obtained in patients with histologically confirmed diagnoses, 94 consecutive untreated patients were studied using single-voxel 1H spectroscopy (point-resolved spectroscopy; TE 135 msec, TE 135 msec, TR 1500 msec). A total of 90 tumor spectra obtained in patients with diagnostic 1H MR spectroscopy examinations were analyzed using commercially available software (MRUI/VARPRO) and classified using linear discriminant analysis as World Health Organization (WHO) Grade I/II, WHO Grade III, or WHO Grade IV lesions. In all cases, the classification results were matched with histopathological diagnoses that were made according to the WHO classification criteria after serial stereotactic biopsy procedures or open surgery. Histopathological studies revealed 30 Grade I/II tumors, 29 Grade III tumors, and 31 Grade IV tumors. The reliability of the histological diagnoses was validated considering a minimum postsurgical follow-up period of 12 months (range 12-37 months). Classifications based on spectroscopic data yielded 31 tumors in Grade I/II, 32 in Grade III, and 27 in Grade IV. Incorrect classifications included two Grade II tumors, one of which was identified as Grade III and one as Grade IV; two Grade III tumors identified as Grade II; two Grade III lesions identified as Grade IV; and six Grade IV tumors identified as Grade III. Furthermore, one glioblastoma (WHO Grade IV) was classified as WHO Grade I/II. This represents an overall success rate of 86%, and a 95% success rate in differentiating low-grade from high-grade tumors. CONCLUSIONS: The authors conclude that in vivo 1H MR spectroscopy is a reliable technique for grading neuroepithelial brain tumors.
Resumo:
In 1998-2001 Finland suffered the most severe insect outbreak ever recorded, over 500,000 hectares. The outbreak was caused by the common pine sawfly (Diprion pini L.). The outbreak has continued in the study area, Palokangas, ever since. To find a good method to monitor this type of outbreaks, the purpose of this study was to examine the efficacy of multi-temporal ERS-2 and ENVISAT SAR imagery for estimating Scots pine (Pinus sylvestris L.) defoliation. Three methods were tested: unsupervised k-means clustering, supervised linear discriminant analysis (LDA) and logistic regression. In addition, I assessed if harvested areas could be differentiated from the defoliated forest using the same methods. Two different speckle filters were used to determine the effect of filtering on the SAR imagery and subsequent results. The logistic regression performed best, producing a classification accuracy of 81.6% (kappa 0.62) with two classes (no defoliation, >20% defoliation). LDA accuracy was with two classes at best 77.7% (kappa 0.54) and k-means 72.8 (0.46). In general, the largest speckle filter, 5 x 5 image window, performed best. When additional classes were added the accuracy was usually degraded on a step-by-step basis. The results were good, but because of the restrictions in the study they should be confirmed with independent data, before full conclusions can be made that results are reliable. The restrictions include the small size field data and, thus, the problems with accuracy assessment (no separate testing data) as well as the lack of meteorological data from the imaging dates.
Resumo:
Utilizing remote sensing methods to assess landscape-scale ecological change are rapidly becoming a dominant force in the natural sciences. Powerful and robust non-parametric statistical methods are also actively being developed to compliment the unique characteristics of remotely sensed data. The focus of this research is to utilize these powerful, robust remote sensing and statistical approaches to shed light on woody plant encroachment into native grasslands--a troubling ecological phenomenon occurring throughout the world. Specifically, this research investigates western juniper encroachment within the sage-steppe ecosystem of the western USA. Western juniper trees are native to the intermountain west and are ecologically important by means of providing structural diversity and habitat for many species. However, after nearly 150 years of post-European settlement changes to this threatened ecosystem, natural ecological processes such as fire regimes no longer limit the range of western juniper to rocky refugia and other areas protected from short fire return intervals that are historically common to the region. Consequently, sage-steppe communities with high juniper densities exhibit negative impacts, such as reduced structural diversity, degraded wildlife habitat and ultimately the loss of biodiversity. Much of today's sage-steppe ecosystem is transitioning to juniper woodlands. Additionally, the majority of western juniper woodlands have not reached their full potential in both range and density. The first section of this research investigates the biophysical drivers responsible for juniper expansion patterns observed in the sage-steppe ecosystem. The second section is a comprehensive accuracy assessment of classification methods used to identify juniper tree cover from multispectral 1 m spatial resolution aerial imagery.