940 resultados para Feature Classification
Resumo:
These Facts sheets have been developed to provide a multitude of information about executive branch agencies/departments on a single sheet of paper. The Facts provides general information, contact information, workforce data, leave & benefits information, and affirmative action data. This is the most recent update of information for the fiscal year 2007.
Resumo:
Vulvar cancer is a rare disease and its screening is depending on the quality and the relevance of our clinical examination. Incidence of vulvar cancer and especially precancerous lesions, vulvar intraepithelial neoplasias (VIN), increased during these last years. The new terminology of vulvar intraepithelial neoplasia will help us to identify high risk groups which could develop a cancer: usual and differentiated VIN. An early diagnosis is essential to propose an adequate treatment. Management is a major point according to the rising incidence of these lesions in younger women. Until we can observe a benefit from the vaccination against human papillomavirus, we must increase the quality of screening by a careful examination of the vulva.
Batch effect confounding leads to strong bias in performance estimates obtained by cross-validation.
Resumo:
BACKGROUND: With the large amount of biological data that is currently publicly available, many investigators combine multiple data sets to increase the sample size and potentially also the power of their analyses. However, technical differences ("batch effects") as well as differences in sample composition between the data sets may significantly affect the ability to draw generalizable conclusions from such studies. FOCUS: The current study focuses on the construction of classifiers, and the use of cross-validation to estimate their performance. In particular, we investigate the impact of batch effects and differences in sample composition between batches on the accuracy of the classification performance estimate obtained via cross-validation. The focus on estimation bias is a main difference compared to previous studies, which have mostly focused on the predictive performance and how it relates to the presence of batch effects. DATA: We work on simulated data sets. To have realistic intensity distributions, we use real gene expression data as the basis for our simulation. Random samples from this expression matrix are selected and assigned to group 1 (e.g., 'control') or group 2 (e.g., 'treated'). We introduce batch effects and select some features to be differentially expressed between the two groups. We consider several scenarios for our study, most importantly different levels of confounding between groups and batch effects. METHODS: We focus on well-known classifiers: logistic regression, Support Vector Machines (SVM), k-nearest neighbors (kNN) and Random Forests (RF). Feature selection is performed with the Wilcoxon test or the lasso. Parameter tuning and feature selection, as well as the estimation of the prediction performance of each classifier, is performed within a nested cross-validation scheme. The estimated classification performance is then compared to what is obtained when applying the classifier to independent data.
Resumo:
Staphylococcus aureus, especially when it is methicillin resistant, has been recognised as a major cause of nosocomial and community-acquired infections. It has also been shown that certain strains were able to cause clonal epidemics whereas others showed a more incidental occurrence. On the basis of this behavioural distinction, a genetic feature underlying this difference in epidemicity can be assumed. Understanding the difference will not only contribute to the development of markers for the identification of epidemic strains but will also shed light on the evolution of clones. Genomes of strains from two independent collections (n=18 and n=10 strains) were analysed. Both collections were composed of carefully selected, genetically diverse strains with clinically well-defined epidemic and sporadic behaviour. Comparative genome hybridisation (CGH) was performed using an Agilent array for one collection (up to 11 probes per open reading frame - ORF), and an Affymetrix array for the other (up to 30 probes per ORF). Presence and absence information of probe homologues and ORFs was taken for analysis of molecular variance (AMOVA) at the strain and behaviour levels. Not a single probe showed 100% concordant differences between epidemic and sporadic strains. Moreover, probe differences between groups were always smaller than those within groups. This was also true, when the analysis was focussed on presence versus absence of ORF's or when probe information was transformed into allelic profiles. These findings present strong evidence against the presence or absence of a single common specific genetic factor differentiating epidemic from sporadic S. aureus clones.
Resumo:
The absolute K magnitudes and kinematic parameters of about 350 oxygen-rich Long-Period Variable stars are calibrated, by means of an up-to-date maximum-likelihood method, using HIPPARCOS parallaxes and proper motions together with radial velocities and, as additional data, periods and V-K colour indices. Four groups, differing by their kinematics and mean magnitudes, are found. For each of them, we also obtain the distributions of magnitude, period and de-reddened colour of the base population, as well as de-biased period-luminosity-colour relations and their two-dimensional projections. The SRa semiregulars do not seem to constitute a separate class of LPVs. The SRb appear to belong to two populations of different ages. In a PL diagram, they constitute two evolutionary sequences towards the Mira stage. The Miras of the disk appear to pulsate on a lower-order mode. The slopes of their de-biased PL and PC relations are found to be very different from the ones of the Oxygen Miras of the LMC. This suggests that a significant number of so-called Miras of the LMC are misclassified. This also suggests that the Miras of the LMC do not constitute a homogeneous group, but include a significant proportion of metal-deficient stars, suggesting a relatively smooth star formation history. As a consequence, one may not trivially transpose the LMC period-luminosity relation from one galaxy to the other.
Resumo:
During the period 1996-2000, forty-three heavy rainfall events have been detected in the Internal Basins of Catalonia (Northeastern of Spain). Most of these events caused floods and serious damage. This high number leads to the need for a methodology to classify them, on the basis of their surface rainfall distribution, their internal organization and their physical features. The aim of this paper is to show a methodology to analyze systematically the convective structures responsible of those heavy rainfall events on the basis of the information supplied by the meteorological radar. The proposed methodology is as follows. Firstly, the rainfall intensity and the surface rainfall pattern are analyzed on the basis of the raingauge data. Secondly, the convective structures at the lowest level are identified and characterized by using a 2-D algorithm, and the convective cells are identified by using a 3-D procedure that looks for the reflectivity cores in every radar volume. Thirdly, the convective cells (3-D) are associated with the 2-D structures (convective rainfall areas). This methodology has been applied to the 43 heavy rainfall events using the meteorological radar located near Barcelona and the SAIH automatic raingauge network.
Resumo:
INTRODUCTION: The 2004 version of the World Health Organization classification subdivides thymic epithelial tumors into A, AB, B1, B2, and B3 (and rare other) thymomas and thymic carcinomas (TC). Due to a morphological continuum between some thymoma subtypes and some morphological overlap between thymomas and TC, a variable proportion of cases may pose problems in classification, contributing to the poor interobserver reproducibility in some studies. METHODS: To overcome this problem, hematoxylin-eosin-stained and immunohistochemically processed sections of prototypic, "borderland," and "combined" thymomas and TC (n = 72) were studied by 18 pathologists at an international consensus slide workshop supported by the International Thymic Malignancy Interest Group. RESULTS: Consensus was achieved on refined criteria for decision making at the A/AB borderland, the distinction between B1, B2, and B3 thymomas and the separation of B3 thymomas from TCs. "Atypical type A thymoma" is tentatively proposed as a new type A thymoma variant. New reporting strategies for tumors with more than one histological pattern are proposed. CONCLUSION: These guidelines can set the stage for reproducibility studies and the design of a clinically meaningful grading system for thymic epithelial tumors.
Resumo:
Selostus: Suomen happamien sulfaattimaiden kansainvälinen luokittelu
Resumo:
111 patients with acute leukemia, including 29 children, were classified according to the surface markers and cytochemistry of their blasts. The acute leukemias were separated into two majors groups (lymphoid and non-lymphoid) depending on the presence or absence of specific lymphoid markers. On the basis of these criteria a correlation of 94% with the hematological diagnosis was obtained. Acute lymphoblastic leukemia (ALL) was divisible into three sub-groups: 11 cases expressing T-cell specific markers were classified as T-ALL and 33 cases expressing the common ALL antigen (CALLA) as c-ALL. 18 of the latter expressed an additional marker, DSA (Daudi surface antigen), splitting c-ALL cases in two subgroups. Cytochemistry of the cases lacking specific surface markers (n = 67) served to diagnose 41 acute myeloid leukemia (AML) cases and 8 monoblastic leukemias. The remaining 18 cases could not be classified. The presence of absence of HLD-DR (Ia) antigens served to subdivide AML into two major subgroups. The prognostic significance of these new diagnostic splits is under active study.
Resumo:
BACKGROUND: Extensive research exists estimating the effect hazardous alcohol¦use on morbidity and mortality, but little research quantifies the association between¦alcohol consumption and utility scores in patients with alcohol dependence.¦In the context of comparative research, the World Health Organisation (WHO)¦proposed to categorise the risk for alcohol-related acute and chronic harm according¦to patients' average daily alcohol consumption. OBJECTIVES: To estimate utility¦scores associated with each category of the WHO drinking risk-level classification¦in patients with alcohol dependence (AD). METHODS: We used data from¦CONTROL, an observational cohort study including 143 AD patients from the Alcohol¦Treatment Center at Lausanne University Hospital, followed for 12 months.¦Average daily alcohol consumption was assessed monthly using the Timeline Follow-¦back method and patients were categorised according to the WHO drinking¦risk-level classification: abstinent, low, medium, high and very high. Other measures¦as sociodemographic characteristics and utility scores derived from the EuroQoL¦5-Dimensions questionnaire (EQ-5D) were collected every three months.¦Mixed models for repeated measures were used to estimate mean utility scores¦associated with WHO drinking risk-level categories. RESULTS: A total of 143 patients¦were included and the 12-month follow-up permitting the assessment of¦1318 person-months. At baseline the mean age of the patients was 44.6 (SD 11.8)¦and the majority of patients was male (63.6%). Using repeated measures analysis,¦utility scores decreased with increasing drinking levels, ranging from 0.80 in abstinent¦patients to 0.62 in patients with very high risk drinking level (p_0.0001).¦CONCLUSIONS: In this sample of patients with alcohol dependence undergoing¦specialized care, utility scores estimated from the EQ-5D appeared to substantially¦and consistently vary according to patients' WHO drinking level.
Resumo:
CD34/QBEND10 immunostaining has been assessed in 150 bone marrow biopsies (BMB) including 91 myelodysplastic syndromes (MDS), 16 MDS-related AML, 25 reactive BMB, and 18 cases where RA could neither be established nor ruled out. All cases were reviewed and classified according to the clinical and morphological FAB criteria. The percentage of CD34-positive (CD34 +) hematopoietic cells and the number of clusters of CD34+ cells in 10 HPF were determined. In most cases the CD34+ cell count was similar to the blast percentage determined morphologically. In RA, however, not only typical blasts but also less immature hemopoietic cells lying morphologically between blasts and promyelocytes were stained with CD34. The CD34+ cell count and cluster values were significantly higher in RA than in BMB with reactive changes (p<0.0001 for both), in RAEB than in RA (p=0.0006 and p=0.0189, respectively), in RAEBt than in RAEB (p=0.0001 and p=0.0038), and in MDS-AML than in RAEBt (p<0.0001 and p=0.0007). Presence of CD34+ cell clusters in RA correlated with increased risk of progression of the disease. We conclude that CD34 immunostaining in BMB is a useful tool for distinguishing RA from other anemias, assessing blast percentage in MDS cases, classifying them according to FAB, and following their evolution.
Resumo:
BACKGROUND: Several studies have established Glioblastoma Multiforme (GBM) prognostic and predictive models based on age and Karnofsky Performance Status (KPS), while very few studies evaluated the prognostic and predictive significance of preoperative MR-imaging. However, to date, there is no simple preoperative GBM classification that also correlates with a highly prognostic genomic signature. Thus, we present for the first time a biologically relevant, and clinically applicable tumor Volume, patient Age, and KPS (VAK) GBM classification that can easily and non-invasively be determined upon patient admission. METHODS: We quantitatively analyzed the volumes of 78 GBM patient MRIs present in The Cancer Imaging Archive (TCIA) corresponding to patients in The Cancer Genome Atlas (TCGA) with VAK annotation. The variables were then combined using a simple 3-point scoring system to form the VAK classification. A validation set (N = 64) from both the TCGA and Rembrandt databases was used to confirm the classification. Transcription factor and genomic correlations were performed using the gene pattern suite and Ingenuity Pathway Analysis. RESULTS: VAK-A and VAK-B classes showed significant median survival differences in discovery (P = 0.007) and validation sets (P = 0.008). VAK-A is significantly associated with P53 activation, while VAK-B shows significant P53 inhibition. Furthermore, a molecular gene signature comprised of a total of 25 genes and microRNAs was significantly associated with the classes and predicted survival in an independent validation set (P = 0.001). A favorable MGMT promoter methylation status resulted in a 10.5 months additional survival benefit for VAK-A compared to VAK-B patients. CONCLUSIONS: The non-invasively determined VAK classification with its implication of VAK-specific molecular regulatory networks, can serve as a very robust initial prognostic tool, clinical trial selection criteria, and important step toward the refinement of genomics-based personalized therapy for GBM patients.
Resumo:
A semisupervised support vector machine is presented for the classification of remote sensing images. The method exploits the wealth of unlabeled samples for regularizing the training kernel representation locally by means of cluster kernels. The method learns a suitable kernel directly from the image and thus avoids assuming a priori signal relations by using a predefined kernel structure. Good results are obtained in image classification examples when few labeled samples are available. The method scales almost linearly with the number of unlabeled samples and provides out-of-sample predictions.
Resumo:
Among the types of remote sensing acquisitions, optical images are certainly one of the most widely relied upon data sources for Earth observation. They provide detailed measurements of the electromagnetic radiation reflected or emitted by each pixel in the scene. Through a process termed supervised land-cover classification, this allows to automatically yet accurately distinguish objects at the surface of our planet. In this respect, when producing a land-cover map of the surveyed area, the availability of training examples representative of each thematic class is crucial for the success of the classification procedure. However, in real applications, due to several constraints on the sample collection process, labeled pixels are usually scarce. When analyzing an image for which those key samples are unavailable, a viable solution consists in resorting to the ground truth data of other previously acquired images. This option is attractive but several factors such as atmospheric, ground and acquisition conditions can cause radiometric differences between the images, hindering therefore the transfer of knowledge from one image to another. The goal of this Thesis is to supply remote sensing image analysts with suitable processing techniques to ensure a robust portability of the classification models across different images. The ultimate purpose is to map the land-cover classes over large spatial and temporal extents with minimal ground information. To overcome, or simply quantify, the observed shifts in the statistical distribution of the spectra of the materials, we study four approaches issued from the field of machine learning. First, we propose a strategy to intelligently sample the image of interest to collect the labels only in correspondence of the most useful pixels. This iterative routine is based on a constant evaluation of the pertinence to the new image of the initial training data actually belonging to a different image. Second, an approach to reduce the radiometric differences among the images by projecting the respective pixels in a common new data space is presented. We analyze a kernel-based feature extraction framework suited for such problems, showing that, after this relative normalization, the cross-image generalization abilities of a classifier are highly increased. Third, we test a new data-driven measure of distance between probability distributions to assess the distortions caused by differences in the acquisition geometry affecting series of multi-angle images. Also, we gauge the portability of classification models through the sequences. In both exercises, the efficacy of classic physically- and statistically-based normalization methods is discussed. Finally, we explore a new family of approaches based on sparse representations of the samples to reciprocally convert the data space of two images. The projection function bridging the images allows a synthesis of new pixels with more similar characteristics ultimately facilitating the land-cover mapping across images.