995 resultados para CLASS DISCOVERY
Resumo:
Array technologies have made it possible to record simultaneously the expression pattern of thousands of genes. A fundamental problem in the analysis of gene expression data is the identification of highly relevant genes that either discriminate between phenotypic labels or are important with respect to the cellular process studied in the experiment: for example cell cycle or heat shock in yeast experiments, chemical or genetic perturbations of mammalian cell lines, and genes involved in class discovery for human tumors. In this paper we focus on the task of unsupervised gene selection. The problem of selecting a small subset of genes is particularly challenging as the datasets involved are typically characterized by a very small sample size ?? the order of few tens of tissue samples ??d by a very large feature space as the number of genes tend to be in the high thousands. We propose a model independent approach which scores candidate gene selections using spectral properties of the candidate affinity matrix. The algorithm is very straightforward to implement yet contains a number of remarkable properties which guarantee consistent sparse selections. To illustrate the value of our approach we applied our algorithm on five different datasets. The first consists of time course data from four well studied Hematopoietic cell lines (HL-60, Jurkat, NB4, and U937). The other four datasets include three well studied treatment outcomes (large cell lymphoma, childhood medulloblastomas, breast tumors) and one unpublished dataset (lymph status). We compared our approach both with other unsupervised methods (SOM,PCA,GS) and with supervised methods (SNR,RMB,RFE). The results clearly show that our approach considerably outperforms all the other unsupervised approaches in our study, is competitive with supervised methods and in some case even outperforms supervised approaches.
Resumo:
Clustering is a difficult task: there is no single cluster definition and the data can have more than one underlying structure. Pareto-based multi-objective genetic algorithms (e.g., MOCK Multi-Objective Clustering with automatic K-determination and MOCLE-Multi-Objective Clustering Ensemble) were proposed to tackle these problems. However, the output of such algorithms can often contains a high number of partitions, becoming difficult for an expert to manually analyze all of them. In order to deal with this problem, we present two selection strategies, which are based on the corrected Rand, to choose a subset of solutions. To test them, they are applied to the set of solutions produced by MOCK and MOCLE in the context of several datasets. The study was also extended to select a reduced set of partitions from the initial population of MOCLE. These analysis show that both versions of selection strategy proposed are very effective. They can significantly reduce the number of solutions and, at the same time, keep the quality and the diversity of the partitions in the original set of solutions. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The enzyme dihydroorotate dehydrogenase (DHODH) has been suggested as a promising target for the design of trypanocidal agents. We report here the discovery of novel inhibitors of Trypanosoma cruzi DHODH identified by a combination of virtual screening and ITC methods. Monitoring of the enzymatic reaction in the presence of selected ligands together with structural information obtained from X-ray crystallography analysis have allowed the identification and validation of a novel site of interaction (S2 site). This has provided important structural insights for the rational design of T cruzi and Leishmania major DHODH inhibitors. The most potent compound (1) in the investigated series inhibits TcDHODH enzyme with K(i)(app) value of 19.28 mu M and possesses a ligand efficiency of 0.54 kcal mol(-1) per non-H atom. The compounds described in this work are promising hits for further development. (C) 2010 Elsevier Masson SAS. All rights reserved.
Resumo:
We extend the standard price discovery analysis to estimate the information share of dual-class shares across domestic and foreign markets. By examining both common and preferred shares, we aim to extract information not only about the fundamental value of the rm, but also about the dual-class premium. In particular, our interest lies on the price discovery mechanism regulating the prices of common and preferred shares in the BM&FBovespa as well as the prices of their ADR counterparts in the NYSE and in the Arca platform. However, in the presence of contemporaneous correlation between the innovations, the standard information share measure depends heavily on the ordering we attribute to prices in the system. To remain agnostic about which are the leading share class and market, one could for instance compute some weighted average information share across all possible orderings. This is extremely inconvenient given that we are dealing with 2 share prices in Brazil, 4 share prices in the US, plus the exchange rate (and hence over 5,000 permutations!). We thus develop a novel methodology to carry out price discovery analyses that does not impose any ex-ante assumption about which share class or trading platform conveys more information about shocks in the fundamental price. As such, our procedure yields a single measure of information share, which is invariant to the ordering of the variables in the system. Simulations of a simple market microstructure model show that our information share estimator works pretty well in practice. We then employ transactions data to study price discovery in two dual-class Brazilian stocks and their ADRs. We uncover two interesting ndings. First, the foreign market is at least as informative as the home market. Second, shocks in the dual-class premium entail a permanent e ect in normal times, but transitory in periods of nancial distress. We argue that the latter is consistent with the expropriation of preferred shareholders as a class.
Resumo:
We have evaluated two synthetic epothilone analogues lacking the 12,13-epoxide functionality, 12,13-desoxyepothilone B (dEpoB), and 12,13-desoxyepothilone F (dEpoF). The concentrations required for 50% growth inhibition (IC50) for a variety of anticancer agents were measured in CCRF-CEM/VBL1000 cells (2,048-fold resistance to vinblastine). By using dEpoB, dEpoF, aza-EpoB, and paclitaxel, the IC50 values were 0.029, 0.092, 2.99, and 5.17 μM, respectively. These values represent 4-, 33.5-, 1,423- and 3,133-fold resistance, respectively, when compared with the corresponding IC50 in the parent [nonmultiple drug-resistant (MDR)] CCRF-CEM cells. We then produced MDR human lung carcinoma A549 cells by continuous exposure of the tumor cells to sublethal concentrations of dEpoB (1.8 yr), vinblastine (1.2 yr), and paclitaxel (1.8 yr). This continued exposure led to the development of 2.1-, 4,848-, and 2,553-fold resistance to each drug, respectively. The therapeutic effect of dEpoB and paclitaxel was also compared in vivo in a mouse model by using various tumor xenografts. dEpoB is much more effective in reducing tumor sizes in all MDR tumors tested. Analysis of dEpoF, an analog possessing greater aqueous solubility than dEpoB, showed curative effects similar to dEpoB against K562, CCRF-CEM, and MX-1 xenografts. These results indicate that dEpoB and dEpoF are efficacious antitumor agents with both a broad chemotherapeutic spectrum and wide safety margins.
Resumo:
We present Herschel PACS 100 and 160 μm observations of the solar-type stars α Men, HD 88230 and HD 210277, which form part of the FGK stars sample of the Herschel open time key programme (OTKP) DUNES (DUst around NEarby Stars). Our observations show small infrared excesses at 160 μm for all three stars. HD 210277 also shows a small excess at 100 μm, while the 100 μm fluxes of α Men and HD 88230 agree with the stellar photospheric predictions. We attribute these infrared excesses to a new class of cold, faint debris discs. Both α Men and HD 88230 are spatially resolved in the PACS 160 μm images, while HD 210277 is point-like at that wavelength. The projected linear sizes of the extended emission lie in the range from ~115 to ≤ 250 AU. The estimated black body temperatures from the 100 and 160 μm fluxes are ≲22 K, and the fractional luminosity of the cold dust is L_dust/L_⋆ ~ 10^-6, close to the luminosity of the solar-system’s Kuiper belt. These debris discs are the coldest and faintest discs discovered so far around mature stars, so they cannot be explained easily invoking “classical” debris disc models.
Resumo:
The integration of computer technologies into everyday classroom life continues to provide pedagogical challenges for school systems, teachers and administrators. Data from an exploratory case study of one teacher and a multiage class of children in the first years of schooling in Australia show that when young children are using computers for set tasks in small groups, they require ongoing support from teachers, and to engage in peer interactions that are meaningful and productive. Classroom organization and the nature of teacher-child talk are key factors in engaging children in set tasks and producing desirable learning and teaching outcomes.
Resumo:
Weta possess typical Ensifera ears. Each ear comprises three functional parts: two equally sized tympanal membranes, an underlying system of modified tracheal chambers, and the auditory sensory organ, the crista acustica. This organ sits within an enclosed fluid-filled channel-previously presumed to be hemolymph. The role this channel plays in insect hearing is unknown. We discovered that the fluid within the channel is not actually hemolymph, but a medium composed principally of lipid from a new class. Three-dimensional imaging of this lipid channel revealed a previously undescribed tissue structure within the channel, which we refer to as the olivarius organ. Investigations into the function of the olivarius reveal de novo lipid synthesis indicating that it is producing these lipids in situ from acetate. The auditory role of this lipid channel was investigated using Laser Doppler vibrometry of the tympanal membrane, which shows that the displacement of the membrane is significantly increased when the lipid is removed from the auditory system. Neural sensitivity of the system, however, decreased upon removal of the lipid-a surprising result considering that in a typical auditory system both the mechanical and auditory sensitivity are positively correlated. These two results coupled with 3D modelling of the auditory system lead us to hypothesize a model for weta audition, relying strongly on the presence of the lipid channel. This is the first instance of lipids being associated with an auditory system outside of the Odentocete cetaceans, demonstrating convergence for the use of lipids in hearing.
Resumo:
To identify susceptibility loci for visceral leishmaniasis, we undertook genome-wide association studies in two populations: 989 cases and 1,089 controls from India and 357 cases in 308 Brazilian families (1,970 individuals). The HLA-DRB1-HLA-DQA1 locus was the only region to show strong evidence of association in both populations. Replication at this region was undertaken in a second Indian population comprising 941 cases and 990 controls, and combined analysis across the three cohorts for rs9271858 at this locus showed P combined = 2.76 × 10 -17 and odds ratio (OR) = 1.41, 95% confidence interval (CI) = 1.30-1.52. A conditional analysis provided evidence for multiple associations within the HLA-DRB1-HLA-DQA1 region, and a model in which risk differed between three groups of haplotypes better explained the signal and was significant in the Indian discovery and replication cohorts. In conclusion, the HLA-DRB1-HLA-DQA1 HLA class II region contributes to visceral leishmaniasis susceptibility in India and Brazil, suggesting shared genetic risk factors for visceral leishmaniasis that cross the epidemiological divides of geography and parasite species. © 2013 Nature America, Inc. All rights reserved.
Resumo:
Bayesian networks are compact, flexible, and interpretable representations of a joint distribution. When the network structure is unknown but there are observational data at hand, one can try to learn the network structure. This is called structure discovery. This thesis contributes to two areas of structure discovery in Bayesian networks: space--time tradeoffs and learning ancestor relations. The fastest exact algorithms for structure discovery in Bayesian networks are based on dynamic programming and use excessive amounts of space. Motivated by the space usage, several schemes for trading space against time are presented. These schemes are presented in a general setting for a class of computational problems called permutation problems; structure discovery in Bayesian networks is seen as a challenging variant of the permutation problems. The main contribution in the area of the space--time tradeoffs is the partial order approach, in which the standard dynamic programming algorithm is extended to run over partial orders. In particular, a certain family of partial orders called parallel bucket orders is considered. A partial order scheme that provably yields an optimal space--time tradeoff within parallel bucket orders is presented. Also practical issues concerning parallel bucket orders are discussed. Learning ancestor relations, that is, directed paths between nodes, is motivated by the need for robust summaries of the network structures when there are unobserved nodes at work. Ancestor relations are nonmodular features and hence learning them is more difficult than modular features. A dynamic programming algorithm is presented for computing posterior probabilities of ancestor relations exactly. Empirical tests suggest that ancestor relations can be learned from observational data almost as accurately as arcs even in the presence of unobserved nodes.
Resumo:
Development of effective therapies to eradicate persistent, slowly replicating M. tuberculosis (Mtb) represents a significant challenge to controlling the global TB epidemic. To develop such therapies, it is imperative to translate information from metabolome and proteome adaptations of persistent Mtb into the drug discovery screening platforms. To this end, reductive sulfur metabolism is genetically and pharmacologically implicated in survival, pathogenesis, and redox homeostasis of persistent Mtb. Therefore, inhibitors of this pathway are expected to serve as powerful tools in its preclinical and clinical validation as a therapeutic target for eradicating persisters. Here, we establish a first functional HTS platform for identification of APS reductase (APSR) inhibitors, a critical enzyme in the assimilation of sulfate for the biosynthesis of cysteine and other essential sulfur-containing molecules. Our HTS campaign involving 38?350 compounds led to the discovery of three distinct structural classes of APSR inhibitors. A class of bioactive compounds with known pharmacology displayed potent bactericidal activity in wild-type Mtb as well as MDR and XDR clinical isolates. Top compounds showed markedly diminished potency in a conditional Delta APSR mutant, which could be restored by complementation with Mtb APSR. Furthermore, ITC studies on representative compounds provided evidence for direct engagement of the APSR target. Finally, potent APSR inhibitors significantly decreased the cellular levels of key reduced sulfur-containing metabolites and also induced an oxidative shift in mycothiol redox potential of live Mtb, thus providing functional validation of our screening data. In summary, we have identified first-in-class inhibitors of APSR that can serve as molecular probes in unraveling the links between Mtb persistence, antibiotic tolerance, and sulfate assimilation, in addition to their potential therapeutic value.
Resumo:
TYPICAL is a package for describing and making automatic inferences about a broad class of SCHEME predicate functions. These functions, called types following popular usage, delineate classes of primitive SCHEME objects, composite data structures, and abstract descriptions. TYPICAL types are generated by an extensible combinator language from either existing types or primitive terminals. These generated types are located in a lattice of predicate subsumption which captures necessary entailment between types; if satisfaction of one type necessarily entail satisfaction of another, the first type is below the second in the lattice. The inferences make by TYPICAL computes the position of the new definition within the lattice and establishes it there. This information is then accessible to both later inferences and other programs (reasoning systems, code analyzers, etc) which may need the information for their own purposes. TYPICAL was developed as a representation language for the discovery program Cyrano; particular examples are given of TYPICAL's application in the Cyrano program.
Resumo:
Mapping novel terrain from sparse, complex data often requires the resolution of conflicting information from sensors working at different times, locations, and scales, and from experts with different goals and situations. Information fusion methods help resolve inconsistencies in order to distinguish correct from incorrect answers, as when evidence variously suggests that an object's class is car, truck, or airplane. The methods developed here consider a complementary problem, supposing that information from sensors and experts is reliable though inconsistent, as when evidence suggests that an objects class is car, vehicle, or man-made. Underlying relationships among objects are assumed to be unknown to the automated system of the human user. The ARTMAP information fusion system uses distributed code representations that exploit the neural network's capacity for one-to-many learning in order to produce self-organizing expert systems that discover hierarchial knowledge structures. The system infers multi-level relationships among groups of output classes, without any supervised labeling of these relationships. The procedure is illustrated with two image examples.
Resumo:
Classifying novel terrain or objects front sparse, complex data may require the resolution of conflicting information from sensors working at different times, locations, and scales, and from sources with different goals and situations. Information fusion methods can help resolve inconsistencies, as when evidence variously suggests that an object's class is car, truck, or airplane. The methods described here consider a complementary problem, supposing that information from sensors and experts is reliable though inconsistent, as when evidence suggests that an object's class is car, vehicle, and man-made. Underlying relationships among objects are assumed to be unknown to the automated system or the human user. The ARTMAP information fusion system used distributed code representations that exploit the neural network's capacity for one-to-many learning in order to produce self-organizing expert systems that discover hierarchical knowledge structures. The system infers multi-level relationships among groups of output classes, without any supervised labeling of these relationships.
Resumo:
Classifying novel terrain or objects from sparse, complex data may require the resolution of conflicting information from sensors woring at different times, locations, and scales, and from sources with different goals and situations. Information fusion methods can help resolve inconsistencies, as when eveidence variously suggests that and object's class is car, truck, or airplane. The methods described her address a complementary problem, supposing that information from sensors and experts is reliable though inconsistent, as when evidence suggests that an object's class is car, vehicle, and man-made. Underlying relationships among classes are assumed to be unknown to the autonomated system or the human user. The ARTMAP information fusion system uses distributed code representations that exploit the neural network's capacity for one-to-many learning in order to produce self-organizing expert systems that discover hierachical knowlege structures. The fusion system infers multi-level relationships among groups of output classes, without any supervised labeling of these relationships. The procedure is illustrated with two image examples, but is not limited to image domain.