Biblioteca Digital

607 resultados para Équation de Korteweg-De Vries

Document clustering evaluation : Divergence from a random baseline

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Divergence from a random baseline is a technique for the evaluation of document clustering. It ensures cluster quality measures are performing work that prevents ineffective clusterings from giving high scores to clusterings that provide no useful result. These concepts are defined and analysed using intrinsic and extrinsic approaches to the evaluation of document cluster quality. This includes the classical clusters to categories approach and a novel approach that uses ad hoc information retrieval. The divergence from a random baseline approach is able to differentiate ineffective clusterings encountered in the INEX XML Mining track. It also appears to perform a normalisation similar to the Normalised Mutual Information (NMI) measure but it can be applied to any measure of cluster quality. When it is applied to the intrinsic measure of distortion as measured by RMSE, subtraction from a random baseline provides a clear optimum that is not apparent otherwise. This approach can be applied to any clustering evaluation. This paper describes its use in the context of document clustering evaluation.

Pairwise similarity of TopSig document signatures

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper analyses the pairwise distances of signatures produced by the TopSig retrieval model on two document collections. The distribution of the distances are compared to purely random signatures. It explains why TopSig is only competitive with state of the art retrieval models at early precision. Only the local neighbourhood of the signatures is interpretable. We suggest this is a common property of vector space models.

Mathematics learning in prep

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Most teachers recognise the importance of mathematics teaching and learning in early years but there is not consensus on how and when this learning should occur. Young-Loveridge (cited in de Vries, Thomas, and Warren, 2010) suggests that quality early mathematical experiences are a key determinant to later achievement.

ClusterEval 1.0 : Cluster quality Evaluation software

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This report describes the available functionality and use of the ClusterEval evaluation software. It implements novel and standard measures for the evaluation of cluster quality. This software has been used at the INEX XML Mining track and in the MediaEval Social Event Detection task.

Genome-wide association study reveals three susceptibility loci for common migraine in the general population

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Migraine is a common, heterogeneous and heritable neurological disorder. Its pathophysiology is incompletely understood, and its genetic influences at the population level are unknown. In a population-based genome-wide analysis including 5,122 migraineurs and 18,108 non-migraineurs, rs2651899 (1p36.32, PRDM16), rs10166942 (2q37.1, TRPM8) and rs11172113 (12q13.3, LRP1) were among the top seven associations (P < 5 × 10(-6)) with migraine. These SNPs were significant in a meta-analysis among three replication cohorts and met genome-wide significance in a meta-analysis combining the discovery and replication cohorts (rs2651899, odds ratio (OR) = 1.11, P = 3.8 × 10(-9); rs10166942, OR = 0.85, P = 5.5 × 10(-12); and rs11172113, OR = 0.90, P = 4.3 × 10(-9)). The associations at rs2651899 and rs10166942 were specific for migraine compared with non-migraine headache. None of the three SNP associations was preferential for migraine with aura or without aura, nor were any associations specific for migraine features. TRPM8 has been the focus of neuropathic pain models, whereas LRP1 modulates neuronal glutamate signaling, plausibly linking both genes to migraine pathophysiology.

Social event detection at MediaEval 2013 : challenges, datasets and evaluation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we provide an overview of the Social Event Detection (SED) task that is part of the MediaEval Bench mark for Multimedia Evaluation 2013. This task requires participants to discover social events and organize the re- lated media items in event-specific clusters within a collection of Web multimedia. Social events are events that are planned by people, attended by people and for which the social multimedia are also captured by people. We describe the challenges, datasets, and the evaluation methodology.

Report on INEX 2008

Relevância:

10.00% 10.00%

Publicador:

Resumo:

INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2008 evaluation campaign, which consisted of a wide range of tracks: Ad hoc, Book, Efficiency, Entity Ranking, Interactive, QA, Link the Wiki, and XML Mining.

Document clustering algorithms, representations and evaluation for information retrieval

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis presents new methods for classification and thematic grouping of billions of web pages, at scales previously not achievable. This process is also known as document clustering, where similar documents are automatically associated with clusters that represent various distinct topic. These automatically discovered topics are in turn used to improve search engine performance by only searching the topics that are deemed relevant to particular user queries.

Clustering and labeling a web scale document collection using Wikipedia clusters

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Clustering is an important technique in organising and categorising web scale documents. The main challenges faced in clustering the billions of documents available on the web are the processing power required and the sheer size of the datasets available. More importantly, it is nigh impossible to generate the labels for a general web document collection containing billions of documents and a vast taxonomy of topics. However, document clusters are most commonly evaluated by comparison to a ground truth set of labels for documents. This paper presents a clustering and labeling solution where the Wikipedia is clustered and hundreds of millions of web documents in ClueWeb12 are mapped on to those clusters. This solution is based on the assumption that the Wikipedia contains such a wide range of diverse topics that it represents a small scale web. We found that it was possible to perform the web scale document clustering and labeling process on one desktop computer under a couple of days for the Wikipedia clustering solution containing about 1000 clusters. It takes longer to execute a solution with finer granularity clusters such as 10,000 or 50,000. These results were evaluated using a set of external data.

Parallel streaming signature EM-tree: A clustering algorithm for web scale applications

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundreds of thousands of clusters. It does this on a single mid-range machine using efficient algorithms and compressed document representations. It is applied to two web-scale crawls covering tens of terabytes. ClueWeb09 and ClueWeb12 contain 500 and 733 million web pages and were clustered into 500,000 to 700,000 clusters. To the best of our knowledge, such fine grained clustering has not been previously demonstrated. Previous approaches clustered a sample that limits the maximum number of discoverable clusters. The proposed EM-tree algorithm uses the entire collection in clustering and produces several orders of magnitude more clusters than the existing algorithms. Fine grained clustering is necessary for meaningful clustering in massive collections where the number of distinct topics grows linearly with collection size. These fine-grained clusters show an improved cluster quality when assessed with two novel evaluations using ad hoc search relevance judgments and spam classifications for external validation. These evaluations solve the problem of assessing the quality of clusters where categorical labeling is unavailable and unfeasible.

Genetics of rheumatoid arthritis contributes to biology and drug discovery

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A major challenge in human genetics is to devise a systematic strategy to integrate disease-associated variants with diverse genomic and biological data sets to provide insight into disease pathogenesis and guide drug discovery for complex traits such as rheumatoid arthritis (RA)1. Here we performed a genome-wide association study meta-analysis in a total of >100,000 subjects of European and Asian ancestries (29,880 RA cases and 73,758 controls), by evaluating ~10 million single-nucleotide polymorphisms. We discovered 42 novel RA risk loci at a genome-wide level of significance, bringing the total to 101 (refs 2, 3, 4). We devised an in silico pipeline using established bioinformatics methods based on functional annotation5, cis-acting expression quantitative trait loci6 and pathway analyses7, 8, 9—as well as novel methods based on genetic overlap with human primary immunodeficiency, haematological cancer somatic mutations and knockout mouse phenotypes—to identify 98 biological candidate genes at these 101 risk loci. We demonstrate that these genes are the targets of approved therapies for RA, and further suggest that drugs approved for other indications may be repurposed for the treatment of RA. Together, this comprehensive genetic study sheds light on fundamental genes, pathways and cell types that contribute to RA pathogenesis, and provides empirical evidence that the genetics of RA can provide important information for drug discovery.

Phytogeny of the Quambalariaceae fam. nov., including important Eucalyptus pathogens in South Africa and Australia

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The genus Quambalaria consists of plant-pathogenic fungi causing disease on leaves and shoots of species of Eucalyptus and its close relative, Corymbia. The phylogenetic relationship of Quambalaria spp., previously classified in genera such as Sporothrix and Ramularia, has never been addressed. It has, however, been suggested that they belong to the basidiomycete orders Exobasidiales or Ustilaginales. The aim of this study was thus to consider the ordinal relationships of Q. eucalypti and Q. pitereka using ribosomal LSU sequences. Sequence data from the ITS nrDNA were used to determine the phylogenetic relationship of the two Quambalaria species together with Fugomyces (= Cerinosterus) cyanescens. In addition to sequence data, the ultrastructure of the septal pores of the species in question was compared. From the LSU sequence data it was concluded that Quambalaria spp. and F. cyanescens form a monophyletic clade in the Microstromatales, an order of the Ustilaginomycetes. Sequences from the ITS region confirmed that Q. pitereka and Q. eucalypti are distinct species. The ex-type isolate of F. cyanescens, together with another isolate from Eucalyptus in Australia, constitute a third species of Quambalaria, Q. cyanescens (de Hoog & G.A. de Vries) Z.W. de Beer, Begerow & R. Bauer comb. nov. Transmission electron-microscopic studies of the septal pores confirm that all three Quambalaria spp. have dolipores with swollen lips, which differ from other members of the Microstromatales (i.e. the Microstromataceae and Volvocisporiaceae) that have simple pores with more or less rounded pore lips. Based on their unique ultrastructural features and the monophyly of the three Quambalaria spp. in the Microstromatales, a new family, Quambalariaceae Z.W. de Beer, Begerow & R. Bauer fam. nov., is described.

Generalized Burgers equations and Euler-Painlev transcendents. I

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Initial-value problems for the generalized Burgers equation (GBE) ut+u betaux+lambdaualpha =(delta/2)uxx are discussed for the single hump type of initial data both continuous and discontinuous. The numerical solution is carried to the self-similar ``intermediate asymptotic'' regime when the solution is given analytically by the self-similar form. The nonlinear (transformed) ordinary differential equations (ODE's) describing the self-similar form are generalizations of a class discussed by Euler and PainlevÃ© and quoted by Kamke. These ODE's are new, and it is postulated that they characterize GBE's in the same manner as the Painlev equations categorize the Kortweg-de Vries (KdV) type. A connection problem for some related ODE's satisfying proper asymptotic conditions at x=Â±[infinity], is solved. The range of amplitude parameter is found for which the solution of the connection problem exists. The other solutions of the above GBE, which display several interesting features such as peaking, breaking, and a long shelf on the left for negative values of the damping coefficient lambda, are also discussed. The results are compared with those holding for the modified KdV equation with damping. Journal of Mathematical Physics is copyrighted by The American Institute of Physics.

Genetic analysis for a shared biological basis between migraine and coronary artery disease

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objective: To apply genetic analysis of genome-wide association data to study the extent and nature of a shared biological basis between migraine and coronary artery disease (CAD). Methods: Four separate methods for cross-phenotype genetic analysis were applied on data from 2 large-scale genome-wide association studies of migraine (19,981 cases, 56,667 controls) and CAD (21,076 cases, 63,014 controls). The first 2 methods quantified the extent of overlapping risk variants and assessed the load of CAD risk loci in migraineurs. Genomic regions of shared risk were then identified by analysis of covariance patterns between the 2 phenotypes and by querying known genome-wide significant loci. Results: We found a significant overlap of genetic risk loci for migraine and CAD. When stratified by migraine subtype, this was limited to migraine without aura, and the overlap was protective in that patients with migraine had a lower load of CAD risk alleles than controls. Genes indicated by 16 shared risk loci point to mechanisms with potential roles in migraine pathogenesis and CAD, including endothelial dysfunction (PHACTR1) and insulin homeostasis (GIP). Conclusions: The results suggest that shared biological processes contribute to risk of migraine and CAD, but surprisingly this commonality is restricted to migraine without aura and the impact is in opposite directions. Understanding the mechanisms underlying these processes and their opposite relationship to migraine and CAD may improve our understanding of both disorders.

Gene-based pleiotropy across migraine with aura and migraine without aura patient groups

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Introduction: It is unclear whether patients diagnosed according to International Classification of Headache Disorders criteria for migraine with aura (MA) and migraine without aura (MO) experience distinct disorders or whether their migraine subtypes are genetically related. Aim: Using a novel gene-based (statistical) approach, we aimed to identify individual genes and pathways associated both with MA and MO. Methods: Gene-based tests were performed using genome-wide association summary statistic results from the most recent International Headache Genetics Consortium study comparing 4505 MA cases with 34,813 controls and 4038 MO cases with 40,294 controls. After accounting for non-independence of gene-based test results, we examined the significance of the proportion of shared genes associated with MA and MO. Results: We found a significant overlap in genes associated with MA and MO. Of the total 1514 genes with a nominally significant gene-based p value (pgene-based ≤ 0.05) in the MA subgroup, 107 also produced pgene-based ≤ 0.05 in the MO subgroup. The proportion of overlapping genes is almost double the empirically derived null expectation, producing significant evidence of gene-based overlap (pleiotropy) (pbinomial-test = 1.5 × 10–4). Combining results across MA and MO, six genes produced genome-wide significant gene-based p values. Four of these genes (TRPM8, UFL1, FHL5 and LRP1) were located in close proximity to previously reported genome-wide significant SNPs for migraine, while two genes, TARBP2 and NPFF separated by just 259 bp on chromosome 12q13.13, represent a novel risk locus. The genes overlapping in both migraine types were enriched for functions related to inflammation, the cardiovascular system and connective tissue. Conclusions: Our results provide novel insight into the likely genes and biological mechanisms that underlie both MA and MO, and when combined with previous data, highlight the neuropeptide FF-amide peptide encoding gene (NPFF) as a novel candidate risk gene for both types of migraine.

«
1
2
...
4
5
6
7
8
9
10
...
40
41
»