975 resultados para les k-uplets de premiers
Resumo:
This paper describes the approach taken to the clustering task at INEX 2009 by a group at the Queensland University of Technology. The Random Indexing (RI) K-tree has been used with a representation that is based on the semantic markup available in the INEX 2009 Wikipedia collection. The RI K-tree is a scalable approach to clustering large document collections. This approach has produced quality clustering when evaluated using two different methodologies.
Resumo:
Digital collections are growing exponentially in size as the information age takes a firm grip on all aspects of society. As a result Information Retrieval (IR) has become an increasingly important area of research. It promises to provide new and more effective ways for users to find information relevant to their search intentions. Document clustering is one of the many tools in the IR toolbox and is far from being perfected. It groups documents that share common features. This grouping allows a user to quickly identify relevant information. If these groups are misleading then valuable information can accidentally be ignored. There- fore, the study and analysis of the quality of document clustering is important. With more and more digital information available, the performance of these algorithms is also of interest. An algorithm with a time complexity of O(n2) can quickly become impractical when clustering a corpus containing millions of documents. Therefore, the investigation of algorithms and data structures to perform clustering in an efficient manner is vital to its success as an IR tool. Document classification is another tool frequently used in the IR field. It predicts categories of new documents based on an existing database of (doc- ument, category) pairs. Support Vector Machines (SVM) have been found to be effective when classifying text documents. As the algorithms for classifica- tion are both efficient and of high quality, the largest gains can be made from improvements to representation. Document representations are vital for both clustering and classification. Representations exploit the content and structure of documents. Dimensionality reduction can improve the effectiveness of existing representations in terms of quality and run-time performance. Research into these areas is another way to improve the efficiency and quality of clustering and classification results. Evaluating document clustering is a difficult task. Intrinsic measures of quality such as distortion only indicate how well an algorithm minimised a sim- ilarity function in a particular vector space. Intrinsic comparisons are inherently limited by the given representation and are not comparable between different representations. Extrinsic measures of quality compare a clustering solution to a “ground truth” solution. This allows comparison between different approaches. As the “ground truth” is created by humans it can suffer from the fact that not every human interprets a topic in the same manner. Whether a document belongs to a particular topic or not can be subjective.
Resumo:
The Texas Transportation Commission (“the Commission”) is responsible for planning and making policies for the location, construction, and maintenance of a comprehensive system of highways and public roads in Texas. In order for the Commission to carry out its legislative mandate, the Texas Constitution requires that most revenue generated by motor vehicle registration fees and motor fuel taxes be used for constructing and maintaining public roadways and other designated purposes. The Texas Department of Transportation (TxDOT) assists the Commission in executing state transportation policy. It is the responsibility of the legislature to appropriate money for TxDOT’s operation and maintenance expenses. All money authorized to be appropriated for TxDOT’s operations must come from the State Highway Fund (also known as Fund 6, Fund 006, or Fund 0006). The Commission can then use the balance in the fund to fulfill its responsibilities. However, the value of the revenue received in Fund 6 is not keeping pace with growing demand for transportation infrastructure in Texas. Additionally, diversion of revenue to nontransportation uses now exceeds $600 million per year. As shown in Figure 1.1, revenues and expenditures of the State Highway Fund per vehicle mile traveled (VMT) in Texas have remained almost flat since 1993. In the meantime, construction cost inflation has gone up more than 100%, effectively halving the value of expenditure.
Resumo:
This research report documents work conducted by the Center for Transportation (CTR) at The University of Texas at Austin in analyzing the Joint Analysis using the Combined Knowledge (J.A.C.K.) program. This program was developed by the Texas Department of Transportation (TxDOT) to make projections of revenues and expenditures. This research effort was to span from September 2008 to August 2009, but the bulk of the work was completed and presented by December 2008. J.A.C.K. was subsequently renamed TRENDS, but for consistency with the scope of work, the original name is used throughout this report.
Resumo:
In this paper we present pyktree, an implementation of the K-tree algorithm in the Python programming language. The K-tree algorithm provides highly balanced search trees for vector quantization that scales up to very large data sets. Pyktree is highly modular and well suited for rapid-prototyping of novel distance measures and centroid representations. It is easy to install and provides a python package for library use as well as command line tools.
Resumo:
PCR-based cancer diagnosis requires detection of rare mutations in k- ras, p53 or other genes. The assumption has been that mutant and wild-type sequences amplify with near equal efficiency, so that they are eventually present in proportions representative of the starting material. Work on factor IX suggests that this assumption is invalid for one case of near- sequence identity. To test the generality of this phenomenon and its relevance to cancer diagnosis, primers distant from point mutations in p53 and k-ras were used to amplify wild-type and mutant sequences from these genes. A substantial bias against PCR amplification of mutants was observed for two regions of the p53 gene and one region of k-ras. For k-ras and p53, bias was observed when the wild-type and mutant sequences were amplified separately or when mixed in equal proportions before PCR. Bias was present with proofreading and non-proofreading polymerase. Mutant and wild-type segments of the factor V, cystic fibrosis transmembrane conductance regulator and prothrombin genes were amplified and did not exhibit PCR bias. Therefore, the assumption of equal PCR efficiency for point mutant and wild-type sequences is invalid in several systems. Quantitative or diagnostic PCR will require validation for each locus, and enrichment strategies may be needed to optimize detection of mutants.
Resumo:
The molecular structure of the mineral archerite ((K,NH4)H2PO4) has been determined and compared with that of biphosphammite ((NH4,K)H2PO4). Raman spectroscopy and infrared spectroscopy has been used to characterise these ‘cave’ minerals. Both minerals originated from the Murra-el-elevyn Cave, Eucla, Western Australia. The mineral is formed by the reaction of the chemicals in bat guano with calcite substrates. Raman and infrared bands are assigned to H2PO4-, OH and NH stretching vibrations. The Raman band at 981 cm-1 is assigned to the HOP stretching vibration. Bands in the 1200 to 1800 cm-1 region are associated with NH4+ bending modes. The molecular structure of the two minerals appear to be very similar, and it is therefore concluded that the two minerals are identical.
Resumo:
RÉSUMÉ. La prise en compte des troubles de la communication dans l’utilisation des systèmes de recherche d’information tels qu’on peut en trouver sur le Web est généralement réalisée par des interfaces utilisant des modalités n’impliquant pas la lecture et l’écriture. Peu d’applications existent pour aider l’utilisateur en difficulté dans la modalité textuelle. Nous proposons la prise en compte de la conscience phonologique pour assister l’utilisateur en difficulté d’écriture de requêtes (dysorthographie) ou de lecture de documents (dyslexie). En premier lieu un système de réécriture et d’interprétation des requêtes entrées au clavier par l’utilisateur est proposé : en s’appuyant sur les causes de la dysorthographie et sur les exemples à notre disposition, il est apparu qu’un système combinant une approche éditoriale (type correcteur orthographique) et une approche orale (système de transcription automatique) était plus approprié. En second lieu une méthode d’apprentissage automatique utilise des critères spécifiques , tels que la cohésion grapho-phonémique, pour estimer la lisibilité d’une phrase, puis d’un texte. ABSTRACT. Most applications intend to help disabled users in the information retrieval process by proposing non-textual modalities. This paper introduces specific parameters linked to phonological awareness in the textual modality. This will enhance the ability of systems to deal with orthographic issues and with the adaptation of results to the reader when for example the reader is dyslexic. We propose a phonology based sentence level rewriting system that combines spelling correction, speech synthesis and automatic speech recognition. This has been evaluated on a corpus of questions we get from dyslexic children. We propose a specific sentence readability measure that involves phonetic parameters such as grapho-phonemic cohesion. This has been learned on a corpus of reading time of sentences read by dyslexic children.
Resumo:
Resolving a noted open problem, we show that the Undirected Feedback Vertex Set problem, parameterized by the size of the solution set of vertices, is in the parameterized complexity class Poly(k), that is, polynomial-time pre-processing is sufficient to reduce an initial problem instance (G, k) to a decision-equivalent simplified instance (G', k') where k' � k, and the number of vertices of G' is bounded by a polynomial function of k. Our main result shows an O(k11) kernelization bound.
Resumo:
Many phosphate containing minerals are found in the Jenolan Caves. Such minerals are formed by the reaction of bat guano and clays from the caves. Among these cave minerals is the mineral taranakite (K,NH4)Al3(PO4)3(OH)•9(H2O) which has been identified by X-ray diffraction. Jenolan Caves taranakite has been characterised by Raman spectroscopy. Raman and infrared bands are assigned to H2PO4-, OH and NH stretching vibrations. By using a combination of XRD and Raman spectroscopy, the existence of taranakite in the caves has been proven.
Resumo:
Purpose: Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people may be looking for specific web sites or may wish to conduct transactions with web services. This paper aims to focus on automatically classifying the different user intents behind web queries. Design/methodology/approach: For the research reported in this paper, 130,000 web search engine queries are categorized as informational, navigational, or transactional using a k-means clustering approach based on a variety of query traits. Findings: The research findings show that more than 75 percent of web queries (clustered into eight classifications) are informational in nature, with about 12 percent each for navigational and transactional. Results also show that web queries fall into eight clusters, six primarily informational, and one each of primarily transactional and navigational. Research limitations/implications: This study provides an important contribution to web search literature because it provides information about the goals of searchers and a method for automatically classifying the intents of the user queries. Automatic classification of user intent can lead to improved web search engines by tailoring results to specific user needs. Practical implications: The paper discusses how web search engines can use automatically classified user queries to provide more targeted and relevant results in web searching by implementing a real time classification method as presented in this research. Originality/value: This research investigates a new application of a method for automatically classifying the intent of user queries. There has been limited research to date on automatically classifying the user intent of web queries, even though the pay-off for web search engines can be quite beneficial. © Emerald Group Publishing Limited.
Resumo:
"ORIGO Stepping Stones gives mathematics teachers the best of both worlds by delivering lessons and teacher guides on a digital platform blended with the more traditional printed student journals." -- Publisher website
Resumo:
"ORIGO Stepping Stones gives mathematics teachers the best of both worlds by delivering lessons and teacher guides on a digital platform blended with the more traditional printed student journals." -- Publisher website
Resumo:
Cette présentation met en avant la Théorie des Littératies Multiples (TLM) et l’importance de lire, lire le monde et se lire, dans le but de se transformer en contexte plurilingue. La première partie de cette communication sera consacrée à la présentation des principes fondamentaux de la TLM. Il est vrai que la littératie valorisée par l’école est souvent celle qui est la plus prisée dans la recherche et l’enseignement. La TLM enlève la littératie scolaire de sa place privilégiée et l’insère dans un agencement de littératies au foyer, à l’école et dans la communauté. Les littératies, en tant que construit renvoient aux mots, aux gestes, aux attitudes, ou plus exactement, aux façons de parler, de lire, d’écrire et de valoriser les réalités de la vie. Elles sont une façon de devenir avec le monde. Les littératies constituent des textes au sens large (comme par exemple, la musique, l’art, la physique et les athématiques) qui peuvent être visuels,oraux, écrits, tactiles,olfactifs ou numériques. Elles se fusionnent aux contextes sociopolitiques, culturels, économiques, genrés et racialisés, qui de par leur caractère mobile et fluide transforment les littératies génératrices de locuteurs, de scripteurs, d’artistes, d’ava tars et de communautés. Les littératies prennent leur sens en contexte, dans le temps et l’espace où on se trouve De ce fait, leur actualisation n’est pas prédéterminée et est imprévisible. La TLM s’intéresse aux rôles joués par les littératies. Lire, lire le monde et se lire a pour fonction importante entre autres de transformer une vie, une communauté et une société. La deuxième partie de cette communication sera consacrée à un projet de recherche ayant pour objectif d’explorer la façon dont des enfants acquièrent simultanément deux ou plusieurs systèmes d’écriture. Des enfants âgés de 5 à 8 ans ont participé à des activités filmées en salle de classe, au foyer et dans leur quartier. Puis des entretiens ont été menés avec les enfants, leurs parents et leurs enseignants. Ce projet nous permet de mieux saisir ce que signifient en contexte plurilingue les littératies en tant que processus. Le projet s’intéresse à ce qu’implique lire, lire le monde et se lire à l’école, au foyer et dans la communauté. Dans une société pluraliste, nous sommes plus que jamais ensibilisés aux contextes particuliers dans lesquels lire, lire le monde et se lire s’actualisent, qu’il s’agisse d’un nouvel arrivant ou d’une personne vivant en milieu minoritaire.