915 resultados para Databases, Bibliographic


Relevância:

20.00% 20.00%

Publicador:

Resumo:

C. difficile causes gastrointestinal infections in humans, including severe diarrhea. It is implicated in 20%-30% of cases of antibiotic-associated diarrhea, in 50%-70% of cases of antibiotic-associated colitis, and in >90% of cases of antibiotic-associated pseudomembranous colitis. Exposure to antimicrobial agent, hospitalization and age are some of the risk factors that predispose to CDI. Virtually all hospitalized patients with nosocomially-acquired CDI have a history of treatment with antimicrobials or neoplastic agent within the previous 2 months. The development of CDI usually occurs during treatment with antibiotics or some weeks after completing the course of the antibiotics. ^ After exposure to the organism (often in a hospital), the median incubation period is less than 1 week, with a median time of onset of 2days. The difference in the time between the use of antibiotic and the development of the disease relate to the timing of exogenous acquisition of C. difficile. ^ This paper reviewed the literature for studies on different classes of antibiotics in association with the rates of primary CDI and RCDI from the year 1984 to 2012. The databases searched in this systematic review were: PubMed (National Library of Medicine) and Medline (R) (Ovid). RefWorks was used to store bibliographic data. ^ The search strategy yielded 733 studies, 692 articles from Ovid Medline (R) and 41 articles from PubMed after removing all duplicates. Only 11 studies were included as high quality studies. Out of the 11 studies reviewed, 6 studies described the development of CDI in non-CDI patients taking antibiotics for other purposes and 5 studies identified the risk factors associated with the development of recurrent CDI after exposure to antibiotics. ^ The risk of developing CDI in non-CDI patients receiving beta lactam antibiotics was 2.35%, while fluoroquinolones, clindamycin/macrolides and other antibiotics were associated with 2.64%, 2.54% and 2.35% respectively. Of those who received beta lactam antibiotic, 26.7% developed RCDI, while 36.8% of those who received any fluoroquinolone developed RCDI, 26.5% of those who received either clindamycin or macrolides developed RCDI and 29.1% of those who received other antibiotics developed RCDI. Continued use of non-C. difficile antibiotics especially fluoroquinolones was identified as an important risk factor for primary CDI and recurrent CDI. ^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Se presenta un panorama y los interrogantes fundamentales de la etapa de la Web 3.0. Se analizan las características actuales de los sistemas bibliográficos estructurados con el modelo entidad-relación. Se definen los niveles conceptual, lógico y físico en los sistemas informáticos; consecuentemente se presentan las características de los FRBR y se obervan las relaciones entre obra y documento en el modelo conceptual FRBR. Se describen los FRBRoo como una interpretación con una lógica de objetos de los mismos requerimientos funcionales. Finalmente se plantean las tendencias a futuro, tales como pasar de las modelizaciones de entidad-relación a la de objetos, la explicitación con anotación semántica consistente, el mapeo de bases bibliográficas existentes y el desarrollo de ontologías para que los sistemas documentales se integren en la Web Semática

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Se presenta un panorama y los interrogantes fundamentales de la etapa de la Web 3.0. Se analizan las características actuales de los sistemas bibliográficos estructurados con el modelo entidad-relación. Se definen los niveles conceptual, lógico y físico en los sistemas informáticos; consecuentemente se presentan las características de los FRBR y se obervan las relaciones entre obra y documento en el modelo conceptual FRBR. Se describen los FRBRoo como una interpretación con una lógica de objetos de los mismos requerimientos funcionales. Finalmente se plantean las tendencias a futuro, tales como pasar de las modelizaciones de entidad-relación a la de objetos, la explicitación con anotación semántica consistente, el mapeo de bases bibliográficas existentes y el desarrollo de ontologías para que los sistemas documentales se integren en la Web Semática

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Se presenta un panorama y los interrogantes fundamentales de la etapa de la Web 3.0. Se analizan las características actuales de los sistemas bibliográficos estructurados con el modelo entidad-relación. Se definen los niveles conceptual, lógico y físico en los sistemas informáticos; consecuentemente se presentan las características de los FRBR y se obervan las relaciones entre obra y documento en el modelo conceptual FRBR. Se describen los FRBRoo como una interpretación con una lógica de objetos de los mismos requerimientos funcionales. Finalmente se plantean las tendencias a futuro, tales como pasar de las modelizaciones de entidad-relación a la de objetos, la explicitación con anotación semántica consistente, el mapeo de bases bibliográficas existentes y el desarrollo de ontologías para que los sistemas documentales se integren en la Web Semática

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Correct species identifications are of tremendous importance for invasion ecology, as mistakes could lead to misdirecting limited resources against harmless species or inaction against problematic ones. DNA barcoding is becoming a promising and reliable tool for species identifications, however the efficacy of such molecular taxonomy depends on gene region(s) that provide a unique sequence to differentiate among species and on availability of reference sequences in existing genetic databases. Here, we assembled a list of aquatic and terrestrial non-indigenous species (NIS) and checked two leading genetic databases for corresponding sequences of six genome regions used for DNA barcoding. The genetic databases were checked in 2010, 2012, and 2016. All four aquatic kingdoms (Animalia, Chromista, Plantae and Protozoa) were initially equally represented in the genetic databases, with 64, 65, 69, and 61% of NIS included, respectively. Sequences for terrestrial NIS were present at rates of 58 and 78% for Animalia and Plantae, respectively. Six years later, the number of sequences for aquatic NIS increased to 75, 75, 74, and 63% respectively, while those for terrestrial NIS increased to 74 and 88% respectively. Genetic databases are marginally better populated with sequences of terrestrial NIS of plants compared to aquatic NIS and terrestrial NIS of animals. The rate at which sequences are added to databases is not equal among taxa. Though some groups of NIS are not detectable at all based on available data - mostly aquatic ones - encouragingly, current availability of sequences of taxa with environmental and/or economic impact is relatively good and continues to increase with time.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Despite the fact that input–output (IO) tables form a central part of the System of National Accounts, each individual country's national IO table exhibits more or less different features and characteristics, reflecting the country's socioeconomic idiosyncrasies. Consequently, the compilers of a multi-regional input–output table (MRIOT) are advised to thoroughly examine the conceptual as well as methodological differences among countries in the estimation of basic statistics for national IO tables and, if necessary, to carry out pre-adjustment of these tables into a common format prior to the MRIOT compilation. The objective of this study is to provide a practical guide for harmonizing national IO tables to construct a consistent MRIOT, referring to the adjustment practices used by the Institute of Developing Economies, JETRO (IDE-JETRO) in compiling the Asian International Input–Output Table.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Geographic Information Systems are developed to handle enormous volumes of data and are equipped with numerous functionalities intended to capture, store, edit, organise, process and analyse or represent the geographically referenced information. On the other hand, industrial simulators for driver training are real-time applications that require a virtual environment, either geospecific, geogeneric or a combination of the two, over which the simulation programs will be run. In the final instance, this environment constitutes a geographic location with its specific characteristics of geometry, appearance, functionality, topography, etc. The set of elements that enables the virtual simulation environment to be created and in which the simulator user can move, is usually called the Visual Database (VDB). The main idea behind the work being developed approaches a topic that is of major interest in the field of industrial training simulators, which is the problem of analysing, structuring and describing the virtual environments to be used in large driving simulators. This paper sets out a methodology that uses the capabilities and benefits of Geographic Information Systems for organising, optimising and managing the visual Database of the simulator and for generally enhancing the quality and performance of the simulator.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The need to refine models for best-estimate calculations, based on good-quality experimental data, has been expressed in many recent meetings in the field of nuclear applications. The modeling needs arising in this respect should not be limited to the currently available macroscopic methods but should be extended to next-generation analysis techniques that focus on more microscopic processes. One of the most valuable databases identified for the thermalhydraulics modeling was developed by the Nuclear Power Engineering Corporation (NUPEC), Japan. From 1987 to 1995, NUPEC performed steady-state and transient critical power and departure from nucleate boiling (DNB) test series based on the equivalent full-size mock-ups. Considering the reliability not only of the measured data, but also other relevant parameters such as the system pressure, inlet sub-cooling and rod surface temperature, these test series supplied the first substantial database for the development of truly mechanistic and consistent models for boiling transition and critical heat flux. Over the last few years the Pennsylvania State University (PSU) under the sponsorship of the U.S. Nuclear Regulatory Commission (NRC) has prepared, organized, conducted and summarized the OECD/NRC Full-size Fine-mesh Bundle Tests (BFBT) Benchmark. The international benchmark activities have been conducted in cooperation with the Nuclear Energy Agency/Organization for Economic Co-operation and Development (NEA/OECD) and Japan Nuclear Energy Safety (JNES) organization, Japan. Consequently, the JNES has made available the Boiling Water Reactor (BWR) NUPEC database for the purposes of the benchmark. Based on the success of the OECD/NRC BFBT benchmark the JNES has decided to release also the data based on the NUPEC Pressurized Water Reactor (PWR) subchannel and bundle tests for another follow-up international benchmark entitled OECD/NRC PWR Subchannel and Bundle Tests (PSBT) benchmark. This paper presents an application of the joint Penn State University/Technical University of Madrid (UPM) version of the well-known subchannel code COBRA-TF, namely CTF, to the critical power and departure from nucleate boiling (DNB) exercises of the OECD/NRC BFBT and PSBT benchmarks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Over the last few years, the Pennsylvania State University (PSU) under the sponsorship of the US Nuclear Regulatory Commission (NRC) has prepared, organized, conducted, and summarized two international benchmarks based on the NUPEC data—the OECD/NRC Full-Size Fine-Mesh Bundle Test (BFBT) Benchmark and the OECD/NRC PWR Sub-Channel and Bundle Test (PSBT) Benchmark. The benchmarks’ activities have been conducted in cooperation with the Nuclear Energy Agency/Organization for Economic Co-operation and Development (NEA/OECD) and the Japan Nuclear Energy Safety (JNES) Organization. This paper presents an application of the joint Penn State University/Technical University of Madrid (UPM) version of the well-known sub-channel code COBRA-TF (Coolant Boiling in Rod Array-Two Fluid), namely, CTF, to the steady state critical power and departure from nucleate boiling (DNB) exercises of the OECD/NRC BFBT and PSBT benchmarks. The goal is two-fold: firstly, to assess these models and to examine their strengths and weaknesses; and secondly, to identify the areas for improvement.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Se ha realizado un estudio de la distribución de plantas vasculares en un territorio situado entre las provincias de Cuenca, Guadalajara, Madrid y Toledo (España). El territorio estudiado está en la Submeseta Sur de la península ibérica, al sur del Sistema Central, norte de los Montes de Toledo y oeste del Sistema Ibérico, en zonas sedimentarias con poco relieve y clima mediterráneo, con grandes contrastes de temperatura y precipitaciones muy irregulares. Coincide con las regiones naturales llamadas: “Alcarrias”, “Campiñas de Madrid y Guadalajara” y “Vegas de los ríos de la Cuenca del Tajo”. Es lo que he llamado Cuenca Media del Tajo. En una primera fase se ha estudiado la flora regional para adquirir conocimientos previos mediante las consultas bibliográficas y de herbarios, se ha contado con bases de datos disponibles para MA, MACB, MAF, JACA, AH, ABH, VAL, SALA y EMMA. Se han revisado las aportaciones propias que en los últimos años he realizado en el territorio en relación a los temas tratados en la tesis. El trabajo de campo ha consistido en la realización de inventarios de presencia de flora. Ha sido muy importante la tarea herborizadora para llegar a la correcta identificación de las especies colectadas en los inventarios. De esta forma el autor ha confeccionado un herbario propio JML que en el territorio muestreado ha reunido cerca de 15000 números en las cuatro provincias estudiadas. Se ha hecho un muestreo sistemático del territorio con unos 6000 listados de plantas. Se ha tomado como unidad de muestreo la cuadrícula de UTM de 1 km de lado incluida como una centésima parte de la cuadrícula de UTM de 10 km. Se han seguido criterios para uniformizar el muestreo. Se ha apuntado el tiempo empleado y la superficie muestreada estimada en cada toma de datos. El criterio mínimo que tienen que cumplir todas las cuadrículas en el área estudiada ha sido que para cada cuadrícula de UTM de 5 km se han realizado al menos 5 inventarios en 5 cuadrículas diferentes de UTM de 1 km y al menos en una hora de tiempo. La unidad de comparación ha sido la cuadrícula UTM de 5 km. Se han informatizado los inventarios de campo, para ello se ha creado la base de datos TESIS en Microsoft office –Access-. Las principales tablas son LOCALIDAD, en la que se anotan las características del lugar muestreado y ESPECIES, que lista las especies de flora consideradas en las cuatro provincias del estudio. Por medio de formularios se han rellenado las tablas; destaca la tabla ESPECIE INVENTARIO que relaciona las tablas ESPECIES y LOCALIDAD; esta tabla cuenta en este momento con unos 165.000 registros. En la tabla ESPECIES_FPVI se visualizan las especies recopiladas. Se ha creado un indicador llamado FPVI “Flora permanentemente visible identificable” que consiste en atribuir a cada especie unos índices que nos permiten saber si una determinada planta se puede detectar en cualquier época del año. Los resultados presentados son: Creación de la base de datos TESIS. El Catálogo Florístico de la Cuenca Media del Tajo, que es el catálogo de la flora de las cuatro provincias desde el principio de la sistemática hasta las Saxifragáceas. En total se han recopilado 1028 taxones distribuidos en 77 familias. Se ha calculado el índice FPVI, para las especies del catálogo. La finalidad para la que se ha diseñado este índice es para poder comparar territorios. Para el desarrollo de ambos resultados ha sido fundamental el desarrollo de la tabla ESPECIES_PVI de la base de datos TESIS. En la tabla ESPECIES_PVI se han apuntado las características ecológicas y se revisa la información bibliográfica disponible para cada especie; las principales fuentes de información consultadas han sido Flora iberica, el proyecto “Anthos” y las bases de datos de los herbarios. Se ha apuntado sí se ha visto, sí está protegida o sí en un endemismo. Otros resultados son: la localización de las cuadrículas de UTM de 10 km, con mayor número de endemismos o especies singulares, con mayor valor botánico. Se ha realizado un par de ejemplos de estudios de autoecología de especie, en concreto Teucrium pumilum y Clematis recta. Se han confeccionando salidas cartográficas de distribución de especies. Se ha elaborado el herbario JML. Se ha presentado una sencilla herramienta para incluir inventarios florísticos, citas corológicas, consultas de autoecología o etiquetado de pliegos de herbario. Como colofón, se ha colaborado para desarrollar una aplicación informática de visualización, análisis y estudio de la distribución de taxones vegetales, que ha utilizado como datos de partida un porcentaje importante de los obtenidos para esta tesis. ABSTRACT I have made a study of the distribution of vascular plants in a territory located between the provinces of Cuenca, Guadalajara, Madrid and Toledo (Spain). The studied area is in the “Submeseta” South of the Iberian Peninsula, south of the Central System, north of the Montes de Toledo and west of the Iberian System, in sedimentary areas with little relief and Mediterranean climate, with big temperature contrasts and irregular rainfall. Coincides with the natural regions called "Alcarrias", "countryside of Madrid and Guadalajara" and “Vegas River Tagus Basin”. This is what I have called Middle Tagus Basin. In a first step we have studied the regional flora to acquire prior knowledge through the literature and herbaria consultations, it has had available databases for MA, MACB, MAF, JACA, AH, ABH, VAL, SALA and EMMA herbaria. The contributions I have made in the last years in the territory in relation to the topics discussed in the thesis have been revised. The field work consisted of conducting inventories presence of flora. Botanize was a very important task to get to the correct identification of the species collected in inventories. In this way the author has made his own herbarium JML in the sampled area has met at least 15000 samples in the four studied provinces. There has been a systematic sampling of the territory with nearly 6,000 listings of plants. Was taken as the sampling unit grid UTM 1 km side included as a hundredth of the UTM grid of 10 km from side. Criteria have been taken to standardize the sampling. Data were taken of the time spent and the estimated sampled surface. The minimum criteria they have to meet all the grids in the study area has been that for each UTM grid of 5 km have been made at least 5 stocks in 5 different grids UTM 1 km and at least one hour of time. The unit of comparison was the UTM grid of 5 km. I have computerized inventories of field, for it was created a database in Access- Microsoft office -TESIS. The main tables are LOCALIDAD, with caracteristics of the sampled location and ESPECIES, which lists the plant species considered in the four provinces of the study, is. Through forms I filled in the tables; highlights ESPECIE INVENTARIO table that relates the tables ESPECIES and LOCALIDAD, this table is counted at the moment with about 165,000 records. The table ESPECIES FPVI visualizes all recollected species. We have created an indicator called FPVI "Flora permanently visible identifiable" that attributes to each species indices that allow us to know whether a given plant can be detected in any season. The results presented are: Creating data base TESIS. The Floristic Books Middle Tagus Basin, which is a catalog of the flora of the four provinces since the beginning of the systematic until Saxifragaceae. In total 1028 collected taxa in 77 families. We calculated FPVI index for species catalog. The purpose for which this index designed is, to compare territories. For the development of both results, it was essential to develop the table ESPECIES_PVI TESIS data base. Table ESPECIES_PVI has signed the ecological characteristics and bibliographic information available for each species is revised; the main sources of information has been Flora iberica, the Anthos project databases of herbaria. Targeted species has been recorded, when seen, protected or endemism. Have also been located UTM grids of 10 km, with the largest number of endemic or unique species and more botanical value. There have been a couple of species autecology studies, namely Teucrium pumilum and Clematis recta, as an example of this type of study. They have been putting together maps of species distribution. We made herbarium JML. I have presented a simple tool to include floristic inventories, chorological appointments, consultations or to tag autoecology herbarium specimens. To cap it has worked to develop a computer application for visualization, analysis and study of the distribution of plant taxa, which has used as input data a significant percentage of those obtained for this thesis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

El aprendizaje automático y la cienciometría son las disciplinas científicas que se tratan en esta tesis. El aprendizaje automático trata sobre la construcción y el estudio de algoritmos que puedan aprender a partir de datos, mientras que la cienciometría se ocupa principalmente del análisis de la ciencia desde una perspectiva cuantitativa. Hoy en día, los avances en el aprendizaje automático proporcionan las herramientas matemáticas y estadísticas para trabajar correctamente con la gran cantidad de datos cienciométricos almacenados en bases de datos bibliográficas. En este contexto, el uso de nuevos métodos de aprendizaje automático en aplicaciones de cienciometría es el foco de atención de esta tesis doctoral. Esta tesis propone nuevas contribuciones en el aprendizaje automático que podrían arrojar luz sobre el área de la cienciometría. Estas contribuciones están divididas en tres partes: Varios modelos supervisados (in)sensibles al coste son aprendidos para predecir el éxito científico de los artículos y los investigadores. Los modelos sensibles al coste no están interesados en maximizar la precisión de clasificación, sino en la minimización del coste total esperado derivado de los errores ocasionados. En este contexto, los editores de revistas científicas podrían disponer de una herramienta capaz de predecir el número de citas de un artículo en el fututo antes de ser publicado, mientras que los comités de promoción podrían predecir el incremento anual del índice h de los investigadores en los primeros años. Estos modelos predictivos podrían allanar el camino hacia nuevos sistemas de evaluación. Varios modelos gráficos probabilísticos son aprendidos para explotar y descubrir nuevas relaciones entre el gran número de índices bibliométricos existentes. En este contexto, la comunidad científica podría medir cómo algunos índices influyen en otros en términos probabilísticos y realizar propagación de la evidencia e inferencia abductiva para responder a preguntas bibliométricas. Además, la comunidad científica podría descubrir qué índices bibliométricos tienen mayor poder predictivo. Este es un problema de regresión multi-respuesta en el que el papel de cada variable, predictiva o respuesta, es desconocido de antemano. Los índices resultantes podrían ser muy útiles para la predicción, es decir, cuando se conocen sus valores, el conocimiento de cualquier valor no proporciona información sobre la predicción de otros índices bibliométricos. Un estudio bibliométrico sobre la investigación española en informática ha sido realizado bajo la cultura de publicar o morir. Este estudio se basa en una metodología de análisis de clusters que caracteriza la actividad en la investigación en términos de productividad, visibilidad, calidad, prestigio y colaboración internacional. Este estudio también analiza los efectos de la colaboración en la productividad y la visibilidad bajo diferentes circunstancias. ABSTRACT Machine learning and scientometrics are the scientific disciplines which are covered in this dissertation. Machine learning deals with the construction and study of algorithms that can learn from data, whereas scientometrics is mainly concerned with the analysis of science from a quantitative perspective. Nowadays, advances in machine learning provide the mathematical and statistical tools for properly working with the vast amount of scientometrics data stored in bibliographic databases. In this context, the use of novel machine learning methods in scientometrics applications is the focus of attention of this dissertation. This dissertation proposes new machine learning contributions which would shed light on the scientometrics area. These contributions are divided in three parts: Several supervised cost-(in)sensitive models are learned to predict the scientific success of articles and researchers. Cost-sensitive models are not interested in maximizing classification accuracy, but in minimizing the expected total cost of the error derived from mistakes in the classification process. In this context, publishers of scientific journals could have a tool capable of predicting the citation count of an article in the future before it is published, whereas promotion committees could predict the annual increase of the h-index of researchers within the first few years. These predictive models would pave the way for new assessment systems. Several probabilistic graphical models are learned to exploit and discover new relationships among the vast number of existing bibliometric indices. In this context, scientific community could measure how some indices influence others in probabilistic terms and perform evidence propagation and abduction inference for answering bibliometric questions. Also, scientific community could uncover which bibliometric indices have a higher predictive power. This is a multi-output regression problem where the role of each variable, predictive or response, is unknown beforehand. The resulting indices could be very useful for prediction purposes, that is, when their index values are known, knowledge of any index value provides no information on the prediction of other bibliometric indices. A scientometric study of the Spanish computer science research is performed under the publish-or-perish culture. This study is based on a cluster analysis methodology which characterizes the research activity in terms of productivity, visibility, quality, prestige and international collaboration. This study also analyzes the effects of collaboration on productivity and visibility under different circumstances.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

El objetivo principal de este proyecto ha sido introducir aprendizaje automático en la aplicación FleSe. FleSe es una aplicación web que permite realizar consultas borrosas sobre bases de datos nítidos. Para llevar a cabo esta función la aplicación utiliza unos criterios para definir los conceptos borrosos usados para llevar a cabo las consultas. FleSe además permite que el usuario cambie estas personalizaciones. Es aquí donde introduciremos el aprendizaje automático, de tal manera que los criterios por defecto cambien y aprendan en función de las personalizaciones que van realizando los usuarios. Los objetivos secundarios han sido familiarizarse con el desarrollo y diseño web, al igual que recordar y ampliar el conocimiento sobre lógica borrosa y el lenguaje de programación lógica Ciao-Prolog. A lo largo de la realización del proyecto y sobre todo después del estudio de los resultados se demuestra que la agrupación de los usuarios marca la diferencia con la última versión de la aplicación. Esto se basa en la siguiente idea, podemos usar un algoritmo de aprendizaje automático sobre las personalizaciones de los criterios de todos los usuarios, pero la gran diversidad de opiniones de los usuarios puede llevar al algoritmo a concluir criterios erróneos o no representativos. Para solucionar este problema agrupamos a los usuarios intentando que cada grupo tengan la misma opinión o mismo criterio sobre el concepto. Y después de haber realizado las agrupaciones usar el algoritmo de aprendizaje automático para precisar el criterio por defecto de cada grupo de usuarios. Como posibles mejoras para futuras versiones de la aplicación FleSe sería un mejor control y manejo del ejecutable plserver. Este archivo se encarga de permitir a la aplicación web usar el lenguaje de programación lógica Ciao-Prolog para llevar a cabo la lógica borrosa relacionada con las consultas. Uno de los problemas más importantes que ofrece plserver es que bloquea el hilo de ejecución al intentar cargar un archivo con errores y en caso de ocurrir repetidas veces bloquea todas las peticiones siguientes bloqueando la aplicación. Pensando en los usuarios y posibles clientes, sería también importante permitir que FleSe trabajase con bases de datos de SQL en vez de almacenar la base de datos en los archivos de Prolog. Otra posible mejora basarse en distintas características a la hora de agrupar los usuarios dependiendo de los conceptos borrosos que se van ha utilizar en las consultas. Con esto se conseguiría que para cada concepto borroso, se generasen distintos grupos de usuarios, los cuales tendrían opiniones distintas sobre el concepto en cuestión. Así se generarían criterios por defecto más precisos para cada usuario y cada concepto borroso.---ABSTRACT---The main objective of this project has been to introduce machine learning in the application FleSe. FleSe is a web application that makes fuzzy queries over databases with precise information, using defined criteria to define the fuzzy concepts used by the queries. The application allows the users to change and custom these criteria. On this point is where the machine learning would be introduced, so FleSe learn from every new user customization of the criteria in order to generate a new default value of it. The secondary objectives of this project were get familiar with web development and web design in order to understand the how the application works, as well as refresh and improve the knowledge about fuzzy logic and logic programing. During the realization of the project and after the study of the results, I realized that clustering the users in different groups makes the difference between this new version of the application and the previous. This conclusion follows the next idea, we can use an algorithm to introduce machine learning over the criteria that people have, but the problem is the diversity of opinions and judgements that exists, making impossible to generate a unique correct criteria for all the users. In order to solve this problem, before using the machine learning methods, we cluster the users in order to make groups that have the same opinion, and afterwards, use the machine learning methods to precise the default criteria of each users group. The future improvements that could be important for the next versions of FleSe will be to control better the behaviour of the plserver file, that cost many troubles at the beginning of this project and it also generate important errors in the previous version. The file plserver allows the web application to use Ciao-Prolog, a logic programming language that control and manage all the fuzzy logic. One of the main problems with plserver is that when the user uploads a file with errors, it will block the thread and when this happens multiple times it will start blocking all the requests. Oriented to the customer, would be important as well to allow FleSe to manage and work with SQL databases instead of store the data in the Prolog files. Another possible improvement would that the cluster algorithm would be based on different criteria depending on the fuzzy concepts that the selected Prolog file have. This will generate more meaningful clusters, and therefore, the default criteria offered to the users will be more precise.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Expressed sequence tags (ESTs) are randomly sequenced cDNA clones. Currently, nearly 3 million human and 2 million mouse ESTs provide valuable resources that enable researchers to investigate the products of gene expression. The EST databases have proven to be useful tools for detecting homologous genes, for exon mapping, revealing differential splicing, etc. With the increasing availability of large amounts of poorly characterised eukaryotic (notably human) genomic sequence, ESTs have now become a vital tool for gene identification, sometimes yielding the only unambiguous evidence for the existence of a gene expression product. However, BLAST-based Web servers available to the general user have not kept pace with these developments and do not provide appropriate tools for querying EST databases with large highly spliced genes, often spanning 50 000–100 000 bases or more. Here we describe Gene2EST (http://woody.embl-heidelberg.de/gene2est/), a server that brings together a set of tools enabling efficient retrieval of ESTs matching large DNA queries and their subsequent analysis. RepeatMasker is used to mask dispersed repetitive sequences (such as Alu elements) in the query, BLAST2 for searching EST databases and Artemis for graphical display of the findings. Gene2EST combines these components into a Web resource targeted at the researcher who wishes to study one or a few genes to a high level of detail.