937 resultados para Astronomical Data Bases
Resumo:
Pós-graduação em Planejamento e Análise de Políticas Públicas - FCHS
Resumo:
The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.
Resumo:
ABSTRACTDie vorliegende Arbeit befasste sich mit der Reinigung,heterologen Expression, Charakterisierung, molekularenAnalyse, Mutation und Kristallisation des EnzymsVinorin-Synthase. Das Enzym spielt eine wichtige Rolle inder Ajmalin-Biosynthese, da es in einerAcetyl-CoA-abhängigen Reaktion die Umwandlung desSarpagan-Alkaloids 16-epi-Vellosimin zu Vinorin unterBildung des Ajmalan-Grundgerüstes katalysiert. Nach der Reinigung der Vinorin-Synthase ausHybrid-Zellkulturen von Rauvolfia serpentina/Rhazya strictamit den fünf chromatographischen TrennmethodenAnionenaustauschchromatographie an SOURCE 30Q, HydrophobeInteraktionen Chromatographie an SOURCE 15PHE,Chromatographie an MacroPrep Ceramic Hydroxyapatit,Anionenaustauschchromatographie an Mono Q undGrößenausschlußchromatographie an Superdex 75 konnte dieVinorin-Synthase aus 2 kg Zellkulturgewebe 991fachangereichert werden.Das nach der Reinigung angefertigte SDS-Gel ermöglichte eineklare Zuordnung der Protein-Bande als Vinorin-Synthase.Der Verdau der Enzymbande mit der Endoproteinase LysC unddie darauffolgende Sequenzierung der Spaltpeptide führte zuvier Peptidsequenzen. Der Datenbankvergleich (SwissProt)zeigte keinerlei Homologien zu Sequenzen bekannterPflanzenenzyme. Mit degenerierten Primern, abgeleitet voneinem der erhaltenen Peptidfragmente und einer konserviertenRegion bekannter Acetyltransferasen gelang es, ein erstescDNA-Fragment der Vinorin-Synthase zu amplifizieren. Mit derMethode der RACE-PCR wurde die Nukleoidsequenzvervollständigt, was zu einem cDNA-Vollängenklon mit einerGröße von 1263 bp führte, der für ein Protein mit 421Aminosäuren (46 kDa) codiert.Das Vinorin-Synthase-Gen wurde in den pQE2-Expressionsvektorligiert, der für einen N-terminalen 6-fachen His-tagcodiert. Anschließend wurde sie erstmals erfolgreich in E.coli im mg-Maßstab exprimiert und bis zur Homogenitätgereinigt. Durch die erfolgreiche Überexpression konnte dieVinorin-Synthase eingehend charakterisiert werden. DerKM-Wert für das Substrat Gardneral wurde mit 20 µM, bzw.41.2 µM bestimmt und Vmax betrug 1 pkat, bzw. 1.71 pkat.Nach erfolgreicher Abspaltung des His-tags wurden diekinetischen Parameter erneut bestimmt (KM- Wert 7.5 µM, bzw.27.52 µM, Vmax 0.7 pkat, bzw. 1.21 pkat). Das Co-Substratzeigt einen KM- Wert von 60.5 µM (Vmax 0.6 pkat). DieVinorin-Synthase besitzt ein Temperatur-Optimum von 35 °Cund ein pH-Optimum bei 7.8.Homologievergleiche mit anderen Enzymen zeigten, dass dieVinorin-Synthase zu einer noch kleinen Familie von bisher 10Acetyltransferasen gehört. Alle Enzyme der Familie haben einHxxxD und ein DFGWG-Motiv zu 100 % konserviert. Basierendauf diesen Homologievergleichen und Inhibitorstudien wurden11 in dieser Proteinfamilie konservierte Aminosäuren gegenAlanin ausgetauscht, um so die Aminosäuren einer in derLiteratur postulierten katalytischen Triade(Ser/Cys-His-Asp) zu identifizieren.Die Mutation aller vorhandenen konservierten Serine undCysteine resultierte in keiner Mutante, die zumvollständigen Aktivitätsverlust des Enzyms führte. Nur dieMutationen H160A und D164A resultierten in einemvollständigen Aktivitätsverlust des Enzyms. Dieses Ergebniswiderlegt die Theorie einer katalytischen Triade und zeigte,dass die Aminosäuren H160A und D164A exklusiv an derkatalytischen Reaktion beteiligt sind.Zur Überprüfung dieser Ergebnisse und zur vollständigenAufklärung des Reaktionsmechanismus wurde dieVinorin-Synthase kristallisiert. Die bis jetzt erhaltenenKristalle (Kristallgröße in µm x: 150, y: 200, z: 200)gehören der Raumgruppe P212121 (orthorhombisch primitiv) anund beugen bis 3.3 Å. Da es bis jetzt keine Kristallstruktureines zur Vinorin-Synthase homologen Proteins gibt, konntedie Struktur noch nicht vollständig aufgeklärt werden. ZurLösung des Phasenproblems wird mit der Methode der multiplenanomalen Dispersion (MAD) jetzt versucht, die ersteKristallstruktur in dieser Enzymfamilie aufzuklären.
Resumo:
Il progetto di ricerca che presentiamo nasce dalla virtuosa combinazione di teoria e prassi didattica nello spirito della ricerca-azione. Scopo del presente lavoro è elaborare un percorso didattico di formazione alla traduzione specializzata in ambito medico-scientifico, tecnico ed economico-giuridico per la combinazione linguistica spagnolo-italiano all’interno della cornice istituzionale concreta dell’università italiana oggi. La nostra proposta formativa si fonda su tre elementi: la ricognizione del mercato attuale della traduzione per la combinazione linguistica indicata, l’individuazione degli obiettivi formativi in base al modello di competenza traduttiva scelto, l’elaborazione del percorso didattico per competenze e basato sull’enfoque por tareas di traduzione. Nella progettazione delle modalità didattiche due sono gli aspetti che definiscono il percorso proposto: il concetto di genere testuale specializzato per la traduzione e la gestione delle informazioni mediante le nuove tecnologie (corpora, banche dati terminologiche e fraseologiche, memorie di traduzione, traduzione controllata). Il presente lavoro si articola in due parti: la prima parte (quattro capitoli) presenta l’inquadramento teorico all’interno del quale si sviluppa la riflessione intorno alla didattica della traduzione specializzata; la seconda parte (due capitoli) presenta l’inquadramento metodologico e analitico all’interno del quale si elabora la nostra proposta didattica. Nel primo capitolo si illustrano i rapporti fra traduzione e mondo professionale; nel secondo capitolo si presenta il concetto di competenza traduttiva come ponte tra la formazione e il mondo della traduzione professionale; nel terzo capitolo si ripercorrono le tappe principali dell’evoluzione della didattica della traduzione generale; nel quarto capitolo illustriamo alcune tra le più recenti e complete proposte didattiche per la traduzione specializzata in ambito tecnico, medico-scientifico ed economico-giuridico. Nel quinto capitolo si introduce il concetto di genere testuale specializzato per la traduzione e nel sesto capitolo si illustra la proposta didattica per la traduzione specializzata dallo spagnolo in italiano che ha motivato il presente lavoro.
Resumo:
Dietary supplements (DS) are easily available and increasingly used, and adverse hepatic reactions have been reported following their intake. To critically review the literature on liver injury because of DSs, delineating patterns and mechanisms of injury and to increase the awareness towards this cause of acute and chronic liver damage. Studies and case reports on liver injury specifically because of DSs published between 1990 and 2010 were searched in the PubMed and EMBASE data bases using the terms 'dietary/nutritional supplements', 'adverse hepatic reactions', 'liver injury'; 'hepatitis', 'liver failure', 'vitamin A' and 'retinoids', and reviewed for yet unidentified publications. Significant liver injury was reported after intake of Herbalife and Hydroxycut products, tea extracts from Camellia sinensis, products containing usnic acid and high contents of vitamin A, anabolic steroids and others. No uniform pattern of hepatotoxicity has been identified and severity may range from asymptomatic elevations of serum liver enzymes to hepatic failure and death. Exact estimates on how frequent adverse hepatic reactions occur as a result of DSs cannot be provided. Liver injury from DSs mimicking other liver diseases is increasingly recognized. Measures to reduce risk include tighter regulation of their production and distribution and increased awareness of users and professionals of the potential risks.
Resumo:
Gap junctions are clustered channels between contacting cells through which direct intercellular communication via diffusion of ions and metabolites can occur. Two hemichannels, each built up of six connexin protein subunits in the plasma membrane of adjacent cells, can dock to each other to form conduits between cells. We have recently screened mouse and human genomic data bases and have found 19 connexin (Cx) genes in the mouse genome and 20 connexin genes in the human genome. One mouse connexin gene and two human connexin genes do not appear to have orthologs in the other genome. With three exceptions, the characterized connexin genes comprise two exons whereby the complete reading frame is located on the second exon. Targeted ablation of eleven mouse connexin genes revealed basic insights into the functional diversity of the connexin gene family. In addition, the phenotypes of human genetic disorders caused by mutated connexin genes further complement our understanding of connexin functions in the human organism. In this review we compare currently identified connexin genes in both the mouse and human genome and discuss the functions of gap junctions deduced from targeted mouse mutants and human genetic disorders.
Resumo:
Objective. This study examines the structure, processes, and data necessary to assess the outcome variables, length of stay and total cost, for a pediatric practice guideline. The guideline was developed by a group of physicians and ancillary staff members representing the services that most commonly provide treatment for asthma patients at Texas Children's Hospital, as a means of standardizing care. Outcomes have needed to be assessed to determine the practice guideline's effectiveness.^ Data sources and study design. Data for the study were collected retrospectively from multiple hospital data bases and from inpatient chart reviews. All patients in this quasi-experimental study had a diagnosis of Asthma (ICD-9-CM Code 493.91) at the time of admission.^ The study examined data for 100 patients admitted between September 15, 1995 and November 15, 1995, whose physician had elected to apply the asthma practice guideline at the time of the patient's admission. The study examined data for 66 inpatients admitted between September 15, 1995 and November 15, 1995, whose physician elected not to apply the asthma practice guideline. The principal outcome variables were identified as "Length of Stay" and "Cost".^ Principal findings. The mean length of stay for the group in which the practice guideline was applied was 2.3 days, and 3.1 days for the comparison group, who did not receive care directed by the practice guideline. The difference was statistically significant (p value = 0.008). There was not a demonstrable difference in risk factors, health status, or quality of care between the groups. Although not showing statistical significance in the univariate analysis, private insurance showed a significant difference in the logistic regression model presenting an elevated odds ratio (odds ratio = 2.2 for a hospital stay $\le$2 days to an odds ratio = 4.7 for a hospital stay $\le$3 days) showing that patients with private insurance experienced greater risk of a shorter hospital stay than the patients with public insurance in each of the logistic regression models. Public insurance included; Medicaid, Medicare, and charity cases. Private insurance included; private insurance policies whether group, individual, or managed care. The cost of an admission was significantly less for the group in which the practice guideline was applied, with a mean difference between the two groups of $1307 per patient.^ Conclusion. The implementation and utilization of a pediatric practice guideline for asthma inpatients at Texas Children's Hospital has a significant impact in terms of reducing the total cost of the hospital stay and length of the hospital stay for asthma patients admitted to Texas Children's Hospital. ^
Resumo:
The characterization of exoplanetary atmospheres has come of age in the last decade, as astronomical techniques now allow for albedos, chemical abundances, temperature profiles and maps, rotation periods and even wind speeds to be measured. Atmospheric dynamics sets the background state of density, temperature and velocity that determines or influences the spectral and temporal appearance of an exoplanetary atmosphere. Hot exoplanets are most amenable to these characterization techniques; in the present review, we focus on highly-irradiated, large exoplanets (the "hot Jupiters"), as astronomical data begin to confront theoretical questions. We summarize the basic atmospheric quantities inferred from the astronomical observations. We review the state of the art by addressing a series of current questions and look towards the future by considering a separate set of exploratory questions. Attaining the next level of understanding will require a concerted effort of constructing multi-faceted, multi-wavelength datasets for benchmark objects. Understanding clouds presents a formidable obstacle, as they introduce degeneracies into the interpretation of spectra, yet their properties and existence are directly influenced by atmospheric dynamics. Confronting general circulation models with these multi-faceted, multi-wavelength datasets will help us understand these and other degeneracies. The coming decade will witness a decisive confrontation of theory and simulation by the next generation of astronomical data.
Resumo:
BACKGROUND AND PURPOSE Previous studies have suggested that advanced age predicts worse outcome following mechanical thrombectomy. We assessed outcomes from 2 recent large prospective studies to determine the association among TICI, age, and outcome. MATERIALS AND METHODS Data from the Solitaire FR Thrombectomy for Acute Revascularization (STAR) trial, an international multicenter prospective single-arm thrombectomy study and the Solitaire arm of the Solitaire FR With the Intention For Thrombectomy (SWIFT) trial were pooled. TICI was determined by core laboratory review. Good outcome was defined as an mRS score of 0-2 at 90 days. We analyzed the association among clinical outcome, successful-versus-unsuccessful reperfusion (TICI 2b-3 versus TICI 0-2a), and age (dichotomized across the median). RESULTS Two hundred sixty-nine of 291 patients treated with Solitaire in the STAR and SWIFT data bases for whom TICI and 90-day outcome data were available were included. The median age was 70 years (interquartile range, 60-76 years) with an age range of 25-88 years. The mean age of patients 70 years of age or younger was 59 years, and it was 77 years for patients older than 70 years. There was no significant difference between baseline NIHSS scores or procedure time metrics. Hemorrhage and device-related complications were more common in the younger age group but did not reach statistical significance. In absolute terms, the rate of good outcome was higher in the younger population (64% versus 44%, P < .001). However, the magnitude of benefit from successful reperfusion was higher in the 70 years of age and older group (OR, 4.82; 95% CI, 1.32-17.63 versus OR 7.32; 95% CI, 1.73-30.99). CONCLUSIONS Successful reperfusion is the strongest predictor of good outcome following mechanical thrombectomy, and the magnitude of benefit is highest in the patient population older than 70 years of age.
Resumo:
In the last decades, a striking amount of hydrographic data, covering the most part of Mediterranean basin, have been generated by the efforts made to characterize the oceanography and ecology of the basin. On the other side, the improvement in technologies, and the consequent perfecting of sampling and analytical techniques, provided data even more reliable than in the past. Nutrient data enter fully in this context, but suffer of the fact of having been produced by a large number of uncoordinated research programs and of being often deficient in quality control, with data bases lacking of intercalibration. In this study we present a computational procedure based on robust statistical parameters and on the physical dynamic properties of the Mediterranean sea and its morphological characteristics, to partially overcome the above limits in the existing data sets. Through a data pre filtering based on the outlier analysis, and thanks to the subsequent shape analysis, the procedure identifies the inconsistent data and for each basin area identifies a characteristic set of shapes (vertical profiles). Rejecting all the profiles that do not follow any of the spotted shapes, the procedure identifies all the reliable profiles and allows us to obtain a data set that can be considered more internally consistent than the existing ones.
Resumo:
A software for simulation of bruise occurrence in fruit grading lines, SIMLIN 2.0, is presented. Examples of application are included on the simulation of handling Sudanell peaches. SIMLIN 2.0 provides algorithms for the selection of logistic bruise prediction models adjusted on the basis of user designed laboratory tests. Handled fruits are characterised for simulation by means of statistical features on the independent variables of the logistic model. SIMLIN 2.0 allows to display different line designs establishing their aggressiveness from internal data bases. Aggressiveness is characterised in terms of data gathered with electronic products IS-100 type. The software provides graphical outputs which enable decision making on the improvement strategies of the lines and the selection of the product to be handled.
Resumo:
Introduction: Previous systematic reviews of the literature on the effects of Tai Chi Chuan(TCC)on balance have focussed either on determining the quality of the research design or have provided just a general description of the studies.To the best of our knowledge none have approached this topic by conducting an analysis from the point of view of the factors which affect balance.It is important to present this perspective as it will help to guide future research in this field. Methodology: Seven electronic data bases were searched for publications dated between 1996 and 2012.The inclusion criteria were;randomized controlled trials(RCT)written in English. Results: From a total of 397 articles identified, 27 randomized controlled trials were eligible for the analysis. Conclusions: Studies reviewed appear to confirm that TCC improves static and dynamic balance and in the functional factors which affect balance in persons of over 55 years of age.Only one study was identified on people affected with problems with the vestibular system. No studies on the influence of TCC on improvement in balance in individuals suffering from deteriorated brain function were identified.
Resumo:
Este Proyecto Fin de Grado está enmarcado dentro de las actividades del GRyS (Grupo de Redes y Servicios de Próxima Generación) con las Smart Grids. En la investigación actual sobre Smart Grids se pretenden alcanzar los siguientes objetivos: . Integrar fuentes de energías renovables de manera efectiva. . Aumentar la eficiencia en la gestión de la demanda y suministro de forma dinámica. . Reducir las emisiones de CO2 dando prioridad a fuentes de energía verdes. . Concienciar del consumo de energía mediante la monitorización de dispositivos y servicios. . Estimular el desarrollo de un mercado vanguardista de tecnologías energéticamente eficientes con nuevos modelos de negocio. Dentro del contexto de las Smart Grids, el interés del GRyS se extiende básicamente a la creación de middlewares semánticos y tecnologías afines, como las ontologías de servicios y las bases de datos semánticas. El objetivo de este Proyecto Fin de Grado ha sido diseñar y desarrollar una aplicación para dispositivos con sistema operativo Android, que implementa una interfaz gráfica y los métodos necesarios para obtener y representar información de registro de servicios de una plataforma SOA (Service-Oriented Architecture). La aplicación permite: . Representar información relativa a los servicios y dispositivos registrados en una Smart Grid. . Guardar, cargar y compartir por correo electrónico ficheros HTML con la información anterior. . Representar en un mapa la ubicación de los dispositivos. . Representar medidas (voltaje, temperatura, etc.) en tiempo real. . Aplicar filtros por identificador de dispositivo, modelo o fabricante. . Realizar consultas SPARQL a bases de datos semánticas. . Guardar y cagar consultas SPARQL en ficheros de texto almacenados en la tarjeta SD. La aplicación, desarrollada en Java, es de código libre y hace uso de tecnologías estándar y abiertas como HTML, XML, SPARQL y servicios RESTful. Se ha tenido ocasión de probarla con la infraestructura del proyecto europeo e-Gotham (Sustainable-Smart Grid Open System for the Aggregated Control, Monitoring and Management of Energy), en el que participan 17 socios de 5 países: España, Italia, Estonia, Finlandia y Noruega. En esta memoria se detalla el estudio realizado sobre el Estado del arte y las tecnologías utilizadas en el desarrollo del proyecto, la implementación, diseño y arquitectura de la aplicación, así como las pruebas realizadas y los resultados obtenidos. ABSTRACT. This Final Degree Project is framed within the activities of the GRyS (Grupo de Redes y Servicios de Próxima Generación) with the Smart Grids. Current research on Smart Grids aims to achieve the following objectives: . To effectively integrate renewable energy sources. . To increase management efficiency by dynamically matching demand and supply. . To reduce carbon emissions by giving priority to green energy sources. . To raise energy consumption awareness by monitoring products and services. . To stimulate the development of a leading-edge market for energy-efficient technologies with new business models. Within the context of the Smart Grids, the interest of the GRyS basically extends to the creation of semantic middleware and related technologies, such as service ontologies and semantic data bases. The objective of this Final Degree Project has been to design and develop an application for devices with Android operating system, which implements a graphical interface and methods to obtain and represent services registry information in a Service-Oriented Architecture (SOA) platform. The application allows users to: . Represent information related to services and devices registered in a Smart Grid. . Save, load and share HTML files with the above information by email. . Represent the location of devices on a map. . Represent measures (voltage, temperature, etc.) in real time. . Apply filters by device id, model or manufacturer. . SPARQL query semantic database. . Save and load SPARQL queries in text files stored on the SD card. The application, developed in Java, is open source and uses open standards such as HTML, XML, SPARQL and RESTful services technologies. It has been tested in a real environment using the e-Gotham European project infrastructure (Sustainable-Smart Grid Open System for the Aggregated Control, Monitoring and Management of Energy), which is participated by 17 partners from 5 countries: Spain, Italy, Estonia, Finland and Norway. This report details the study on the State of the art and the technologies used in the development of the project, implementation, design and architecture of the application, as well as the tests performed and the results obtained.
Resumo:
El correcto pronóstico en el ámbito de la logística de transportes es de vital importancia para una adecuada planificación de medios y recursos, así como de su optimización. Hasta la fecha los estudios sobre planificación portuaria se basan principalmente en modelos empíricos; que se han utilizado para planificar nuevas terminales y desarrollar planes directores cuando no se dispone de datos iniciales, analíticos; más relacionados con la teoría de colas y tiempos de espera con formulaciones matemáticas complejas y necesitando simplificaciones de las mismas para hacer manejable y práctico el modelo o de simulación; que requieren de una inversión significativa como para poder obtener resultados aceptables invirtiendo en programas y desarrollos complejos. La Minería de Datos (MD) es un área moderna interdisciplinaria que engloba a aquellas técnicas que operan de forma automática (requieren de la mínima intervención humana) y, además, son eficientes para trabajar con las grandes cantidades de información disponible en las bases de datos de numerosos problemas prácticos. La aplicación práctica de estas disciplinas se extiende a numerosos ámbitos comerciales y de investigación en problemas de predicción, clasificación o diagnosis. Entre las diferentes técnicas disponibles en minería de datos las redes neuronales artificiales (RNA) y las redes probabilísticas o redes bayesianas (RB) permiten modelizar de forma conjunta toda la información relevante para un problema dado. En el presente trabajo se han analizado dos aplicaciones de estos casos al ámbito portuario y en concreto a contenedores. En la Tesis Doctoral se desarrollan las RNA como herramienta para obtener previsiones de tráfico y de recursos a futuro de diferentes puertos, a partir de variables de explotación, obteniéndose valores continuos. Para el caso de las redes bayesianas (RB), se realiza un trabajo similar que para el caso de las RNA, obteniéndose valores discretos (un intervalo). El principal resultado que se obtiene es la posibilidad de utilizar tanto las RNA como las RB para la estimación a futuro de parámetros físicos, así como la relación entre los mismos en una terminal para una correcta asignación de los medios a utilizar y por tanto aumentar la eficiencia productiva de la terminal. Como paso final se realiza un estudio de complementariedad de ambos modelos a corto plazo, donde se puede comprobar la buena aceptación de los resultados obtenidos. Por tanto, se puede concluir que estos métodos de predicción pueden ser de gran ayuda a la planificación portuaria. The correct assets’ forecast in the field of transportation logistics is a matter of vital importance for a suitable planning and optimization of the necessary means and resources. Up to this date, ports planning studies were basically using empirical models to deal with new terminals planning or master plans development when no initial data are available; analytical models, more connected to the queuing theory and the waiting times, and very complicated mathematical formulations requiring significant simplifications to acquire a practical and easy to handle model; or simulation models, that require a significant investment in computer codes and complex developments to produce acceptable results. The Data Mining (DM) is a modern interdisciplinary field that include those techniques that operate automatically (almost no human intervention is required) and are highly efficient when dealing with practical problems characterized by huge data bases containing significant amount of information. These disciplines’ practical application extends to many commercial or research fields, dealing with forecast, classification or diagnosis problems. Among the different techniques of the Data Mining, the Artificial Neuronal Networks (ANN) and the probabilistic – or Bayesian – networks (BN) allow the joint modeling of all the relevant information for a given problem. This PhD work analyses their application to two practical cases in the ports field, concretely to container terminals. This PhD work details how the ANN have been developed as a tool to produce traffic and resources forecasts for several ports, based on exploitation variables to obtain continuous values. For the Bayesian networks case (BN), a similar development has been carried out, obtaining discreet values (an interval). The main finding is the possibility to use ANN and BN to estimate future needs of the port’s or terminal’s physical parameters, as well as the relationship between them within a specific terminal, that allow a correct assignment of the necessary means and, thus, to increase the terminal’s productive efficiency. The final step is a short term complementarily study of both models, carried out in order to verify the obtained results. It can thus be stated that these prediction methods can be a very useful tool in ports’ planning.
Resumo:
Parte de la investigación biomédica actual se encuentra centrada en el análisis de datos heterogéneos. Estos datos pueden tener distinto origen, estructura, y semántica. Gran cantidad de datos de interés para los investigadores se encuentran en bases de datos públicas, que recogen información de distintas fuentes y la ponen a disposición de la comunidad de forma gratuita. Para homogeneizar estas fuentes de datos públicas con otras de origen privado, existen diversas herramientas y técnicas que permiten automatizar los procesos de homogeneización de datos heterogéneos. El Grupo de Informática Biomédica (GIB) [1] de la Universidad Politécnica de Madrid colabora en el proyecto europeo P-medicine [2], cuya finalidad reside en el desarrollo de una infraestructura que facilite la evolución de los procedimientos médicos actuales hacia la medicina personalizada. Una de las tareas enmarcadas en el proyecto P-medicine que tiene asignado el grupo consiste en elaborar herramientas que ayuden a usuarios en el proceso de integración de datos contenidos en fuentes de información heterogéneas. Algunas de estas fuentes de información son bases de datos públicas de ámbito biomédico contenidas en la plataforma NCBI [3] (National Center for Biotechnology Information). Una de las herramientas que el grupo desarrolla para integrar fuentes de datos es Ontology Annotator. En una de sus fases, la labor del usuario consiste en recuperar información de una base de datos pública y seleccionar de forma manual los resultados relevantes. Para automatizar el proceso de búsqueda y selección de resultados relevantes, por un lado existe un gran interés en conseguir generar consultas que guíen hacia resultados lo más precisos y exactos como sea posible, por otro lado, existe un gran interés en extraer información relevante de elevadas cantidades de documentos, lo cual requiere de sistemas que analicen y ponderen los datos que caracterizan a los mismos. En el campo informático de la inteligencia artificial, dentro de la rama de la recuperación de la información, existen diversos estudios acerca de la expansión de consultas a partir de retroalimentación relevante que podrían ser de gran utilidad para dar solución a la cuestión. Estos estudios se centran en técnicas para reformular o expandir la consulta inicial utilizando como realimentación los resultados que en una primera instancia fueron relevantes para el usuario, de forma que el nuevo conjunto de resultados tenga mayor proximidad con los que el usuario realmente desea. El objetivo de este trabajo de fin de grado consiste en el estudio, implementación y experimentación de métodos que automaticen el proceso de extracción de información trascendente de documentos, utilizándola para expandir o reformular consultas. De esta forma se pretende mejorar la precisión y el ranking de los resultados asociados. Dichos métodos serán integrados en la herramienta Ontology Annotator y enfocados a la fuente de datos de PubMed [4].---ABSTRACT---Part of the current biomedical research is focused on the analysis of heterogeneous data. These data may have different origin, structure and semantics. A big quantity of interesting data is contained in public databases which gather information from different sources and make it open and free to be used by the community. In order to homogenize thise sources of public data with others which origin is private, there are some tools and techniques that allow automating the processes of integration heterogeneous data. The biomedical informatics group of the Universidad Politécnica de Madrid cooperates with the European project P-medicine which main purpose is to create an infrastructure and models to facilitate the transition from current medical practice to personalized medicine. One of the tasks of the project that the group is in charge of consists on the development of tools that will help users in the process of integrating data from diverse sources. Some of the sources are biomedical public data bases from the NCBI platform (National Center for Biotechnology Information). One of the tools in which the group is currently working on for the integration of data sources is called the Ontology Annotator. In this tool there is a phase in which the user has to retrieve information from a public data base and select the relevant data contained in it manually. For automating the process of searching and selecting data on the one hand, there is an interest in automatically generating queries that guide towards the more precise results as possible. On the other hand, there is an interest on retrieve relevant information from large quantities of documents. The solution requires systems that analyze and weigh the data allowing the localization of the relevant items. In the computer science field of the artificial intelligence, in the branch of information retrieval there are diverse studies about the query expansion from relevance feedback that could be used to solve the problem. The main purpose of this studies is to obtain a set of results that is the closer as possible to the information that the user really wants to retrieve. In order to reach this purpose different techniques are used to reformulate or expand the initial query using a feedback the results that where relevant for the user, with this method, the new set of results will have more proximity with the ones that the user really desires. The goal of this final dissertation project consists on the study, implementation and experimentation of methods that automate the process of extraction of relevant information from documents using this information to expand queries. This way, the precision and the ranking of the results associated will be improved. These methods will be integrated in the Ontology Annotator tool and will focus on the PubMed data source.