15 resultados para HTLV-1 database
em Aston University Research Archive
Resumo:
Derivational morphology proposes meaningful connections between words and is largely unrepresented in lexical databases. This thesis presents a project to enrich a lexical database with morphological links and to evaluate their contribution to disambiguation. A lexical database with sense distinctions was required. WordNet was chosen because of its free availability and widespread use. Its suitability was assessed through critical evaluation with respect to specifications and criticisms, using a transparent, extensible model. The identification of serious shortcomings suggested a portable enrichment methodology, applicable to alternative resources. Although 40% of the most frequent words are prepositions, they have been largely ignored by computational linguists, so addition of prepositions was also required. The preferred approach to morphological enrichment was to infer relations from phenomena discovered algorithmically. Both existing databases and existing algorithms can capture regular morphological relations, but cannot capture exceptions correctly; neither of them provide any semantic information. Some morphological analysis algorithms are subject to the fallacy that morphological analysis can be performed simply by segmentation. Morphological rules, grounded in observation and etymology, govern associations between and attachment of suffixes and contribute to defining the meaning of morphological relationships. Specifying character substitutions circumvents the segmentation fallacy. Morphological rules are prone to undergeneration, minimised through a variable lexical validity requirement, and overgeneration, minimised by rule reformulation and restricting monosyllabic output. Rules take into account the morphology of ancestor languages through co-occurrences of morphological patterns. Multiple rules applicable to an input suffix need their precedence established. The resistance of prefixations to segmentation has been addressed by identifying linking vowel exceptions and irregular prefixes. The automatic affix discovery algorithm applies heuristics to identify meaningful affixes and is combined with morphological rules into a hybrid model, fed only with empirical data, collected without supervision. Further algorithms apply the rules optimally to automatically pre-identified suffixes and break words into their component morphemes. To handle exceptions, stoplists were created in response to initial errors and fed back into the model through iterative development, leading to 100% precision, contestable only on lexicographic criteria. Stoplist length is minimised by special treatment of monosyllables and reformulation of rules. 96% of words and phrases are analysed. 218,802 directed derivational links have been encoded in the lexicon rather than the wordnet component of the model because the lexicon provides the optimal clustering of word senses. Both links and analyser are portable to an alternative lexicon. The evaluation uses the extended gloss overlaps disambiguation algorithm. The enriched model outperformed WordNet in terms of recall without loss of precision. Failure of all experiments to outperform disambiguation by frequency reflects on WordNet sense distinctions.
Resumo:
The continuing threat of infectious disease and future pandemics, coupled to the continuous increase of drug-resistant pathogens, makes the discovery of new and better vaccines imperative. For effective vaccine development, antigen discovery and validation is a prerequisite. The compilation of information concerning pathogens, virulence factors and antigenic epitopes has resulted in many useful databases. However, most such immunological databases focus almost exclusively on antigens where epitopes are known and ignore those for which epitope information was unavailable. We have compiled more than 500 antigens into the AntigenDB database, making use of the literature and other immunological resources. These antigens come from 44 important pathogenic species. In AntigenDB, a database entry contains information regarding the sequence, structure, origin, etc. of an antigen with additional information such as B and T-cell epitopes, MHC binding, function, gene-expression and post translational modifications, where available. AntigenDB also provides links to major internal and external databases. We shall update AntigenDB on a rolling basis, regularly adding antigens from other organisms and extra data analysis tools. AntigenDB is available freely at http://www.imtech.res.in/raghava/antigendb and its mirror site http://www.bic.uams.edu/raghava/antigendb.
Resumo:
Computer-Based Learning systems of one sort or another have been in existence for almost 20 years, but they have yet to achieve real credibility within Commerce, Industry or Education. A variety of reasons could be postulated for this, typically: - cost - complexity - inefficiency - inflexibility - tedium Obviously different systems deserve different levels and types of criticism, but it still remains true that Computer-Based Learning (CBL) is falling significantly short of its potential. Experience of a small, but highly successful CBL system within a large, geographically distributed industry (the National Coal Board) prompted an investigation into currently available packages, the original intention being to purchase the most suitable software and run it on existing computer hardware, alongside existing software systems. It became apparent that none of the available CBL packages were suitable, and a decision was taken to develop an in-house Computer-Assisted Instruction system according to the following criteria: - cheap to run; - easy to author course material; - easy to use; - requires no computing knowledge to use (as either an author or student) ; - efficient in the use of computer resources; - has a comprehensive range of facilities at all levels. This thesis describes the initial investigation, resultant observations and the design, development and implementation of the SCHOOL system. One of the principal characteristics c£ SCHOOL is that it uses a hierarchical database structure for the storage of course material - thereby providing inherently a great deal of the power, flexibility and efficiency originally required. Trials using the SCHOOL system on IBM 303X series equipment are also detailed, along with proposed and current development work on what is essentially an operational CBL system within a large-scale Industrial environment.
Resumo:
This thesis describes the development of a complete data visualisation system for large tabular databases, such as those commonly found in a business environment. A state-of-the-art 'cyberspace cell' data visualisation technique was investigated and a powerful visualisation system using it was implemented. Although allowing databases to be explored and conclusions drawn, it had several drawbacks, the majority of which were due to the three-dimensional nature of the visualisation. A novel two-dimensional generic visualisation system, known as MADEN, was then developed and implemented, based upon a 2-D matrix of 'density plots'. MADEN allows an entire high-dimensional database to be visualised in one window, while permitting close analysis in 'enlargement' windows. Selections of records can be made and examined, and dependencies between fields can be investigated in detail. MADEN was used as a tool for investigating and assessing many data processing algorithms, firstly data-reducing (clustering) methods, then dimensionality-reducing techniques. These included a new 'directed' form of principal components analysis, several novel applications of artificial neural networks, and discriminant analysis techniques which illustrated how groups within a database can be separated. To illustrate the power of the system, MADEN was used to explore customer databases from two financial institutions, resulting in a number of discoveries which would be of interest to a marketing manager. Finally, the database of results from the 1992 UK Research Assessment Exercise was analysed. Using MADEN allowed both universities and disciplines to be graphically compared, and supplied some startling revelations, including empirical evidence of the 'Oxbridge factor'.
Resumo:
Database systems have a user interface one of the components of which will normally be a query language which is based on a particular data model. Typically data models provide primitives to define, manipulate and query databases. Often these primitives are designed to form self-contained query languages. This thesis describes a prototype implementation of a system which allows users to specify queries against the database in a query language whose primitives are not those provided by the actual model on which the database system is based, but those provided by a different data model. The implementation chosen is the Functional Query Language Front End (FQLFE). This uses the Daplex functional data model and query language. Using FQLFE, users can specify the underlying database (based on the relational model) in terms of Daplex. Queries against this specified view can then be made in Daplex. FQLFE transforms these queries into the query language (Quel) of the underlying target database system (Ingres). The automation of part of the Daplex function definition phase is also described and its implementation discussed.
Resumo:
An implementation of a Lexical Functional Grammar (LFG) natural language front-end to a database is presented, and its capabilities demonstrated by reference to a set of queries used in the Chat-80 system. The potential of LFG for such applications is explored. Other grammars previously used for this purpose are briefly reviewed and contrasted with LFG. The basic LFG formalism is fully described, both as to its syntax and semantics, and the deficiencies of the latter for database access application shown. Other current LFG implementations are reviewed and contrasted with the LFG implementation developed here specifically for database access. The implementation described here allows a natural language interface to a specific Prolog database to be produced from a set of grammar rule and lexical specifications in an LFG-like notation. In addition to this the interface system uses a simple database description to compile metadata about the database for later use in planning the execution of queries. Extensions to LFG's semantic component are shown to be necessary to produce a satisfactory functional analysis and semantic output for querying a database. A diverse set of natural language constructs are analysed using LFG and the derivation of Prolog queries from the F-structure output of LFG is illustrated. The functional description produced from LFG is proposed as sufficient for resolving many problems of quantification and attachment.
Resumo:
This thesis presents a new approach to designing large organizational databases. The approach emphasizes the need for a holistic approach to the design process. The development of the proposed approach was based on a comprehensive examination of the issues of relevance to the design and utilization of databases. Such issues include conceptual modelling, organization theory, and semantic theory. The conceptual modelling approach presented in this thesis is developed over three design stages, or model perspectives. In the semantic perspective, concept definitions were developed based on established semantic principles. Such definitions rely on meaning - provided by intension and extension - to determine intrinsic conceptual definitions. A tool, called meaning-based classification (MBC), is devised to classify concepts based on meaning. Concept classes are then integrated using concept definitions and a set of semantic relations which rely on concept content and form. In the application perspective, relationships are semantically defined according to the application environment. Relationship definitions include explicit relationship properties and constraints. The organization perspective introduces a new set of relations specifically developed to maintain conformity of conceptual abstractions with the nature of information abstractions implied by user requirements throughout the organization. Such relations are based on the stratification of work hierarchies, defined elsewhere in the thesis. Finally, an example of an application of the proposed approach is presented to illustrate the applicability and practicality of the modelling approach.
Resumo:
We have developed a novel multilocus sequence typing (MLST) scheme and database (http://pubmlst.org/pacnes/) for Propionibacterium acnes based on the analysis of seven core housekeeping genes. The scheme, which was validated against previously described antibody, single locus and random amplification of polymorphic DNA typing methods, displayed excellent resolution and differentiated 123 isolates into 37 sequence types (STs). An overall clonal population structure was detected with six eBURST groups representing the major clades I, II and III, along with two singletons. Two highly successful and global clonal lineages, ST6 (type IA) and ST10 (type IB1), representing 64?% of this current MLST isolate collection were identified. The ST6 clone and closely related single locus variants, which comprise a large clonal complex CC6, dominated isolates from patients with acne, and were also significantly associated with ophthalmic infections. Our data therefore support an association between acne and P. acnes strains from the type IA cluster and highlight the role of a widely disseminated clonal genotype in this condition. Characterization of type I cell surface-associated antigens that are not detected in ST10 or strains of type II and III identified two dermatan-sulphate-binding proteins with putative phase/antigenic variation signatures. We propose that the expression of these proteins by type IA organisms contributes to their role in the pathophysiology of acne and helps explain the recurrent nature of the disease. The MLST scheme and database described in this study should provide a valuable platform for future epidemiological and evolutionary studies of P. acnes.
Resumo:
This paper discusses the use of a Model developed by Aston Business School to record the work load of its academic staff. By developing a database to register annual activity in all areas of teaching, administration and research the School has created a flexible tool which can be used for facilitating both day-to-day managerial and longer term strategic decisions. This paper gives a brief outline of the Model and discusses the factors which were taken into account when setting it up. Particular attention is paid to the uses made of the Model and the problems encountered in developing it. The paper concludes with an appraisal of the Model’s impact and of additional developments which are currently being considered. Aston Business School has had a Load Model in some form for many years. The Model has, however, been refined over the past five years, so that it has developed into a form which can be used for a far greater number of purposes within the School. The Model is coordinated by a small group of academic and administrative staff, chaired by the Head of the School. This group is responsible for the annual cycle of collecting and inputting data, validating returns, carrying out analyses of the raw data, and presenting the mater ial to different sections of the School. The authors of this paper are members of this steer ing group.
Resumo:
The Teallach project has adapted model-based user-interface development techniques to the systematic creation of user-interfaces for object-oriented database applications. Model-based approaches aim to provide designers with a more principled approach to user-interface development using a variety of underlying models, and tools which manipulate these models. Here we present the results of the Teallach project, describing the tools developed and the flexible design method supported. Distinctive features of the Teallach system include provision of database-specific constructs, comprehensive facilities for relating the different models, and support for a flexible design method in which models can be constructed and related by designers in different orders and in different ways, to suit their particular design rationales. The system then creates the desired user-interface as an independent, fully functional Java application, with automatically generated help facilities.
Resumo:
A protein's isoelectric point or pI corresponds to the solution pH at which its net surface charge is zero. Since the early days of solution biochemistry, the pI has been recorded and reported, and thus literature reports of pI abound. The Protein Isoelectric Point database (PIP-DB) has collected and collated these data to provide an increasingly comprehensive database for comparison and benchmarking purposes. A web application has been developed to warehouse this database and provide public access to this unique resource. PIP-DB is a web-enabled SQL database with an HTML GUI front-end. PIP-DB is fully searchable across a range of properties.
Resumo:
Introduction-The design of the UK MPharm curriculum is driven by the Royal Pharmaceutical Society of Great Britain (RPSGB) accreditation process and the EU directive (85/432/EEC).[1] Although the RPSGB is informed about teaching activity in UK Schools of Pharmacy (SOPs), there is no database which aggregates information to provide the whole picture of pharmacy education within the UK. The aim of the teaching, learning and assessment study [2] was to document and map current programmes in the 16 established SOPs. Recent developments in programme delivery have resulted in a focus on deep learning (for example, through problem based learning approaches) and on being more student centred and less didactic through lectures. The specific objectives of this part of the study were (a) to quantify the content and modes of delivery of material as described in course documentation and (b) having categorised the range of teaching methods, ask students to rate how important they perceived each one for their own learning (using a three point Likert scale: very important, fairly important or not important). Material and methods-The study design compared three datasets: (1) quantitative course document review, (2) qualitative staff interview and (3) quantitative student self completion survey. All 16 SOPs provided a set of their undergraduate course documentation for the year 2003/4. The documentation variables were entered into Excel tables. A self-completion questionnaire was administered to all year four undergraduates, using a pragmatic mixture of methods, (n=1847) in 15 SOPs within Great Britain. The survey data were analysed (n=741) using SPSS, excluding non-UK students who may have undertaken part of their studies within a non-UK university. Results and discussion-Interviews showed that individual teachers and course module leaders determine the choice of teaching methods used. Content review of the documentary evidence showed that 51% of the taught element of the course was delivered using lectures, 31% using practicals (includes computer aided learning) and 18% small group or interactive teaching. There was high uniformity across the schools for the first three years; variation in the final year was due to the project. The average number of hours per year across 15 schools (data for one school were not available) was: year 1: 408 hours; year 2: 401 hours; year 3: 387 hours; year 4: 401 hours. The survey showed that students perceived lectures to be the most important method of teaching after dispensing or clinical practicals. Taking the very important rating only: 94% (n=694) dispensing or clinical practicals; 75% (n=558) lectures; 52% (n=386) workshops, 50% (n=369) tutorials, 43% (n=318) directed study. Scientific laboratory practices were rated very important by only 31% (n=227). The study shows that teaching of pharmacy to undergraduates in the UK is still essentially didactic through a high proportion of formal lectures and with high levels of staff-student contact. Schools consider lectures still to be the most cost effective means of delivering the core syllabus to large cohorts of students. However, this does limit the scope for any optionality within teaching, the scope for small group work is reduced as is the opportunity to develop multi-professional learning or practice placements. Although novel teaching and learning techniques such as e-learning have expanded considerably over the past decade, schools of pharmacy have concentrated on lectures as the best way of coping with the huge expansion in student numbers. References [1] Council Directive. Concerning the coordination of provisions laid down by law, regulation or administrative action in respect of certain activities in the field of pharmacy. Official Journal of the European Communities 1985;85/432/EEC. [2] Wilson K, Jesson J, Langley C, Clarke L, Hatfield K. MPharm Programmes: Where are we now? Report commissioned by the Pharmacy Practice Research Trust., 2005.
Resumo:
The Protein pKa Database (PPD) v1.0 provides a compendium of protein residue-specific ionization equilibria (pKa values), as collated from the primary literature, in the form of a web-accessible postgreSQL relational database. Ionizable residues play key roles in the molecular mechanisms that underlie many biological phenomena, including protein folding and enzyme catalysis. The PPD serves as a general protein pKa archive and as a source of data that allows for the development and improvement of pKa prediction systems. The database is accessed through an HTML interface, which offers two fast, efficient search methods: an amino acid-based query and a Basic Local Alignment Search Tool search. Entries also give details of experimental techniques and links to other key databases, such as National Center for Biotechnology Information and the Protein Data Bank, providing the user with considerable background information.
Resumo:
The IUPHAR database (IUPHAR-DB) integrates peer-reviewed pharmacological, chemical, genetic, functional and anatomical information on the 354 nonsensory G protein-coupled receptors (GPCRs), 71 ligand-gated ion channel subunits and 141 voltage-gated-like ion channel subunits encoded by the human, rat and mouse genomes. These genes represent the targets of approximately one-third of currently approved drugs and are a major focus of drug discovery and development programs in the pharmaceutical industry. IUPHAR-DB provides a comprehensive description of the genes and their functions, with information on protein structure and interactions, ligands, expression patterns, signaling mechanisms, functional assays and biologically important receptor variants (e.g. single nucleotide polymorphisms and splice variants). In addition, the phenotypes resulting from altered gene expression (e.g. in genetically altered animals or in human genetic disorders) are described. The content of the database is peer reviewed by members of the International Union of Basic and Clinical Pharmacology Committee on Receptor Nomenclature and Drug Classification (NC-IUPHAR); the data are provided through manual curation of the primary literature by a network of over 60 subcommittees of NC-IUPHAR. Links to other bioinformatics resources, such as NCBI, Uniprot, HGNC and the rat and mouse genome databases are provided. IUPHAR-DB is freely available at http://www.iuphar-db.org. © 2008 The Author(s).
Resumo:
Objective: Vomiting in pregnancy is a common condition affecting 80% of pregnant women. Hyperemesis is at one end of the spectrum, seen in 0.5–2% of the pregnant population. Known factors such as nulliparity, younger age and high body mass indexare associated with an increased risk of this condition in the first trimester. Late pregnancy complications attributable to hyperemesis, the pathogenesis of which is poorly understood, have not been studied in large population-based studies in the United Kingdom. The objective of this study was to determine a plausible association between hyperemesis and pregnancy complications,such as pregnancy-related hypertension, gestational diabetes and liver problems in pregnancy, and the rates of elective (ElCS) and emergency caesarean section (EmCS). Methods: Using a database based on ICD-10 classification, anonymised data of admissions to a large multi-ethnic hospital in Manchester, UK between 2000 and 2012 were examined.Notwithstanding the obvious limitations with hospital database-based research, this large volume of datasets allows powerful studies of disease trends and complications.Results Between 2000 and 2012, 156 507 women aged 45 or under were admitted to hospital. Of these, 1111 women were coded for hyperemesis (0.4%). A greater proportion of women with hyperemesis than without hyperemesis were coded forhypertensive disorders in pregnancy such as pregnancy-induced hypertension, pre-eclampsia and eclampsia (2.7% vs 1.5%;P=0.001). The proportion of gestational diabetes and liver disorders in pregnancy was similar for both groups (diabetes:0.5% vs. 0.4%; P=0.945, liver disorders: 0.2% vs. 0.1%;P=0.662). Hyperemesis patients had a higher proportion of elective and emergency caesarean sections compared with the non-hyperemesis group (ElCS: 3.3% vs. 2%; P=0.002, EmCS: 5% vs.3%; P=0.00). Conclusions: There was a higher rate of emergency and elective caesarean section in women with hyperemesis, which could reflect the higher prevalence of pregnancy-related hypertensive disorders(but not diabetes or liver disorders) in this group. The factors contributing to the higher prevalence of hypertensive disorders arenot known, but these findings lead us to question whether there is a similar pathogenesis in the development of both the conditions and hence whether further study in this area is warranted.