992 resultados para Database accession number
Resumo:
Background The 'database search problem', that is, the strengthening of a case - in terms of probative value - against an individual who is found as a result of a database search, has been approached during the last two decades with substantial mathematical analyses, accompanied by lively debate and centrally opposing conclusions. This represents a challenging obstacle in teaching but also hinders a balanced and coherent discussion of the topic within the wider scientific and legal community. This paper revisits and tracks the associated mathematical analyses in terms of Bayesian networks. Their derivation and discussion for capturing probabilistic arguments that explain the database search problem are outlined in detail. The resulting Bayesian networks offer a distinct view on the main debated issues, along with further clarity. Methods As a general framework for representing and analyzing formal arguments in probabilistic reasoning about uncertain target propositions (that is, whether or not a given individual is the source of a crime stain), this paper relies on graphical probability models, in particular, Bayesian networks. This graphical probability modeling approach is used to capture, within a single model, a series of key variables, such as the number of individuals in a database, the size of the population of potential crime stain sources, and the rarity of the corresponding analytical characteristics in a relevant population. Results This paper demonstrates the feasibility of deriving Bayesian network structures for analyzing, representing, and tracking the database search problem. The output of the proposed models can be shown to agree with existing but exclusively formulaic approaches. Conclusions The proposed Bayesian networks allow one to capture and analyze the currently most well-supported but reputedly counter-intuitive and difficult solution to the database search problem in a way that goes beyond the traditional, purely formulaic expressions. The method's graphical environment, along with its computational and probabilistic architectures, represents a rich package that offers analysts and discussants with additional modes of interaction, concise representation, and coherent communication.
Resumo:
Teaching and research are organised differently between subject domains: attempts to construct typologies of higher education institutions, however, often do not include quantitative indicators concerning subject mix which would allow systematic comparisons of large numbers of higher education institutions among different countries, as the availability of data for such indicators is limited. In this paper, we present an exploratory approach for the construction of such indicators. The database constructed in the AQUAMETH project, which includes also data disaggregated at the disciplinary level, is explored with the aim of understanding patterns of subject mix. For six European countries, an exploratory and descriptive analysis of staff composition divided in four large domains (medical sciences, engineering and technology, natural sciences and social sciences and humanities) is performed, which leads to a classification distinguishing between specialist and generalist institutions. Among the latter, a further distinction is made based on the presence or absence of a medical department. Preliminary exploration of this classification and its comparison with other indicators show the influence of long term dynamics on the subject mix of individual higher education institutions, but also underline disciplinary differences, for example regarding student to staff ratios, as well as national patterns, for example regarding the number of PhD degrees per 100 undergraduate students. Despite its many limitations, this exploratory approach allows defining a classification of higher education institutions that accounts for a large share of differences between the analysed higher education institutions.
Resumo:
Heriot-Watt University uses a software package called Syllabus Plus for its timetabling. This package can perform scheduling functions however it is currently employed only as a room booking system at present. In academic session 2008-2009 the university will be restructuring its academic year from 3 terms of 10 weeks to semesters of 14 weeks and therefore major changes will be required to the timetabling information. This project has two functions, both with practical and relevant applications to the timetabling of the university. The aims of the project are the ability to change population number of modules and activities, delete term 3 modules and activities, the ability to change module and activity name, and change the teaching week pattern from the semester
Resumo:
Since 2008, Intelligence units of six states of the western part of Switzerland have been sharing a common database for the analysis of high volume crimes. On a daily basis, events reported to the police are analysed, filtered and classified to detect crime repetitions and interpret the crime environment. Several forensic outcomes are integrated in the system such as matches of traces with persons, and links between scenes detected by the comparison of forensic case data. Systematic procedures have been settled to integrate links assumed mainly through DNA profiles, shoemarks patterns and images. A statistical outlook on a retrospective dataset of series from 2009 to 2011 of the database informs for instance on the number of repetition detected or confirmed and increased by forensic case data. Time needed to obtain forensic intelligence in regard with the type of marks treated, is seen as a critical issue. Furthermore, the underlying integration process of forensic intelligence into the crime intelligence database raised several difficulties in regards of the acquisition of data and the models used in the forensic databases. Solutions found and adopted operational procedures are described and discussed. This process form the basis to many other researches aimed at developing forensic intelligence models.
Resumo:
OBJECTIVE: Methylphenidate is prescribed for children and adolescents to treat ADHD. As in many Western countries, the increase in methylphenidate consumption is a public concern in Switzerland. The article discusses the authors' assessment of prescription prevalence in 2002 and 2005 for school-aged children in the canton of Vaud. METHOD: Pharmacy prescription information is available from the regional public health authority. Descriptive analyses are conducted on an anonymized database of the years 2002 and 2005. Data for each year are compared to assess trends in methylphenidate prescription prevalence. RESULTS: The findings show an increase from 0.74% to 1.02% in the number of prescriptions for 5- to 14-year-old children, particularly in prescriptions for girls. Data also show important geographical differences in prescription. CONCLUSION: The prevalence of methylphenidate prescription is lower in Switzerland than other Western countries, particularly the United States. However, some aspects of prevalence are similar, including the increase per year, demographics, and geographic characteristics.
Resumo:
Selenoproteins are a diverse group of proteinsusually misidentified and misannotated in sequencedatabases. The presence of an in-frame UGA (stop)codon in the coding sequence of selenoproteingenes precludes their identification and correctannotation. The in-frame UGA codons are recodedto cotranslationally incorporate selenocysteine,a rare selenium-containing amino acid. The developmentof ad hoc experimental and, more recently,computational approaches have allowed the efficientidentification and characterization of theselenoproteomes of a growing number of species.Today, dozens of selenoprotein families have beendescribed and more are being discovered in recentlysequenced species, but the correct genomic annotationis not available for the majority of thesegenes. SelenoDB is a long-term project that aims toprovide, through the collaborative effort of experimentaland computational researchers, automaticand manually curated annotations of selenoproteingenes, proteins and SECIS elements. Version 1.0 ofthe database includes an initial set of eukaryoticgenomic annotations, with special emphasis on thehuman selenoproteome, for immediate inspectionby selenium researchers or incorporation into moregeneral databases. SelenoDB is freely available athttp://www.selenodb.org.
Resumo:
ABSTRACTThis study reviewed the data on the Brazilian Ephemeroptera, based on the studies published before July, 2013, estimated the number of species still to be described, and identified which regions of the country have been the subject of least research. More than half the species are known from the description of only one developmental stage, with imagoes being described more frequently than nymphs. The Brazilian Northeast is the region with the weakest database. Body size affected description rates, with a strong tendency for the larger species to be described first. The estimated number of unknown Brazilian species was accentuated by the fact that so few species have been described so far. The steep slope of the asymptote and the considerable confidence interval of the estimate reinforce the conclusion that a large number of species are still to be described. This emphasizes the need for investments in the training of specialists in systematics and ecology for all regions of Brazil to correct these deficiencies, given the role of published papers as a primary source of information, and the fundamental importance of taxonomic knowledge for the development of effective measures for the conservation of ephemeropteran and the aquatic ecosystems they depend on.
Resumo:
The aim of this work is to introduce a systematic press database on natural hazards and climate change in Catalonia (NE of Spain) and to analyze its potential application to social-impact studies. For this reason, a review of the concepts of risk, hazard, vulnerability and social perception is also included. This database has been built for the period 1982¿2007 and contains all the news related with those issues published by the oldest still-active newspaper in Catalonia. Some parameters are registered for each article and for each event, including criteria that enable us to determine the importance accorded to it by the newspaper, and a compilation of information about it. This ACCESS data base allows each article to be classified on the basis of the seven defined topics and key words, as well as summary information about the format and structuring of the new itself, the social impact of the event and data about the magnitude or intensity of the event. The coverage given to this type of news has been assessed because of its influence on construction of the social perception of natural risk and climate change, and as a potential source of information about them. The treatment accorded by the press to different risks is also considered. More than 14 000 press articles have been classified. Results show that the largest number of news items for the period 1982¿2007 relates to forest fires and droughts, followed by floods and heavy rainfalls, although floods are the major risk in the region of study. Two flood events recorded in 2002 have been analyzed in order to show an example of the role of the press information as indicator of risk perception.
Resumo:
One of the aims of the MEDEX project is to improve the knowledge of high-impact weather events in the Mediterranean. According to the guidelines of this project, a pilot study was carried out in two regions of Spain (the Balearic Islands and Catalonia) by the Social Impact Research group of MEDEX. The main goal is to suggest some general and suitable criteria about how to analyse requests received in Meteorological Services arising out of the damage caused by weather events. Thus, all the requests received between 2000 and 2002 at the Servei Meteorològic de Catalunya as well as at the Division of AEMET in the Balearic Islands were analysed. Firstly, the proposed criteria in order to build the database are defined and discussed. Secondly, the temporal distribution of the requests for damage claims is analysed. On average, almost half of them were received during the first month after the event happened. During the first six months, the percentage increases by 90%. Thirdly, various factors are taken into account to determine the impact of specific events on society. It is remarkable that the greatest number of requests is for those episodes with simultaneous heavy rain and strong wind, and finally, those that are linked to high population density.
Resumo:
The NW Mediterranean region experiences every year heavy rainfall and flash floods that occasionally produce catastrophic damages. Less frequent are floods that affect large regions. Although a large number of databases devoted exclusively to floods or considering all kind of natural hazards do exist, usually they only record catastrophic flood events. This paper deals with the new flood database that is being developed within the framework of HYMEX project. Results are focused on four regions representative of the NW sector of Mediterranean Europe: Catalonia, Spain; the Balearic Islands, Spain; Calabria, Italy; and Languedoc-Roussillon, Midi-Pyrenées and PACA, France. The common available 30-yr period starts in 1981 and ends in 2010. The paper shows the database structure and criteria, the comparison with other flood databases, some statistics on spatial and temporal distribution, and an identification of the most important events. The paper also provides a table that includes the date and affected region of all the catastrophic events identified in the regions of study, in order to make this information available for all audiences.
Resumo:
Volumes of data used in science and industry are growing rapidly. When researchers face the challenge of analyzing them, their format is often the first obstacle. Lack of standardized ways of exploring different data layouts requires an effort each time to solve the problem from scratch. Possibility to access data in a rich, uniform manner, e.g. using Structured Query Language (SQL) would offer expressiveness and user-friendliness. Comma-separated values (CSV) are one of the most common data storage formats. Despite its simplicity, with growing file size handling it becomes non-trivial. Importing CSVs into existing databases is time-consuming and troublesome, or even impossible if its horizontal dimension reaches thousands of columns. Most databases are optimized for handling large number of rows rather than columns, therefore, performance for datasets with non-typical layouts is often unacceptable. Other challenges include schema creation, updates and repeated data imports. To address the above-mentioned problems, I present a system for accessing very large CSV-based datasets by means of SQL. It's characterized by: "no copy" approach - data stay mostly in the CSV files; "zero configuration" - no need to specify database schema; written in C++, with boost [1], SQLite [2] and Qt [3], doesn't require installation and has very small size; query rewriting, dynamic creation of indices for appropriate columns and static data retrieval directly from CSV files ensure efficient plan execution; effortless support for millions of columns; due to per-value typing, using mixed text/numbers data is easy; very simple network protocol provides efficient interface for MATLAB and reduces implementation time for other languages. The software is available as freeware along with educational videos on its website [4]. It doesn't need any prerequisites to run, as all of the libraries are included in the distribution package. I test it against existing database solutions using a battery of benchmarks and discuss the results.
Resumo:
Selectome (http://selectome.unil.ch/) is a database of positive selection, based on a branch-site likelihood test. This model estimates the number of nonsynonymous substitutions (dN) and synonymous substitutions (dS) to evaluate the variation in selective pressure (dN/dS ratio) over branches and over sites. Since the original release of Selectome, we have benchmarked and implemented a thorough quality control procedure on multiple sequence alignments, aiming to provide minimum false-positive results. We have also improved the computational efficiency of the branch-site test implementation, allowing larger data sets and more frequent updates. Release 6 of Selectome includes all gene trees from Ensembl for Primates and Glires, as well as a large set of vertebrate gene trees. A total of 6810 gene trees have some evidence of positive selection. Finally, the web interface has been improved to be more responsive and to facilitate searches and browsing.
Resumo:
The recognition that colorectal cancer (CRC) is a heterogeneous disease in terms of clinical behaviour and response to therapy translates into an urgent need for robust molecular disease subclassifiers that can explain this heterogeneity beyond current parameters (MSI, KRAS, BRAF). Attempts to fill this gap are emerging. The Cancer Genome Atlas (TGCA) reported two main CRC groups, based on the incidence and spectrum of mutated genes, and another paper reported an EMT expression signature defined subgroup. We performed a prior free analysis of CRC heterogeneity on 1113 CRC gene expression profiles and confronted our findings to established molecular determinants and clinical, histopathological and survival data. Unsupervised clustering based on gene modules allowed us to distinguish at least five different gene expression CRC subtypes, which we call surface crypt-like, lower crypt-like, CIMP-H-like, mesenchymal and mixed. A gene set enrichment analysis combined with literature search of gene module members identified distinct biological motifs in different subtypes. The subtypes, which were not derived based on outcome, nonetheless showed differences in prognosis. Known gene copy number variations and mutations in key cancer-associated genes differed between subtypes, but the subtypes provided molecular information beyond that contained in these variables. Morphological features significantly differed between subtypes. The objective existence of the subtypes and their clinical and molecular characteristics were validated in an independent set of 720 CRC expression profiles. Our subtypes provide a novel perspective on the heterogeneity of CRC. The proposed subtypes should be further explored retrospectively on existing clinical trial datasets and, when sufficiently robust, be prospectively assessed for clinical relevance in terms of prognosis and treatment response predictive capacity. Original microarray data were uploaded to the ArrayExpress database (http://www.ebi.ac.uk/arrayexpress/) under Accession Nos E-MTAB-990 and E-MTAB-1026. © 2013 Swiss Institute of Bioinformatics. Journal of Pathology published by John Wiley & Sons Ltd on behalf of Pathological Society of Great Britain and Ireland.
Resumo:
For well over 100 years, the Working Stress Design (WSD) approach has been the traditional basis for geotechnical design with regard to settlements or failure conditions. However, considerable effort has been put forth over the past couple of decades in relation to the adoption of the Load and Resistance Factor Design (LRFD) approach into geotechnical design. With the goal of producing engineered designs with consistent levels of reliability, the Federal Highway Administration (FHWA) issued a policy memorandum on June 28, 2000, requiring all new bridges initiated after October 1, 2007, to be designed according to the LRFD approach. Likewise, regionally calibrated LRFD resistance factors were permitted by the American Association of State Highway and Transportation Officials (AASHTO) to improve the economy of bridge foundation elements. Thus, projects TR-573, TR-583 and TR-584 were undertaken by a research team at Iowa State University’s Bridge Engineering Center with the goal of developing resistance factors for pile design using available pile static load test data. To accomplish this goal, the available data were first analyzed for reliability and then placed in a newly designed relational database management system termed PIle LOad Tests (PILOT), to which this first volume of the final report for project TR-573 is dedicated. PILOT is an amalgamated, electronic source of information consisting of both static and dynamic data for pile load tests conducted in the State of Iowa. The database, which includes historical data on pile load tests dating back to 1966, is intended for use in the establishment of LRFD resistance factors for design and construction control of driven pile foundations in Iowa. Although a considerable amount of geotechnical and pile load test data is available in literature as well as in various State Department of Transportation files, PILOT is one of the first regional databases to be exclusively used in the development of LRFD resistance factors for the design and construction control of driven pile foundations. Currently providing an electronically organized assimilation of geotechnical and pile load test data for 274 piles of various types (e.g., steel H-shaped, timber, pipe, Monotube, and concrete), PILOT (http://srg.cce.iastate.edu/lrfd/) is on par with such familiar national databases used in the calibration of LRFD resistance factors for pile foundations as the FHWA’s Deep Foundation Load Test Database. By narrowing geographical boundaries while maintaining a high number of pile load tests, PILOT exemplifies a model for effective regional LRFD calibration procedures.
Resumo:
BACKGROUND: Several European HIV observational data bases have, over the last decade, accumulated a substantial number of resistance test results and developed large sample repositories, There is a need to link these efforts together, We here describe the development of such a novel tool that allows to bind these data bases together in a distributed fashion for which the control and data remains with the cohorts rather than classic data mergers.METHODS: As proof-of-concept we entered two basic queries into the tool: available resistance tests and available samples. We asked for patients still alive after 1998-01-01, and between 180 and 195 cm of height, and how many samples or resistance tests there would be available for these patients, The queries were uploaded with the tool to a central web server from which each participating cohort downloaded the queries with the tool and ran them against their database, The numbers gathered were then submitted back to the server and we could accumulate the number of available samples and resistance tests.RESULTS: We obtained the following results from the cohorts on available samples/resistance test: EuResist: not availableI11,194; EuroSIDA: 20,71611,992; ICONA: 3,751/500; Rega: 302/302; SHCS: 53,78311,485, In total, 78,552 samples and 15,473 resistance tests were available amongst these five cohorts. Once these data items have been identified, it is trivial to generate lists of relevant samples that would be usefuI for ultra deep sequencing in addition to the already available resistance tests, Saon the tool will include small analysis packages that allow each cohort to pull a report on their cohort profile and also survey emerging resistance trends in their own cohort,CONCLUSIONS: We plan on providing this tool to all cohorts within the Collaborative HIV and Anti-HIV Drug Resistance Network (CHAIN) and will provide the tool free of charge to others for any non-commercial use, The potential of this tool is to ease collaborations, that is, in projects requiring data to speed up identification of novel resistance mutations by increasing the number of observations across multiple cohorts instead of awaiting single cohorts or studies to reach the critical number needed to address such issues.