903 resultados para Data-driven Methods
Resumo:
This study investigated the spatial, spectral, temporal and functional proprieties of functional brain connections involved in the concurrent execution of unrelated visual perception and working memory tasks. Electroencephalography data was analysed using a novel data-driven approach assessing source coherence at the whole-brain level. Three connections in the beta-band (18-24 Hz) and one in the gamma-band (30-40 Hz) were modulated by dual-task performance. Beta-coherence increased within two dorsofrontal-occipital connections in dual-task conditions compared to the single-task condition, with the highest coherence seen during low working memory load trials. In contrast, beta-coherence in a prefrontal-occipital functional connection and gamma-coherence in an inferior frontal-occipitoparietal connection was not affected by the addition of the second task and only showed elevated coherence under high working memory load. Analysis of coherence as a function of time suggested that the dorsofrontal-occipital beta-connections were relevant to working memory maintenance, while the prefrontal-occipital beta-connection and the inferior frontal-occipitoparietal gamma-connection were involved in top-down control of concurrent visual processing. The fact that increased coherence in the gamma-connection, from low to high working memory load, was negatively correlated with faster reaction time on the perception task supports this interpretation. Together, these results demonstrate that dual-task demands trigger non-linear changes in functional interactions between frontal-executive and occipitoparietal-perceptual cortices.
Resumo:
In this paper, we develop a data-driven methodology to characterize the likelihood of orographic precipitation enhancement using sequences of weather radar images and a digital elevation model (DEM). Geographical locations with topographic characteristics favorable to enforce repeatable and persistent orographic precipitation such as stationary cells, upslope rainfall enhancement, and repeated convective initiation are detected by analyzing the spatial distribution of a set of precipitation cells extracted from radar imagery. Topographic features such as terrain convexity and gradients computed from the DEM at multiple spatial scales as well as velocity fields estimated from sequences of weather radar images are used as explanatory factors to describe the occurrence of localized precipitation enhancement. The latter is represented as a binary process by defining a threshold on the number of cell occurrences at particular locations. Both two-class and one-class support vector machine classifiers are tested to separate the presumed orographic cells from the nonorographic ones in the space of contributing topographic and flow features. Site-based validation is carried out to estimate realistic generalization skills of the obtained spatial prediction models. Due to the high class separability, the decision function of the classifiers can be interpreted as a likelihood or susceptibility of orographic precipitation enhancement. The developed approach can serve as a basis for refining radar-based quantitative precipitation estimates and short-term forecasts or for generating stochastic precipitation ensembles conditioned on the local topography.
Resumo:
At the University of Lausanne third-year medical students are given the task of spending a month investigating a question of community medicine. In 2009, four students evaluated the legitimacy of health insurers intervening in the management of depression. They found that health insurers put pressure on public authorities during the development of legislation governing the health system and reimbursement for treatment. This fact emerged during the scientific investigation led jointly by the team in the course of the "module of immersion in community medicine." This paper presents each step of their study. The example chosen illustrates the learning objectives covered by the module.
Resumo:
This paper presents multiple kernel learning (MKL) regression as an exploratory spatial data analysis and modelling tool. The MKL approach is introduced as an extension of support vector regression, where MKL uses dedicated kernels to divide a given task into sub-problems and to treat them separately in an effective way. It provides better interpretability to non-linear robust kernel regression at the cost of a more complex numerical optimization. In particular, we investigate the use of MKL as a tool that allows us to avoid using ad-hoc topographic indices as covariables in statistical models in complex terrains. Instead, MKL learns these relationships from the data in a non-parametric fashion. A study on data simulated from real terrain features confirms the ability of MKL to enhance the interpretability of data-driven models and to aid feature selection without degrading predictive performances. Here we examine the stability of the MKL algorithm with respect to the number of training data samples and to the presence of noise. The results of a real case study are also presented, where MKL is able to exploit a large set of terrain features computed at multiple spatial scales, when predicting mean wind speed in an Alpine region.
Resumo:
BACKGROUND: The annotation of protein post-translational modifications (PTMs) is an important task of UniProtKB curators and, with continuing improvements in experimental methodology, an ever greater number of articles are being published on this topic. To help curators cope with this growing body of information we have developed a system which extracts information from the scientific literature for the most frequently annotated PTMs in UniProtKB. RESULTS: The procedure uses a pattern-matching and rule-based approach to extract sentences with information on the type and site of modification. A ranked list of protein candidates for the modification is also provided. For PTM extraction, precision varies from 57% to 94%, and recall from 75% to 95%, according to the type of modification. The procedure was used to track new publications on PTMs and to recover potential supporting evidence for phosphorylation sites annotated based on the results of large scale proteomics experiments. CONCLUSIONS: The information retrieval and extraction method we have developed in this study forms the basis of a simple tool for the manual curation of protein post-translational modifications in UniProtKB/Swiss-Prot. Our work demonstrates that even simple text-mining tools can be effectively adapted for database curation tasks, providing that a thorough understanding of the working process and requirements are first obtained. This system can be accessed at http://eagl.unige.ch/PTM/.
Resumo:
Many states are striving to keep their deer population to a sustainable and controllable level while maximizing public safety. In Iowa, measures to control the deer population include annual deer hunts and special deer herd management plans in urban areas. While these plans may reduce the deer population, traffic safety in these areas has not been fully assessed. Using deer population data from the Iowa Department of Natural Resources and data on deer-vehicle crashes and deer carcass removals from the Iowa Department of Transportation, the authors examined the relationship between deer-vehicle collisions, deer density, and land use in three urban areas in Iowa that have deer management plans in place (Cedar Rapids, Dubuque, and Iowa City) over the period 2002 to 2007. First, a comparison of deer-vehicle crash counts and deer carcass removal counts was conducted at the county level. Further, the authors estimated econometric models to investigate the factors that influence the frequency and severity of deer-vehicle crashes in these zones. Overall, the number of deer carcasses removed on the primary roads in these counties was greater than the number of reported deervehicle crashes on those roads. These differences can be attributed to a number of reasons, including variability in data reporting and data collection practices. In addition, high rates of underreporting of crashes were found on major routes that carry high volumes of traffic. This study also showed that multiple factors affect deer-vehicle crashes and corresponding injury outcomes in urban management zones. The identified roadway and non-roadway factors could be useful for identifying locations on the transportation system that significantly impact deer species and safety and for determining appropriate countermeasures for mitigation. Efforts to reduce deer density adjacent to roads and developed land and to provide wider shoulders on undivided roads are recommended. Improving the consistency and accuracy of deer carcass and deer-vehicle collision data collection methods and practices is also desirable.
Resumo:
Many transportation agencies maintain grade as an attribute in roadway inventory databases; however, the information is often in an aggregated format. Cross slope is rarely included in large roadway inventories. Accurate methods available to collect grade and cross slope include global positioning systems, traditional surveying, and mobile mapping systems. However, most agencies do not have the resources to utilize these methods to collect grade and cross slope on a large scale. This report discusses the use of LIDAR to extract roadway grade and cross slope for large-scale inventories. Current data collection methods and their advantages and disadvantages are discussed. A pilot study to extract grade and cross slope from a LIDAR data set, including methodology, results, and conclusions, is presented. This report describes the regression methodology used to extract and evaluate the accuracy of grade and cross slope from three dimensional surfaces created from LIDAR data. The use of LIDAR data to extract grade and cross slope on tangent highway segments was evaluated and compared against grade and cross slope collected using an automatic level for 10 test segments along Iowa Highway 1. Grade and cross slope were measured from a surface model created from LIDAR data points collected for the study area. While grade could be estimated to within 1%, study results indicate that cross slope cannot practically be estimated using a LIDAR derived surface model.
Resumo:
Land plants have had the reputation of being problematic for DNA barcoding for two general reasons: (i) the standard DNA regions used in algae, animals and fungi have exceedingly low levels of variability and (ii) the typically used land plant plastid phylogenetic markers (e.g. rbcL, trnL-F, etc.) appear to have too little variation. However, no one has assessed how well current phylogenetic resources might work in the context of identification (versus phylogeny reconstruction). In this paper, we make such an assessment, particularly with two of the markers commonly sequenced in land plant phylogenetic studies, plastid rbcL and internal transcribed spacers of the large subunits of nuclear ribosomal DNA (ITS), and find that both of these DNA regions perform well even though the data currently available in GenBank/EBI were not produced to be used as barcodes and BLAST searches are not an ideal tool for this purpose. These results bode well for the use of even more variable regions of plastid DNA (such as, for example, psbA-trnH) as barcodes, once they have been widely sequenced. In the short term, efforts to bring land plant barcoding up to the standards being used now in other organisms should make swift progress. There are two categories of DNA barcode users, scientists in fields other than taxonomy and taxonomists. For the former, the use of mitochondrial and plastid DNA, the two most easily assessed genomes, is at least in the short term a useful tool that permits them to get on with their studies, which depend on knowing roughly which species or species groups they are dealing with, but these same DNA regions have important drawbacks for use in taxonomic studies (i.e. studies designed to elucidate species limits). For these purposes, DNA markers from uniparentally (usually maternally) inherited genomes can only provide half of the story required to improve taxonomic standards being used in DNA barcoding. In the long term, we will need to develop more sophisticated barcoding tools, which would be multiple, low-copy nuclear markers with sufficient genetic variability and PCR-reliability; these would permit the detection of hybrids and permit researchers to identify the 'genetic gaps' that are useful in assessing species limits.
Resumo:
BACKGROUND: PCR has the potential to detect and precisely quantify specific DNA sequences, but it is not yet often used as a fully quantitative method. A number of data collection and processing strategies have been described for the implementation of quantitative PCR. However, they can be experimentally cumbersome, their relative performances have not been evaluated systematically, and they often remain poorly validated statistically and/or experimentally. In this study, we evaluated the performance of known methods, and compared them with newly developed data processing strategies in terms of resolution, precision and robustness. RESULTS: Our results indicate that simple methods that do not rely on the estimation of the efficiency of the PCR amplification may provide reproducible and sensitive data, but that they do not quantify DNA with precision. Other evaluated methods based on sigmoidal or exponential curve fitting were generally of both poor resolution and precision. A statistical analysis of the parameters that influence efficiency indicated that it depends mostly on the selected amplicon and to a lesser extent on the particular biological sample analyzed. Thus, we devised various strategies based on individual or averaged efficiency values, which were used to assess the regulated expression of several genes in response to a growth factor. CONCLUSION: Overall, qPCR data analysis methods differ significantly in their performance, and this analysis identifies methods that provide DNA quantification estimates of high precision, robustness and reliability. These methods allow reliable estimations of relative expression ratio of two-fold or higher, and our analysis provides an estimation of the number of biological samples that have to be analyzed to achieve a given precision.
Resumo:
This report is concerned with the prediction of the long-time creep and shrinkage behavior of concrete. It is divided into three main areas. l. The development of general prediction methods that can be used by a design engineer when specific experimental data are not available. 2. The development of prediction methods based on experimental data. These methods take advantage of equations developed in item l, and can be used to accurately predict creep and shrinkage after only 28 days of data collection. 3. Experimental verification of items l and 2, and the development of specific prediction equations for four sand-lightweight aggregate concretes tested in the experimental program. The general prediction equations and methods are developed in Chapter II. Standard Equations to estimate the creep of normal weight concrete (Eq. 9), sand-lightweight concrete (Eq. 12), and lightweight concrete (Eq. 15) are recommended. These equations are developed for standard conditions (see Sec. 2. 1) and correction factors required to convert creep coefficients obtained from equations 9, 12, and 15 to valid predictions for other conditions are given in Equations 17 through 23. The correction factors are shown graphically in Figs. 6 through 13. Similar equations and methods are developed for the prediction of the shrinkage of moist cured normal weight concrete (Eq. 30}, moist cured sand-lightweight concrete (Eq. 33}, and moist cured lightweight concrete (Eq. 36). For steam cured concrete the equations are Eq. 42 for normal weight concrete, and Eq. 45 for lightweight concrete. Correction factors are given in Equations 47 through 52 and Figs., 18 through 24. Chapter III summarizes and illustrates, by examples, the prediction methods developed in Chapter II. Chapters IV and V describe an experimental program in which specific prediction equations are developed for concretes made with Haydite manufactured by Hydraulic Press Brick Co. (Eqs. 53 and 54}, Haydite manufactured by Buildex Inc. (Eqs. 55 and 56), Haydite manufactured by The Cater-Waters Corp. (Eqs. 57 and 58}, and Idealite manufactured by Idealite Co. (Eqs. 59 and 60). General prediction equations are also developed from the data obtained in the experimental program (Eqs. 61 and 62) and are compared to similar equations developed in Chapter II. Creep and Shrinkage prediction methods based on 28 day experimental data are developed in Chapter VI. The methods are verified by comparing predicted and measured values of the long-time creep and shrinkage of specimens tested at the University of Iowa (see Chapters IV and V) and elsewhere. The accuracy obtained is shown to be superior to other similar methods available to the design engineer.
Resumo:
BACKGROUND: Several European HIV observational data bases have, over the last decade, accumulated a substantial number of resistance test results and developed large sample repositories, There is a need to link these efforts together, We here describe the development of such a novel tool that allows to bind these data bases together in a distributed fashion for which the control and data remains with the cohorts rather than classic data mergers.METHODS: As proof-of-concept we entered two basic queries into the tool: available resistance tests and available samples. We asked for patients still alive after 1998-01-01, and between 180 and 195 cm of height, and how many samples or resistance tests there would be available for these patients, The queries were uploaded with the tool to a central web server from which each participating cohort downloaded the queries with the tool and ran them against their database, The numbers gathered were then submitted back to the server and we could accumulate the number of available samples and resistance tests.RESULTS: We obtained the following results from the cohorts on available samples/resistance test: EuResist: not availableI11,194; EuroSIDA: 20,71611,992; ICONA: 3,751/500; Rega: 302/302; SHCS: 53,78311,485, In total, 78,552 samples and 15,473 resistance tests were available amongst these five cohorts. Once these data items have been identified, it is trivial to generate lists of relevant samples that would be usefuI for ultra deep sequencing in addition to the already available resistance tests, Saon the tool will include small analysis packages that allow each cohort to pull a report on their cohort profile and also survey emerging resistance trends in their own cohort,CONCLUSIONS: We plan on providing this tool to all cohorts within the Collaborative HIV and Anti-HIV Drug Resistance Network (CHAIN) and will provide the tool free of charge to others for any non-commercial use, The potential of this tool is to ease collaborations, that is, in projects requiring data to speed up identification of novel resistance mutations by increasing the number of observations across multiple cohorts instead of awaiting single cohorts or studies to reach the critical number needed to address such issues.
Resumo:
The hydrological and biogeochemical processes that operate in catchments influence the ecological quality of freshwater systems through delivery of fine sediment, nutrients and organic matter. Most models that seek to characterise the delivery of diffuse pollutants from land to water are reductionist. The multitude of processes that are parameterised in such models to ensure generic applicability make them complex and difficult to test on available data. Here, we outline an alternative - data-driven - inverse approach. We apply SCIMAP, a parsimonious risk based model that has an explicit treatment of hydrological connectivity. we take a Bayesian approach to the inverse problem of determining the risk that must be assigned to different land uses in a catchment in order to explain the spatial patterns of measured in-stream nutrient concentrations. We apply the model to identify the key sources of nitrogen (N) and phosphorus (P) diffuse pollution risk in eleven UK catchments covering a range of landscapes. The model results show that: 1) some land use generates a consistently high or low risk of diffuse nutrient pollution; but 2) the risks associated with different land uses vary both between catchments and between nutrients; and 3) that the dominant sources of P and N risk in the catchment are often a function of the spatial configuration of land uses. Taken on a case-by-case basis, this type of inverse approach may be used to help prioritise the focus of interventions to reduce diffuse pollution risk for freshwater ecosystems. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
BACKGROUND: Used in conjunction with biological surveillance, behavioural surveillance provides data allowing for a more precise definition of HIV/STI prevention strategies. In 2008, mapping of behavioural surveillance in EU/EFTA countries was performed on behalf of the European Centre for Disease prevention and Control. METHOD: Nine questionnaires were sent to all 31 member States and EEE/EFTA countries requesting data on the overall behavioural and second generation surveillance system and on surveillance in the general population, youth, men having sex with men (MSM), injecting drug users (IDU), sex workers (SW), migrants, people living with HIV/AIDS (PLWHA), and sexually transmitted infection (STI) clinics patients. Requested data included information on system organisation (e.g. sustainability, funding, institutionalisation), topics covered in surveys and main indicators. RESULTS: Twenty-eight of the 31 countries contacted supplied data. Sixteen countries reported an established behavioural surveillance system, and 13 a second generation surveillance system (combination of biological surveillance of HIV/AIDS and STI with behavioural surveillance). There were wide differences as regards the year of survey initiation, number of populations surveyed, data collection methods used, organisation of surveillance and coordination with biological surveillance. The populations most regularly surveyed are the general population, youth, MSM and IDU. SW, patients of STI clinics and PLWHA are surveyed less regularly and in only a small number of countries, and few countries have undertaken behavioural surveys among migrant or ethnic minorities populations. In many cases, the identification of populations with risk behaviour and the selection of populations to be included in a BS system have not been formally conducted, or are incomplete. Topics most frequently covered are similar across countries, although many different indicators are used. In most countries, sustainability of surveillance systems is not assured. CONCLUSION: Although many European countries have established behavioural surveillance systems, there is little harmonisation as regards the methods and indicators adopted. The main challenge now faced is to build and maintain organised and functional behavioural and second generation surveillance systems across Europe, to increase collaboration, to promote robust, sustainable and cost-effective data collection methods, and to harmonise indicators.