988 resultados para Geo-scientific processing
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Biomedical natural language processing (BioNLP) is a subfield of natural language processing, an area of computational linguistics concerned with developing programs that work with natural language: written texts and speech. Biomedical relation extraction concerns the detection of semantic relations such as protein-protein interactions (PPI) from scientific texts. The aim is to enhance information retrieval by detecting relations between concepts, not just individual concepts as with a keyword search. In recent years, events have been proposed as a more detailed alternative for simple pairwise PPI relations. Events provide a systematic, structural representation for annotating the content of natural language texts. Events are characterized by annotated trigger words, directed and typed arguments and the ability to nest other events. For example, the sentence “Protein A causes protein B to bind protein C” can be annotated with the nested event structure CAUSE(A, BIND(B, C)). Converted to such formal representations, the information of natural language texts can be used by computational applications. Biomedical event annotations were introduced by the BioInfer and GENIA corpora, and event extraction was popularized by the BioNLP'09 Shared Task on Event Extraction. In this thesis we present a method for automated event extraction, implemented as the Turku Event Extraction System (TEES). A unified graph format is defined for representing event annotations and the problem of extracting complex event structures is decomposed into a number of independent classification tasks. These classification tasks are solved using SVM and RLS classifiers, utilizing rich feature representations built from full dependency parsing. Building on earlier work on pairwise relation extraction and using a generalized graph representation, the resulting TEES system is capable of detecting binary relations as well as complex event structures. We show that this event extraction system has good performance, reaching the first place in the BioNLP'09 Shared Task on Event Extraction. Subsequently, TEES has achieved several first ranks in the BioNLP'11 and BioNLP'13 Shared Tasks, as well as shown competitive performance in the binary relation Drug-Drug Interaction Extraction 2011 and 2013 shared tasks. The Turku Event Extraction System is published as a freely available open-source project, documenting the research in detail as well as making the method available for practical applications. In particular, in this thesis we describe the application of the event extraction method to PubMed-scale text mining, showing how the developed approach not only shows good performance, but is generalizable and applicable to large-scale real-world text mining projects. Finally, we discuss related literature, summarize the contributions of the work and present some thoughts on future directions for biomedical event extraction. This thesis includes and builds on six original research publications. The first of these introduces the analysis of dependency parses that leads to development of TEES. The entries in the three BioNLP Shared Tasks, as well as in the DDIExtraction 2011 task are covered in four publications, and the sixth one demonstrates the application of the system to PubMed-scale text mining.
Resumo:
The nucleus tractus solitarii (NTS) receives afferent projections from the arterial baroreceptors, carotid chemoreceptors and cardiopulmonary receptors and as a function of this information produces autonomic adjustments in order to maintain arterial blood pressure within a narrow range of variation. The activation of each of these cardiovascular afferents produces a specific autonomic response by the excitation of neuronal projections from the NTS to the ventrolateral areas of the medulla (nucleus ambiguus, caudal and rostral ventrolateral medulla). The neurotransmitters at the NTS level as well as the excitatory amino acid (EAA) receptors involved in the processing of the autonomic responses in the NTS, although extensively studied, remain to be completely elucidated. In the present review we discuss the role of the EAA L-glutamate and its different receptor subtypes in the processing of the cardiovascular reflexes in the NTS. The data presented in this review related to the neurotransmission in the NTS are based on experimental evidence obtained in our laboratory in unanesthetized rats. The two major conclusions of the present review are that a) the excitation of the cardiovagal component by cardiovascular reflex activation (chemo- and Bezold-Jarisch reflexes) or by L-glutamate microinjection into the NTS is mediated by N-methyl-D-aspartate (NMDA) receptors, and b) the sympatho-excitatory component of the chemoreflex and the pressor response to L-glutamate microinjected into the NTS are not affected by an NMDA receptor antagonist, suggesting that the sympatho-excitatory component of these responses is mediated by non-NMDA receptors.
Resumo:
The amount of biological data has grown exponentially in recent decades. Modern biotechnologies, such as microarrays and next-generation sequencing, are capable to produce massive amounts of biomedical data in a single experiment. As the amount of the data is rapidly growing there is an urgent need for reliable computational methods for analyzing and visualizing it. This thesis addresses this need by studying how to efficiently and reliably analyze and visualize high-dimensional data, especially that obtained from gene expression microarray experiments. First, we will study the ways to improve the quality of microarray data by replacing (imputing) the missing data entries with the estimated values for these entries. Missing value imputation is a method which is commonly used to make the original incomplete data complete, thus making it easier to be analyzed with statistical and computational methods. Our novel approach was to use curated external biological information as a guide for the missing value imputation. Secondly, we studied the effect of missing value imputation on the downstream data analysis methods like clustering. We compared multiple recent imputation algorithms against 8 publicly available microarray data sets. It was observed that the missing value imputation indeed is a rational way to improve the quality of biological data. The research revealed differences between the clustering results obtained with different imputation methods. On most data sets, the simple and fast k-NN imputation was good enough, but there were also needs for more advanced imputation methods, such as Bayesian Principal Component Algorithm (BPCA). Finally, we studied the visualization of biological network data. Biological interaction networks are examples of the outcome of multiple biological experiments such as using the gene microarray techniques. Such networks are typically very large and highly connected, thus there is a need for fast algorithms for producing visually pleasant layouts. A computationally efficient way to produce layouts of large biological interaction networks was developed. The algorithm uses multilevel optimization within the regular force directed graph layout algorithm.
Resumo:
In this thesis, the suitability of different trackers for finger tracking in high-speed videos was studied. Tracked finger trajectories from the videos were post-processed and analysed using various filtering and smoothing methods. Position derivatives of the trajectories, speed and acceleration were extracted for the purposes of hand motion analysis. Overall, two methods, Kernelized Correlation Filters and Spatio-Temporal Context Learning tracking, performed better than the others in the tests. Both achieved high accuracy for the selected high-speed videos and also allowed real-time processing, being able to process over 500 frames per second. In addition, the results showed that different filtering methods can be applied to produce more appropriate velocity and acceleration curves calculated from the tracking data. Local Regression filtering and Unscented Kalman Smoother gave the best results in the tests. Furthermore, the results show that tracking and filtering methods are suitable for high-speed hand-tracking and trajectory-data post-processing.
Resumo:
This study was designed to evaluate the effect of different conditions of collection, transport and storage on the quality of blood samples from normal individuals in terms of the activity of the enzymes ß-glucuronidase, total hexosaminidase, hexosaminidase A, arylsulfatase A and ß-galactosidase. The enzyme activities were not affected by the different materials used for collection (plastic syringes or vacuum glass tubes). In the evaluation of different heparin concentrations (10% heparin, 5% heparin, and heparinized syringe) in the syringes, it was observed that higher doses resulted in an increase of at least 1-fold in the activities of ß-galactosidase, total hexosaminidase and hexosaminidase A in leukocytes, and ß-glucuronidase in plasma. When the effects of time and means of transportation were studied, samples that had been kept at room temperature showed higher deterioration with time (72 and 96 h) before processing, and in this case it was impossible to isolate leukocytes from most samples. Comparison of heparin and acid citrate-dextrose (ACD) as anticoagulants revealed that ß-glucuronidase and hexosaminidase activities in plasma reached levels near the lower normal limits when ACD was used. In conclusion, we observed that heparin should be used as the preferable anticoagulant when measuring these lysosomal enzyme activities, and we recommend that, when transport time is more than 24 h, samples should be shipped by air in a styrofoam box containing wet ice.
Resumo:
Results of subgroup analysis (SA) reported in randomized clinical trials (RCT) cannot be adequately interpreted without information about the methods used in the study design and the data analysis. Our aim was to show how often inaccurate or incomplete reports occur. First, we selected eight methodological aspects of SA on the basis of their importance to a reader in determining the confidence that should be placed in the author's conclusions regarding such analysis. Then, we reviewed the current practice of reporting these methodological aspects of SA in clinical trials in four leading journals, i.e., the New England Journal of Medicine, the Journal of the American Medical Association, the Lancet, and the American Journal of Public Health. Eight consecutive reports from each journal published after July 1, 1998 were included. Of the 32 trials surveyed, 17 (53%) had at least one SA. Overall, the proportion of RCT reporting a particular methodological aspect ranged from 23 to 94%. Information on whether the SA preceded/followed the analysis was reported in only 7 (41%) of the studies. Of the total possible number of items to be reported, NEJM, JAMA, Lancet and AJPH clearly mentioned 59, 67, 58 and 72%, respectively. We conclude that current reporting of SA in RCT is incomplete and inaccurate. The results of such SA may have harmful effects on treatment recommendations if accepted without judicious scrutiny. We recommend that editors improve the reporting of SA in RCT by giving authors a list of the important items to be reported.
Resumo:
We present a critical analysis of the generalized use of the "impact factor". By means of the Kruskal-Wallis test, it was shown that it is not possible to compare distinct disciplines using the impact factor without adjustments. After assigning the median journal the value of one (1.000), the impact factor value for each journal was calculated by the rule of three. The adjusted values were homogeneous, thus permitting comparison among distinct disciplines.
Resumo:
The aim of this master’s thesis is to research and analyze how purchase invoice processing can be automated and streamlined in a system renewal project. The impacts of workflow automation on invoice handling are studied by means of time, cost and quality aspects. Purchase invoice processing has a lot of potential for automation because of its labor-intensive and repetitive nature. As a case study combining both qualitative and quantitative methods, the topic is approached from a business process management point of view. The current process was first explored through interviews and workshop meetings to create a holistic understanding of the process at hand. Requirements for process streamlining were then researched focusing on specified vendors and their purchase invoices, which helped to identify the critical factors for successful invoice automation. To optimize the flow from invoice receipt to approval for payment, the invoice receiving process was outsourced and the automation functionalities of the new system utilized in invoice handling. The quality of invoice data and the need of simple structured purchase order (PO) invoices were emphasized in the system testing phase. Hence, consolidated invoices containing references to multiple PO or blanket release numbers should be simplified in order to use automated PO matching. With non-PO invoices, it is important to receive the buyer reference details in an applicable invoice data field so that automation rules could be created to route invoices to a review and approval flow. In the beginning of the project, invoice processing was seen ineffective both time- and cost-wise, and it required a lot of manual labor to carry out all tasks. In accordance with testing results, it was estimated that over half of the invoices could be automated within a year after system implementation. Processing times could be reduced remarkably, which would then result savings up to 40 % in annual processing costs. Due to several advancements in the purchase invoice process, business process quality could also be perceived as improved.
Resumo:
We studied the action of high pressure processing on the inactivation of two foodborne pathogens, Staphylococcus aureus ATCC 6538 and Salmonella enteritidis ATCC 13076, suspended in a culture medium and inoculated into caviar samples. The baroresistance of the two pathogens in a tryptic soy broth suspension at a concentration of 10(8)-10(9) colony-forming units/ml was tested for continuous and cycled pressurization in the 150- to 550-MPa range and for 15-min treatments at room temperature. The increase of cycle number permitted the reduction of the pressure level able to totally inactivate both microorganisms in the tryptic soy broth suspension, whereas the effect of different procedure times on complete inactivation of the microorganisms inoculated into caviar was similar.
Resumo:
This paper analyzes the profile of the Brazilian output in the field of multiple sclerosis from 1981 to 2004. The search was conducted through the MEDLINE and LILACS databases, selecting papers in which the term "multiple sclerosis" was defined as the main topic and "Brazil" or "Brasil" as others. The data were analyzed regarding the themes, the state in Brazil and institution where the papers were produced, the journals where the papers were published, journal's impact factor, and language. The search disclosed 141 documents (91 from MEDLINE and LILACS, and 50 from LILACS only) published in 44 different journals (23 of them MEDLINE-indexed). A total of 111 documents were produced by 17 public universities, 29 by 3 private medical schools and 1 by a non-governmental organization. There were 65 original contributions, 37 case reports, 20 reviews, 6 PhD dissertations, 5 guidelines, 2 validation studies, 2 clinical trials, 2 chapters in textbooks, 1 Master of Science thesis, and 1 patient education handout. The journal impact factor ranged from 0.0217 to 6.039 (median 3.03). Of 91 papers from MEDLINE, 65 were published by Arquivos de Neuro-Psiquiatria. More than 90% of the papers were written in Portuguese. São Paulo was the most productive state in the country, followed by Rio de Janeiro, Minas Gerais and Paraná. Eighty-two percent of the Brazilian output came from the Southeastern region.
Resumo:
Feature extraction is the part of pattern recognition, where the sensor data is transformed into a more suitable form for the machine to interpret. The purpose of this step is also to reduce the amount of information passed to the next stages of the system, and to preserve the essential information in the view of discriminating the data into different classes. For instance, in the case of image analysis the actual image intensities are vulnerable to various environmental effects, such as lighting changes and the feature extraction can be used as means for detecting features, which are invariant to certain types of illumination changes. Finally, classification tries to make decisions based on the previously transformed data. The main focus of this thesis is on developing new methods for the embedded feature extraction based on local non-parametric image descriptors. Also, feature analysis is carried out for the selected image features. Low-level Local Binary Pattern (LBP) based features are in a main role in the analysis. In the embedded domain, the pattern recognition system must usually meet strict performance constraints, such as high speed, compact size and low power consumption. The characteristics of the final system can be seen as a trade-off between these metrics, which is largely affected by the decisions made during the implementation phase. The implementation alternatives of the LBP based feature extraction are explored in the embedded domain in the context of focal-plane vision processors. In particular, the thesis demonstrates the LBP extraction with MIPA4k massively parallel focal-plane processor IC. Also higher level processing is incorporated to this framework, by means of a framework for implementing a single chip face recognition system. Furthermore, a new method for determining optical flow based on LBPs, designed in particular to the embedded domain is presented. Inspired by some of the principles observed through the feature analysis of the Local Binary Patterns, an extension to the well known non-parametric rank transform is proposed, and its performance is evaluated in face recognition experiments with a standard dataset. Finally, an a priori model where the LBPs are seen as combinations of n-tuples is also presented
Resumo:
Brazilian scientific output exhibited a 4-fold increase in the last two decades because of the stability of the investment in research and development activities and of changes in the policies of the main funding agencies. Most of this production is concentrated in public universities and research institutes located in the richest part of the country. Among all areas of knowledge, the most productive are Health and Biological Sciences. During the 1998-2002 period these areas presented heterogeneous growth ranging from 4.5% (Pharmacology) to 191% (Psychiatry), with a median growth rate of 47.2%. In order to identify and rank the 20 most prolific institutions in these areas, searches were made in three databases (DataCAPES, ISI and MEDLINE) which permitted the identification of 109,507 original articles produced by the 592 Graduate Programs in Health and Biological Sciences offered by 118 public universities and research institutes. The 20 most productive centers, ranked according to the total number of ISI-indexed articles published during the 1998-2003 period, produced 78.7% of the papers in these areas and are strongly concentrated in the Southern part of the country, mainly in São Paulo State.