1000 resultados para machine independent
Resumo:
Biomedical natural language processing (BioNLP) is a subfield of natural language processing, an area of computational linguistics concerned with developing programs that work with natural language: written texts and speech. Biomedical relation extraction concerns the detection of semantic relations such as protein-protein interactions (PPI) from scientific texts. The aim is to enhance information retrieval by detecting relations between concepts, not just individual concepts as with a keyword search. In recent years, events have been proposed as a more detailed alternative for simple pairwise PPI relations. Events provide a systematic, structural representation for annotating the content of natural language texts. Events are characterized by annotated trigger words, directed and typed arguments and the ability to nest other events. For example, the sentence “Protein A causes protein B to bind protein C” can be annotated with the nested event structure CAUSE(A, BIND(B, C)). Converted to such formal representations, the information of natural language texts can be used by computational applications. Biomedical event annotations were introduced by the BioInfer and GENIA corpora, and event extraction was popularized by the BioNLP'09 Shared Task on Event Extraction. In this thesis we present a method for automated event extraction, implemented as the Turku Event Extraction System (TEES). A unified graph format is defined for representing event annotations and the problem of extracting complex event structures is decomposed into a number of independent classification tasks. These classification tasks are solved using SVM and RLS classifiers, utilizing rich feature representations built from full dependency parsing. Building on earlier work on pairwise relation extraction and using a generalized graph representation, the resulting TEES system is capable of detecting binary relations as well as complex event structures. We show that this event extraction system has good performance, reaching the first place in the BioNLP'09 Shared Task on Event Extraction. Subsequently, TEES has achieved several first ranks in the BioNLP'11 and BioNLP'13 Shared Tasks, as well as shown competitive performance in the binary relation Drug-Drug Interaction Extraction 2011 and 2013 shared tasks. The Turku Event Extraction System is published as a freely available open-source project, documenting the research in detail as well as making the method available for practical applications. In particular, in this thesis we describe the application of the event extraction method to PubMed-scale text mining, showing how the developed approach not only shows good performance, but is generalizable and applicable to large-scale real-world text mining projects. Finally, we discuss related literature, summarize the contributions of the work and present some thoughts on future directions for biomedical event extraction. This thesis includes and builds on six original research publications. The first of these introduces the analysis of dependency parses that leads to development of TEES. The entries in the three BioNLP Shared Tasks, as well as in the DDIExtraction 2011 task are covered in four publications, and the sixth one demonstrates the application of the system to PubMed-scale text mining.
Resumo:
A spectral angle based feature extraction method, Spectral Clustering Independent Component Analysis (SC-ICA), is proposed in this work to improve the brain tissue classification from Magnetic Resonance Images (MRI). SC-ICA provides equal priority to global and local features; thereby it tries to resolve the inefficiency of conventional approaches in abnormal tissue extraction. First, input multispectral MRI is divided into different clusters by a spectral distance based clustering. Then, Independent Component Analysis (ICA) is applied on the clustered data, in conjunction with Support Vector Machines (SVM) for brain tissue analysis. Normal and abnormal datasets, consisting of real and synthetic T1-weighted, T2-weighted and proton density/fluid-attenuated inversion recovery images, were used to evaluate the performance of the new method. Comparative analysis with ICA based SVM and other conventional classifiers established the stability and efficiency of SC-ICA based classification, especially in reproduction of small abnormalities. Clinical abnormal case analysis demonstrated it through the highest Tanimoto Index/accuracy values, 0.75/98.8%, observed against ICA based SVM results, 0.17/96.1%, for reproduced lesions. Experimental results recommend the proposed method as a promising approach in clinical and pathological studies of brain diseases
Resumo:
Developing successful navigation and mapping strategies is an essential part of autonomous robot research. However, hardware limitations often make for inaccurate systems. This project serves to investigate efficient alternatives to mapping an environment, by first creating a mobile robot, and then applying machine learning to the robot and controlling systems to increase the robustness of the robot system. My mapping system consists of a semi-autonomous robot drone in communication with a stationary Linux computer system. There are learning systems running on both the robot and the more powerful Linux system. The first stage of this project was devoted to designing and building an inexpensive robot. Utilizing my prior experience from independent studies in robotics, I designed a small mobile robot that was well suited for simple navigation and mapping research. When the major components of the robot base were designed, I began to implement my design. This involved physically constructing the base of the robot, as well as researching and acquiring components such as sensors. Implementing the more complex sensors became a time-consuming task, involving much research and assistance from a variety of sources. A concurrent stage of the project involved researching and experimenting with different types of machine learning systems. I finally settled on using neural networks as the machine learning system to incorporate into my project. Neural nets can be thought of as a structure of interconnected nodes, through which information filters. The type of neural net that I chose to use is a type that requires a known set of data that serves to train the net to produce the desired output. Neural nets are particularly well suited for use with robotic systems as they can handle cases that lie at the extreme edges of the training set, such as may be produced by "noisy" sensor data. Through experimenting with available neural net code, I became familiar with the code and its function, and modified it to be more generic and reusable for multiple applications of neural nets.
Resumo:
This thesis presents DCE, or Dynamic Conditional Execution, as an alternative to reduce the cost of mispredicted branches. The basic idea is to fetch all paths produced by a branch that obey certain restrictions regarding complexity and size. As a result, a smaller number of predictions is performed, and therefore, a lesser number of branches are mispredicted. DCE fetches through selected branches avoiding disruptions in the fetch flow when these branches are fetched. Both paths of selected branches are executed but only the correct path commits. In this thesis we propose an architecture to execute multiple paths of selected branches. Branches are selected based on the size and other conditions. Simple and complex branches can be dynamically predicated without requiring a special instruction set nor special compiler optimizations. Furthermore, a technique to reduce part of the overhead generated by the execution of multiple paths is proposed. The performance achieved reaches levels of up to 12% when comparing a Local predictor used in DCE against a Global predictor used in the reference machine. When both machines use a Local predictor, the speedup is increased by an average of 3-3.5%.
Resumo:
Delineating brain tumor boundaries from magnetic resonance images is an essential task for the analysis of brain cancer. We propose a fully automatic method for brain tissue segmentation, which combines Support Vector Machine classification using multispectral intensities and textures with subsequent hierarchical regularization based on Conditional Random Fields. The CRF regularization introduces spatial constraints to the powerful SVM classification, which assumes voxels to be independent from their neighbors. The approach first separates healthy and tumor tissue before both regions are subclassified into cerebrospinal fluid, white matter, gray matter and necrotic, active, edema region respectively in a novel hierarchical way. The hierarchical approach adds robustness and speed by allowing to apply different levels of regularization at different stages. The method is fast and tailored to standard clinical acquisition protocols. It was assessed on 10 multispectral patient datasets with results outperforming previous methods in terms of segmentation detail and computation times.
Resumo:
BACKGROUND AND PURPOSE Reproducible segmentation of brain tumors on magnetic resonance images is an important clinical need. This study was designed to evaluate the reliability of a novel fully automated segmentation tool for brain tumor image analysis in comparison to manually defined tumor segmentations. METHODS We prospectively evaluated preoperative MR Images from 25 glioblastoma patients. Two independent expert raters performed manual segmentations. Automatic segmentations were performed using the Brain Tumor Image Analysis software (BraTumIA). In order to study the different tumor compartments, the complete tumor volume TV (enhancing part plus non-enhancing part plus necrotic core of the tumor), the TV+ (TV plus edema) and the contrast enhancing tumor volume CETV were identified. We quantified the overlap between manual and automated segmentation by calculation of diameter measurements as well as the Dice coefficients, the positive predictive values, sensitivity, relative volume error and absolute volume error. RESULTS Comparison of automated versus manual extraction of 2-dimensional diameter measurements showed no significant difference (p = 0.29). Comparison of automated versus manual segmentation of volumetric segmentations showed significant differences for TV+ and TV (p<0.05) but no significant differences for CETV (p>0.05) with regard to the Dice overlap coefficients. Spearman's rank correlation coefficients (ρ) of TV+, TV and CETV showed highly significant correlations between automatic and manual segmentations. Tumor localization did not influence the accuracy of segmentation. CONCLUSIONS In summary, we demonstrated that BraTumIA supports radiologists and clinicians by providing accurate measures of cross-sectional diameter-based tumor extensions. The automated volume measurements were comparable to manual tumor delineation for CETV tumor volumes, and outperformed inter-rater variability for overlap and sensitivity.
Resumo:
Abstract machines provide a certain separation between platformdependent and platform-independent concerns in compilation. Many of the differences between architectures are encapsulated in the speciflc abstract machine implementation and the bytecode is left largely architecture independent. Taking advantage of this fact, we present a framework for estimating upper and lower bounds on the execution times of logic programs running on a bytecode-based abstract machine. Our approach includes a one-time, programindependent proflling stage which calculates constants or functions bounding the execution time of each abstract machine instruction. Then, a compile-time cost estimation phase, using the instruction timing information, infers expressions giving platform-dependent upper and lower bounds on actual execution time as functions of input data sizes for each program. Working at the abstract machine level makes it possible to take into account low-level issues in new architectures and platforms by just reexecuting the calibration stage instead of having to tailor the analysis for each architecture and platform. Applications of such predicted execution times include debugging/veriflcation of time properties, certiflcation of time properties in mobile code, granularity control in parallel/distributed computing, and resource-oriented specialization.
Resumo:
We present a parallel graph narrowing machine, which is used to implement a functional logic language on a shared memory multiprocessor. It is an extensión of an abstract machine for a purely functional language. The result is a programmed graph reduction machine which integrates the mechanisms of unification, backtracking, and independent and-parallelism. In the machine, the subexpressions of an expression can run in parallel. In the case of backtracking, the structure of an expression is used to avoid the reevaluation of subexpressions as far as possible. Deterministic computations are detected. Their results are maintained and need not be reevaluated after backtracking.
Resumo:
This paper proposes an architecture, based on statistical machine translation, for developing the text normalization module of a text to speech conversion system. The main target is to generate a language independent text normalization module, based on data and flexible enough to deal with all situa-tions presented in this task. The proposed architecture is composed by three main modules: a tokenizer module for splitting the text input into a token graph (tokenization), a phrase-based translation module (token translation) and a post-processing module for removing some tokens. This paper presents initial exper-iments for numbers and abbreviations. The very good results obtained validate the proposed architecture.
Resumo:
A method is given for determining the time course and spatial extent of consistently and transiently task-related activations from other physiological and artifactual components that contribute to functional MRI (fMRI) recordings. Independent component analysis (ICA) was used to analyze two fMRI data sets from a subject performing 6-min trials composed of alternating 40-sec Stroop color-naming and control task blocks. Each component consisted of a fixed three-dimensional spatial distribution of brain voxel values (a “map”) and an associated time course of activation. For each trial, the algorithm detected, without a priori knowledge of their spatial or temporal structure, one consistently task-related component activated during each Stroop task block, plus several transiently task-related components activated at the onset of one or two of the Stroop task blocks only. Activation patterns occurring during only part of the fMRI trial are not observed with other techniques, because their time courses cannot easily be known in advance. Other ICA components were related to physiological pulsations, head movements, or machine noise. By using higher-order statistics to specify stricter criteria for spatial independence between component maps, ICA produced improved estimates of the temporal and spatial extent of task-related activation in our data compared with principal component analysis (PCA). ICA appears to be a promising tool for exploratory analysis of fMRI data, particularly when the time courses of activation are not known in advance.
Resumo:
Optimism is growing that the near future will witness rapid growth in human-computer interaction using voice. System prototypes have recently been built that demonstrate speaker-independent real-time speech recognition, and understanding of naturally spoken utterances with vocabularies of 1000 to 2000 words, and larger. Already, computer manufacturers are building speech recognition subsystems into their new product lines. However, before this technology can be broadly useful, a substantial knowledge base is needed about human spoken language and performance during computer-based spoken interaction. This paper reviews application areas in which spoken interaction can play a significant role, assesses potential benefits of spoken interaction with machines, and compares voice with other modalities of human-computer interaction. It also discusses information that will be needed to build a firm empirical foundation for the design of future spoken and multimodal interfaces. Finally, it argues for a more systematic and scientific approach to investigating spoken input and performance with future language technology.
Resumo:
'Free will' and its corollary, the concept of individual responsibility are keystones of the justice system. This paper shows that if we accept a physics that disallows time reversal, the concept of 'free will' is undermined by an integrated understanding of the influence of genetics and environment on human behavioural responses. Analysis is undertaken by modelling life as a novel statistico-deterministic version of a Turing machine, i.e. as a series of transitions between states at successive instants of time. Using this model it is proven by induction that the entire course of life is independent of the action of free will. Although determined by prior state, the probability of transitions between states in response to a standard environmental stimulus is not equal to 1 and the transitions may differ quantitatively at the molecular level and qualitatively at the level of the whole organism. Transitions between states correspond to behaviours. It is shown that the behaviour of identical twins (or clones), although determined, would be incompletely predictable and non-identical, creating an illusion of the operation of 'free will'. 'Free will' is a convenient construct for current judicial systems and social control because it allows rationalization of punishment for those whose behaviour falls outside socially defined norms. Indeed, it is conceivable that maintenance of ideas of free will has co-evolved with community morality to reinforce its operation. If the concept is free will is to be maintained it would require revision of our current physical theories.
Resumo:
The thesis describes an investigation into methods for the specification, design and implementation of computer control systems for flexible manufacturing machines comprising multiple, independent, electromechanically-driven mechanisms. An analysis is made of the elements of conventional mechanically-coupled machines in order that the operational functions of these elements may be identified. This analysis is used to define the scope of requirements necessary to specify the format, function and operation of a flexible, independently driven mechanism machine. A discussion of how this type of machine can accommodate modern manufacturing needs of high-speed and flexibility is presented. A sequential method of capturing requirements for such machines is detailed based on a hierarchical partitioning of machine requirements from product to independent drive mechanism. A classification of mechanisms using notations, including Data flow diagrams and Petri-nets, is described which supports capture and allows validation of requirements. A generic design for a modular, IDM machine controller is derived based upon hierarchy of control identified in these machines. A two mechanism experimental machine is detailed which is used to demonstrate the application of the specification, design and implementation techniques. A computer controller prototype and a fully flexible implementation for the IDM machine, based on Petri-net models described using the concurrent programming language Occam, is detailed. The ability of this modular computer controller to support flexible, safe and fault-tolerant operation of the two intermittent motion, discrete-synchronisation independent drive mechanisms is presented. The application of the machine development methodology to industrial projects is established.
Resumo:
Traditional machinery for manufacturing processes are characterised by actuators powered and co-ordinated by mechanical linkages driven from a central drive. Increasingly, these linkages are replaced by independent electrical drives, each performs a different task and follows a different motion profile, co-ordinated by computers. A design methodology for the servo control of high speed multi-axis machinery is proposed, based on the concept of a highly adaptable generic machine model. In addition to the dynamics of the drives and the loads, the model includes the inherent interactions between the motion axes and thus provides a Multi-Input Multi-Output (MIMO) description. In general, inherent interactions such as structural couplings between groups of motion axes are undesirable and needed to be compensated. On the other hand, imposed interactions such as the synchronisation of different groups of axes are often required. It is recognised that a suitable MIMO controller can simultaneously achieve these objectives and reconciles their potential conflicts. Both analytical and numerical methods for the design of MIMO controllers are investigated. At present, it is not possible to implement high order MIMO controllers for practical reasons. Based on simulations of the generic machine model under full MIMO control, however, it is possible to determine a suitable topology for a blockwise decentralised control scheme. The Block Relative Gain array (BRG) is used to compare the relative strength of closed loop interactions between sub-systems. A number of approaches to the design of the smaller decentralised MIMO controllers for these sub-systems has been investigated. For the purpose of illustration, a benchmark problem based on a 3 axes test rig has been carried through the design cycle to demonstrate the working of the design methodology.
Resumo:
Background: DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. Identification of DNA-binding proteins is one of the major challenges in the field of genome annotation. There have been several computational methods proposed in the literature to deal with the DNA-binding protein identification. However, most of them can't provide an invaluable knowledge base for our understanding of DNA-protein interactions. Results: We firstly presented a new protein sequence encoding method called PSSM Distance Transformation, and then constructed a DNA-binding protein identification method (SVM-PSSM-DT) by combining PSSM Distance Transformation with support vector machine (SVM). First, the PSSM profiles are generated by using the PSI-BLAST program to search the non-redundant (NR) database. Next, the PSSM profiles are transformed into uniform numeric representations appropriately by distance transformation scheme. Lastly, the resulting uniform numeric representations are inputted into a SVM classifier for prediction. Thus whether a sequence can bind to DNA or not can be determined. In benchmark test on 525 DNA-binding and 550 non DNA-binding proteins using jackknife validation, the present model achieved an ACC of 79.96%, MCC of 0.622 and AUC of 86.50%. This performance is considerably better than most of the existing state-of-the-art predictive methods. When tested on a recently constructed independent dataset PDB186, SVM-PSSM-DT also achieved the best performance with ACC of 80.00%, MCC of 0.647 and AUC of 87.40%, and outperformed some existing state-of-the-art methods. Conclusions: The experiment results demonstrate that PSSM Distance Transformation is an available protein sequence encoding method and SVM-PSSM-DT is a useful tool for identifying the DNA-binding proteins. A user-friendly web-server of SVM-PSSM-DT was constructed, which is freely accessible to the public at the web-site on http://bioinformatics.hitsz.edu.cn/PSSM-DT/.