838 resultados para Automated algorithms
Resumo:
Images of a scene, static or dynamic, are generally acquired at different epochs from different viewpoints. They potentially gather information about the whole scene and its relative motion with respect to the acquisition device. Data from different (in the spatial or temporal domain) visual sources can be fused together to provide a unique consistent representation of the whole scene, even recovering the third dimension, permitting a more complete understanding of the scene content. Moreover, the pose of the acquisition device can be achieved by estimating the relative motion parameters linking different views, thus providing localization information for automatic guidance purposes. Image registration is based on the use of pattern recognition techniques to match among corresponding parts of different views of the acquired scene. Depending on hypotheses or prior information about the sensor model, the motion model and/or the scene model, this information can be used to estimate global or local geometrical mapping functions between different images or different parts of them. These mapping functions contain relative motion parameters between the scene and the sensor(s) and can be used to integrate accordingly informations coming from the different sources to build a wider or even augmented representation of the scene. Accordingly, for their scene reconstruction and pose estimation capabilities, nowadays image registration techniques from multiple views are increasingly stirring up the interest of the scientific and industrial community. Depending on the applicative domain, accuracy, robustness, and computational payload of the algorithms represent important issues to be addressed and generally a trade-off among them has to be reached. Moreover, on-line performance is desirable in order to guarantee the direct interaction of the vision device with human actors or control systems. This thesis follows a general research approach to cope with these issues, almost independently from the scene content, under the constraint of rigid motions. This approach has been motivated by the portability to very different domains as a very desirable property to achieve. A general image registration approach suitable for on-line applications has been devised and assessed through two challenging case studies in different applicative domains. The first case study regards scene reconstruction through on-line mosaicing of optical microscopy cell images acquired with non automated equipment, while moving manually the microscope holder. By registering the images the field of view of the microscope can be widened, preserving the resolution while reconstructing the whole cell culture and permitting the microscopist to interactively explore the cell culture. In the second case study, the registration of terrestrial satellite images acquired by a camera integral with the satellite is utilized to estimate its three-dimensional orientation from visual data, for automatic guidance purposes. Critical aspects of these applications are emphasized and the choices adopted are motivated accordingly. Results are discussed in view of promising future developments.
Resumo:
A first phase of the research activity has been related to the study of the state of art of the infrastructures for cycling, bicycle use and methods for evaluation. In this part, the candidate has studied the "bicycle system" in countries with high bicycle use and in particular in the Netherlands. Has been carried out an evaluation of the questionnaires of the survey conducted within the European project BICY on mobility in general in 13 cities of the participating countries. The questionnaire was designed, tested and implemented, and was later validated by a test in Bologna. The results were corrected with information on demographic situation and compared with official data. The cycling infrastructure analysis was conducted on the basis of information from the OpenStreetMap database. The activity consisted in programming algorithms in Python that allow to extract data from the database infrastructure for a region, to sort and filter cycling infrastructure calculating some attributes, such as the length of the arcs paths. The results obtained were compared with official data where available. The structure of the thesis is as follows: 1. Introduction: description of the state of cycling in several advanced countries, description of methods of analysis and their importance to implement appropriate policies for cycling. Supply and demand of bicycle infrastructures. 2. Survey on mobility: it gives details of the investigation developed and the method of evaluation. The results obtained are presented and compared with official data. 3. Analysis cycling infrastructure based on information from the database of OpenStreetMap: describes the methods and algorithms developed during the PhD. The results obtained by the algorithms are compared with official data. 4. Discussion: The above results are discussed and compared. In particular the cycle demand is compared with the length of cycle networks within a city. 5. Conclusions
Resumo:
The main goal of this thesis is to facilitate the process of industrial automated systems development applying formal methods to ensure the reliability of systems. A new formulation of distributed diagnosability problem in terms of Discrete Event Systems theory and automata framework is presented, which is then used to enforce the desired property of the system, rather then just verifying it. This approach tackles the state explosion problem with modeling patterns and new algorithms, aimed for verification of diagnosability property in the context of the distributed diagnosability problem. The concepts are validated with a newly developed software tool.
Resumo:
The variability of results from different automated methods of detection and tracking of extratropical cyclones is assessed in order to identify uncertainties related to the choice of method. Fifteen international teams applied their own algorithms to the same dataset - the period 1989-2009 of interim European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERAInterim) data. This experiment is part of the community project Intercomparison of Mid Latitude Storm Diagnostics (IMILAST; see www.proclim.ch/imilast/index.html). The spread of results for cyclone frequency, intensity, life cycle, and track location is presented to illustrate the impact of using different methods. Globally, methods agree well for geographical distribution in large oceanic regions, interannual variability of cyclone numbers, geographical patterns of strong trends, and distribution shape for many life cycle characteristics. In contrast, the largest disparities exist for the total numbers of cyclones, the detection of weak cyclones, and distribution in some densely populated regions. Consistency between methods is better for strong cyclones than for shallow ones. Two case studies of relatively large, intense cyclones reveal that the identification of the most intense part of the life cycle of these events is robust between methods, but considerable differences exist during the development and the dissolution phases.
Resumo:
INTRODUCTION Native-MR angiography (N-MRA) is considered an imaging alternative to contrast enhanced MR angiography (CE-MRA) for patients with renal insufficiency. Lower intraluminal contrast in N-MRA often leads to failure of the segmentation process in commercial algorithms. This study introduces an in-house 3D model-based segmentation approach used to compare both sequences by automatic 3D lumen segmentation, allowing for evaluation of differences of aortic lumen diameters as well as differences in length comparing both acquisition techniques at every possible location. METHODS AND MATERIALS Sixteen healthy volunteers underwent 1.5-T-MR Angiography (MRA). For each volunteer, two different MR sequences were performed, CE-MRA: gradient echo Turbo FLASH sequence and N-MRA: respiratory-and-cardiac-gated, T2-weighted 3D SSFP. Datasets were segmented using a 3D model-based ellipse-fitting approach with a single seed point placed manually above the celiac trunk. The segmented volumes were manually cropped from left subclavian artery to celiac trunk to avoid error due to side branches. Diameters, volumes and centerline length were computed for intraindividual comparison. For statistical analysis the Wilcoxon-Signed-Ranked-Test was used. RESULTS Average centerline length obtained based on N-MRA was 239.0±23.4 mm compared to 238.6±23.5 mm for CE-MRA without significant difference (P=0.877). Average maximum diameter obtained based on N-MRA was 25.7±3.3 mm compared to 24.1±3.2 mm for CE-MRA (P<0.001). In agreement with the difference in diameters, volumes obtained based on N-MRA (100.1±35.4 cm(3)) were consistently and significantly larger compared to CE-MRA (89.2±30.0 cm(3)) (P<0.001). CONCLUSIONS 3D morphometry shows highly similar centerline lengths for N-MRA and CE-MRA, but systematically higher diameters and volumes for N-MRA.
Resumo:
BACKGROUND A precise detection of volume change allows for better estimating the biological behavior of the lung nodules. Postprocessing tools with automated detection, segmentation, and volumetric analysis of lung nodules may expedite radiological processes and give additional confidence to the radiologists. PURPOSE To compare two different postprocessing software algorithms (LMS Lung, Median Technologies; LungCARE®, Siemens) in CT volumetric measurement and to analyze the effect of soft (B30) and hard reconstruction filter (B70) on automated volume measurement. MATERIAL AND METHODS Between January 2010 and April 2010, 45 patients with a total of 113 pulmonary nodules were included. The CT exam was performed on a 64-row multidetector CT scanner (Somatom Sensation, Siemens, Erlangen, Germany) with the following parameters: collimation, 24x1.2 mm; pitch, 1.15; voltage, 120 kVp; reference tube current-time, 100 mAs. Automated volumetric measurement of each lung nodule was performed with the two different postprocessing algorithms based on two reconstruction filters (B30 and B70). The average relative volume measurement difference (VME%) and the limits of agreement between two methods were used for comparison. RESULTS At soft reconstruction filters the LMS system produced mean nodule volumes that were 34.1% (P < 0.0001) larger than those by LungCARE® system. The VME% was 42.2% with a limit of agreement between -53.9% and 138.4%.The volume measurement with soft filters (B30) was significantly larger than with hard filters (B70); 11.2% for LMS and 1.6% for LungCARE®, respectively (both with P < 0.05). LMS measured greater volumes with both filters, 13.6% for soft and 3.8% for hard filters, respectively (P < 0.01 and P > 0.05). CONCLUSION There is a substantial inter-software (LMS/LungCARE®) as well as intra-software variability (B30/B70) in lung nodule volume measurement; therefore, it is mandatory to use the same equipment with the same reconstruction filter for the follow-up of lung nodule volume.
Resumo:
PURPOSE Quantification of retinal layers using automated segmentation of optical coherence tomography (OCT) images allows for longitudinal studies of retinal and neurological disorders in mice. The purpose of this study was to compare the performance of automated retinal layer segmentation algorithms with data from manual segmentation in mice using the Spectralis OCT. METHODS Spectral domain OCT images from 55 mice from three different mouse strains were analyzed in total. The OCT scans from 22 C57Bl/6, 22 BALBc, and 11 C3A.Cg-Pde6b(+)Prph2(Rd2) /J mice were automatically segmented using three commercially available automated retinal segmentation algorithms and compared to manual segmentation. RESULTS Fully automated segmentation performed well in mice and showed coefficients of variation (CV) of below 5% for the total retinal volume. However, all three automated segmentation algorithms yielded much thicker total retinal thickness values compared to manual segmentation data (P < 0.0001) due to segmentation errors in the basement membrane. CONCLUSIONS Whereas the automated retinal segmentation algorithms performed well for the inner layers, the retinal pigmentation epithelium (RPE) was delineated within the sclera, leading to consistently thicker measurements of the photoreceptor layer and the total retina. TRANSLATIONAL RELEVANCE The introduction of spectral domain OCT allows for accurate imaging of the mouse retina. Exact quantification of retinal layer thicknesses in mice is important to study layers of interest under various pathological conditions.
Resumo:
This paper presents a shallow dialogue analysis model, aimed at human-human dialogues in the context of staff or business meetings. Four components of the model are defined, and several machine learning techniques are used to extract features from dialogue transcripts: maximum entropy classifiers for dialogue acts, latent semantic analysis for topic segmentation, or decision tree classifiers for discourse markers. A rule-based approach is proposed for solving cross-modal references to meeting documents. The methods are trained and evaluated thanks to a common data set and annotation format. The integration of the components into an automated shallow dialogue parser opens the way to multimodal meeting processing and retrieval applications.
Resumo:
We present tools for rapid and quantitative detection of sediment lamination. The BMPix tool extracts color and gray-scale curves from images at pixel resolution. The PEAK tool uses the gray-scale curve and performs, for the first time, fully automated counting of laminae based on three methods. The maximum count algorithm counts every bright peak of a couplet of two laminae (annual resolution) in a smoothed curve. The zero-crossing algorithm counts every positive and negative halfway-passage of the curve through a wide moving average, separating the record into bright and dark intervals (seasonal resolution). The same is true for the frequency truncation method, which uses Fourier transformation to decompose the curve into its frequency components before counting positive and negative passages. We applied the new methods successfully to tree rings, to well-dated and already manually counted marine varves from Saanich Inlet, and to marine laminae from the Antarctic continental margin. In combination with AMS14C dating, we found convincing evidence that laminations in Weddell Sea sites represent varves, deposited continuously over several millennia during the last glacial maximum. The new tools offer several advantages over previous methods. The counting procedures are based on a moving average generated from gray-scale curves instead of manual counting. Hence, results are highly objective and rely on reproducible mathematical criteria. Also, the PEAK tool measures the thickness of each year or season. Since all information required is displayed graphically, interactive optimization of the counting algorithms can be achieved quickly and conveniently.
Resumo:
Due to the relative transparency of its embryos and larvae, the zebrafish is an ideal model organism for bioimaging approaches in vertebrates. Novel microscope technologies allow the imaging of developmental processes in unprecedented detail, and they enable the use of complex image-based read-outs for high-throughput/high-content screening. Such applications can easily generate Terabytes of image data, the handling and analysis of which becomes a major bottleneck in extracting the targeted information. Here, we describe the current state of the art in computational image analysis in the zebrafish system. We discuss the challenges encountered when handling high-content image data, especially with regard to data quality, annotation, and storage. We survey methods for preprocessing image data for further analysis, and describe selected examples of automated image analysis, including the tracking of cells during embryogenesis, heartbeat detection, identification of dead embryos, recognition of tissues and anatomical landmarks, and quantification of behavioral patterns of adult fish. We review recent examples for applications using such methods, such as the comprehensive analysis of cell lineages during early development, the generation of a three-dimensional brain atlas of zebrafish larvae, and high-throughput drug screens based on movement patterns. Finally, we identify future challenges for the zebrafish image analysis community, notably those concerning the compatibility of algorithms and data formats for the assembly of modular analysis pipelines.
Resumo:
Scientific workflows provide the means to define, execute and reproduce computational experiments. However, reusing existing workflows still poses challenges for workflow designers. Workflows are often too large and too specific to reuse in their entirety, so reuse is more likely to happen for fragments of workflows. These fragments may be identified manually by users as sub-workflows, or detected automatically. In this paper we present the FragFlow approach, which detects workflow fragments automatically by analyzing existing workflow corpora with graph mining algorithms. FragFlow detects the most common workflow fragments, links them to the original workflows and visualizes them. We evaluate our approach by comparing FragFlow results against user-defined sub-workflows from three different corpora of the LONI Pipeline system. Based on this evaluation, we discuss how automated workflow fragment detection could facilitate workflow reuse.
Resumo:
El daño cerebral adquirido (DCA) es un problema social y sanitario grave, de magnitud creciente y de una gran complejidad diagnóstica y terapéutica. Su elevada incidencia, junto con el aumento de la supervivencia de los pacientes, una vez superada la fase aguda, lo convierten también en un problema de alta prevalencia. En concreto, según la Organización Mundial de la Salud (OMS) el DCA estará entre las 10 causas más comunes de discapacidad en el año 2020. La neurorrehabilitación permite mejorar el déficit tanto cognitivo como funcional y aumentar la autonomía de las personas con DCA. Con la incorporación de nuevas soluciones tecnológicas al proceso de neurorrehabilitación se pretende alcanzar un nuevo paradigma donde se puedan diseñar tratamientos que sean intensivos, personalizados, monitorizados y basados en la evidencia. Ya que son estas cuatro características las que aseguran que los tratamientos son eficaces. A diferencia de la mayor parte de las disciplinas médicas, no existen asociaciones de síntomas y signos de la alteración cognitiva que faciliten la orientación terapéutica. Actualmente, los tratamientos de neurorrehabilitación se diseñan en base a los resultados obtenidos en una batería de evaluación neuropsicológica que evalúa el nivel de afectación de cada una de las funciones cognitivas (memoria, atención, funciones ejecutivas, etc.). La línea de investigación en la que se enmarca este trabajo de investigación pretende diseñar y desarrollar un perfil cognitivo basado no sólo en el resultado obtenido en esa batería de test, sino también en información teórica que engloba tanto estructuras anatómicas como relaciones funcionales e información anatómica obtenida de los estudios de imagen. De esta forma, el perfil cognitivo utilizado para diseñar los tratamientos integra información personalizada y basada en la evidencia. Las técnicas de neuroimagen representan una herramienta fundamental en la identificación de lesiones para la generación de estos perfiles cognitivos. La aproximación clásica utilizada en la identificación de lesiones consiste en delinear manualmente regiones anatómicas cerebrales. Esta aproximación presenta diversos problemas relacionados con inconsistencias de criterio entre distintos clínicos, reproducibilidad y tiempo. Por tanto, la automatización de este procedimiento es fundamental para asegurar una extracción objetiva de información. La delineación automática de regiones anatómicas se realiza mediante el registro tanto contra atlas como contra otros estudios de imagen de distintos sujetos. Sin embargo, los cambios patológicos asociados al DCA están siempre asociados a anormalidades de intensidad y/o cambios en la localización de las estructuras. Este hecho provoca que los algoritmos de registro tradicionales basados en intensidad no funcionen correctamente y requieran la intervención del clínico para seleccionar ciertos puntos (que en esta tesis hemos denominado puntos singulares). Además estos algoritmos tampoco permiten que se produzcan deformaciones grandes deslocalizadas. Hecho que también puede ocurrir ante la presencia de lesiones provocadas por un accidente cerebrovascular (ACV) o un traumatismo craneoencefálico (TCE). Esta tesis se centra en el diseño, desarrollo e implementación de una metodología para la detección automática de estructuras lesionadas que integra algoritmos cuyo objetivo principal es generar resultados que puedan ser reproducibles y objetivos. Esta metodología se divide en cuatro etapas: pre-procesado, identificación de puntos singulares, registro y detección de lesiones. Los trabajos y resultados alcanzados en esta tesis son los siguientes: Pre-procesado. En esta primera etapa el objetivo es homogeneizar todos los datos de entrada con el objetivo de poder extraer conclusiones válidas de los resultados obtenidos. Esta etapa, por tanto, tiene un gran impacto en los resultados finales. Se compone de tres operaciones: eliminación del cráneo, normalización en intensidad y normalización espacial. Identificación de puntos singulares. El objetivo de esta etapa es automatizar la identificación de puntos anatómicos (puntos singulares). Esta etapa equivale a la identificación manual de puntos anatómicos por parte del clínico, permitiendo: identificar un mayor número de puntos lo que se traduce en mayor información; eliminar el factor asociado a la variabilidad inter-sujeto, por tanto, los resultados son reproducibles y objetivos; y elimina el tiempo invertido en el marcado manual de puntos. Este trabajo de investigación propone un algoritmo de identificación de puntos singulares (descriptor) basado en una solución multi-detector y que contiene información multi-paramétrica: espacial y asociada a la intensidad. Este algoritmo ha sido contrastado con otros algoritmos similares encontrados en el estado del arte. Registro. En esta etapa se pretenden poner en concordancia espacial dos estudios de imagen de sujetos/pacientes distintos. El algoritmo propuesto en este trabajo de investigación está basado en descriptores y su principal objetivo es el cálculo de un campo vectorial que permita introducir deformaciones deslocalizadas en la imagen (en distintas regiones de la imagen) y tan grandes como indique el vector de deformación asociado. El algoritmo propuesto ha sido comparado con otros algoritmos de registro utilizados en aplicaciones de neuroimagen que se utilizan con estudios de sujetos control. Los resultados obtenidos son prometedores y representan un nuevo contexto para la identificación automática de estructuras. Identificación de lesiones. En esta última etapa se identifican aquellas estructuras cuyas características asociadas a la localización espacial y al área o volumen han sido modificadas con respecto a una situación de normalidad. Para ello se realiza un estudio estadístico del atlas que se vaya a utilizar y se establecen los parámetros estadísticos de normalidad asociados a la localización y al área. En función de las estructuras delineadas en el atlas, se podrán identificar más o menos estructuras anatómicas, siendo nuestra metodología independiente del atlas seleccionado. En general, esta tesis doctoral corrobora las hipótesis de investigación postuladas relativas a la identificación automática de lesiones utilizando estudios de imagen médica estructural, concretamente estudios de resonancia magnética. Basándose en estos cimientos, se han abrir nuevos campos de investigación que contribuyan a la mejora en la detección de lesiones. ABSTRACT Brain injury constitutes a serious social and health problem of increasing magnitude and of great diagnostic and therapeutic complexity. Its high incidence and survival rate, after the initial critical phases, makes it a prevalent problem that needs to be addressed. In particular, according to the World Health Organization (WHO), brain injury will be among the 10 most common causes of disability by 2020. Neurorehabilitation improves both cognitive and functional deficits and increases the autonomy of brain injury patients. The incorporation of new technologies to the neurorehabilitation tries to reach a new paradigm focused on designing intensive, personalized, monitored and evidence-based treatments. Since these four characteristics ensure the effectivity of treatments. Contrary to most medical disciplines, it is not possible to link symptoms and cognitive disorder syndromes, to assist the therapist. Currently, neurorehabilitation treatments are planned considering the results obtained from a neuropsychological assessment battery, which evaluates the functional impairment of each cognitive function (memory, attention, executive functions, etc.). The research line, on which this PhD falls under, aims to design and develop a cognitive profile based not only on the results obtained in the assessment battery, but also on theoretical information that includes both anatomical structures and functional relationships and anatomical information obtained from medical imaging studies, such as magnetic resonance. Therefore, the cognitive profile used to design these treatments integrates information personalized and evidence-based. Neuroimaging techniques represent an essential tool to identify lesions and generate this type of cognitive dysfunctional profiles. Manual delineation of brain anatomical regions is the classical approach to identify brain anatomical regions. Manual approaches present several problems related to inconsistencies across different clinicians, time and repeatability. Automated delineation is done by registering brains to one another or to a template. However, when imaging studies contain lesions, there are several intensity abnormalities and location alterations that reduce the performance of most of the registration algorithms based on intensity parameters. Thus, specialists may have to manually interact with imaging studies to select landmarks (called singular points in this PhD) or identify regions of interest. These two solutions have the same inconvenient than manual approaches, mentioned before. Moreover, these registration algorithms do not allow large and distributed deformations. This type of deformations may also appear when a stroke or a traumatic brain injury (TBI) occur. This PhD is focused on the design, development and implementation of a new methodology to automatically identify lesions in anatomical structures. This methodology integrates algorithms whose main objective is to generate objective and reproducible results. It is divided into four stages: pre-processing, singular points identification, registration and lesion detection. Pre-processing stage. In this first stage, the aim is to standardize all input data in order to be able to draw valid conclusions from the results. Therefore, this stage has a direct impact on the final results. It consists of three steps: skull-stripping, spatial and intensity normalization. Singular points identification. This stage aims to automatize the identification of anatomical points (singular points). It involves the manual identification of anatomical points by the clinician. This automatic identification allows to identify a greater number of points which results in more information; to remove the factor associated to inter-subject variability and thus, the results are reproducible and objective; and to eliminate the time spent on manual marking. This PhD proposed an algorithm to automatically identify singular points (descriptor) based on a multi-detector approach. This algorithm contains multi-parametric (spatial and intensity) information. This algorithm has been compared with other similar algorithms found on the state of the art. Registration. The goal of this stage is to put in spatial correspondence two imaging studies of different subjects/patients. The algorithm proposed in this PhD is based on descriptors. Its main objective is to compute a vector field to introduce distributed deformations (changes in different imaging regions), as large as the deformation vector indicates. The proposed algorithm has been compared with other registration algorithms used on different neuroimaging applications which are used with control subjects. The obtained results are promising and they represent a new context for the automatic identification of anatomical structures. Lesion identification. This final stage aims to identify those anatomical structures whose characteristics associated to spatial location and area or volume has been modified with respect to a normal state. A statistical study of the atlas to be used is performed to establish which are the statistical parameters associated to the normal state. The anatomical structures that may be identified depend on the selected anatomical structures identified on the atlas. The proposed methodology is independent from the selected atlas. Overall, this PhD corroborates the investigated research hypotheses regarding the automatic identification of lesions based on structural medical imaging studies (resonance magnetic studies). Based on these foundations, new research fields to improve the automatic identification of lesions in brain injury can be proposed.
Resumo:
Texas Department of Transportation, Austin
Resumo:
Derivational morphology proposes meaningful connections between words and is largely unrepresented in lexical databases. This thesis presents a project to enrich a lexical database with morphological links and to evaluate their contribution to disambiguation. A lexical database with sense distinctions was required. WordNet was chosen because of its free availability and widespread use. Its suitability was assessed through critical evaluation with respect to specifications and criticisms, using a transparent, extensible model. The identification of serious shortcomings suggested a portable enrichment methodology, applicable to alternative resources. Although 40% of the most frequent words are prepositions, they have been largely ignored by computational linguists, so addition of prepositions was also required. The preferred approach to morphological enrichment was to infer relations from phenomena discovered algorithmically. Both existing databases and existing algorithms can capture regular morphological relations, but cannot capture exceptions correctly; neither of them provide any semantic information. Some morphological analysis algorithms are subject to the fallacy that morphological analysis can be performed simply by segmentation. Morphological rules, grounded in observation and etymology, govern associations between and attachment of suffixes and contribute to defining the meaning of morphological relationships. Specifying character substitutions circumvents the segmentation fallacy. Morphological rules are prone to undergeneration, minimised through a variable lexical validity requirement, and overgeneration, minimised by rule reformulation and restricting monosyllabic output. Rules take into account the morphology of ancestor languages through co-occurrences of morphological patterns. Multiple rules applicable to an input suffix need their precedence established. The resistance of prefixations to segmentation has been addressed by identifying linking vowel exceptions and irregular prefixes. The automatic affix discovery algorithm applies heuristics to identify meaningful affixes and is combined with morphological rules into a hybrid model, fed only with empirical data, collected without supervision. Further algorithms apply the rules optimally to automatically pre-identified suffixes and break words into their component morphemes. To handle exceptions, stoplists were created in response to initial errors and fed back into the model through iterative development, leading to 100% precision, contestable only on lexicographic criteria. Stoplist length is minimised by special treatment of monosyllables and reformulation of rules. 96% of words and phrases are analysed. 218,802 directed derivational links have been encoded in the lexicon rather than the wordnet component of the model because the lexicon provides the optimal clustering of word senses. Both links and analyser are portable to an alternative lexicon. The evaluation uses the extended gloss overlaps disambiguation algorithm. The enriched model outperformed WordNet in terms of recall without loss of precision. Failure of all experiments to outperform disambiguation by frequency reflects on WordNet sense distinctions.
Resumo:
Automatically generating maps of a measured variable of interest can be problematic. In this work we focus on the monitoring network context where observations are collected and reported by a network of sensors, and are then transformed into interpolated maps for use in decision making. Using traditional geostatistical methods, estimating the covariance structure of data collected in an emergency situation can be difficult. Variogram determination, whether by method-of-moment estimators or by maximum likelihood, is very sensitive to extreme values. Even when a monitoring network is in a routine mode of operation, sensors can sporadically malfunction and report extreme values. If this extreme data destabilises the model, causing the covariance structure of the observed data to be incorrectly estimated, the generated maps will be of little value, and the uncertainty estimates in particular will be misleading. Marchant and Lark [2007] propose a REML estimator for the covariance, which is shown to work on small data sets with a manual selection of the damping parameter in the robust likelihood. We show how this can be extended to allow treatment of large data sets together with an automated approach to all parameter estimation. The projected process kriging framework of Ingram et al. [2007] is extended to allow the use of robust likelihood functions, including the two component Gaussian and the Huber function. We show how our algorithm is further refined to reduce the computational complexity while at the same time minimising any loss of information. To show the benefits of this method, we use data collected from radiation monitoring networks across Europe. We compare our results to those obtained from traditional kriging methodologies and include comparisons with Box-Cox transformations of the data. We discuss the issue of whether to treat or ignore extreme values, making the distinction between the robust methods which ignore outliers and transformation methods which treat them as part of the (transformed) process. Using a case study, based on an extreme radiological events over a large area, we show how radiation data collected from monitoring networks can be analysed automatically and then used to generate reliable maps to inform decision making. We show the limitations of the methods and discuss potential extensions to remedy these.