22 resultados para extraction and separation techniques
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
The objective of this thesis is model some processes from the nature as evolution and co-evolution, and proposing some techniques that can ensure that these learning process really happens and useful to solve some complex problems as Go game. The Go game is ancient and very complex game with simple rules which still is a challenge for the Artificial Intelligence. This dissertation cover some approaches that were applied to solve this problem, proposing solve this problem using competitive and cooperative co-evolutionary learning methods and other techniques proposed by the author. To study, implement and prove these methods were used some neural networks structures, a framework free available and coded many programs. The techniques proposed were coded by the author, performed many experiments to find the best configuration to ensure that co-evolution is progressing and discussed the results. Using co-evolutionary learning processes can be observed some pathologies which could impact co-evolution progress. In this dissertation is introduced some techniques to solve pathologies as loss of gradients, cycling dynamics and forgetting. According to some authors, one solution to solve these co-evolution pathologies is introduce more diversity in populations that are evolving. In this thesis is proposed some techniques to introduce more diversity and some diversity measurements for neural networks structures to monitor diversity during co-evolution. The genotype diversity evolved were analyzed in terms of its impact to global fitness of the strategies evolved and their generalization. Additionally, it was introduced a memory mechanism in the network neural structures to reinforce some strategies in the genes of the neurons evolved with the intention that some good strategies learned are not forgotten. In this dissertation is presented some works from other authors in which cooperative and competitive co-evolution has been applied. The Go board size used in this thesis was 9x9, but can be easily escalated to more bigger boards.The author believe that programs coded and techniques introduced in this dissertation can be used for other domains.
Resumo:
Mealiness (woolliness in peaches) is a negative attribute of sensory texture that combines the sensation of a desegregated tissue with the sensation of lack of juiciness. In this study, 24 apples cv. Top Red and 8 peaches cv. Maycrest, submitted to 3 and 2 different storage conditions respectively have been tested by mechanical and MRI techniques to assess mealiness. With this study, the results obtained on apples in a previous work have been validated using mathematical features from the histograms of the T2 maps: more skewed and the presence of a tail in mealy apples, similar to internal breakdown. In peaches, MRI techniques can also be used to identify woolly fruits. Not all the changes found in the histograms of woolly peaches are similar from those observed in mealy apples pointing to a different underlying physiological change in both disorders.
Resumo:
In this paper, we propose a system for authenticating local bee pollen against fraudulent samples using image processing and classification techniques. Our system is based on the colour properties of bee pollen loads and the use of one-class classifiers to reject unknown pollen samples. The latter classification techniques allow us to tackle the major difficulty of the problem, the existence of many possible fraudulent pollen types. Also presented is a multi-classifier model with an ambiguity discovery process to fuse the output of the one-class classifiers. The method is validated by authenticating Spanish bee pollen types, the overall accuracy of the final system of being 94%. Therefore, the system is able to rapidly reject the non-local pollen samples with inexpensive hardware and without the need to send the product to the laboratory.
Resumo:
El diseño y desarrollo de sistemas de suspensión para vehículos se basa cada día más en el diseño por ordenador y en herramientas de análisis por ordenador, las cuales permiten anticipar problemas y resolverlos por adelantado. El comportamiento y las características dinámicas se calculan con precisión, bajo coste, y recursos y tiempos de cálculo reducidos. Sin embargo, existe una componente iterativa en el proceso, que requiere la definición manual de diseños a través de técnicas “prueba y error”. Esta Tesis da un paso hacia el desarrollo de un entorno de simulación eficiente capaz de simular, analizar y evaluar diseños de suspensiones vehiculares, y de mejorarlos hacia la solución optima mediante la modificación de los parámetros de diseño. La modelización mediante sistemas multicuerpo se utiliza aquí para desarrollar un modelo de autocar con 18 grados de libertad, de manera detallada y eficiente. La geometría y demás características de la suspensión se ajustan a las del vehículo real, así como los demás parámetros del modelo. Para simular la dinámica vehicular, se utiliza una formulación multicuerpo moderna y eficiente basada en las ecuaciones de Maggi, a la que se ha incorporado un visor 3D. Así, se consigue simular maniobras vehiculares en tiempos inferiores al tiempo real. Una vez que la dinámica está disponible, los análisis de sensibilidad son cruciales para una optimización robusta y eficiente. Para ello, se presenta una técnica matemática que permite derivar las variables dinámicas dentro de la formulación, de forma algorítmica, general, con la precisión de la maquina, y razonablemente eficiente: la diferenciación automática. Este método propaga las derivadas con respecto a las variables de diseño a través del código informático y con poca intervención del usuario. En contraste con otros enfoques en la bibliografía, generalmente particulares y limitados, se realiza una comparación de librerías, se desarrolla una formulación híbrida directa-automática para el cálculo de sensibilidades, y se presentan varios ejemplos reales. Finalmente, se lleva a cabo la optimización de la respuesta dinámica del vehículo citado. Se analizan cuatro tipos distintos de optimización: identificación de parámetros, optimización de la maniobrabilidad, optimización del confort y optimización multi-objetivo, todos ellos aplicados al diseño del autocar. Además de resultados analíticos y gráficos, se incluyen algunas consideraciones acerca de la eficiencia. En resumen, se mejora el comportamiento dinámico de vehículos por medio de modelos multicuerpo y de técnicas de diferenciación automática y optimización avanzadas, posibilitando un ajuste automático, preciso y eficiente de los parámetros de diseño. ABSTRACT Each day, the design and development of vehicle suspension systems relies more on computer-aided design and computer-aided engineering tools, which allow anticipating the problems and solving them ahead of time. Dynamic behavior and characteristics are thus simulated accurately and inexpensively with moderate computational times and resources. There is, however, an iterative component in the process, which involves the manual definition of designs in a trialand-error manner. This Thesis takes a step towards the development of an efficient simulation framework capable of simulating, analyzing and evaluating vehicle suspension designs, and automatically improving them by varying the design parameters towards the optimal solution. The multibody systems approach is hereby used to model a three-dimensional 18-degrees-of-freedom coach in a comprehensive yet efficient way. The suspension geometry and characteristics resemble the ones from the real vehicle, as do the rest of vehicle parameters. In order to simulate vehicle dynamics, an efficient, state-of-the-art multibody formulation based on Maggi’s equations is employed, and a three-dimensional graphics viewer is developed. As a result, vehicle maneuvers can be simulated faster than real-time. Once the dynamics are ready, a sensitivity analysis is crucial for a robust optimization. To that end, a mathematical technique is introduced, which allows differentiating the dynamic variables within the multibody formulation in a general, algorithmic, accurate to machine precision, and reasonably efficient way: automatic differentiation. This method propagates the derivatives with respect to the design parameters throughout the computer code, with little user interaction. In contrast with other attempts in the literature, mostly not generalpurpose, a benchmarking of libraries is carried out, a hybrid direct-automatic differentiation approach for the computation of sensitivities is developed, and several real-life examples are analyzed. Finally, a design optimization process of the aforementioned vehicle is carried out. Four different types of dynamic response optimization are presented: parameter identification, handling optimization, ride comfort optimization and multi-objective optimization; all of which are applied to the design of the coach example. Together with analytical and visual proof of the results, efficiency considerations are made. In summary, the dynamic behavior of vehicles is improved by using the multibody systems approach, along with advanced differentiation and optimization techniques, enabling an automatic, accurate and efficient tuning of design parameters.
Resumo:
La caracterización de los cultivos cubierta (cover crops) puede permitir comparar la idoneidad de diferentes especies para proporcionar servicios ecológicos como el control de la erosión, el reciclado de nutrientes o la producción de forrajes. En este trabajo se estudiaron bajo condiciones de campo diferentes técnicas para caracterizar el dosel vegetal con objeto de establecer una metodología para medir y comparar las arquitecturas de los cultivos cubierta más comunes. Se estableció un ensayo de campo en Madrid (España central) para determinar la relación entre el índice de área foliar (LAI) y la cobertura del suelo (GC) para un cultivo de gramínea, uno de leguminosa y uno de crucífera. Para ello se sembraron doce parcelas con cebada (Hordeum vulgare L.), veza (Vicia sativa L.), y colza (Brassica napus L.). En 10 fechas de muestreo se midieron el LAI (con estimaciones directas y del LAI-2000), la fracción interceptada de la radiación fotosintéticamente activa (FIPAR) y la GC. Un experimento de campo de dos años (Octubre-Abril) se estableció en la misma localización para evaluar diferentes especies (Hordeum vulgare L., Secale cereale L., x Triticosecale Whim, Sinapis alba L., Vicia sativa L.) y cultivares (20) en relación con su idoneidad para ser usadas como cultivos cubierta. La GC se monitorizó mediante análisis de imágenes digitales con 21 y 22 muestreos, y la biomasa se midió 8 y 10 veces, respectivamente para cada año. Un modelo de Gompertz caracterizó la cobertura del suelo hasta el decaimiento observado tras las heladas, mientras que la biomasa se ajustó a ecuaciones de Gompertz, logísticas y lineales-exponenciales. Al final del experimento se determinaron el C, el N y el contenido en fibra (neutrodetergente, ácidodetergente y lignina), así como el N fijado por las leguminosas. Se aplicó el análisis de decisión multicriterio (MCDA) con objeto de obtener un ranking de especies y cultivares de acuerdo con su idoneidad para actuar como cultivos cubierta en cuatro modalidades diferentes: cultivo de cobertura, cultivo captura, abono verde y forraje. Las asociaciones de cultivos leguminosas con no leguminosas pueden afectar al crecimiento radicular y a la absorción de N de ambos componentes de la mezcla. El conocimiento de cómo los sistemas radiculares específicos afectan al crecimiento individual de las especies es útil para entender las interacciones en las asociaciones, así como para planificar estrategias de cultivos cubierta. En un tercer ensayo se combinaron estudios en rhizotrones con extracción de raíces e identificación de especies por microscopía, así como con estudios de crecimiento, absorción de N y 15N en capas profundas del suelo. Las interacciones entre raíces en su crecimiento y en el aprovisionamiento de N se estudiaron para dos de los cultivares mejor valorados en el estudio previo: uno de cebada (Hordeum vulgare L. cv. Hispanic) y otro de veza (Vicia sativa L. cv. Aitana). Se añadió N en dosis de 0 (N0), 50 (N1) y 150 (N2) kg N ha-1. Como resultados del primer estudio, se ajustaron correctamente modelos lineales y cuadráticos a la relación entre la GC y el LAI para todos los cultivos, pero en la gramínea alcanzaron una meseta para un LAI>4. Antes de alcanzar la cobertura total, la pendiente de la relación lineal entre ambas variables se situó en un rango entre 0.025 y 0.030. Las lecturas del LAI-2000 estuvieron correlacionadas linealmente con el LAI, aunque con tendencia a la sobreestimación. Las correcciones basadas en el efecto de aglutinación redujeron el error cuadrático medio del LAI estimado por el LAI-2000 desde 1.2 hasta 0.5 para la crucífera y la leguminosa, no siendo efectivas para la cebada. Esto determinó que para los siguientes estudios se midieran únicamente la GC y la biomasa. En el segundo experimento, las gramíneas alcanzaron la mayor cobertura del suelo (83-99%) y la mayor biomasa (1226-1928 g m-2) al final del mismo. Con la mayor relación C/N (27-39) y contenido en fibra digestible (53-60%) y la menor calidad de residuo (~68%). La mostaza presentó elevadas GC, biomasa y absorción de N en el año más templado en similitud con las gramíneas, aunque escasa calidad como forraje en ambos años. La veza presentó la menor absorción de N (2.4-0.7 g N m-2) debido a la fijación de N (9.8-1.6 g N m-2) y escasa acumulación de N. El tiempo térmico hasta alcanzar el 30% de GC constituyó un buen indicador de especies de rápida cubrición. La cuantificación de las variables permitió hallar variabilidad entre las especies y proporcionó información para posteriores decisiones sobre la selección y manejo de los cultivos cubierta. La agregación de dichas variables a través de funciones de utilidad permitió confeccionar rankings de especies y cultivares para cada uso. Las gramíneas fueron las más indicadas para los usos de cultivo de cobertura, cultivo captura y forraje, mientras que las vezas fueron las mejor como abono verde. La mostaza alcanzó altos valores como cultivo de cobertura y captura en el primer año, pero el segundo decayó debido a su pobre actuación en los inviernos fríos. Hispanic fue el mejor cultivar de cebada como cultivo de cobertura y captura, mientras que Albacete como forraje. El triticale Titania alcanzó la posición más alta como cultiva de cobertura, captura y forraje. Las vezas Aitana y BGE014897 mostraron buenas aptitudes como abono verde y cultivo captura. El MCDA permitió la comparación entre especies y cultivares proporcionando información relevante para la selección y manejo de cultivos cubierta. En el estudio en rhizotrones tanto la mezcla de especies como la cebada alcanzaron mayor intensidad de raíces (RI) y profundidad (RD) que la veza, con valores alrededor de 150 cruces m-1 y 1.4 m respectivamente, comparados con 50 cruces m-1 y 0.9 m para la veza. En las capas más profundas del suelo, la asociación de cultivos mostró valores de RI ligeramente mayores que la cebada en monocultivo. La cebada y la asociación obtuvieron mayores valores de densidad de raíces (RLD) (200-600 m m-3) que la veza (25-130) entre 0.8 y 1.2 m de profundidad. Los niveles de N no mostraron efectos claros en RI, RD ó RLD, sin embargo, el incremento de N favoreció la proliferación de raíces de veza en la asociación en capas profundas del suelo, con un ratio cebada/veza situado entre 25 a N0 y 5 a N2. La absorción de N de la cebada se incrementó en la asociación a expensas de la veza (de ~100 a 200 mg planta-1). Las raíces de cebada en la asociación absorbieron también más nitrógeno marcado de las capas profundas del suelo (0.6 mg 15N planta-1) que en el monocultivo (0.3 mg 15N planta-1). ABSTRACT Cover crop characterization may allow comparing the suitability of different species to provide ecological services such as erosion control, nutrient recycling or fodder production. Different techniques to characterize plant canopy were studied under field conditions in order to establish a methodology for measuring and comparing cover crops canopies. A field trial was established in Madrid (central Spain) to determine the relationship between leaf area index (LAI) and ground cover (GC) in a grass, a legume and a crucifer crop. Twelve plots were sown with either barley (Hordeum vulgare L.), vetch (Vicia sativa L.), or rape (Brassica napus L.). On 10 sampling dates the LAI (both direct and LAI-2000 estimations), fraction intercepted of photosynthetically active radiation (FIPAR) and GC were measured. A two-year field experiment (October-April) was established in the same location to evaluate different species (Hordeum vulgare L., Secale cereale L., x Triticosecale Whim, Sinapis alba L., Vicia sativa L.) and cultivars (20) according to their suitability to be used as cover crops. GC was monitored through digital image analysis with 21 and 22 samples, and biomass measured 8 and 10 times, respectively for each season. A Gompertz model characterized ground cover until the decay observed after frosts, while biomass was fitted to Gompertz, logistic and linear-exponential equations. At the end of the experiment C, N, and fiber (neutral detergent, acid and lignin) contents, and the N fixed by the legumes were determined. Multicriteria decision analysis (MCDA) was applied in order to rank the species and cultivars according to their suitability to perform as cover crops in four different modalities: cover crop, catch crop, green manure and fodder. Intercropping legumes and non-legumes may affect the root growth and N uptake of both components in the mixture. The knowledge of how specific root systems affect the growth of the individual species is useful for understanding the interactions in intercrops as well as for planning cover cropping strategies. In a third trial rhizotron studies were combined with root extraction and species identification by microscopy and with studies of growth, N uptake and 15N uptake from deeper soil layers. The root interactions of root growth and N foraging were studied for two of the best ranked cultivars in the previous study: a barley (Hordeum vulgare L. cv. Hispanic) and a vetch (Vicia sativa L. cv. Aitana). N was added at 0 (N0), 50 (N1) and 150 (N2) kg N ha-1. As a result, linear and quadratic models fitted to the relationship between the GC and LAI for all of the crops, but they reached a plateau in the grass when the LAI > 4. Before reaching full cover, the slope of the linear relationship between both variables was within the range of 0.025 to 0.030. The LAI-2000 readings were linearly correlated with the LAI but they tended to overestimation. Corrections based on the clumping effect reduced the root mean square error of the estimated LAI from the LAI-2000 readings from 1.2 to less than 0.50 for the crucifer and the legume, but were not effective for barley. This determined that in the following studies only the GC and biomass were measured. In the second experiment, the grasses reached the highest ground cover (83- 99%) and biomass (1226-1928 g/m2) at the end of the experiment. The grasses had the highest C/N ratio (27-39) and dietary fiber (53-60%) and the lowest residue quality (~68%). The mustard presented high GC, biomass and N uptake in the warmer year with similarity to grasses, but low fodder capability in both years. The vetch presented the lowest N uptake (2.4-0.7 g N/m2) due to N fixation (9.8-1.6 g N/m2) and low biomass accumulation. The thermal time until reaching 30% ground cover was a good indicator of early coverage species. Variable quantification allowed finding variability among the species and provided information for further decisions involving cover crops selection and management. Aggregation of these variables through utility functions allowed ranking species and cultivars for each usage. Grasses were the most suitable for the cover crop, catch crop and fodder uses, while the vetches were the best as green manures. The mustard attained high ranks as cover and catch crop the first season, but the second decayed due to low performance in cold winters. Hispanic was the most suitable barley cultivar as cover and catch crop, and Albacete as fodder. The triticale Titania attained the highest rank as cover and catch crop and fodder. Vetches Aitana and BGE014897 showed good aptitudes as green manures and catch crops. MCDA allowed comparison among species and cultivars and might provide relevant information for cover crops selection and management. In the rhizotron study the intercrop and the barley attained slightly higher root intensity (RI) and root depth (RD) than the vetch, with values around 150 crosses m-1 and 1.4 m respectively, compared to 50 crosses m-1 and 0.9 m for the vetch. At deep soil layers, intercropping showed slightly larger RI values compared to the sole cropped barley. The barley and the intercropping had larger root length density (RLD) values (200-600 m m-3) than the vetch (25-130) at 0.8-1.2 m depth. The topsoil N supply did not show a clear effect on the RI, RD or RLD; however increasing topsoil N favored the proliferation of vetch roots in the intercropping at deep soil layers, with the barley/vetch root ratio ranging from 25 at N0 to 5 at N2. The N uptake of the barley was enhanced in the intercropping at the expense of the vetch (from ~100 mg plant-1 to 200). The intercropped barley roots took up more labeled nitrogen (0.6 mg 15N plant-1) than the sole-cropped barley roots (0.3 mg 15N plant-1) from deep layers.
Resumo:
Three-dimensional Direct Numerical Simulations combined with Particle Image Velocimetry experiments have been performed on a hemisphere-cylinder at Reynolds number 1000 and angle of attack 20◦. At these flow conditions, a pair of vortices, so-called “horn” vortices, are found to be associated with flow separation. In order to understand the highly complex phenomena associated with this fully threedimensional massively separated flow, different structural analysis techniques have been employed: Proper Orthogonal and Dynamic Mode Decompositions, POD and DMD, respectively, as well as criticalpoint theory. A single dominant frequency associated with the von Karman vortex shedding has been identified in both the experimental and the numerical results. POD and DMD modes associated with this frequency were recovered in the analysis. Flow separation was also found to be intrinsically linked to the observed modes. On the other hand, critical-point theory has been applied in order to highlight possible links of the topology patterns over the surface of the body with the computed modes. Critical points and separation lines on the body surface show in detail the presence of different flow patterns in the base flow: a three-dimensional separation bubble and two pairs of unsteady vortices systems, the horn vortices, mentioned before, and the so-called “leeward” vortices. The horn vortices emerge perpendicularly from the body surface at the separation region. On the other hand, the leeward vortices are originated downstream of the separation bubble, as a result of the boundary layer separation. The frequencies associated with these vortical structures have been quantified.