989 resultados para Text similarity measures


Relevância:

90.00% 90.00%

Publicador:

Resumo:

La tesis tracta diferents aspectes relacionats amb el càlcul de la semblança quàntica, així com la seva aplicació en la racionalització i predicció de l'activitat de fàrmacs. Es poden destacar dos progressos importants en el desenvolupament de noves metodologies que faciliten el càlcul de les mesures de semblança quàntica. En primer lloc, la descripció de les molècules mitjançant les funciones densitat aproximades PASA (Promolecular Atomic Shell Approximation) ha permès descriure amb suficient precisió la densitat electrònica dels sistemes moleculars analitzats, reduint substancialment el temps de càlcul de les mesures de semblança. En segon lloc, el desenvolupament de tècniques de superposició molecular específiques de les mesures de semblança quàntica ha permès resoldre el problema de l'alineament en l'espai dels compostos comparats. El perfeccionament d'aquests nous procediments i algoritmes matemàtics associats a les mesures de semblança molecular quàntica, ha estat essencial per poder progressar en diferents disciplines de la química computacional, sobretot les relacionades amb les anàlisis quantitatives entre les estructures moleculars i les seves activitats biològiques, conegudes amb les sigles angleses QSAR (Quantitative Structure-Activity Relationships). Precisament en l'àrea de les relacions estructura-activitat s'han presentat dues aproximacions fonamentades en la semblança molecular quàntica que s'originen a partir de dues representacions diferents de les molècules. La primera descripció considera la densitat electrònica global de les molècules i és important, entre altres, la disposició dels objectes comparats en l'espai i la seva conformació tridimensional. El resultat és una matriu de semblança amb les mesures de semblança de tots els parells de compostos que formen el conjunt estudiat. La segona descripció es fonamenta en la partició de la densitat global de les molècules en fragments. S'utilitzen mesures d'autosemblança per analitzar els requeriments bàsics d'una determinada activitat des del punt de vista de la semblança quàntica. El procés permet la detecció de les regions moleculars que són responsables d'una alta resposta biològica. Això permet obtenir un patró amb les regions actives que és d'evident interès per als propòsits del disseny de fàrmacs. En definitiva, s'ha comprovat que mitjançant la simulació i manipulació informàtica de les molècules en tres dimensions es pot obtenir una informació essencial en l'estudi de la interacció entre els fàrmacs i els seus receptors macromoleculars.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

La present tesi, tot i que emmarcada dins de la teoria de les Mesures Semblança Molecular Quántica (MQSM), es deriva en tres àmbits clarament definits: - La creació de Contorns Moleculars de IsoDensitat Electrònica (MIDCOs, de l'anglès Molecular IsoDensity COntours) a partir de densitats electròniques ajustades. - El desenvolupament d'un mètode de sobreposició molecular, alternatiu a la regla de la màxima semblança. - Relacions Quantitatives Estructura-Activitat (QSAR, de l'anglès Quantitative Structure-Activity Relationships). L'objectiu en el camp dels MIDCOs és l'aplicació de funcions densitat ajustades, ideades inicialment per a abaratir els càlculs de MQSM, per a l'obtenció de MIDCOs. Així, es realitza un estudi gràfic comparatiu entre diferents funcions densitat ajustades a diferents bases amb densitats obtingudes de càlculs duts a terme a nivells ab initio. D'aquesta manera, l'analogia visual entre les funcions ajustades i les ab initio obtinguda en el ventall de representacions de densitat obtingudes, i juntament amb els valors de les mesures de semblança obtinguts prèviament, totalment comparables, fonamenta l'ús d'aquestes funcions ajustades. Més enllà del propòsit inicial, es van realitzar dos estudis complementaris a la simple representació de densitats, i són l'anàlisi de curvatura i l'extensió a macromolècules. La primera observació correspon a comprovar no només la semblança dels MIDCOs, sinó la coherència del seu comportament a nivell de curvatura, podent-se així observar punts d'inflexió en la representació de densitats i veure gràficament aquelles zones on la densitat és còncava o convexa. Aquest primer estudi revela que tant les densitats ajustades com les calculades a nivell ab initio es comporten de manera totalment anàloga. En la segona part d'aquest treball es va poder estendre el mètode a molècules més grans, de fins uns 2500 àtoms. Finalment, s'aplica part de la filosofia del MEDLA. Sabent que la densitat electrònica decau ràpidament al allunyar-se dels nuclis, el càlcul d'aquesta pot ser obviat a distàncies grans d'aquests. D'aquesta manera es va proposar particionar l'espai, i calcular tan sols les funcions ajustades de cada àtom tan sols en una regió petita, envoltant l'àtom en qüestió. Duent a terme aquest procés, es disminueix el temps de càlcul i el procés esdevé lineal amb nombre d'àtoms presents en la molècula tractada. En el tema dedicat a la sobreposició molecular es tracta la creació d'un algorisme, així com la seva implementació en forma de programa, batejat Topo-Geometrical Superposition Algorithm (TGSA), d'un mètode que proporcionés aquells alineaments que coincideixen amb la intuïció química. El resultat és un programa informàtic, codificat en Fortran 90, el qual alinea les molècules per parelles considerant tan sols nombres i distàncies atòmiques. La total absència de paràmetres teòrics permet desenvolupar un mètode de sobreposició molecular general, que proporcioni una sobreposició intuïtiva, i també de forma rellevant, de manera ràpida i amb poca intervenció de l'usuari. L'ús màxim del TGSA s'ha dedicat a calcular semblances per al seu ús posterior en QSAR, les quals majoritàriament no corresponen al valor que s'obtindria d'emprar la regla de la màxima semblança, sobretot si hi ha àtoms pesats en joc. Finalment, en l'últim tema, dedicat a la Semblança Quàntica en el marc del QSAR, es tracten tres aspectes diferents: - Ús de matrius de semblança. Aquí intervé l'anomenada matriu de semblança, calculada a partir de les semblances per parelles d'entre un conjunt de molècules. Aquesta matriu és emprada posteriorment, degudament tractada, com a font de descriptors moleculars per a estudis QSAR. Dins d'aquest àmbit s'han fet diversos estudis de correlació d'interès farmacològic, toxicològic, així com de diverses propietats físiques. - Aplicació de l'energia d'interacció electró-electró, assimilat com a una forma d'autosemblança. Aquesta modesta contribució consisteix breument en prendre el valor d'aquesta magnitud, i per analogia amb la notació de l'autosemblança molecular quàntica, assimilar-la com a cas particular de d'aquesta mesura. Aquesta energia d'interacció s'obté fàcilment a partir de programari mecanoquàntic, i esdevé ideal per a fer un primer estudi preliminar de correlació, on s'utilitza aquesta magnitud com a únic descriptor. - Càlcul d'autosemblances, on la densitat ha estat modificada per a augmentar el paper d'un substituent. Treballs previs amb densitats de fragments, tot i donar molt bons resultats, manquen de cert rigor conceptual en aïllar un fragment, suposadament responsable de l'activitat molecular, de la totalitat de l'estructura molecular, tot i que les densitats associades a aquest fragment ja difereixen degut a pertànyer a esquelets amb diferents substitucions. Un procediment per a omplir aquest buit que deixa la simple separació del fragment, considerant així la totalitat de la molècula (calcular-ne l'autosemblança), però evitant al mateix temps valors d'autosemblança no desitjats provocats per àtoms pesats, és l'ús de densitats de Forats de fermi, els quals es troben definits al voltant del fragment d'interès. Aquest procediment modifica la densitat de manera que es troba majoritàriament concentrada a la regió d'interès, però alhora permet obtenir una funció densitat, la qual es comporta matemàticament igual que la densitat electrònica regular, podent-se així incorporar dins del marc de la semblança molecular. Les autosemblances calculades amb aquesta metodologia han portat a bones correlacions amb àcids aromàtics substituïts, podent així donar una explicació al seu comportament. Des d'un altre punt de vista, també s'han fet contribucions conceptuals. S'ha implementat una nova mesura de semblança, la d'energia cinètica, la qual consisteix en prendre la recentment desenvolupada funció densitat d'energia cinètica, la qual al comportar-se matemàticament igual a les densitats electròniques regulars, s'ha incorporat en el marc de la semblança. A partir d'aquesta mesura s'han obtingut models QSAR satisfactoris per diferents conjunts moleculars. Dins de l'aspecte del tractament de les matrius de semblança s'ha implementat l'anomenada transformació estocàstica com a alternativa a l'ús de l'índex Carbó. Aquesta transformació de la matriu de semblança permet obtenir una nova matriu no simètrica, la qual pot ser posteriorment tractada per a construir models QSAR.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

La present Tesi Doctoral, titulada desenvolupament computacional de la semblança molecular quàntica, tracta, fonamentalment, els aspectes de càlcul de mesures de semblança basades en la comparació de funcions de densitat electrònica.El primer capítol, Semblança quàntica, és introductori. S'hi descriuen les funcions de densitat de probabilitat electrònica i llur significança en el marc de la mecànica quàntica. Se n'expliciten els aspectes essencials i les condicions matemàtiques a satisfer, cara a una millor comprensió dels models de densitat electrònica que es proposen. Hom presenta les densitats electròniques, mencionant els teoremes de Hohenberg i Kohn i esquematitzant la teoria de Bader, com magnituds fonamentals en la descripció de les molècules i en la comprensió de llurs propietats.En el capítol Models de densitats electròniques moleculars es presenten procediments computacionals originals per l'ajust de funcions densitat a models expandits en termes de gaussianes 1s centrades en els nuclis. Les restriccions físico-matemàtiques associades a les distribucions de probabilitat s'introdueixen de manera rigorosa, en el procediment anomenat Atomic Shell Approximation (ASA). Aquest procediment, implementat en el programa ASAC, parteix d'un espai funcional quasi complert, d'on se seleccionen variacionalment les funcions o capes de l'expansió, d'acord als requisits de no negativitat. La qualitat d'aquestes densitats i de les mesures de semblança derivades es verifica abastament. Aquest model ASA s'estén a representacions dinàmiques, físicament més acurades, en quant que afectades per les vibracions nuclears, cara a una exploració de l'efecte de l'esmorteïment dels pics nuclears en les mesures de semblança molecular. La comparació de les densitats dinàmiques respecte les estàtiques evidencia un reordenament en les densitats dinàmiques, d'acord al que constituiria una manifestació del Principi quàntic de Le Chatelier. El procediment ASA, explícitament consistent amb les condicions de N-representabilitat, s'aplica també a la determinació directe de densitats electròniques hidrogenoides, en un context de teoria del funcional de la densitat.El capítol Maximització global de la funció de semblança presenta algorismes originals per la determinació de la màxima sobreposició de les densitats electròniques moleculars. Les mesures de semblança molecular quàntica s'identifiquen amb el màxim solapament, de manera es mesuri la distància entre les molècules, independentment dels sistemes de referència on es defineixen les densitats electròniques. Partint de la solució global en el límit de densitats infinitament compactades en els nuclis, es proposen tres nivells de aproximació per l'exploració sistemàtica, no estocàstica, de la funció de semblança, possibilitant la identificació eficient del màxim global, així com també dels diferents màxims locals. Es proposa també una parametrització original de les integrals de recobriment a través d'ajustos a funcions lorentzianes, en quant que tècnica d'acceleració computacional. En la pràctica de les relacions estructura-activitat, aquests avenços possibiliten la implementació eficient de mesures de semblança quantitatives, i, paral·lelament, proporcionen una metodologia totalment automàtica d'alineació molecular. El capítol Semblances d'àtoms en molècules descriu un algorisme de comparació dels àtoms de Bader, o regions tridimensionals delimitades per superfícies de flux zero de la funció de densitat electrònica. El caràcter quantitatiu d'aquestes semblances possibilita la mesura rigorosa de la noció química de transferibilitat d'àtoms i grups funcionals. Les superfícies de flux zero i els algorismes d'integració usats han estat publicats recentment i constitueixen l'aproximació més acurada pel càlcul de les propietats atòmiques. Finalment, en el capítol Semblances en estructures cristal·lines hom proposa una definició original de semblança, específica per la comparació dels conceptes de suavitat o softness en la distribució de fonons associats a l'estructura cristal·lina. Aquests conceptes apareixen en estudis de superconductivitat a causa de la influència de les interaccions electró-fonó en les temperatures de transició a l'estat superconductor. En aplicar-se aquesta metodologia a l'anàlisi de sals de BEDT-TTF, s'evidencien correlacions estructurals entre sals superconductores i no superconductores, en consonància amb les hipòtesis apuntades a la literatura sobre la rellevància de determinades interaccions.Conclouen aquesta tesi un apèndix que conté el programa ASAC, implementació de l'algorisme ASA, i un capítol final amb referències bibliogràfiques.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Regional flood frequency techniques are commonly used to estimate flood quantiles when flood data is unavailable or the record length at an individual gauging station is insufficient for reliable analyses. These methods compensate for limited or unavailable data by pooling data from nearby gauged sites. This requires the delineation of hydrologically homogeneous regions in which the flood regime is sufficiently similar to allow the spatial transfer of information. It is generally accepted that hydrologic similarity results from similar physiographic characteristics, and thus these characteristics can be used to delineate regions and classify ungauged sites. However, as currently practiced, the delineation is highly subjective and dependent on the similarity measures and classification techniques employed. A standardized procedure for delineation of hydrologically homogeneous regions is presented herein. Key aspects are a new statistical metric to identify physically discordant sites, and the identification of an appropriate set of physically based measures of extreme hydrological similarity. A combination of multivariate statistical techniques applied to multiple flood statistics and basin characteristics for gauging stations in the Southeastern U.S. revealed that basin slope, elevation, and soil drainage largely determine the extreme hydrological behavior of a watershed. Use of these characteristics as similarity measures in the standardized approach for region delineation yields regions which are more homogeneous and more efficient for quantile estimation at ungauged sites than those delineated using alternative physically-based procedures typically employed in practice. The proposed methods and key physical characteristics are also shown to be efficient for region delineation and quantile development in alternative areas composed of watersheds with statistically different physical composition. In addition, the use of aggregated values of key watershed characteristics was found to be sufficient for the regionalization of flood data; the added time and computational effort required to derive spatially distributed watershed variables does not increase the accuracy of quantile estimators for ungauged sites. This dissertation also presents a methodology by which flood quantile estimates in Haiti can be derived using relationships developed for data rich regions of the U.S. As currently practiced, regional flood frequency techniques can only be applied within the predefined area used for model development. However, results presented herein demonstrate that the regional flood distribution can successfully be extrapolated to areas of similar physical composition located beyond the extent of that used for model development provided differences in precipitation are accounted for and the site in question can be appropriately classified within a delineated region.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

An integrated approach for multi-spectral segmentation of MR images is presented. This method is based on the fuzzy c-means (FCM) and includes bias field correction and contextual constraints over spatial intensity distribution and accounts for the non-spherical cluster's shape in the feature space. The bias field is modeled as a linear combination of smooth polynomial basis functions for fast computation in the clustering iterations. Regularization terms for the neighborhood continuity of intensity are added into the FCM cost functions. To reduce the computational complexity, the contextual regularizations are separated from the clustering iterations. Since the feature space is not isotropic, distance measure adopted in Gustafson-Kessel (G-K) algorithm is used instead of the Euclidean distance, to account for the non-spherical shape of the clusters in the feature space. These algorithms are quantitatively evaluated on MR brain images using the similarity measures.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Intensity non-uniformity (bias field) correction, contextual constraints over spatial intensity distribution and non-spherical cluster's shape in the feature space are incorporated into the fuzzy c-means (FCM) for segmentation of three-dimensional multi-spectral MR images. The bias field is modeled by a linear combination of smooth polynomial basis functions for fast computation in the clustering iterations. Regularization terms for the neighborhood continuity of either intensity or membership are added into the FCM cost functions. Since the feature space is not isotropic, distance measures, other than the Euclidean distance, are used to account for the shape and volumetric effects of clusters in the feature space. The performance of segmentation is improved by combining the adaptive FCM scheme with the criteria used in Gustafson-Kessel (G-K) and Gath-Geva (G-G) algorithms through the inclusion of the cluster scatter measure. The performance of this integrated approach is quantitatively evaluated on normal MR brain images using the similarity measures. The improvement in the quality of segmentation obtained with our method is also demonstrated by comparing our results with those produced by FSL (FMRIB Software Library), a software package that is commonly used for tissue classification.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Short text messages a.k.a Microposts (e.g. Tweets) have proven to be an effective channel for revealing information about trends and events, ranging from those related to Disaster (e.g. hurricane Sandy) to those related to Violence (e.g. Egyptian revolution). Being informed about such events as they occur could be extremely important to authorities and emergency professionals by allowing such parties to immediately respond. In this work we study the problem of topic classification (TC) of Microposts, which aims to automatically classify short messages based on the subject(s) discussed in them. The accurate TC of Microposts however is a challenging task since the limited number of tokens in a post often implies a lack of sufficient contextual information. In order to provide contextual information to Microposts, we present and evaluate several graph structures surrounding concepts present in linked knowledge sources (KSs). Traditional TC techniques enrich the content of Microposts with features extracted only from the Microposts content. In contrast our approach relies on the generation of different weighted semantic meta-graphs extracted from linked KSs. We introduce a new semantic graph, called category meta-graph. This novel meta-graph provides a more fine grained categorisation of concepts providing a set of novel semantic features. Our findings show that such category meta-graph features effectively improve the performance of a topic classifier of Microposts. Furthermore our goal is also to understand which semantic feature contributes to the performance of a topic classifier. For this reason we propose an approach for automatic estimation of accuracy loss of a topic classifier on new, unseen Microposts. We introduce and evaluate novel topic similarity measures, which capture the similarity between the KS documents and Microposts at a conceptual level, considering the enriched representation of these documents. Extensive evaluation in the context of Emergency Response (ER) and Violence Detection (VD) revealed that our approach outperforms previous approaches using single KS without linked data and Twitter data only up to 31.4% in terms of F1 measure. Our main findings indicate that the new category graph contains useful information for TC and achieves comparable results to previously used semantic graphs. Furthermore our results also indicate that the accuracy of a topic classifier can be accurately predicted using the enhanced text representation, outperforming previous approaches considering content-based similarity measures. © 2014 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We have proposed a similarity matching method (SMM) to obtain the change of Brillouin frequency shift (BFS), in which the change of BFS can be determined from the frequency difference between detecting spectrum and selected reference spectrum by comparing their similarity. We have also compared three similarity measures in the simulation, which has shown that the correlation coefficient is more accurate to determine the change of BFS. Compared with the other methods of determining the change of BFS, the SMM is more suitable for complex Brillouin spectrum profiles. More precise result and much faster processing speed have been verified in our simulation and experiments. The experimental results have shown that the measurement uncertainty of the BFS has been improved to 0.72 MHz by using the SMM, which is almost one-third of that by using the curve fitting method, and the speed of deriving the BFS change by the SMM is 120 times faster than that by the curve fitting method.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This dissertation develops an image processing framework with unique feature extraction and similarity measurements for human face recognition in the thermal mid-wave infrared portion of the electromagnetic spectrum. The goals of this research is to design specialized algorithms that would extract facial vasculature information, create a thermal facial signature and identify the individual. The objective is to use such findings in support of a biometrics system for human identification with a high degree of accuracy and a high degree of reliability. This last assertion is due to the minimal to no risk for potential alteration of the intrinsic physiological characteristics seen through thermal infrared imaging. The proposed thermal facial signature recognition is fully integrated and consolidates the main and critical steps of feature extraction, registration, matching through similarity measures, and validation through testing our algorithm on a database, referred to as C-X1, provided by the Computer Vision Research Laboratory at the University of Notre Dame. Feature extraction was accomplished by first registering the infrared images to a reference image using the functional MRI of the Brain’s (FMRIB’s) Linear Image Registration Tool (FLIRT) modified to suit thermal infrared images. This was followed by segmentation of the facial region using an advanced localized contouring algorithm applied on anisotropically diffused thermal images. Thermal feature extraction from facial images was attained by performing morphological operations such as opening and top-hat segmentation to yield thermal signatures for each subject. Four thermal images taken over a period of six months were used to generate thermal signatures and a thermal template for each subject, the thermal template contains only the most prevalent and consistent features. Finally a similarity measure technique was used to match signatures to templates and the Principal Component Analysis (PCA) was used to validate the results of the matching process. Thirteen subjects were used for testing the developed technique on an in-house thermal imaging system. The matching using an Euclidean-based similarity measure showed 88% accuracy in the case of skeletonized signatures and templates, we obtained 90% accuracy for anisotropically diffused signatures and templates. We also employed the Manhattan-based similarity measure and obtained an accuracy of 90.39% for skeletonized and diffused templates and signatures. It was found that an average 18.9% improvement in the similarity measure was obtained when using diffused templates. The Euclidean- and Manhattan-based similarity measure was also applied to skeletonized signatures and templates of 25 subjects in the C-X1 database. The highly accurate results obtained in the matching process along with the generalized design process clearly demonstrate the ability of the thermal infrared system to be used on other thermal imaging based systems and related databases. A novel user-initialization registration of thermal facial images has been successfully implemented. Furthermore, the novel approach at developing a thermal signature template using four images taken at various times ensured that unforeseen changes in the vasculature did not affect the biometric matching process as it relied on consistent thermal features.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The dependency of word similarity in vector space models on the frequency of words has been noted in a few studies, but has received very little attention. We study the influence of word frequency in a set of 10 000 randomly selected word pairs for a number of different combinations of feature weighting schemes and similarity measures. We find that the similarity of word pairs for all methods, except for the one using singular value decomposition to reduce the dimensionality of the feature space, is determined to a large extent by the frequency of the words. In a binary classification task of pairs of synonyms and unrelated words we find that for all similarity measures the results can be improved when we correct for the frequency bias.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

International Scientific Forum, ISF 2013, ISF 2013, 12-14 December 2013, Tirana.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Feature selection is a central problem in machine learning and pattern recognition. On large datasets (in terms of dimension and/or number of instances), using search-based or wrapper techniques can be cornputationally prohibitive. Moreover, many filter methods based on relevance/redundancy assessment also take a prohibitively long time on high-dimensional. datasets. In this paper, we propose efficient unsupervised and supervised feature selection/ranking filters for high-dimensional datasets. These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features. The experimental results, with up to 10(5) features, show the time efficiency of our methods, with lower generalization error than state-of-the-art techniques, while being dramatically simpler and faster.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

To meet the increasing demands of the complex inter-organizational processes and the demand for continuous innovation and internationalization, it is evident that new forms of organisation are being adopted, fostering more intensive collaboration processes and sharing of resources, in what can be called collaborative networks (Camarinha-Matos, 2006:03). Information and knowledge are crucial resources in collaborative networks, being their management fundamental processes to optimize. Knowledge organisation and collaboration systems are thus important instruments for the success of collaborative networks of organisations having been researched in the last decade in the areas of computer science, information science, management sciences, terminology and linguistics. Nevertheless, research in this area didn’t give much attention to multilingual contexts of collaboration, which pose specific and challenging problems. It is then clear that access to and representation of knowledge will happen more and more on a multilingual setting which implies the overcoming of difficulties inherent to the presence of multiple languages, through the use of processes like localization of ontologies. Although localization, like other processes that involve multilingualism, is a rather well-developed practice and its methodologies and tools fruitfully employed by the language industry in the development and adaptation of multilingual content, it has not yet been sufficiently explored as an element of support to the development of knowledge representations - in particular ontologies - expressed in more than one language. Multilingual knowledge representation is then an open research area calling for cross-contributions from knowledge engineering, terminology, ontology engineering, cognitive sciences, computational linguistics, natural language processing, and management sciences. This workshop joined researchers interested in multilingual knowledge representation, in a multidisciplinary environment to debate the possibilities of cross-fertilization between knowledge engineering, terminology, ontology engineering, cognitive sciences, computational linguistics, natural language processing, and management sciences applied to contexts where multilingualism continuously creates new and demanding challenges to current knowledge representation methods and techniques. In this workshop six papers dealing with different approaches to multilingual knowledge representation are presented, most of them describing tools, approaches and results obtained in the development of ongoing projects. In the first case, Andrés Domínguez Burgos, Koen Kerremansa and Rita Temmerman present a software module that is part of a workbench for terminological and ontological mining, Termontospider, a wiki crawler that aims at optimally traverse Wikipedia in search of domainspecific texts for extracting terminological and ontological information. The crawler is part of a tool suite for automatically developing multilingual termontological databases, i.e. ontologicallyunderpinned multilingual terminological databases. In this paper the authors describe the basic principles behind the crawler and summarized the research setting in which the tool is currently tested. In the second paper, Fumiko Kano presents a work comparing four feature-based similarity measures derived from cognitive sciences. The purpose of the comparative analysis presented by the author is to verify the potentially most effective model that can be applied for mapping independent ontologies in a culturally influenced domain. For that, datasets based on standardized pre-defined feature dimensions and values, which are obtainable from the UNESCO Institute for Statistics (UIS) have been used for the comparative analysis of the similarity measures. The purpose of the comparison is to verify the similarity measures based on the objectively developed datasets. According to the author the results demonstrate that the Bayesian Model of Generalization provides for the most effective cognitive model for identifying the most similar corresponding concepts existing for a targeted socio-cultural community. In another presentation, Thierry Declerck, Hans-Ulrich Krieger and Dagmar Gromann present an ongoing work and propose an approach to automatic extraction of information from multilingual financial Web resources, to provide candidate terms for building ontology elements or instances of ontology concepts. The authors present a complementary approach to the direct localization/translation of ontology labels, by acquiring terminologies through the access and harvesting of multilingual Web presences of structured information providers in the field of finance, leading to both the detection of candidate terms in various multilingual sources in the financial domain that can be used not only as labels of ontology classes and properties but also for the possible generation of (multilingual) domain ontologies themselves. In the next paper, Manuel Silva, António Lucas Soares and Rute Costa claim that despite the availability of tools, resources and techniques aimed at the construction of ontological artifacts, developing a shared conceptualization of a given reality still raises questions about the principles and methods that support the initial phases of conceptualization. These questions become, according to the authors, more complex when the conceptualization occurs in a multilingual setting. To tackle these issues the authors present a collaborative platform – conceptME - where terminological and knowledge representation processes support domain experts throughout a conceptualization framework, allowing the inclusion of multilingual data as a way to promote knowledge sharing and enhance conceptualization and support a multilingual ontology specification. In another presentation Frieda Steurs and Hendrik J. Kockaert present us TermWise, a large project dealing with legal terminology and phraseology for the Belgian public services, i.e. the translation office of the ministry of justice, a project which aims at developing an advanced tool including expert knowledge in the algorithms that extract specialized language from textual data (legal documents) and whose outcome is a knowledge database including Dutch/French equivalents for legal concepts, enriched with the phraseology related to the terms under discussion. Finally, Deborah Grbac, Luca Losito, Andrea Sada and Paolo Sirito report on the preliminary results of a pilot project currently ongoing at UCSC Central Library, where they propose to adapt to subject librarians, employed in large and multilingual Academic Institutions, the model used by translators working within European Union Institutions. The authors are using User Experience (UX) Analysis in order to provide subject librarians with a visual support, by means of “ontology tables” depicting conceptual linking and connections of words with concepts presented according to their semantic and linguistic meaning. The organizers hope that the selection of papers presented here will be of interest to a broad audience, and will be a starting point for further discussion and cooperation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Limited information is available regarding the methodology required to characterize hashish seizures for assessing the presence or the absence of a chemical link between two seizures. This casework report presents the methodology applied for assessing that two different police seizures were coming from the same block before this latter one was split. The chemical signature was extracted using GC-MS analysis and the implemented methodology consists in a study of intra- and inter-variability distributions based on the measurement of the chemical profiles similarity using a number of hashish seizures and the calculation of the Pearson correlation coefficient. Different statistical scenarios (i.e., a combination of data pretreatment techniques and selection of target compounds) were tested to find the most discriminating one. Seven compounds showing high discrimination capabilities were selected on which a specific statistical data pretreatment was applied. Based on the results, the statistical model built for comparing the hashish seizures leads to low error rates. Therefore, the implemented methodology is suitable for the chemical profiling of hashish seizures.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Es presenta un mètode de selecció d'orbitals atòmics relacionat amb la teoria de la Semblança Molecular Quàntica, que permet reduir l'espai actiu quan es vol dur a terme un càlcul a nivell d'Interacció de Configuracions per a l'àtom d'heli