17 resultados para wsd
Resumo:
In this paper, a new high precision focused word sense disambiguation (WSD) approach is proposed, which not only attempts to identify the proper sense for a word but also provides the probabilistic evaluation for the identification confidence at the same time. A novel Instance Knowledge Network (IKN) is built to generate and maintain semantic knowledge at the word, type synonym set and instance levels. Related algorithms based on graph matching are developed to train IKN with probabilistic knowledge and to use IKN for probabilistic word sense disambiguation. Based on the Senseval-3 all-words task, we run extensive experiments to show the performance enhancements in different precision ranges and the rationality of probabilistic based automatic confidence evaluation of disambiguation. We combine our WSD algorithm with five best WSD algorithms in senseval-3 all words tasks. The results show that the combined algorithms all outperform the corresponding algorithms.
Resumo:
This thesis examines protein behaviours that occur during cereal fermentations. The focus is on the prolamin degradation in sourdoughs. The thesis also looks at what happens to the oat globulins during an oat bran acidification process. The cereal prolamins are unique proteins in many respects. The wheat prolamins (glutenins and gliadins) are responsible for the formation of the gluten that provides the viscoelastic properties to wheat doughs whereas the rye prolamins (secalins) are unable to develop gluten-like structures. In addition, many baking technological features, such as flavour, shelf-life and dough properties are affected by the protein degradation that might occur during processing. On the other hand, the prolamins contain protein structures that are harmful to gluten sensitive people. It is thus evident that the degradation of the prolamins in sourdough processes may be approached from various aspects. This thesis describes some of these approaches. Four different cereal fermentations were carried out. Wheat sourdough (WSD) and rye sourdough (RSD) fermentations represented traditional sourdoughs. A germinated-wheat sourdough (GWSD) was a novel sourdough type that was prepared using germinated wheat grains that had high and diverse proteolytic activities. The oat bran fermentation (OBF) represented a fermentation system that lacked functional cereal proteases. The high molecular weight glutenins and rye secalins were degraded during the WSD and RSD fermentations, respectively. It was noteworthy that in WSD only a very limited degradation of the gliadins occurred. The gliadins were, however, hydrolysed very extensively during the GWSD fermentation. No protein degradation was observable in the OBF system. Instead the acidification altered the solubility of the oat globulins and this finally led to their aggregation. This thesis confirms that the endogenous proteases of cereals hydrolyse cereal prolamins in sourdoughs. The thesis also shows that the proteolytic activity of the used cereal raw material determines the extent of proteolysis that occurs in sourdough. This means that bakers may adjust the protein degradation in their sourdoughs by selecting the raw material based on its proteolytic activity. The thesis also demonstrates that by using germinated grains, with high and diverse proteolytic activity in sourdough preparations, the prolamins can be extensively degraded. Whether such highly proteolytic food technology could be used to manufacture new gluten-free cereal-based products for gluten sensitive people remains to be solved.
Resumo:
Il est connu que les problèmes d'ambiguïté de la langue ont un effet néfaste sur les résultats des systèmes de Recherche d'Information (RI). Toutefois, les efforts de recherche visant à intégrer des techniques de Désambiguisation de Sens (DS) à la RI n'ont pas porté fruit. La plupart des études sur le sujet obtiennent effectivement des résultats négatifs ou peu convaincants. De plus, des investigations basées sur l'ajout d'ambiguïté artificielle concluent qu'il faudrait une très haute précision de désambiguation pour arriver à un effet positif. Ce mémoire vise à développer de nouvelles approches plus performantes et efficaces, se concentrant sur l'utilisation de statistiques de cooccurrence afin de construire des modèles de contexte. Ces modèles pourront ensuite servir à effectuer une discrimination de sens entre une requête et les documents d'une collection. Dans ce mémoire à deux parties, nous ferons tout d'abord une investigation de la force de la relation entre un mot et les mots présents dans son contexte, proposant une méthode d'apprentissage du poids d'un mot de contexte en fonction de sa distance du mot modélisé dans le document. Cette méthode repose sur l'idée que des modèles de contextes faits à partir d'échantillons aléatoires de mots en contexte devraient être similaires. Des expériences en anglais et en japonais montrent que la force de relation en fonction de la distance suit généralement une loi de puissance négative. Les poids résultant des expériences sont ensuite utilisés dans la construction de systèmes de DS Bayes Naïfs. Des évaluations de ces systèmes sur les données de l'atelier Semeval en anglais pour la tâche Semeval-2007 English Lexical Sample, puis en japonais pour la tâche Semeval-2010 Japanese WSD, montrent que les systèmes ont des résultats comparables à l'état de l'art, bien qu'ils soient bien plus légers, et ne dépendent pas d'outils ou de ressources linguistiques. La deuxième partie de ce mémoire vise à adapter les méthodes développées à des applications de Recherche d'Information. Ces applications ont la difficulté additionnelle de ne pas pouvoir dépendre de données créées manuellement. Nous proposons donc des modèles de contextes à variables latentes basés sur l'Allocation Dirichlet Latente (LDA). Ceux-ci seront combinés à la méthodes de vraisemblance de requête par modèles de langue. En évaluant le système résultant sur trois collections de la conférence TREC (Text REtrieval Conference), nous observons une amélioration proportionnelle moyenne de 12% du MAP et 23% du GMAP. Les gains se font surtout sur les requêtes difficiles, augmentant la stabilité des résultats. Ces expériences seraient la première application positive de techniques de DS sur des tâches de RI standard.
Resumo:
Fenneropenaeus indicus could be protected from white spot disease (WSD) caused by white spot syndrome virus (WSSV) using a formalin-inactivated viral preparation (IVP) derived from WSSV-infected shrimp tissue. The lowest test quantity of lyophilized IVP coated onto feed at 0.025 g–1 (dry weight) and administered at a rate of 0.035 g feed g–1 body weight d–1 for 7 consecutive days was sufficient to provide protection from WSD for a short period (10 d after cessation of IVP administration). Shrimp that survived challenges on the 5th and 10th days after cessation of IVP administration survived repeated challenges although they were sometimes positive for the presence of WSSV by a polymerase chain reaction (PCR) assay specific for WSSV. These results suggest that F. indicus can be protected from WSD by simple oral administration of IVP
Resumo:
Identifying the correct sense of a word in context is crucial for many tasks in natural language processing (machine translation is an example). State-of-the art methods for Word Sense Disambiguation (WSD) build models using hand-crafted features that usually capturing shallow linguistic information. Complex background knowledge, such as semantic relationships, are typically either not used, or used in specialised manner, due to the limitations of the feature-based modelling techniques used. On the other hand, empirical results from the use of Inductive Logic Programming (ILP) systems have repeatedly shown that they can use diverse sources of background knowledge when constructing models. In this paper, we investigate whether this ability of ILP systems could be used to improve the predictive accuracy of models for WSD. Specifically, we examine the use of a general-purpose ILP system as a method to construct a set of features using semantic, syntactic and lexical information. This feature-set is then used by a common modelling technique in the field (a support vector machine) to construct a classifier for predicting the sense of a word. In our investigation we examine one-shot and incremental approaches to feature-set construction applied to monolingual and bilingual WSD tasks. The monolingual tasks use 32 verbs and 85 verbs and nouns (in English) from the SENSEVAL-3 and SemEval-2007 benchmarks; while the bilingual WSD task consists of 7 highly ambiguous verbs in translating from English to Portuguese. The results are encouraging: the ILP-assisted models show substantial improvements over those that simply use shallow features. In addition, incremental feature-set construction appears to identify smaller and better sets of features. Taken together, the results suggest that the use of ILP with diverse sources of background knowledge provide a way for making substantial progress in the field of WSD.
Resumo:
Fuel is a material used to produce heat or power by burning, and lubricity is the capacity for reducing friction. The aim of this work is evaluate the lubricity of eight fossil and renewable fuels used in Diesel engines, by means of a HFRR tester, following the ASTM D 6079-04 Standard. In this conception, a sphere of AISI 52100 steel (diameter of 6,000,05 mm, Ra 0,050,005 μm, E = 210 GPa, HRC 624, HV0,2 63147) is submitted to a reciprocating motion under a normal load of 2 N and 50 Hz frequency to promote a wear track length of 1.10.1mm in a plan disc of AISI 52100 steel (HV0,05 18410, Ra 0,020,005 μm). The testing extent time was 75 minutes, 225,000 cycles. Each one test was repeated six times to furnish the results, by means of intrinsic signatures from the signals of the lubricant film percentage, friction coefficient, contact heating, Sound Pressure Level, SPL [dB]. These signal signatures were obtained by two thermocouples and a portable decibelmeter coupled to a data acquisition system and to the HFRR system. The wettability of droplet of the diesel fuel in thermal equilibrium on a horizontal surface of a virgin plan disc of 52100 steel, Ra 0,02 0,005 μm, were measured by its contact angle of 7,0 3,5o, while the results obtained for the biodiesel B5, B20 and B100 blends originated by the ethylic transesterification of soybean oil were, respectively, 7,5 3,5o, 13,5 3,5o e 19,0 1,0o; for the distilled water, 78,0 6,0o; the biodiesel B5, B20 and B100 blends originated by the ethylic transesterification of sunflower oil were, respectively, 7,0 4,0o, 8,5 4,5o e 19,5 2,5o. Different thickness of lubricant film were formed and measured by their percentage by means of the contact resistance technique, suggesting several regimes, since the boundary until the hydrodynamic lubrication. All oils analyzed in this study promoted the ball wear scars with diameters smaller than 400 μm. The lowest values were observed in the scar balls lubricated by mixtures B100, B20 and B5 of sunflower and B20 and B5 of soybean oils (WSD < 215 μm)
Resumo:
The automatic disambiguation of word senses (i.e., the identification of which of the meanings is used in a given context for a word that has multiple meanings) is essential for such applications as machine translation and information retrieval, and represents a key step for developing the so-called Semantic Web. Humans disambiguate words in a straightforward fashion, but this does not apply to computers. In this paper we address the problem of Word Sense Disambiguation (WSD) by treating texts as complex networks, and show that word senses can be distinguished upon characterizing the local structure around ambiguous words. Our goal was not to obtain the best possible disambiguation system, but we nevertheless found that in half of the cases our approach outperforms traditional shallow methods. We show that the hierarchical connectivity and clustering of words are usually the most relevant features for WSD. The results reported here shed light on the relationship between semantic and structural parameters of complex networks. They also indicate that when combined with traditional techniques the complex network approach may be useful to enhance the discrimination of senses in large texts. Copyright (C) EPLA, 2012
Resumo:
La Cognitive Radio è un dispositivo in grado di reagire ai cambiamenti dell’ambiente radio in cui opera, modificando autonomamente e dinamicamente i propri parametri funzionali tra cui la frequenza, la potenza di trasmissione e la modulazione. Il principio di base di questi dispositivi è l’accesso dinamico alle risorse radio potenzialmente non utilizzate, con cui utenti non in possesso di licenze possono sfruttare le frequenze che in un determinato spazio temporale non vengono usate, preoccupandosi di non interferire con gli utenti che hanno privilegi su quella parte di spettro. Devono quindi essere individuati i cosiddetti “spectrum holes” o “white spaces”, parti di spettro assegnate ma non utilizzate, dai quali prendono il nome i dispositivi.Uno dei modi per individuare gli “Spectrum holes” per una Cognitive Radio consiste nel cercare di captare il segnale destinato agli utenti primari; questa tecnica è nota con il nome di Spectrum Sensing e consente di ottenere essenzialmente una misura all’interno del canale considerato al fine di determinare la presenza o meno di un servizio protetto. La tecnica di sensing impiegata da un WSD che opera autonomamente non è però molto efficiente in quanto non garantisce una buona protezione ai ricevitori DTT che usano lo stesso canale sul quale il WSD intende trasmettere.A livello europeo la soluzione che è stata ritenuta più affidabile per evitare le interferenze sui ricevitori DTT è rappresentata dall’uso di un geo-location database che opera in collaborazione con il dispositivo cognitivo.Lo scopo di questa tesi è quello di presentare un algoritmo che permette di combinare i due approcci di geo-location database e Sensing per definire i livelli di potenza trasmissibile da un WSD.
Resumo:
La Word Sense Disambiguation è un problema informatico appartenente al campo di studi del Natural Language Processing, che consiste nel determinare il senso di una parola a seconda del contesto in cui essa viene utilizzata. Se un processo del genere può apparire banale per un essere umano, può risultare d'altra parte straordinariamente complicato se si cerca di codificarlo in una serie di istruzioni esguibili da una macchina. Il primo e principale problema necessario da affrontare per farlo è quello della conoscenza: per operare una disambiguazione sui termini di un testo, un computer deve poter attingere da un lessico che sia il più possibile coerente con quello di un essere umano. Sebbene esistano altri modi di agire in questo caso, quello di creare una fonte di conoscenza machine-readable è certamente il metodo che permette di affrontare il problema in maniera più diretta. Nel corso di questa tesi si cercherà, come prima cosa, di spiegare in cosa consiste la Word Sense Disambiguation, tramite una descrizione breve ma il più possibile dettagliata del problema. Nel capitolo 1 esso viene presentato partendo da alcuni cenni storici, per poi passare alla descrizione dei componenti fondamentali da tenere in considerazione durante il lavoro. Verranno illustrati concetti ripresi in seguito, che spaziano dalla normalizzazione del testo in input fino al riassunto dei metodi di classificazione comunemente usati in questo campo. Il capitolo 2 è invece dedicato alla descrizione di BabelNet, una risorsa lessico-semantica multilingua di recente costruzione nata all'Università La Sapienza di Roma. Verranno innanzitutto descritte le due fonti da cui BabelNet attinge la propria conoscenza, WordNet e Wikipedia. In seguito saranno illustrati i passi della sua creazione, dal mapping tra le due risorse base fino alla definizione di tutte le relazioni che legano gli insiemi di termini all'interno del lessico. Infine viene proposta una serie di esperimenti che mira a mettere BabelNet su un banco di prova, prima per verificare la consistenza del suo metodo di costruzione, poi per confrontarla, in termini di prestazioni, con altri sistemi allo stato dell'arte sottoponendola a diversi task estrapolati dai SemEval, eventi internazionali dedicati alla valutazione dei problemi WSD, che definiscono di fatto gli standard di questo campo. Nel capitolo finale vengono sviluppate alcune considerazioni sulla disambiguazione, introdotte da un elenco dei principali campi applicativi del problema. Vengono in questa sede delineati i possibili sviluppi futuri della ricerca, ma anche i problemi noti e le strade recentemente intraprese per cercare di portare le prestazioni della Word Sense Disambiguation oltre i limiti finora definiti.
Resumo:
Geotechnical properties of sediment from Ocean Drilling Program Leg 164 are presented as: (1) normalized shipboard strength ratios from the Cape Fear Diapir, the Blake Ridge Diapir, and the Blake Ridge; and (2) Atterberg limit, vane shear strength, pocket-penetrometer strength, and constant-rate-of-strain consolidation results from Hole 995A, located on the Blake Ridge. This study was conducted to understand the stress history in a region characterized by high sedimentation rates and the presence of gas hydrates. Collectively, the results indicate that sediment from the Blake Ridge exhibits significant underconsolidated behavior, except near the seafloor. At least 10 m of additional overburden was removed by erosion or mass wasting at Hole 993A on the Cape Fear Diapir, compared to nearby sites.
Resumo:
MedFlux sampling was carried out at the French JGOFS DYFAMED (DYnamique des Flux Atmospheriques en MEDiterranee) site in the Ligurian Sea (northwestern Mediterranean), 52km off Nice (431200N, 71400E) in 2300m water depth. In 2003, a mooring with sediment trap arrays was deployed 6 March (day of year, DOY 65) and recovered 6 May (DOY 126); this trap deployment will be referred to as Period 1 (P1). The array was redeployed a week later on 14 May (DOY 134) and recovered again on 30 June (DOY 181); this trap deployment will be referred to as Period 2 (P2). Indented-rotating sphere (IRS) valve traps were fitted with TS carousels to determine temporal variability of particulate matter flux. TS traps were fitted with ''dimpled'' spheres. Vertical flux at 200m depth is considered to be equivalent to new or export production, and traps sampled at 238 and 117m during P1 and P2, respectively. We also collected TS material at 711m during P1 and at 1918m during P2. Upon recovery, samples were split using a McLaneTM WSD splitter to allow multiple chemical analyses. Here we report 2003 data on TS particulate mass, and the contributions of organic carbon (OC), opal, lithogenic material and calcium carbonate to mass. In 2005, traps were deployed as described above for 55 d during a single period from 4 March (DOY 63) to 1 May (DOY 121). TS traps were fitted with ''dimpled'' spheres. TS particulate matter was collected from 313 to 924 m.
Resumo:
In this paper we explore the use of semantic classes in an existing information retrieval system in order to improve its results. Thus, we use two different ontologies of semantic classes (WordNet domain and Basic Level Concepts) in order to re-rank the retrieved documents and obtain better recall and precision. Finally, we implement a new method for weighting the expanded terms taking into account the weights of the original query terms and their relations in WordNet with respect to the new ones (which have demonstrated to improve the results). The evaluation of these approaches was carried out in the CLEF Robust-WSD Task, obtaining an improvement of 1.8% in GMAP for the semantic classes approach and 10% in MAP employing the WordNet term weighting approach.
Resumo:
In this work we present a semantic framework suitable of being used as support tool for recommender systems. Our purpose is to use the semantic information provided by a set of integrated resources to enrich texts by conducting different NLP tasks: WSD, domain classification, semantic similarities and sentiment analysis. After obtaining the textual semantic enrichment we would be able to recommend similar content or even to rate texts according to different dimensions. First of all, we describe the main characteristics of the semantic integrated resources with an exhaustive evaluation. Next, we demonstrate the usefulness of our resource in different NLP tasks and campaigns. Moreover, we present a combination of different NLP approaches that provide enough knowledge for being used as support tool for recommender systems. Finally, we illustrate a case of study with information related to movies and TV series to demonstrate that our framework works properly.
Resumo:
The cutting fluids are lubricants used in machining processes, because they present many benefits for different processes. They have many functions, such as lubrication, cooling, improvement in surface finishing, besides they decreases the tool wear and protect it against corrosion. Therefore due to new environment laws and demand to green products, new cutting fluids must be development. These shall be biodegradable, non-toxic, safety for environment and operator healthy. Thus, vegetable oils are a good option to solve this problem, replacing the mineral oils. In this context, this work aimed to develop an emulsion cutting fluid from epoxidized vegetable oil, promoting better lubrication and cooling in machining processes, besides being environment friendly. The methodology was divided in five steps: first one was the biolubricant synthesis by epoxidation reaction. Following this, the biolubricant was characterized in terms of density, acidity, iodo index, oxirane index, viscosity, thermal stability and chemical composition. The third step was to develop an emulsion O/A with different oil concentration (10, 20 and 25%) and surfactant concentration (1, 2.5 and 5%). Also, emulsion stability was studied. The emulsion tribological performance were carried out in HFRR (High Frequency Reciprocating Rig), it consists in ball-disc contact. Results showed that the vegetable based lubricant may be synthesized by epoxidationreaction, the spectra showed that there was 100% conversion of the epoxy rings unsaturations. In regard the tribological assessment is observed that the percentage of oil present in the emulsion directly influenced the film formation and coefficient of friction for higher concentrations the film formation process is slow and unstable, and the coefficient of friction. The high concentrations of surfactants have not improved the emulsions tribological performance. The best performance in friction reduction was observed to emulsion with 10% of oil and 5% of surfactant, its average wear scar was 202 μm.
Study of white spot disease in four native species in Persian Gulf by histopathology and PCR methods
Resumo:
After serious disease outbreak, caused by new virus (WSV), has been occurring among cultured penaeid shrimps in Asian countries like China since 1993 and then in Latin American countries, during June till July 2002 a rapid and high mortality in cultured Penaeus indicus in Abadan region located in south of Iran with typical signs and symptoms of White Spot Syndrome Virus was confirmed by different studies of Histopathology, PCR, TEM, Virology. This study was conducted for the purpose of determination of prevalence(rate of infection)/ROI and grading severity (SOI) of WSD to five species: 150 samples of captured shrimps and 90 samples of cultured ones; Penaeus indicus, P. semisulcatus, P. merguiensis, Parapenaopsis styliferus, and Metapenaeus affinis in 2005. 136 of 240 samples have shown clinical and macroscopical signs & symptoms including; white spots on carapase (0.5-2 mm), easily removing of cuticule, fragility of hepatopancreas and red color of motility limbs. Histopathological changes like specific intranuclear inclusion bodies (cowdry-type A) were observed in all target tissues (gill, epidermis, haemolymph and midgut) but not in hepatopancreas, among shrimps collected from various farms in the south and captured ones from Persian Gulf, even ones without clinical signs. ROI among species estimated, using the NATIVIDAD & LIGHTNER formula(1992b) and SOI were graded, using a generalized scheme for assigning a numerical qualitative value to severity grade of infection which was provided by LIGHTNER(1996), in consideration to histopathology and counting specific inclusion bodies in different stages(were modified by B. Gholamhoseini). Samples with clinical signs, showed grades more than 2. Most of the P. semisulcatus and M. affinis samples showed grade of 3, in the other hand in most of P. styliferus samples grade of 4 were observed, which can suggest different sensitivity of different species. All samples were tested by Nested PCR method with IQTm 2000 WSSV kit and 183 of 240 samples were positive and 3 1evel of infection which was shown in this PCR confirmed our SOI grades, but they were more specified.