35 resultados para Learning from text
em Aston University Research Archive
Resumo:
Text classification is essential for narrowing down the number of documents relevant to a particular topic for further pursual, especially when searching through large biomedical databases. Protein-protein interactions are an example of such a topic with databases being devoted specifically to them. This paper proposed a semi-supervised learning algorithm via local learning with class priors (LL-CP) for biomedical text classification where unlabeled data points are classified in a vector space based on their proximity to labeled nodes. The algorithm has been evaluated on a corpus of biomedical documents to identify abstracts containing information about protein-protein interactions with promising results. Experimental results show that LL-CP outperforms the traditional semisupervised learning algorithms such as SVMand it also performs better than local learning without incorporating class priors.
Resumo:
Ontology construction for any domain is a labour intensive and complex process. Any methodology that can reduce the cost and increase efficiency has the potential to make a major impact in the life sciences. This paper describes an experiment in ontology construction from text for the animal behaviour domain. Our objective was to see how much could be done in a simple and relatively rapid manner using a corpus of journal papers. We used a sequence of pre-existing text processing steps, and here describe the different choices made to clean the input, to derive a set of terms and to structure those terms in a number of hierarchies. We describe some of the challenges, especially that of focusing the ontology appropriately given a starting point of a heterogeneous corpus. Results - Using mainly automated techniques, we were able to construct an 18055 term ontology-like structure with 73% recall of animal behaviour terms, but a precision of only 26%. We were able to clean unwanted terms from the nascent ontology using lexico-syntactic patterns that tested the validity of term inclusion within the ontology. We used the same technique to test for subsumption relationships between the remaining terms to add structure to the initially broad and shallow structure we generated. All outputs are available at http://thirlmere.aston.ac.uk/~kiffer/animalbehaviour/ webcite. Conclusion - We present a systematic method for the initial steps of ontology or structured vocabulary construction for scientific domains that requires limited human effort and can make a contribution both to ontology learning and maintenance. The method is useful both for the exploration of a scientific domain and as a stepping stone towards formally rigourous ontologies. The filtering of recognised terms from a heterogeneous corpus to focus upon those that are the topic of the ontology is identified to be one of the main challenges for research in ontology learning.
Resumo:
Ontology construction for any domain is a labour intensive and complex process. Any methodology that can reduce the cost and increase efficiency has the potential to make a major impact in the life sciences. This paper describes an experiment in ontology construction from text for the Animal Behaviour domain. Our objective was to see how much could be done in a simple and rapid manner using a corpus of journal papers. We used a sequence of text processing steps, and describe the different choices made to clean the input, to derive a set of terms and to structure those terms in a hierarchy. We were able in a very short space of time to construct a 17000 term ontology with a high percentage of suitable terms. We describe some of the challenges, especially that of focusing the ontology appropriately given a starting point of a heterogeneous corpus.
Resumo:
The use of ontologies as representations of knowledge is widespread but their construction, until recently, has been entirely manual. We argue in this paper for the use of text corpora and automated natural language processing methods for the construction of ontologies. We delineate the challenges and present criteria for the selection of appropriate methods. We distinguish three ma jor steps in ontology building: associating terms, constructing hierarchies and labelling relations. A number of methods are presented for these purposes but we conclude that the issue of data-sparsity still is a ma jor challenge. We argue for the use of resources external tot he domain specific corpus.
Resumo:
We study the dynamics of on-line learning in multilayer neural networks where training examples are sampled with repetition and where the number of examples scales with the number of network weights. The analysis is carried out using the dynamical replica method aimed at obtaining a closed set of coupled equations for a set of macroscopic variables from which both training and generalization errors can be calculated. We focus on scenarios whereby training examples are corrupted by additive Gaussian output noise and regularizers are introduced to improve the network performance. The dependence of the dynamics on the noise level, with and without regularizers, is examined, as well as that of the asymptotic values obtained for both training and generalization errors. We also demonstrate the ability of the method to approximate the learning dynamics in structurally unrealizable scenarios. The theoretical results show good agreement with those obtained by computer simulations.
Resumo:
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework called joint sentiment-topic (JST) model based on latent Dirichlet allocation (LDA), which detects sentiment and topic simultaneously from text. A reparameterized version of the JST model called Reverse-JST, obtained by reversing the sequence of sentiment and topic generation in the modeling process, is also studied. Although JST is equivalent to Reverse-JST without a hierarchical prior, extensive experiments show that when sentiment priors are added, JST performs consistently better than Reverse-JST. Besides, unlike supervised approaches to sentiment classification which often fail to produce satisfactory performance when shifting to other domains, the weakly supervised nature of JST makes it highly portable to other domains. This is verified by the experimental results on data sets from five different domains where the JST model even outperforms existing semi-supervised approaches in some of the data sets despite using no labeled documents. Moreover, the topics and topic sentiment detected by JST are indeed coherent and informative. We hypothesize that the JST model can readily meet the demand of large-scale sentiment analysis from the web in an open-ended fashion.
Resumo:
We explore how openness in terms of external linkages generates learning effects, which enable firms to generate more innovation outputs from any given breadth of external linkages. Openness to external knowledge sources, whether through search activity or linkages to external partners in new product development, involves a process of interaction and information processing. Such activities are likely to be subject to a learning process, as firms learn which knowledge sources and collaborative linkages are most useful to their particular needs, and which partnerships are most effective in delivering innovation performance. Using panel data from Irish manufacturing plants, we find evidence of such learning effects: establishments with substantial experience of external collaborations in previous periods derive more innovation output from openness in the current period. © 2013 The Authors. Strategic Management Journal published by John Wiley & Sons Ltd.
Resumo:
Post-disaster housing reconstruction projects face several challenges. Resources and material supplies are often scarce; several and different types of organizations are involved, while projects must be completed as quickly as possible to foster recovery. Within this context, the chapter aims to increase the understanding of relief supply chain design in reconstruction. In addition, the chapter is introducing a community based and beneficiary perspective to relief supply chains by evaluating the implications of local components for supply chain design in reconstruction. This is achieved through the means of secondary data analysis based on the evaluation reports of two major housing reconstruction projects that took place in Europe the last decade. A comparative analysis of the organizational designs of these projects highlights the ways in which users can be involved. The performance of reconstruction supply chains seems to depend to a large extent on the way beneficiaries are integrated in supply chain design impacting positively on the effectiveness of reconstruction supply chains.
Resumo:
Networked Learning, e-Learning and Technology Enhanced Learning have each been defined in different ways, as people's understanding about technology in education has developed. Yet each could also be considered as a terminology competing for a contested conceptual space. Theoretically this can be a ‘fertile trans-disciplinary ground for represented disciplines to affect and potentially be re-orientated by others’ (Parchoma and Keefer, 2012), as differing perspectives on terminology and subject disciplines yield new understandings. Yet when used in government policy texts to describe connections between humans, learning and technology, terms tend to become fixed in less fertile positions linguistically. A deceptively spacious policy discourse that suggests people are free to make choices conceals an economically-based assumption that implementing new technologies, in themselves, determines learning. Yet it actually narrows choices open to people as one route is repeatedly in the foreground and humans are not visibly involved in it. An impression that the effective use of technology for endless improvement is inevitable cuts off critical social interactions and new knowledge for multiple understandings of technology in people's lives. This paper explores some findings from a corpus-based Critical Discourse Analysis of UK policy for educational technology during the last 15 years, to help to illuminate the choices made. This is important when through political economy, hierarchical or dominant neoliberal logic promotes a single ‘universal model’ of technology in education, without reference to a wider social context (Rustin, 2013). Discourse matters, because it can ‘mould identities’ (Massey, 2013) in narrow, objective economically-based terms which 'colonise discourses of democracy and student-centredness' (Greener and Perriton, 2005:67). This undermines subjective social, political, material and relational (Jones, 2012: 3) contexts for those learning when humans are omitted. Critically confronting these structures is not considered a negative activity. Whilst deterministic discourse for educational technology may leave people unconsciously restricted, I argue that, through a close analysis, it offers a deceptively spacious theoretical tool for debate about the wider social and economic context of educational technology. Methodologically it provides insights about ways technology, language and learning intersect across disciplinary borders (Giroux, 1992), as powerful, mutually constitutive elements, ever-present in networked learning situations. In sharing a replicable approach for linguistic analysis of policy discourse I hope to contribute to visions others have for a broader theoretical underpinning for educational technology, as a developing field of networked knowledge and research (Conole and Oliver, 2002; Andrews, 2011).
Resumo:
An overview of the antioxidant role of the biologically active form of vitamin E, α-tocopherol, in polyolefins is discussed. The effect of the vitamin antioxidant on the melt and colour stability of polyethylene (PE) and polypropylene (PP) is highlighted. It is shown that tocopherol is a highly effective antioxidant that results in superior melt stabilisation of polyolefins particularly when used at much lower concentration than that needed for conventional synthetic hindered phenol processing stabilisers. As with other hindered phenols,α-tocopherol imparts also some colour to the polymer but this is shown to be reduced drastically in the presence of other antioxidants, such as phosphites, or other additives, such as polyhydric alcohols.
Resumo:
We explore how openness in terms of external linkages generates learning effects, which enable firms to generate more innovation outputs from any given breadth of external linkages. Openness to external knowledge sources, whether through search activity or linkages to external partners in new product development, involves a process of interaction and information processing. Such activities are likely to be subject to a learning process, as firms learn which knowledge sources and collaborative linkages are most useful to their particular needs, and which partnerships are most effective in delivering innovation performance. Using panel data from Irish manufacturing plants, we find evidence of such learning effects: establishments with substantial experience of external collaborations in previous periods derive more innovation output from openness in the current period. © 2013 The Authors. Strategic Management Journal published by John Wiley & Sons Ltd.
Resumo:
In this chapter, the way in which varied terms such as Networked learning, e-learning and Technology Enhanced Learning (TEL) have each become colonised to support a dominant, economically-based world view of educational technology is discussed. Critical social theory about technology, language and learning is brought into dialogue with examples from a corpus-based Critical Discourse Analysis (CDA) of UK policy texts for educational technology between1997 and 2012. Though these policy documents offer much promise for enhancement of people’s performance via technology, the human presence to enact such innovation is missing. Given that ‘academic workload’ is a ‘silent barrier’ to the implementation of TEL strategies (Gregory and Lodge, 2015), analysis further exposes, through empirical examples, that the academic labour of both staff and students appears to be unacknowledged. Global neoliberal capitalist values have strongly territorialised the contemporary university (Hayes & Jandric, 2014), utilising existing naïve, utopian arguments about what technology alone achieves. Whilst the chapter reveals how humans are easily ‘evicted’, even from discourse about their own learning (Hayes, 2015), it also challenges staff and students to seek to re-occupy the important territory of policy to subvert the established order. We can use the very political discourse that has disguised our networked learning practices, in new explicit ways, to restore our human visibility.
Resumo:
Automatic Term Recognition (ATR) is a fundamental processing step preceding more complex tasks such as semantic search and ontology learning. From a large number of methodologies available in the literature only a few are able to handle both single and multi-word terms. In this paper we present a comparison of five such algorithms and propose a combined approach using a voting mechanism. We evaluated the six approaches using two different corpora and show how the voting algorithm performs best on one corpus (a collection of texts from Wikipedia) and less well using the Genia corpus (a standard life science corpus). This indicates that choice and design of corpus has a major impact on the evaluation of term recognition algorithms. Our experiments also showed that single-word terms can be equally important and occupy a fairly large proportion in certain domains. As a result, algorithms that ignore single-word terms may cause problems to tasks built on top of ATR. Effective ATR systems also need to take into account both the unstructured text and the structured aspects and this means information extraction techniques need to be integrated into the term recognition process.
Resumo:
This article aims to gain a greater understanding of relevant and successful methods of stimulating an ICT culture and skills development in rural areas. The paper distils good practice activities, utilizing criteria derived from a review of the rural dimensions of ICT learning, from a range of relevant initiatives and programmes. These good practice activities cover: community resource centres providing opportunities for ‘tasting’ ICTs; video games and Internet Cafe´s as tools removing ‘entry barriers’; emphasis on ‘user management’ as a means of creating ownership; service delivery beyond fixed locations; use of ICT capacities in the delivery of general services; and selected use of financial support.