19 resultados para Machine Learning,Natural Language Processing,Descriptive Text Mining,POIROT,Transformer

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aircraft Maintenance, Repair and Overhaul (MRO) feedback commonly includes an engineer’s complex text-based inspection report. Capturing and normalizing the content of these textual descriptions is vital to cost and quality benchmarking, and provides information to facilitate continuous improvement of MRO process and analytics. As data analysis and mining tools requires highly normalized data, raw textual data is inadequate. This paper offers a textual-mining solution to efficiently analyse bulk textual feedback data. Despite replacement of the same parts and/or sub-parts, the actual service cost for the same repair is often distinctly different from similar previously jobs. Regular expression algorithms were incorporated with an aircraft MRO glossary dictionary in order to help provide additional information concerning the reason for cost variation. Professional terms and conventions were included within the dictionary to avoid ambiguity and improve the outcome of the result. Testing results show that most descriptive inspection reports can be appropriately interpreted, allowing extraction of highly normalized data. This additional normalized data strongly supports data analysis and data mining, whilst also increasing the accuracy of future quotation costing. This solution has been effectively used by a large aircraft MRO agency with positive results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We construct a mapping from complex recursive linguistic data structures to spherical wave functions using Smolensky's filler/role bindings and tensor product representations. Syntactic language processing is then described by the transient evolution of these spherical patterns whose amplitudes are governed by nonlinear order parameter equations. Implications of the model in terms of brain wave dynamics are indicated.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Accurate single trial P300 classification lends itself to fast and accurate control of Brain Computer Interfaces (BCIs). Highly accurate classification of single trial P300 ERPs is achieved by characterizing the EEG via corresponding stationary and time-varying Wackermann parameters. Subsets of maximally discriminating parameters are then selected using the Network Clustering feature selection algorithm and classified with Naive-Bayes and Linear Discriminant Analysis classifiers. Hence the method is assessed on two different data-sets from BCI competitions and is shown to produce accuracies of between approximately 70% and 85%. This is promising for the use of Wackermann parameters as features in the classification of single-trial ERP responses.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article considers the issue of low levels of motivation for foreign language learning in England by exploring how language learning is conceptualised by different key voices in that country through the examination of written data: policy documents and reports on the UK's language needs, curriculum documents, and press articles. The extent to which this conceptualisation has changed over time is explored, through the consideration of documents from two time points, before and after a change in government in the UK. The study uses corpus analysis methods in this exploration. The picture that emerges is a complex one regarding how the 'problems' and 'solutions' surrounding language learning in that context are presented in public discourse. This, we conclude, has implications for the likely success of measures adopted to increase language learning uptake in that context.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human brain imaging techniques, such as Magnetic Resonance Imaging (MRI) or Diffusion Tensor Imaging (DTI), have been established as scientific and diagnostic tools and their adoption is growing in popularity. Statistical methods, machine learning and data mining algorithms have successfully been adopted to extract predictive and descriptive models from neuroimage data. However, the knowledge discovery process typically requires also the adoption of pre-processing, post-processing and visualisation techniques in complex data workflows. Currently, a main problem for the integrated preprocessing and mining of MRI data is the lack of comprehensive platforms able to avoid the manual invocation of preprocessing and mining tools, that yields to an error-prone and inefficient process. In this work we present K-Surfer, a novel plug-in of the Konstanz Information Miner (KNIME) workbench, that automatizes the preprocessing of brain images and leverages the mining capabilities of KNIME in an integrated way. K-Surfer supports the importing, filtering, merging and pre-processing of neuroimage data from FreeSurfer, a tool for human brain MRI feature extraction and interpretation. K-Surfer automatizes the steps for importing FreeSurfer data, reducing time costs, eliminating human errors and enabling the design of complex analytics workflow for neuroimage data by leveraging the rich functionalities available in the KNIME workbench.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It has been suggested that Assessment for Learning (AfL) plays a significant role in enhancing teaching and learning in mainstream educational contexts. However, little empirical evidence can support these claims. As AfL has been shown to be enacted predominantly through interactions in primary classes, there is a need to understand if it is appropriate, whether it can be efficiently used in teaching English to Young Learners (TEYL) and how it can facilitate learning in such a context. This emerging research focus gains currency especially in the light of SLA research, which suggests the important role of interactions in foreign language learning. This mixed-method, descriptive and exploratory study aims to investigate how teachers of learners aged 7-11 understand AfL; how they implement it; and the impact that such implementation could have on interactions which occur during lessons. The data were collected through lesson observations, scrutiny of school documents, semi-structured interviews and a focus group interview with teachers. The findings indicate that fitness for purpose guides the implementation of AfL in TEYL classrooms. Significantly, the study has revealed differences in the implementation of AfL between classes of 7-9 and 10-11 year olds within each of the three purposes (setting objectives and expectations; monitoring performance; and checking achievement) identified through the data. Another important finding of this study is the empirical evidence suggesting that the use of AfL could facilitate creating conditions conducive to learning in TEYL classes during collaborative and expert/novice interactions. The findings suggest that teachers’ understanding of AfL is largely aligned with the theoretical frameworks (Black & Wiliam, 2009; Swaffield, 2011) already available. However, they also demonstrate that there are TEYL specific characteristics. This research has important pedagogical implications and indicates a number of areas for further research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article reports on an investigation into the language learning beliefs of students of French in England, aged 16 to 18. It focuses on qualitative data from two groups of learners (10 in total). While both groups had broadly similar levels of achievement in French in terns of examination success, they dffered greatly in the self-image they had of themselves as language learners, with one group displaying low levels of self-eficacy beliefs regarding the possibility of future success. The implica tions of such beliefs for students' levels of motivation and persistence are discussed, together with their possible causes. The article concludes by suggesting changes in classroom practice that might help students develop a more positive image of them selves as language learners.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aircraft Maintenance, Repair and Overhaul (MRO) agencies rely largely on row-data based quotation systems to select the best suppliers for the customers (airlines). The data quantity and quality becomes a key issue to determining the success of an MRO job, since we need to ensure we achieve cost and quality benchmarks. This paper introduces a data mining approach to create an MRO quotation system that enhances the data quantity and data quality, and enables significantly more precise MRO job quotations. Regular Expression was utilized to analyse descriptive textual feedback (i.e. engineer’s reports) in order to extract more referable highly normalised data for job quotation. A text mining based key influencer analysis function enables the user to proactively select sub-parts, defects and possible solutions to make queries more accurate. Implementation results show that system data would improve cost quotation in 40% of MRO jobs, would reduce service cost without causing a drop in service quality.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

25 monolingual (L1) children with Specific Language Impairment (SLI), 32 sequential bilingual (L2) children, and 29 L1 controls completed the Test of Active & Passive Sentences-Revised (van der Lely, 1996) and the self-paced listening task with picture verification for actives and passives (Marinis, 2007). These revealed important between-group differences in both tasks. The children with SLI showed difficulties in both actives and passives when they had to reanalyse thematic roles on-line. Their error pattern provided evidence for working memory limitations. The L2 children showed difficulties only in passives both on-line and off-line. We suggest that these relate to the complex syntactic algorithm in passives and reflect an earlier developmental stage due to reduced exposure to the L2. The results are discussed in relation to theories of SLI and can be best accommodated within accounts proposing that difficulties in the comprehension of passives stem from processing limitations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This cross-sectional study examines the role of L1-L2 differences and structural distance in the processing of gender and number agreement by English-speaking learners of Spanish at three different levels of proficiency. Preliminary results show that differences between the L1 and L2 impact L2 development, as sensitivity to gender agreement violations, as opposed to number agreement violations, emerges only in learners at advanced levels of proficiency. Results also show that the establishment of agreement dependencies is impacted by the structural distance between the agreeing elements for native speakers and for learners at intermediate and advanced levels of proficiency but not for low proficiency. The overall pattern of results suggests that the linguistic factors examined here impact development but do not constrain ultimate attainment; for advanced learners, results suggest that second language processing is qualitatively similar to native processing.