955 resultados para Feature learning
Resumo:
This qualitative study investigated how a team of 7 hospital educators collaborated to develop e-curriculum units to pilot for a newly acquired learning -r management system at a large, multisite academic health sciences centre. A case study approach was used to examine how the e-Curriculum Team was structured, how the educators worked together to develop strategies to better utilize e-leaming in their ovwi practice, what e-curriculum they chose to develop, and how they determined their priorities for e-curriculum development. It also inquired into how they planned to involve other educators in using e-leaming. One set of semistructured interviews with the 6 hospital educators involved in the project, as well as minutes of team meetings and the researcher's journal, were analyzed (the researcher was also a hospital educator on the team). Project management structure, educator support, and organizational pressures on the implementation project feature prominently in the case study. This study suggests that implementation of e-leaming will be more successful if (a) educators involved in the development of e-leaming curriculum are supported in their role as change agents, (b) the pain of vmleaming current educational practice is considered, (c) the limitations of the software being implemented are recognized, (d) time is spent leaming about best practice, and (e) the project is protected as much as possible from organizational pressures and distractions.
Resumo:
The learning gap created by summer vacation creates a significant breach in the learning cycle, where student achievement levels decrease over the course ofthe summer (Cooper et aI., 2000). In a review of 39 studies, Cooper and colleagues (1996) specified that the summer learning shortfall equals at least one month loss of instruction as measured by grade level equivalents on standardized test scores. Specifically, the achievement gap has a more profound effect on children as they grow older, where there is a steady deterioration in knowledge and skills sustained during the summer months (Cooper et aI., 1996; Kerry & Davies, 1998). While some stakeholders believe that the benefits of a summer vacation overshadow the reversing effect on achievement, it is the impact of the summer learning gap on vulnerable children, including children who are disadvantaged as a result of requiring special educational needs, children from low socioeconomic backgrounds, and children learning English as a second language, that is most problematic. More specifically, research has demonstrated that it is children's literacy-based skills that are most affected during the summer months. Children from high socioeconomic backgrounds recurrently showed gains in reading achievement over the summer whereas disadvantaged children repeatedly illustrate having significant losses. Consequently, the summer learning gap was deemed to exaggerate the inequality experienced by children from low socioeconomic backgrounds. Ultimately, the summer learning gap was found to have the most profound on vulnerable children, placing these children at an increased chance for academic failure. A primary feature of this research project was to include primary caregivers as authentic partners in a summer family literacy program fabricated to scaffold their children's literacy-based needs. This feature led to the research team adapting and implementing a published study entitled, Learning Begins at Home (LBH): A Research-Based Family Literacy Program Curriculum. Researchers at the Ontario Institute designed this program for the Study of Education, University of Toronto. The LBH program capitalized on incorporating the flexibility required to make the program adaptable to meet the needs of each participating child and his or her primary caregiver. As it has been well documented in research, the role primary caregivers have in an intervention program are the most influential on a child's future literacy success or failure (Timmons, 2008). Subsequently, a requirement for participating in the summer family literacy program required the commitment of one child and one of his or her primary caregivers. The primary caregiver played a fundamental role in the intervention program through their participation in workshop activities prior to and following hands on work with their child. The purpose of including the primary caregiver as an authentic partner in the program was to encourage a definitive shift in the family, whereby caregivers would begin to implement literacy activities in their home on a daily basis. The intervention program was socially constructed through the collaboration of knowledge. The role ofthe author in the study was as the researcher, in charge of analyzing and interpreting the results of the study. There were a total of thirty-six (36) participants in the study; there were nineteen (19) participants in the intervention group and seventeen (17) participants in the control group. All of the children who participated in the study were enrolled in junior kindergarten classrooms within the Niagara Catholic District School Board. Once children were referred to the program, a Speech and Language Pathologist assessed each individual child to identify if they met the eligibility requirements for participation in the summer family literacy intervention program. To be eligible to participate, children were required to demonstrate having significant literacy needs (i.e., below 25%ile on the Test of Preschool Early Literacy described below). Children with low incident disabilities (such as Autism or Intellectual Disabilities) and children with significant English as a Second Language difficulties were excluded from the study. The research team utilized a standard pre-test-post-test comparison group design whereby all participating children were assessed with the Test of Preschool Early Literacy (Lonigan et aI., 2007), and a standard measure of letter identification and letter sound understanding. Pre-intervention assessments were conducted two weeks prior to the intervention program commencing, and the first set of the post-intervention assessments were administered immediately following the completion of the intervention program. The follow-up post-intervention assessments took place in December 2010 to measure the sustainability of the gains obtained from the intervention program. As a result of the program, all of the children in the intervention program scored statistically significantly higher on their literacy scores for Print Knowledge, Letter Identification, and Letter Sound Understanding scores than the control group at the postintervention assessment point (immediately following the completion of the program) and at the December post-intervention assessment point. For Phonological Awareness, there was no statistically significant difference between the intervention group and the control at the postintervention assessment point, however, there was a statistically significant difference found between the intervention group and the control group at the December post-intervention assessment point. In general, these results indicate that the summer family literacy intervention program made an immediate impact on the emergent literacy skills of the participating children. Moreover, these results indicate that the summer family literacy intervention program has the ability to foster the emergent literacy skills of vulnerable children, potentially reversing the negative effect the summer learning gap has on these children.
Resumo:
The curse of dimensionality is a major problem in the fields of machine learning, data mining and knowledge discovery. Exhaustive search for the most optimal subset of relevant features from a high dimensional dataset is NP hard. Sub–optimal population based stochastic algorithms such as GP and GA are good choices for searching through large search spaces, and are usually more feasible than exhaustive and deterministic search algorithms. On the other hand, population based stochastic algorithms often suffer from premature convergence on mediocre sub–optimal solutions. The Age Layered Population Structure (ALPS) is a novel metaheuristic for overcoming the problem of premature convergence in evolutionary algorithms, and for improving search in the fitness landscape. The ALPS paradigm uses an age–measure to control breeding and competition between individuals in the population. This thesis uses a modification of the ALPS GP strategy called Feature Selection ALPS (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS uses a novel frequency count system to rank features in the GP population based on evolved feature frequencies. The ranked features are translated into probabilities, which are used to control evolutionary processes such as terminal–symbol selection for the construction of GP trees/sub-trees. The FSALPS metaheuristic continuously refines the feature subset selection process whiles simultaneously evolving efficient classifiers through a non–converging evolutionary process that favors selection of features with high discrimination of class labels. We investigated and compared the performance of canonical GP, ALPS and FSALPS on high–dimensional benchmark classification datasets, including a hyperspectral image. Using Tukey’s HSD ANOVA test at a 95% confidence interval, ALPS and FSALPS dominated canonical GP in evolving smaller but efficient trees with less bloat expressions. FSALPS significantly outperformed canonical GP and ALPS and some reported feature selection strategies in related literature on dimensionality reduction.
Resumo:
The curse of dimensionality is a major problem in the fields of machine learning, data mining and knowledge discovery. Exhaustive search for the most optimal subset of relevant features from a high dimensional dataset is NP hard. Sub–optimal population based stochastic algorithms such as GP and GA are good choices for searching through large search spaces, and are usually more feasible than exhaustive and determinis- tic search algorithms. On the other hand, population based stochastic algorithms often suffer from premature convergence on mediocre sub–optimal solutions. The Age Layered Population Structure (ALPS) is a novel meta–heuristic for overcoming the problem of premature convergence in evolutionary algorithms, and for improving search in the fitness landscape. The ALPS paradigm uses an age–measure to control breeding and competition between individuals in the population. This thesis uses a modification of the ALPS GP strategy called Feature Selection ALPS (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS uses a novel frequency count system to rank features in the GP population based on evolved feature frequencies. The ranked features are translated into probabilities, which are used to control evolutionary processes such as terminal–symbol selection for the construction of GP trees/sub-trees. The FSALPS meta–heuristic continuously refines the feature subset selection process whiles simultaneously evolving efficient classifiers through a non–converging evolutionary process that favors selection of features with high discrimination of class labels. We investigated and compared the performance of canonical GP, ALPS and FSALPS on high–dimensional benchmark classification datasets, including a hyperspectral image. Using Tukey’s HSD ANOVA test at a 95% confidence interval, ALPS and FSALPS dominated canonical GP in evolving smaller but efficient trees with less bloat expressions. FSALPS significantly outperformed canonical GP and ALPS and some reported feature selection strategies in related literature on dimensionality reduction.
Resumo:
L’uptake est la réponse immédiate de l’apprenant suite à la rétroaction de l’enseignant (Lyster & Ranta, 1997). Cette étude investigue la relation entre l’uptake et l’apprentissage des déterminants possessifs et des questions d’anglais L2. Elle examine aussi l’effet des reformulations implicites et explicites en termes d’uptake et d’apprentissage. Deux classes intensives (ESL) de sixième année du primaire (N=53) à Montréal ont participé à cette étude. Les deux classes ont été réparties en deux groupes : reformulations explicites et reformulations implicites. L’intervention comportait des activités communicatives. Les élèves ont été testés sur les formes cibles immédiatement avant et après le traitement pédagogique en utilisant des taches orales. Les résultats ont confirmé l’effet supérieur des reformulations explicites en termes d’uptake et d’apprentissage et que l’effet des reformulations dépend de la cible. Cette étude a montré aussi que l’uptake peut faciliter l’apprentissage et que son absence n’est pas signe de manque d’apprentissage.
Resumo:
L’objectif de cette thèse par articles est de présenter modestement quelques étapes du parcours qui mènera (on espère) à une solution générale du problème de l’intelligence artificielle. Cette thèse contient quatre articles qui présentent chacun une différente nouvelle méthode d’inférence perceptive en utilisant l’apprentissage machine et, plus particulièrement, les réseaux neuronaux profonds. Chacun de ces documents met en évidence l’utilité de sa méthode proposée dans le cadre d’une tâche de vision par ordinateur. Ces méthodes sont applicables dans un contexte plus général, et dans certains cas elles on tété appliquées ailleurs, mais ceci ne sera pas abordé dans le contexte de cette de thèse. Dans le premier article, nous présentons deux nouveaux algorithmes d’inférence variationelle pour le modèle génératif d’images appelé codage parcimonieux “spike- and-slab” (CPSS). Ces méthodes d’inférence plus rapides nous permettent d’utiliser des modèles CPSS de tailles beaucoup plus grandes qu’auparavant. Nous démontrons qu’elles sont meilleures pour extraire des détecteur de caractéristiques quand très peu d’exemples étiquetés sont disponibles pour l’entraînement. Partant d’un modèle CPSS, nous construisons ensuite une architecture profonde, la machine de Boltzmann profonde partiellement dirigée (MBP-PD). Ce modèle a été conçu de manière à simplifier d’entraînement des machines de Boltzmann profondes qui nécessitent normalement une phase de pré-entraînement glouton pour chaque couche. Ce problème est réglé dans une certaine mesure, mais le coût d’inférence dans le nouveau modèle est relativement trop élevé pour permettre de l’utiliser de manière pratique. Dans le deuxième article, nous revenons au problème d’entraînement joint de machines de Boltzmann profondes. Cette fois, au lieu de changer de famille de modèles, nous introduisons un nouveau critère d’entraînement qui donne naissance aux machines de Boltzmann profondes à multiples prédictions (MBP-MP). Les MBP-MP sont entraînables en une seule étape et ont un meilleur taux de succès en classification que les MBP classiques. Elles s’entraînent aussi avec des méthodes variationelles standard au lieu de nécessiter un classificateur discriminant pour obtenir un bon taux de succès en classification. Par contre, un des inconvénients de tels modèles est leur incapacité de générer deséchantillons, mais ceci n’est pas trop grave puisque la performance de classification des machines de Boltzmann profondes n’est plus une priorité étant donné les dernières avancées en apprentissage supervisé. Malgré cela, les MBP-MP demeurent intéressantes parce qu’elles sont capable d’accomplir certaines tâches que des modèles purement supervisés ne peuvent pas faire, telles que celle de classifier des données incomplètes ou encore celle de combler intelligemment l’information manquante dans ces données incomplètes. Le travail présenté dans cette thèse s’est déroulé au milieu d’une période de transformations importantes du domaine de l’apprentissage à réseaux neuronaux profonds qui a été déclenchée par la découverte de l’algorithme de “dropout” par Geoffrey Hinton. Dropout rend possible un entraînement purement supervisé d’architectures de propagation unidirectionnel sans être exposé au danger de sur- entraînement. Le troisième article présenté dans cette thèse introduit une nouvelle fonction d’activation spécialement con ̧cue pour aller avec l’algorithme de Dropout. Cette fonction d’activation, appelée maxout, permet l’utilisation de aggrégation multi-canal dans un contexte d’apprentissage purement supervisé. Nous démontrons comment plusieurs tâches de reconnaissance d’objets sont mieux accomplies par l’utilisation de maxout. Pour terminer, sont présentons un vrai cas d’utilisation dans l’industrie pour la transcription d’adresses de maisons à plusieurs chiffres. En combinant maxout avec une nouvelle sorte de couche de sortie pour des réseaux neuronaux de convolution, nous démontrons qu’il est possible d’atteindre un taux de succès comparable à celui des humains sur un ensemble de données coriace constitué de photos prises par les voitures de Google. Ce système a été déployé avec succès chez Google pour lire environ cent million d’adresses de maisons.
Resumo:
Treating e-mail filtering as a binary text classification problem, researchers have applied several statistical learning algorithms to email corpora with promising results. This paper examines the performance of a Naive Bayes classifier using different approaches to feature selection and tokenization on different email corpora
Resumo:
Die zunehmende Vernetzung der Informations- und Kommunikationssysteme führt zu einer weiteren Erhöhung der Komplexität und damit auch zu einer weiteren Zunahme von Sicherheitslücken. Klassische Schutzmechanismen wie Firewall-Systeme und Anti-Malware-Lösungen bieten schon lange keinen Schutz mehr vor Eindringversuchen in IT-Infrastrukturen. Als ein sehr wirkungsvolles Instrument zum Schutz gegenüber Cyber-Attacken haben sich hierbei die Intrusion Detection Systeme (IDS) etabliert. Solche Systeme sammeln und analysieren Informationen von Netzwerkkomponenten und Rechnern, um ungewöhnliches Verhalten und Sicherheitsverletzungen automatisiert festzustellen. Während signatur-basierte Ansätze nur bereits bekannte Angriffsmuster detektieren können, sind anomalie-basierte IDS auch in der Lage, neue bisher unbekannte Angriffe (Zero-Day-Attacks) frühzeitig zu erkennen. Das Kernproblem von Intrusion Detection Systeme besteht jedoch in der optimalen Verarbeitung der gewaltigen Netzdaten und der Entwicklung eines in Echtzeit arbeitenden adaptiven Erkennungsmodells. Um diese Herausforderungen lösen zu können, stellt diese Dissertation ein Framework bereit, das aus zwei Hauptteilen besteht. Der erste Teil, OptiFilter genannt, verwendet ein dynamisches "Queuing Concept", um die zahlreich anfallenden Netzdaten weiter zu verarbeiten, baut fortlaufend Netzverbindungen auf, und exportiert strukturierte Input-Daten für das IDS. Den zweiten Teil stellt ein adaptiver Klassifikator dar, der ein Klassifikator-Modell basierend auf "Enhanced Growing Hierarchical Self Organizing Map" (EGHSOM), ein Modell für Netzwerk Normalzustand (NNB) und ein "Update Model" umfasst. In dem OptiFilter werden Tcpdump und SNMP traps benutzt, um die Netzwerkpakete und Hostereignisse fortlaufend zu aggregieren. Diese aggregierten Netzwerkpackete und Hostereignisse werden weiter analysiert und in Verbindungsvektoren umgewandelt. Zur Verbesserung der Erkennungsrate des adaptiven Klassifikators wird das künstliche neuronale Netz GHSOM intensiv untersucht und wesentlich weiterentwickelt. In dieser Dissertation werden unterschiedliche Ansätze vorgeschlagen und diskutiert. So wird eine classification-confidence margin threshold definiert, um die unbekannten bösartigen Verbindungen aufzudecken, die Stabilität der Wachstumstopologie durch neuartige Ansätze für die Initialisierung der Gewichtvektoren und durch die Stärkung der Winner Neuronen erhöht, und ein selbst-adaptives Verfahren eingeführt, um das Modell ständig aktualisieren zu können. Darüber hinaus besteht die Hauptaufgabe des NNB-Modells in der weiteren Untersuchung der erkannten unbekannten Verbindungen von der EGHSOM und der Überprüfung, ob sie normal sind. Jedoch, ändern sich die Netzverkehrsdaten wegen des Concept drif Phänomens ständig, was in Echtzeit zur Erzeugung nicht stationärer Netzdaten führt. Dieses Phänomen wird von dem Update-Modell besser kontrolliert. Das EGHSOM-Modell kann die neuen Anomalien effektiv erkennen und das NNB-Model passt die Änderungen in Netzdaten optimal an. Bei den experimentellen Untersuchungen hat das Framework erfolgversprechende Ergebnisse gezeigt. Im ersten Experiment wurde das Framework in Offline-Betriebsmodus evaluiert. Der OptiFilter wurde mit offline-, synthetischen- und realistischen Daten ausgewertet. Der adaptive Klassifikator wurde mit dem 10-Fold Cross Validation Verfahren evaluiert, um dessen Genauigkeit abzuschätzen. Im zweiten Experiment wurde das Framework auf einer 1 bis 10 GB Netzwerkstrecke installiert und im Online-Betriebsmodus in Echtzeit ausgewertet. Der OptiFilter hat erfolgreich die gewaltige Menge von Netzdaten in die strukturierten Verbindungsvektoren umgewandelt und der adaptive Klassifikator hat sie präzise klassifiziert. Die Vergleichsstudie zwischen dem entwickelten Framework und anderen bekannten IDS-Ansätzen zeigt, dass der vorgeschlagene IDSFramework alle anderen Ansätze übertrifft. Dies lässt sich auf folgende Kernpunkte zurückführen: Bearbeitung der gesammelten Netzdaten, Erreichung der besten Performanz (wie die Gesamtgenauigkeit), Detektieren unbekannter Verbindungen und Entwicklung des in Echtzeit arbeitenden Erkennungsmodells von Eindringversuchen.
Resumo:
Variations on the standard Kohonen feature map can enable an ordering of the map state space by using only a limited subset of the complete input vector. Also it is possible to employ merely a local adaptation procedure to order the map, rather than having to rely on global variables and objectives. Such variations have been included as part of a hybrid learning system (HLS) which has arisen out of a genetic-based classifier system. In the paper a description of the modified feature map is given, which constitutes the HLSs long term memory, and results in the control of a simple maze running task are presented, thereby demonstrating the value of goal related feedback within the overall network.
Resumo:
This paper presents an enhanced hypothesis verification strategy for 3D object recognition. A new learning methodology is presented which integrates the traditional dichotomic object-centred and appearance-based representations in computer vision giving improved hypothesis verification under iconic matching. The "appearance" of a 3D object is learnt using an eigenspace representation obtained as it is tracked through a scene. The feature representation implicitly models the background and the objects observed enabling the segmentation of the objects from the background. The method is shown to enhance model-based tracking, particularly in the presence of clutter and occlusion, and to provide a basis for identification. The unified approach is discussed in the context of the traffic surveillance domain. The approach is demonstrated on real-world image sequences and compared to previous (edge-based) iconic evaluation techniques.
Resumo:
Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values.
Resumo:
Many studies have widely accepted the assumption that learning processes can be promoted when teaching styles and learning styles are well matched. In this study, the synergy between learning styles, learning patterns, and gender as a selected demographic feature and learners’ performance were quantitatively investigated in a blended learning setting. This environment adopts a traditional teaching approach of ‘one-sizefits-all’ without considering individual user’s preferences and attitudes. Hence, evidence can be provided about the value of taking such factors into account in Adaptive Educational Hypermedia Systems (AEHSs). Felder and Soloman’s Index of Learning Styles (ILS) was used to identify the learning styles of 59 undergraduate students at the University of Babylon. Five hypotheses were investigated in the experiment. Our findings show that there is no statistical significance in some of the assessed factors. However, processing dimension, the total number of hits on course website and gender indicated a statistical significance on learners’ performance. This finding needs more investigation in order to identify the effective factors on students’ achievement to be considered in Adaptive Educational Hypermedia Systems (AEHSs).
Resumo:
Extinction-resistant fear is considered to be a central feature of pathological anxiety. Here we sought to determine if individual differences in Intolerance of Uncertainty (IU), a potential risk factor for anxiety disorders, underlies compromised fear extinction. We tested this hypothesis by recording electrodermal activity in 38 healthy participants during fear acquisition and extinction. We assessed the temporality of fear extinction, by examining early and late extinction learning. During early extinction, low IU was associated with larger skin conductance responses to learned threat vs. safety cues, whereas high IU was associated with skin conductance responding to both threat and safety cues, but no cue discrimination. During late extinction, low IU showed no difference in skin conductance between learned threat and safety cues, whilst high IU predicted continued fear expression to learned threat, indexed by larger skin conductance to threat vs. safety cues. These findings suggest a critical role of uncertainty-based mechanisms in the maintenance of learned fear.
Resumo:
Identifying the correct sense of a word in context is crucial for many tasks in natural language processing (machine translation is an example). State-of-the art methods for Word Sense Disambiguation (WSD) build models using hand-crafted features that usually capturing shallow linguistic information. Complex background knowledge, such as semantic relationships, are typically either not used, or used in specialised manner, due to the limitations of the feature-based modelling techniques used. On the other hand, empirical results from the use of Inductive Logic Programming (ILP) systems have repeatedly shown that they can use diverse sources of background knowledge when constructing models. In this paper, we investigate whether this ability of ILP systems could be used to improve the predictive accuracy of models for WSD. Specifically, we examine the use of a general-purpose ILP system as a method to construct a set of features using semantic, syntactic and lexical information. This feature-set is then used by a common modelling technique in the field (a support vector machine) to construct a classifier for predicting the sense of a word. In our investigation we examine one-shot and incremental approaches to feature-set construction applied to monolingual and bilingual WSD tasks. The monolingual tasks use 32 verbs and 85 verbs and nouns (in English) from the SENSEVAL-3 and SemEval-2007 benchmarks; while the bilingual WSD task consists of 7 highly ambiguous verbs in translating from English to Portuguese. The results are encouraging: the ILP-assisted models show substantial improvements over those that simply use shallow features. In addition, incremental feature-set construction appears to identify smaller and better sets of features. Taken together, the results suggest that the use of ILP with diverse sources of background knowledge provide a way for making substantial progress in the field of WSD.
Resumo:
Condition monitoring of wooden railway sleepers applications are generallycarried out by visual inspection and if necessary some impact acoustic examination iscarried out intuitively by skilled personnel. In this work, a pattern recognition solutionhas been proposed to automate the process for the achievement of robust results. Thestudy presents a comparison of several pattern recognition techniques together withvarious nonstationary feature extraction techniques for classification of impactacoustic emissions. Pattern classifiers such as multilayer perceptron, learning cectorquantization and gaussian mixture models, are combined with nonstationary featureextraction techniques such as Short Time Fourier Transform, Continuous WaveletTransform, Discrete Wavelet Transform and Wigner-Ville Distribution. Due to thepresence of several different feature extraction and classification technqies, datafusion has been investigated. Data fusion in the current case has mainly beeninvestigated on two levels, feature level and classifier level respectively. Fusion at thefeature level demonstrated best results with an overall accuracy of 82% whencompared to the human operator.