12 resultados para Visual Odometry,Transformer,Deep learning
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Convolutional Neural Networks (CNN) have become the state-of-the-art methods on many large scale visual recognition tasks. For a lot of practical applications, CNN architectures have a restrictive requirement: A huge amount of labeled data are needed for training. The idea of generative pretraining is to obtain initial weights of the network by training the network in a completely unsupervised way and then fine-tune the weights for the task at hand using supervised learning. In this thesis, a general introduction to Deep Neural Networks and algorithms are given and these methods are applied to classification tasks of handwritten digits and natural images for developing unsupervised feature learning. The goal of this thesis is to find out if the effect of pretraining is damped by recent practical advances in optimization and regularization of CNN. The experimental results show that pretraining is still a substantial regularizer, however, not a necessary step in training Convolutional Neural Networks with rectified activations. On handwritten digits, the proposed pretraining model achieved a classification accuracy comparable to the state-of-the-art methods.
Resumo:
A new area of machine learning research called deep learning, has moved machine learning closer to one of its original goals: artificial intelligence and general learning algorithm. The key idea is to pretrain models in completely unsupervised way and finally they can be fine-tuned for the task at hand using supervised learning. In this thesis, a general introduction to deep learning models and algorithms are given and these methods are applied to facial keypoints detection. The task is to predict the positions of 15 keypoints on grayscale face images. Each predicted keypoint is specified by an (x,y) real-valued pair in the space of pixel indices. In experiments, we pretrained deep belief networks (DBN) and finally performed a discriminative fine-tuning. We varied the depth and size of an architecture. We tested both deterministic and sampled hidden activations and the effect of additional unlabeled data on pretraining. The experimental results show that our model provides better results than publicly available benchmarks for the dataset.
Resumo:
In this thesis, we propose to infer pixel-level labelling in video by utilising only object category information, exploiting the intrinsic structure of video data. Our motivation is the observation that image-level labels are much more easily to be acquired than pixel-level labels, and it is natural to find a link between the image level recognition and pixel level classification in video data, which would transfer learned recognition models from one domain to the other one. To this end, this thesis proposes two domain adaptation approaches to adapt the deep convolutional neural network (CNN) image recognition model trained from labelled image data to the target domain exploiting both semantic evidence learned from CNN, and the intrinsic structures of unlabelled video data. Our proposed approaches explicitly model and compensate for the domain adaptation from the source domain to the target domain which in turn underpins a robust semantic object segmentation method for natural videos. We demonstrate the superior performance of our methods by presenting extensive evaluations on challenging datasets comparing with the state-of-the-art methods.
Resumo:
In this work, image based estimation methods, also known as direct methods, are studied which avoid feature extraction and matching completely. Cost functions use raw pixels as measurements and the goal is to produce precise 3D pose and structure estimates. The cost functions presented minimize the sensor error, because measurements are not transformed or modified. In photometric camera pose estimation, 3D rotation and translation parameters are estimated by minimizing a sequence of image based cost functions, which are non-linear due to perspective projection and lens distortion. In image based structure refinement, on the other hand, 3D structure is refined using a number of additional views and an image based cost metric. Image based estimation methods are particularly useful in conditions where the Lambertian assumption holds, and the 3D points have constant color despite viewing angle. The goal is to improve image based estimation methods, and to produce computationally efficient methods which can be accomodated into real-time applications. The developed image-based 3D pose and structure estimation methods are finally demonstrated in practise in indoor 3D reconstruction use, and in a live augmented reality application.
Resumo:
Learning from demonstration becomes increasingly popular as an efficient way of robot programming. Not only a scientific interest acts as an inspiration in this case but also the possibility of producing the machines that would find application in different areas of life: robots helping with daily routine at home, high performance automata in industries or friendly toys for children. One way to teach a robot to fulfill complex tasks is to start with simple training exercises, combining them to form more difficult behavior. The objective of the Master’s thesis work was to study robot programming with visual input. Dynamic movement primitives (DMPs) were chosen as a tool for motion learning and generation. Assuming a movement to be a spring system influenced by an external force, making this system move, DMPs represent the motion as a set of non-linear differential equations. During the experiments the properties of DMP, such as temporal and spacial invariance, were examined. The effect of the DMP parameters, including spring coefficient, damping factor, temporal scaling, on the trajectory generated were studied.
Resumo:
The human language-learning ability persists throughout life, indicating considerable flexibility at the cognitive and neural level. This ability spans from expanding the vocabulary in the mother tongue to acquisition of a new language with its lexicon and grammar. The present thesis consists of five studies that tap both of these aspects of adult language learning by using magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) during language processing and language learning tasks. The thesis shows that learning novel phonological word forms, either in the native tongue or when exposed to a foreign phonology, activates the brain in similar ways. The results also show that novel native words readily become integrated in the mental lexicon. Several studies in the thesis highlight the left temporal cortex as an important brain region in learning and accessing phonological forms. Incidental learning of foreign phonological word forms was reflected in functionally distinct temporal lobe areas that, respectively, reflected short-term memory processes and more stable learning that persisted to the next day. In a study where explicitly trained items were tracked for ten months, it was found that enhanced naming-related temporal and frontal activation one week after learning was predictive of good long-term memory. The results suggest that memory maintenance is an active process that depends on mechanisms of reconsolidation, and that these process vary considerably between individuals. The thesis put special emphasis on studying language learning in the context of language production. The neural foundation of language production has been studied considerably less than that of perceptive language, especially on the sentence level. A well-known paradigm in language production studies is picture naming, also used as a clinical tool in neuropsychology. This thesis shows that accessing the meaning and phonological form of a depicted object are subserved by different neural implementations. Moreover, a comparison between action and object naming from identical images indicated that the grammatical class of the retrieved word (verb, noun) is less important than the visual content of the image. In the present thesis, the picture naming was further modified into a novel paradigm in order to probe sentence-level speech production in a newly learned miniature language. Neural activity related to grammatical processing did not differ between the novel language and the mother tongue, but stronger neural activation for the novel language was observed during the planning of the upcoming output, likely related to more demanding lexical retrieval and short-term memory. In sum, the thesis aimed at examining language learning by combining different linguistic domains, such as phonology, semantics, and grammar, in a dynamic description of language processing in the human brain.
Resumo:
This thesis was part of lean adaptation project started at Outotec Lappeenranta factory in early 2013. The purpose of this thesis was to develop and propose lean tools that could be used in daily management, visual management and continuous improvement. This thesis was “outsiders” view, and as such, did not study the current processes deeply. As result of this thesis, two different Daily Management -boards were designed, one for parallel processes and one for sequential processes. In addition, methods of doing continuous improvement and daily task accountability were framed and standard work for the leaders outlined. The tools presented in this thesis are general tools which support work in lean environment. They are visual and, if used correctly, they provide a basis from which continuous improvement can be done. Lean philosophy emphasizes the deep understanding of the current situation and it would be against the lean principles to blindly implement anything developed “on the outside”. The tools presented should be reviewed and modified further by the people working on the factory floor.
Resumo:
Workshop at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
The number of persons with visual impairment in Tanzania is estimated to over 1.6 million. About half a million of these persons are children aged 7-13. Only about 1% of these children are enrolled in schools. The special schools and units are too few and in most cases they are far away from the children’s homes. More and more regular schools are enrolling children with visual impairment, but the schools lack financial resources, tactile teaching materials and trained special education teachers. Children with visual impairment enrolled in regular schools seldom get enough support and often fail in examinations. The general aim of this study was to contribute to increased knowledge and understanding about how teachers can change their teaching practices and thus facilitate the learning of children with visual impairment included in regular classrooms as they participate in an action research project. The project was conducted in a primary school in a poor rural region with a high frequency of blindness and visual impairment. The school was poorly resourced and the average number of pupils per class was 90. The teachers who participated in the collaborative action research project were the 14 teachers who taught blind or visually impaired pupils in grades 4 and 6, in total 6 pupils. The action research project was conducted during a period of 6 months and was carried out in five cycles. The teachers were actively involved in all the project activities; identifying challenges, planning solutions, producing teaching materials, reflecting on outcomes, collaborating and evaluating. Empirical data was collected with questionnaires, interviews, observations and focus group discussions. The findings of the study show that the teachers managed to change their teaching practices through systematic reflection, analysis and collaboration. The teachers produced a variety of tactile teaching materials, which facilitated the learning of the pupils with visual impairment. The pupils learned better and felt more included in the regular classes. The teachers gained new knowledge and skills. They grew professionally and started to collaborate with each other. The study contributes to new knowledge of how collaborative action research can be conducted in the area of special education in a Tanzanian school context. The study has also relevance to the planning of school-based professional development programs and teacher education programs in Tanzania and in other low-income countries. The results also point at strategies which can promote inclusion of children with disabilities in regular schools.
Resumo:
This thesis investigates the matter of race in the context of Finnish language acquisition among adult migrants in Finland. Here matter denotes both the materiality of race and how race comes to matter. Drawing primarily on an auto/ethno/graphic account of learning the Finnish language as a participant in the Finnish for foreigners classes, this thesis problematises the ontology and epistemology of race, i.e., what race is, how it is known, and what an engagement with race entails. Taking cues from the bodily practices of learning the Finnish trill or the rolling r, this study proposes a notion of “trilling race” and argues for an onto-epistemological dis/continuity that marks race’s arrival. The notion of dis/continuity reworks the distinction between continuity and discontinuity, and asks about the how of the arrival of any identity, the where, and the when. In so doing, an analysis of “trilling race” engages with one of the major problematics that has exercised much critical attention, namely: how to read race differently. That is, to rethink the conundrum of the need to counter “representational weight” (Puar 2007, 191) of race on the one hand, and to account for the racialised lived realities on the other. The link between a study of the phenomenon of host country language acquisition and an examination of the question of race is not as obvious as it might seem. For example, what does the argument that the process of language learning is racialised actually imply? Does it mean that race, as a process of racialisation or an ongoing configuration of sets of power relations, exerts force from an outside on the otherwise neutral process of learning the host country language? Or does it mean that race, as an identity category, presents as among the analytical perspectives, along with gender and class for instance, of the phenomenon of host country language acquisition? With these questions in mind, and to foreground the examination of the question of race in the context of Finnish language acquisition among adult migrants, this thesis opens with a discussion of the art installation Finnexia by Lisa Erdman. Finnexia is a fictitious drug said to facilitate Finnish language learning through accelerating the cognitive learning process and reducing the anxiety of speaking the Finnish language. Not only does the Finnexia installation make visible the ways in which the lack of skill in Finnish is fgured as the threshold – a border that separates the inside from the outside – to integration, but also, and importantly, it raises questions about the nature of difference, and the process of differentiation that separates the individual from the social, fact from fiction, nature from culture. These puzzles animate much of the analysis in this dissertation. These concerns continue to be addressed in the rest of part one. Whereas chapter two offers a reconsideration of the ambiguities of ethnisme/ethnicity and race, chapter three dilates on the methodological implications of a conception of the dis/continuity of race. Part two focuses on the matter of race and examines the political economy of visual-aural encounters, whereas part three shifts the focus and rethinks the possibilities and limitations of transforming racialised and normative constraints. Taking up these particular problematics, this thesis as a whole argues that race trills itself: its identity/difference is simultaneously made possible and impossible.
Resumo:
This study aims to extend prior knowledge on the learning and developmental outcomes of the experiential learning cycle of David Kolb by the analysis of its practical realization at Team Academy. The study is based on the constructivist approach to learning and considers, among others, the concepts of autonomy support, Nonaka and Takeuchi's knowledge creation model, Luft and Ingham's Johari Window and Deci and Ryan's Self-determination theory. For the investigation deep interviews were carried out with the participants of Team Academy, both learners and coaches. Taking the interview results and the above described theories into consideration this study concludes that experiential learning results not only in effective learning, but also in a remarkable soft skill acquisition, self-development and increase in motivation with an internal locus of causality. Real-life projects permit the learners to experience real challenges. By the practical activities and teamwork they also get the possibility to find out their personal strengths, weaknesses and unique capacities.
Resumo:
The general aim of the thesis was to study university students’ learning from the perspective of regulation of learning and text processing. The data were collected from the two academic disciplines of medical and teacher education, which share the features of highly scheduled study, a multidisciplinary character, a complex relationship between theory and practice and a professional nature. Contemporary information society poses new challenges for learning, as it is not possible to learn all the information needed in a profession during a study programme. Therefore, it is increasingly important to learn how to think and learn independently, how to recognise gaps in and update one’s knowledge and how to deal with the huge amount of constantly changing information. In other words, it is critical to regulate one’s learning and to process text effectively. The thesis comprises five sub-studies that employed cross-sectional, longitudinal and experimental designs and multiple methods, from surveys to eye tracking. Study I examined the connections between students’ study orientations and the ways they regulate their learning. In total, 410 second-, fourth- and sixth-year medical students from two Finnish medical schools participated in the study by completing a questionnaire measuring both general study orientations and regulation strategies. The students were generally deeply oriented towards their studies. However, they regulated their studying externally. Several interesting and theoretically reasonable connections between the variables were found. For instance, self-regulation was positively correlated with deep orientation and achievement orientation and was negatively correlated with non-commitment. However, external regulation was likewise positively correlated with deep orientation and achievement orientation but also with surface orientation and systematic orientation. It is argued that external regulation might function as an effective coping strategy in the cognitively loaded medical curriculum. Study II focused on medical students’ regulation of learning and their conceptions of the learning environment in an innovative medical course where traditional lectures were combined wth problem-based learning (PBL) group work. First-year medical and dental students (N = 153) completed a questionnaire assessing their regulation strategies of learning and views about the PBL group work. The results indicated that external regulation and self-regulation of the learning content were the most typical regulation strategies among the participants. In line with previous studies, self-regulation wasconnected with study success. Strictly organised PBL sessions were not considered as useful as lectures, although the students’ views of the teacher/tutor and the group were mainly positive. Therefore, developers of teaching methods are challenged to think of new solutions that facilitate reflection of one’s learning and that improve the development of self-regulation. In Study III, a person-centred approach to studying regulation strategies was employed, in contrast to the traditional variable-centred approach used in Study I and Study II. The aim of Study III was to identify different regulation strategy profiles among medical students (N = 162) across time and to examine to what extent these profiles predict study success in preclinical studies. Four regulation strategy profiles were identified, and connections with study success were found. Students with the lowest self-regulation and with an increasing lack of regulation performed worse than the other groups. As the person-centred approach enables us to individualise students with diverse regulation patterns, it could be used in supporting student learning and in facilitating the early diagnosis of learning difficulties. In Study IV, 91 student teachers participated in a pre-test/post-test design where they answered open-ended questions about a complex science concept both before and after reading either a traditional, expository science text or a refutational text that prompted the reader to change his/her beliefs according to scientific beliefs about the phenomenon. The student teachers completed a questionnaire concerning their regulation and processing strategies. The results showed that the students’ understanding improved after text reading intervention and that refutational text promoted understanding better than the traditional text. Additionally, regulation and processing strategies were found to be connected with understanding the science phenomenon. A weak trend showed that weaker learners would benefit more from the refutational text. It seems that learners with effective learning strategies are able to pick out the relevant content regardless of the text type, whereas weaker learners might benefit from refutational parts that contrast the most typical misconceptions with scientific views. The purpose of Study V was to use eye tracking to determine how third-year medical studets (n = 39) and internal medicine residents (n = 13) read and solve patient case texts. The results revealed differences between medical students and residents in processing patient case texts; compared to the students, the residents were more accurate in their diagnoses and processed the texts significantly faster and with a lower number of fixations. Different reading patterns were also found. The observed differences between medical students and residents in processing patient case texts could be used in medical education to model expert reasoning and to teach how a good medical text should be constructed. The main findings of the thesis indicate that even among very selected student populations, such as high-achieving medical students or student teachers, there seems to be a lot of variation in regulation strategies of learning and text processing. As these learning strategies are related to successful studying, students enter educational programmes with rather different chances of managing and achieving success. Further, the ways of engaging in learning seldom centre on a single strategy or approach; rather, students seem to combine several strategies to a certain degree. Sometimes, it can be a matter of perspective of which way of learning can be considered best; therefore, the reality of studying in higher education is often more complicated than the simplistic view of self-regulation as a good quality and external regulation as a harmful quality. The beginning of university studies may be stressful for many, as the gap between high school and university studies is huge and those strategies that were adequate during high school might not work as well in higher education. Therefore, it is important to map students’ learning strategies and to encourage them to engage in using high-quality learning strategies from the beginning. Instead of separate courses on learning skills, the integration of these skills into course contents should be considered. Furthermore, learning complex scientific phenomena could be facilitated by paying attention to high-quality learning materials and texts and other support from the learning environment also in the university. Eye tracking seems to have great potential in evaluating performance and growing diagnostic expertise in text processing, although more research using texts as stimulus is needed. Both medical and teacher education programmes and the professions themselves are challenging in terms of their multidisciplinary nature and increasing amounts of information and therefore require good lifelong learning skills during the study period and later in work life.