11 resultados para Visual identification tasks
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Convolutional Neural Networks (CNN) have become the state-of-the-art methods on many large scale visual recognition tasks. For a lot of practical applications, CNN architectures have a restrictive requirement: A huge amount of labeled data are needed for training. The idea of generative pretraining is to obtain initial weights of the network by training the network in a completely unsupervised way and then fine-tune the weights for the task at hand using supervised learning. In this thesis, a general introduction to Deep Neural Networks and algorithms are given and these methods are applied to classification tasks of handwritten digits and natural images for developing unsupervised feature learning. The goal of this thesis is to find out if the effect of pretraining is damped by recent practical advances in optimization and regularization of CNN. The experimental results show that pretraining is still a substantial regularizer, however, not a necessary step in training Convolutional Neural Networks with rectified activations. On handwritten digits, the proposed pretraining model achieved a classification accuracy comparable to the state-of-the-art methods.
Resumo:
Kandidaatintyö tehtiin osana PulpVision-tutkimusprojektia, jonka tarkoituksena on kehittää kuvapohjaisia laskenta- ja luokittelumetodeja sellun laaduntarkkailuun paperin valmistuksessa. Tämän tutkimusprojektin osana on aiemmin kehitetty metodi, jolla etsittiin kaarevia rakenteita kuvista, ja tätä metodia hyödynnettiin kuitujen etsintään kuvista. Tätä metodia käytettiin lähtökohtana kandidaatintyölle. Työn tarkoituksena oli tutkia, voidaanko erilaisista kuitukuvista laskettujen piirteiden avulla tunnistaa kuvassa olevien kuitujen laji. Näissä kuitukuvissa oli kuituja neljästä eri puulajista ja yhdestä kasvista. Nämä lajit olivat akasia, koivu, mänty, eukalyptus ja vehnä. Jokaisesta lajista valittiin 100 kuitukuvaa ja nämä kuvat jaettiin kahteen ryhmään, joista ensimmäistä käytettiin opetusryhmänä ja toista testausryhmänä. Opetusryhmän avulla jokaiselle kuitulajille laskettiin näitä kuvaavia piirteitä, joiden avulla pyrittiin tunnistamaan testausryhmän kuvissa olevat kuitulajit. Nämä kuvat oli tuottanut CEMIS-Oulu (Center for Measurement and Information Systems), joka on mittaustekniikkaan keskittynyt yksikkö Oulun yliopistossa. Yksittäiselle opetusryhmän kuitukuvalle laskettiin keskiarvot ja keskihajonnat kolmesta eri piirteestä, jotka olivat pituus, leveys ja kaarevuus. Lisäksi laskettiin, kuinka monta kuitua kuvasta löydettiin. Näiden piirteiden eri yhdistelmien avulla testattiin tunnistamisen tarkkuutta käyttämällä k:n lähimmän naapurin menetelmää ja Naiivi Bayes -luokitinta testausryhmän kuville. Testeistä saatiin lupaavia tuloksia muun muassa pituuden ja leveyden keskiarvoja käytettäessä saavutettiin jopa noin 98 %:n tarkkuus molemmilla algoritmeilla. Tunnistuksessa kuitujen keskimäärinen pituus vaikutti olevan kuitukuvia parhaiten kuvaava piirre. Käytettyjen algoritmien välillä ei ollut suurta vaihtelua tarkkuudessa. Testeissä saatujen tulosten perusteella voidaan todeta, että kuitukuvien tunnistaminen on mahdollista. Testien perusteella kuitukuvista tarvitsee laskea vain kaksi piirrettä, joilla kuidut voidaan tunnistaa tarkasti. Käytetyt lajittelualgoritmit olivat hyvin yksinkertaisia, mutta ne toimivat testeissä hyvin.
Resumo:
This thesis investigates the strategy implementation process of enterprices; a process whichhas lacked the academic attentioon compared with a rich strategy formation research trdition. Strategy implementation is viewed as a process ensuring tha the strtegies of an organisation are realised fully and quickly, yet with constant consideration of changing circumstances. The aim of this sudy is to provide a framework for identifying, analysing and removing the strategy implementation bottleneck af an organization and thus for intesifying its strategy process.The study is opened by specifying the concept, tasks and key actors of strategy implementation process; especially arguments for the critical implementation role of the top management are provided. In order to facilitate the analysis nad synthetisation of the core findings of scattered doctrine, six characteristic approaches to strategy implementation phenomenon are identified and compared. The Bottleneck Framework is introduced as an instrument for arranging potential strategy realisation problems, prioritising an organisation's implementation obstacles and focusing the improvement measures accordingly. The SUCCESS Framework is introduced as a mnemonic of the seven critical factors to be taken into account when promoting sttrategy implementation. Both frameworks are empirically tested by applying them to real strategy implementation intesification process in an international, industrial, group-structured case enterprise.
Resumo:
Learning from demonstration becomes increasingly popular as an efficient way of robot programming. Not only a scientific interest acts as an inspiration in this case but also the possibility of producing the machines that would find application in different areas of life: robots helping with daily routine at home, high performance automata in industries or friendly toys for children. One way to teach a robot to fulfill complex tasks is to start with simple training exercises, combining them to form more difficult behavior. The objective of the Master’s thesis work was to study robot programming with visual input. Dynamic movement primitives (DMPs) were chosen as a tool for motion learning and generation. Assuming a movement to be a spring system influenced by an external force, making this system move, DMPs represent the motion as a set of non-linear differential equations. During the experiments the properties of DMP, such as temporal and spacial invariance, were examined. The effect of the DMP parameters, including spring coefficient, damping factor, temporal scaling, on the trajectory generated were studied.
Resumo:
Local features are used in many computer vision tasks including visual object categorization, content-based image retrieval and object recognition to mention a few. Local features are points, blobs or regions in images that are extracted using a local feature detector. To make use of extracted local features the localized interest points are described using a local feature descriptor. A descriptor histogram vector is a compact representation of an image and can be used for searching and matching images in databases. In this thesis the performance of local feature detectors and descriptors is evaluated for object class detection task. Features are extracted from image samples belonging to several object classes. Matching features are then searched using random image pairs of a same class. The goal of this thesis is to find out what are the best detector and descriptor methods for such task in terms of detector repeatability and descriptor matching rate.
Resumo:
Visual data mining (VDM) tools employ information visualization techniques in order to represent large amounts of high-dimensional data graphically and to involve the user in exploring data at different levels of detail. The users are looking for outliers, patterns and models – in the form of clusters, classes, trends, and relationships – in different categories of data, i.e., financial, business information, etc. The focus of this thesis is the evaluation of multidimensional visualization techniques, especially from the business user’s perspective. We address three research problems. The first problem is the evaluation of projection-based visualizations with respect to their effectiveness in preserving the original distances between data points and the clustering structure of the data. In this respect, we propose the use of existing clustering validity measures. We illustrate their usefulness in evaluating five visualization techniques: Principal Components Analysis (PCA), Sammon’s Mapping, Self-Organizing Map (SOM), Radial Coordinate Visualization and Star Coordinates. The second problem is concerned with evaluating different visualization techniques as to their effectiveness in visual data mining of business data. For this purpose, we propose an inquiry evaluation technique and conduct the evaluation of nine visualization techniques. The visualizations under evaluation are Multiple Line Graphs, Permutation Matrix, Survey Plot, Scatter Plot Matrix, Parallel Coordinates, Treemap, PCA, Sammon’s Mapping and the SOM. The third problem is the evaluation of quality of use of VDM tools. We provide a conceptual framework for evaluating the quality of use of VDM tools and apply it to the evaluation of the SOM. In the evaluation, we use an inquiry technique for which we developed a questionnaire based on the proposed framework. The contributions of the thesis consist of three new evaluation techniques and the results obtained by applying these evaluation techniques. The thesis provides a systematic approach to evaluation of various visualization techniques. In this respect, first, we performed and described the evaluations in a systematic way, highlighting the evaluation activities, and their inputs and outputs. Secondly, we integrated the evaluation studies in the broad framework of usability evaluation. The results of the evaluations are intended to help developers and researchers of visualization systems to select appropriate visualization techniques in specific situations. The results of the evaluations also contribute to the understanding of the strengths and limitations of the visualization techniques evaluated and further to the improvement of these techniques.
Resumo:
Centrifugal pumps are a notable end-consumer of electrical energy. Typical application of a centrifugal pump is the filling or emptying of a reservoir tank, where the pump is often operated at a constant speed until the process is completed. Installing a frequency converter to control the motor substitutes the traditional fixed-speed pumping system, allows the optimization of rotational speed profile for the pumping tasks and enables the estimation of rotational speed and shaft torque of an induction motor without any additional measurements from the motor shaft. Utilization of variable-speed operation provides the possibility to decrease the overall energy consumption of the pumping task. The static head of the pumping process may change during the pumping task. In such systems, the minimum rotational speed changes during reservoir filling or emptying, and the minimum energy consumption can’t be achieved with a fixed rotational speed. This thesis presents embedded algorithms to automatically identify, optimize and monitor pumping processes between supply and destination reservoirs, and evaluates the changing static head –based optimization method.
Resumo:
This dissertation examined skill development in music reading by focusing on the visual processing of music notation in different music-reading tasks. Each of the three experiments of this dissertation addressed one of the three types of music reading: (i) sight-reading, i.e. reading and performing completely unknown music, (ii) rehearsed reading, during which the performer is already familiar with the music being played, and (iii) silent reading with no performance requirements. The use of the eye-tracking methodology allowed the recording of the readers’ eye movements from the time of music reading with extreme precision. Due to the lack of coherence in the smallish amount of prior studies on eye movements in music reading, the dissertation also had a heavy methodological emphasis. The present dissertation thus aimed to promote two major issues: (1) it investigated the eye-movement indicators of skill and skill development in sight-reading, rehearsed reading and silent reading, and (2) developed and tested suitable methods that can be used by future studies on the topic. Experiment I focused on the eye-movement behaviour of adults during their first steps of learning to read music notation. The longitudinal experiment spanned a nine-month long music-training period, during which 49 participants (university students taking part in a compulsory music course) sight-read and performed a series of simple melodies in three measurement sessions. Participants with no musical background were entitled as “novices”, whereas “amateurs” had had musical training prior to the experiment. The main issue of interest was the changes in the novices’ eye movements and performances across the measurements while the amateurs offered a point of reference for the assessment of the novices’ development. The experiment showed that the novices tended to sight-read in a more stepwise fashion than the amateurs, the latter group manifesting more back-and-forth eye movements. The novices’ skill development was reflected by the faster identification of note symbols involved in larger melodic intervals. Across the measurements, the novices also began to show sensitivity to the melodies’ metrical structure, which the amateurs demonstrated from the very beginning. The stimulus melodies consisted of quarter notes, making the effects of meter and larger melodic intervals distinguishable from effects caused by, say, different rhythmic patterns. Experiment II explored the eye movements of 40 experienced musicians (music education students and music performance students) during temporally controlled rehearsed reading. This cross-sectional experiment focused on the eye-movement effects of one-bar-long melodic alterations placed within a familiar melody. The synchronizing of the performance and eye-movement recordings enabled the investigation of the eye-hand span, i.e., the temporal gap between a performed note and the point of gaze. The eye-hand span was typically found to remain around one second. Music performance students demonstrated increased professing efficiency by their shorter average fixation durations as well as in the two examined eye-hand span measures: these participants used larger eye-hand spans more frequently and inspected more of the musical score during the performance of one metrical beat than students of music education. Although all participants produced performances almost indistinguishable in terms of their auditory characteristics, the altered bars indeed affected the reading of the score: the general effects of expertise in terms of the two eye- hand span measures, demonstrated by the music performance students, disappeared in the face of the melodic alterations. Experiment III was a longitudinal experiment designed to examine the differences between adult novice and amateur musicians’ silent reading of music notation, as well as the changes the 49 participants manifested during a nine-month long music course. From a methodological perspective, an opening to research on eye movements in music reading was the inclusion of a verbal protocol in the research design: after viewing the musical image, the readers were asked to describe what they had seen. A two-way categorization for verbal descriptions was developed in order to assess the quality of extracted musical information. More extensive musical background was related to shorter average fixation duration, more linear scanning of the musical image, and more sophisticated verbal descriptions of the music in question. No apparent effects of skill development were observed for the novice music readers alone, but all participants improved their verbal descriptions towards the last measurement. Apart from the background-related differences between groups of participants, combining verbal and eye-movement data in a cluster analysis identified three styles of silent reading. The finding demonstrated individual differences in how the freely defined silent-reading task was approached. This dissertation is among the first presentations of a series of experiments systematically addressing the visual processing of music notation in various types of music-reading tasks and focusing especially on the eye-movement indicators of developing music-reading skill. Overall, the experiments demonstrate that the music-reading processes are affected not only by “top-down” factors, such as musical background, but also by the “bottom-up” effects of specific features of music notation, such as pitch heights, metrical division, rhythmic patterns and unexpected melodic events. From a methodological perspective, the experiments emphasize the importance of systematic stimulus design, temporal control during performance tasks, and the development of complementary methods, for easing the interpretation of the eye-movement data. To conclude, this dissertation suggests that advances in comprehending the cognitive aspects of music reading, the nature of expertise in this musical task, and the development of educational tools can be attained through the systematic application of the eye-tracking methodology also in this specific domain.
Resumo:
The ongoing global financial crisis has demonstrated the importance of a systemwide, or macroprudential, approach to safeguarding financial stability. An essential part of macroprudential oversight concerns the tasks of early identification and assessment of risks and vulnerabilities that eventually may lead to a systemic financial crisis. Thriving tools are crucial as they allow early policy actions to decrease or prevent further build-up of risks or to otherwise enhance the shock absorption capacity of the financial system. In the literature, three types of systemic risk can be identified: i ) build-up of widespread imbalances, ii ) exogenous aggregate shocks, and iii ) contagion. Accordingly, the systemic risks are matched by three categories of analytical methods for decision support: i ) early-warning, ii ) macro stress-testing, and iii ) contagion models. Stimulated by the prolonged global financial crisis, today's toolbox of analytical methods includes a wide range of innovative solutions to the two tasks of risk identification and risk assessment. Yet, the literature lacks a focus on the task of risk communication. This thesis discusses macroprudential oversight from the viewpoint of all three tasks: Within analytical tools for risk identification and risk assessment, the focus concerns a tight integration of means for risk communication. Data and dimension reduction methods, and their combinations, hold promise for representing multivariate data structures in easily understandable formats. The overall task of this thesis is to represent high-dimensional data concerning financial entities on lowdimensional displays. The low-dimensional representations have two subtasks: i ) to function as a display for individual data concerning entities and their time series, and ii ) to use the display as a basis to which additional information can be linked. The final nuance of the task is, however, set by the needs of the domain, data and methods. The following ve questions comprise subsequent steps addressed in the process of this thesis: 1. What are the needs for macroprudential oversight? 2. What form do macroprudential data take? 3. Which data and dimension reduction methods hold most promise for the task? 4. How should the methods be extended and enhanced for the task? 5. How should the methods and their extensions be applied to the task? Based upon the Self-Organizing Map (SOM), this thesis not only creates the Self-Organizing Financial Stability Map (SOFSM), but also lays out a general framework for mapping the state of financial stability. This thesis also introduces three extensions to the standard SOM for enhancing the visualization and extraction of information: i ) fuzzifications, ii ) transition probabilities, and iii ) network analysis. Thus, the SOFSM functions as a display for risk identification, on top of which risk assessments can be illustrated. In addition, this thesis puts forward the Self-Organizing Time Map (SOTM) to provide means for visual dynamic clustering, which in the context of macroprudential oversight concerns the identification of cross-sectional changes in risks and vulnerabilities over time. Rather than automated analysis, the aim of visual means for identifying and assessing risks is to support disciplined and structured judgmental analysis based upon policymakers' experience and domain intelligence, as well as external risk communication.
Resumo:
Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.
Resumo:
The importance of package design as a marketing tool is growing as the competition in retail environment increases. However, there is a lack of studies on how each element of package design affects consumer decisions in different countries. The objective of this thesis is to study the role of package design to Japanese consumers. The research was conducted through an experiment with a sample of 37 Japanese female participants. They were divided into two groups and were given different tasks: one group had to choose a chocolate for themselves, and the other for a group of friends. The participants were presented with 15 different Finnish chocolate boxes to choose from. The qualitative data was gathered through observation and semi-structured interviews. In addition, data from questionnaires was quantified and all the data was triangulated. The empirical results suggest that visual elements strongly affect the decision making of Japanese consumers. Image was the most important element which acted as both, a visual and an informational aspect in the experiment. Informational elements on the other hand have little effect, especially when the context is written in a foreign language. However, informational elements affected participants who were choosing chocolates for a group of friends. A unique finding was the importance of kawaii (cuteness) to Japanese consumers.