12 resultados para visual data analysis
em Helda - Digital Repository of University of Helsinki
Resumo:
The aim of this thesis is to develop a fully automatic lameness detection system that operates in a milking robot. The instrumentation, measurement software, algorithms for data analysis and a neural network model for lameness detection were developed. Automatic milking has become a common practice in dairy husbandry, and in the year 2006 about 4000 farms worldwide used over 6000 milking robots. There is a worldwide movement with the objective of fully automating every process from feeding to milking. Increase in automation is a consequence of increasing farm sizes, the demand for more efficient production and the growth of labour costs. As the level of automation increases, the time that the cattle keeper uses for monitoring animals often decreases. This has created a need for systems for automatically monitoring the health of farm animals. The popularity of milking robots also offers a new and unique possibility to monitor animals in a single confined space up to four times daily. Lameness is a crucial welfare issue in the modern dairy industry. Limb disorders cause serious welfare, health and economic problems especially in loose housing of cattle. Lameness causes losses in milk production and leads to early culling of animals. These costs could be reduced with early identification and treatment. At present, only a few methods for automatically detecting lameness have been developed, and the most common methods used for lameness detection and assessment are various visual locomotion scoring systems. The problem with locomotion scoring is that it needs experience to be conducted properly, it is labour intensive as an on-farm method and the results are subjective. A four balance system for measuring the leg load distribution of dairy cows during milking in order to detect lameness was developed and set up in the University of Helsinki Research farm Suitia. The leg weights of 73 cows were successfully recorded during almost 10,000 robotic milkings over a period of 5 months. The cows were locomotion scored weekly, and the lame cows were inspected clinically for hoof lesions. Unsuccessful measurements, caused by cows standing outside the balances, were removed from the data with a special algorithm, and the mean leg loads and the number of kicks during milking was calculated. In order to develop an expert system to automatically detect lameness cases, a model was needed. A probabilistic neural network (PNN) classifier model was chosen for the task. The data was divided in two parts and 5,074 measurements from 37 cows were used to train the model. The operation of the model was evaluated for its ability to detect lameness in the validating dataset, which had 4,868 measurements from 36 cows. The model was able to classify 96% of the measurements correctly as sound or lame cows, and 100% of the lameness cases in the validation data were identified. The number of measurements causing false alarms was 1.1%. The developed model has the potential to be used for on-farm decision support and can be used in a real-time lameness monitoring system.
Resumo:
In this Thesis, we develop theory and methods for computational data analysis. The problems in data analysis are approached from three perspectives: statistical learning theory, the Bayesian framework, and the information-theoretic minimum description length (MDL) principle. Contributions in statistical learning theory address the possibility of generalization to unseen cases, and regression analysis with partially observed data with an application to mobile device positioning. In the second part of the Thesis, we discuss so called Bayesian network classifiers, and show that they are closely related to logistic regression models. In the final part, we apply the MDL principle to tracing the history of old manuscripts, and to noise reduction in digital signals.
Resumo:
This work belongs to the field of computational high-energy physics (HEP). The key methods used in this thesis work to meet the challenges raised by the Large Hadron Collider (LHC) era experiments are object-orientation with software engineering, Monte Carlo simulation, the computer technology of clusters, and artificial neural networks. The first aspect discussed is the development of hadronic cascade models, used for the accurate simulation of medium-energy hadron-nucleus reactions, up to 10 GeV. These models are typically needed in hadronic calorimeter studies and in the estimation of radiation backgrounds. Various applications outside HEP include the medical field (such as hadron treatment simulations), space science (satellite shielding), and nuclear physics (spallation studies). Validation results are presented for several significant improvements released in Geant4 simulation tool, and the significance of the new models for computing in the Large Hadron Collider era is estimated. In particular, we estimate the ability of the Bertini cascade to simulate Compact Muon Solenoid (CMS) hadron calorimeter HCAL. LHC test beam activity has a tightly coupled cycle of simulation-to-data analysis. Typically, a Geant4 computer experiment is used to understand test beam measurements. Thus an another aspect of this thesis is a description of studies related to developing new CMS H2 test beam data analysis tools and performing data analysis on the basis of CMS Monte Carlo events. These events have been simulated in detail using Geant4 physics models, full CMS detector description, and event reconstruction. Using the ROOT data analysis framework we have developed an offline ANN-based approach to tag b-jets associated with heavy neutral Higgs particles, and we show that this kind of NN methodology can be successfully used to separate the Higgs signal from the background in the CMS experiment.
Resumo:
Accelerator mass spectrometry (AMS) is an ultrasensitive technique for measuring the concentration of a single isotope. The electric and magnetic fields of an electrostatic accelerator system are used to filter out other isotopes from the ion beam. The high velocity means that molecules can be destroyed and removed from the measurement background. As a result, concentrations down to one atom in 10^16 atoms are measurable. This thesis describes the construction of the new AMS system in the Accelerator Laboratory of the University of Helsinki. The system is described in detail along with the relevant ion optics. System performance and some of the 14C measurements done with the system are described. In a second part of the thesis, a novel statistical model for the analysis of AMS data is presented. Bayesian methods are used in order to make the best use of the available information. In the new model, instrumental drift is modelled with a continuous first-order autoregressive process. This enables rigorous normalization to standards measured at different times. The Poisson statistical nature of a 14C measurement is also taken into account properly, so that uncertainty estimates are much more stable. It is shown that, overall, the new model improves both the accuracy and the precision of AMS measurements. In particular, the results can be improved for samples with very low 14C concentrations or measured only a few times.
Resumo:
Aims: Develop and validate tools to estimate residual noise covariance in Planck frequency maps. Quantify signal error effects and compare different techniques to produce low-resolution maps. Methods: We derive analytical estimates of covariance of the residual noise contained in low-resolution maps produced using a number of map-making approaches. We test these analytical predictions using Monte Carlo simulations and their impact on angular power spectrum estimation. We use simulations to quantify the level of signal errors incurred in different resolution downgrading schemes considered in this work. Results: We find an excellent agreement between the optimal residual noise covariance matrices and Monte Carlo noise maps. For destriping map-makers, the extent of agreement is dictated by the knee frequency of the correlated noise component and the chosen baseline offset length. The significance of signal striping is shown to be insignificant when properly dealt with. In map resolution downgrading, we find that a carefully selected window function is required to reduce aliasing to the sub-percent level at multipoles, ell > 2Nside, where Nside is the HEALPix resolution parameter. We show that sufficient characterization of the residual noise is unavoidable if one is to draw reliable contraints on large scale anisotropy. Conclusions: We have described how to compute the low-resolution maps, with a controlled sky signal level, and a reliable estimate of covariance of the residual noise. We have also presented a method to smooth the residual noise covariance matrices to describe the noise correlations in smoothed, bandwidth limited maps.
Resumo:
The aim of the study was to analyze and facilitate collaborative design in a virtual learning environment (VLE). Discussions of virtual design in design education have typically focused on technological or communication issues, not on pedagogical issues. Yet in order to facilitate collaborative design, it is also necessary to address the pedagogical issues related to the virtual design process. In this study, the progressive inquiry model of collaborative designing was used to give a structural level of facilitation to students working in the VLE. According to this model, all aspects of inquiry, such as creating the design context, constructing a design idea, evaluating the idea, and searching for new information, can be shared in a design community. The study consists of three design projects: 1) designing clothes for premature babies, 2) designing conference bags for an international conference, and 3) designing tactile books for visually impaired children. These design projects constituted a continuum of design experiments, each of which highlighted certain perspectives on collaborative designing. The design experiments were organized so that the participants worked in design teams, both face-to-face and virtually. The first design experiment focused on peer collaboration among textile teacher students in the VLE. The second design experiment took into consideration end-users needs by using a participatory design approach. The third design experiment intensified computer-supported collaboration between students and domain experts. The virtual learning environments, in these design experiments, were designed to support knowledge-building pedagogy and progressive inquiry learning. These environments enabled a detailed recording of all computer-mediated interactions and data related to virtual designing. The data analysis was based on qualitative content analysis of design statements in the VLE. This study indicated four crucial issues concerning collaborative design in the VLE in craft and design education. Firstly, using the collaborative design process in craft and design education gives rise to special challenges of building learning communities, creating appropriate design tasks for them, and providing tools for collaborative activities. Secondly, the progressive inquiry model of collaborative designing can be used as a scaffold support for design thinking and for reflection on the design process. Thirdly, participation and distributed expertise can be facilitated by considering the key stakeholders who are related to the design task or design context, and getting them to participate in virtual designing. Fourthly, in the collaborative design process, it is important that team members create and improve visual and technical ideas together, not just agree or disagree about proposed ideas. Therefore, viewing the VLE as a medium for collaborative construction of the design objects appears crucial in order to understand and facilitate the complex processes in collaborative designing.
Resumo:
The paradigm of computational vision hypothesizes that any visual function -- such as the recognition of your grandparent -- can be replicated by computational processing of the visual input. What are these computations that the brain performs? What should or could they be? Working on the latter question, this dissertation takes the statistical approach, where the suitable computations are attempted to be learned from the natural visual data itself. In particular, we empirically study the computational processing that emerges from the statistical properties of the visual world and the constraints and objectives specified for the learning process. This thesis consists of an introduction and 7 peer-reviewed publications, where the purpose of the introduction is to illustrate the area of study to a reader who is not familiar with computational vision research. In the scope of the introduction, we will briefly overview the primary challenges to visual processing, as well as recall some of the current opinions on visual processing in the early visual systems of animals. Next, we describe the methodology we have used in our research, and discuss the presented results. We have included some additional remarks, speculations and conclusions to this discussion that were not featured in the original publications. We present the following results in the publications of this thesis. First, we empirically demonstrate that luminance and contrast are strongly dependent in natural images, contradicting previous theories suggesting that luminance and contrast were processed separately in natural systems due to their independence in the visual data. Second, we show that simple cell -like receptive fields of the primary visual cortex can be learned in the nonlinear contrast domain by maximization of independence. Further, we provide first-time reports of the emergence of conjunctive (corner-detecting) and subtractive (opponent orientation) processing due to nonlinear projection pursuit with simple objective functions related to sparseness and response energy optimization. Then, we show that attempting to extract independent components of nonlinear histogram statistics of a biologically plausible representation leads to projection directions that appear to differentiate between visual contexts. Such processing might be applicable for priming, \ie the selection and tuning of later visual processing. We continue by showing that a different kind of thresholded low-frequency priming can be learned and used to make object detection faster with little loss in accuracy. Finally, we show that in a computational object detection setting, nonlinearly gain-controlled visual features of medium complexity can be acquired sequentially as images are encountered and discarded. We present two online algorithms to perform this feature selection, and propose the idea that for artificial systems, some processing mechanisms could be selectable from the environment without optimizing the mechanisms themselves. In summary, this thesis explores learning visual processing on several levels. The learning can be understood as interplay of input data, model structures, learning objectives, and estimation algorithms. The presented work adds to the growing body of evidence showing that statistical methods can be used to acquire intuitively meaningful visual processing mechanisms. The work also presents some predictions and ideas regarding biological visual processing.
Resumo:
The core aim of machine learning is to make a computer program learn from the experience. Learning from data is usually defined as a task of learning regularities or patterns in data in order to extract useful information, or to learn the underlying concept. An important sub-field of machine learning is called multi-view learning where the task is to learn from multiple data sets or views describing the same underlying concept. A typical example of such scenario would be to study a biological concept using several biological measurements like gene expression, protein expression and metabolic profiles, or to classify web pages based on their content and the contents of their hyperlinks. In this thesis, novel problem formulations and methods for multi-view learning are presented. The contributions include a linear data fusion approach during exploratory data analysis, a new measure to evaluate different kinds of representations for textual data, and an extension of multi-view learning for novel scenarios where the correspondence of samples in the different views or data sets is not known in advance. In order to infer the one-to-one correspondence of samples between two views, a novel concept of multi-view matching is proposed. The matching algorithm is completely data-driven and is demonstrated in several applications such as matching of metabolites between humans and mice, and matching of sentences between documents in two languages.
Resumo:
Tourism is one of important livelihoods in Lapland. Christmas tourism was launched in the early 1980s and it became a success story - being labelled as the most epochal tourism product in Finland. Hence, today Christmas tourists are one of the most significant foreign groups arriving to Lapland during the winter season and contributing considerably to the economics of the northeastern periphery of the EU. Christmas tourism concentrates around Father Christmas who uses reindeer for transportation. The Sämi are the only indigenous people in the EU. They are all stereotypically perceived to be reindeer herders. Somehow these three, that is, Santa Claus, reindeer and the Sämi, have been incorporated into same fairytale dominion. In practice, this has happened by using the most visible cultural but also significant identity marker of the Sämi, the Sämi costume. This, in turn, has created controversy over authenticity due to manners in which the costume is used in tourism - often in imitational, mismatched forms by non-Sämi. In this thesis, after relevant literature review I intend to establish how the Sâmi are represented in Christmas tourism through visual data consisting of ten images from three foreign sources. Then I clarify why and to whom it matters of how the Sâmi are represented in Christmas tourism with the aid of 65 questionnaires and nineteen expert interviews collected mainly in the Finnish Sâmi Home Region in October 2009. Through the multiplicity of the voices of various interest and ethnic groups and by using critical discourse analysis I attempt to give an overview of the respondents' opinions and look at some preliminary solutions to the controversy. Based on my data, the non-Sami appear to accept the Sami costume usage in Christmas tourism most readily. Consequently, respect and attitudinal changes have become the respondents' propositions in addition to common set of rules of how the Sami image could be appropriated without violating the integrity of the Sami people, or a similar system of S¿m¡ Duodji trademark guaranteeing the authenticity of the tourism products. Additionally, though half of the interviewees explicate Sami presence in Christmas tourism by adding local flavour to otherwise commercial enterprise, the other half see no rationale to connect facts with fiction, that is, the Sami with Santa Claus.
Resumo:
Aerosol particles have effect on climate, visibility, air quality and human health. However, the strength of which aerosol particles affect our everyday life is not well described or entirely understood. Therefore, investigations of different processes and phenomena including e.g. primary particle sources, initial steps of secondary particle formation and growth, significance of charged particles in particle formation, as well as redistribution mechanisms in the atmosphere are required. In this work sources, sinks and concentrations of air ions (charged molecules, cluster and particles) were investigated directly by measuring air molecule ionising components (i.e. radon activity concentrations and external radiation dose rates) and charged particle size distributions, as well as based on literature review. The obtained results gave comprehensive and valuable picture of the spatial and temporal variation of the air ion sources, sinks and concentrations to use as input parameters in local and global scale climate models. Newly developed air ion spectrometers (Airel Ltd.) offered a possibility to investigate atmospheric (charged) particle formation and growth at sub-3 nm sizes. Therefore, new visual classification schemes for charged particle formation events were developed, and a newly developed particle growth rate method was tested with over one year dataset. These data analysis methods have been widely utilised by other researchers since introducing them. This thesis resulted interesting characteristics of atmospheric particle formation and growth: e.g. particle growth may sometimes be suppressed before detection limit (~ 3 nm) of traditional aerosol instruments, particle formation may take place during daytime as well as in the evening, growth rates of sub-3 nm particles were quite constant throughout the year while growth rates of larger particles (3-20 nm in diameter) were higher during summer compared to winter. These observations were thought to be a consequence of availability of condensing vapours. The observations of this thesis offered new understanding of the particle formation in the atmosphere. However, the role of ions in particle formation, which is not well understood with current knowledge, requires further research in future.
Resumo:
Tourism is one of important livelihoods in Lapland. Christmas tourism was launched in the early 1980s and it became a success story - being labelled as the most epochal tourism product in Finland. Hence, today Christmas tourists are one of the most significant foreign groups arriving to Lapland during the winter season and contributing considerably to the economics of the northeastern periphery of the EU. Christmas tourism concentrates around Father Christmas who uses reindeer for transportation. The Sämi are the only indigenous people in the EU. They are all stereotypically perceived to be reindeer herders. Somehow these three, that is, Santa Claus, reindeer and the Sämi, have been incorporated into same fairytale dominion. In practice, this has happened by using the most visible cultural but also significant identity marker of the Sämi, the Sämi costume. This, in turn, has created controversy over authenticity due to manners in which the costume is used in tourism - often in imitational, mismatched forms by non-Sämi. In this thesis, after relevant literature review I intend to establish how the Sâmi are represented in Christmas tourism through visual data consisting of ten images from three foreign sources. Then I clarify why and to whom it matters of how the Sâmi are represented in Christmas tourism with the aid of 65 questionnaires and nineteen expert interviews collected mainly in the Finnish Sâmi Home Region in October 2009. Through the multiplicity of the voices of various interest and ethnic groups and by using critical discourse analysis I attempt to give an overview of the respondents' opinions and look at some preliminary solutions to the controversy. Based on my data, the non-Sami appear to accept the Sami costume usage in Christmas tourism most readily. Consequently, respect and attitudinal changes have become the respondents' propositions in addition to common set of rules of how the Sami image could be appropriated without violating the integrity of the Sami people, or a similar system of S¿m¡ Duodji trademark guaranteeing the authenticity of the tourism products. Additionally, though half of the interviewees explicate Sami presence in Christmas tourism by adding local flavour to otherwise commercial enterprise, the other half see no rationale to connect facts with fiction, that is, the Sami with Santa Claus.