15 resultados para Robust methods
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Human activity recognition in everyday environments is a critical, but challenging task in Ambient Intelligence applications to achieve proper Ambient Assisted Living, and key challenges still remain to be dealt with to realize robust methods. One of the major limitations of the Ambient Intelligence systems today is the lack of semantic models of those activities on the environment, so that the system can recognize the speci c activity being performed by the user(s) and act accordingly. In this context, this thesis addresses the general problem of knowledge representation in Smart Spaces. The main objective is to develop knowledge-based models, equipped with semantics to learn, infer and monitor human behaviours in Smart Spaces. Moreover, it is easy to recognize that some aspects of this problem have a high degree of uncertainty, and therefore, the developed models must be equipped with mechanisms to manage this type of information. A fuzzy ontology and a semantic hybrid system are presented to allow modelling and recognition of a set of complex real-life scenarios where vagueness and uncertainty are inherent to the human nature of the users that perform it. The handling of uncertain, incomplete and vague data (i.e., missing sensor readings and activity execution variations, since human behaviour is non-deterministic) is approached for the rst time through a fuzzy ontology validated on real-time settings within a hybrid data-driven and knowledgebased architecture. The semantics of activities, sub-activities and real-time object interaction are taken into consideration. The proposed framework consists of two main modules: the low-level sub-activity recognizer and the high-level activity recognizer. The rst module detects sub-activities (i.e., actions or basic activities) that take input data directly from a depth sensor (Kinect). The main contribution of this thesis tackles the second component of the hybrid system, which lays on top of the previous one, in a superior level of abstraction, and acquires the input data from the rst module's output, and executes ontological inference to provide users, activities and their in uence in the environment, with semantics. This component is thus knowledge-based, and a fuzzy ontology was designed to model the high-level activities. Since activity recognition requires context-awareness and the ability to discriminate among activities in di erent environments, the semantic framework allows for modelling common-sense knowledge in the form of a rule-based system that supports expressions close to natural language in the form of fuzzy linguistic labels. The framework advantages have been evaluated with a challenging and new public dataset, CAD-120, achieving an accuracy of 90.1% and 91.1% respectively for low and high-level activities. This entails an improvement over both, entirely data-driven approaches, and merely ontology-based approaches. As an added value, for the system to be su ciently simple and exible to be managed by non-expert users, and thus, facilitate the transfer of research to industry, a development framework composed by a programming toolbox, a hybrid crisp and fuzzy architecture, and graphical models to represent and con gure human behaviour in Smart Spaces, were developed in order to provide the framework with more usability in the nal application. As a result, human behaviour recognition can help assisting people with special needs such as in healthcare, independent elderly living, in remote rehabilitation monitoring, industrial process guideline control, and many other cases. This thesis shows use cases in these areas.
Resumo:
The aim of this thesis was to observe possibilities to enhance the development of manufacturing costs savings and competitiveness related to the compact KONE Renova Slim elevator door. Compact slim doors are especially designed for EMEA markets. EMEA market area is characterized by highly competitive pricing and lead times which are manifested as pressures to decrease manufacturing costs and lead times of the compact elevator door. The new elevator safety code EN81-20 coming live during the spring 2016 will also have a negative impact on the cost and competitiveness development making the situation more acute. As a sheet metal product the KONE Renova slim is highly variable. Manufacturing methods utilized in the production are common and robust methods. Due to the low volumes, high variability and tight lead times the manufacturing of the doors is facing difficulties. Manufacturing of the doors is outsourced to two individual suppliers Stera and Wittur. This thesis was implemented in collaboration with Stera. KONE and Stera pursue a long term and close partnership where the benefits reached by the collaboration are shared equally. Despite the aims, the collaboration between companies is not totally visible and various barriers are hampering the development towards more efficient ways of working. Based on the empirical studies related to this thesis, an efficient standardized (A+) process was developed for the main variations of the compact elevator door. Using the standardized process KONE is able to order the most important AMDS door variations from Stera with increased quality, lower manufacturing costs and manufacturing lead time compared to the current situation. In addition to all the benefits, the standardized (A+) process also includes risks in practice. KONE and the door supplier need to consider these practical risks together before decisions are made.
Resumo:
Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.
Resumo:
Recent years have produced great advances in the instrumentation technology. The amount of available data has been increasing due to the simplicity, speed and accuracy of current spectroscopic instruments. Most of these data are, however, meaningless without a proper analysis. This has been one of the reasons for the overgrowing success of multivariate handling of such data. Industrial data is commonly not designed data; in other words, there is no exact experimental design, but rather the data have been collected as a routine procedure during an industrial process. This makes certain demands on the multivariate modeling, as the selection of samples and variables can have an enormous effect. Common approaches in the modeling of industrial data are PCA (principal component analysis) and PLS (projection to latent structures or partial least squares) but there are also other methods that should be considered. The more advanced methods include multi block modeling and nonlinear modeling. In this thesis it is shown that the results of data analysis vary according to the modeling approach used, thus making the selection of the modeling approach dependent on the purpose of the model. If the model is intended to provide accurate predictions, the approach should be different than in the case where the purpose of modeling is mostly to obtain information about the variables and the process. For industrial applicability it is essential that the methods are robust and sufficiently simple to apply. In this way the methods and the results can be compared and an approach selected that is suitable for the intended purpose. Differences in data analysis methods are compared with data from different fields of industry in this thesis. In the first two papers, the multi block method is considered for data originating from the oil and fertilizer industries. The results are compared to those from PLS and priority PLS. The third paper considers applicability of multivariate models to process control for a reactive crystallization process. In the fourth paper, nonlinear modeling is examined with a data set from the oil industry. The response has a nonlinear relation to the descriptor matrix, and the results are compared between linear modeling, polynomial PLS and nonlinear modeling using nonlinear score vectors.
Resumo:
Forest inventories are used to estimate forest characteristics and the condition of forest for many different applications: operational tree logging for forest industry, forest health state estimation, carbon balance estimation, land-cover and land use analysis in order to avoid forest degradation etc. Recent inventory methods are strongly based on remote sensing data combined with field sample measurements, which are used to define estimates covering the whole area of interest. Remote sensing data from satellites, aerial photographs or aerial laser scannings are used, depending on the scale of inventory. To be applicable in operational use, forest inventory methods need to be easily adjusted to local conditions of the study area at hand. All the data handling and parameter tuning should be objective and automated as much as possible. The methods also need to be robust when applied to different forest types. Since there generally are no extensive direct physical models connecting the remote sensing data from different sources to the forest parameters that are estimated, mathematical estimation models are of "black-box" type, connecting the independent auxiliary data to dependent response data with linear or nonlinear arbitrary models. To avoid redundant complexity and over-fitting of the model, which is based on up to hundreds of possibly collinear variables extracted from the auxiliary data, variable selection is needed. To connect the auxiliary data to the inventory parameters that are estimated, field work must be performed. In larger study areas with dense forests, field work is expensive, and should therefore be minimized. To get cost-efficient inventories, field work could partly be replaced with information from formerly measured sites, databases. The work in this thesis is devoted to the development of automated, adaptive computation methods for aerial forest inventory. The mathematical model parameter definition steps are automated, and the cost-efficiency is improved by setting up a procedure that utilizes databases in the estimation of new area characteristics.
Resumo:
In this thesis, a classi cation problem in predicting credit worthiness of a customer is tackled. This is done by proposing a reliable classi cation procedure on a given data set. The aim of this thesis is to design a model that gives the best classi cation accuracy to e ectively predict bankruptcy. FRPCA techniques proposed by Yang and Wang have been preferred since they are tolerant to certain type of noise in the data. These include FRPCA1, FRPCA2 and FRPCA3 from which the best method is chosen. Two di erent approaches are used at the classi cation stage: Similarity classi er and FKNN classi er. Algorithms are tested with Australian credit card screening data set. Results obtained indicate a mean classi cation accuracy of 83.22% using FRPCA1 with similarity classi- er. The FKNN approach yields a mean classi cation accuracy of 85.93% when used with FRPCA2, making it a better method for the suitable choices of the number of nearest neighbors and fuzziness parameters. Details on the calibration of the fuzziness parameter and other parameters associated with the similarity classi er are discussed.
Resumo:
The papermaking industry has been continuously developing intelligent solutions to characterize the raw materials it uses, to control the manufacturing process in a robust way, and to guarantee the desired quality of the end product. Based on the much improved imaging techniques and image-based analysis methods, it has become possible to look inside the manufacturing pipeline and propose more effective alternatives to human expertise. This study is focused on the development of image analyses methods for the pulping process of papermaking. Pulping starts with wood disintegration and forming the fiber suspension that is subsequently bleached, mixed with additives and chemicals, and finally dried and shipped to the papermaking mills. At each stage of the process it is important to analyze the properties of the raw material to guarantee the product quality. In order to evaluate properties of fibers, the main component of the pulp suspension, a framework for fiber characterization based on microscopic images is proposed in this thesis as the first contribution. The framework allows computation of fiber length and curl index correlating well with the ground truth values. The bubble detection method, the second contribution, was developed in order to estimate the gas volume at the delignification stage of the pulping process based on high-resolution in-line imaging. The gas volume was estimated accurately and the solution enabled just-in-time process termination whereas the accurate estimation of bubble size categories still remained challenging. As the third contribution of the study, optical flow computation was studied and the methods were successfully applied to pulp flow velocity estimation based on double-exposed images. Finally, a framework for classifying dirt particles in dried pulp sheets, including the semisynthetic ground truth generation, feature selection, and performance comparison of the state-of-the-art classification techniques, was proposed as the fourth contribution. The framework was successfully tested on the semisynthetic and real-world pulp sheet images. These four contributions assist in developing an integrated factory-level vision-based process control.
Resumo:
The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and efficient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the field of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workflows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are specifically aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workflows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the specific data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workflows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The first study focuses on spermatogenesis in murinetestis and the second one examines cell lineage specification in mouse embryonicstem cells.
Resumo:
Tiivistelmä: Harvennusmenetelmien vertailu ojitetun turvemaan männikössä. Simulointitutkimus
Resumo:
Summary