37 resultados para Graph-based methods
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.
Resumo:
Metaheuristic methods have become increasingly popular approaches in solving global optimization problems. From a practical viewpoint, it is often desirable to perform multimodal optimization which, enables the search of more than one optimal solution to the task at hand. Population-based metaheuristic methods offer a natural basis for multimodal optimization. The topic has received increasing interest especially in the evolutionary computation community. Several niching approaches have been suggested to allow multimodal optimization using evolutionary algorithms. Most global optimization approaches, including metaheuristics, contain global and local search phases. The requirement to locate several optima sets additional requirements for the design of algorithms to be effective in both respects in the context of multimodal optimization. In this thesis, several different multimodal optimization algorithms are studied in regard to how their implementation in the global and local search phases affect their performance in different problems. The study concentrates especially on variations of the Differential Evolution algorithm and their capabilities in multimodal optimization. To separate the global and local search search phases, three multimodal optimization algorithms are proposed, two of which hybridize the Differential Evolution with a local search method. As the theoretical background behind the operation of metaheuristics is not generally thoroughly understood, the research relies heavily on experimental studies in finding out the properties of different approaches. To achieve reliable experimental information, the experimental environment must be carefully chosen to contain appropriate and adequately varying problems. The available selection of multimodal test problems is, however, rather limited, and no general framework exists. As a part of this thesis, such a framework for generating tunable test functions for evaluating different methods of multimodal optimization experimentally is provided and used for testing the algorithms. The results demonstrate that an efficient local phase is essential for creating efficient multimodal optimization algorithms. Adding a suitable global phase has the potential to boost the performance significantly, but the weak local phase may invalidate the advantages gained from the global phase.
Resumo:
New luminometric particle-based methods were developed to quantify protein and to count cells. The developed methods rely on the interaction of the sample with nano- or microparticles and different principles of detection. In fluorescence quenching, timeresolved luminescence resonance energy transfer (TR-LRET), and two-photon excitation fluorescence (TPX) methods, the sample prevents the adsorption of labeled protein to the particles. Depending on the system, the addition of the analyte increases or decreases the luminescence. In the dissociation method, the adsorbed protein protects the Eu(III) chelate on the surface of the particles from dissociation at a low pH. The experimental setups are user-friendly and rapid and do not require hazardous test compounds and elevated temperatures. The sensitivity of the quantification of protein (from 40 to 500 pg bovine serum albumin in a sample) was 20-500-fold better than in most sensitive commercial methods. The quenching method exhibited low protein-to-protein variability and the dissociation method insensitivity to the assay contaminants commonly found in biological samples. Less than ten eukaryotic cells were detected and quantified with all the developed methods under optimized assay conditions. Furthermore, two applications, the method for detection of the aggregation of protein and the cell viability test, were developed by utilizing the TR-LRET method. The detection of the aggregation of protein was allowed at a more than 10,000 times lower concentration, 30 μg/L, compared to the known methods of UV240 absorbance and dynamic light scattering. The TR-LRET method was combined with a nucleic acid assay with cell-impermeable dye to measure the percentage of dead cells in a single tube test with cell counts below 1000 cells/tube.
Resumo:
Virtual environments and real-time simulators (VERS) are becoming more and more important tools in research and development (R&D) process of non-road mobile machinery (NRMM). The virtual prototyping techniques enable faster and more cost-efficient development of machines compared to use of real life prototypes. High energy efficiency has become an important topic in the world of NRMM because of environmental and economic demands. The objective of this thesis is to develop VERS based methods for research and development of NRMM. A process using VERS for assessing effects of human operators on the life-cycle efficiency of NRMM was developed. Human in the loop simulations are ran using an underground mining loader to study the developed process. The simulations were ran in the virtual environment of the Laboratory of Intelligent Machines of Lappeenranta University of Technology. A physically adequate real-time simulation model of NRMM was shown to be reliable and cost effective in testing of hardware components by the means of hardware-in-the-loop (HIL) simulations. A control interface connecting integrated electro-hydraulic energy converter (IEHEC) with virtual simulation model of log crane was developed. IEHEC consists of a hydraulic pump-motor and an integrated electrical permanent magnet synchronous motorgenerator. The results show that state of the art real-time NRMM simulators are capable to solve factors related to energy consumption and productivity of the NRMM. A significant variation between the test drivers is found. The results show that VERS can be used for assessing human effects on the life-cycle efficiency of NRMM. HIL simulation responses compared to that achieved with conventional simulation method demonstrate the advances and drawbacks of various possible interfaces between the simulator and hardware part of the system under study. Novel ideas for arranging the interface are successfully tested and compared with the more traditional one. The proposed process for assessing the effects of operators on the life-cycle efficiency will be applied for wider group of operators in the future. Driving styles of the operators can be analysed statistically from sufficient large result data. The statistical analysis can find the most life-cycle efficient driving style for the specific environment and machinery. The proposed control interface for HIL simulation need to be further studied. The robustness and the adaptation of the interface in different situations must be verified. The future work will also include studying the suitability of the IEHEC for different working machines using the proposed HIL simulation method.
Resumo:
Paperin pinnan karheus on yksi paperin laatukriteereistä. Sitä mitataan fyysisestipaperin pintaa mittaavien laitteiden ja optisten laitteiden avulla. Mittaukset vaativat laboratorioolosuhteita, mutta nopeammille, suoraan linjalla tapahtuville mittauksilla olisi tarvetta paperiteollisuudessa. Paperin pinnan karheus voidaan ilmaista yhtenä näytteelle kohdistuvana karheusarvona. Tässä työssä näyte on jaettu merkitseviin alueisiin, ja jokaiselle alueelle on laskettu erillinen karheusarvo. Karheuden mittaukseen on käytetty useita menetelmiä. Yleisesti hyväksyttyä tilastollista menetelmää on käytetty tässä työssä etäisyysmuunnoksen lisäksi. Paperin pinnan karheudenmittauksessa on ollut tarvetta jakaa analysoitava näyte karheuden perusteella alueisiin. Aluejaon avulla voidaan rajata näytteestä selvästi karheampana esiintyvät alueet. Etäisyysmuunnos tuottaa alueita, joita on analysoitu. Näistä alueista on muodostettu yhtenäisiä alueita erilaisilla segmentointimenetelmillä. PNN -menetelmään (Pairwise Nearest Neighbor) ja naapurialueiden yhdistämiseen perustuvia algoritmeja on käytetty.Alueiden jakamiseen ja yhdistämiseen perustuvaa lähestymistapaa on myös tarkasteltu. Segmentoitujen kuvien validointi on yleensä tapahtunut ihmisen tarkastelemana. Tämän työn lähestymistapa on verrata yleisesti hyväksyttyä tilastollista menetelmää segmentoinnin tuloksiin. Korkea korrelaatio näiden tulosten välillä osoittaa onnistunutta segmentointia. Eri kokeiden tuloksia on verrattu keskenään hypoteesin testauksella. Työssä on analysoitu kahta näytesarjaa, joidenmittaukset on suoritettu OptiTopolla ja profilometrillä. Etäisyysmuunnoksen aloitusparametrit, joita muutettiin kokeiden aikana, olivat aloituspisteiden määrä ja sijainti. Samat parametrimuutokset tehtiin kaikille algoritmeille, joita käytettiin alueiden yhdistämiseen. Etäisyysmuunnoksen jälkeen korrelaatio oli voimakkaampaa profilometrillä mitatuille näytteille kuin OptiTopolla mitatuille näytteille. Segmentoiduilla OptiTopo -näytteillä korrelaatio parantui voimakkaammin kuin profilometrinäytteillä. PNN -menetelmän tuottamilla tuloksilla korrelaatio oli paras.
Resumo:
Fluid handling systems such as pump and fan systems are found to have a significant potential for energy efficiency improvements. To deliver the energy saving potential, there is a need for easily implementable methods to monitor the system output. This is because information is needed to identify inefficient operation of the fluid handling system and to control the output of the pumping system according to process needs. Model-based pump or fan monitoring methods implemented in variable speed drives have proven to be able to give information on the system output without additional metering; however, the current model-based methods may not be usable or sufficiently accurate in the whole operation range of the fluid handling device. To apply model-based system monitoring in a wider selection of systems and to improve the accuracy of the monitoring, this paper proposes a new method for pump and fan output monitoring with variable-speed drives. The method uses a combination of already known operating point estimation methods. Laboratory measurements are used to verify the benefits and applicability of the improved estimation method, and the new method is compared with five previously introduced model-based estimation methods. According to the laboratory measurements, the new estimation method is the most accurate and reliable of the model-based estimation methods.
Resumo:
Fluid handling systems account for a significant share of the global consumption of electrical energy. They also suffer from problems, which reduce their energy efficiency and increase life-cycle costs. Detecting or predicting these problems in time can make fluid handling systems more environmentally and economically sustainable to operate. In this Master’s Thesis, significant problems in fluid systems were studied and possibilities to develop variable-speed-drive-based detection methods for them was discussed. A literature review was conducted to find significant problems occurring in fluid handling systems containing pumps, fans and compressors. To find case examples for evaluating the feasibility of variable-speed-drive-based methods, queries were sent to industrial companies. As a result of this, the possibility to detect heat exchanger fouling with a variable-speed drive was analysed with data from three industrial cases. It was found that a mass flow rate estimate, which can be generated with a variable speed drive, can be used together with temperature measurements to monitor a heat exchanger’s thermal performance. Secondly, it was found that the fouling-related increase in the pressure drop of a heat exchanger can be monitored with a variable speed drive. Lastly, for systems where the flow device is speed controlled with by a pressure measurement, it was concluded that increasing rotational speed can be interpreted as progressing fouling in the heat exchanger.
Resumo:
Computational model-based simulation methods were developed for the modelling of bioaffinity assays. Bioaffinity-based methods are widely used to quantify a biological substance in biological research, development and in routine clinical in vitro diagnostics. Bioaffinity assays are based on the high affinity and structural specificity between the binding biomolecules. The simulation methods developed are based on the mechanistic assay model, which relies on the chemical reaction kinetics and describes the forming of a bound component as a function of time from the initial binding interaction. The simulation methods were focused on studying the behaviour and the reliability of bioaffinity assay and the possibilities the modelling methods of binding reaction kinetics provide, such as predicting assay results even before the binding reaction has reached equilibrium. For example, a rapid quantitative result from a clinical bioaffinity assay sample can be very significant, e.g. even the smallest elevation of a heart muscle marker reveals a cardiac injury. The simulation methods were used to identify critical error factors in rapid bioaffinity assays. A new kinetic calibration method was developed to calibrate a measurement system by kinetic measurement data utilizing only one standard concentration. A nodebased method was developed to model multi-component binding reactions, which have been a challenge to traditional numerical methods. The node-method was also used to model protein adsorption as an example of nonspecific binding of biomolecules. These methods have been compared with the experimental data from practice and can be utilized in in vitro diagnostics, drug discovery and in medical imaging.
Resumo:
Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.
Resumo:
In this paper, we review the advances of monocular model-based tracking for last ten years period until 2014. In 2005, Lepetit, et. al, [19] reviewed the status of monocular model based rigid body tracking. Since then, direct 3D tracking has become quite popular research area, but monocular model-based tracking should still not be forgotten. We mainly focus on tracking, which could be applied to aug- mented reality, but also some other applications are covered. Given the wide subject area this paper tries to give a broad view on the research that has been conducted, giving the reader an introduction to the different disciplines that are tightly related to model-based tracking. The work has been conducted by searching through well known academic search databases in a systematic manner, and by selecting certain publications for closer examination. We analyze the results by dividing the found papers into different categories by their way of implementation. The issues which have not yet been solved are discussed. We also discuss on emerging model-based methods such as fusing different types of features and region-based pose estimation which could show the way for future research in this subject.
Resumo:
Tannins, typically segregated into two major groups, the hydrolyzable tannins (HTs) and the proanthocyanidins (PAs), are plant polyphenolic secondary metabolites found throughout the plant kingdom. On one hand, tannins may cause harmful nutritional effects on herbivores, for example insects, and hence they work as plants’ defense against plant-eating animals. On the other hand, they may affect positively some herbivores, such as mammals, for example by their antioxidant, antimicrobial, anti-inflammatory or anticarcinogenic activities. This thesis focuses on understanding the bioactivity of plant tannins, their anthelmintic properties and the tools used for the qualitative and quantitative analysis of this endless source of structural diversity. The first part of the experimental work focused on the development of ultra-high performance liquid chromatography−tandem mass spectrometry (UHPLC-MS/MS) based methods for the rapid fingerprint analysis of bioactive polyphenols, especially tannins. In the second part of the experimental work the in vitro activity of isolated and purified HTs and their hydrolysis product, gallic acid, was tested against egg hatching and larval motility of two larval developmental stages, L1 and L2, of a common ruminant gastrointestinal parasite, Haemonchus contortus. The results indicated clear relationships between the HT structure and the anthelmintic activity. The activity of the studied compounds depended on many structural features, including size, functional groups present in the structure, and the structural rigidness. To further understand tannin bioactivity on a molecular level, the interaction between bovine serum albumin (BSA), and seven HTs and epigallocatechin gallate was examined. The objective was to define the effect of pH on the formation on tannin–protein complexes and to evaluate the stability of the formed complexes by gel electrophoresis and MALDI-TOF-MS. The results indicated that more basic pH values had a stabilizing effect on the tannin–protein complexes and that the tannin oxidative activity was directly linked with their tendency to form covalently stabilized complexes with BSA at increased pH.
Resumo:
The estimating of the relative orientation and position of a camera is one of the integral topics in the field of computer vision. The accuracy of a certain Finnish technology company’s traffic sign inventory and localization process can be improved by utilizing the aforementioned concept. The company’s localization process uses video data produced by a vehicle installed camera. The accuracy of estimated traffic sign locations depends on the relative orientation between the camera and the vehicle. This thesis proposes a computer vision based software solution which can estimate a camera’s orientation relative to the movement direction of the vehicle by utilizing video data. The task was solved by using feature-based methods and open source software. When using simulated data sets, the camera orientation estimates had an absolute error of 0.31 degrees on average. The software solution can be integrated to be a part of the traffic sign localization pipeline of the company in question.
Resumo:
Mass spectrometry (MS)-based proteomics has seen significant technical advances during the past two decades and mass spectrometry has become a central tool in many biosciences. Despite the popularity of MS-based methods, the handling of the systematic non-biological variation in the data remains a common problem. This biasing variation can result from several sources ranging from sample handling to differences caused by the instrumentation. Normalization is the procedure which aims to account for this biasing variation and make samples comparable. Many normalization methods commonly used in proteomics have been adapted from the DNA-microarray world. Studies comparing normalization methods with proteomics data sets using some variability measures exist. However, a more thorough comparison looking at the quantitative and qualitative differences of the performance of the different normalization methods and at their ability in preserving the true differential expression signal of proteins, is lacking. In this thesis, several popular and widely used normalization methods (the Linear regression normalization, Local regression normalization, Variance stabilizing normalization, Quantile-normalization, Median central tendency normalization and also variants of some of the forementioned methods), representing different strategies in normalization are being compared and evaluated with a benchmark spike-in proteomics data set. The normalization methods are evaluated in several ways. The performance of the normalization methods is evaluated qualitatively and quantitatively on a global scale and in pairwise comparisons of sample groups. In addition, it is investigated, whether performing the normalization globally on the whole data or pairwise for the comparison pairs examined, affects the performance of the normalization method in normalizing the data and preserving the true differential expression signal. In this thesis, both major and minor differences in the performance of the different normalization methods were found. Also, the way in which the normalization was performed (global normalization of the whole data or pairwise normalization of the comparison pair) affected the performance of some of the methods in pairwise comparisons. Differences among variants of the same methods were also observed.
Resumo:
Decisions taken in modern organizations are often multi-dimensional, involving multiple decision makers and several criteria measured on different scales. Multiple Criteria Decision Making (MCDM) methods are designed to analyze and to give recommendations in this kind of situations. Among the numerous MCDM methods, two large families of methods are the multi-attribute utility theory based methods and the outranking methods. Traditionally both method families require exact values for technical parameters and criteria measurements, as well as for preferences expressed as weights. Often it is hard, if not impossible, to obtain exact values. Stochastic Multicriteria Acceptability Analysis (SMAA) is a family of methods designed to help in this type of situations where exact values are not available. Different variants of SMAA allow handling all types of MCDM problems. They support defining the model through uncertain, imprecise, or completely missing values. The methods are based on simulation that is applied to obtain descriptive indices characterizing the problem. In this thesis we present new advances in the SMAA methodology. We present and analyze algorithms for the SMAA-2 method and its extension to handle ordinal preferences. We then present an application of SMAA-2 to an area where MCDM models have not been applied before: planning elevator groups for high-rise buildings. Following this, we introduce two new methods to the family: SMAA-TRI that extends ELECTRE TRI for sorting problems with uncertain parameter values, and SMAA-III that extends ELECTRE III in a similar way. An efficient software implementing these two methods has been developed in conjunction with this work, and is briefly presented in this thesis. The thesis is closed with a comprehensive survey of SMAA methodology including a definition of a unified framework.
Resumo:
Tässä diplomityössä suunnitellaan yksivaiheisen turbiinin ylisooninen staattori ja alisooninen roottori, tulo-osa ja diffuusori. Työn alussa tarkastellaan aksiaaliturbiinin käyttökohteita ja teoriaa, jonka jälkeen esitetään suunnittelun perustana olevat menetelmät ja periaatteet. Perussuunnittelu tehdään Traupelinmenetelmällä WinAxtu 1.1 suunnitteluohjelmalla ja hyötysuhde arvioidaan lisäksiExcel-pohjaisella laskennalla. Ylisooninen staattori suunnitellaan perussuunnittelun tuloksiin perustuen, soveltamalla karakteristikoiden menetelmää suuttimen laajenevaan osaan ja pinta-alasuhteita suppenevaan osaan. Roottorin keskiviiva piirretään Sahlbergin menetelmällä ja siiven muoto määritetään A3K7 paksuusjakauman sekä tiheän siipihilan muotoilun periaatteita yhdistämällä. Tulo-osa suunnitellaan mahdollisimman jouhevaksi geometriatietojen ja kirjallisuuden esimerkkien mukaisesti. Lopuksi tulo-osaa mallinnetaan CFD-laskennalla. Diffuusori suunnitellaan käyttämällä soveltuvin osin kirjallisuudessa esitettyjätietoja, tulo-osan geometriaa ja CFD-laskentaa. Suunnittelutuloksia verrataan lopuksi kirjallisuudessa esitettyihin tuloksiin ja arvioidaan suunnittelun onnistumista sekä mahdollisia ongelmakohtia.