897 resultados para estimation and filtering
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.
Resumo:
The determination of volumetric water content of soils is an important factor in irrigation management. Among the indirect methods for estimating, the time-domain reflectometry (TDR) technique has received a significant attention. Like any other technique, it has advantages and disadvantages, but its greatest disadvantage is the need of calibration and high cost of acquisition. The main goal of this study was to establish a calibration model for the TDR equipment, Trase System Model 6050X1, to estimate the volumetric water content in a Distroferric Red Latosol. The calibration was carried out in a laboratory with disturbed soil samples under study, packed in PVC columns of a volume of 0.0078m³. The TDR probes were handcrafted with three rods and 0.20m long. They were vertically installed in soil columns, with a total of five probes per column and sixteen columns. The weightings were carried out in a digital scale, while daily readings of dielectric constant were obtained in TDR equipment. The linear model θν = 0.0103 Ka + 0.1900 to estimate the studied volumetric water content showed an excellent coefficient of determination (0.93), enabling the use of probes in indirect estimation of soil moisture.
Resumo:
This study investigates futures market efficiency and optimal hedge ratio estimation. First, cointegration between spot and futures prices is studied using Johansen method, with two different model specifications. If prices are found cointegrated, restrictions on cointegrating vector and adjustment coefficients are imposed, to account for unbiasedness, weak exogeneity and prediction hypothesis. Second, optimal hedge ratios are estimated using static OLS, and time-varying DVEC and CCC models. In-sample and out-of-sample results for one, two and five period ahead are reported. The futures used in thesis are RTS index, EUR/RUB exchange rate and Brent oil, traded in Futures and options on RTS.(FORTS) For in-sample period, data points were acquired from start of trading of each futures contract, RTS index from August 2005, EUR/RUB exchange rate March 2009 and Brent oil October 2008, lasting till end of May 2011. Out-of-sample period covers start of June 2011, till end of December 2011. Our results indicate that all three asset pairs, spot and futures, are cointegrated. We found RTS index futures to be unbiased predictor of spot price, mixed evidence for exchange rate, and for Brent oil futures unbiasedness was not supported. Weak exogeneity results for all pairs indicated spot price to lead in price discovery process. Prediction hypothesis, unbiasedness and weak exogeneity of futures, was rejected for all asset pairs. Variance reduction results varied between assets, in-sample in range of 40-85 percent and out-of sample in range of 40-96 percent. Differences between models were found small, except for Brent oil in which OLS clearly dominated. Out-of-sample results indicated exceptionally high variance reduction for RTS index, approximately 95 percent.
Resumo:
Parameter estimation still remains a challenge in many important applications. There is a need to develop methods that utilize achievements in modern computational systems with growing capabilities. Owing to this fact different kinds of Evolutionary Algorithms are becoming an especially perspective field of research. The main aim of this thesis is to explore theoretical aspects of a specific type of Evolutionary Algorithms class, the Differential Evolution (DE) method, and implement this algorithm as codes capable to solve a large range of problems. Matlab, a numerical computing environment provided by MathWorks inc., has been utilized for this purpose. Our implementation empirically demonstrates the benefits of a stochastic optimizers with respect to deterministic optimizers in case of stochastic and chaotic problems. Furthermore, the advanced features of Differential Evolution are discussed as well as taken into account in the Matlab realization. Test "toycase" examples are presented in order to show advantages and disadvantages caused by additional aspects involved in extensions of the basic algorithm. Another aim of this paper is to apply the DE approach to the parameter estimation problem of the system exhibiting chaotic behavior, where the well-known Lorenz system with specific set of parameter values is taken as an example. Finally, the DE approach for estimation of chaotic dynamics is compared to the Ensemble prediction and parameter estimation system (EPPES) approach which was recently proposed as a possible solution for similar problems.
Resumo:
The thesis is related to the topic of image-based characterization of fibers in pulp suspension during the papermaking process. Papermaking industry is focusing on process control optimization and automatization, which makes it possible to manufacture highquality products in a resource-efficient way. Being a part of the process control, pulp suspension analysis allows to predict and modify properties of the end product. This work is a part of the tree species identification task and focuses on analysis of fiber parameters in the pulp suspension at the wet stage of paper production. The existing machine vision methods for pulp characterization were investigated, and a method exploiting direction sensitive filtering, non-maximum suppression, hysteresis thresholding, tensor voting, and curve extraction from tensor maps was developed. Application of the method to the microscopic grayscale pulp images made it possible to detect curves corresponding to fibers in the pulp image and to compute their morphological characteristics. Performance of the method was evaluated based on the manually produced ground truth data. An accuracy of fiber characteristics estimation, including length, width, and curvature, for the acacia pulp images was found to be 84, 85, and 60% correspondingly.
Resumo:
This thesis presents a set of methods and models for estimation of iron and slag flows in the blast furnace hearth and taphole. The main focus was put on predicting taphole flow patterns and estimating the effects of various taphole conditions on the drainage behavior of the blast furnace hearth. All models were based on a general understanding of the typical tap cycle of an industrial blast furnace. Some of the models were evaluated on short-term process data from the reference furnace. A computational fluid dynamics (CFD) model was built and applied to simulate the complicated hearth flows and thus to predict the regions of the hearth exerted to erosion under various operating conditions. Key boundary variables of the CFD model were provided by a simplified drainage model based on the first principles. By examining the evolutions of liquid outflow rates measured from the furnace studied, the drainage model was improved to include the effects of taphole diameter and length. The estimated slag delays showed good agreement with the observed ones. The liquid flows in the taphole were further studied using two different models and the results of both models indicated that it is more likely that separated flow of iron and slag occurs in the taphole when the liquid outflow rates are comparable during tapping. The drainage process was simulated with an integrated model based on an overall balance analysis: The high in-furnace overpressure can compensate for the resistances induced by the liquid flows in the hearth and through the taphole. Finally, a recently developed multiphase CFD model including interfacial forces between immiscible liquids was developed and both the actual iron-slag system and a water-oil system in laboratory scale were simulated. The model was demonstrated to be a useful tool for simulating hearth flows for gaining understanding of the complex phenomena in the drainage of the blast furnace.
Resumo:
The recent emergence of low-cost RGB-D sensors has brought new opportunities for robotics by providing affordable devices that can provide synchronized images with both color and depth information. In this thesis, recent work on pose estimation utilizing RGBD sensors is reviewed. Also, a pose recognition system for rigid objects using RGB-D data is implemented. The implementation uses half-edge primitives extracted from the RGB-D images for pose estimation. The system is based on the probabilistic object representation framework by Detry et al., which utilizes Nonparametric Belief Propagation for pose inference. Experiments are performed on household objects to evaluate the performance and robustness of the system.
Resumo:
More discussion is required on how and which types of biomass should be used to achieve a significant reduction in the carbon load released into the atmosphere in the short term. The energy sector is one of the largest greenhouse gas (GHG) emitters and thus its role in climate change mitigation is important. Replacing fossil fuels with biomass has been a simple way to reduce carbon emissions because the carbon bonded to biomass is considered as carbon neutral. With this in mind, this thesis has the following objectives: (1) to study the significance of the different GHG emission sources related to energy production from peat and biomass, (2) to explore opportunities to develop more climate friendly biomass energy options and (3) to discuss the importance of biogenic emissions of biomass systems. The discussion on biogenic carbon and other GHG emissions comprises four case studies of which two consider peat utilization, one forest biomass and one cultivated biomasses. Various different biomass types (peat, pine logs and forest residues, palm oil, rapeseed oil and jatropha oil) are used as examples to demonstrate the importance of biogenic carbon to life cycle GHG emissions. The biogenic carbon emissions of biomass are defined as the difference in the carbon stock between the utilization and the non-utilization scenarios of biomass. Forestry-drained peatlands were studied by using the high emission values of the peatland types in question to discuss the emission reduction potential of the peatlands. The results are presented in terms of global warming potential (GWP) values. Based on the results, the climate impact of the peat production can be reduced by selecting high-emission-level peatlands for peat production. The comparison of the two different types of forest biomass in integrated ethanol production in pulp mill shows that the type of forest biomass impacts the biogenic carbon emissions of biofuel production. The assessment of cultivated biomasses demonstrates that several selections made in the production chain significantly affect the GHG emissions of biofuels. The emissions caused by biofuel can exceed the emissions from fossil-based fuels in the short term if biomass is in part consumed in the process itself and does not end up in the final product. Including biogenic carbon and other land use carbon emissions into the carbon footprint calculations of biofuel reveals the importance of the time frame and of the efficiency of biomass carbon content utilization. As regards the climate impact of biomass energy use, the net impact on carbon stocks (in organic matter of soils and biomass), compared to the impact of the replaced energy source, is the key issue. Promoting renewable biomass regardless of biogenic GHG emissions can increase GHG emissions in the short term and also possibly in the long term.
Resumo:
The power rating of wind turbines is constantly increasing; however, keeping the voltage rating at the low-voltage level results in high kilo-ampere currents. An alternative for increasing the power levels without raising the voltage level is provided by multiphase machines. Multiphase machines are used for instance in ship propulsion systems, aerospace applications, electric vehicles, and in other high-power applications including wind energy conversion systems. A machine model in an appropriate reference frame is required in order to design an efficient control for the electric drive. Modeling of multiphase machines poses a challenge because of the mutual couplings between the phases. Mutual couplings degrade the drive performance unless they are properly considered. In certain multiphase machines there is also a problem of high current harmonics, which are easily generated because of the small current path impedance of the harmonic components. However, multiphase machines provide special characteristics compared with the three-phase counterparts: Multiphase machines have a better fault tolerance, and are thus more robust. In addition, the controlled power can be divided among more inverter legs by increasing the number of phases. Moreover, the torque pulsation can be decreased and the harmonic frequency of the torque ripple increased by an appropriate multiphase configuration. By increasing the number of phases it is also possible to obtain more torque per RMS ampere for the same volume, and thus, increase the power density. In this doctoral thesis, a decoupled d–q model of double-star permanent-magnet (PM) synchronous machines is derived based on the inductance matrix diagonalization. The double-star machine is a special type of multiphase machines. Its armature consists of two three-phase winding sets, which are commonly displaced by 30 electrical degrees. In this study, the displacement angle between the sets is considered a parameter. The diagonalization of the inductance matrix results in a simplified model structure, in which the mutual couplings between the reference frames are eliminated. Moreover, the current harmonics are mapped into a reference frame, in which they can be easily controlled. The work also presents methods to determine the machine inductances by a finite-element analysis and by voltage-source inverters on-site. The derived model is validated by experimental results obtained with an example double-star interior PM (IPM) synchronous machine having the sets displaced by 30 electrical degrees. The derived transformation, and consequently, the decoupled d–q machine model, are shown to model the behavior of an actual machine with an acceptable accuracy. Thus, the proposed model is suitable to be used for the model-based control design of electric drives consisting of double-star IPM synchronous machines.
Resumo:
Growing concerns about toxicity and development of resistance against synthetic herbicides have demanded looking for alternative weed management approaches. Allelopathy has gained sufficient support and potential for sustainable weed management. Aqueous extracts of six plant species (sunflower, rice, mulberry, maize, brassica and sorghum) in different combinations alone or in mixture with 75% reduced dose of herbicides were evaluated for two consecutive years under field conditions. A weedy check and S-metolachlor with atrazine (pre emergence) and atrazine alone (post emergence) at recommended rates was included for comparison. Weed dynamics, maize growth indices and yield estimation were done by following standard procedures. All aqueous plant extract combinations suppressed weed growth and biomass. Moreover, the suppressive effect was more pronounced when aqueous plant extracts were supplemented with reduced doses of herbicides. Brassica-sunflower-sorghum combination suppressed weeds by 74-80, 78-70, 65-68% during both years of study that was similar with S-metolachlor along half dose of atrazine and full dose of atrazine alone. Crop growth rate and dry matter accumulation attained peak values of 32.68 and 1,502 g m-2 d-1 for brassica-sunflower-sorghum combination at 60 and 75 days after sowing. Curve fitting regression for growth and yield traits predicted strong positive correlation to grain yield and negative correlation to weed dry biomass under allelopathic weed management in maize crop.
Resumo:
We have studied the metabolism of diglycine and triglycine in the isolated non-filtering rat kidney. Kidneys from adult male Wistar Kyoto rats weighing 250-350 g were perfused with Krebs-Henseleit solution containing either 1 mM diglycine or triglycine. The analysis of the peptide residues and their components was performed using an amino acid microanalyzer utilizing ion exchange chromatography. Diglycine was degraded to a final concentration of 0.09 mM after 120 min (91%); this degradation occurred predominantly during the first hour, with a 56% reduction of the initial concentration. The metabolism of triglycine occurred similarly, with a final concentration of 0.18 mM (82%); during the first hour there was a 67% reduction of the initial concentration of the tripeptide. Both peptides produced glycine in increasing concentrations, but there was a slightly lower recovery of glycine, suggesting its utilization by the kidney as fuel. The hydrolysis of triglycine also produced diglycine, which was also hydrolyzed to glycine. The results of the present study show the existence of functional endothelial or contraluminal membrane peptidases which may be important during parenteral nutrition.
Resumo:
The use of limiting dilution assay (LDA) for assessing the frequency of responders in a cell population is a method extensively used by immunologists. A series of studies addressing the statistical method of choice in an LDA have been published. However, none of these studies has addressed the point of how many wells should be employed in a given assay. The objective of this study was to demonstrate how a researcher can predict the number of wells that should be employed in order to obtain results with a given accuracy, and, therefore, to help in choosing a better experimental design to fulfill one's expectations. We present the rationale underlying the expected relative error computation based on simple binomial distributions. A series of simulated in machina experiments were performed to test the validity of the a priori computation of expected errors, thus confirming the predictions. The step-by-step procedure of the relative error estimation is given. We also discuss the constraints under which an LDA must be performed.
Resumo:
The aim of this work is to apply approximate Bayesian computation in combination with Marcov chain Monte Carlo methods in order to estimate the parameters of tuberculosis transmission. The methods are applied to San Francisco data and the results are compared with the outcomes of previous works. Moreover, a methodological idea with the aim to reduce computational time is also described. Despite the fact that this approach is proved to work in an appropriate way, further analysis is needed to understand and test its behaviour in different cases. Some related suggestions to its further enhancement are described in the corresponding chapter.
Resumo:
Fluid handling systems such as pump and fan systems are found to have a significant potential for energy efficiency improvements. To deliver the energy saving potential, there is a need for easily implementable methods to monitor the system output. This is because information is needed to identify inefficient operation of the fluid handling system and to control the output of the pumping system according to process needs. Model-based pump or fan monitoring methods implemented in variable speed drives have proven to be able to give information on the system output without additional metering; however, the current model-based methods may not be usable or sufficiently accurate in the whole operation range of the fluid handling device. To apply model-based system monitoring in a wider selection of systems and to improve the accuracy of the monitoring, this paper proposes a new method for pump and fan output monitoring with variable-speed drives. The method uses a combination of already known operating point estimation methods. Laboratory measurements are used to verify the benefits and applicability of the improved estimation method, and the new method is compared with five previously introduced model-based estimation methods. According to the laboratory measurements, the new estimation method is the most accurate and reliable of the model-based estimation methods.