930 resultados para kernel estimates
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.
Resumo:
Eucalyptus plantations represent a short term and cost efficient alternative for sequestrating carbon dioxide from the atmosphere. Despite the known potential of forest plantations of fast growing species to store carbon in the biomass, there are relatively few studies including precise estimates of the amount of carbon in these plantations. In this study it was determined the carbon content in the stems, branches, leaves and roots of a clonal Eucalyptus grandis plantation in the Southeast of Brazil. We developed allometric equations to estimate the total amount of carbon and total biomass, and produced an estimate of the carbon stock in the stand level. Altogether, 23 sample trees were selected for aboveground biomass assessment. The roots of 9 of the 23 sampled trees were partially excavated to assess the belowground biomass at a singletree level. Two models with DBH, H and DBH2H were tested. The average relative share of carbon content in the stem, branch, leaf and root compartments was 44.6%, 43.0%, 46.1% and 37.8%, respectively, which is smaller than the generic value commonly used (50%). The best-fit allometric equations to estimate the total amount of carbon and total biomass had DBH2H as independent variable. The root-to-shoot ratio was relatively stable (C.V. = 27.5%) probably because the sub-sample was composed of clones. Total stand carbon stock in the Eucalyptus plantation was estimated to be 73.38 MgC ha-1, which is within the carbon stock range for Eucalyptus plantations.
Resumo:
ABSTRACT Knowledge of natural water availability, which is characterized by low flows, is essential for planning and management of water resources. One of the most widely used hydrological techniques to determine streamflow is regionalization, but the extrapolation of regionalization equations beyond the limits of sample data is not recommended. This paper proposes a new method for reducing overestimation errors associated with the extrapolation of regionalization equations for low flows. The method is based on the use of a threshold value for the maximum specific low flow discharge estimated at the gauging sites that are used in the regionalization. When a specific low flow, which has been estimated using the regionalization equation, exceeds the threshold value, the low flow can be obtained by multiplying the drainage area by the threshold value. This restriction imposes a physical limit to the low flow, which reduces the error of overestimating flows in regions of extrapolation. A case study was done in the Urucuia river basin, in Brazil, and the results showed the regionalization equation to perform positively in reducing the risk of extrapolation.
Resumo:
Transportation of fluids is one of the most common and energy intensive processes in the industrial and HVAC sectors. Pumping systems are frequently subject to engineering malpractice when dimensioned, which can lead to poor operational efficiency. Moreover, pump monitoring requires dedicated measuring equipment, which imply costly investments. Inefficient pump operation and improper maintenance can increase energy costs substantially and even lead to pump failure. A centrifugal pump is commonly driven by an induction motor. Driving the induction motor with a frequency converter can diminish energy consumption in pump drives and provide better control of a process. In addition, induction machine signals can also be estimated by modern frequency converters, dispensing with the use of sensors. If the estimates are accurate enough, a pump can be modelled and integrated into the frequency converter control scheme. This can open the possibility of joint motor and pump monitoring and diagnostics, thereby allowing the detection of reliability-reducing operating states that can lead to additional maintenance costs. The goal of this work is to study the accuracy of rotational speed, torque and shaft power estimates calculated by a frequency converter. Laboratory tests were performed in order to observe estimate behaviour in both steady-state and transient operation. An induction machine driven by a vector-controlled frequency converter, coupled with another induction machine acting as load was used in the tests. The estimated quantities were obtained through the frequency converter’s Trend Recorder software. A high-precision, HBM T12 torque-speed transducer was used to measure the actual values of the aforementioned variables. The effect of the flux optimization energy saving feature on the estimate quality was also studied. A processing function was developed in MATLAB for comparison of the obtained data. The obtained results confirm the suitability of this particular converter to provide accurate enough estimates for pumping applications.
Resumo:
This work aimed to develop allometric equations for tree biomass estimation, and to determine the site biomass in different "cerrado" ecosystems. Destructive sampling in a "campo cerrado" (open savanna) was carried out at the Biological Reserve of Moji-Guaçu, State of São Paulo, southeastern Brazil. This "campo cerrado" (open savanna) grows under a tropical climate and on acid, low nutrient soils. Sixty wood plants were cut to ground level and measurements of diameter, height and weight of leaves and stems were taken. We selected the best equations among the most commonly used mathematical relations according to R² values, significance, and standard error. Both diameter (D) and height (H) showed good relationship with plant biomass, but the use of these two parameters together (DH and D²H) provided the best predictor variables. The best equations were linear, but power and exponential equations also showed high R² and significance. The applicability of these equations is discussed and biomass estimates are compared with other types of tropical savannas. Mineralmass was also estimated. "Cerrados" proved to have very important carbon reservoirs due to their great extent. In addition, high land-use change that takes place nowadays in the "cerrado" biome may significantly affect the global carbon cycle.
Resumo:
Estimates of genetic and phenotypic parameters were obtained by using data from families of a recurrent selection program in rice. An experiment using population CNA-IRAT 4ME/1/1 was conducted at two locations (Lambari and Cambuquira) in the State of Minas Gerais, Brazil. At Lambari, families S0:2 and S0:3 were assessed during crop seasons 1992/1993 and 1993/1994, respectively. In the Cambuquira trial, only S0:3 families were tested in 1993/1994. The experimental design was a 10 x 10 lattice with three replications. The following traits were assessed: grain yield (GY), mean number of days to flowering (FL), plant height (PH), and the incidence of neck blast (NB) caused by Pyricularia grisea and grain staining (GS) caused by Drechslera oryzae. This population proved to be promising for recurrent selection, as it had high average yield and genetic variability. Heritability estimates obtained using variance components were generally greater than estimates of realized heritability, and heritability obtained by parent-offspring regression
Resumo:
Defatted Brazil nut kernel flour, a rich source of high quality proteins, is presently being utilized in the formulation of animal feeds. One of the possible ways to improve its utilization for human consumption is through improvement in its functional properties. In the present study, changes in some of the functional properties of Brazil nut kernel globulin were evaluated after acetylation at 58.6, 66.2 and 75.3% levels. The solubility of acetylated globulin was improved above pH 6.0 but was reduced in the pH range of 3.0-4.0. Water and oil absorption capacity, as well as the viscosity increased with increase in the level of acetylation. Level of modification also influenced the emulsifying capacity: decreased at pH 3.0, but increased at pH 7.0 and 9.0. Highest emulsion activity (approximately 62.2%) was observed at pH 3.0 followed by pH 9.0 and pH 7.0 and least (about 11.8%) at pH 5.0. Emulsion stability also followed similar behavior as that of emulsion activity.
Resumo:
The purpose of this study was to investigate and model the water absorption process by corn kernels with different levels of mechanical damage Corn kernels of AG 1510 variety with moisture content of 14.2 (% d.b.) were used. Different mechanical damage levels were indirectly evaluated by electrical conductivity measurements. The absorption process was based on the industrial corn wet milling process, in which the product was soaked with a 0.2% sulfur dioxide (SO2) solution and 0.55% lactic acid (C3H6O3) in distilled water, under controlled temperatures of 40, 50, 60, and 70 ºC and different mechanical damage levels. The Peleg model was used for the analysis and modeling of water absorption process. The conclusion is that the structural changes caused by the mechanical damage to the corn kernels influenced the initial rates of water absorption, which were higher for the most damaged kernels, and they also changed the equilibrium moisture contents of the kernels. The Peleg model was well adjusted to the experimental data presenting satisfactory values for the analyzed statistic parameters for all temperatures regardless of the damage level of the corn kernels.
Resumo:
Solid lipid particles have been investigated by food researchers due to their ability to enhance the incorporation and bioavailability of lipophilic bioactives in aqueous formulations. The objectives of this study were to evaluate the physicochemical stability and digestibility of lipid microparticles produced with tristearin and palm kernel oil. The motivation for conducting this study was the fact that mixing lipids can prevent the expulsion of the bioactive from the lipid core and enhance the digestibility of lipid structures. The lipid microparticles containing different palm kernel oil contents were stable after 60 days of storage according to the particle size and zeta potential data. Their calorimetric behavior indicated that they were composed of a very heterogeneous lipid matrix. Lipid microparticles were stable under various conditions of ionic strength, sugar concentration, temperature, and pH. Digestibility assays indicated no differences in the release of free fatty acids, which was approximately 30% in all analises. The in vitro digestibility tests showed that the amount of palm kernel in the particles did not affect the percentage of lipolysis, probably due to the high amount of surfactants used and/or the solid state of the microparticles.
Resumo:
Clipping from a Town Council meeting at which estimates of the costs of Railway Line no. 1 and Line no. 2 were submitted by the office of Port Dalhousie and Thorold Railway. The estimate was submitted by S.D. Woodruff and George Rykert, president. There is also a disclaimer in which Calvin Phelps claims to have resigned as director of the Port Dalhousie and Thorold Railway when he discovered that the company had no intention to adhere to the original plan for building and running the road, Aug. 1854.
Resumo:
Chart of Port Dalhousie and Thorold Railway from St. Catharines to Thorold estimates. This document is slightly burnt and torn. Text is slightly affected. It is signed by S.D. Woodruff, Dec. 27, 1856.
Resumo:
Abstract of estimates, June 13, 1856.
Resumo:
Report by Jacob Misner on setting contracts for deepening and clearing ditches and estimates of quantities and costs of marsh drainage (3 ½ pages, handwritten). This is marked as a copy, July 14, 1855.
Resumo:
Estimates for marsh drainage sent to Dexter Deverardo from S.D. Woodruff for Alexander Cook and Andrew Mains, Nov. 30, 1855.