953 resultados para Linear multivariate methods
Resumo:
Forest inventories are used to estimate forest characteristics and the condition of forest for many different applications: operational tree logging for forest industry, forest health state estimation, carbon balance estimation, land-cover and land use analysis in order to avoid forest degradation etc. Recent inventory methods are strongly based on remote sensing data combined with field sample measurements, which are used to define estimates covering the whole area of interest. Remote sensing data from satellites, aerial photographs or aerial laser scannings are used, depending on the scale of inventory. To be applicable in operational use, forest inventory methods need to be easily adjusted to local conditions of the study area at hand. All the data handling and parameter tuning should be objective and automated as much as possible. The methods also need to be robust when applied to different forest types. Since there generally are no extensive direct physical models connecting the remote sensing data from different sources to the forest parameters that are estimated, mathematical estimation models are of "black-box" type, connecting the independent auxiliary data to dependent response data with linear or nonlinear arbitrary models. To avoid redundant complexity and over-fitting of the model, which is based on up to hundreds of possibly collinear variables extracted from the auxiliary data, variable selection is needed. To connect the auxiliary data to the inventory parameters that are estimated, field work must be performed. In larger study areas with dense forests, field work is expensive, and should therefore be minimized. To get cost-efficient inventories, field work could partly be replaced with information from formerly measured sites, databases. The work in this thesis is devoted to the development of automated, adaptive computation methods for aerial forest inventory. The mathematical model parameter definition steps are automated, and the cost-efficiency is improved by setting up a procedure that utilizes databases in the estimation of new area characteristics.
Resumo:
Nutritional status of eight 1.0 and 4.7 years old clones of Eucalyptus grandis, cultivated in a medium textured Ustults - US - and a Quartzipsamments - PS - soils, in Lençóis Paulista, São Paulo, were evaluated by the Diagnosis and Recommendation Integrated System (DRIS) and Critical Level (CL) methods. Based on multivariate discriminant analysis, the DRIS indices described the nutritional status of trees better in relation to tree age and soil type than in relation to nutrient composition. Spearman's correlation coefficients showed statistically significant relationships between volumetric tree growth and nutrients when applying DRIS indices or foliar nutrient concentrations. However, the DRIS indices indicated a lower number of trees with nutritional deficiencies, in relation to the CL method. According to the CL method, P, S, and Ca were deficient in the majority of the soils and tree age categories. By the DRIS method, Ca was the only deficient nutrient in PS soils, and appeared to be particularly limited in one-year-old trees. In conclusion, the DRIS method was more efficient than the CL method in evaluating the nutritional status of eucalyptus trees.
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.
Resumo:
A study about the spatial variability of data of soil resistance to penetration (RSP) was conducted at layers 0.0-0.1 m, 0.1-0.2 m and 0.2-0.3 m depth, using the statistical methods in univariate forms, i.e., using traditional geostatistics, forming thematic maps by ordinary kriging for each layer of the study. It was analyzed the RSP in layer 0.2-0.3 m depth through a spatial linear model (SLM), which considered the layers 0.0-0.1 m and 0.1-0.2 m in depth as covariable, obtaining an estimation model and a thematic map by universal kriging. The thematic maps of the RSP at layer 0.2-0.3 m depth, constructed by both methods, were compared using measures of accuracy obtained from the construction of the matrix of errors and confusion matrix. There are similarities between the thematic maps. All maps showed that the RSP is higher in the north region.
Resumo:
Evapotranspiration is the process of water loss of vegetated soil due to evaporation and transpiration, and it may be estimated by various empirical methods. This study had the objective to carry out the evaluation of the performance of the following methods: Blaney-Criddle, Jensen-Haise, Linacre, Solar Radiation, Hargreaves-Samani, Makkink, Thornthwaite, Camargo, Priestley-Taylor and Original Penman in the estimation of the potential evapotranspiration when compared to the Penman-Monteith standard method (FAO56) to the climatic conditions of Uberaba, state of Minas Gerais, Brazil. A set of 21 years monthly data (1990 to 2010) was used, working with the climatic elements: temperature, relative humidity, wind speed and insolation. The empirical methods to estimate reference evapotranspiration were compared with the standard method using linear regression, simple statistical analysis, Willmott agreement index (d) and performance index (c). The methods Makkink and Camargo showed the best performance, with "c" values of 0.75 and 0.66, respectively. The Hargreaves-Samani method presented a better linear relation with the standard method, with a correlation coefficient (r) of 0.88.
Resumo:
One approach to verify the adequacy of estimation methods of reference evapotranspiration is the comparison with the Penman-Monteith method, recommended by the United Nations of Food and Agriculture Organization - FAO, as the standard method for estimating ET0. This study aimed to compare methods for estimating ET0, Makkink (MK), Hargreaves (HG) and Solar Radiation (RS), with Penman-Monteith (PM). For this purpose, we used daily data of global solar radiation, air temperature, relative humidity and wind speed for the year 2010, obtained through the automatic meteorological station, with latitude 18° 91' 66" S, longitude 48° 25' 05" W and altitude of 869m, at the National Institute of Meteorology situated in the Campus of Federal University of Uberlandia - MG, Brazil. Analysis of results for the period were carried out in daily basis, using regression analysis and considering the linear model y = ax, where the dependent variable was the method of Penman-Monteith and the independent, the estimation of ET0 by evaluated methods. Methodology was used to check the influence of standard deviation of daily ET0 in comparison of methods. The evaluation indicated that methods of Solar Radiation and Penman-Monteith cannot be compared, yet the method of Hargreaves indicates the most efficient adjustment to estimate ETo.
Resumo:
Singular Value Decomposition (SVD), Principal Component Analysis (PCA) and Multiple Linear Regression (MLR) are some of the mathematical pre- liminaries that are discussed prior to explaining PLS and PCR models. Both PLS and PCR are applied to real spectral data and their di erences and similarities are discussed in this thesis. The challenge lies in establishing the optimum number of components to be included in either of the models but this has been overcome by using various diagnostic tools suggested in this thesis. Correspondence analysis (CA) and PLS were applied to ecological data. The idea of CA was to correlate the macrophytes species and lakes. The di erences between PLS model for ecological data and PLS for spectral data are noted and explained in this thesis. i
Resumo:
The purpose of this thesis is twofold. The first and major part is devoted to sensitivity analysis of various discrete optimization problems while the second part addresses methods applied for calculating measures of solution stability and solving multicriteria discrete optimization problems. Despite numerous approaches to stability analysis of discrete optimization problems two major directions can be single out: quantitative and qualitative. Qualitative sensitivity analysis is conducted for multicriteria discrete optimization problems with minisum, minimax and minimin partial criteria. The main results obtained here are necessary and sufficient conditions for different stability types of optimal solutions (or a set of optimal solutions) of the considered problems. Within the framework of quantitative direction various measures of solution stability are investigated. A formula for a quantitative characteristic called stability radius is obtained for the generalized equilibrium situation invariant to changes of game parameters in the case of the H¨older metric. Quality of the problem solution can also be described in terms of robustness analysis. In this work the concepts of accuracy and robustness tolerances are presented for a strategic game with a finite number of players where initial coefficients (costs) of linear payoff functions are subject to perturbations. Investigation of stability radius also aims to devise methods for its calculation. A new metaheuristic approach is derived for calculation of stability radius of an optimal solution to the shortest path problem. The main advantage of the developed method is that it can be potentially applicable for calculating stability radii of NP-hard problems. The last chapter of the thesis focuses on deriving innovative methods based on interactive optimization approach for solving multicriteria combinatorial optimization problems. The key idea of the proposed approach is to utilize a parameterized achievement scalarizing function for solution calculation and to direct interactive procedure by changing weighting coefficients of this function. In order to illustrate the introduced ideas a decision making process is simulated for three objective median location problem. The concepts, models, and ideas collected and analyzed in this thesis create a good and relevant grounds for developing more complicated and integrated models of postoptimal analysis and solving the most computationally challenging problems related to it.
Resumo:
PURPOSE: To identify the factors associated with weight retention after pregnancy.METHODS: A cohort study was performed with 145 women receiving maternity care at a hospital in Caxias do Sul, Rio Grande do Sul, Brazil, aged 19 to 45 years, between weeks 38 and 42 of pregnancy. The patients were evaluated at one month, three months, and six months after delivery. Student's t-test or one-way analysis of variance (ANOVA) was used to compare groups, as indicated; correlations were assessed with Pearson's and Spearman's tests, as indicated; to identify and evaluate confounders independently associated with total weight loss, a multivariate linear regression analysis was performed and statistical significance was set at p≤0.05.RESULTS: There was a significant positive association between total weight gain - and a negative association with physical exercise during pregnancy - with total weight loss. Higher parity, inter-pregnancy interval, calorie intake, pre-pregnancy body mass index (BMI), weight gain related to pre-pregnancy BMI, presence and severity of depression, and lack of exclusive breastfeeding were directly associated with lower weight loss. Among nominal variables, level of education and marital status were significantly associated with total weight loss.CONCLUSION: In the present study, lower weight retention in the postpartum period was associated with higher educational attainment and with being married. Normal or below-normal pre-pregnancy BMI, physical activity and adequate weight gain during pregnancy, lower parity, exclusive breastfeeding for a longer period, appropriate or low calorie intake, and absence of depression were also determinants of reduced weight retention.
Resumo:
In this article a two-dimensional transient boundary element formulation based on the mass matrix approach is discussed. The implicit formulation of the method to deal with elastoplastic analysis is considered, as well as the way to deal with viscous damping effects. The time integration processes are based on the Newmark rhoand Houbolt methods, while the domain integrals for mass, elastoplastic and damping effects are carried out by the well known cell approximation technique. The boundary element algebraic relations are also coupled with finite element frame relations to solve stiffened domains. Some examples to illustrate the accuracy and efficiency of the proposed formulation are also presented.
Resumo:
A linear prediction procedure is one of the approved numerical methods of signal processing. In the field of optical spectroscopy it is used mainly for extrapolation known parts of an optical signal in order to obtain a longer one or deduce missing signal samples. The first is needed particularly when narrowing spectral lines for the purpose of spectral information extraction. In the present paper the coherent anti-Stokes Raman scattering (CARS) spectra were under investigation. The spectra were significantly distorted by the presence of nonlinear nonresonant background. In addition, line shapes were far from Gaussian/Lorentz profiles. To overcome these disadvantages the maximum entropy method (MEM) for phase spectrum retrieval was used. The obtained broad MEM spectra were further underwent the linear prediction analysis in order to be narrowed.
Resumo:
Baroreflex sensitivity was studied in the same group of conscious rats using vasoactive drugs (phenylephrine and sodium nitroprusside) administered by three different approaches: 1) bolus injection, 2) steady-state (blood pressure (BP) changes produced in steps), 3) ramp infusion (30 s, brief infusion). The heart rate (HR) responses were evaluated by the mean index (mean ratio of all HR changes and mean arterial pressure (MAP) changes), by linear regression and by the logistic method (maximum gain of the sigmoid curve by a logistic function). The experiments were performed on three consecutive days. Basal MAP and resting HR were similar on all days of the study. Bradycardic responses evaluated by the mean index (-1.5 ± 0.2, -2.1 ± 0.2 and -1.6 ± 0.2 bpm/mmHg) and linear regression (-1.8 ± 0.3, -1.4 ± 0.3 and -1.7 ± 0.2 bpm/mmHg) were similar for all three approaches used to change blood pressure. The tachycardic responses to decreases of MAP were similar when evaluated by linear regression (-3.9 ± 0.8, -2.1 ± 0.7 and -3.8 ± 0.4 bpm/mmHg). However, the tachycardic mean index (-3.1 ± 0.4, -6.6 ± 1 and -3.6 ± 0.5 bpm/mmHg) was higher when assessed by the steady-state method. The average gain evaluated by logistic function (-3.5 ± 0.6, -7.6 ± 1.3 and -3.8 ± 0.4 bpm/mmHg) was similar to the reflex tachycardic values, but different from the bradycardic values. Since different ways to change BP may alter the afferent baroreceptor function, the MAP changes obtained during short periods of time (up to 30 s: bolus and ramp infusion) are more appropriate to prevent the acute resetting. Assessment of the baroreflex sensitivity by mean index and linear regression permits a separate analysis of gain for reflex bradycardia and reflex tachycardia. Although two values of baroreflex sensitivity cannot be evaluated by a single symmetric logistic function, this method has the advantage of better comparing the baroreflex sensitivity of animals with different basal blood pressures.
Resumo:
Several methods are used to estimate anaerobic threshold (AT) during exercise. The aim of the present study was to compare AT obtained by a graphic visual method for the estimate of ventilatory and metabolic variables (gold standard), to a bi-segmental linear regression mathematical model of Hinkley's algorithm applied to heart rate (HR) and carbon dioxide output (VCO2) data. Thirteen young (24 ± 2.63 years old) and 16 postmenopausal (57 ± 4.79 years old) healthy and sedentary women were submitted to a continuous ergospirometric incremental test on an electromagnetic braking cycloergometer with 10 to 20 W/min increases until physical exhaustion. The ventilatory variables were recorded breath-to-breath and HR was obtained beat-to-beat over real time. Data were analyzed by the nonparametric Friedman test and Spearman correlation test with the level of significance set at 5%. Power output (W), HR (bpm), oxygen uptake (VO2; mL kg-1 min-1), VO2 (mL/min), VCO2 (mL/min), and minute ventilation (VE; L/min) data observed at the AT level were similar for both methods and groups studied (P > 0.05). The VO2 (mL kg-1 min-1) data showed significant correlation (P < 0.05) between the gold standard method and the mathematical model when applied to HR (r s = 0.75) and VCO2 (r s = 0.78) data for the subjects as a whole (N = 29). The proposed mathematical method for the detection of changes in response patterns of VCO2 and HR was adequate and promising for AT detection in young and middle-aged women, representing a semi-automatic, non-invasive and objective AT measurement.
Resumo:
The objectives of this study were to evaluate and compare the use of linear and nonlinear methods for analysis of heart rate variability (HRV) in healthy subjects and in patients after acute myocardial infarction (AMI). Heart rate (HR) was recorded for 15 min in the supine position in 10 patients with AMI taking β-blockers (aged 57 ± 9 years) and in 11 healthy subjects (aged 53 ± 4 years). HRV was analyzed in the time domain (RMSSD and RMSM), the frequency domain using low- and high-frequency bands in normalized units (nu; LFnu and HFnu) and the LF/HF ratio and approximate entropy (ApEn) were determined. There was a correlation (P < 0.05) of RMSSD, RMSM, LFnu, HFnu, and the LF/HF ratio index with the ApEn of the AMI group on the 2nd (r = 0.87, 0.65, 0.72, 0.72, and 0.64) and 7th day (r = 0.88, 0.70, 0.69, 0.69, and 0.87) and of the healthy group (r = 0.63, 0.71, 0.63, 0.63, and 0.74), respectively. The median HRV indexes of the AMI group on the 2nd and 7th day differed from the healthy group (P < 0.05): RMSSD = 10.37, 19.95, 24.81; RMSM = 23.47, 31.96, 43.79; LFnu = 0.79, 0.79, 0.62; HFnu = 0.20, 0.20, 0.37; LF/HF ratio = 3.87, 3.94, 1.65; ApEn = 1.01, 1.24, 1.31, respectively. There was agreement between the methods, suggesting that these have the same power to evaluate autonomic modulation of HR in both AMI patients and healthy subjects. AMI contributed to a reduction in cardiac signal irregularity, higher sympathetic modulation and lower vagal modulation.
Resumo:
The DNA extraction is a critical step in Genetically Modified Organisms analysis based on real-time PCR. In this study, the CTAB and DNeasy methods provided good quality and quantity of DNA from the texturized soy protein, infant formula, and soy milk samples. Concerning the Certified Reference Material consisting of 5% Roundup Ready® soybean, neither method yielded DNA of good quality. However, the dilution test applied in the CTAB extracts showed no interference of inhibitory substances. The PCR efficiencies of lectin target amplification were not statistically different, and the coefficients of correlation (R²) demonstrated high degree of correlation between the copy numbers and the threshold cycle (Ct) values. ANOVA showed suitable adjustment of the regression and absence of significant linear deviations. The efficiencies of the p35S amplification were not statistically different, and all R² values using DNeasy extracts were above 0.98 with no significant linear deviations. Two out of three R² values using CTAB extracts were lower than 0.98, corresponding to lower degree of correlation, and the lack-of-fit test showed significant linear deviation in one run. The comparative analysis of the Ct values for the p35S and lectin targets demonstrated no statistical significant differences between the analytical curves of each target.