21 resultados para Multiple Additive Regression Trees (MART)
em CentAUR: Central Archive University of Reading - UK
Resumo:
Multiple regression analysis is a statistical technique which allows to predict a dependent variable from m ore than one independent variable and also to determine influential independent variables. Using experimental data, in this study the multiple regression analysis is applied to predict the room mean velocity and determine the most influencing parameters on the velocity. More than 120 experiments for four different heat source locations were carried out in a test chamber with a high level wall mounted air supply terminal at air change rates 3-6 ach. The influence of the environmental parameters such as supply air momentum, room heat load, Archimedes number and local temperature ratio, were examined by two methods: a simple regression analysis incorporated into scatter matrix plots and multiple stepwise regression analysis. It is concluded that, when a heat source is located along the jet centre line, the supply momentum mainly influences the room mean velocity regardless of the plume strength. However, when the heat source is located outside the jet region, the local temperature ratio (the inverse of the local heat removal effectiveness) is a major influencing parameter.
Resumo:
Real-time rainfall monitoring in Africa is of great practical importance for operational applications in hydrology and agriculture. Satellite data have been used in this context for many years because of the lack of surface observations. This paper describes an improved artificial neural network algorithm for operational applications. The algorithm combines numerical weather model information with the satellite data. Using this algorithm, daily rainfall estimates were derived for 4 yr of the Ethiopian and Zambian main rainy seasons and were compared with two other algorithms-a multiple linear regression making use of the same information as that of the neural network and a satellite-only method. All algorithms were validated against rain gauge data. Overall, the neural network performs best, but the extent to which it does so depends on the calibration/validation protocol. The advantages of the neural network are most evident when calibration data are numerous and close in space and time to the validation data. This result emphasizes the importance of a real-time calibration system.
Resumo:
Suprathermal electrons (E > 80 eV) carry heat flux away from the Sun. Processes controlling the heat flux are not well understood. To gain insight into these processes, we model heat flux as a linear dependence on two independent parameters: electron number flux and electron pitch angle anisotropy. Pitch angle anisotropy is further modeled as a linear dependence on two solar wind components: magnetic field strength and plasma density. These components show no correlation with number flux, reinforcing its independence from pitch angle anisotropy. Multiple linear regression applied to 2 years of Wind data shows good correspondence between modeled and observed heat flux and anisotropy. The results suggest that the interplay of solar wind parameters and electron number flux results in distinctive heat flux dropouts at heliospheric features like plasma sheets but that these parameters continuously modify heat flux. This is inconsistent with magnetic disconnection as the primary cause of heat flux dropouts. Analysis of fast and slow solar wind regimes separately shows that electron number flux and pitch angle anisotropy are equally correlated with heat flux in slow wind but that number flux is the dominant correlative in fast wind. Also, magnetic field strength correlates better with pitch angle anisotropy in slow wind than in fast wind. The energy dependence of the model fits suggests different scattering processes in fast and slow wind.
Resumo:
Baking and 2-g mixograph analyses were performed for 55 cultivars (19 spring and 36 winter wheat) from various quality classes from the 2002 harvest in Poland. An instrumented 2-g direct-drive mixograph was used to study the mixing characteristics of the wheat cultivars. A number of parameters were extracted automatically from each mixograph trace and correlated with baking volume and flour quality parameters (protein content and high molecular weight glutenin subunit [HMW-GS] composition by SDS-PAGE) using multiple linear regression statistical analysis. Principal component analysis of the mixograph data discriminated between four flour quality classes, and predictions of baking volume were obtained using several selected mixograph parameters, chosen using a best subsets regression routine, giving R-2 values of 0.862-0.866. In particular, three new spring wheat strains (CHD 502a-c) recently registered in Poland were highly discriminated and predicted to give high baking volume on the basis of two mixograph parameters: peak bandwidth and 10-min bandwidth.
Resumo:
Multiple linear regression is used to diagnose the signal of the 11-yr solar cycle in zonal-mean zonal wind and temperature in the 40-yr ECMWF Re-Analysis (ERA-40) dataset. The results of previous studies are extended to 2008 using data from ECMWF operational analyses. This analysis confirms that the solar signal found in previous studies is distinct from that of volcanic aerosol forcing resulting from the eruptions of El Chichón and Mount Pinatubo, but it highlights the potential for confusion of the solar signal and lower-stratospheric temperature trends. A correction to an error that is present in previous results of Crooks and Gray, stemming from the use of a single daily analysis field rather than monthly averaged data, is also presented.
Resumo:
The recent global economic crisis is often associated with the development and pricing of mortgage-backed securities (i.e. MBSs) and underlying products (i.e. sub-prime mortgages). This work uses a rich database of MBS issues and represents the first attempt to price commercial MBSs (i.e. CMBSs) in the European market. Our results are consistent with research carried out in the US market and we find that bond-, mortgage-, real estate-related and multinational characteristics show different degrees of significance in explaining European CMBS spreads at issuance. Multiple linear regression analysis using a databank of CMBSs issued between 1997 and 2007 indicates a strong relationship with bond-related factors, followed by real estate and mortgage market conditions. We also find that multinational factors are significant, with country of issuance, collateral location and access to more liquid markets all being important in explaining the cost of secured funding for real estate companies. As floater coupon tranches tend to be riskier and exhibit higher spreads, we also estimate a model using this sub-set of data and results hold, hence reinforcing our findings. Finally, we estimate our model for both tranches A and B and find that real estate factors become relatively more important for the riskier investment products.
Resumo:
The estimation of prediction quality is important because without quality measures, it is difficult to determine the usefulness of a prediction. Currently, methods for ligand binding site residue predictions are assessed in the function prediction category of the biennial Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiment, utilizing the Matthews Correlation Coefficient (MCC) and Binding-site Distance Test (BDT) metrics. However, the assessment of ligand binding site predictions using such metrics requires the availability of solved structures with bound ligands. Thus, we have developed a ligand binding site quality assessment tool, FunFOLDQA, which utilizes protein feature analysis to predict ligand binding site quality prior to the experimental solution of the protein structures and their ligand interactions. The FunFOLDQA feature scores were combined using: simple linear combinations, multiple linear regression and a neural network. The neural network produced significantly better results for correlations to both the MCC and BDT scores, according to Kendall’s τ, Spearman’s ρ and Pearson’s r correlation coefficients, when tested on both the CASP8 and CASP9 datasets. The neural network also produced the largest Area Under the Curve score (AUC) when Receiver Operator Characteristic (ROC) analysis was undertaken for the CASP8 dataset. Furthermore, the FunFOLDQA algorithm incorporating the neural network, is shown to add value to FunFOLD, when both methods are employed in combination. This results in a statistically significant improvement over all of the best server methods, the FunFOLD method (6.43%), and one of the top manual groups (FN293) tested on the CASP8 dataset. The FunFOLDQA method was also found to be competitive with the top server methods when tested on the CASP9 dataset. To the best of our knowledge, FunFOLDQA is the first attempt to develop a method that can be used to assess ligand binding site prediction quality, in the absence of experimental data.
Resumo:
An analysis of the attribution of past and future changes in stratospheric ozone and temperature to anthropogenic forcings is presented. The analysis is an extension of the study of Shepherd and Jonsson (2008) who analyzed chemistry-climate simulations from the Canadian Middle Atmosphere Model (CMAM) and attributed both past and future changes to changes in the external forcings, i.e. the abundances of ozone-depleting substances (ODS) and well-mixed greenhouse gases. The current study is based on a new CMAM dataset and includes two important changes. First, we account for the nonlinear radiative response to changes in CO2. It is shown that over centennial time scales the radiative response in the upper stratosphere to CO2 changes is significantly nonlinear and that failure to account for this effect leads to a significant error in the attribution. To our knowledge this nonlinearity has not been considered before in attribution analysis, including multiple linear regression studies. For the regression analysis presented here the nonlinearity was taken into account by using CO2 heating rate, rather than CO2 abundance, as the explanatory variable. This approach yields considerable corrections to the results of the previous study and can be recommended to other researchers. Second, an error in the way the CO2 forcing changes are implemented in the CMAM was corrected, which significantly affects the results for the recent past. As the radiation scheme, based on Fomichev et al. (1998), is used in several other models we provide some description of the problem and how it was fixed.
Resumo:
A continuous tropospheric and stratospheric vertically resolved ozone time series, from 1850 to 2099, has been generated to be used as forcing in global climate models that do not include interactive chemistry. A multiple linear regression analysis of SAGE I+II satellite observations and polar ozonesonde measurements is used for the stratospheric zonal mean dataset during the well-observed period from 1979 to 2009. In addition to terms describing the mean annual cycle, the regression includes terms representing equivalent effective stratospheric chlorine (EESC) and the 11-yr solar cycle variability. The EESC regression fit coefficients, together with pre-1979 EESC values, are used to extrapolate the stratospheric ozone time series backward to 1850. While a similar procedure could be used to extrapolate into the future, coupled chemistry climate model (CCM) simulations indicate that future stratospheric ozone abundances are likely to be significantly affected by climate change, and capturing such effects through a regression model approach is not feasible. Therefore, the stratospheric ozone dataset is extended into the future (merged in 2009) with multimodel mean projections from 13 CCMs that performed a simulation until 2099 under the SRES (Special Report on Emission Scenarios) A1B greenhouse gas scenario and the A1 adjusted halogen scenario in the second round of the Chemistry-Climate Model Validation (CCMVal-2) Activity. The stratospheric zonal mean ozone time series is merged with a three-dimensional tropospheric data set extracted from simulations of the past by two CCMs (CAM3.5 and GISSPUCCINI)and of the future by one CCM (CAM3.5). The future tropospheric ozone time series continues the historical CAM3.5 simulation until 2099 following the four different Representative Concentration Pathways (RCPs). Generally good agreement is found between the historical segment of the ozone database and satellite observations, although it should be noted that total column ozone is overestimated in the southern polar latitudes during spring and tropospheric column ozone is slightly underestimated. Vertical profiles of tropospheric ozone are broadly consistent with ozonesondes and in-situ measurements, with some deviations in regions of biomass burning. The tropospheric ozone radiative forcing (RF) from the 1850s to the 2000s is 0.23Wm−2, lower than previous results. The lower value is mainly due to (i) a smaller increase in biomass burning emissions; (ii) a larger influence of stratospheric ozone depletion on upper tropospheric ozone at high southern latitudes; and possibly (iii) a larger influence of clouds (which act to reduce the net forcing) compared to previous radiative forcing calculations. Over the same period, decreases in stratospheric ozone, mainly at high latitudes, produce a RF of −0.08Wm−2, which is more negative than the central Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report (AR4) value of −0.05Wm−2, but which is within the stated range of −0.15 to +0.05Wm−2. The more negative value is explained by the fact that the regression model simulates significant ozone depletion prior to 1979, in line with the increase in EESC and as confirmed by CCMs, while the AR4 assumed no change in stratospheric RF prior to 1979. A negative RF of similar magnitude persists into the future, although its location shifts from high latitudes to the tropics. This shift is due to increases in polar stratospheric ozone, but decreases in tropical lower stratospheric ozone, related to a strengthening of the Brewer-Dobson circulation, particularly through the latter half of the 21st century. Differences in trends in tropospheric ozone among the four RCPs are mainly driven by different methane concentrations, resulting in a range of tropospheric ozone RFs between 0.4 and 0.1Wm−2 by 2100. The ozone dataset described here has been released for the Coupled Model Intercomparison Project (CMIP5) model simulations in netCDF Climate and Forecast (CF) Metadata Convention at the PCMDI website (http://cmip-pcmdi.llnl.gov/).
Resumo:
Low variability of crop production from year to year is desirable for many reasons, including reduced income risk and stability of supplies. Therefore, it is important to understand the nature of yield variability, whether it is changing through time, and how it varies between crops and regions. Previous studies have shown that national crop yield variability has changed in the past, with the direction and magnitude dependent on crop type and location. Whilst such studies acknowledge the importance of climate variability in determining yield variability, it has been assumed that its magnitude and its effect on crop production have not changed through time and, hence, that changes to yield variability have been due to non-climatic factors. We address this assumption by jointly examining yield and climate variability for three major crops (rice, wheat and maize) over the past 50 years. National yield time series and growing season temperature and precipitation were de-trended and related using multiple linear regression. Yield variability changed significantly in half of the crop–country combinations examined. For several crop–country combinations, changes in yield variability were related to changes in climate variability.
Resumo:
We discuss the modeling of dielectric responses of electromagnetically excited networks which are composed of a mixture of capacitors and resistors. Such networks can be employed as lumped-parameter circuits to model the response of composite materials containing conductive and insulating grains. The dynamics of the excited network systems are studied using a state space model derived from a randomized incidence matrix. Time and frequency domain responses from synthetic data sets generated from state space models are analyzed for the purpose of estimating the fraction of capacitors in the network. Good results were obtained by using either the time-domain response to a pulse excitation or impedance data at selected frequencies. A chemometric framework based on a Successive Projections Algorithm (SPA) enables the construction of multiple linear regression (MLR) models which can efficiently determine the ratio of conductive to insulating components in composite material samples. The proposed method avoids restrictions commonly associated with Archie’s law, the application of percolation theory or Kohlrausch-Williams-Watts models and is applicable to experimental results generated by either time domain transient spectrometers or continuous-wave instruments. Furthermore, it is quite generic and applicable to tomography, acoustics as well as other spectroscopies such as nuclear magnetic resonance, electron paramagnetic resonance and, therefore, should be of general interest across the dielectrics community.
Resumo:
The occurrence of mid-latitude windstorms is related to strong socio-economic effects. For detailed and reliable regional impact studies, large datasets of high-resolution wind fields are required. In this study, a statistical downscaling approach in combination with dynamical downscaling is introduced to derive storm related gust speeds on a high-resolution grid over Europe. Multiple linear regression models are trained using reanalysis data and wind gusts from regional climate model simulations for a sample of 100 top ranking windstorm events. The method is computationally inexpensive and reproduces individual windstorm footprints adequately. Compared to observations, the results for Germany are at least as good as pure dynamical downscaling. This new tool can be easily applied to large ensembles of general circulation model simulations and thus contribute to a better understanding of the regional impact of windstorms based on decadal and climate change projections.
Resumo:
We present a method for deriving the radiative effects of absorbing aerosols in cloudy scenes from satellite retrievals only. We use data of 2005–2007 from various passive sensors aboard satellites of the “A-Train” constellation. The study area is restricted to the tropical- and subtropical Atlantic Ocean. To identify the dependence of the local planetary albedo in cloudy scenes on cloud liquid water path and aerosol optical depth (AOD), we perform a multiple linear regression. The OMI UV-Aerosolindex serves as an indicator for absorbing-aerosol presence. In our method, the aerosol influences the local planetary albedo through direct- (scattering and absorption) and indirect (Twomey) aerosol effects. We find an increase of the local planetary albedo (LPA) with increasing AOD of mostly scattering aerosol and a decrease of the LPA with increasing AOD of mostly absorbing aerosol. These results allow us to derive the direct aerosol effect of absorbing aerosols in cloudy scenes, with the effect of cloudy-scene aerosol absorption in the tropical- and subtropical Atlantic contributing (+21.2±11.1)×10−3 Wm−2 to the global top of the atmosphere radiative forcing.
Resumo:
The objective of this study was to evaluate the association of PPARG coactivator1 alpha (PPARGC1A), peroxisome proliferator activated receptor gamma (PPARG), and uncoupling protein1 (UCP1) gene polymorphisms with the metabolic syndrome (MS) in an Asian Indian population. Nine common polymorphisms were genotyped via polymerase chain reaction restriction fragment length polymorphism and direct sequencing in 950 normal glucose-tolerant subjects and 550 type 2 diabetic subjects, chosen randomly from the Chennai Urban Rural Epidemiological Study, an ongoing population based study in Southern India. Among the 9 polymorphisms examined, only the Thr394Thr variant of the PPARGC1A gene was significantly associated with diabetes and obesity. The genotype frequency of GA of Thr394Thr variant was 16% (138/887) in the nonMS group and 22% (136/613) in the MS group, and this genotype frequency was significantly higher with MS both in males (p = 0.01) and females (p = 0.05), compared to the without-MS group. Logistic regression analysis revealed that the odds ratio for MS for the susceptible genotype GA of Thr394Thr was 1.411 [95% CI: 1.03-1.84, p = 0.012]. In the multiple logistic regression analysis, however, there was no association of this polymorphism as an independent factor with MS. Hence, the study shows that the polymorphisms in the PPARGC1A, PPARG and UCP1 genes are not associated with MS in Asian Indians.
Resumo:
Sixteen years (1994 – 2009) of ozone profiling by ozonesondes at Valentia Meteorological and Geophysical Observatory, Ireland (51.94° N, 10.23° W) along with a co-located MkIV Brewer spectrophotometer for the period 1993–2009 are analyzed. Simple and multiple linear regression methods are used to infer the recent trend, if any, in stratospheric column ozone over the station. The decadal trend from 1994 to 2010 is also calculated from the monthly mean data of Brewer and column ozone data derived from satellite observations. Both of these show a 1.5 % increase per decade during this period with an uncertainty of about ±0.25 %. Monthly mean data for March show a much stronger trend of ~ 4.8 % increase per decade for both ozonesonde and Brewer data. The ozone profile is divided between three vertical slots of 0–15 km, 15–26 km, and 26 km to the top of the atmosphere and a 11-year running average is calculated. Ozone values for the month of March only are observed to increase at each level with a maximum change of +9.2 ± 3.2 % per decade (between years 1994 and 2009) being observed in the vertical region from 15 to 26 km. In the tropospheric region from 0 to 15 km, the trend is positive but with a poor statistical significance. However, for the top level of above 26 km the trend is significantly positive at about 4 % per decade. The March integrated ozonesonde column ozone during this period is found to increase at a rate of ~6.6 % per decade compared with the Brewer and satellite positive trends of ~5 % per decade.