136 resultados para bootstrapping
Resumo:
Estimation of economic relationships often requires imposition of constraints such as positivity or monotonicity on each observation. Methods to impose such constraints, however, vary depending upon the estimation technique employed. We describe a general methodology to impose (observation-specific) constraints for the class of linear regression estimators using a method known as constraint weighted bootstrapping. While this method has received attention in the nonparametric regression literature, we show how it can be applied for both parametric and nonparametric estimators. A benefit of this method is that imposing numerous constraints simultaneously can be performed seamlessly. We apply this method to Norwegian dairy farm data to estimate both unconstrained and constrained parametric and nonparametric models.
Resumo:
This paper proposes a constrained nonparametric method of estimating an input distance function. A regression function is estimated via kernel methods without functional form assumptions. To guarantee that the estimated input distance function satisfies its properties, monotonicity constraints are imposed on the regression surface via the constraint weighted bootstrapping method borrowed from statistics literature. The first, second, and cross partial analytical derivatives of the estimated input distance function are derived, and thus the elasticities measuring input substitutability can be computed from them. The method is then applied to a cross-section of 3,249 Norwegian timber producers.
Resumo:
We present in this article an automated framework that extracts product adopter information from online reviews and incorporates the extracted information into feature-based matrix factorization formore effective product recommendation. In specific, we propose a bootstrapping approach for the extraction of product adopters from review text and categorize them into a number of different demographic categories. The aggregated demographic information of many product adopters can be used to characterize both products and users in the form of distributions over different demographic categories. We further propose a graphbased method to iteratively update user- and product-related distributions more reliably in a heterogeneous user-product graph and incorporate them as features into the matrix factorization approach for product recommendation. Our experimental results on a large dataset crawled from JINGDONG, the largest B2C e-commerce website in China, show that our proposed framework outperforms a number of competitive baselines for product recommendation.
Resumo:
The purpose of this study is to produce a model to be used by state regulating agencies to assess demand for subacute care. In accomplishing this goal, the study refines the definition of subacute care, demonstrates a method for bed need assessment, and measures the effectiveness of this new level of care. This was the largest study of subacute care to date. Research focused on 19 subacute units in 16 states, each of which provides high-intensity rehabilitative and/or restorative care carried out in a high-tech unit. Each of the facilities was based in a nursing home, but utilized separate staff, equipment, and services. Because these facilities are under local control, it was possible to study regional differences in subacute care demand.^ Using this data, a model for predicting demand for subacute care services was created, building on earlier models submitted by John Whitman for the American Hospital Association and Robin E. MacStravic. The Broderick model uses the "bootstrapping" method and takes advantage of high technology: computers and software, databases in business and government, publicly available databases from providers or commercial vendors, professional organizations, and other information sources. Using newly available sources of information, this new model addresses the problems and needs of health care planners as they approach the challenges of the 21st century. ^
Resumo:
The purpose of this study is to produce a model to be used by state regulating agencies to assess demand for subacute care. In accomplishing this goal, the study refines the definition of subacute care, demonstrates a method for bed need assessment, and measures the effectiveness of this new level of care. This was the largest study of subacute care to date. Research focused on 19 subacute units in 16 states, each of which provides high-intensity rehabilitative and/or restorative care carried out in a high-tech unit. Each of the facilities was based in a nursing home, but utilized separate staff, equipment, and services. Because these facilities are under local control, it was possible to study regional differences in subacute care demand. Using this data, a model for predicting demand for subacute care services was created, building on earlier models submitted by John Whitman for the American Hospital Association and Robin E. MacStravic. The Broderick model uses the "bootstrapping" method and takes advantage of high technology: computers and software, databases in business and government, publicly available databases from providers or commercial vendors, professional organizations, and other information sources. Using newly available sources of information, this new model addresses the problems and needs of health care planners as they approach the challenges of the 21st century.
Resumo:
The composition and abundance of algal pigments provide information on phytoplankton community characteristics such as photoacclimation, overall biomass and taxonomic composition. In particular, pigments play a major role in photoprotection and in the light-driven part of photosynthesis. Most phytoplankton pigments can be measured by high-performance liquid chromatography (HPLC) techniques applied to filtered water samples. This method, as well as other laboratory analyses, is time consuming and therefore limits the number of samples that can be processed in a given time. In order to receive information on phytoplankton pigment composition with a higher temporal and spatial resolution, we have developed a method to assess pigment concentrations from continuous optical measurements. The method applies an empirical orthogonal function (EOF) analysis to remote-sensing reflectance data derived from ship-based hyperspectral underwater radiometry and from multispectral satellite data (using the Medium Resolution Imaging Spectrometer - MERIS - Polymer product developed by Steinmetz et al., 2011, doi:10.1364/OE.19.009783) measured in the Atlantic Ocean. Subsequently we developed multiple linear regression models with measured (collocated) pigment concentrations as the response variable and EOF loadings as predictor variables. The model results show that surface concentrations of a suite of pigments and pigment groups can be well predicted from the ship-based reflectance measurements, even when only a multispectral resolution is chosen (i.e., eight bands, similar to those used by MERIS). Based on the MERIS reflectance data, concentrations of total and monovinyl chlorophyll a and the groups of photoprotective and photosynthetic carotenoids can be predicted with high quality. As a demonstration of the utility of the approach, the fitted model based on satellite reflectance data as input was applied to 1 month of MERIS Polymer data to predict the concentration of those pigment groups for the whole eastern tropical Atlantic area. Bootstrapping explorations of cross-validation error indicate that the method can produce reliable predictions with relatively small data sets (e.g., < 50 collocated values of reflectance and pigment concentration). The method allows for the derivation of time series from continuous reflectance data of various pigment groups at various regions, which can be used to study variability and change of phytoplankton composition and photophysiology.
Resumo:
We present the market practice for interest rate yield curves construction and pricing interest rate derivatives. Then we give a brief description of the Vasicek and the Hull-White models, with an example of calibration to market data. We generalize the classical Black-Scholes-Merton pricing formulas, considering more general cases such as perfect or partial collateral, derivatives on a dividend paying asset subject to repo funding, and multiple currencies. Finally we derive generic pricing formulae for different combinations of cash flow and collateral currencies, and we apply the results to the pricing of FX swaps and CCS, and we discuss curve bootstrapping.
Resumo:
Background: Most large acute stroke trials have been neutral. Functional outcome is usually analysed using a yes or no answer, e.g. death or dependency vs. independence. We assessed which statistical approaches are most efficient in analysing outcomes from stroke trials. Methods: Individual patient data from acute, rehabilitation and stroke unit trials studying the effects of interventions which alter functional outcome were assessed. Outcomes included modified Rankin Scale, Barthel Index, and ‘3 questions’. Data were analysed using a variety of approaches which compare two treatment groups. The results for each statistical test for each trial were then compared. Results: Data from 55 datasets were obtained (47 trials, 54,173 patients). The test results differed substantially so that approaches which use the ordered nature of functional outcome data (ordinal logistic regression, t-test, robust ranks test, bootstrapping the difference in mean rank) were more efficient statistically than those which collapse the data into 2 groups (chi square) (ANOVA p<0.001). The findings were consistent across different types and sizes of trial and for the different measures of functional outcome. Conclusions: When analysing functional outcome from stroke trials, statistical tests which use the original ordered data are more efficient and more likely to yield reliable results. Suitable approaches included ordinal logistic regression, t-test, and robust ranks test.
Resumo:
Background and Purpose—Vascular prevention trials mostly count “yes/no” (binary) outcome events, eg, stroke/no stroke. Analysis of ordered categorical vascular events (eg, fatal stroke/nonfatal stroke/no stroke) is clinically relevant and could be more powerful statistically. Although this is not a novel idea in the statistical community, ordinal outcomes have not been applied to stroke prevention trials in the past. Methods—Summary data on stroke, myocardial infarction, combined vascular events, and bleeding were obtained by treatment group from published vascular prevention trials. Data were analyzed using 10 statistical approaches which allow comparison of 2 ordinal or binary treatment groups. The results for each statistical test for each trial were then compared using Friedman 2-way analysis of variance with multiple comparison procedures. Results—Across 85 trials (335 305 subjects) the test results differed substantially so that approaches which used the ordinal nature of stroke events (fatal/nonfatal/no stroke) were more efficient than those which combined the data to form 2 groups (P0.0001). The most efficient tests were bootstrapping the difference in mean rank, Mann–Whitney U test, and ordinal logistic regression; 4- and 5-level data were more efficient still. Similar findings were obtained for myocardial infarction, combined vascular outcomes, and bleeding. The findings were consistent across different types, designs and sizes of trial, and for the different types of intervention. Conclusions—When analyzing vascular events from prevention trials, statistical tests which use ordered categorical data are more efficient and are more likely to yield reliable results than binary tests. This approach gives additional information on treatment effects by severity of event and will allow trials to be smaller. (Stroke. 2008;39:000-000.)
Resumo:
Background and Purpose—Most large acute stroke trials have been neutral. Functional outcome is usually analyzed using a yes or no answer, eg, death or dependency versus independence. We assessed which statistical approaches are most efficient in analyzing outcomes from stroke trials. Methods—Individual patient data from acute, rehabilitation and stroke unit trials studying the effects of interventions which alter functional outcome were assessed. Outcomes included modified Rankin Scale, Barthel Index, and “3 questions”. Data were analyzed using a variety of approaches which compare 2 treatment groups. The results for each statistical test for each trial were then compared. Results—Data from 55 datasets were obtained (47 trials, 54 173 patients). The test results differed substantially so that approaches which use the ordered nature of functional outcome data (ordinal logistic regression, t test, robust ranks test, bootstrapping the difference in mean rank) were more efficient statistically than those which collapse the data into 2 groups (2; ANOVA, P0.001). The findings were consistent across different types and sizes of trial and for the different measures of functional outcome. Conclusions—When analyzing functional outcome from stroke trials, statistical tests which use the original ordered data are more efficient and more likely to yield reliable results. Suitable approaches included ordinal logistic regression, test, and robust ranks test.
Resumo:
Family businesses are special in many respects. By examining their financial characteristics one can come to unique conclusions/results. This paper explores the general characteristics of the financial behaviour of family businesses, presents the main findings of the INSIST project’s company case studies concerning financing issues and strategies, and intends to identify the financial characteristics of company succession. The whole existence of family businesses is characterized by a duality of the family and business dimensions and this remains the case in their financial affairs. The financial decisions in family businesses (especially SMEs) are affected by aspects involving a duality of goals rather than exclusively profitability, the simultaneous presence of family and business financial needs, and the preferential handling of family needs at the expense of business needs (although it has to be said that there is evidence of family investments being postponed for the sake of business, too. Family businesses, beyond their actual effectiveness, are guided by individual goals like securing living standards, ensuring workplaces for family members, stability of operation, preservation of the company’s good reputation, and keeping the company’s size at a level that the immediate family can control and manage. The INSIST project’s company case studies revealed some interesting traits of family business finances like the importance of financial support from the founder’s family during the establishment of the company, the use of bootstrapping techniques, the financial characteristics of succession, and the role of family members in financial management.
Resumo:
Social exchange theory and notions of reciprocity have long been assumed to explain the relationship between psychological contract breach and important employee outcomes. To date, however, there has been no explicit testing of these assumptions. This research, therefore, explores the mediating role of negative, generalized, and balanced reciprocity, in the relationships between psychological contract breach and employees’ affective organizational commitment and turnover intentions. A survey of 247 Pakistani employees of a large public university was analyzed using structural equation modeling and bootstrapping techniques, and provided excellent support for our model. As predicted, psychological contract breach was positively related to negative reciprocity norms and negatively related to generalized and balanced reciprocity norms. Negative and generalized (but not balanced) reciprocity were negatively and positively (respectively) related to employees’ affective organizational commitment and fully mediated the relationship between psychological contract breach and affective organizational commitment. Moreover, affective organizational commitment fully mediated the relationship between generalized and negative reciprocity and employees’ turnover intentions. Implications for theory and practice are discussed.
Resumo:
This study focuses on multiple linear regression models relating six climate indices (temperature humidity THI, environmental stress ESI, equivalent temperature index ETI, heat load HLI, modified HLI (HLI new), and respiratory rate predictor RRP) with three main components of cow’s milk (yield, fat, and protein) for cows in Iran. The least absolute shrinkage selection operator (LASSO) and the Akaike information criterion (AIC) techniques are applied to select the best model for milk predictands with the smallest number of climate predictors. Uncertainty estimation is employed by applying bootstrapping through resampling. Cross validation is used to avoid over-fitting. Climatic parameters are calculated from the NASA-MERRA global atmospheric reanalysis. Milk data for the months from April to September, 2002 to 2010 are used. The best linear regression models are found in spring between milk yield as the predictand and THI, ESI, ETI, HLI, and RRP as predictors with p-value < 0.001 and R2 (0.50, 0.49) respectively. In summer, milk yield with independent variables of THI, ETI, and ESI show the highest relation (p-value < 0.001) with R2 (0.69). For fat and protein the results are only marginal. This method is suggested for the impact studies of climate variability/change on agriculture and food science fields when short-time series or data with large uncertainty are available.
Resumo:
The main purpose of this study is to assess the relationship between four bioclimatic indices for cattle (environmental stress, heat load, modified heat load, and respiratory rate predictor indices) and three main milk components (fat, protein, and milk yield) considering uncertainty. The climate parameters used to calculate the climate indices were taken from the NASA-Modern Era Retrospective-Analysis for Research and Applications (NASA-MERRA) reanalysis from 2002 to 2010. Cow milk data were considered for the same period from April to September when the cows use the natural pasture. The study is based on a linear regression analysis using correlations as a summarizing diagnostic. Bootstrapping is used to represent uncertainty information in the confidence intervals. The main results identify an interesting relationship between the milk compounds and climate indices under all climate conditions. During spring, there are reasonably high correlations between the fat and protein concentrations vs. the climate indices, whereas there are insignificant dependencies between the milk yield and climate indices. During summer, the correlation between the fat and protein concentrations with the climate indices decreased in comparison with the spring results, whereas the correlation for the milk yield increased. This methodology is suggested for studies investigating the impacts of climate variability/change on food and agriculture using short term data considering uncertainty.
Resumo:
The main purpose of this study is to assess the relationship between six bioclimatic indices for cattle (temperature humidity (THI), environmental stress (ESI), equivalent temperature (ESI), heat load (HLI), modified heat load (HLInew) and respiratory rate predictor(RRP)) and fundamental milk components (fat, protein, and milk yield) considering uncertainty. The climate parameters used to calculate the climate indices were taken from the NASA-Modern Era Retrospective-Analysis for Research and Applications (NASA-MERRA) reanalysis from 2002 to 2010. Cow milk data were considered for the same period from April to September when cows use natural pasture, with possibility for cows to choose to stay in the barn or to graze on the pasture in the pasturing system. The study is based on a linear regression analysis using correlations as a summarizing diagnostic. Bootstrapping is used to represent uncertainty estimation through resampling in the confidence intervals. To find the relationships between climate indices (THI, ETI, HLI, HLInew, ESI and RRP) and main components of cow milk (fat, protein and yield), multiple liner regression is applied. The least absolute shrinkage selection operator (LASSO) and the Akaike information criterion (AIC) techniques are applied to select the best model for milk predictands with the smallest number of climate predictors. Cross validation is used to avoid over-fitting. Based on results of investigation the effect of heat stress indices on milk compounds separately, we suggest the use of ESI and RRP in the summer and ESI in the spring. THI and HLInew are suggested for fat content and HLInew also is suggested for protein content in the spring season. The best linear models are found in spring between milk yield as predictands and THI, ESI,HLI, ETI and RRP as predictors with p-value < 0.001 and R2 0.50, 0.49. In summer, milk yield with independent variables of THI, ETI and ESI show the highest relation (p-value < 0.001) with R2 (0.69). For fat and protein the results are only marginal. It is strongly suggested that new and significant indices are needed to control critical heat stress conditions that consider more predictors of the effect of climate variability on animal products, such as sunshine duration, quality of pasture, the number of days of stress (NDS), the color of skin with attention to large black spots, and categorical predictors such as breed, welfare facility, and management system. This methodology is suggested for studies investigating the impacts of climate variability/change on food quality/security, animal science and agriculture using short term data considering uncertainty or data collection is expensive, difficult, or data with gaps.