832 resultados para Poisson generalized linear mixed models
Resumo:
Short summary: This study was undertaken to assess the diversity of plant resources utilized by the local population in south-western Madagascar, the social, ecological and biophysical conditions that drive their uses and availability, and possible alternative strategies for their sustainable use in the region. The study region, ‘Mahafaly region’, located in south-western Madagascar, is one of the country’s most economically, educationally and climatically disadvantaged regions. With an arid steppe climate, the agricultural production is limited by low water availability and a low level of soil nutrients and soil organic carbon. The region comprises the recently extended Tsimanampetsotsa National Park, with numerous sacred and communities forests, which are threatened by slash and burn agriculture and overexploitation of forests resources. The present study analyzed the availability of wild yams and medicinal plants, and their importance for the livelihood of the local population in this region. An ethnobotanical survey was conducted recording the diversity, local knowledge and use of wild yams and medicinal plants utilized by the local communities in five villages in the Mahafaly region. 250 households were randomly selected followed by semi-structured interviews on the socio-economic characteristics of the households. Data allowed us to characterize sociocultural and socioeconomic factors that determine the local use of wild yams and medicinal plants, and to identify their role in the livelihoods of local people. Species-environment relationships and the current spatial distribution of the wild yams were investigated and predicted using ordination methods and a niche based habitat modelling approach. Species response curves along edaphic gradients allowed us to understand the species requirements on habitat conditions. We thus investigated various alternative methods to enhance the wild yam regeneration for their local conservation and their sustainable use in the Mahafaly region. Altogether, six species of wild yams and a total of 214 medicinal plants species from 68 families and 163 genera were identified in the study region. Results of the cluster and discriminant analysis indicated a clear pattern on resource, resulted in two groups of household and characterized by differences in livestock numbers, off-farm activities, agricultural land and harvests. A generalized linear model highlighted that economic factors significantly affect the collection intensity of wild yams, while the use of medicinal plants depends to a higher degree on socio-cultural factors. The gradient analysis on the distribution of the wild yam species revealed a clear pattern for species habitats. Species models based on NPMR (Nonparametric Multiplicative Regression analysis) indicated the importance of vegetation structure, human interventions, and soil characteristics to determine wild yam species distribution. The prediction of the current availability of wild yam resources showed that abundant wild yam resources are scarce and face high harvest intensity. Experiments on yams cultivation revealed that germination of seeds was enhanced by using pre-germination treatments before planting, vegetative regeneration performed better with the upper part of the tubers (corms) rather than the sets of tubers. In-situ regeneration was possible for the upper parts of the wild tubers but the success depended significantly on the type of soil. The use of manure (10-20 t ha¹) increased the yield of the D. alata and D. alatipes by 40%. We thus suggest the promotion of other cultivated varieties of D. alata found regions neighbouring as the Mahafaly Plateau.
Resumo:
We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM's). Learning is treated as a maximum likelihood problem; in particular, we present an Expectation-Maximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an on-line learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain.
Resumo:
The statistical analysis of literary style is the part of stylometry that compares measurable characteristics in a text that are rarely controlled by the author, with those in other texts. When the goal is to settle authorship questions, these characteristics should relate to the author’s style and not to the genre, epoch or editor, and they should be such that their variation between authors is larger than the variation within comparable texts from the same author. For an overview of the literature on stylometry and some of the techniques involved, see for example Mosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) or Lebart, Salem and Berry (1998). Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be “the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writters like Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translated several times into Spanish, Italian and French, with modern English translations by Rosenthal (1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465, but it was not printed until 1490. There is an intense and long lasting debate around its authorship sprouting from its first edition, where its introduction states that the whole book is the work of Martorell (1413?-1468), while at the end it is stated that the last one fourth of the book is by Galba (?-1490), after the death of Martorell. Some of the authors that support the theory of single authorship are Riquer (1990), Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer (1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990). Neither of the two candidate authors left any text comparable to the one under study, and therefore discriminant analysis can not be used to help classify chapters by author. By using sample texts encompassing about ten percent of the book, and looking at word length and at the use of 44 conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that might indicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba and Ginebra (2000) estimates that stylistic boundary to be near chapter 383. Following the lead of the extensive literature, this paper looks into word length, the use of the most frequent words and into the use of vowels in each chapter of the book. Given that the features selected are categorical, that leads to three contingency tables of ordered rows and therefore to three sequences of multinomial observations. Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3 describes the problem of the estimation of a suden change-point in those sequences, in the following sections we propose various ways to estimate change-points in multinomial sequences; the method in section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma models onto the sequence of Chi-square distances between each row profiles and the average profile, the one in Section 6 fits models onto the sequence of values taken by the first component of the correspondence analysis as well as onto sequences of other summary measures like the average word length. In Section 7 we fit models onto the marginal binomial sequences to identify the features that distinguish the chapters before and after that boundary. Most methods rely heavily on the use of generalized linear models
Resumo:
Even though antenatal care is universally regarded as important, determinants of demand for antenatal care have not been widely studied. Evidence concerning which and how socioeconomic conditions influence whether a pregnant woman attends or not at least one antenatal consultation or how these factors affect the absences to antenatal consultations is very limited. In order to generate this evidence, a two-stage analysis was performed with data from the Demographic and Health Survey carried out by Profamilia in Colombia during 2005. The first stage was run as a logit model showing the marginal effects on the probability of attending the first visit and an ordinary least squares model was performed for the second stage. It was found that mothers living in the pacific region as well as young mothers seem to have a lower probability of attending the first visit but these factors are not related to the number of absences to antenatal consultation once the first visit has been achieved. The effect of health insurance was surprising because of the differing effects that the health insurers showed. Some familiar and personal conditions such as willingness to have the last children and number of previous children, demonstrated to be important in the determination of demand. The effect of mother’s educational attainment was proved as important whereas the father’s educational achievement was not. This paper provides some elements for policy making in order to increase the demand inducement of antenatal care, as well as stimulating research on demand for specific issues on health.
Resumo:
We study the role of natural resource windfalls in explaining the efficiency of public expenditures. Using a rich dataset of expenditures and public good provision for 1,836 municipalities in Peru for period 2001-2010, we estimate a non-monotonic relationship between the efficiency of public good provision and the level of natural resource transfers. Local governments that were extremely favored by the boom of mineral prices were more efficient in using fiscal windfalls whereas those benefited with modest transfers were more inefficient. These results can be explained by the increase in political competition associated with the boom. However, the fact that increases in efficiency were related to reductions in public good provision casts doubts about the beneficial effects of political competition in promoting efficiency.
Resumo:
Models of the dynamics of nitrogen in soil (soil-N) can be used to aid the fertilizer management of a crop. The predictions of soil-N models can be validated by comparison with observed data. Validation generally involves calculating non-spatial statistics of the observations and predictions, such as their means, their mean squared-difference, and their correlation. However, when the model predictions are spatially distributed across a landscape the model requires validation with spatial statistics. There are three reasons for this: (i) the model may be more or less successful at reproducing the variance of the observations at different spatial scales; (ii) the correlation of the predictions with the observations may be different at different spatial scales; (iii) the spatial pattern of model error may be informative. In this study we used a model, parameterized with spatially variable input information about the soil, to predict the mineral-N content of soil in an arable field, and compared the results with observed data. We validated the performance of the N model spatially with a linear mixed model of the observations and model predictions, estimated by residual maximum likelihood. This novel approach allowed us to describe the joint variation of the observations and predictions as: (i) independent random variation that occurred at a fine spatial scale; (ii) correlated random variation that occurred at a coarse spatial scale; (iii) systematic variation associated with a spatial trend. The linear mixed model revealed that, in general, the performance of the N model changed depending on the spatial scale of interest. At the scales associated with random variation, the N model underestimated the variance of the observations, and the predictions were correlated poorly with the observations. At the scale of the trend, the predictions and observations shared a common surface. The spatial pattern of the error of the N model suggested that the observations were affected by the local soil condition, but this was not accounted for by the N model. In summary, the N model would be well-suited to field-scale management of soil nitrogen, but suited poorly to management at finer spatial scales. This information was not apparent with a non-spatial validation. (c),2007 Elsevier B.V. All rights reserved.
Resumo:
A physically motivated statistical model is used to diagnose variability and trends in wintertime ( October - March) Global Precipitation Climatology Project (GPCP) pentad (5-day mean) precipitation. Quasi-geostrophic theory suggests that extratropical precipitation amounts should depend multiplicatively on the pressure gradient, saturation specific humidity, and the meridional temperature gradient. This physical insight has been used to guide the development of a suitable statistical model for precipitation using a mixture of generalized linear models: a logistic model for the binary occurrence of precipitation and a Gamma distribution model for the wet day precipitation amount. The statistical model allows for the investigation of the role of each factor in determining variations and long-term trends. Saturation specific humidity q(s) has a generally negative effect on global precipitation occurrence and with the tropical wet pentad precipitation amount, but has a positive relationship with the pentad precipitation amount at mid- and high latitudes. The North Atlantic Oscillation, a proxy for the meridional temperature gradient, is also found to have a statistically significant positive effect on precipitation over much of the Atlantic region. Residual time trends in wet pentad precipitation are extremely sensitive to the choice of the wet pentad threshold because of increasing trends in low-amplitude precipitation pentads; too low a choice of threshold can lead to a spurious decreasing trend in wet pentad precipitation amounts. However, for not too small thresholds, it is found that the meridional temperature gradient is an important factor for explaining part of the long-term trend in Atlantic precipitation.
Resumo:
An extensive statistical ‘downscaling’ study is done to relate large-scale climate information from a general circulation model (GCM) to local-scale river flows in SW France for 51 gauging stations ranging from nival (snow-dominated) to pluvial (rainfall-dominated) river-systems. This study helps to select the appropriate statistical method at a given spatial and temporal scale to downscale hydrology for future climate change impact assessment of hydrological resources. The four proposed statistical downscaling models use large-scale predictors (derived from climate model outputs or reanalysis data) that characterize precipitation and evaporation processes in the hydrological cycle to estimate summary flow statistics. The four statistical models used are generalized linear (GLM) and additive (GAM) models, aggregated boosted trees (ABT) and multi-layer perceptron neural networks (ANN). These four models were each applied at two different spatial scales, namely at that of a single flow-gauging station (local downscaling) and that of a group of flow-gauging stations having the same hydrological behaviour (regional downscaling). For each statistical model and each spatial resolution, three temporal resolutions were considered, namely the daily mean flows, the summary statistics of fortnightly flows and a daily ‘integrated approach’. The results show that flow sensitivity to atmospheric factors is significantly different between nival and pluvial hydrological systems which are mainly influenced, respectively, by shortwave solar radiations and atmospheric temperature. The non-linear models (i.e. GAM, ABT and ANN) performed better than the linear GLM when simulating fortnightly flow percentiles. The aggregated boosted trees method showed higher and less variable R2 values to downscale the hydrological variability in both nival and pluvial regimes. Based on GCM cnrm-cm3 and scenarios A2 and A1B, future relative changes of fortnightly median flows were projected based on the regional downscaling approach. The results suggest a global decrease of flow in both pluvial and nival regimes, especially in spring, summer and autumn, whatever the considered scenario. The discussion considers the performance of each statistical method for downscaling flow at different spatial and temporal scales as well as the relationship between atmospheric processes and flow variability.
Resumo:
Few studies have linked density dependence of parasitism and the tritrophic environment within which a parasitoid forages. In the non-crop plant-aphid, Centaurea nigra-Uroleucon jaceae system, mixed patterns of density-dependent parasitism by the parasitoids Aphidius funebris and Trioxys centaureae were observed in a survey of a natural population. Breakdown of density-dependent parasitism revealed that density dependence was inverse in smaller colonies but direct in large colonies (>20 aphids), suggesting there is a threshold effect in parasitoid response to aphid density. The CV2 of searching parasitoids was estimated from parasitism data using a hierarchical generalized linear model, and CV2>1 for A. funebris between plant patches, while for T. centaureae CV2>1 within plant patches. In both cases, density independent heterogeneity was more important than density-dependent heterogeneity in parasitism. Parasitism by T. centaureae increased with increasing plant patch size. Manipulation of aphid colony size and plant patch size revealed that parasitism by A. funebris was directly density dependent at the range of colony sizes tested (50-200 initial aphids), and had a strong positive relationship with plant patch size. The effects of plant patch size detected for both species indicate that the tritrophic environment provides a source of host density independent heterogeneity in parasitism, and can modify density-dependent responses. (c) 2007 Gessellschaft fur Okologie. Published by Elsevier GmbH. All rights reserved.
Resumo:
We introduce a procedure for association based analysis of nuclear families that allows for dichotomous and more general measurements of phenotype and inclusion of covariate information. Standard generalized linear models are used to relate phenotype and its predictors. Our test procedure, based on the likelihood ratio, unifies the estimation of all parameters through the likelihood itself and yields maximum likelihood estimates of the genetic relative risk and interaction parameters. Our method has advantages in modelling the covariate and gene-covariate interaction terms over recently proposed conditional score tests that include covariate information via a two-stage modelling approach. We apply our method in a study of human systemic lupus erythematosus and the C-reactive protein that includes sex as a covariate.
Resumo:
BACKGROUND: The widespread occurrence of feminized male fish downstream of some wastewater treatment works has led to substantial interest from ecologists and public health professionals. This concern stems from the view that the effects observed have a parallel in humans, and that both phenomena are caused by exposure to mixtures of contaminants that interfere with reproductive development. The evidence for a "wildlife-human connection" is, however, weak: Testicular dysgenesis syndrome, seen in human males, is most easily reproduced in rodent models by exposure to mixtures of antiandrogenic chemicals. In contrast, the accepted explanation for feminization of wild male fish is that it results mainly from exposure to steroidal estrogens originating primarily from human excretion. OBJECTIVES: We sought to further explore the hypothesis that endocrine disruption in fish is multi-causal, resulting from exposure to mixtures of chemicals with both estrogenic and antiandrogenic properties. METHODS: We used hierarchical generalized linear and generalized additive statistical modeling to explore the associations between modeled concentrations and activities of estrogenic and antiandrogenic chemicals in 30 U.K. rivers and feminized responses seen in wild fish living in these rivers. RESULTS: In addition to the estrogenic substances, antiandrogenic activity was prevalent in almost all treated sewage effluents tested. Further, the results of the modeling demonstrated that feminizing effects in wild fish could be best modeled as a function of their predicted exposure to both anti-androgens and estrogens or to antiandrogens alone. CONCLUSION: The results provide a strong argument for a multicausal etiology of widespread feminization of wild fish in U.K. rivers involving contributions from both steroidal estrogens and xeno-estrogens and from other (as yet unknown) contaminants with antiandrogenic properties. These results may add farther credence to the hypothesis that endocrine-disrupting effects seen in wild fish and in humans are caused by similar combinations of endocrine-disrupting chemical cocktails.
Resumo:
A physically motivated statistical model is used to diagnose variability and trends in wintertime ( October - March) Global Precipitation Climatology Project (GPCP) pentad (5-day mean) precipitation. Quasi-geostrophic theory suggests that extratropical precipitation amounts should depend multiplicatively on the pressure gradient, saturation specific humidity, and the meridional temperature gradient. This physical insight has been used to guide the development of a suitable statistical model for precipitation using a mixture of generalized linear models: a logistic model for the binary occurrence of precipitation and a Gamma distribution model for the wet day precipitation amount. The statistical model allows for the investigation of the role of each factor in determining variations and long-term trends. Saturation specific humidity q(s) has a generally negative effect on global precipitation occurrence and with the tropical wet pentad precipitation amount, but has a positive relationship with the pentad precipitation amount at mid- and high latitudes. The North Atlantic Oscillation, a proxy for the meridional temperature gradient, is also found to have a statistically significant positive effect on precipitation over much of the Atlantic region. Residual time trends in wet pentad precipitation are extremely sensitive to the choice of the wet pentad threshold because of increasing trends in low-amplitude precipitation pentads; too low a choice of threshold can lead to a spurious decreasing trend in wet pentad precipitation amounts. However, for not too small thresholds, it is found that the meridional temperature gradient is an important factor for explaining part of the long-term trend in Atlantic precipitation.
Resumo:
Objectives: To assess the potential source of variation that surgeon may add to patient outcome in a clinical trial of surgical procedures. Methods: Two large (n = 1380) parallel multicentre randomized surgical trials were undertaken to compare laparoscopically assisted hysterectomy with conventional methods of abdominal and vaginal hysterectomy; involving 43 surgeons. The primary end point of the trial was the occurrence of at least one major complication. Patients were nested within surgeons giving the data set a hierarchical structure. A total of 10% of patients had at least one major complication, that is, a sparse binary outcome variable. A linear mixed logistic regression model (with logit link function) was used to model the probability of a major complication, with surgeon fitted as a random effect. Models were fitted using the method of maximum likelihood in SAS((R)). Results: There were many convergence problems. These were resolved using a variety of approaches including; treating all effects as fixed for the initial model building; modelling the variance of a parameter on a logarithmic scale and centring of continuous covariates. The initial model building process indicated no significant 'type of operation' across surgeon interaction effect in either trial, the 'type of operation' term was highly significant in the abdominal trial, and the 'surgeon' term was not significant in either trial. Conclusions: The analysis did not find a surgeon effect but it is difficult to conclude that there was not a difference between surgeons. The statistical test may have lacked sufficient power, the variance estimates were small with large standard errors, indicating that the precision of the variance estimates may be questionable.
Resumo:
This work analyzes the use of linear discriminant models, multi-layer perceptron neural networks and wavelet networks for corporate financial distress prediction. Although simple and easy to interpret, linear models require statistical assumptions that may be unrealistic. Neural networks are able to discriminate patterns that are not linearly separable, but the large number of parameters involved in a neural model often causes generalization problems. Wavelet networks are classification models that implement nonlinear discriminant surfaces as the superposition of dilated and translated versions of a single "mother wavelet" function. In this paper, an algorithm is proposed to select dilation and translation parameters that yield a wavelet network classifier with good parsimony characteristics. The models are compared in a case study involving failed and continuing British firms in the period 1997-2000. Problems associated with over-parameterized neural networks are illustrated and the Optimal Brain Damage pruning technique is employed to obtain a parsimonious neural model. The results, supported by a re-sampling study, show that both neural and wavelet networks may be a valid alternative to classical linear discriminant models.
Resumo:
Many studies warn that climate change may undermine global food security. Much work on this topic focuses on modelling crop-weather interactions but these models do not generally account for the ways in which socio-economic factors influence how harvests are affected by weather. To address this gap, this paper uses a quantitative harvest vulnerability index based on annual soil moisture and grain production data as the dependent variables in a Linear Mixed Effects model with national scale socio-economic data as independent variables for the period 1990-2005. Results show that rice, wheat and maize production in middle income countries were especially vulnerable to droughts. By contrast, harvests in countries with higher investments in agriculture (e.g higher amounts of fertilizer use) were less vulnerable to drought. In terms of differences between the world's major grain crops, factors that made rice and wheat crops vulnerable to drought were quite consistent, whilst those of maize crops varied considerably depending on the type of region. This is likely due to the fact that maize is produced under very different conditions worldwide. One recommendation for reducing drought vulnerability risks is coordinated development and adaptation policies, including institutional support that enables farmers to take proactive action.