979 resultados para Multivariate generalized t -distribution
Resumo:
In longitudinal studies of disease, patients may experience several events through a follow-up period. In these studies, the sequentially ordered events are often of interest and lead to problems that have received much attention recently. Issues of interest include the estimation of bivariate survival, marginal distributions and the conditional distribution of gap times. In this work we consider the estimation of the survival function conditional to a previous event. Different nonparametric approaches will be considered for estimating these quantities, all based on the Kaplan-Meier estimator of the survival function. We explore the finite sample behavior of the estimators through simulations. The different methods proposed in this article are applied to a data set from a German Breast Cancer Study. The methods are used to obtain predictors for the conditional survival probabilities as well as to study the influence of recurrence in overall survival.
Resumo:
1. Model-based approaches have been used increasingly in conservation biology over recent years. Species presence data used for predictive species distribution modelling are abundant in natural history collections, whereas reliable absence data are sparse, most notably for vagrant species such as butterflies and snakes. As predictive methods such as generalized linear models (GLM) require absence data, various strategies have been proposed to select pseudo-absence data. However, only a few studies exist that compare different approaches to generating these pseudo-absence data. 2. Natural history collection data are usually available for long periods of time (decades or even centuries), thus allowing historical considerations. However, this historical dimension has rarely been assessed in studies of species distribution, although there is great potential for understanding current patterns, i.e. the past is the key to the present. 3. We used GLM to model the distributions of three 'target' butterfly species, Melitaea didyma, Coenonympha tullia and Maculinea teleius, in Switzerland. We developed and compared four strategies for defining pools of pseudo-absence data and applied them to natural history collection data from the last 10, 30 and 100 years. Pools included: (i) sites without target species records; (ii) sites where butterfly species other than the target species were present; (iii) sites without butterfly species but with habitat characteristics similar to those required by the target species; and (iv) a combination of the second and third strategies. Models were evaluated and compared by the total deviance explained, the maximized Kappa and the area under the curve (AUC). 4. Among the four strategies, model performance was best for strategy 3. Contrary to expectations, strategy 2 resulted in even lower model performance compared with models with pseudo-absence data simulated totally at random (strategy 1). 5. Independent of the strategy model, performance was enhanced when sites with historical species presence data were not considered as pseudo-absence data. Therefore, the combination of strategy 3 with species records from the last 100 years achieved the highest model performance. 6. Synthesis and applications. The protection of suitable habitat for species survival or reintroduction in rapidly changing landscapes is a high priority among conservationists. Model-based approaches offer planning authorities the possibility of delimiting priority areas for species detection or habitat protection. The performance of these models can be enhanced by fitting them with pseudo-absence data relying on large archives of natural history collection species presence data rather than using randomly sampled pseudo-absence data.
Resumo:
1. Landscape modification is often considered the principal cause of population decline in many bat species. Thus, schemes for bat conservation rely heavily on knowledge about species-landscape relationships. So far, however, few studies have quantified the possible influence of landscape structure on large-scale spatial patterns in bat communities. 2. This study presents quantitative models that use landscape structure to predict (i) spatial patterns in overall community composition and (ii) individual species' distributions through canonical correspondence analysis and generalized linear models, respectively. A geographical information system (GIS) was then used to draw up maps of (i) overall community patterns and (ii) distribution of potential species' habitats. These models relied on field data from the Swiss Jura mountains. 3. Fight descriptors of landscape structure accounted for 30% of the variation in bat community composition. For some species, more than 60% of the variance in distribution could be explained by landscape structure. Elevation, forest or woodland cover, lakes and suburbs, were the most frequent predictors. 4. This study shows that community composition in bats is related to landscape structure through species-specific relationships to resources. Due to their nocturnal activities and the difficulties of remote identification, a comprehensive bat census is rarely possible, and we suggest that predictive modelling of the type described here provides an indispensable conservation tool.
Resumo:
Species distribution models (SDMs) are widely used to explain and predict species ranges and environmental niches. They are most commonly constructed by inferring species' occurrence-environment relationships using statistical and machine-learning methods. The variety of methods that can be used to construct SDMs (e.g. generalized linear/additive models, tree-based models, maximum entropy, etc.), and the variety of ways that such models can be implemented, permits substantial flexibility in SDM complexity. Building models with an appropriate amount of complexity for the study objectives is critical for robust inference. We characterize complexity as the shape of the inferred occurrence-environment relationships and the number of parameters used to describe them, and search for insights into whether additional complexity is informative or superfluous. By building 'under fit' models, having insufficient flexibility to describe observed occurrence-environment relationships, we risk misunderstanding the factors shaping species distributions. By building 'over fit' models, with excessive flexibility, we risk inadvertently ascribing pattern to noise or building opaque models. However, model selection can be challenging, especially when comparing models constructed under different modeling approaches. Here we argue for a more pragmatic approach: researchers should constrain the complexity of their models based on study objective, attributes of the data, and an understanding of how these interact with the underlying biological processes. We discuss guidelines for balancing under fitting with over fitting and consequently how complexity affects decisions made during model building. Although some generalities are possible, our discussion reflects differences in opinions that favor simpler versus more complex models. We conclude that combining insights from both simple and complex SDM building approaches best advances our knowledge of current and future species ranges.
Resumo:
In Alzheimer's disease (AD), synaptic alterations play a major role and are often correlated with cognitive changes. In order to better understand synaptic modifications, we compared alterations in NMDA receptors and postsynaptic protein PSD-95 expression in the entorhinal cortex (EC) and frontal cortex (FC; area 9) of AD and control brains. We combined immunohistochemical and image analysis methods to quantify on consecutive sections the distribution of PSD-95 and NMDA receptors GluN1, GluN2A and GluN2B in EC and FC from 25 AD and control cases. The density of stained receptors was analyzed using multivariate statistical methods to assess the effect of neurodegeneration. In both regions, the number of neuronal profiles immunostained for GluN1 receptors subunit and PSD-95 protein was significantly increased in AD compared to controls (3-6 fold), while the number of neuronal profiles stained for GluN2A and GluN2B receptors subunits was on the contrary decreased (3-4 fold). The increase in marked neuronal profiles was more prominent in a cortical band corresponding to layers 3 to 5 with large pyramidal cells. Neurons positive for GluN1 or PSD-95 staining were often found in the same localization on consecutive sections and they were also reactive for the anti-tau antibody AD2, indicating a neurodegenerative process. Differences in the density of immunoreactive puncta representing neuropile were not statistically significant. Altogether these data indicate that GluN1 and PSD-95 accumulate in the neuronal perikarya, but this is not the case for GluN2A and GluN2B, while the neuropile compartment is less subject to modifications. Thus, important variations in the pattern of distribution of the NMDA receptors subunits and PSD-95 represent a marker in AD and by impairing the neuronal network, contribute to functional deterioration.
Resumo:
Epoetin-delta (Dynepo Shire Pharmaceuticals, Basing stoke, UK) is a synthetic form of erythropoietin (EPO) whose resemblance with endogenous EPO makes it hard to identify using the classical identification criteria. Urine samples collected from six healthy volunteers treated with epoetin-delta injections and from a control population were immuno-purified and analyzed with the usual IEF method. On the basis of the EPO profiles integration, a linear multivariate model was computed for discriminant analysis. For each sample, a pattern classification algorithm returned a bands distribution and intensity score (bands intensity score) saying how representative this sample is of one of the two classes, positive or negative. Effort profiles were also integrated in the model. The method yielded a good sensitivity versus specificity relation and was used to determine the detection window of the molecule following multiple injections. The bands intensity score, which can be generalized to epoetin-alpha and epoetin-beta, is proposed as an alternative criterion and a supplementary evidence for the identification of EPO abuse.
Resumo:
INTRODUCTION: Red cell distribution width was recently identified as a predictor of cardiovascular and all-cause mortality in patients with previous stroke. Red cell distribution width is also higher in patients with stroke compared with those without. However, there are no data on the association of red cell distribution width, assessed during the acute phase of ischemic stroke, with stroke severity and functional outcome. In the present study, we sought to investigate this relationship and ascertain the main determinants of red cell distribution width in this population. METHODS: We used data from the Acute Stroke Registry and Analysis of Lausanne for patients between January 2003 and December 2008. Red cell distribution width was generated at admission by the Sysmex XE-2100 automated cell counter from ethylene diamine tetraacetic acid blood samples stored at room temperature until measurement. An χ(2) -test was performed to compare frequencies of categorical variables between different red cell distribution width quartiles, and one-way analysis of variance for continuous variables. The effect of red cell distribution width on severity and functional outcome was investigated in univariate and multivariate robust regression analysis. Level of significance was set at 95%. RESULTS: There were 1504 patients (72±15·76 years, 43·9% females) included in the analysis. Red cell distribution width was significantly associated to NIHSS (β-value=0·24, P=0·01) and functional outcome (odds ratio=10·73 for poor outcome, P<0·001) at univariate analysis but not multivariate. Prehospital Rankin score (β=0·19, P<0·001), serum creatinine (β=0·008, P<0·001), hemoglobin (β=-0·009, P<0·001), mean platelet volume (β=0·09, P<0·05), age (β=0·02, P<0·001), low ejection fraction (β=0·66, P<0·001) and antihypertensive treatment (β=0·32, P<0·001) were independent determinants of red cell distribution width. CONCLUSIONS: Red cell distribution width, assessed during the early phase of acute ischemic stroke, does not predict severity or functional outcome.
Resumo:
We studied the distribution of Palearctic green toads (Bufo viridis subgroup), an anuran species group with three ploidy levels, inhabiting the Central Asian Amudarya River drainage. Various approaches (one-way, multivariate, components variance analyses and maximum entropy modelling) were used to estimate the effect of altitude, precipitation, temperature and land vegetation covers on the distribution of toads. It is usually assumed that polyploid species occur in regions with harsher climatic conditions (higher latitudes, elevations, etc.), but for the green toads complex, we revealed a more intricate situation. The diploid species (Bufo shaartusiensis and Bufo turanensis) inhabit the arid lowlands (from 44 to 789 m a.s.l.), while tetraploid Bufo pewzowi were recorded in mountainous regions (340-3492 m a.s.l.) with usually lower temperatures and higher precipitation rates than in the region inhabited by diploid species. The triploid species Bufo baturae was found in the Pamirs (Tajikistan) at the highest altitudes (2503-3859 m a.s.l.) under the harshest climatic conditions.
Resumo:
Human immunodeficiency virus (HIV)-positive patients have a greater prevalence of coinfection with human papillomavirus (HPV) is of high oncogenic risk. Indeed, the presence of the virus favours intraepithelial squamous cell lesion progression and may induce cancer. The aim of this study was to evaluate the prevalence of HPV infection, distribution of HPV types and risk factors among HIV-positive patients. Cervical samples from 450 HIV-positive patients were analysed with regard to oncotic cytology, colposcopy and HPV presence and type by means of polymerase chain reaction and sequencing. The results were analysed by comparing demographic data and data relating to HPV and HIV infection. The prevalence of HPV was 47.5%. Among the HPV-positive samples, 59% included viral types of high oncogenic risk. Multivariate analysis showed an association between HPV infection and the presence of cytological alterations (p = 0.003), age greater than or equal to 35 years (p = 0.002), number of partners greater than three (p = 0.002), CD4+ lymphocyte count < 200/mm3 (p = 0.041) and alcohol abuse (p = 0.004). Although high-risk HPV was present in the majority of the lesions studied, the low frequency of HPV 16 (3.3%), low occurrence of cervical lesions and preserved immunological state in most of the HIV-positive patients were factors that may explain the low occurrence of precancerous cervical lesions in this population.
Resumo:
Aims Food-deceptive pollination, in which plants do not offer any food reward to their pollinators, is common within the Orchidaceae. As food-deceptive orchids are poorer competitors for pollinator visitation than rewarding orchids, their occurrence in a given habitat may be more constrained than that of rewarding orchids. In particular, the success of deceptive orchids strongly relies on several biotic factors such as interactions with co-flowering rewarding species and pollinators, which may vary with altitude and over time. Our study compares generalized food-deceptive (i.e. excluding sexually deceptive) and rewarding orchids to test whether (i) deceptive orchids flower earlier compared to their rewarding counterparts and whether (ii) the relative occurrence of deceptive orchids decreases with increasing altitude. Methods To compare the flowering phenology of rewarding and deceptive orchids, we analysed data compiled from the literature at the species level over the occidental Palaearctic area. Since flowering phenology can be constrained by the latitudinal distribution of the species and by their phylogenetic relationships, we accounted for these factors in our analysis. To compare the altitudinal distribution of rewarding and deceptive orchids, we used field observations made over the entire Swiss territory and over two Swiss mountain ranges. Important Findings We found that deceptive orchid species start flowering earlier than rewarding orchids do, which is in accordance with the hypotheses of exploitation of naive pollinators and/or avoidance of competition with rewarding co-occurring species. Also, the relative frequency of deceptive orchids decreases with altitude, suggesting that deception may be less profitable at high compared to low altitude.
Resumo:
We present models predicting the potential distribution of a threatened ant species, Formica exsecta Nyl., in the Swiss National Park ( SNP). Data to fit the models have been collected according to a random-stratified design with an equal number of replicates per stratum. The basic aim of such a sampling strategy is to allow the formal testing of biological hypotheses about those factors most likely to account for the distribution of the modeled species. The stratifying factors used in this study were: vegetation, slope angle and slope aspect, the latter two being used as surrogates of solar radiation, considered one of the basic requirements of F. exsecta. Results show that, although the basic stratifying predictors account for more than 50% of the deviance, the incorporation of additional non-spatially explicit predictors into the model, as measured in the field, allows for an increased model performance (up to nearly 75%). However, this was not corroborated by permutation tests. Implementation on a national scale was made for one model only, due to the difficulty of obtaining similar predictors on this scale. The resulting map on the national scale suggests that the species might once have had a broader distribution in Switzerland. Reasons for its particular abundance within the SNP might possibly be related to habitat fragmentation and vegetation transformation outside the SNP boundaries.
Resumo:
An important statistical development of the last 30 years has been the advance in regression analysis provided by generalized linear models (GLMs) and generalized additive models (GAMs). Here we introduce a series of papers prepared within the framework of an international workshop entitled: Advances in GLMs/GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6-11 August 2001.We first discuss some general uses of statistical models in ecology, as well as provide a short review of several key examples of the use of GLMs and GAMs in ecological modeling efforts. We next present an overview of GLMs and GAMs, and discuss some of their related statistics used for predictor selection, model diagnostics, and evaluation. Included is a discussion of several new approaches applicable to GLMs and GAMs, such as ridge regression, an alternative to stepwise selection of predictors, and methods for the identification of interactions by a combined use of regression trees and several other approaches. We close with an overview of the papers and how we feel they advance our understanding of their application to ecological modeling.
Resumo:
This paper combines multivariate density forecasts of output growth, inflationand interest rates from a suite of models. An out-of-sample weighting scheme based onthe predictive likelihood as proposed by Eklund and Karlsson (2005) and Andersson andKarlsson (2007) is used to combine the models. Three classes of models are considered: aBayesian vector autoregression (BVAR), a factor-augmented vector autoregression (FAVAR)and a medium-scale dynamic stochastic general equilibrium (DSGE) model. Using Australiandata, we find that, at short forecast horizons, the Bayesian VAR model is assignedthe most weight, while at intermediate and longer horizons the factor model is preferred.The DSGE model is assigned little weight at all horizons, a result that can be attributedto the DSGE model producing density forecasts that are very wide when compared withthe actual distribution of observations. While a density forecast evaluation exercise revealslittle formal evidence that the optimally combined densities are superior to those from thebest-performing individual model, or a simple equal-weighting scheme, this may be a resultof the short sample available.
Resumo:
Aim This study used data from temperate forest communities to assess: (1) five different stepwise selection methods with generalized additive models, (2) the effect of weighting absences to ensure a prevalence of 0.5, (3) the effect of limiting absences beyond the environmental envelope defined by presences, (4) four different methods for incorporating spatial autocorrelation, and (5) the effect of integrating an interaction factor defined by a regression tree on the residuals of an initial environmental model. Location State of Vaud, western Switzerland. Methods Generalized additive models (GAMs) were fitted using the grasp package (generalized regression analysis and spatial predictions, http://www.cscf.ch/grasp). Results Model selection based on cross-validation appeared to be the best compromise between model stability and performance (parsimony) among the five methods tested. Weighting absences returned models that perform better than models fitted with the original sample prevalence. This appeared to be mainly due to the impact of very low prevalence values on evaluation statistics. Removing zeroes beyond the range of presences on main environmental gradients changed the set of selected predictors, and potentially their response curve shape. Moreover, removing zeroes slightly improved model performance and stability when compared with the baseline model on the same data set. Incorporating a spatial trend predictor improved model performance and stability significantly. Even better models were obtained when including local spatial autocorrelation. A novel approach to include interactions proved to be an efficient way to account for interactions between all predictors at once. Main conclusions Models and spatial predictions of 18 forest communities were significantly improved by using either: (1) cross-validation as a model selection method, (2) weighted absences, (3) limited absences, (4) predictors accounting for spatial autocorrelation, or (5) a factor variable accounting for interactions between all predictors. The final choice of model strategy should depend on the nature of the available data and the specific study aims. Statistical evaluation is useful in searching for the best modelling practice. However, one should not neglect to consider the shapes and interpretability of response curves, as well as the resulting spatial predictions in the final assessment.
Resumo:
Principal curves have been defined Hastie and Stuetzle (JASA, 1989) assmooth curves passing through the middle of a multidimensional dataset. They are nonlinear generalizations of the first principalcomponent, a characterization of which is the basis for the principalcurves definition.In this paper we propose an alternative approach based on a differentproperty of principal components. Consider a point in the space wherea multivariate normal is defined and, for each hyperplane containingthat point, compute the total variance of the normal distributionconditioned to belong to that hyperplane. Choose now the hyperplaneminimizing this conditional total variance and look for thecorresponding conditional mean. The first principal component of theoriginal distribution passes by this conditional mean and it isorthogonal to that hyperplane. This property is easily generalized todata sets with nonlinear structure. Repeating the search from differentstarting points, many points analogous to conditional means are found.We call them principal oriented points. When a one-dimensional curveruns the set of these special points it is called principal curve oforiented points. Successive principal curves are recursively definedfrom a generalization of the total variance.