917 resultados para Generalized Least Squares Estimation
Resumo:
We studied the relationship between flower size and nectar properties of hummingbird-visited flowers in the Brazilian Atlantic Forest. We analysed the nectar volume and concentration as a function of corolla length and the average bill size of visitors for 150 plant species, using the phylogenetic generalized least squares (PGLS) to control for phylogenetic signals in the data. We found that nectar volume is positively correlated with corolla length due to phylogenetic allometry. We also demonstrated that larger flowers provide better rewards for long-billed hummingbirds. Regardless of the causal mechanisms, our results support the hypothesis that morphological floral traits that drive partitioning among hummingbirds correspond to the quantity of resources produced by the flowers in the Atlantic Forest. We demonstrate that the relationship between nectar properties and flower size is affected by phylogenetic constraints and thus future studies assessing the interaction between floral traits need to control for phylogenetic signals in the data.
Resumo:
It is shown that variance-balanced designs can be obtained from Type I orthogonal arrays for many general models with two kinds of treatment effects, including ones for interference, with general dependence structures. These designs can be used to obtain optimal and efficient designs. Some examples and design comparisons are given. (C) 2002 Elsevier B.V. All rights reserved.
Resumo:
In this article we investigate the asymptotic and finite-sample properties of predictors of regression models with autocorrelated errors. We prove new theorems associated with the predictive efficiency of generalized least squares (GLS) and incorrectly structured GLS predictors. We also establish the form associated with their predictive mean squared errors as well as the magnitude of these errors relative to each other and to those generated from the ordinary least squares (OLS) predictor. A large simulation study is used to evaluate the finite-sample performance of forecasts generated from models using different corrections for the serial correlation.
Resumo:
Analysis of risk measures associated with price series data movements and its predictions are of strategic importance in the financial markets as well as to policy makers in particular for short- and longterm planning for setting up economic growth targets. For example, oilprice risk-management focuses primarily on when and how an organization can best prevent the costly exposure to price risk. Value-at-Risk (VaR) is the commonly practised instrument to measure risk and is evaluated by analysing the negative/positive tail of the probability distributions of the returns (profit or loss). In modelling applications, least-squares estimation (LSE)-based linear regression models are often employed for modeling and analyzing correlated data. These linear models are optimal and perform relatively well under conditions such as errors following normal or approximately normal distributions, being free of large size outliers and satisfying the Gauss-Markov assumptions. However, often in practical situations, the LSE-based linear regression models fail to provide optimal results, for instance, in non-Gaussian situations especially when the errors follow distributions with fat tails and error terms possess a finite variance. This is the situation in case of risk analysis which involves analyzing tail distributions. Thus, applications of the LSE-based regression models may be questioned for appropriateness and may have limited applicability. We have carried out the risk analysis of Iranian crude oil price data based on the Lp-norm regression models and have noted that the LSE-based models do not always perform the best. We discuss results from the L1, L2 and L∞-norm based linear regression models. ACM Computing Classification System (1998): B.1.2, F.1.3, F.2.3, G.3, J.2.
Resumo:
This dissertation examines the monetary models of exchange rate determination for Brazil, Canada, and two countries in the Caribbean, namely, the Dominican Republic and Jamaica. With the exception of Canada, the others adopted the floating regime during the past ten years.^ The empirical validity of four seminal models in exchange rate economics were determined. Three of these models were entirely classical (Bilson and Frenkel) or Keynesian (Dornbusch) in nature. The fourth model (Real Interest Differential Model) was a mixture of the two schools of economic theory.^ There is no clear empirical evidence of the validity of the monetary models. However, the signs of the coefficients of the nominal interest differential variable were as predicted by the Keynesian hypothesis in the case of Canada and as predicted by the Chicago theorists in the remaining countries. Moreover, in case of Brazil, due to hyperinflation, the exchange rate is heavily influenced by domestic money supply.^ I also tested the purchasing power parity (PPP) for this same set of countries. For both the monetary as well as the PPP hypothesis, I tested for co-integration and applied ordinary least squares estimation procedure. The error correction model was also used for the PPP model, to determine convergence to equilibrium.^ The validity of PPP is also questionable for my set of countries. Endogeinity among the regressors as well as the lack of proper price indices are the contributing factors. More importantly, Central Bank intervention negate rapid adjustment of price and exchange rates to their equilibrium value. However, its forecasting capability for the period 1993-1994 is superior compared to the monetary models in two of the four cases.^ I conclude that in spite of the questionable validity of these models, the monetary models give better results in the case of the "smaller" economies like the Dominican Republic and Jamaica where monetary influences swamp the other determinants of exchange rate. ^
Influência das condições ambientais no verdor da vegetação da caatinga frente às mudanças climáticas
Resumo:
The Caatinga biome, a semi-arid climate ecosystem found in northeast Brazil, presents low rainfall regime and strong seasonality. It has the most alarming climate change projections within the country, with air temperature rising and rainfall reduction with stronger trends than the global average predictions. Climate change can present detrimental results in this biome, reducing vegetation cover and changing its distribution, as well as altering all ecosystem functioning and finally influencing species diversity. In this context, the purpose of this study is to model the environmental conditions (rainfall and temperature) that influence the Caatinga biome productivity and to predict the consequences of environmental conditions in the vegetation dynamics under future climate change scenarios. Enhanced Vegetation Index (EVI) was used to estimate vegetation greenness (presence and density) in the area. Considering the strong spatial and temporal autocorrelation as well as the heterogeneity of the data, various GLS models were developed and compared to obtain the best model that would reflect rainfall and temperature influence on vegetation greenness. Applying new climate change scenarios in the model, environmental determinants modification, rainfall and temperature, negatively influenced vegetation greenness in the Caatinga biome. This model was used to create potential vegetation maps for current and future of Caatinga cover considering 20% decrease in precipitation and 1 °C increase in temperature until 2040, 35% decrease in precipitation and 2.5 °C increase in temperature in the period 2041-2070 and 50% decrease in precipitation and 4.5 °C increase in temperature in the period 2071-2100. The results suggest that the ecosystem functioning will be affected on the future scenario of climate change with a decrease of 5.9% of the vegetation greenness until 2040, 14.2% until 2070 and 24.3% by the end of the century. The Caatinga vegetation in lower altitude areas (most of the biome) will be more affected by climatic changes.
Resumo:
The evolution of reproductive strategies involves a complex calculus of costs and benefits to both parents and offspring. Many marine animals produce embryos packaged in tough egg capsules or gelatinous egg masses attached to benthic surfaces. While these egg structures can protect against environmental stresses, the packaging is energetically costly for parents to produce. In this series of studies, I examined a variety of ecological factors affecting the evolution of benthic development as a life history strategy. I used marine gastropods as my model system because they are incredibly diverse and abundant worldwide, and they exhibit a variety of reproductive and developmental strategies.
The first study examines predation on benthic egg masses. I investigated: 1) behavioral mechanisms of predation when embryos are targeted (rather than the whole egg mass); 2) the specific role of gelatinous matrix in predation. I hypothesized that gelatinous matrix does not facilitate predation. One study system was the sea slug Olea hansineensis, an obligate egg mass predator, feeding on the sea slug Haminoea vesicula. Olea fed intensely and efficiently on individual Haminoea embryos inside egg masses but showed no response to live embryos removed from gel, suggesting that gelatinous matrix enables predation. This may be due to mechanical support of the feeding predator by the matrix. However, Haminoea egg masses outnumber Olea by two orders of magnitude in the field, and each egg mass can contain many tens of thousands of embryos, so predation pressure on individuals is likely not strong. The second system involved the snail Nassarius vibex, a non-obligate egg mass predator, feeding on the polychaete worm Clymenella mucosa. Gel neither inhibits nor promotes embryo predation for Nassarius, but because it cannot target individual embryos inside an egg mass, its feeding is slow and inefficient, and feeding rates in the field are quite low. However, snails that compete with Nassarius for scavenged food have not been seen to eat egg masses in the field, leaving Nassarius free to exploit the resource. Overall, egg mass predation in these two systems likely benefits the predators much more than it negatively affects the prey. Thus, selection for environmentally protective aspects of egg mass production may be much stronger than selection for defense against predation.
In the second study, I examined desiccation resistance in intertidal egg masses made by Haminoea vesicula, which preferentially attaches its flat, ribbon-shaped egg masses to submerged substrata. Egg masses occasionally detach and become stranded on exposed sand at low tide. Unlike adults, the encased embryos cannot avoid desiccation by selectively moving about the habitat, and the egg mass shape has high surface-area-to-volume ratio that should make it prone to drying out. Thus, I hypothesized that the embryos would not survive stranding. I tested this by deploying individual egg masses of two age classes on exposed sand bars for the duration of low tide. After rehydration, embryos midway through development showed higher rates of survival than newly-laid embryos, though for both stages survival rates over 25% were frequently observed. Laboratory desiccation trials showed that >75% survival is possible in an egg mass that has lost 65% of its water weight, and some survival (<25%) was observed even after 83% water weight lost. Although many surviving embryos in both experiments showed damage, these data demonstrate that egg mass stranding is not necessarily fatal to embryos. They may be able to survive a far greater range of conditions than they normally encounter, compensating for their lack of ability to move. Also, desiccation tolerance of embryos may reduce pressure on parents to find optimal laying substrata.
The third study takes a big-picture approach to investigating the evolution of different developmental strategies in cone snails, the largest genus of marine invertebrates. Cone snail species hatch out of their capsules as either swimming larvae or non-dispersing forms, and their developmental mode has direct consequences for biogeographic patterns. Variability in life history strategies among taxa may be influenced by biological, environmental, or phylogenetic factors, or a combination of these. While most prior research has examined these factors singularly, my aim was to investigate the effects of a host of intrinsic, extrinsic, and historical factors on two fundamental aspects of life history: egg size and egg number. I used phylogenetic generalized least-squares regression models to examine relationships between these two egg traits and a variety of hypothesized intrinsic and extrinsic variables. Adult shell morphology and spatial variability in productivity and salinity across a species geographic range had the strongest effects on egg diameter and number of eggs per capsule. Phylogeny had no significant influence. Developmental mode in Conus appears to be influenced mostly by species-level adaptations and niche specificity rather than phylogenetic conservatism. Patterns of egg size and egg number appear to reflect energetic tradeoffs with body size and specific morphologies as well as adaptations to variable environments. Overall, this series of studies highlights the importance of organism-scale biotic and abiotic interactions in evolutionary patterns.
Resumo:
1. Genomewide association studies (GWAS) enable detailed dissections of the genetic basis for organisms' ability to adapt to a changing environment. In long-term studies of natural populations, individuals are often marked at one point in their life and then repeatedly recaptured. It is therefore essential that a method for GWAS includes the process of repeated sampling. In a GWAS, the effects of thousands of single-nucleotide polymorphisms (SNPs) need to be fitted and any model development is constrained by the computational requirements. A method is therefore required that can fit a highly hierarchical model and at the same time is computationally fast enough to be useful. 2. Our method fits fixed SNP effects in a linear mixed model that can include both random polygenic effects and permanent environmental effects. In this way, the model can correct for population structure and model repeated measures. The covariance structure of the linear mixed model is first estimated and subsequently used in a generalized least squares setting to fit the SNP effects. The method was evaluated in a simulation study based on observed genotypes from a long-term study of collared flycatchers in Sweden. 3. The method we present here was successful in estimating permanent environmental effects from simulated repeated measures data. Additionally, we found that especially for variable phenotypes having large variation between years, the repeated measurements model has a substantial increase in power compared to a model using average phenotypes as a response. 4. The method is available in the R package RepeatABEL. It increases the power in GWAS having repeated measures, especially for long-term studies of natural populations, and the R implementation is expected to facilitate modelling of longitudinal data for studies of both animal and human populations.
Resumo:
Generalized linear mixed models (GLMMs) provide an elegant framework for the analysis of correlated data. Due to the non-closed form of the likelihood, GLMMs are often fit by computational procedures like penalized quasi-likelihood (PQL). Special cases of these models are generalized linear models (GLMs), which are often fit using algorithms like iterative weighted least squares (IWLS). High computational costs and memory space constraints often make it difficult to apply these iterative procedures to data sets with very large number of cases. This paper proposes a computationally efficient strategy based on the Gauss-Seidel algorithm that iteratively fits sub-models of the GLMM to subsetted versions of the data. Additional gains in efficiency are achieved for Poisson models, commonly used in disease mapping problems, because of their special collapsibility property which allows data reduction through summaries. Convergence of the proposed iterative procedure is guaranteed for canonical link functions. The strategy is applied to investigate the relationship between ischemic heart disease, socioeconomic status and age/gender category in New South Wales, Australia, based on outcome data consisting of approximately 33 million records. A simulation study demonstrates the algorithm's reliability in analyzing a data set with 12 million records for a (non-collapsible) logistic regression model.
Distributed Estimation Over an Adaptive Incremental Network Based on the Affine Projection Algorithm
Resumo:
We study the problem of distributed estimation based on the affine projection algorithm (APA), which is developed from Newton`s method for minimizing a cost function. The proposed solution is formulated to ameliorate the limited convergence properties of least-mean-square (LMS) type distributed adaptive filters with colored inputs. The analysis of transient and steady-state performances at each individual node within the network is developed by using a weighted spatial-temporal energy conservation relation and confirmed by computer simulations. The simulation results also verify that the proposed algorithm provides not only a faster convergence rate but also an improved steady-state performance as compared to an LMS-based scheme. In addition, the new approach attains an acceptable misadjustment performance with lower computational and memory cost, provided the number of regressor vectors and filter length parameters are appropriately chosen, as compared to a distributed recursive-least-squares (RLS) based method.
Resumo:
A literatura internacional que analisa os fatores impactantes das transações com partes relacionadas concentra-se no Reino Unido, nos EUA e no continente asiático, sendo o Brasil um ambiente pouco investigado. Esta pesquisa tem por objetivo investigar tanto os fatores impactantes dos contratos com partes relacionadas, quanto o impacto dessas transações no desempenho das empresas brasileiras. Estudos recentes que investigaram as determinantes das transações com partes relacionadas (TPRs), assim como seus impactos no desempenho das empresas, levaram em consideração as vertentes apresentadas por Gordon, Henry e Palia (2004): (a) de conflitos de interesses, as quais apoiam a visão de que as TPRs são danosas para os acionistas minoritários, implicando expropriação da riqueza deles, por parte dos controladores (acionistas majoritários); e (b) transações eficientes que podem ser benéficas às empresas, atendendo, desse modo, aos objetivos econômicos subjacentes delas. Esta pesquisa apoia-se na vertente de conflito de interesses, com base na teoria da agência e no fato de que o cenário brasileiro apresenta ter como característica uma estrutura de propriedade concentrada e ser um país emergente com ambiente legal caracterizado pela baixa proteção aos acionistas minoritários. Para operacionalizar a pesquisa, utilizou-se uma amostra inicial composta de 70 empresas com ações listadas na BM&FBovespa, observando o período de 2010 a 2012. Os contratos relacionados foram identificados e quantificados de duas formas, de acordo com a metodologia aplicada por Kohlbeck e Mayhew (2004; 2010) e Silveira, Prado e Sasso (2009). Como principais determinantes foram investigadas proxies para captar os efeitos dos mecanismos de governança corporativa e ambiente legal, do desempenho das empresas, dos desvios entre direitos sobre controle e direitos sobre fluxo de caixa e do excesso de remuneração executiva. Também foram adicionadas variáveis de controle para isolar as características intrínsecas das firmas. Nas análises econométricas foram estimados os modelos pelos métodos de Poisson, corte transversal agrupado (Pooled-OLS) e logit. A estimação foi feita pelo método dos mínimos quadrados ordinários (MQO), e para aumentar a robustez das estimativas econométricas, foram utilizadas variáveis instrumentais estimadas pelo método dos momentos generalizados (MMG). As evidências indicam que os fatores investigados impactam diferentemente as diversas medidas de TPRs das empresas analisadas. Verificou-se que os contratos relacionados, em geral, são danosos às empresas, impactando negativamente o desempenho delas, desempenho este que é aumentado pela presença de mecanismos eficazes de governança corporativa. Os resultados do impacto das medidas de governança corporativa e das características intrínsecas das firmas no desempenho das empresas são robustos à presença de endogeneidade com base nas regressões com variáveis instrumentais.
Resumo:
The portfolio generating the iTraxx EUR index is modeled by coupled Markov chains. Each of the industries of the portfolio evolves according to its own Markov transition matrix. Using a variant of the method of moments, the model parameters are estimated from a data set of Standard and Poor's. Swap spreads are evaluated by Monte-Carlo simulations. Along with an actuarially fair spread, at least squares spread is considered.
Resumo:
This study addresses the issue of the presence of a unit root on the growth rate estimation by the least-squares approach. We argue that when the log of a variable contains a unit root, i.e., it is not stationary then the growth rate estimate from the log-linear trend model is not a valid representation of the actual growth of the series. In fact, under such a situation, we show that the growth of the series is the cumulative impact of a stochastic process. As such the growth estimate from such a model is just a spurious representation of the actual growth of the series, which we refer to as a “pseudo growth rate”. Hence such an estimate should be interpreted with caution. On the other hand, we highlight that the statistical representation of a series as containing a unit root is not easy to separate from an alternative description which represents the series as fundamentally deterministic (no unit root) but containing a structural break. In search of a way around this, our study presents a survey of both the theoretical and empirical literature on unit root tests that takes into account possible structural breaks. We show that when a series is trendstationary with breaks, it is possible to use the log-linear trend model to obtain well defined estimates of growth rates for sub-periods which are valid representations of the actual growth of the series. Finally, to highlight the above issues, we carry out an empirical application whereby we estimate meaningful growth rates of real wages per worker for 51 industries from the organised manufacturing sector in India for the period 1973-2003, which are not only unbiased but also asymptotically efficient. We use these growth rate estimates to highlight the evolving inter-industry wage structure in India.
Resumo:
Customer satisfaction and retention are key issues for organizations in today’s competitive market place. As such, much research and revenue has been invested in developing accurate ways of assessing consumer satisfaction at both the macro (national) and micro (organizational) level, facilitating comparisons in performance both within and between industries. Since the instigation of the national customer satisfaction indices (CSI), partial least squares (PLS) has been used to estimate the CSI models in preference to structural equation models (SEM) because they do not rely on strict assumptions about the data. However, this choice was based upon some misconceptions about the use of SEM’s and does not take into consideration more recent advances in SEM, including estimation methods that are robust to non-normality and missing data. In this paper, both SEM and PLS approaches were compared by evaluating perceptions of the Isle of Man Post Office Products and Customer service using a CSI format. The new robust SEM procedures were found to be advantageous over PLS. Product quality was found to be the only driver of customer satisfaction, while image and satisfaction were the only predictors of loyalty, thus arguing for the specificity of postal services
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.