950 resultados para akaike information criterion
Resumo:
We predicted that the probability of egg occurrence of salamander Salamandrina perspicillata depended on stream features and predation by native crayfish Austropotamobius fulcisianus and the introduced trout Salmo trutta. We assessed the presence of S. perspicillata at 54 sites within a natural reserve of southern Tuscany, Italy. Generalized linear models with binomial errors were constructed using egg presence/absence and altitude, stream mean size and slope, electrical conductivity, water pH and temperature, and a predation factor, defined according to the presence/absence of crayfish and trout. Some competing models also included an autocovariate term, which estimated how much the response variable at any one sampling point reflected response values at surrounding points. The resulting models were compared using Akaike's information criterion. Model selection led to a subset of 14 models with Delta AIC(c) <7 (i.e., models ranging from substantial support to considerably less support), and all but one of these included an effect of predation. Models with the autocovariate term had considerably more support than those without the term. According to multimodel inference, the presence of trout and crayfish reduced the probability of egg occurrence from a mean level of 0.90 (SE limits: 0.98-0.55) to 0.12 (SE limits: 0.34-0.04). The presence of crayfish alone had no detectable effects (SE limits: 0.86-0.39). The results suggest that introduced trout have a detrimental effect on the reproductive output of S. perspicillata and confirm the fundamental importance of distinguishing the roles of endogenous and exogenous forces that act on population distribution.
Resumo:
We tested whether the distribution of three common springtail species (Gressittacantha terranova, Gomphiocephalus hodgsoni and Friesea grisea) in Victoria Land (Antarctica) could be modelled as a function of latitude, longitude, altitude and distance from the sea.
Victoria Land, Ross Dependency, Antarctica.
Generalized linear models were constructed using species presence/absence data relative to geographical features (latitude, longitude, altitude, distance from sea) across the species' entire ranges. Model results were then integrated with the known phylogeography of each species and hypotheses were generated on the role of climate as a major driver of Antarctic springtail distribution.
Based on model selection using Akaike's information criterion, the species' distributions were: hump-shaped relative to longitude and monotonic with altitude for Gressittacantha terranova; hump-shaped relative to latitude and monotonic with altitude for Gomphiocephalus hodgsoni; and hump-shaped relative to longitude and monotonic with latitude, altitude and distance from the sea for Friesea grisea.
No single distributional pattern was shared by the three species. While distributions were partially a response to climatic spatial clines, the patterns observed strongly suggest that past geological events have influenced the observed distributions. Accordingly, present-day spatial patterns are likely to have arisen from the interaction of historical and environmental drivers. Future studies will need to integrate a range of spatial and temporal scales to further quantify their respective roles.
Resumo:
Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.
Resumo:
The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.
Resumo:
We consider the finite sample properties of model selection by information criteria in conditionally heteroscedastic models. Recent theoretical results show that certain popular criteria are consistent in that they will select the true model asymptotically with probability 1. To examine the empirical relevance of this property, Monte Carlo simulations are conducted for a set of non–nested data generating processes (DGPs) with the set of candidate models consisting of all types of model used as DGPs. In addition, not only is the best model considered but also those with similar values of the information criterion, called close competitors, thus forming a portfolio of eligible models. To supplement the simulations, the criteria are applied to a set of economic and financial series. In the simulations, the criteria are largely ineffective at identifying the correct model, either as best or a close competitor, the parsimonious GARCH(1, 1) model being preferred for most DGPs. In contrast, asymmetric models are generally selected to represent actual data. This leads to the conjecture that the properties of parameterizations of processes commonly used to model heteroscedastic data are more similar than may be imagined and that more attention needs to be paid to the behaviour of the standardized disturbances of such models, both in simulation exercises and in empirical modelling.
Resumo:
So Paulo is the most developed state in Brazil and contains few fragments of native ecosystems, generally surrounded by intensive agriculture lands. Despite this, some areas still shelter large native animals. We aimed at understanding how medium and large carnivores use a mosaic landscape of forest/savanna and agroecosystems, and how the species respond to different landscape parameters (percentage of landcover and edge density), in a multi-scale perspective. The response variables were: species richness, carnivore frequency and frequency for the three most recorded species (Puma concolor, Chrysocyon brachyurus and Leopardus pardalis). We compared 11 competing models using Akaike`s information criterion (AIC) and assessed model support using weight of AIC. Concurrent models were combinations of landcover types (native vegetation, ""cerrado"" formations, ""cerrado"" and eucalypt plantation), landscape feature (percentage of landcover and edge density) and spatial scale. Herein, spatial scale refers to the radius around a sampling point defining a circular landscape. The scales analyzed were 250 (fine), 1,000 (medium) and 2,000 m (coarse). The shape of curves for response variables (linear, exponential and power) was also assessed. Our results indicate that species with high mobility, P. concolor and C. brachyurus, were best explained by edge density of the native vegetation at a coarse scale (2,000 m). The relationship between P. concolor and C. brachyurus frequency had a negative power-shaped response to explanatory variables. This general trend was also observed for species richness and carnivore frequency. Species richness and P. concolor frequency were also well explained by a second concurrent model: edge density of cerrado at the fine (250 m) scale. A different response was recorded for L. pardalis, as the frequency was best explained for the amount of cerrado at the fine (250 m) scale. The curve of response was linearly positive. The contrasting results (P. concolor and C. brachyurus vs L. pardalis) may be due to the much higher mobility of the two first species, in comparison with the third. Still, L. pardalis requires habitat with higher quality when compared with other two species. This study highlights the importance of considering multiple spatial scales when evaluating species responses to different habitats. An important and new finding was the prevalence of edge density over the habitat extension to explain overall carnivore distribution, a key information for planning and management of protected areas.
Resumo:
1. Analyses of species association have major implications for selecting indicators for freshwater biomonitoring and conservation, because they allow for the elimination of redundant information and focus on taxa that can be easily handled and identified. These analyses are particularly relevant in the debate about using speciose groups (such as the Chironomidae) as indicators in the tropics, because they require difficult and time-consuming analysis, and their responses to environmental gradients, including anthropogenic stressors, are poorly known. 2. Our objective was to show whether chironomid assemblages in Neotropical streams include clear associations of taxa and, if so, how well these associations could be explained by a set of models containing information from different spatial scales. For this, we formulated a priori models that allowed for the influence of local, landscape and spatial factors on chironomid taxon associations (CTA). These models represented biological hypotheses capable of explaining associations between chironomid taxa. For instance, CTA could be best explained by local variables (e.g. pH, conductivity and water temperature) or by processes acting at wider landscape scales (e.g. percentage of forest cover). 3. Biological data were taken from 61 streams in Southeastern Brazil, 47 of which were in well-preserved regions, and 14 of which drained areas severely affected by anthropogenic activities. We adopted a model selection procedure using Akaike`s information criterion to determine the most parsimonious models for explaining CTA. 4. Applying Kendall`s coefficient of concordance, seven genera (Tanytarsus/Caladomyia, Ablabesmyia, Parametriocnemus, Pentaneura, Nanocladius, Polypedilum and Rheotanytarsus) were identified as associated taxa. The best-supported model explained 42.6% of the total variance in the abundance of associated taxa. This model combined local and landscape environmental filters and spatial variables (which were derived from eigenfunction analysis). However, the model with local filters and spatial variables also had a good chance of being selected as the best model. 5. Standardised partial regression coefficients of local and landscape filters, including spatial variables, derived from model averaging allowed an estimation of which variables were best correlated with the abundance of associated taxa. In general, the abundance of the associated genera tended to be lower in streams characterised by a high percentage of forest cover (landscape scale), lower proportion of muddy substrata and high values of pH and conductivity (local scale). 6. Overall, our main result adds to the increasing number of studies that have indicated the importance of local and landscape variables, as well as the spatial relationships among sampling sites, for explaining aquatic insect community patterns in streams. Furthermore, our findings open new possibilities for the elimination of redundant data in the assessment of anthropogenic impacts on tropical streams.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Foram utilizados quatorze modelos de regressão aleatória, para ajustar 86.598 dados de produção de leite no dia do controle de 2.155 primeiras lactações de vacas Caracu, truncadas aos 305 dias. Os modelos incluíram os efeitos fixos de grupo contemporâneo e a covariável idade da vaca ao parto. Uma regressão ortogonal de ordem cúbica foi usada para modelar a trajetória média da população. Os efeitos genéticos aditivos e de ambiente permanente foram modelados por meio de regressões aleatórias, usando polinômios ortogonais de Legendre, de ordens cúbicas. Diferentes estruturas de variâncias residuais foram testadas e consideradas por meio de classes contendo 1, 10, 15 e 43 variâncias residuais e de funções de variâncias (FV) usando polinômios ordinários e ortogonais, cujas ordens variaram de quadrática até sêxtupla. Os modelos foram comparados usando o teste da razão de verossimilhança, o Critério de Informação de Akaike e o Critério de Informação Bayesiano de Schwar. Os testes indicaram que, quanto maior a ordem da função de variâncias, melhor o ajuste. Dos polinômios ordinários, a função de sexta ordem foi superior. Os modelos com classes de variâncias residuais foram aparentemente superiores àqueles com funções de variância. O modelo com homogeneidade de variâncias foi inadequado. O modelo com 15 classes heterogêneas foi o que melhor ajustou às variâncias residuais, entretanto, os parâmetros genéticos estimados foram muito próximos para os modelos com 10, 15 ou 43 classes de variâncias ou com FV de sexta ordem.
Resumo:
Com o objetivo de estimar parâmetros genéticos para a produção de leite no dia do controle (PLDC), foram usadas as 2.440 primeiras lactações de vacas da raça Gir leiteira, com partos registrados entre 1990 e 2005. As PLDC foram consideradas em dez classes mensais e analisadas por meio de modelo de regressão aleatória (MRA) utilizando-se como efeitos aleatórios o genético-aditivo, o de ambiente permanente e o residual e, como efeitos fixos, o grupo de contemporâneos (GC), a co-variável idade da vaca ao parto (efeito linear e quadrático) e a curva média de lactação da população. Os efeitos genético-aditivos e de ambiente permanente foram modelados utilizando-se as funções de Wilmink (WIL) e Ali e Schaeffer (AS). As variâncias residuais foram modeladas utilizando-se 1, 4, 6 ou 10 classes. Os grupos de contemporâneos foram definidos como rebanho-ano-estação do controle contendo no mínimo três animais. Os testes indicaram que o modelo com quatro classes de variâncias usando a função paramétrica AS foi o que melhor se ajustou aos dados. As estimativas de herdabilidade variaram de 0,21 a 0,33 para a função AS e de 0,17 a 0,30 para WIL e foram maiores na primeira metade da lactação. As correlações genéticas entre as PLDC foram positivas e elevadas entre os controles adjacentes e diminuiram quando a distância entre os controles aumentou. Para o melhor modelo, foram estimados os valores genéticos para a produção de leite acumulada até os 305 dias e, para períodos parciais da lactação, foram obtidas como médias dos valores genéticos preditos naquele período. Os valores genéticos foram comparados, por meio da correlação de posto, ao valor genético predito para a produção acumulada até os 305 dias, pelo método tradicional. As correlações entre os valores genéticos indicaram que podem ocorrer divergências na classificação dos animais pelos critérios estudados.
Resumo:
Foram utilizados 35.732 registros de peso do nascimento aos 660 dias de idade de 8.458 animais da raça Tabapuã para estimar funções de covariância utilizando modelos de regressão aleatória sobre polinômios de Legendre. Os modelos incluíram: como aleatórios, os efeitos genético aditivo direto, materno, de ambiente permanente de animal e materno; como fixos, os efeitos de grupo de contemporâneo; como covariáveis, a idade do animal à pesagem e a idade da vaca ao parto (linear e quadrática); e sobre a idade à pesagem, polinômio ortogonal de Legendre (regressão cúbica) foi considerado para modelar a curva média da população. O resíduo foi modelado considerando sete classes de variância e os modelos foram comparados pelos critérios de informação Bayesiano de Schwarz e Akaike. O melhor modelo apresentou ordens 4, 3, 6, 3 para os efeitos genético aditivo direto e materno, de ambiente permanente de animal e materno, respectivamente. As estimativas de covariância e herdabilidades, obtidas utilizando modelo bicaracter, e de regressão aleatória foram semelhantes. As estimativas de herdabilidade para o efeito genético aditivo direto, obtidas com o modelo de regressão aleatória, aumentaram do nascimento (0,15) aos 660 dias de idade (0,45). Maiores estimativas de herdabilidade materna foram obtidas para pesos medidos logo após o nascimento. As correlações genéticas variaram de moderadas a altas e diminuíram com o aumento da distância entre as pesagens. A seleção para maiores pesos em qualquer idade promove maior ganho de peso do nascimento aos 660 dias de idade.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Um total de 19.770 pesos corporais de bovinos Guzerá, do nascimento aos 365 dias de idade, pertencentes ao banco de dados da Associação Brasileira dos Criadores de Zebu (ABCZ) foi analisado com os objetivos de comparar diferentes estruturas de variâncias residuais, considerando 1, 18, 28 e 53 classes residuais e funções de variância de ordens quadrática a quíntica; e estimar funções de co-variância de diferentes ordens para os efeitos genético aditivo direto, genético materno, de ambiente permanente de animal e de mãe e parâmetros genéticos para os pesos corporais usando modelos de regressão aleatória. Os efeitos aleatórios foram modelados por regressões polinomiais em escala de Legendre com ordens variando de linear a quártica. Os modelos foram comparados pelo teste de razão de verossimilhança e pelos critérios de Informação de Akaike e de Informação Bayesiano de Schwarz. O modelo com 18 classes heterogêneas foi o que melhor se ajustou às variâncias residuais, de acordo com os testes estatísticos, porém, o modelo com função de variância de quinta ordem também mostrou-se apropriado. Os valores de herdabilidade direta estimados foram maiores que os encontrados na literatura, variando de 0,04 a 0,53, mas seguiram a mesma tendência dos estimados pelas análises unicaracterísticas. A seleção para peso em qualquer idade melhoraria o peso em todas as idades no intervalo estudado.
Resumo:
The objectives of this study were to compare the goodness of fit of four non-linear growth models, i.e. Brody, Gompertz, Logistic and Von Bertalanffy, in West African Dwarf (WAD) sheep. A total of 5274 monthly weight records from birth up to 180 days of age from 889 lambs, collected during 2001 to 2004 in Betecoucou breeding farm in Benin were used. In the preliminary analysis, the General Linear Model Procedure of the Statistical Analysis Systems Institute was applied to the dataset to identify the significant effects of the sex of lamb (male and female), type of birth (single and twin), season of birth (rainy season and dry season), parity of dam (1, 2 and 3) and year of birth (2001, 2002, 2003 and 2004) on the observed birth weight and monthly weight up to 6 months of age. The models parameters (A, B and k), coefficient of determination (112), mean square error (MSE) were calculated using language of technical computing package Matlab(R), 2006. The mean values of A, B and k were substituted into each model to calculate the corresponding Akaike's Information Criterion (AIC). Among the four growth functions, the Brody model has been selected for its accuracy of fit according to the higher R(2), lower MSE and A/C Finally, the parameters A, B and k were adjusted in Matlab(R) 2006 for the sex of lamb, year of birth, season of birth, birth type and the parity of ewe, providing a specific slope of the Brody growth curve. The results of this study suggest that Brody model can be useful for WAD sheep breeding in Betecoucou farm conditions through growth monitoring.