898 resultados para Exponential Random Graph Model
Resumo:
In this paper is presented a region-based methodology for Digital Elevation Model segmentation obtained from laser scanning data. The methodology is based on two sequential techniques, i.e., a recursive splitting technique using the quad tree structure followed by a region merging technique using the Markov Random Field model. The recursive splitting technique starts splitting the Digital Elevation Model into homogeneous regions. However, due to slight height differences in the Digital Elevation Model, region fragmentation can be relatively high. In order to minimize the fragmentation, a region merging technique based on the Markov Random Field model is applied to the previously segmented data. The resulting regions are firstly structured by using the so-called Region Adjacency Graph. Each node of the Region Adjacency Graph represents a region of the Digital Elevation Model segmented and two nodes have connectivity between them if corresponding regions share a common boundary. Next it is assumed that the random variable related to each node, follows the Markov Random Field model. This hypothesis allows the derivation of the posteriori probability distribution function whose solution is obtained by the Maximum a Posteriori estimation. Regions presenting high probability of similarity are merged. Experiments carried out with laser scanning data showed that the methodology allows to separate the objects in the Digital Elevation Model with a low amount of fragmentation.
Resumo:
Studies investigating the use of random regression models for genetic evaluation of milk production in Zebu cattle are scarce. In this study, 59,744 test-day milk yield records from 7,810 first lactations of purebred dairy Gyr (Bos indicus) and crossbred (dairy Gyr × Holstein) cows were used to compare random regression models in which additive genetic and permanent environmental effects were modeled using orthogonal Legendre polynomials or linear spline functions. Residual variances were modeled considering 1, 5, or 10 classes of days in milk. Five classes fitted the changes in residual variances over the lactation adequately and were used for model comparison. The model that fitted linear spline functions with 6 knots provided the lowest sum of residual variances across lactation. On the other hand, according to the deviance information criterion (DIC) and Bayesian information criterion (BIC), a model using third-order and fourth-order Legendre polynomials for additive genetic and permanent environmental effects, respectively, provided the best fit. However, the high rank correlation (0.998) between this model and that applying third-order Legendre polynomials for additive genetic and permanent environmental effects, indicates that, in practice, the same bulls would be selected by both models. The last model, which is less parameterized, is a parsimonious option for fitting dairy Gyr breed test-day milk yield records. © 2013 American Dairy Science Association.
Resumo:
Random regression models have been widely used to estimate genetic parameters that influence milk production in Bos taurus breeds, and more recently in B. indicus breeds. With the aim of finding appropriate random regression model to analyze milk yield, different parametric functions were compared, applied to 20,524 test-day milk yield records of 2816 first-lactation Guzerat (B. indicus) cows in Brazilian herds. The records were analyzed by random regression models whose random effects were additive genetic, permanent environmental and residual, and whose fixed effects were contemporary group, the covariable cow age at calving (linear and quadratic effects), and the herd lactation curve. The additive genetic and permanent environmental effects were modeled by the Wilmink function, a modified Wilmink function (with the second term divided by 100), a function that combined third-order Legendre polynomials with the last term of the Wilmink function, and the Ali and Schaeffer function. The residual variances were modeled by means of 1, 4, 6, or 10 heterogeneous classes, with the exception of the last term of the Wilmink function, for which there were 1, from 0.20 to 0.33. Genetic correlations between adjacent records were high values (0.83-0.99), but they declined when the interval between the test-day records increased, and were negative between the first and last records. The model employing the Ali and Schaeffer function with six residual variance classes was the most suitable for fitting the data. © FUNPEC-RP.
Resumo:
The Brazilian Association of Simmental and Simbrasil Cattle Farmers provided 29,510 records from 10,659 Simmental beef cattle; these were used to estimate (co)variance components and genetic parameters for weights in the growth trajectory, based on multi-trait (MTM) and random regression models (RRM). The (co)variance components and genetic parameters were estimated by restricted maximum likelihood. In the MTM analysis, the likelihood ratio test was used to determine the significance of random effects included in the model and to define the most appropriate model. All random effects were significant and included in the final model. In the RRM analysis, different adjustments of polynomial orders were compared for 5 different criteria to choose the best fit model. An RRM of third order for the direct additive genetic, direct permanent environmental, maternal additive genetic, and maternal permanent environment effects was sufficient to model variance structures in the growth trajectory of the animals. The (co)variance components were generally similar in MTM and RRM. Direct heritabilities of MTM were slightly lower than RRM and varied from 0.04 to 0.42 and 0.16 to 0.45, respectively. Additive direct correlations were mostly positive and of high magnitude, being highest at closest ages. Considering the results and that pre-adjustment of the weights to standard ages is not required, RRM is recommended for genetic evaluation of Simmental beef cattle in Brazil. ©FUNPEC-RP.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
High pressure NMR spectroscopy has developed into an important tool for studying conformational equilibria of proteins in solution. We have studied the amide proton and nitrogen chemical shifts of the 20 canonical amino acids X in the random-coil model peptide Ac-Gly-Gly-X-Ala-NH2, in a pressure range from 0.1 to 200 MPa, at a proton resonance frequency of 800 MHz. The obtained data allowed the determination of first and second order pressure coefficients with high accuracy at 283 K and pH 6.7. The mean first and second order pressure coefficients <B-1(15N)> and <B-2(15N)> for nitrogen are 2.91 ppm/GPa and -2.32 ppm/GPa(2), respectively. The corresponding values <B-1(1H)> and <B-2(1H)> for the amide protons are 0.52 ppm/GPa and -0.41 ppm/GPa(2). Residual dependent (1)J(1H15N)-coupling constants are shown.
Discriminating Different Classes of Biological Networks by Analyzing the Graphs Spectra Distribution
Resumo:
The brain's structural and functional systems, protein-protein interaction, and gene networks are examples of biological systems that share some features of complex networks, such as highly connected nodes, modularity, and small-world topology. Recent studies indicate that some pathologies present topological network alterations relative to norms seen in the general population. Therefore, methods to discriminate the processes that generate the different classes of networks (e. g., normal and disease) might be crucial for the diagnosis, prognosis, and treatment of the disease. It is known that several topological properties of a network (graph) can be described by the distribution of the spectrum of its adjacency matrix. Moreover, large networks generated by the same random process have the same spectrum distribution, allowing us to use it as a "fingerprint". Based on this relationship, we introduce and propose the entropy of a graph spectrum to measure the "uncertainty" of a random graph and the Kullback-Leibler and Jensen-Shannon divergences between graph spectra to compare networks. We also introduce general methods for model selection and network model parameter estimation, as well as a statistical procedure to test the nullity of divergence between two classes of complex networks. Finally, we demonstrate the usefulness of the proposed methods by applying them to (1) protein-protein interaction networks of different species and (2) on networks derived from children diagnosed with Attention Deficit Hyperactivity Disorder (ADHD) and typically developing children. We conclude that scale-free networks best describe all the protein-protein interactions. Also, we show that our proposed measures succeeded in the identification of topological changes in the network while other commonly used measures (number of edges, clustering coefficient, average path length) failed.
Resumo:
We extend the random permutation model to obtain the best linear unbiased estimator of a finite population mean accounting for auxiliary variables under simple random sampling without replacement (SRS) or stratified SRS. The proposed method provides a systematic design-based justification for well-known results involving common estimators derived under minimal assumptions that do not require specification of a functional relationship between the response and the auxiliary variables.
Resumo:
Monte Carlo simulation was used to evaluate properties of a simple Bayesian MCMC analysis of the random effects model for single group Cormack-Jolly-Seber capture-recapture data. The MCMC method is applied to the model via a logit link, so parameters p, S are on a logit scale, where logit(S) is assumed to have, and is generated from, a normal distribution with mean μ and variance σ2 . Marginal prior distributions on logit(p) and μ were independent normal with mean zero and standard deviation 1.75 for logit(p) and 100 for μ ; hence minimally informative. Marginal prior distribution on σ2 was placed on τ2=1/σ2 as a gamma distribution with α=β=0.001 . The study design has 432 points spread over 5 factors: occasions (t) , new releases per occasion (u), p, μ , and σ . At each design point 100 independent trials were completed (hence 43,200 trials in total), each with sample size n=10,000 from the parameter posterior distribution. At 128 of these design points comparisons are made to previously reported results from a method of moments procedure. We looked at properties of point and interval inference on μ , and σ based on the posterior mean, median, and mode and equal-tailed 95% credibility interval. Bayesian inference did very well for the parameter μ , but under the conditions used here, MCMC inference performance for σ was mixed: poor for sparse data (i.e., only 7 occasions) or σ=0 , but good when there were sufficient data and not small σ .
Resumo:
We present a novel framework for encoding latency analysis of arbitrary multiview video coding prediction structures. This framework avoids the need to consider an specific encoder architecture for encoding latency analysis by assuming an unlimited processing capacity on the multiview encoder. Under this assumption, only the influence of the prediction structure and the processing times have to be considered, and the encoding latency is solved systematically by means of a graph model. The results obtained with this model are valid for a multiview encoder with sufficient processing capacity and serve as a lower bound otherwise. Furthermore, with the objective of low latency encoder design with low penalty on rate-distortion performance, the graph model allows us to identify the prediction relationships that add higher encoding latency to the encoder. Experimental results for JMVM prediction structures illustrate how low latency prediction structures with a low rate-distortion penalty can be derived in a systematic manner using the new model.
Resumo:
Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.
Resumo:
We present MBIS (Multivariate Bayesian Image Segmentation tool), a clustering tool based on the mixture of multivariate normal distributions model. MBIS supports multi-channel bias field correction based on a B-spline model. A second methodological novelty is the inclusion of graph-cuts optimization for the stationary anisotropic hidden Markov random field model. Along with MBIS, we release an evaluation framework that contains three different experiments on multi-site data. We first validate the accuracy of segmentation and the estimated bias field for each channel. MBIS outperforms a widely used segmentation tool in a cross-comparison evaluation. The second experiment demonstrates the robustness of results on atlas-free segmentation of two image sets from scan-rescan protocols on 21 healthy subjects. Multivariate segmentation is more replicable than the monospectral counterpart on T1-weighted images. Finally, we provide a third experiment to illustrate how MBIS can be used in a large-scale study of tissue volume change with increasing age in 584 healthy subjects. This last result is meaningful as multivariate segmentation performs robustly without the need for prior knowledge.
Resumo:
Niche apportionment models have only been applied once to parasite communities. Only the random assortment model (RA), which indicates that species abundances are independent from each other and that interspecific competition is unimportant, provided a good fit to 3 out of 6 parasite communities investigated. The generality of this result needs to be validated, however. In this study we apply 5 niche apportionment models to the parasite communities of 14 fish species from the Great Barrier Reef. We determined which model fitted the data when using either numerical abundance or biomass as an estimate of parasite abundance, and whether the fit of niche apportionment models depends on how the parasite community is defined (e.g. ecto, endoparasites or all parasites considered together). The RA model provided a good fit for the whole community of parasites in 7 fish species when using biovolume (as a surrogate of biomass) as a measure of species abundance. The RA model also fitted observed data when ecto- and endoparasites were considered separately, using abundance or biovolume, but less frequently. Variation in fish sizes among species was not associated with the probability of a model fitting the data. Total numerical abundance and biovolume of parasites were not related across host species, suggesting that they capture different aspects of abundance. Biovolume is not only a better measurement to use with niche-orientated models, it should also be the preferred descriptor to analyse parasite community structure in other contexts. Most of the biological assumptions behind the RA model, i.e. randomness in apportioning niche space, lack of interspecific competition, independence of abundance among different species, and species with variable niches in changeable environments, are in accordance with some previous findings on parasite communities. Thus, parasite communities may generally be unsaturated with species, with empty niches, and interspecific interactions may generally be unimportant in determining parasite community structure.
Resumo:
In Statnote 9, we described a one-way analysis of variance (ANOVA) ‘random effects’ model in which the objective was to estimate the degree of variation of a particular measurement and to compare different sources of variation in space and time. The illustrative scenario involved the role of computer keyboards in a University communal computer laboratory as a possible source of microbial contamination of the hands. The study estimated the aerobic colony count of ten selected keyboards with samples taken from two keys per keyboard determined at 9am and 5pm. This type of design is often referred to as a ‘nested’ or ‘hierarchical’ design and the ANOVA estimated the degree of variation: (1) between keyboards, (2) between keys within a keyboard, and (3) between sample times within a key. An alternative to this design is a 'fixed effects' model in which the objective is not to measure sources of variation per se but to estimate differences between specific groups or treatments, which are regarded as 'fixed' or discrete effects. This statnote describes two scenarios utilizing this type of analysis: (1) measuring the degree of bacterial contamination on 2p coins collected from three types of business property, viz., a butcher’s shop, a sandwich shop, and a newsagent and (2) the effectiveness of drugs in the treatment of a fungal eye infection.
Resumo:
This paper provides the most fully comprehensive evidence to date on whether or not monetary aggregates are valuable for forecasting US inflation in the early to mid 2000s. We explore a wide range of different definitions of money, including different methods of aggregation and different collections of included monetary assets. We use non-linear, artificial intelligence techniques, namely, recurrent neural networks, evolution strategies and kernel methods in our forecasting experiment. In the experiment, these three methodologies compete to find the best fitting US inflation forecasting models and are then compared to forecasts from a naive random walk model. The best models were non-linear autoregressive models based on kernel methods. Our findings do not provide much support for the usefulness of monetary aggregates in forecasting inflation. There is evidence in the literature that evolutionary methods can be used to evolve kernels hence our future work should combine the evolutionary and kernel methods to get the benefits of both.