987 resultados para Model trees


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background The vast sequence divergence among different virus groups has presented a great challenge to alignment-based analysis of virus phylogeny. Due to the problems caused by the uncertainty in alignment, existing tools for phylogenetic analysis based on multiple alignment could not be directly applied to the whole-genome comparison and phylogenomic studies of viruses. There has been a growing interest in alignment-free methods for phylogenetic analysis using complete genome data. Among the alignment-free methods, a dynamical language (DL) method proposed by our group has successfully been applied to the phylogenetic analysis of bacteria and chloroplast genomes. Results In this paper, the DL method is used to analyze the whole-proteome phylogeny of 124 large dsDNA viruses and 30 parvoviruses, two data sets with large difference in genome size. The trees from our analyses are in good agreement to the latest classification of large dsDNA viruses and parvoviruses by the International Committee on Taxonomy of Viruses (ICTV). Conclusions The present method provides a new way for recovering the phylogeny of large dsDNA viruses and parvoviruses, and also some insights on the affiliation of a number of unclassified viruses. In comparison, some alignment-free methods such as the CV Tree method can be used for recovering the phylogeny of large dsDNA viruses, but they are not suitable for resolving the phylogeny of parvoviruses with a much smaller genome size.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Real world business process models may consist of hundreds of elements and have sophisticated structure. Although there are tasks where such models are valuable and appreciated, in general complexity has a negative influence on model comprehension and analysis. Thus, means for managing the complexity of process models are needed. One approach is abstraction of business process models-creation of a process model which preserves the main features of the initial elaborate process model, but leaves out insignificant details. In this paper we study the structural aspects of process model abstraction and introduce an abstraction approach based on process structure trees (PST). The developed approach assures that the abstracted process model preserves the ordering constraints of the initial model. It surpasses pattern-based process model abstraction approaches, allowing to handle graph-structured process models of arbitrary structure. We also provide an evaluation of the proposed approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivated by the analysis of the Australian Grain Insect Resistance Database (AGIRD), we develop a Bayesian hurdle modelling approach to assess trends in strong resistance of stored grain insects to phosphine over time. The binary response variable from AGIRD indicating presence or absence of strong resistance is characterized by a majority of absence observations and the hurdle model is a two step approach that is useful when analyzing such a binary response dataset. The proposed hurdle model utilizes Bayesian classification trees to firstly identify covariates and covariate levels pertaining to possible presence or absence of strong resistance. Secondly, generalized additive models (GAMs) with spike and slab priors for variable selection are fitted to the subset of the dataset identified from the Bayesian classification tree indicating possibility of presence of strong resistance. From the GAM we assess trends, biosecurity issues and site specific variables influencing the presence of strong resistance using a variable selection approach. The proposed Bayesian hurdle model is compared to its frequentist counterpart, and also to a naive Bayesian approach which fits a GAM to the entire dataset. The Bayesian hurdle model has the benefit of providing a set of good trees for use in the first step and appears to provide enough flexibility to represent the influence of variables on strong resistance compared to the frequentist model, but also captures the subtle changes in the trend that are missed by the frequentist and naive Bayesian models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is a concern that high densities of elephants in southern Africa could lead to the overall reduction of other forms of biodiversity. We present a grid-based model of elephant-savanna dynamics, which differs from previous elephant-vegetation models by accounting for woody plant demographics, tree-grass interactions, stochastic environmental variables (fire and rainfall), and spatial contagion of fire and tree recruitment. The model projects changes in height structure and spatial pattern of trees over periods of centuries. The vegetation component of the model produces long-term tree-grass coexistence, and the emergent fire frequencies match those reported for southern African savannas. Including elephants in the savanna model had the expected effect of reducing woody plant cover, mainly via increased adult tree mortality, although at an elephant density of 1.0 elephant/km2, woody plants still persisted for over a century. We tested three different scenarios in addition to our default assumptions. (1) Reducing mortality of adult trees after elephant use, mimicking a more browsing-tolerant tree species, mitigated the detrimental effect of elephants on the woody population. (2) Coupling germination success (increased seedling recruitment) to elephant browsing further increased tree persistence, and (3) a faster growing woody component allowed some woody plant persistence for at least a century at a density of 3 elephants/km2. Quantitative models of the kind presented here provide a valuable tool for exploring the consequences of management decisions involving the manipulation of elephant population densities. © 2005 by the Ecological Society of America.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite longstanding concern with the dimensionality of the service quality construct as measured by ServQual and IS-ServQual instruments, variations on the IS-ServQual instrument have been enduringly prominent in both academic research and practice in the field of IS. We explain the continuing popularity of the instrument based on the salience of the item set for predicting overall customer satisfaction, suggesting that the preoccupation with the dimensions has been a distraction. The implicit mutual exclusivity of the items suggests a more appropriate conceptualization of IS-ServQual as a formative index. This conceptualization resolves the paradox in IS-ServQual research, that of how an instrument with such well-known and well-documented weaknesses continue to be very influential and widely used by academics and practitioners. A formative conceptualization acknowledges and addresses the criticisms of IS-ServQual, while simultaneously explaining its enduring salience by focusing on the items rather than the “dimensions.” By employing an opportunistic sample and adopting the most recent IS-ServQual instrument published in a leading IS journal (virtually, any valid IS- ServQual sample in combination with a previously tested instrument variant would suffice for study purposes), we demonstrate that when re-specified as both first-order and second-order formatives, IS-ServQual has good model quality metrics and high predictive power on customer satisfaction. We conclude that this formative specification has higher practical use and is more defensible theoretically.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

1 Species-accumulation curves for woody plants were calculated in three tropical forests, based on fully mapped 50-ha plots in wet, old-growth forest in Peninsular Malaysia, in moist, old-growth forest in central Panama, and in dry, previously logged forest in southern India. A total of 610 000 stems were identified to species and mapped to < Im accuracy. Mean species number and stem number were calculated in quadrats as small as 5 m x 5 m to as large as 1000 m x 500 m, for a variety of stem sizes above 10 mm in diameter. Species-area curves were generated by plotting species number as a function of quadrat size; species-individual curves were generated from the same data, but using stem number as the independent variable rather than area. 2 Species-area curves had different forms for stems of different diameters, but species-individual curves were nearly independent of diameter class. With < 10(4) stems, species-individual curves were concave downward on log-log plots, with curves from different forests diverging, but beyond about 104 stems, the log-log curves became nearly linear, with all three sites having a similar slope. This indicates an asymptotic difference in richness between forests: the Malaysian site had 2.7 times as many species as Panama, which in turn was 3.3 times as rich as India. 3 Other details of the species-accumulation relationship were remarkably similar between the three sites. Rectangular quadrats had 5-27% more species than square quadrats of the same area, with longer and narrower quadrats increasingly diverse. Random samples of stems drawn from the entire 50 ha had 10-30% more species than square quadrats with the same number of stems. At both Pasoh and BCI, but not Mudumalai. species richness was slightly higher among intermediate-sized stems (50-100mm in diameter) than in either smaller or larger sizes, These patterns reflect aggregated distributions of individual species, plus weak density-dependent forces that tend to smooth the species abundance distribution and 'loosen' aggregations as stems grow. 4 The results provide support for the view that within each tree community, many species have their abundance and distribution guided more by random drift than deterministic interactions. The drift model predicts that the species-accumulation curve will have a declining slope on a log-log plot, reaching a slope of O.1 in about 50 ha. No other model of community structure can make such a precise prediction. 5 The results demonstrate that diversity studies based on different stem diameters can be compared by sampling identical numbers of stems. Moreover, they indicate that stem counts < 1000 in tropical forests will underestimate the percentage difference in species richness between two diverse sites. Fortunately, standard diversity indices (Fisher's sc, Shannon-Wiener) captured diversity differences in small stem samples more effectively than raw species richness, but both were sample size dependent. Two nonparametric richness estimators (Chao. jackknife) performed poorly, greatly underestimating true species richness.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The parasitic weed Orobanche crenata inflicts major damage on faba bean, lentil, pea and other crops in Mediterranean environments. The development of methods to control O. crenata is to a large extent hampered by the complexity of host-parasite systems. Using a model of host-parasite interactions can help to explain and understand this intricacy. This paper reports on the evaluation and application of a model simulating host-parasite competition as affected by environment and management that was implemented in the framework of the Agricultural Production Systems Simulator (APSIM). Model-predicted faba bean and O. crenata growth and development were evaluated against independent data. The APSIM-Fababean and -Parasite modules displayed a good capability to reproduce effects of pedoclimatic conditions, faba bean sowing date and O. crenata infestation on host-parasite competition. The r(2) values throughout exceeded 0.84 (RMSD: 5.36 days) for phenological, 0.85 (RMSD: 223.00 g m(-2)) for host growth and 0.78 (RMSD: 99.82 g m(-2)) for parasite growth parameters. Inaccuracies of simulated faba bean root growth that caused some bias of predicted parasite number and host yield loss may be dealt with by more flexibly simulating vertical root distribution. The model was applied in simulation experiments to determine optimum sowing windows for infected and non-infected faba bean in Mediterranean environments. Simulation results proved realistic and testified to the capability of APSIM to contribute to the development of tactical approaches in parasitic weed control.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Species distribution modelling (SDM) typically analyses species’ presence together with some form of absence information. Ideally absences comprise observations or are inferred from comprehensive sampling. When such information is not available, then pseudo-absences are often generated from the background locations within the study region of interest containing the presences, or else absence is implied through the comparison of presences to the whole study region, e.g. as is the case in Maximum Entropy (MaxEnt) or Poisson point process modelling. However, the choice of which absence information to include can be both challenging and highly influential on SDM predictions (e.g. Oksanen and Minchin, 2002). In practice, the use of pseudo- or implied absences often leads to an imbalance where absences far outnumber presences. This leaves analysis highly susceptible to ‘naughty-noughts’: absences that occur beyond the envelope of the species, which can exert strong influence on the model and its predictions (Austin and Meyers, 1996). Also known as ‘excess zeros’, naughty noughts can be estimated via an overall proportion in simple hurdle or mixture models (Martin et al., 2005). However, absences, especially those that occur beyond the species envelope, can often be more diverse than presences. Here we consider an extension to excess zero models. The two-staged approach first exploits the compartmentalisation provided by classification trees (CTs) (as in O’Leary, 2008) to identify multiple sources of naughty noughts and simultaneously delineate several species envelopes. Then SDMs can be fit separately within each envelope, and for this stage, we examine both CTs (as in Falk et al., 2014) and the popular MaxEnt (Elith et al., 2006). We introduce a wider range of model performance measures to improve treatment of naughty noughts in SDM. We retain an overall measure of model performance, the area under the curve (AUC) of the Receiver-Operating Curve (ROC), but focus on its constituent measures of false negative rate (FNR) and false positive rate (FPR), and how these relate to the threshold in the predicted probability of presence that delimits predicted presence from absence. We also propose error rates more relevant to users of predictions: false omission rate (FOR), the chance that a predicted absence corresponds to (and hence wastes) an observed presence, and the false discovery rate (FDR), reflecting those predicted (or potential) presences that correspond to absence. A high FDR may be desirable since it could help target future search efforts, whereas zero or low FOR is desirable since it indicates none of the (often valuable) presences have been ignored in the SDM. For illustration, we chose Bradypus variegatus, a species that has previously been published as an exemplar species for MaxEnt, proposed by Phillips et al. (2006). We used CTs to increasingly refine the species envelope, starting with the whole study region (E0), eliminating more and more potential naughty noughts (E1–E3). When combined with an SDM fit within the species envelope, the best CT SDM had similar AUC and FPR to the best MaxEnt SDM, but otherwise performed better. The FNR and FOR were greatly reduced, suggesting that CTs handle absences better. Interestingly, MaxEnt predictions showed low discriminatory performance, with the most common predicted probability of presence being in the same range (0.00-0.20) for both true absences and presences. In summary, this example shows that SDMs can be improved by introducing an initial hurdle to identify naughty noughts and partition the envelope before applying SDMs. This improvement was barely detectable via AUC and FPR yet visible in FOR, FNR, and the comparison of predicted probability of presence distribution for pres/absence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When exposed to hot (22-35 degrees C) and dry climatic conditions in the field during the final 4-6 weeks of pod filling, peanuts (Arachis hypogaea L.) can accumulate highly carcinogenic and immuno-suppressing aflatoxins. Forecasting of the risk posed by these conditions can assist in minimizing pre-harvest contamination. A model was therefore developed as part of the Agricultural Production Systems Simulator (APSIM) peanut module, which calculated an aflatoxin risk index (ARI) using four temperature response functions when fractional available soil water was <0.20 and the crop was in the last 0.40 of the pod-filling phase. ARI explained 0.95 (P <= 0.05) of the variation in aflatoxin contamination, which varied from 0 to c. 800 mu g/kg in 17 large-scale sowings in tropical and four sowings in sub-tropical environments carried out in Australia between 13 November and 16 December 2007. ARI also explained 0.96 (P <= 0.01) of the variation in the proportion of aflatoxin-contaminated loads (>15 mu g/kg) of peanuts in the Kingaroy region of Australia during the period between the 1998/99 and 2007/08 seasons. Simulation of ARI using historical climatic data from 1890 to 2007 indicated a three-fold increase in its value since 1980 compared to the entire previous period. The increase was associated with increases in ambient temperature and decreases in rainfall. To facilitate routine monitoring of aflatoxin risk by growers in near real time, a web interface of the model was also developed. The ARI predicted using this interface for eight growers correlated significantly with the level of contamination in crops (r=095, P <= 0.01). These results suggest that ARI simulated by the model is a reliable indicator of aflatoxin contamination that can be used in aflatoxin research as well as a decision-support tool to monitor pre-harvest aflatoxin risk in peanuts.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivated by the analysis of the Australian Grain Insect Resistance Database (AGIRD), we develop a Bayesian hurdle modelling approach to assess trends in strong resistance of stored grain insects to phosphine over time. The binary response variable from AGIRD indicating presence or absence of strong resistance is characterized by a majority of absence observations and the hurdle model is a two step approach that is useful when analyzing such a binary response dataset. The proposed hurdle model utilizes Bayesian classification trees to firstly identify covariates and covariate levels pertaining to possible presence or absence of strong resistance. Secondly, generalized additive models (GAMs) with spike and slab priors for variable selection are fitted to the subset of the dataset identified from the Bayesian classification tree indicating possibility of presence of strong resistance. From the GAM we assess trends, biosecurity issues and site specific variables influencing the presence of strong resistance using a variable selection approach. The proposed Bayesian hurdle model is compared to its frequentist counterpart, and also to a naive Bayesian approach which fits a GAM to the entire dataset. The Bayesian hurdle model has the benefit of providing a set of good trees for use in the first step and appears to provide enough flexibility to represent the influence of variables on strong resistance compared to the frequentist model, but also captures the subtle changes in the trend that are missed by the frequentist and naive Bayesian models. © 2014 Springer Science+Business Media New York.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: In order to rapidly and efficiently screen potential biofuel feedstock candidates for quintessential traits, robust high-throughput analytical techniques must be developed and honed. The traditional methods of measuring lignin syringyl/guaiacyl (S/G) ratio can be laborious, involve hazardous reagents, and/or be destructive. Vibrational spectroscopy can furnish high-throughput instrumentation without the limitations of the traditional techniques. Spectral data from mid-infrared, near-infrared, and Raman spectroscopies was combined with S/G ratios, obtained using pyrolysis molecular beam mass spectrometry, from 245 different eucalypt and Acacia trees across 17 species. Iterations of spectral processing allowed the assembly of robust predictive models using partial least squares (PLS). RESULTS: The PLS models were rigorously evaluated using three different randomly generated calibration and validation sets for each spectral processing approach. Root mean standard errors of prediction for validation sets were lowest for models comprised of Raman (0.13 to 0.16) and mid-infrared (0.13 to 0.15) spectral data, while near-infrared spectroscopy led to more erroneous predictions (0.18 to 0.21). Correlation coefficients (r) for the validation sets followed a similar pattern: Raman (0.89 to 0.91), mid-infrared (0.87 to 0.91), and near-infrared (0.79 to 0.82). These statistics signify that Raman and mid-infrared spectroscopy led to the most accurate predictions of S/G ratio in a diverse consortium of feedstocks. CONCLUSION: Eucalypts present an attractive option for biofuel and biochemical production. Given the assortment of over 900 different species of Eucalyptus and Corymbia, in addition to various species of Acacia, it is necessary to isolate those possessing ideal biofuel traits. This research has demonstrated the validity of vibrational spectroscopy to efficiently partition different potential biofuel feedstocks according to lignin S/G ratio, significantly reducing experiment and analysis time and expense while providing non-destructive, accurate, global, predictive models encompassing a diverse array of feedstocks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Birds represent the most diverse extant tetrapod clade, with ca. 10,000 extant species, and the timing of the crown avian radiation remains hotly debated. The fossil record supports a primarily Cenozoic radiation of crown birds, whereas molecular divergence dating analyses generally imply that this radiation was well underway during the Cretaceous. Furthermore, substantial differences have been noted between published divergence estimates. These have been variously attributed to clock model, calibration regime, and gene type. One underappreciated phenomenon is that disparity between fossil ages and molecular dates tends to be proportionally greater for shallower nodes in the avian Tree of Life. Here, we explore potential drivers of disparity in avian divergence dates through a set of analyses applying various calibration strategies and coding methods to a mitochondrial genome dataset and an 18-gene nuclear dataset, both sampled across 72 taxa. Our analyses support the occurrence of two deep divergences (i.e., the Palaeognathae/Neognathae split and the Galloanserae/Neoaves split) well within the Cretaceous, followed by a rapid radiation of Neoaves near the K-Pg boundary. However, 95% highest posterior density intervals for most basal divergences in Neoaves cross the boundary, and we emphasize that, barring unreasonably strict prior distributions, distinguishing between a rapid Early Paleocene radiation and a Late Cretaceous radiation may be beyond the resolving power of currently favored divergence dating methods. In contrast to recent observations for placental mammals, constraining all divergences within Neoaves to occur in the Cenozoic does not result in unreasonably high inferred substitution rates. Comparisons of nuclear DNA (nDNA) versus mitochondrial DNA (mtDNA) datasets and NT- versus RY-coded mitochondrial data reveal patterns of disparity that are consistent with substitution model misspecifications that result in tree compression/tree extension artifacts, which may explain some discordance between previous divergence estimates based on different sequence types. Comparisons of fully calibrated and nominally calibrated trees support a correlation between body mass and apparent dating error. Overall, our results are consistent with (but do not require) a Paleogene radiation for most major clades of crown birds.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this study was to evaluate and test methods which could improve local estimates of a general model fitted to a large area. In the first three studies, the intention was to divide the study area into sub-areas that were as homogeneous as possible according to the residuals of the general model, and in the fourth study, the localization was based on the local neighbourhood. According to spatial autocorrelation (SA), points closer together in space are more likely to be similar than those that are farther apart. Local indicators of SA (LISAs) test the similarity of data clusters. A LISA was calculated for every observation in the dataset, and together with the spatial position and residual of the global model, the data were segmented using two different methods: classification and regression trees (CART) and the multiresolution segmentation algorithm (MS) of the eCognition software. The general model was then re-fitted (localized) to the formed sub-areas. In kriging, the SA is modelled with a variogram, and the spatial correlation is a function of the distance (and direction) between the observation and the point of calculation. A general trend is corrected with the residual information of the neighbourhood, whose size is controlled by the number of the nearest neighbours. Nearness is measured as Euclidian distance. With all methods, the root mean square errors (RMSEs) were lower, but with the methods that segmented the study area, the deviance in single localized RMSEs was wide. Therefore, an element capable of controlling the division or localization should be included in the segmentation-localization process. Kriging, on the other hand, provided stable estimates when the number of neighbours was sufficient (over 30), thus offering the best potential for further studies. Even CART could be combined with kriging or non-parametric methods, such as most similar neighbours (MSN).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In order to reduce the motion artifacts in DSA, non-rigid image registration is commonly used before subtracting the mask from the contrast image. Since DSA registration requires a set of spatially non-uniform control points, a conventional MRF model is not very efficient. In this paper, we introduce the concept of pivotal and non-pivotal control points to address this, and propose a non-uniform MRF for DSA registration. We use quad-trees in a novel way to generate the non-uniform grid of control points. Our MRF formulation produces a smooth displacement field and therefore results in better artifact reduction than that of registering the control points independently. We achieve improved computational performance using pivotal control points without compromising on the artifact reduction. We have tested our approach using several clinical data sets, and have presented the results of quantitative analysis, clinical assessment and performance improvement on a GPU. (C) 2013 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A gradient in the density of hyperpolarization-activated cyclic-nucleotide gated (HCN) channels is necessary for the emergence of several functional maps within hippocampal pyramidal neurons. Here, we systematically analyzed the impact of dendritic atrophy on nine such functional maps, related to input resistance and local/transfer impedance properties, using conductance-based models of hippocampal pyramidal neurons. We introduced progressive dendritic atrophy in a CA1 pyramidal neuron reconstruction through a pruning algorithm, measured all functional maps in each pruned reconstruction, and arrived at functional forms for the dependence of underlying measurements on dendritic length. We found that, across frequencies, atrophied neurons responded with higher efficiency to incoming inputs, and the transfer of signals across the dendritic tree was more effective in an atrophied reconstruction. Importantly, despite the presence of identical HCN-channel density gradients, spatial gradients in input resistance, local/transfer resonance frequencies and impedance profiles were significantly constricted in reconstructions with dendrite atrophy, where these physiological measurements across dendritic locations converged to similar values. These results revealed that, in atrophied dendritic structures, the presence of an ion channel density gradient alone was insufficient to sustain homologous functional maps along the same neuronal topograph. We assessed the biophysical basis for these conclusions and found that this atrophy-induced constriction of functional maps was mediated by an enhanced spatial spread of the influence of an HCN-channel cluster in atrophied trees. These results demonstrated that the influence fields of ion channel conductances need to be localized for channel gradients to express themselves as homologous functional maps, suggesting that ion channel gradients are necessary but not sufficient for the emergence of functional maps within single neurons.