947 resultados para hierarchical tree-structure


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genomic sequences are fundamentally text documents, admitting various representations according to need and tokenization. Gene expression depends crucially on binding of enzymes to the DNA sequence at small, poorly conserved binding sites, limiting the utility of standard pattern search. However, one may exploit the regular syntactic structure of the enzyme's component proteins and the corresponding binding sites, framing the problem as one of detecting grammatically correct genomic phrases. In this paper we propose new kernels based on weighted tree structures, traversing the paths within them to capture the features which underpin the task. Experimentally, we and that these kernels provide performance comparable with state of the art approaches for this problem, while offering significant computational advantages over earlier methods. The methods proposed may be applied to a broad range of sequence or tree-structured data in molecular biology and other domains.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Process models define allowed process execution scenarios. The models are usually depicted as directed graphs, with gateway nodes regulating the control flow routing logic and with edges specifying the execution order constraints between tasks. While arbitrarily structured control flow patterns in process models complicate model analysis, they also permit creativity and full expressiveness when capturing non-trivial process scenarios. This paper gives a classification of arbitrarily structured process models based on the hierarchical process model decomposition technique. We identify a structural class of models consisting of block structured patterns which, when combined, define complex execution scenarios spanning across the individual patterns. We show that complex behavior can be localized by examining structural relations of loops in hidden unstructured regions of control flow. The correctness of the behavior of process models within these regions can be validated in linear time. These observations allow us to suggest techniques for transforming hidden unstructured regions into block-structured ones.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A mixed species reforestation program known as the Rainforestation Farming system was undertaken in the Philippines to develop forms of farm forestry more suitable for smallholders than the simple monocultural plantations commonly used then. In this study, we describe the subsequent changes in stand structure and floristic composition of these plantations in order to learn from the experience and develop improved prescriptions for reforestation systems likely to be attractive to smallholders. We investigated stands aged from 6 to 11 years old on three successive occasions over a 6 year period. We found the number of species originally present in the plots as trees >5 cm dbh decreased from an initial total of 76 species to 65 species at the end of study period. But, at the same time, some new species reached the size class threshold and were recruited into the canopy layer. There was a substantial decline in tree density from an estimated stocking of about 5000 trees per ha at the time of planting to 1380 trees per ha at the time of the first measurement; the density declined by a further 4.9% per year. Changes in composition and stand structure were indicated by a marked shift in the Importance Value Index of species. Over six years, shade-intolerant species became less important and the native shade-tolerant species (often Dipterocarps) increased in importance. Based on how the Rainforestation Farming plantations developed in these early years, we suggest that mixed-species plantations elsewhere in the humid tropics should be around 1000 trees per ha or less, that the proportion of fast growing (and hence early maturing) trees should be about 30–40% of this initial density and that any fruit tree component should only be planted on the plantation margin where more light and space are available for crowns to develop.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Effective control of dense, high-quality carbon nanotube arrays using hierarchical multilayer catalyst patterns is demonstrated. Scanning/transmission electron microscopy, atomic force microscopy, Raman spectroscopy, and numerical simulations show that by changing the secondary and tertiary layers one can control the properties of the nanotube arrays. The arrays with the highest surface density of vertically aligned nanotubes are produced using a hierarchical stack of iron nanoparticles and alumina and silica layers differing in thickness by one order of magnitude from one another. The results are explained in terms of the catalyst structure effect on carbon diffusivity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Spatial data are now prevalent in a wide range of fields including environmental and health science. This has led to the development of a range of approaches for analysing patterns in these data. In this paper, we compare several Bayesian hierarchical models for analysing point-based data based on the discretization of the study region, resulting in grid-based spatial data. The approaches considered include two parametric models and a semiparametric model. We highlight the methodology and computation for each approach. Two simulation studies are undertaken to compare the performance of these models for various structures of simulated point-based data which resemble environmental data. A case study of a real dataset is also conducted to demonstrate a practical application of the modelling approaches. Goodness-of-fit statistics are computed to compare estimates of the intensity functions. The deviance information criterion is also considered as an alternative model evaluation criterion. The results suggest that the adaptive Gaussian Markov random field model performs well for highly sparse point-based data where there are large variations or clustering across the space; whereas the discretized log Gaussian Cox process produces good fit in dense and clustered point-based data. One should generally consider the nature and structure of the point-based data in order to choose the appropriate method in modelling a discretized spatial point-based data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a novel method for improving hierarchical speaker clustering in the tasks of speaker diarization and speaker linking. In hierarchical clustering, a tree can be formed that demonstrates various levels of clustering. We propose a ratio that expresses the impact of each cluster on the formation of this tree and use this to rescale cluster scores. This provides score normalisation based on the impact of each cluster. We use a state-of-the-art speaker diarization and linking system across the SAIVT-BNEWS corpus to show that our proposed impact ratio can provide a relative improvement of 16% in diarization error rate (DER).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recently, attempts to improve decision making in species management have focussed on uncertainties associated with modelling temporal fluctuations in populations. Reducing model uncertainty is challenging; while larger samples improve estimation of species trajectories and reduce statistical errors, they typically amplify variability in observed trajectories. In particular, traditional modelling approaches aimed at estimating population trajectories usually do not account well for nonlinearities and uncertainties associated with multi-scale observations characteristic of large spatio-temporal surveys. We present a Bayesian semi-parametric hierarchical model for simultaneously quantifying uncertainties associated with model structure and parameters, and scale-specific variability over time. We estimate uncertainty across a four-tiered spatial hierarchy of coral cover from the Great Barrier Reef. Coral variability is well described; however, our results show that, in the absence of additional model specifications, conclusions regarding coral trajectories become highly uncertain when considering multiple reefs, suggesting that management should focus more at the scale of individual reefs. The approach presented facilitates the description and estimation of population trajectories and associated uncertainties when variability cannot be attributed to specific causes and origins. We argue that our model can unlock value contained in large-scale datasets, provide guidance for understanding sources of uncertainty, and support better informed decision making

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is a concern that high densities of elephants in southern Africa could lead to the overall reduction of other forms of biodiversity. We present a grid-based model of elephant-savanna dynamics, which differs from previous elephant-vegetation models by accounting for woody plant demographics, tree-grass interactions, stochastic environmental variables (fire and rainfall), and spatial contagion of fire and tree recruitment. The model projects changes in height structure and spatial pattern of trees over periods of centuries. The vegetation component of the model produces long-term tree-grass coexistence, and the emergent fire frequencies match those reported for southern African savannas. Including elephants in the savanna model had the expected effect of reducing woody plant cover, mainly via increased adult tree mortality, although at an elephant density of 1.0 elephant/km2, woody plants still persisted for over a century. We tested three different scenarios in addition to our default assumptions. (1) Reducing mortality of adult trees after elephant use, mimicking a more browsing-tolerant tree species, mitigated the detrimental effect of elephants on the woody population. (2) Coupling germination success (increased seedling recruitment) to elephant browsing further increased tree persistence, and (3) a faster growing woody component allowed some woody plant persistence for at least a century at a density of 3 elephants/km2. Quantitative models of the kind presented here provide a valuable tool for exploring the consequences of management decisions involving the manipulation of elephant population densities. © 2005 by the Ecological Society of America.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives Demonstrate the application of decision trees – classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs) – to understand structure in missing data. Setting Data taken from employees at three different industry sites in Australia. Participants 7915 observations were included. Materials and Methods The approach was evaluated using an occupational health dataset comprising results of questionnaires, medical tests, and environmental monitoring. Statistical methods included standard statistical tests and the ‘rpart’ and ‘gbm’ packages for CART and BRT analyses, respectively, from the statistical software ‘R’. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. Results CART and BRT models were effective in highlighting a missingness structure in the data, related to the Type of data (medical or environmental), the site in which it was collected, the number of visits and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured compared to structured missingness. Discussion Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Conclusion Researchers are encouraged to use CART and BRT models to explore and understand missing data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Korean black scraper, Thamnaconus modestus, is one of the most economically important maricultural fish species in Korea. However, the annual catch of this fish has been continuously declining over the past several decades. In this study, the genetic diversity and relationships among four wild populations and two hatchery stocks of Korean black scraper were assessed based on 16 microsatellite (MS) markers. A total of 319 different alleles were detected over all loci with an average of 19.94 alleles per locus. The hatchery stocks [mean number of alleles (N A) = 12, allelic richness (A R) = 12, expected heterozygosity (He) = 0.834] showed a slight reduction (P > 0.05) in genetic variability in comparison with wild populations (mean N A = 13.86, A R = 12.35, He = 0.844), suggesting a sufficient level of genetic variation in the hatchery populations. Similarly low levels of inbreeding and significant Hardy–Weinberg equilibrium deviations were detected in both wild and hatchery populations. The genetic subdivision among all six populations was low but significant (overall F ST = 0.008, P < 0.01). Pairwise F ST, a phylogenetic tree, and multidimensional scaling analysis suggested the existence of three geographically structured populations based on different sea basin origins, although the isolation-by-distance model was rejected. This result was corroborated by an analysis of molecular variance. This genetic differentiation may result from the co-effects of various factors, such as historical dispersal, local environment and ocean currents. These three geographical groups can be considered as independent management units. Our results show that MS markers may be suitable not only for the genetic monitoring of hatchery stocks but also for revealing the population structure of Korean black scraper populations. These results will provide critical information for breeding programs, the management of cultured stocks and the conservation of this species.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Several techniques are known for searching an ordered collection of data. The techniques and analyses of retrieval methods based on primary attributes are straightforward. Retrieval using secondary attributes depends on several factors. For secondary attribute retrieval, the linear structures—inverted lists, multilists, doubly linked lists—and the recently proposed nonlinear tree structures—multiple attribute tree (MAT), K-d tree (kdT)—have their individual merits. It is shown in this paper that, of the two tree structures, MAT possesses several features of a systematic data structure for external file organisation which make it superior to kdT. Analytic estimates for the complexity of node searchers, in MAT and kdT for several types of queries, are developed and compared.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A variety of data structures such as inverted file, multi-lists, quad tree, k-d tree, range tree, polygon tree, quintary tree, multidimensional tries, segment tree, doubly chained tree, the grid file, d-fold tree. super B-tree, Multiple Attribute Tree (MAT), etc. have been studied for multidimensional searching and related problems. Physical data base organization, which is an important application of multidimensional searching, is traditionally and mostly handled by employing inverted file. This study proposes MAT data structure for bibliographic file systems, by illustrating the superiority of MAT data structure over inverted file. Both the methods are compared in terms of preprocessing, storage and query costs. Worst-case complexity analysis of both the methods, for a partial match query, is carried out in two cases: (a) when directory resides in main memory, (b) when directory resides in secondary memory. In both cases, MAT data structure is shown to be more efficient than the inverted file method. Arguments are given to illustrate the superiority of MAT data structure in an average case also. An efficient adaptation of MAT data structure, that exploits the special features of MAT structure and bibliographic files, is proposed for bibliographic file systems. In this adaptation, suitable techniques for fixing and ranking of the attributes for MAT data structure are proposed. Conclusions and proposals for future research are presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hierarchical Bayesian models can assimilate surveillance and ecological information to estimate both invasion extent and model parameters for invading plant pests spread by people. A reliability analysis framework that can accommodate multiple dispersal modes is developed to estimate human-mediated dispersal parameters for an invasive species. Uncertainty in the observation process is modelled by accounting for local natural spread and population growth within spatial units. Broad scale incursion dynamics are based on a mechanistic gravity model with a Weibull distribution modification to incorporate a local pest build-up phase. The model uses Markov chain Monte Carlo simulations to infer the probability of colonisation times for discrete spatial units and to estimate connectivity parameters between these units. The hierarchical Bayesian model with observational and ecological components is applied to a surveillance dataset for a spiralling whitefly (Aleurodicus dispersus) invasion in Queensland, Australia. The model structure provides a useful application that draws on surveillance data and ecological knowledge that can be used to manage the risk of pest movement.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research is a step forward in discovering knowledge from databases of complex structure like tree or graph. Several data mining algorithms are developed based on a novel representation called Balanced Optimal Search for extracting implicit, unknown and potentially useful information like patterns, similarities and various relationships from tree data, which are also proved to be advantageous in analysing big data. This thesis focuses on analysing unordered tree data, which is robust to data inconsistency, irregularity and swift information changes, hence, in the era of big data it becomes a popular and widely used data model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this study was to explore soil microbial activities related to C and N cycling and the occurrence and concentrations of two important groups of plant secondary compounds, terpenes and phenolic compounds, under silver birch (Betula pendula Roth), Norway spruce (Picea abies (L.) Karst) and Scots pine (Pinus sylvestris L.) as well as to study the effects of volatile monoterpenes and tannins on soil microbial activities. The study site, located in Kivalo, northern Finland, included ca. 70-year-old adjacent stands dominated by silver birch, Norway spruce and Scots pine. Originally the soil was very probably similar in all three stands. All forest floor layers (litter (L), fermentation layer (F) and humified layer (H)) under birch and spruce showed higher rates of CO2 production, greater net mineralisation of nitrogen and higher amounts of carbon and nitrogen in microbial biomass than did the forest floor layers under pine. Concentrations of mono-, sesqui-, di- and triterpenes were higher under both conifers than under birch, while the concentration of total water-soluble phenolic compounds as well as the concentration of condensed tannins tended to be higher or at least as high under spruce as under birch or pine. In general, differences between tree species in soil microbial activities and in concentrations of secondary compounds were smaller in the H layer than in the upper layers. The rate of CO2 production and the amount of carbon in the microbial biomass correlated highly positively with the concentration of total water-soluble phenolic compounds and positively with the concentration of condensed tannins. Exposure of soil to volatile monoterpenes and tannins extracted and fractionated from spruce and pine needles affected carbon and nitrogen transformations in soil, but the effects were dependent on the compound and its molecular structure. Monoterpenes decreased net mineralisation of nitrogen and probably had a toxic effect on part of the microbial population in soil, while another part of the microbes seemed to be able to use monoterpenes as a carbon source. With tannins, low-molecular-weight compounds (also compounds other than tannins) increased soil CO2 production and nitrogen immobilisation by soil microbes while the higher-molecular-weight condensed tannins had inhibitory effects. In conclusion, plant secondary compounds may have a great potential in regulation of C and N transformations in forest soils, but the real magnitude of their significance in soil processes is impossible to estimate.