997 resultados para Random Rooted Labeled Trees


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The maximum M of a critical Bienaymé-Galton-Watson process conditioned on the total progeny N is studied. Imbedding of the process in a random walk is used. A limit theorem for the distribution of M as N → ∞ is proved. The result is trasferred to the non-critical processes. A corollary for the maximal strata of a random rooted labeled tree is obtained.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

"UIUCDCS-R-74-679"

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A hyperplane arrangement is a finite set of hyperplanes in a real affine space. An especially important arrangement is the braid arrangement, which is the set of all hyperplanes xi - xj = 1, 1 trees. For instance, the number of labeled interval orders that can be obtained from n intervals I1,..., In of generic lengths is counted. There is also discussed an arrangement due to N. Linial whose number of regions is the number of alternating (or intransitive) trees, as defined by Gelfand, Graev, and Postnikov [Gelfand, I. M., Graev, M. I., and Postnikov, A. (1995), preprint]. Finally, a refinement is given, related to counting labeled trees by number of inversions, of a result of Shi [Shi, J.-Y. (1986), Lecture Notes in Mathematics, no. 1179, Springer-Verlag] that a certain deformation of the braid arrangement has (n + 1)n-1 regions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Extracting frequent subtrees from the tree structured data has important applications in Web mining. In this paper, we introduce a novel canonical form for rooted labelled unordered trees called the balanced-optimal-search canonical form (BOCF) that can handle the isomorphism problem efficiently. Using BOCF, we define a tree structure guided scheme based enumeration approach that systematically enumerates only the valid subtrees. Finally, we present the balanced optimal search tree miner (BOSTER) algorithm based on BOCF and the proposed enumeration approach, for finding frequent induced subtrees from a database of labelled rooted unordered trees. Experiments on the real datasets compare the efficiency of BOSTER over the two state-of-the-art algorithms for mining induced unordered subtrees, HybridTreeMiner and UNI3. The results are encouraging.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an algorithm for mining unordered embedded subtrees using the balanced-optimal-search canonical form (BOCF). A tree structure guided scheme based enumeration approach is defined using BOCF for systematically enumerating the valid subtrees only. Based on this canonical form and enumeration technique, the balanced optimal search embedded subtree mining algorithm (BEST) is introduced for mining embedded subtrees from a database of labelled rooted unordered trees. The extensive experiments on both synthetic and real datasets demonstrate the efficiency of BEST over the two state-of-the-art algorithms for mining embedded unordered subtrees, SLEUTH and U3.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the two-dimensional version of a drainage network model introduced ill Gangopadhyay, Roy and Sarkar (2004), and show that the appropriately rescaled family of its paths converges in distribution to the Brownian web. We do so by verifying the convergence criteria proposed in Fontes, Isopi, Newman and Ravishankar (2002).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficiently addressed while comparing XML documents. In this paper, we provide an integrated and fine-grained comparison framework to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally and semantically similar sub-trees), and to allow the end-user to adjust the comparison process according to her requirements. Our framework consists of four main modules for (i) discovering the structural commonalities between sub-trees, (ii) identifying sub-tree semantic resemblances, (iii) computing tree-based edit operations costs, and (iv) computing tree edit distance. Experimental results demonstrate higher comparison accuracy with respect to alternative methods, while timing experiments reflect the impact of semantic similarity on overall system performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Universal trees based on sequences of single gene homologs cannot be rooted. Iwabe et al. [Iwabe, N., Kuma, K.-I., Hasegawa, M., Osawa, S. & Miyata, T. (1989) Proc. Natl. Acad. Sci. USA 86, 9355-9359] circumvented this problem by using ancient gene duplications that predated the last common ancestor of all living things. Their separate, reciprocally rooted gene trees for elongation factors and ATPase subunits showed Bacteria (eubacteria) as branching first from the universal tree with Archaea (archaebacteria) and Eucarya (eukaryotes) as sister groups. Given its topical importance to evolutionary biology and concerns about the appropriateness of the ATPase data set, an evaluation of the universal tree root using other ancient gene duplications is essential. In this study, we derive a rooting for the universal tree using aminoacyl-tRNA synthetase genes, an extensive multigene family whose divergence likely preceded that of prokaryotes and eukaryotes. An approximately 1600-bp conserved region was sequenced from the isoleucyl-tRNA synthetases of several species representing deep evolutionary branches of eukaryotes (Nosema locustae), Bacteria (Aquifex pyrophilus and Thermotoga maritima) and Archaea (Pyrococcus furiosus and Sulfolobus acidocaldarius). In addition, a new valyl-tRNA synthetase was characterized from the protist Trichomonas vaginalis. Different phylogenetic methods were used to generate trees of isoleucyl-tRNA synthetases rooted by valyl- and leucyl-tRNA synthetases. All isoleucyl-tRNA synthetase trees showed Archaea and Eucarya as sister groups, providing strong confirmation for the universal tree rooting reported by Iwabe et al. As well, there was strong support for the monophyly (sensu Hennig) of Archaea. The valyl-tRNA synthetase gene from Tr. vaginalis clustered with other eukaryotic ValRS genes, which may have been transferred from the mitochondrial genome to the nuclear genome, suggesting that this amitochondrial trichomonad once harbored an endosymbiotic bacterium.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data aggregation in wireless sensor networks is employed to reduce the communication overhead and prolong the network lifetime. However, an adversary may compromise some sensor nodes, and use them to forge false values as the aggregation result. Previous secure data aggregation schemes have tackled this problem from different angles. The goal of those algorithms is to ensure that the Base Station (BS) does not accept any forged aggregation results. But none of them have tried to detect the nodes that inject into the network bogus aggregation results. Moreover, most of them usually have a communication overhead that is (at best) logarithmic per node. In this paper, we propose a secure and energy-efficient data aggregation scheme that can detect the malicious nodes with a constant per node communication overhead. In our solution, all aggregation results are signed with the private keys of the aggregators so that they cannot be altered by others. Nodes on each link additionally use their pairwise shared key for secure communications. Each node receives the aggregation results from its parent (sent by the parent of its parent) and its siblings (via its parent node), and verifies the aggregation result of the parent node. Theoretical analysis on energy consumption and communication overhead accords with our comparison based simulation study over random data aggregation trees.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We consider a random tree and introduce a metric in the space of trees to define the ""mean tree"" as the tree minimizing the average distance to the random tree. When the resulting metric space is compact we have laws of large numbers and central limit theorems for sequence of independent identically distributed random trees. As application we propose tests to check if two samples of random trees have the same law.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

An analytic solution to the multi-target Bayes recursion known as the δ-Generalized Labeled Multi-Bernoulli ( δ-GLMB) filter has been recently proposed by Vo and Vo in [“Labeled Random Finite Sets and Multi-Object Conjugate Priors,” IEEE Trans. Signal Process., vol. 61, no. 13, pp. 3460-3475, 2014]. As a sequel to that paper, the present paper details efficient implementations of the δ-GLMB multi-target tracking filter. Each iteration of this filter involves an update operation and a prediction operation, both of which result in weighted sums of multi-target exponentials with intractably large number of terms. To truncate these sums, the ranked assignment and K-th shortest path algorithms are used in the update and prediction, respectively, to determine the most significant terms without exhaustively computing all of the terms. In addition, using tools derived from the same framework, such as probability hypothesis density filtering, we present inexpensive (relative to the δ-GLMB filter) look-ahead strategies to reduce the number of computations. Characterization of the L1-error in the multi-target density arising from the truncation is presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Biopanning of phage-displayed random peptide libraries is a powerful technique for identifying peptides that mimic epitopes (mimotopes) for monoclonal antibodies (mAbs). However, peptides derived using polyclonal antisera may represent epitopes for a diverse range of antibodies. Hence following screening of phage libraries with polyclonal antisera, including autoimmune disease sera, a procedure is required to distinguish relevant from irrelevant phagotopes. We therefore applied the multiple sequence alignment algorithm PILEUP together with a matrix for scoring amino acid substitutions based on physicochemical properties to generate guide trees depicting relatedness of selected peptides. A random heptapeptide library was biopanned nine times using no selecting antibodies, immunoglobulin G (IgG) from sera of subjects with autoimmune diseases (primary biliary cirrhosis (PBC) and type 1 diabetes) and three murine ascites fluids that contained mAbs to overlapping epitope(s) on the Ross River Virus envelope protein 2. Peptides randomly sampled from the library were distributed throughout the guide tree of the total set of peptides whilst many of the peptides derived in the absence of selecting antibody aligned to a single cluster. Moreover peptides selected by different sources of IgG aligned to separate clusters, each with a different amino acid motif. These alignments were validated by testing all of the 53 phagotopes derived using IgG from PBC sera for reactivity by capture ELISA with antibodies affinity purified on the E2 subunit of the pyruvate dehydrogenase complex (PDC-E2), the major autoantigen in PBC: only those phagotopes that aligned to PBC-associated clusters were reactive. Hence the multiple sequence alignment procedure discriminates relevant from irrelevant phagotopes and thus a major difficulty with biopanning phage-displayed random peptide libraries with polyclonal antibodies is surmounted.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

1 Species-accumulation curves for woody plants were calculated in three tropical forests, based on fully mapped 50-ha plots in wet, old-growth forest in Peninsular Malaysia, in moist, old-growth forest in central Panama, and in dry, previously logged forest in southern India. A total of 610 000 stems were identified to species and mapped to < Im accuracy. Mean species number and stem number were calculated in quadrats as small as 5 m x 5 m to as large as 1000 m x 500 m, for a variety of stem sizes above 10 mm in diameter. Species-area curves were generated by plotting species number as a function of quadrat size; species-individual curves were generated from the same data, but using stem number as the independent variable rather than area. 2 Species-area curves had different forms for stems of different diameters, but species-individual curves were nearly independent of diameter class. With < 10(4) stems, species-individual curves were concave downward on log-log plots, with curves from different forests diverging, but beyond about 104 stems, the log-log curves became nearly linear, with all three sites having a similar slope. This indicates an asymptotic difference in richness between forests: the Malaysian site had 2.7 times as many species as Panama, which in turn was 3.3 times as rich as India. 3 Other details of the species-accumulation relationship were remarkably similar between the three sites. Rectangular quadrats had 5-27% more species than square quadrats of the same area, with longer and narrower quadrats increasingly diverse. Random samples of stems drawn from the entire 50 ha had 10-30% more species than square quadrats with the same number of stems. At both Pasoh and BCI, but not Mudumalai. species richness was slightly higher among intermediate-sized stems (50-100mm in diameter) than in either smaller or larger sizes, These patterns reflect aggregated distributions of individual species, plus weak density-dependent forces that tend to smooth the species abundance distribution and 'loosen' aggregations as stems grow. 4 The results provide support for the view that within each tree community, many species have their abundance and distribution guided more by random drift than deterministic interactions. The drift model predicts that the species-accumulation curve will have a declining slope on a log-log plot, reaching a slope of O.1 in about 50 ha. No other model of community structure can make such a precise prediction. 5 The results demonstrate that diversity studies based on different stem diameters can be compared by sampling identical numbers of stems. Moreover, they indicate that stem counts < 1000 in tropical forests will underestimate the percentage difference in species richness between two diverse sites. Fortunately, standard diversity indices (Fisher's sc, Shannon-Wiener) captured diversity differences in small stem samples more effectively than raw species richness, but both were sample size dependent. Two nonparametric richness estimators (Chao. jackknife) performed poorly, greatly underestimating true species richness.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Let G = (V, E) be a finite, simple and undirected graph. For S subset of V, let delta(S, G) = {(u, v) is an element of E : u is an element of S and v is an element of V - S} be the edge boundary of S. Given an integer i, 1 <= i <= vertical bar V vertical bar, let the edge isoperimetric value of G at i be defined as b(e)(i, G) = min(S subset of V:vertical bar S vertical bar=i)vertical bar delta(S, G)vertical bar. The edge isoperimetric peak of G is defined as b(e)(G) = max(1 <= j <=vertical bar V vertical bar)b(e)(j, G). Let b(v)(G) denote the vertex isoperimetric peak defined in a corresponding way. The problem of determining a lower bound for the vertex isoperimetric peak in complete t-ary trees was recently considered in [Y. Otachi, K. Yamazaki, A lower bound for the vertex boundary-width of complete k-ary trees, Discrete Mathematics, in press (doi: 10.1016/j.disc.2007.05.014)]. In this paper we provide bounds which improve those in the above cited paper. Our results can be generalized to arbitrary (rooted) trees. The depth d of a tree is the number of nodes on the longest path starting from the root and ending at a leaf. In this paper we show that for a complete binary tree of depth d (denoted as T-d(2)), c(1)d <= b(e) (T-d(2)) <= d and c(2)d <= b(v)(T-d(2)) <= d where c(1), c(2) are constants. For a complete t-ary tree of depth d (denoted as T-d(t)) and d >= c log t where c is a constant, we show that c(1)root td <= b(e)(T-d(t)) <= td and c(2)d/root t <= b(v) (T-d(t)) <= d where c(1), c(2) are constants. At the heart of our proof we have the following theorem which works for an arbitrary rooted tree and not just for a complete t-ary tree. Let T = (V, E, r) be a finite, connected and rooted tree - the root being the vertex r. Define a weight function w : V -> N where the weight w(u) of a vertex u is the number of its successors (including itself) and let the weight index eta(T) be defined as the number of distinct weights in the tree, i.e eta(T) vertical bar{w(u) : u is an element of V}vertical bar. For a positive integer k, let l(k) = vertical bar{i is an element of N : 1 <= i <= vertical bar V vertical bar, b(e)(i, G) <= k}vertical bar. We show that l(k) <= 2(2 eta+k k)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While plants of a single species emit a diversity of volatile organic compounds (VOCs) to attract or repel interacting organisms, these specific messages may be lost in the midst of the hundreds of VOCs produced by sympatric plants of different species, many of which may have no signal content. Receivers must be able to reduce the babel or noise in these VOCs in order to correctly identify the message. For chemical ecologists faced with vast amounts of data on volatile signatures of plants in different ecological contexts, it is imperative to employ accurate methods of classifying messages, so that suitable bioassays may then be designed to understand message content. We demonstrate the utility of `Random Forests' (RF), a machine-learning algorithm, for the task of classifying volatile signatures and choosing the minimum set of volatiles for accurate discrimination, using datam from sympatric Ficus species as a case study. We demonstrate the advantages of RF over conventional classification methods such as principal component analysis (PCA), as well as data-mining algorithms such as support vector machines (SVM), diagonal linear discriminant analysis (DLDA) and k-nearest neighbour (KNN) analysis. We show why a tree-building method such as RF, which is increasingly being used by the bioinformatics, food technology and medical community, is particularly advantageous for the study of plant communication using volatiles, dealing, as it must, with abundant noise.