110 resultados para SUBSPACES


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The configuration space available to randomly cyclized polymers is divided into subspaces accessible to individual knot types. A phantom chain utilized in numerical simulations of polymers can explore all subspaces, whereas a real closed chain forming a figure-of-eight knot, for example, is confined to a subspace corresponding to this knot type only. One can conceptually compare the assembly of configuration spaces of various knot types to a complex foam where individual cells delimit the configuration space available to a given knot type. Neighboring cells in the foam harbor knots that can be converted into each other by just one intersegmental passage. Such a segment-segment passage occurring at the level of knotted configurations corresponds to a passage through the interface between neighboring cells in the foamy knot space. Using a DNA topoisomerase-inspired simulation approach we characterize here the effective interface area between neighboring knot spaces as well as the surface-to-volume ratio of individual knot spaces. These results provide a reference system required for better understanding mechanisms of action of various DNA topoisomerases.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Molecular shape has long been known to be an important property for the process of molecular recognition. Previous studies postulated the existence of a drug-like shape space that could be used to artificially bias the composition of screening libraries, with the aim to increase the chance of success in Hit Identification. In this work, it was analysed to which extend this assumption holds true. Normalized Principal Moments of Inertia Ratios (NPRs) have been used to describe the molecular shape of small molecules. It was investigated, whether active molecules of diverse targets are located in preferred subspaces of the NPR shape space. Results illustrated a significantly stronger clustering than could be expected by chance, with parts of the space unlikely to be occupied by active compounds. Furthermore, a strong enrichment of elongated, rather flat shapes could be observed, while globular compounds were highly underrepresented. This was confirmed for a wide range of small molecule datasets from different origins. Active compounds exhibited a high overlap in their shape distributions across different targets, making a purely shape­ based discrimination very difficult. An additional perspective was provided by comparing the shapes of protein binding pockets with those of their respective ligands. Although more globular than their ligands, it was observed that binding sites shapes exhibited a similarly skewed distribution in shape space: spherical shapes were highly underrepresented. This was different for unoccupied binding pockets of smaller size. These were on the contrary identified to possess a more globular shape. The relation between shape complementarity and exhibited bioactivity was analysed; a moderate correlation between bioactivity and parameters including pocket coverage, distance in shape space, and others could be identified, which reflects the importance of shape complementarity. However, this also suggests that other aspects are of relevance for molecular recognition. A subsequent analysis assessed if and how shape and volume information retrieved from pocket or respective reference ligands could be used as a pre-filter in a virtual screening approach. ln Lead Optimization compounds need to get optimized with respect to a variety of pararneters. Here, the availability of past success stories is very valuable, as they can guide medicinal chemists during their analogue synthesis plans. However, although of tremendous interest for the public domain, so far only large corporations had the ability to mine historical knowledge in their proprietary databases. With the aim to provide such information, the SwissBioisostere database was developed and released during this thesis. This database contains information on 21,293,355 performed substructural exchanges, corresponding to 5,586,462 unique replacements that have been measured in 35,039 assays against 1,948 molecular targets representing 30 target classes, and on their impact on bioactivity . A user-friendly interface was developed that provides facile access to these data and is accessible at http//www.swissbioisostere.ch. The ChEMBL database was used as primary data source of bioactivity information. Matched molecular pairs have been identified in the extracted and cleaned data. Success-based scores were developed and integrated into the database to allow re-ranking of proposed replacements by their past outcomes. It was analysed to which degree these scores correlate with chemical similarity of the underlying fragments. An unexpectedly weak relationship was detected and further investigated. Use cases of this database were envisioned, and functionalities implemented accordingly: replacement outcomes are aggregatable at the assay level, and it was shawn that an aggregation at the target or target class level could also be performed, but should be accompanied by a careful case-by-case assessment. It was furthermore observed that replacement success depends on the activity of the starting compound A within a matched molecular pair A-B. With increasing potency the probability to lose bioactivity through any substructural exchange was significantly higher than in low affine binders. A potential existence of a publication bias could be refuted. Furthermore, often performed medicinal chemistry strategies for structure-activity-relationship exploration were analysed using the acquired data. Finally, data originating from pharmaceutical companies were compared with those reported in the literature. It could be seen that industrial medicinal chemistry can access replacement information not available in the public domain. In contrast, a large amount of often-performed replacements within companies could also be identified in literature data. Preferences for particular replacements differed between these two sources. The value of combining different endpoints in an evaluation of molecular replacements was investigated. The performed studies highlighted furthermore that there seem to exist no universal substructural replacement that always retains bioactivity irrespective of the biological environment. A generalization of bioisosteric replacements seems therefore not possible. - La forme tridimensionnelle des molécules a depuis longtemps été reconnue comme une propriété importante pour le processus de reconnaissance moléculaire. Des études antérieures ont postulé que les médicaments occupent préférentiellement un sous-ensemble de l'espace des formes des molécules. Ce sous-ensemble pourrait être utilisé pour biaiser la composition de chimiothèques à cribler, dans le but d'augmenter les chances d'identifier des Hits. L'analyse et la validation de cette assertion fait l'objet de cette première partie. Les Ratios de Moments Principaux d'Inertie Normalisés (RPN) ont été utilisés pour décrire la forme tridimensionnelle de petites molécules de type médicament. Il a été étudié si les molécules actives sur des cibles différentes se co-localisaient dans des sous-espaces privilégiés de l'espace des formes. Les résultats montrent des regroupements de molécules incompatibles avec une répartition aléatoire, avec certaines parties de l'espace peu susceptibles d'être occupées par des composés actifs. Par ailleurs, un fort enrichissement en formes allongées et plutôt plates a pu être observé, tandis que les composés globulaires étaient fortement sous-représentés. Cela a été confirmé pour un large ensemble de compilations de molécules d'origines différentes. Les distributions de forme des molécules actives sur des cibles différentes se recoupent largement, rendant une discrimination fondée uniquement sur la forme très difficile. Une perspective supplémentaire a été ajoutée par la comparaison des formes des ligands avec celles de leurs sites de liaison (poches) dans leurs protéines respectives. Bien que plus globulaires que leurs ligands, il a été observé que les formes des poches présentent une distribution dans l'espace des formes avec le même type d'asymétrie que celle observée pour les ligands: les formes sphériques sont fortement sous­ représentées. Un résultat différent a été obtenu pour les poches de plus petite taille et cristallisées sans ligand: elles possédaient une forme plus globulaire. La relation entre complémentarité de forme et bioactivité a été également analysée; une corrélation modérée entre bioactivité et des paramètres tels que remplissage de poche, distance dans l'espace des formes, ainsi que d'autres, a pu être identifiée. Ceci reflète l'importance de la complémentarité des formes, mais aussi l'implication d'autres facteurs. Une analyse ultérieure a évalué si et comment la forme et le volume d'une poche ou de ses ligands de référence pouvaient être utilisés comme un pré-filtre dans une approche de criblage virtuel. Durant l'optimisation d'un Lead, de nombreux paramètres doivent être optimisés simultanément. Dans ce contexte, la disponibilité d'exemples d'optimisations réussies est précieuse, car ils peuvent orienter les chimistes médicinaux dans leurs plans de synthèse par analogie. Cependant, bien que d'un extrême intérêt pour les chercheurs dans le domaine public, seules les grandes sociétés pharmaceutiques avaient jusqu'à présent la capacité d'exploiter de telles connaissances au sein de leurs bases de données internes. Dans le but de remédier à cette limitation, la base de données SwissBioisostere a été élaborée et publiée dans le domaine public au cours de cette thèse. Cette base de données contient des informations sur 21 293 355 échanges sous-structuraux observés, correspondant à 5 586 462 remplacements uniques mesurés dans 35 039 tests contre 1948 cibles représentant 30 familles, ainsi que sur leur impact sur la bioactivité. Une interface a été développée pour permettre un accès facile à ces données, accessible à http:/ /www.swissbioisostere.ch. La base de données ChEMBL a été utilisée comme source de données de bioactivité. Une version modifiée de l'algorithme de Hussain et Rea a été implémentée pour identifier les Matched Molecular Pairs (MMP) dans les données préparées au préalable. Des scores de succès ont été développés et intégrés dans la base de données pour permettre un reclassement des remplacements proposés selon leurs résultats précédemment observés. La corrélation entre ces scores et la similarité chimique des fragments correspondants a été étudiée. Une corrélation plus faible qu'attendue a été détectée et analysée. Différents cas d'utilisation de cette base de données ont été envisagés, et les fonctionnalités correspondantes implémentées: l'agrégation des résultats de remplacement est effectuée au niveau de chaque test, et il a été montré qu'elle pourrait également être effectuée au niveau de la cible ou de la classe de cible, sous réserve d'une analyse au cas par cas. Il a en outre été constaté que le succès d'un remplacement dépend de l'activité du composé A au sein d'une paire A-B. Il a été montré que la probabilité de perdre la bioactivité à la suite d'un remplacement moléculaire quelconque est plus importante au sein des molécules les plus actives que chez les molécules de plus faible activité. L'existence potentielle d'un biais lié au processus de publication par articles a pu être réfutée. En outre, les stratégies fréquentes de chimie médicinale pour l'exploration des relations structure-activité ont été analysées à l'aide des données acquises. Enfin, les données provenant des compagnies pharmaceutiques ont été comparées à celles reportées dans la littérature. Il a pu être constaté que les chimistes médicinaux dans l'industrie peuvent accéder à des remplacements qui ne sont pas disponibles dans le domaine public. Par contre, un grand nombre de remplacements fréquemment observés dans les données de l'industrie ont également pu être identifiés dans les données de la littérature. Les préférences pour certains remplacements particuliers diffèrent entre ces deux sources. L'intérêt d'évaluer les remplacements moléculaires simultanément selon plusieurs paramètres (bioactivité et stabilité métabolique par ex.) a aussi été étudié. Les études réalisées ont souligné qu'il semble n'exister aucun remplacement sous-structural universel qui conserve toujours la bioactivité quel que soit le contexte biologique. Une généralisation des remplacements bioisostériques ne semble donc pas possible.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We give a sufficient condition for a set of block subspaces in an infinite-dimensional Banach space to be weakly Ramsey. Using this condition we prove that in the Levy-collapse of a Mahlo cardinal, every projective set is weakly Ramsey. This, together with a construction of W. H. Woodin, is used to show that the Axiom of Projective Determinacy implies that every projective set is weakly Ramsey. In the case of co we prove similar results for a stronger Ramsey property. And for hereditarily indecomposable spaces we show that the Axiom of Determinacy plus the Axiom of Dependent Choices imply that every set is weakly Ramsey. These results are the generalizations to the class of projective sets of some theorems from W. T. Gowers, and our paper "Weakly Ramsey sets in Banach spaces."

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Linear spaces consisting of σ-finite probability measures and infinite measures (improper priors and likelihood functions) are defined. The commutative group operation, called perturbation, is the updating given by Bayes theorem; the inverse operation is the Radon-Nikodym derivative. Bayes spaces of measures are sets of classes of proportional measures. In this framework, basic notions of mathematical statistics get a simple algebraic interpretation. For example, exponential families appear as affine subspaces with their sufficient statistics as a basis. Bayesian statistics, in particular some well-known properties of conjugated priors and likelihood functions, are revisited and slightly extended

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In numerical linear algebra, students encounter earlythe iterative power method, which finds eigenvectors of a matrixfrom an arbitrary starting point through repeated normalizationand multiplications by the matrix itself. In practice, more sophisticatedmethods are used nowadays, threatening to make the powermethod a historical and pedagogic footnote. However, in the contextof communication over a time-division duplex (TDD) multipleinputmultiple-output (MIMO) channel, the power method takes aspecial position. It can be viewed as an intrinsic part of the uplinkand downlink communication switching, enabling estimationof the eigenmodes of the channel without extra overhead. Generalizingthe method to vector subspaces, communication in thesubspaces with the best receive and transmit signal-to-noise ratio(SNR) is made possible. In exploring this intrinsic subspace convergence(ISC), we show that several published and new schemes canbe cast into a common framework where all members benefit fromthe ISC.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The interaction mean free path between neutrons and TRISO particles is simulated using scripts written in MATLAB to solve the increasing error present with an increase in the packing factor in the reactor physics code Serpent. Their movement is tracked both in an unbounded and in a bounded space. Their track is calculated, depending on the program, linearly directly using the position vectors of the neutrons and the surface equations of all the fuel particles; by dividing the space in multiple subspaces, each of which contain a fraction of the total number of particles, and choosing the particles from those subspaces through which the neutron passes through; or by choosing the particles that lie within an infinite cylinder formed on the movement axis of the neutron. The estimate from the current analytical model, based on an exponential distribution, for the mean free path, utilized by Serpent, is used as a reference result. The results from the implicit model in Serpent imply a too long mean free path with high packing factors. The received results support this observation by producing, with a packing factor of 17 %, approximately 2.46 % shorter mean free path compared to the reference model. This is supported by the packing factor experienced by the neutron, the simulation of which resulted in a 17.29 % packing factor. It was also observed that the neutrons leaving from the surfaces of the fuel particles, in contrast to those starting inside the moderator, do not follow the exponential distribution. The current model, as it is, is thus not valid in the determination of the free path lengths of the neutrons.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a computation of the $V_gamma$ dimension for regression in bounded subspaces of Reproducing Kernel Hilbert Spaces (RKHS) for the Support Vector Machine (SVM) regression $epsilon$-insensitive loss function, and general $L_p$ loss functions. Finiteness of the RV_gamma$ dimension is shown, which also proves uniform convergence in probability for regression machines in RKHS subspaces that use the $L_epsilon$ or general $L_p$ loss functions. This paper presenta a novel proof of this result also for the case that a bias is added to the functions in the RKHS.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Aitchison vector space structure for the simplex is generalized to a Hilbert space structure A2(P) for distributions and likelihoods on arbitrary spaces. Central notations of statistics, such as Information or Likelihood, can be identified in the algebraical structure of A2(P) and their corresponding notions in compositional data analysis, such as Aitchison distance or centered log ratio transform. In this way very elaborated aspects of mathematical statistics can be understood easily in the light of a simple vector space structure and of compositional data analysis. E.g. combination of statistical information such as Bayesian updating, combination of likelihood and robust M-estimation functions are simple additions/ perturbations in A2(Pprior). Weighting observations corresponds to a weighted addition of the corresponding evidence. Likelihood based statistics for general exponential families turns out to have a particularly easy interpretation in terms of A2(P). Regular exponential families form finite dimensional linear subspaces of A2(P) and they correspond to finite dimensional subspaces formed by their posterior in the dual information space A2(Pprior). The Aitchison norm can identified with mean Fisher information. The closing constant itself is identified with a generalization of the cummulant function and shown to be Kullback Leiblers directed information. Fisher information is the local geometry of the manifold induced by the A2(P) derivative of the Kullback Leibler information and the space A2(P) can therefore be seen as the tangential geometry of statistical inference at the distribution P. The discussion of A2(P) valued random variables, such as estimation functions or likelihoods, give a further interpretation of Fisher information as the expected squared norm of evidence and a scale free understanding of unbiased reasoning

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A joint distribution of two discrete random variables with finite support can be displayed as a two way table of probabilities adding to one. Assume that this table has n rows and m columns and all probabilities are non-null. This kind of table can be seen as an element in the simplex of n · m parts. In this context, the marginals are identified as compositional amalgams, conditionals (rows or columns) as subcompositions. Also, simplicial perturbation appears as Bayes theorem. However, the Euclidean elements of the Aitchison geometry of the simplex can also be translated into the table of probabilities: subspaces, orthogonal projections, distances. Two important questions are addressed: a) given a table of probabilities, which is the nearest independent table to the initial one? b) which is the largest orthogonal projection of a row onto a column? or, equivalently, which is the information in a row explained by a column, thus explaining the interaction? To answer these questions three orthogonal decompositions are presented: (1) by columns and a row-wise geometric marginal, (2) by rows and a columnwise geometric marginal, (3) by independent two-way tables and fully dependent tables representing row-column interaction. An important result is that the nearest independent table is the product of the two (row and column)-wise geometric marginal tables. A corollary is that, in an independent table, the geometric marginals conform with the traditional (arithmetic) marginals. These decompositions can be compared with standard log-linear models. Key words: balance, compositional data, simplex, Aitchison geometry, composition, orthonormal basis, arithmetic and geometric marginals, amalgam, dependence measure, contingency table

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A select-divide-and-conquer variational method to approximate configuration interaction (CI) is presented. Given an orthonormal set made up of occupied orbitals (Hartree-Fock or similar) and suitable correlation orbitals (natural or localized orbitals), a large N-electron target space S is split into subspaces S0,S1,S2,...,SR. S0, of dimension d0, contains all configurations K with attributes (energy contributions, etc.) above thresholds T0={T0egy, T0etc.}; the CI coefficients in S0 remain always free to vary. S1 accommodates KS with attributes above T1≤T0. An eigenproblem of dimension d0+d1 for S0+S 1 is solved first, after which the last d1 rows and columns are contracted into a single row and column, thus freezing the last d1 CI coefficients hereinafter. The process is repeated with successive Sj(j≥2) chosen so that corresponding CI matrices fit random access memory (RAM). Davidson's eigensolver is used R times. The final energy eigenvalue (lowest or excited one) is always above the corresponding exact eigenvalue in S. Threshold values {Tj;j=0, 1, 2,...,R} regulate accuracy; for large-dimensional S, high accuracy requires S 0+S1 to be solved outside RAM. From there on, however, usually a few Davidson iterations in RAM are needed for each step, so that Hamiltonian matrix-element evaluation becomes rate determining. One μhartree accuracy is achieved for an eigenproblem of order 24 × 106, involving 1.2 × 1012 nonzero matrix elements, and 8.4×109 Slater determinants

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An implicitly parallel method for integral-block driven restricted active space self-consistent field (RASSCF) algorithms is presented. The approach is based on a model space representation of the RAS active orbitals with an efficient expansion of the model subspaces. The applicability of the method is demonstrated with a RASSCF investigation of the first two excited states of indole

Relevância:

10.00% 10.00%

Publicador:

Resumo:

En aquesta tesi s’estudia el problema de la segmentació del moviment. La tesi presenta una revisió dels principals algoritmes de segmentació del moviment, s’analitzen les característiques principals i es proposa una classificació de les tècniques més recents i importants. La segmentació es pot entendre com un problema d’agrupament d’espais (manifold clustering). Aquest estudi aborda alguns dels reptes més difícils de la segmentació de moviment a través l’agrupament d’espais. S’han proposat nous algoritmes per a l’estimació del rang de la matriu de trajectòries, s’ha presenta una mesura de similitud entre subespais, s’han abordat problemes relacionats amb el comportament dels angles canònics i s’ha desenvolupat una eina genèrica per estimar quants moviments apareixen en una seqüència. L´ultima part de l’estudi es dedica a la correcció de l’estimació inicial d’una segmentació. Aquesta correcció es du a terme ajuntant els problemes de la segmentació del moviment i de l’estructura a partir del moviment.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Investigation of preferred structures of planetary wave dynamics is addressed using multivariate Gaussian mixture models. The number of components in the mixture is obtained using order statistics of the mixing proportions, hence avoiding previous difficulties related to sample sizes and independence issues. The method is first applied to a few low-order stochastic dynamical systems and data from a general circulation model. The method is next applied to winter daily 500-hPa heights from 1949 to 2003 over the Northern Hemisphere. A spatial clustering algorithm is first applied to the leading two principal components (PCs) and shows significant clustering. The clustering is particularly robust for the first half of the record and less for the second half. The mixture model is then used to identify the clusters. Two highly significant extratropical planetary-scale preferred structures are obtained within the first two to four EOF state space. The first pattern shows a Pacific-North American (PNA) pattern and a negative North Atlantic Oscillation (NAO), and the second pattern is nearly opposite to the first one. It is also observed that some subspaces show multivariate Gaussianity, compatible with linearity, whereas others show multivariate non-Gaussianity. The same analysis is also applied to two subperiods, before and after 1978, and shows a similar regime behavior, with a slight stronger support for the first subperiod. In addition a significant regime shift is also observed between the two periods as well as a change in the shape of the distribution. The patterns associated with the regime shifts reflect essentially a PNA pattern and an NAO pattern consistent with the observed global warming effect on climate and the observed shift in sea surface temperature around the mid-1970s.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigated patterns of bryophyte species richness and community structure, and their relation to roof variables, on thatched roofs of the Holnicote Estate, South Somerset. Thirty-two bryophyte species were recorded from 28 sampled roofs, including the globally rare and endangered thatch moss, Leptodontium gemmascens. Multiple regression analyses revealed that thatch age has a highly significant positive effect on the number of species present, accounting for nearly half the observed variation in species richness after removal of outliers. Aspect has a slight and marginally significant effect on species diversity (accounting for an additional 6% of variation), with north-facing samples having slightly more species. Age also has a significant impact on total bryophyte cover after removal of outlying observations. TWINSPAN analysis of bryophyte cover data suggests the existence of at least five discrete communities. Simple Discriminant Analyses indicate that these communities occupy different ecological subspaces as defined by the measured roof variables, with pitch, aspect and thatch age emerging as especially significant attributes. Contingency Analysis indicates that some communities are disfavoured by water reed as compared to wheat straw. The findings are significant for understanding the structure of bryophyte communities, for evaluating the effect of bryophyte cover on thatch performance, and for conservation of thatch communities, especially those harbouring rare species.