Biblioteca Digital

937 resultados para rank-based procedure

Cloning and expression of a cDNA encoding a bovine brain brefeldin A-sensitive guanine nucleotide-exchange protein for ADP-ribosylation factor

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A 200-kDa guanine nucleotide-exchange protein (p200 or GEP) for ADP-ribosylation factors 1 and 3 (ARF1 and ARF3) that was inhibited by brefeldin A (BFA) was purified earlier from cytosol of bovine brain cortex. Amino acid sequences of four tryptic peptides were 47% identical to that of Sec7 from Saccharomyces cerevisiae, which is involved in vesicular trafficking in the Golgi. By using a PCR-based procedure with two degenerate primers representing sequences of these peptides, a product similar in size to Sec7 that contained the peptide sequences was generated. Two oligonucleotides based on this product were used to screen a bovine brain library, which yielded one clone that was a partial cDNA for p200. The remainder of the cDNA was obtained by 5′ and 3′ rapid amplification of cDNA ends (RACE). The ORF of the cDNA encodes a protein of 1,849 amino acids (≈208 kDa) that is 33% identical to yeast Sec7 and 50% identical in the Sec7 domain region. On Northern blot analysis of bovine tissues, a ≈7.4-kb mRNA was identified that hybridized with a p200 probe; it was abundant in kidney, somewhat less abundant in lung, spleen, and brain, and still less abundant in heart. A six-His-tagged fusion protein synthesized in baculovirus-infected Sf9 cells demonstrated BFA-inhibited GEP activity, confirming that BFA sensitivity is an intrinsic property of this ARF GEP and not conferred by another protein component of the complex from which p200 was originally purified.

Identifying ribozyme-accessible sites using NUH triplet-targeting gapmers

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Accurately identifying accessible sites in RNA is a critical prerequisite for optimising the cleavage efficiency of hammerhead ribozymes and other small nucleozymes. Here we describe a simple RNase H-based procedure to rapidly identify hammerhead ribozyme-accessible sites in gene length RNAs. Twelve semi-randomised RNA–DNA–RNA chimeric oligonucleotide probes, known as ‘gapmers’, were used to direct RNase H cleavage of transcripts with the specificity expected for hammerhead ribozymes, i.e. after NUH sites (where H is A, C or U). Cleavage sites were identified simply by the mobility of RNase H cleavage products relative to RNA markers in denaturing polyacrylamide gels. Sites were identified in transcripts encoding human interleukin-2 and platelet-derived growth factor. Thirteen minimised hammerhead ribozymes, miniribozymes (Mrz), were synthesised and in vitro cleavage efficiency (37°C, pH 7.6 and 1 mM MgCl2) at each site was analysed. Of the 13 Mrz, five were highly effective, demonstrating good initial rate constants and extents of cleavage. The speed and accuracy of this method commends its use in screening for hammerhead-accessible sites.

Global color estimation of special-effect coatings from measurements by commercially available portable multiangle spectrophotometers

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Colors of special-effect coatings have strong dependence on illumination/viewing geometry and an appealing appearance. An open question is to ask about the minimum number of measurement geometries required to completely characterize their observed color shift. A recently published principal components analysis (PCA)-based procedure to estimate the color of special-effect coatings at any geometry from measurements at a reduced set of geometries was tested in this work by using the measurement geometries of commercial portable multiangle spectrophotometers X-Rite MA98, Datacolor FX10, and BYK-mac as reduced sets. The performance of the proposed PCA procedure for the color-shift estimation for these commercial geometries has been examined for 15 special-effect coatings. Our results suggest that for rendering the color appearance of 3D objects covered with special-effect coatings, the color accuracy obtained with this procedure may be sufficient. This is the case especially if geometries of X-Rite MA98 or Datacolor FX10 are used.

Genome scan meta-analysis of schizophrenia and bipolar disorder, part II: Schizophrenia

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Schizophrenia is a common disorder with high heritability and a 10-fold increase in risk to siblings of probands. Replication has been inconsistent for reports of significant genetic linkage. To assess evidence for linkage across studies, rank-based genome scan meta-analysis (GSMA) was applied to data from 20 schizophrenia genome scans. Each marker for each scan was assigned to 1 of 120 30-cM bins, with the bins ranked by linkage scores (1 = most significant) and the ranks averaged across studies (R-avg) and then weighted for sample size (rootN[affected cases]). A permutation test was used to compute the probability of observing, by chance, each bin's average rank (P-AvgRnk) or of observing it for a bin with the same place (first, second, etc.) in the order of average ranks in each permutation (P-ord). The GSMA produced significant genomewide evidence for linkage on chromosome 2q (P-AvgRnk

Comparison of grinding media - Cylpebs versus balls

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Cylpebs are slightly tapered cylindrical grinding media with a ratio of length to diameter of unity. The manufactures have made conflicting claims regarding the milling performance of Cylpebs in comparison with balls. One major point of interest is which one grinds finer at the same operating conditions. The difficulty in comparison is due to the shape difference. The two grinding media have different surface area, bulk density and contact mechanisms in grinding action. Comparative tests were conducted using the two types of grinding media in a laboratory Bond ball mill at various conditions of equality such as media mass, size distribution, surface area and input specific energy. The laboratory results indicate that at the same specific energy input level the Cylpebs produce a product with slightly less oversize due to their greater surface area, but essentially the same sizing at the fine end as that produced with the balls. The reason may be that the advantage of greater surface area is balanced by the line contact and area contact grinding actions with the Cylpebs. A new ball mill scale-up procedure [Man, Y.T., 2001. Model-based procedure for scale-up of wet, overflow ball mills, Part 1: outline of the methodology. Minerals Engineering 14 (10), 1237-1246] was employed to predict grinding performance of an industrial mill from the laboratory test results. The predicted full scale operation was compared with the plant survey data. Some problems in the original scale-up procedures were identified. The scale-up procedure was therefore modified to allow the predicted ball mill performance to match the observed one. The calibrated scale-up procedure was used to predict the Cylpebs performance in the full scale industrial mill using the laboratory tests results. (C) 2004 Elsevier Ltd. All rights reserved.

A Mixture model with random-effects components for clustering correlated gene-expression profiles

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Motivation: The clustering of gene profiles across some experimental conditions of interest contributes significantly to the elucidation of unknown gene function, the validation of gene discoveries and the interpretation of biological processes. However, this clustering problem is not straightforward as the profiles of the genes are not all independently distributed and the expression levels may have been obtained from an experimental design involving replicated arrays. Ignoring the dependence between the gene profiles and the structure of the replicated data can result in important sources of variability in the experiments being overlooked in the analysis, with the consequent possibility of misleading inferences being made. We propose a random-effects model that provides a unified approach to the clustering of genes with correlated expression levels measured in a wide variety of experimental situations. Our model is an extension of the normal mixture model to account for the correlations between the gene profiles and to enable covariate information to be incorporated into the clustering process. Hence the model is applicable to longitudinal studies with or without replication, for example, time-course experiments by using time as a covariate, and to cross-sectional experiments by using categorical covariates to represent the different experimental classes. Results: We show that our random-effects model can be fitted by maximum likelihood via the EM algorithm for which the E(expectation) and M(maximization) steps can be implemented in closed form. Hence our model can be fitted deterministically without the need for time-consuming Monte Carlo approximations. The effectiveness of our model-based procedure for the clustering of correlated gene profiles is demonstrated on three real datasets, representing typical microarray experimental designs, covering time-course, repeated-measurement and cross-sectional data. In these examples, relevant clusters of the genes are obtained, which are supported by existing gene-function annotation. A synthetic dataset is considered too.

Visualisation of bioinformatics datasets

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Analysing the molecular polymorphism and interactions of DNA, RNA and proteins is of fundamental importance in biology. Predicting functions of polymorphic molecules is important in order to design more effective medicines. Analysing major histocompatibility complex (MHC) polymorphism is important for mate choice, epitope-based vaccine design and transplantation rejection etc. Most of the existing exploratory approaches cannot analyse these datasets because of the large number of molecules with a high number of descriptors per molecule. This thesis develops novel methods for data projection in order to explore high dimensional biological dataset by visualising them in a low-dimensional space. With increasing dimensionality, some existing data visualisation methods such as generative topographic mapping (GTM) become computationally intractable. We propose variants of these methods, where we use log-transformations at certain steps of expectation maximisation (EM) based parameter learning process, to make them tractable for high-dimensional datasets. We demonstrate these proposed variants both for synthetic and electrostatic potential dataset of MHC class-I. We also propose to extend a latent trait model (LTM), suitable for visualising high dimensional discrete data, to simultaneously estimate feature saliency as an integrated part of the parameter learning process of a visualisation model. This LTM variant not only gives better visualisation by modifying the project map based on feature relevance, but also helps users to assess the significance of each feature. Another problem which is not addressed much in the literature is the visualisation of mixed-type data. We propose to combine GTM and LTM in a principled way where appropriate noise models are used for each type of data in order to visualise mixed-type data in a single plot. We call this model a generalised GTM (GGTM). We also propose to extend GGTM model to estimate feature saliencies while training a visualisation model and this is called GGTM with feature saliency (GGTM-FS). We demonstrate effectiveness of these proposed models both for synthetic and real datasets. We evaluate visualisation quality using quality metrics such as distance distortion measure and rank based measures: trustworthiness, continuity, mean relative rank errors with respect to data space and latent space. In cases where the labels are known we also use quality metrics of KL divergence and nearest neighbour classifications error in order to determine the separation between classes. We demonstrate the efficacy of these proposed models both for synthetic and real biological datasets with a main focus on the MHC class-I dataset.

Text analytics of social media: Sentiment analysis, event detection and summarization

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the last decade, large numbers of social media services have emerged and been widely used in people's daily life as important information sharing and acquisition tools. With a substantial amount of user-contributed text data on social media, it becomes a necessity to develop methods and tools for text analysis for this emerging data, in order to better utilize it to deliver meaningful information to users. ^ Previous work on text analytics in last several decades is mainly focused on traditional types of text like emails, news and academic literatures, and several critical issues to text data on social media have not been well explored: 1) how to detect sentiment from text on social media; 2) how to make use of social media's real-time nature; 3) how to address information overload for flexible information needs. ^ In this dissertation, we focus on these three problems. First, to detect sentiment of text on social media, we propose a non-negative matrix tri-factorization (tri-NMF) based dual active supervision method to minimize human labeling efforts for the new type of data. Second, to make use of social media's real-time nature, we propose approaches to detect events from text streams on social media. Third, to address information overload for flexible information needs, we propose two summarization framework, dominating set based summarization framework and learning-to-rank based summarization framework. The dominating set based summarization framework can be applied for different types of summarization problems, while the learning-to-rank based summarization framework helps utilize the existing training data to guild the new summarization tasks. In addition, we integrate these techneques in an application study of event summarization for sports games as an example of how to better utilize social media data. ^

Text Analytics of Social Media: Sentiment Analysis, Event Detection and Summarization

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the last decade, large numbers of social media services have emerged and been widely used in people's daily life as important information sharing and acquisition tools. With a substantial amount of user-contributed text data on social media, it becomes a necessity to develop methods and tools for text analysis for this emerging data, in order to better utilize it to deliver meaningful information to users. Previous work on text analytics in last several decades is mainly focused on traditional types of text like emails, news and academic literatures, and several critical issues to text data on social media have not been well explored: 1) how to detect sentiment from text on social media; 2) how to make use of social media's real-time nature; 3) how to address information overload for flexible information needs. In this dissertation, we focus on these three problems. First, to detect sentiment of text on social media, we propose a non-negative matrix tri-factorization (tri-NMF) based dual active supervision method to minimize human labeling efforts for the new type of data. Second, to make use of social media's real-time nature, we propose approaches to detect events from text streams on social media. Third, to address information overload for flexible information needs, we propose two summarization framework, dominating set based summarization framework and learning-to-rank based summarization framework. The dominating set based summarization framework can be applied for different types of summarization problems, while the learning-to-rank based summarization framework helps utilize the existing training data to guild the new summarization tasks. In addition, we integrate these techneques in an application study of event summarization for sports games as an example of how to better utilize social media data.

Anatomy and variability of "Cuvierichelys parisiensis", a geoemydid turtle that crosses the Eocene-Oligocene boundary in Belgium

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abundant material of turtles from the early Oligocene site of Boutersem-TGV (Boutersem, Belgium), is presented here. No information on the turtles found there was so far available. All the turtle specimens presented here are attributable to a single freshwater taxon that is identified as a member of Geoemydidae, Cuvierichelys. It is the first representative of the ‘Palaeochelys s. l.–Mauremys’ group recognized in the Belgian Paleogene record. This material, which allows to know all the elements of both the carapace and the plastron of the taxon, cannot be attributed to the only species of the genus Cuvierichelys so far identified in the Oligocene, the Spanish form Cuvierichelys iberica. The taxon from Boutersem is recognized as Cuvierichelys parisiensis. Thus, both the paleobiogeographic and the biostratigraphic distributions of Cuvierichelys parisiensis are extended, its presence being confirmed for the first time outside the French Eocene record. The validity of some European forms is refuted, and several characters previously proposed as different between Cuvierichelys iberica and Cuvierichelys parisiensis are recognized as subjected to intraspecific variability.

Prior knowledge transfer across transcriptional data sets and technologies using compositional statistics yields new mislabelled ovarian cell line

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Here, we describe gene expression compositional assignment (GECA), a powerful, yet simple method based on compositional statistics that can validate the transfer of prior knowledge, such as gene lists, into independent data sets, platforms and technologies. Transcriptional profiling has been used to derive gene lists that stratify patients into prognostic molecular subgroups and assess biomarker performance in the pre-clinical setting. Archived public data sets are an invaluable resource for subsequent in silico validation, though their use can lead to data integration issues. We show that GECA can be used without the need for normalising expression levels between data sets and can outperform rank-based correlation methods. To validate GECA, we demonstrate its success in the cross-platform transfer of gene lists in different domains including: bladder cancer staging, tumour site of origin and mislabelled cell lines. We also show its effectiveness in transferring an epithelial ovarian cancer prognostic gene signature across technologies, from a microarray to a next-generation sequencing setting. In a final case study, we predict the tumour site of origin and histopathology of epithelial ovarian cancer cell lines. In particular, we identify and validate the commonly-used cell line OVCAR-5 as non-ovarian, being gastrointestinal in origin. GECA is available as an open-source R package.

Modèles de dépendance hiérarchique pour l’évaluation des passifs et la tarification en actuariat

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dans cette thèse on s’intéresse à la modélisation de la dépendance entre les risques en assurance non-vie, plus particulièrement dans le cadre des méthodes de provisionnement et en tarification. On expose le contexte actuel et les enjeux liés à la modélisation de la dépendance et l’importance d’une telle approche avec l’avènement des nouvelles normes et exigences des organismes réglementaires quant à la solvabilité des compagnies d’assurances générales. Récemment, Shi et Frees (2011) suggère d’incorporer la dépendance entre deux lignes d’affaires à travers une copule bivariée qui capture la dépendance entre deux cellules équivalentes de deux triangles de développement. Nous proposons deux approches différentes pour généraliser ce modèle. La première est basée sur les copules archimédiennes hiérarchiques, et la deuxième sur les effets aléatoires et la famille de distributions bivariées Sarmanov. Nous nous intéressons dans un premier temps, au Chapitre 2, à un modèle utilisant la classe des copules archimédiennes hiérarchiques, plus précisément la famille des copules partiellement imbriquées, afin d’inclure la dépendance à l’intérieur et entre deux lignes d’affaires à travers les effets calendaires. Par la suite, on considère un modèle alternatif, issu d’une autre classe de la famille des copules archimédiennes hiérarchiques, celle des copules totalement imbriquées, afin de modéliser la dépendance entre plus de deux lignes d’affaires. Une approche avec agrégation des risques basée sur un modèle formé d’une arborescence de copules bivariées y est également explorée. Une particularité importante de l’approche décrite au Chapitre 3 est que l’inférence au niveau de la dépendance se fait à travers les rangs des résidus, afin de pallier un éventuel risque de mauvaise spécification des lois marginales et de la copule régissant la dépendance. Comme deuxième approche, on s’intéresse également à la modélisation de la dépendance à travers des effets aléatoires. Pour ce faire, on considère la famille de distributions bivariées Sarmanov qui permet une modélisation flexible à l’intérieur et entre les lignes d’affaires, à travers les effets d’années de calendrier, années d’accident et périodes de développement. Des expressions fermées de la distribution jointe, ainsi qu’une illustration empirique avec des triangles de développement sont présentées au Chapitre 4. Aussi, nous proposons un modèle avec effets aléatoires dynamiques, où l’on donne plus de poids aux années les plus récentes, et utilisons l’information de la ligne corrélée afin d’effectuer une meilleure prédiction du risque. Cette dernière approche sera étudiée au Chapitre 5, à travers une application numérique sur les nombres de réclamations, illustrant l’utilité d’un tel modèle dans le cadre de la tarification. On conclut cette thèse par un rappel sur les contributions scientifiques de cette thèse, tout en proposant des angles d’ouvertures et des possibilités d’extension de ces travaux.

Nurse Rostering with Genetic Algorithms

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In recent years genetic algorithms have emerged as a useful tool for the heuristic solution of complex discrete optimisation problems. In particular there has been considerable interest in their use in tackling problems arising in the areas of scheduling and timetabling. However, the classical genetic algorithm paradigm is not well equipped to handle constraints and successful implementations usually require some sort of modification to enable the search to exploit problem specific knowledge in order to overcome this shortcoming. This paper is concerned with the development of a family of genetic algorithms for the solution of a nurse rostering problem at a major UK hospital. The hospital is made up of wards of up to 30 nurses. Each ward has its own group of nurses whose shifts have to be scheduled on a weekly basis. In addition to fulfilling the minimum demand for staff over three daily shifts, nurses’ wishes and qualifications have to be taken into account. The schedules must also be seen to be fair, in that unpopular shifts have to be spread evenly amongst all nurses, and other restrictions, such as team nursing and special conditions for senior staff, have to be satisfied. The basis of the family of genetic algorithms is a classical genetic algorithm consisting of n-point crossover, single-bit mutation and a rank-based selection. The solution space consists of all schedules in which each nurse works the required number of shifts, but the remaining constraints, both hard and soft, are relaxed and penalised in the fitness function. The talk will start with a detailed description of the problem and the initial implementation and will go on to highlight the shortcomings of such an approach, in terms of the key element of balancing feasibility, i.e. covering the demand and work regulations, and quality, as measured by the nurses’ preferences. A series of experiments involving parameter adaptation, niching, intelligent weights, delta coding, local hill climbing, migration and special selection rules will then be outlined and it will be shown how a series of these enhancements were able to eradicate these difficulties. Results based on several months’ real data will be used to measure the impact of each modification, and to show that the final algorithm is able to compete with a tabu search approach currently employed at the hospital. The talk will conclude with some observations as to the overall quality of this approach to this and similar problems.

Synergies among safety and security in the prevention of major accidents related to dangerous substances

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Historical evidence shows that chemical, process, and Oil&Gas facilities where dangerous substances are stored or handled are target of deliberate malicious attacks (security attacks) aiming at interfering with normal operations. Physical attacks and cyber-attacks may generate events with consequences on people, property, and the surrounding environment that are comparable to those of major accidents caused by safety-related causes. The security aspects of these facilities are commonly addressed using Security Vulnerability/Risk Assessment (SVA/SRA) methodologies. Most of these methodologies are semi-quantitative and non-systematic approaches that strongly rely on expert judgment, leading to security assessments that are not reproducible. Moreover, they do not consider the synergies with the safety domain. The present 3-year research is aimed at filling the gap outlined by providing knowledge on security attacks, as well as rigorous and systematic methods supporting existing SVA/SRA studies suitable for the chemical, process, and Oil&Gas industry. The different nature of cyber and physical attacks resulted in the development of different methods for the two domains. The first part of the research was devoted to the development and statistical analysis of security databases that allowed to develop new knowledge and lessons learnt on security threats. Based on the obtained background, a Bow-Tie based procedure and two reverse-HazOp based methodologies were developed as hazard identification approaches for physical and cyber threats respectively. To support the quantitative estimation of the security risk, a quantitative procedure based on the Bayesian Network was developed allowing to calculate the probability of success of physical security attacks. All the developed methods have been applied to case studies addressing chemical, process and Oil&Gas facilities (offshore and onshore) proving the quality of the results that can be achieved in improving site security. Furthermore, the outcomes achieved allow to step forward in developing synergies and promoting integration among safety and security management.

Supervised and weakly supervised counting-by-segmentation: the fluorescent microscopy use case

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis focuses on automating the time-consuming task of manually counting activated neurons in fluorescent microscopy images, which is used to study the mechanisms underlying torpor. The traditional method of manual annotation can introduce bias and delay the outcome of experiments, so the author investigates a deep-learning-based procedure to automatize this task. The author explores two of the main convolutional-neural-network (CNNs) state-of-the-art architectures: UNet and ResUnet family model, and uses a counting-by-segmentation strategy to provide a justification of the objects considered during the counting process. The author also explores a weakly-supervised learning strategy that exploits only dot annotations. The author quantifies the advantages in terms of data reduction and counting performance boost obtainable with a transfer-learning approach and, specifically, a fine-tuning procedure. The author released the dataset used for the supervised use case and all the pre-training models, and designed a web application to share both the counting process pipeline developed in this work and the models pre-trained on the dataset analyzed in this work.

«
1
2
3
4
5
6
7
8
...
62
63
»