70 resultados para Software Package Data Exchange (SPDX)


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Gene expression patterns are a key feature in understanding gene function, notably in development. Comparing gene expression patterns between animals is a major step in the study of gene function as well as of animal evolution. It also provides a link between genes and phenotypes. Thus we have developed Bgee, a database designed to compare expression patterns between animals, by implementing ontologies describing anatomies and developmental stages of species, and then designing homology relationships between anatomies and comparison criteria between developmental stages. To define homology relationships between anatomical features we have developed the software Homolonto, which uses a modified ontology alignment approach to propose homology relationships between ontologies. Bgee then uses these aligned ontologies, onto which heterogeneous expression data types are mapped. These already include microarrays and ESTs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Human immunodeficiency virus (HIV) takes advantage of multiple host proteins to support its own replication. The gene ZNRD1 (zinc ribbon domain-containing 1) has been identified as encoding a potential host factor that influenced disease progression in HIV-positive individuals in a genomewide association study and also significantly affected HIV replication in a large-scale in vitro short interfering RNA (siRNA) screen. Genes and polymorphisms identified by large-scale analysis need to be followed up by means of functional assays and resequencing efforts to more precisely map causal genes. METHODS: Genotyping and ZNRD1 gene resequencing for 208 HIV-positive subjects (119 who experienced long-term nonprogression [LTNP] and 89 who experienced normal disease progression) was done by either TaqMan genotyping assays or direct sequencing. Genetic association analysis was performed with the SNPassoc package and Haploview software. siRNA and short hairpin RNA (shRNA) specifically targeting ZNRD1 were used to transiently or stably down-regulate ZNRD1 expression in both lymphoid and nonlymphoid cells. Cells were infected with X4 and R5 HIV strains, and efficiency of infection was assessed by reporter gene assay or p24 assay. RESULTS: Genetic association analysis found a strong statistically significant correlation with the LTNP phenotype (single-nucleotide polymorphism rs1048412; [Formula: see text]), independently of HLA-A10 influence. siRNA-based functional analysis showed that ZNRD1 down-regulation by siRNA or shRNA impaired HIV-1 replication at the transcription level in both lymphoid and nonlymphoid cells. CONCLUSION: Genetic association analysis unequivocally identified ZNRD1 as an independent marker of LTNP to AIDS. Moreover, in vitro experiments pointed to viral transcription as the inhibited step. Thus, our data strongly suggest that ZNRD1 is a host cellular factor that influences HIV-1 replication and disease progression in HIV-positive individuals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Aims: A rapid and simple HPLC-MS method was developed for the simultaneousdetermination of antidementia drugs, including donepezil, galantamine, rivastigmineand its major metabolite NAP 226 - 90, and memantine, for TherapeuticDrug Monitoring (TDM). In the elderly population treated with antidementiadrugs, the presence of several comorbidities, drug interactions resulting frompolypharmacy, and variations in drug metabolism and elimination, are possiblefactors leading to the observed high interindividual variability in plasma levels.Although evidence for the benefit of TDM for antidementia drugs still remains tobe demonstrated, an individually adapted dosage through TDM might contributeto minimize the risk of adverse reactions and to increase the probability of efficienttherapeutic response. Methods: A solid-phase extraction procedure with amixed-mode cation exchange sorbent was used to isolate the drugs from 0.5 mL ofplasma. The compounds were analyzed on a reverse-phase column with a gradientelution consisting of an ammonium acetate buffer at pH 9.3 and acetonitrile anddetected by mass spectrometry in the single ion monitoring mode. Isotope-labeledinternal standards were used for quantification where possible. The validatedmethod was used to measure the plasma levels of antidementia drugs in 300patients treated with these drugs. Results: The method was validated accordingto international standards of validation, including the assessment of the trueness(-8 - 11 %), the imprecision (repeatability: 1-5%, intermediate imprecision:2 - 9 %), selectivity and matrix effects variability (less than 6 %). Furthermore,short and long-term stability of the analytes in plasma was ascertained. Themethod proved to be robust in the calibrated ranges of 1 - 300 ng/mL for rivastigmineand memantine and 2 - 300 mg/mL for donepezil, galantamine and NAP226 - 90. We recently published a full description of the method (1). We found ahigh interindividual variability in plasma levels of these drugs in a study populationof 300 patients. The plasma level measurements, with some preliminaryclinical and pharmacogenetic results, will be presented. Conclusion: A simpleLC-MS method was developed for plasma level determination of antidementiadrugs which was successfully used in a clinical study with 300 patients.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The package HIERFSTAT for the statistical software R, created by the R Development Core Team, allows the estimate of hierarchical F-statistics from a hierarchy with any numbers of levels. In addition, it allows testing the statistical significance of population differentiation for these different levels, using a generalized likelihood-ratio test. The package HIERFSTAT is available at http://www.unil.ch/popgen/softwares/hierfstat.htm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS: We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS: Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

SUMMARY: ExpressionView is an R package that provides an interactive graphical environment to explore transcription modules identified in gene expression data. A sophisticated ordering algorithm is used to present the modules with the expression in a visually appealing layout that provides an intuitive summary of the results. From this overview, the user can select individual modules and access biologically relevant metadata associated with them. AVAILABILITY: http://www.unil.ch/cbg/ExpressionView. Screenshots, tutorials and sample data sets can be found on the ExpressionView web site.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The coverage and volume of geo-referenced datasets are extensive and incessantly¦growing. The systematic capture of geo-referenced information generates large volumes¦of spatio-temporal data to be analyzed. Clustering and visualization play a key¦role in the exploratory data analysis and the extraction of knowledge embedded in¦these data. However, new challenges in visualization and clustering are posed when¦dealing with the special characteristics of this data. For instance, its complex structures,¦large quantity of samples, variables involved in a temporal context, high dimensionality¦and large variability in cluster shapes.¦The central aim of my thesis is to propose new algorithms and methodologies for¦clustering and visualization, in order to assist the knowledge extraction from spatiotemporal¦geo-referenced data, thus improving making decision processes.¦I present two original algorithms, one for clustering: the Fuzzy Growing Hierarchical¦Self-Organizing Networks (FGHSON), and the second for exploratory visual data analysis:¦the Tree-structured Self-organizing Maps Component Planes. In addition, I present¦methodologies that combined with FGHSON and the Tree-structured SOM Component¦Planes allow the integration of space and time seamlessly and simultaneously in¦order to extract knowledge embedded in a temporal context.¦The originality of the FGHSON lies in its capability to reflect the underlying structure¦of a dataset in a hierarchical fuzzy way. A hierarchical fuzzy representation of¦clusters is crucial when data include complex structures with large variability of cluster¦shapes, variances, densities and number of clusters. The most important characteristics¦of the FGHSON include: (1) It does not require an a-priori setup of the number¦of clusters. (2) The algorithm executes several self-organizing processes in parallel.¦Hence, when dealing with large datasets the processes can be distributed reducing the¦computational cost. (3) Only three parameters are necessary to set up the algorithm.¦In the case of the Tree-structured SOM Component Planes, the novelty of this algorithm¦lies in its ability to create a structure that allows the visual exploratory data analysis¦of large high-dimensional datasets. This algorithm creates a hierarchical structure¦of Self-Organizing Map Component Planes, arranging similar variables' projections in¦the same branches of the tree. Hence, similarities on variables' behavior can be easily¦detected (e.g. local correlations, maximal and minimal values and outliers).¦Both FGHSON and the Tree-structured SOM Component Planes were applied in¦several agroecological problems proving to be very efficient in the exploratory analysis¦and clustering of spatio-temporal datasets.¦In this thesis I also tested three soft competitive learning algorithms. Two of them¦well-known non supervised soft competitive algorithms, namely the Self-Organizing¦Maps (SOMs) and the Growing Hierarchical Self-Organizing Maps (GHSOMs); and the¦third was our original contribution, the FGHSON. Although the algorithms presented¦here have been used in several areas, to my knowledge there is not any work applying¦and comparing the performance of those techniques when dealing with spatiotemporal¦geospatial data, as it is presented in this thesis.¦I propose original methodologies to explore spatio-temporal geo-referenced datasets¦through time. Our approach uses time windows to capture temporal similarities and¦variations by using the FGHSON clustering algorithm. The developed methodologies¦are used in two case studies. In the first, the objective was to find similar agroecozones¦through time and in the second one it was to find similar environmental patterns¦shifted in time.¦Several results presented in this thesis have led to new contributions to agroecological¦knowledge, for instance, in sugar cane, and blackberry production.¦Finally, in the framework of this thesis we developed several software tools: (1)¦a Matlab toolbox that implements the FGHSON algorithm, and (2) a program called¦BIS (Bio-inspired Identification of Similar agroecozones) an interactive graphical user¦interface tool which integrates the FGHSON algorithm with Google Earth in order to¦show zones with similar agroecological characteristics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A recurring task in the analysis of mass genome annotation data from high-throughput technologies is the identification of peaks or clusters in a noisy signal profile. Examples of such applications are the definition of promoters on the basis of transcription start site profiles, the mapping of transcription factor binding sites based on ChIP-chip data and the identification of quantitative trait loci (QTL) from whole genome SNP profiles. Input to such an analysis is a set of genome coordinates associated with counts or intensities. The output consists of a discrete number of peaks with respective volumes, extensions and center positions. We have developed for this purpose a flexible one-dimensional clustering tool, called MADAP, which we make available as a web server and as standalone program. A set of parameters enables the user to customize the procedure to a specific problem. The web server, which returns results in textual and graphical form, is useful for small to medium-scale applications, as well as for evaluation and parameter tuning in view of large-scale applications, requiring a local installation. The program written in C++ can be freely downloaded from ftp://ftp.epd.unil.ch/pub/software/unix/madap. The MADAP web server can be accessed at http://www.isrec.isb-sib.ch/madap/.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present research deals with an application of artificial neural networks for multitask learning from spatial environmental data. The real case study (sediments contamination of Geneva Lake) consists of 8 pollutants. There are different relationships between these variables, from linear correlations to strong nonlinear dependencies. The main idea is to construct a subsets of pollutants which can be efficiently modeled together within the multitask framework. The proposed two-step approach is based on: 1) the criterion of nonlinear predictability of each variable ?k? by analyzing all possible models composed from the rest of the variables by using a General Regression Neural Network (GRNN) as a model; 2) a multitask learning of the best model using multilayer perceptron and spatial predictions. The results of the study are analyzed using both machine learning and geostatistical tools.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Due to the existence of free software and pedagogical guides, the use of data envelopment analysis (DEA) has been further democratized in recent years. Nowadays, it is quite usual for practitioners and decision makers with no or little knowledge in operational research to run themselves their own efficiency analysis. Within DEA, several alternative models allow for an environment adjustment. Five alternative models, each of them easily accessible to and achievable by practitioners and decision makers, are performed using the empirical case of the 90 primary schools of the State of Geneva, Switzerland. As the State of Geneva practices an upstream positive discrimination policy towards schools, this empirical case is particularly appropriate for an environment adjustment. The alternative of the majority of DEA models deliver divergent results. It is a matter of concern for applied researchers and a matter of confusion for practitioners and decision makers. From a political standpoint, these diverging results could lead to potentially opposite decisions. Grâce à l'existence de logiciels en libre accès et de guides pédagogiques, la méthode data envelopment analysis (DEA) s'est démocratisée ces dernières années. Aujourd'hui, il n'est pas rare que les décideurs avec peu ou pas de connaissances en recherche opérationnelle réalisent eux-mêmes leur propre analyse d'efficience. A l'intérieur de la méthode DEA, plusieurs modèles permettent de tenir compte des conditions plus ou moins favorables de l'environnement. Cinq de ces modèles, facilement accessibles et applicables par les décideurs, sont utilisés pour mesurer l'efficience des 90 écoles primaires du canton de Genève, Suisse. Le canton de Genève pratiquant une politique de discrimination positive envers les écoles défavorisées, ce cas pratique est particulièrement adapté pour un ajustement à l'environnement. La majorité des modèles DEA génèrent des résultats divergents. Ce constat est préoccupant pour les chercheurs appliqués et perturbant pour les décideurs. D'un point de vue politique, ces résultats divergents conduisent à des prises de décision différentes selon le modèle sur lequel elles sont fondées.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We sequenced 1077 bp of the mitochondrial cytochrome b gene and 511 bp of the nuclear Apolipoprotein B gene in bicoloured shrew (Crocidura leucodon, Soricidae) populations ranging from France to Georgia. The aims of the study were to identify the main genetic clades within this species and the influence of Pleistocene climatic variations on the respective clades. The mitochondrial analyses revealed a European clade distributed from France eastwards to north-western Turkey and a Near East clade distributed from Georgia to Romania; the two clades separated during the Middle Pleistocene. We clearly identified a population expansion after a bottleneck for the European clade based on mitochondrial and nuclear sequencing data; this expansion was not observed for the eastern clade. We hypothesize that the western population was confined to a small Italo-Balkanic refugium, whereas the eastern population subsisted in several refugia along the southern coast of the Black Sea.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction: Therapeutic drug monitoring (TDM) aims at optimizing treatment by individualizing dosage regimen based on measurement of blood concentrations. Maintaining concentrations within a target range requires pharmacokinetic and clinical capabilities. Bayesian calculation represents a gold standard in TDM approach but requires computing assistance. In the last decades computer programs have been developed to assist clinicians in this assignment. The aim of this benchmarking was to assess and compare computer tools designed to support TDM clinical activities.¦Method: Literature and Internet search was performed to identify software. All programs were tested on common personal computer. Each program was scored against a standardized grid covering pharmacokinetic relevance, user-friendliness, computing aspects, interfacing, and storage. A weighting factor was applied to each criterion of the grid to consider its relative importance. To assess the robustness of the software, six representative clinical vignettes were also processed through all of them.¦Results: 12 software tools were identified, tested and ranked. It represents a comprehensive review of the available software's characteristics. Numbers of drugs handled vary widely and 8 programs offer the ability to the user to add its own drug model. 10 computer programs are able to compute Bayesian dosage adaptation based on a blood concentration (a posteriori adjustment) while 9 are also able to suggest a priori dosage regimen (prior to any blood concentration measurement), based on individual patient covariates, such as age, gender, weight. Among those applying Bayesian analysis, one uses the non-parametric approach. The top 2 software emerging from this benchmark are MwPharm and TCIWorks. Other programs evaluated have also a good potential but are less sophisticated (e.g. in terms of storage or report generation) or less user-friendly.¦Conclusion: Whereas 2 integrated programs are at the top of the ranked listed, such complex tools would possibly not fit all institutions, and each software tool must be regarded with respect to individual needs of hospitals or clinicians. Interest in computing tool to support therapeutic monitoring is still growing. Although developers put efforts into it the last years, there is still room for improvement, especially in terms of institutional information system interfacing, user-friendliness, capacity of data storage and report generation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives: Therapeutic drug monitoring (TDM) aims at optimizing treatment by individualizing dosage regimen based on blood concentrations measurement. Maintaining concentrations within a target range requires pharmacokinetic (PK) and clinical capabilities. Bayesian calculation represents a gold standard in TDM approach but requires computing assistance. The aim of this benchmarking was to assess and compare computer tools designed to support TDM clinical activities.¦Methods: Literature and Internet were searched to identify software. Each program was scored against a standardized grid covering pharmacokinetic relevance, user-friendliness, computing aspects, interfacing, and storage. A weighting factor was applied to each criterion of the grid to consider its relative importance. To assess the robustness of the software, six representative clinical vignettes were also processed through all of them.¦Results: 12 software tools were identified, tested and ranked. It represents a comprehensive review of the available software characteristics. Numbers of drugs handled vary from 2 to more than 180, and integration of different population types is available for some programs. Nevertheless, 8 programs offer the ability to add new drug models based on population PK data. 10 computer tools incorporate Bayesian computation to predict dosage regimen (individual parameters are calculated based on population PK models). All of them are able to compute Bayesian a posteriori dosage adaptation based on a blood concentration while 9 are also able to suggest a priori dosage regimen, only based on individual patient covariates. Among those applying Bayesian analysis, MM-USC*PACK uses a non-parametric approach. The top 2 programs emerging from this benchmark are MwPharm and TCIWorks. Others programs evaluated have also a good potential but are less sophisticated or less user-friendly.¦Conclusions: Whereas 2 software packages are ranked at the top of the list, such complex tools would possibly not fit all institutions, and each program must be regarded with respect to individual needs of hospitals or clinicians. Programs should be easy and fast for routine activities, including for non-experienced users. Although interest in TDM tools is growing and efforts were put into it in the last years, there is still room for improvement, especially in terms of institutional information system interfacing, user-friendliness, capability of data storage and automated report generation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate the population genetic structure of the Maghrebian bat, Myotis punicus, between the mainland and islands to assess the island colonization pattern and current gene flow between nearby islands and within the mainland. Location North Africa and the Mediterranean islands of Corsica and Sardinia. Methods We sequenced part of the control region (HVII) of 79 bats across 11 colonies. The phylogeographical pattern was assessed by analysing molecular diversity indices, examining differentiation among populations and estimating divergence time. In addition, we genotyped 182 bats across 10 colonies at seven microsatellite loci. We used analysis of molecular variance and a Bayesian approach to infer nuclear population structure. Finally, we estimated sex-specific dispersal between Corsica and Sardinia. Results Mitochondrial analyses indicated that colonies between Corsica, Sardinia and North Africa are highly differentiated. Within islands there was no difference between colonies, while at the continental level Moroccan and Tunisian populations were highly differentiated. Analyses with seven microsatellite loci showed a similar pattern. The sole difference was the lack of nuclear differentiation between populations in North Africa, suggesting a male-biased dispersal over the continental area. The divergence time of Sardinian and Corsican populations was estimated to date back to the early and mid-Pleistocene. Main conclusions Island colonization by the Maghrebian bats seems to have occurred in a stepping-stone manner and certainly pre-dated human colonization. Currently, open water seems to prevent exchange of bats between the two islands, despite their ability to fly and the narrowness of the strait of Bonifacio. Corsican and Sardinian populations are thus currently isolated from any continental gene pool and must therefore be considered as different evolutionarily significant units (ESU).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Volumes of data used in science and industry are growing rapidly. When researchers face the challenge of analyzing them, their format is often the first obstacle. Lack of standardized ways of exploring different data layouts requires an effort each time to solve the problem from scratch. Possibility to access data in a rich, uniform manner, e.g. using Structured Query Language (SQL) would offer expressiveness and user-friendliness. Comma-separated values (CSV) are one of the most common data storage formats. Despite its simplicity, with growing file size handling it becomes non-trivial. Importing CSVs into existing databases is time-consuming and troublesome, or even impossible if its horizontal dimension reaches thousands of columns. Most databases are optimized for handling large number of rows rather than columns, therefore, performance for datasets with non-typical layouts is often unacceptable. Other challenges include schema creation, updates and repeated data imports. To address the above-mentioned problems, I present a system for accessing very large CSV-based datasets by means of SQL. It's characterized by: "no copy" approach - data stay mostly in the CSV files; "zero configuration" - no need to specify database schema; written in C++, with boost [1], SQLite [2] and Qt [3], doesn't require installation and has very small size; query rewriting, dynamic creation of indices for appropriate columns and static data retrieval directly from CSV files ensure efficient plan execution; effortless support for millions of columns; due to per-value typing, using mixed text/numbers data is easy; very simple network protocol provides efficient interface for MATLAB and reduces implementation time for other languages. The software is available as freeware along with educational videos on its website [4]. It doesn't need any prerequisites to run, as all of the libraries are included in the distribution package. I test it against existing database solutions using a battery of benchmarks and discuss the results.