790 resultados para Datasets
Resumo:
This work complements some of the results appearing in the article ?Publishing Performance in Economics: Spanish Rankings? by Dolado et al. . Specifically we focus on the robustness of the results regardless of the time span considered, the effect of the choice of a particular database on the final results, and the effects on changes in the unit of institutional measure (departments versus institutions as a whole). Differences are significant when we expand the time period considered. There are also significant but small differences if we combine datasets to derive the rankings. Finally, department rankings offer a more precise picture of the situation of the Spanish academics, although results do not differ substantially from those obtained when overall institutions are considered.
Resumo:
Anatomical structures and mechanisms linking genes to neuropsychiatric disorders are not deciphered. Reciprocal copy number variants at the 16p11.2 BP4-BP5 locus offer a unique opportunity to study the intermediate phenotypes in carriers at high risk for autism spectrum disorder (ASD) or schizophrenia (SZ). We investigated the variation in brain anatomy in 16p11.2 deletion and duplication carriers. Beyond gene dosage effects on global brain metrics, we show that the number of genomic copies negatively correlated to the gray matter volume and white matter tissue properties in cortico-subcortical regions implicated in reward, language and social cognition. Despite the near absence of ASD or SZ diagnoses in our 16p11.2 cohort, the pattern of brain anatomy changes in carriers spatially overlaps with the well-established structural abnormalities in ASD and SZ. Using measures of peripheral mRNA levels, we confirm our genomic copy number findings. This combined molecular, neuroimaging and clinical approach, applied to larger datasets, will help interpret the relative contributions of genes to neuropsychiatric conditions by measuring their effect on local brain anatomy.Molecular Psychiatry advance online publication, 25 November 2014; doi:10.1038/mp.2014.145.
Resumo:
Introduction: Non-invasive brain imaging techniques often contrast experimental conditions across a cohort of participants, obfuscating distinctions in individual performance and brain mechanisms that are better characterised by the inter-trial variability. To overcome such limitations, we developed topographic analysis methods for single-trial EEG data [1]. So far this was typically based on time-frequency analysis of single-electrode data or single independent components. The method's efficacy is demonstrated for event-related responses to environmental sounds, hitherto studied at an average event-related potential (ERP) level. Methods: Nine healthy subjects participated to the experiment. Auditory meaningful sounds of common objects were used for a target detection task [2]. On each block, subjects were asked to discriminate target sounds, which were living or man-made auditory objects. Continuous 64-channel EEG was acquired during the task. Two datasets were considered for each subject including single-trial of the two conditions, living and man-made. The analysis comprised two steps. In the first part, a mixture of Gaussians analysis [3] provided representative topographies for each subject. In the second step, conditional probabilities for each Gaussian provided statistical inference on the structure of these topographies across trials, time, and experimental conditions. Similar analysis was conducted at group-level. Results: Results show that the occurrence of each map is structured in time and consistent across trials both at the single-subject and at group level. Conducting separate analyses of ERPs at single-subject and group levels, we could quantify the consistency of identified topographies and their time course of activation within and across participants as well as experimental conditions. A general agreement was found with previous analysis at average ERP level. Conclusions: This novel approach to single-trial analysis promises to have impact on several domains. In clinical research, it gives the possibility to statistically evaluate single-subject data, an essential tool for analysing patients with specific deficits and impairments and their deviation from normative standards. In cognitive neuroscience, it provides a novel tool for understanding behaviour and brain activity interdependencies at both single-subject and at group levels. In basic neurophysiology, it provides a new representation of ERPs and promises to cast light on the mechanisms of its generation and inter-individual variability.
Resumo:
BACKGROUND: Zebrafish is a clinically-relevant model of heart regeneration. Unlike mammals, it has a remarkable heart repair capacity after injury, and promises novel translational applications. Amputation and cryoinjury models are key research tools for understanding injury response and regeneration in vivo. An understanding of the transcriptional responses following injury is needed to identify key players of heart tissue repair, as well as potential targets for boosting this property in humans. RESULTS: We investigated amputation and cryoinjury in vivo models of heart damage in the zebrafish through unbiased, integrative analyses of independent molecular datasets. To detect genes with potential biological roles, we derived computational prediction models with microarray data from heart amputation experiments. We focused on a top-ranked set of genes highly activated in the early post-injury stage, whose activity was further verified in independent microarray datasets. Next, we performed independent validations of expression responses with qPCR in a cryoinjury model. Across in vivo models, the top candidates showed highly concordant responses at 1 and 3 days post-injury, which highlights the predictive power of our analysis strategies and the possible biological relevance of these genes. Top candidates are significantly involved in cell fate specification and differentiation, and include heart failure markers such as periostin, as well as potential new targets for heart regeneration. For example, ptgis and ca2 were overexpressed, while usp2a, a regulator of the p53 pathway, was down-regulated in our in vivo models. Interestingly, a high activity of ptgis and ca2 has been previously observed in failing hearts from rats and humans. CONCLUSIONS: We identified genes with potential critical roles in the response to cardiac damage in the zebrafish. Their transcriptional activities are reproducible in different in vivo models of cardiac injury.
Resumo:
The project aims to achieve two objectives. First, we are analysing the labour market implications of the assumption that firms cannot pay similarly qualified employees differently according to when they joined the firm. For example, if the general situation for workers improves, a firm that seeks to hire new workers may feel it has to pay more to new hires. However, if the firm must pay the same wage to new hires and incumbents due to equal treatment, it would either have to raise the wage of the incumbents, or offer new workers a lower wage than the firm would do otherwise. This is very different from the standard assumption in economic analysis that firms are free to treat newly hired workers independently of existing hires. Second, we will use detailed data on individual wages to try to gauge whether (and to what extent) equity is a feature of actual labour markets. To investigate this, we are using two matched employer-employee panel datasets, one from Portugal and the other from Brazil. These unique datasets provide objective records on millions of workers and their firms over a long period of time, so that we can identify which firms employ which workers at each time. The datasets also include a large number of firm and worker variables.
Resumo:
MOTIVATION: Microarray results accumulated in public repositories are widely reused in meta-analytical studies and secondary databases. The quality of the data obtained with this technology varies from experiment to experiment, and an efficient method for quality assessment is necessary to ensure their reliability. RESULTS: The lack of a good benchmark has hampered evaluation of existing methods for quality control. In this study, we propose a new independent quality metric that is based on evolutionary conservation of expression profiles. We show, using 11 large organ-specific datasets, that IQRray, a new quality metrics developed by us, exhibits the highest correlation with this reference metric, among 14 metrics tested. IQRray outperforms other methods in identification of poor quality arrays in datasets composed of arrays from many independent experiments. In contrast, the performance of methods designed for detecting outliers in a single experiment like Normalized Unscaled Standard Error and Relative Log Expression was low because of the inability of these methods to detect datasets containing only low-quality arrays and because the scores cannot be directly compared between experiments. AVAILABILITY AND IMPLEMENTATION: The R implementation of IQRray is available at: ftp://lausanne.isb-sib.ch/pub/databases/Bgee/general/IQRray.R. CONTACT: Marta.Rosikiewicz@unil.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Resumo:
PECUBE is a three-dimensional thermal-kinematic code capable of solving the heat production-diffusion-advection equation under a temporally varying surface boundary condition. It was initially developed to assess the effects of time-varying surface topography (relief) on low-temperature thermochronological datasets. Thermochronometric ages are predicted by tracking the time-temperature histories of rock-particles ending up at the surface and by combining these with various age-prediction models. In the decade since its inception, the PECUBE code has been under continuous development as its use became wider and addressed different tectonic-geomorphic problems. This paper describes several major recent improvements in the code, including its integration with an inverse-modeling package based on the Neighborhood Algorithm, the incorporation of fault-controlled kinematics, several different ways to address topographic and drainage change through time, the ability to predict subsurface (tunnel or borehole) data, prediction of detrital thermochronology data and a method to compare these with observations, and the coupling with landscape-evolution (or surface-process) models. Each new development is described together with one or several applications, so that the reader and potential user can clearly assess and make use of the capabilities of PECUBE. We end with describing some developments that are currently underway or should take place in the foreseeable future. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
The aim of this study is to perform a thorough comparison of quantitative susceptibility mapping (QSM) techniques and their dependence on the assumptions made. The compared methodologies were: two iterative single orientation methodologies minimizing the l2, l1TV norm of the prior knowledge of the edges of the object, one over-determined multiple orientation method (COSMOS) and anewly proposed modulated closed-form solution (MCF). The performance of these methods was compared using a numerical phantom and in-vivo high resolution (0.65mm isotropic) brain data acquired at 7T using a new coil combination method. For all QSM methods, the relevant regularization and prior-knowledge parameters were systematically changed in order to evaluate the optimal reconstruction in the presence and absence of a ground truth. Additionally, the QSM contrast was compared to conventional gradient recalled echo (GRE) magnitude and R2* maps obtained from the same dataset. The QSM reconstruction results of the single orientation methods show comparable performance. The MCF method has the highest correlation (corrMCF=0.95, r(2)MCF =0.97) with the state of the art method (COSMOS) with additional advantage of extreme fast computation time. The l-curve method gave the visually most satisfactory balance between reduction of streaking artifacts and over-regularization with the latter being overemphasized when the using the COSMOS susceptibility maps as ground-truth. R2* and susceptibility maps, when calculated from the same datasets, although based on distinct features of the data, have a comparable ability to distinguish deep gray matter structures.
Quantitative comparison of reconstruction methods for intra-voxel fiber recovery from diffusion MRI.
Resumo:
Validation is arguably the bottleneck in the diffusion magnetic resonance imaging (MRI) community. This paper evaluates and compares 20 algorithms for recovering the local intra-voxel fiber structure from diffusion MRI data and is based on the results of the "HARDI reconstruction challenge" organized in the context of the "ISBI 2012" conference. Evaluated methods encompass a mixture of classical techniques well known in the literature such as diffusion tensor, Q-Ball and diffusion spectrum imaging, algorithms inspired by the recent theory of compressed sensing and also brand new approaches proposed for the first time at this contest. To quantitatively compare the methods under controlled conditions, two datasets with known ground-truth were synthetically generated and two main criteria were used to evaluate the quality of the reconstructions in every voxel: correct assessment of the number of fiber populations and angular accuracy in their orientation. This comparative study investigates the behavior of every algorithm with varying experimental conditions and highlights strengths and weaknesses of each approach. This information can be useful not only for enhancing current algorithms and develop the next generation of reconstruction methods, but also to assist physicians in the choice of the most adequate technique for their studies.
Resumo:
SUMMARY : Eukaryotic DNA interacts with the nuclear proteins using non-covalent ionic interactions. Proteins can recognize specific nucleotide sequences based on the sterical interactions with the DNA and these specific protein-DNA interactions are the basis for many nuclear processes, e.g. gene transcription, chromosomal replication, and recombination. New technology termed ChIP-Seq has been recently developed for the analysis of protein-DNA interactions on a whole genome scale and it is based on immunoprecipitation of chromatin and high-throughput DNA sequencing procedure. ChIP-Seq is a novel technique with a great potential to replace older techniques for mapping of protein-DNA interactions. In this thesis, we bring some new insights into the ChIP-Seq data analysis. First, we point out to some common and so far unknown artifacts of the method. Sequence tag distribution in the genome does not follow uniform distribution and we have found extreme hot-spots of tag accumulation over specific loci in the human and mouse genomes. These artifactual sequence tags accumulations will create false peaks in every ChIP-Seq dataset and we propose different filtering methods to reduce the number of false positives. Next, we propose random sampling as a powerful analytical tool in the ChIP-Seq data analysis that could be used to infer biological knowledge from the massive ChIP-Seq datasets. We created unbiased random sampling algorithm and we used this methodology to reveal some of the important biological properties of Nuclear Factor I DNA binding proteins. Finally, by analyzing the ChIP-Seq data in detail, we revealed that Nuclear Factor I transcription factors mainly act as activators of transcription, and that they are associated with specific chromatin modifications that are markers of open chromatin. We speculate that NFI factors only interact with the DNA wrapped around the nucleosome. We also found multiple loci that indicate possible chromatin barrier activity of NFI proteins, which could suggest the use of NFI binding sequences as chromatin insulators in biotechnology applications. RESUME : L'ADN des eucaryotes interagit avec les protéines nucléaires par des interactions noncovalentes ioniques. Les protéines peuvent reconnaître les séquences nucléotidiques spécifiques basées sur l'interaction stérique avec l'ADN, et des interactions spécifiques contrôlent de nombreux processus nucléaire, p.ex. transcription du gène, la réplication chromosomique, et la recombinaison. Une nouvelle technologie appelée ChIP-Seq a été récemment développée pour l'analyse des interactions protéine-ADN à l'échelle du génome entier et cette approche est basée sur l'immuno-précipitation de la chromatine et sur la procédure de séquençage de l'ADN à haut débit. La nouvelle approche ChIP-Seq a donc un fort potentiel pour remplacer les anciennes techniques de cartographie des interactions protéine-ADN. Dans cette thèse, nous apportons de nouvelles perspectives dans l'analyse des données ChIP-Seq. Tout d'abord, nous avons identifié des artefacts très communs associés à cette méthode qui étaient jusqu'à présent insoupçonnés. La distribution des séquences dans le génome ne suit pas une distribution uniforme et nous avons constaté des positions extrêmes d'accumulation de séquence à des régions spécifiques, des génomes humains et de la souris. Ces accumulations des séquences artéfactuelles créera de faux pics dans toutes les données ChIP-Seq, et nous proposons différentes méthodes de filtrage pour réduire le nombre de faux positifs. Ensuite, nous proposons un nouvel échantillonnage aléatoire comme un outil puissant d'analyse des données ChIP-Seq, ce qui pourraient augmenter l'acquisition de connaissances biologiques à partir des données ChIP-Seq. Nous avons créé un algorithme d'échantillonnage aléatoire et nous avons utilisé cette méthode pour révéler certaines des propriétés biologiques importantes de protéines liant à l'ADN nommés Facteur Nucléaire I (NFI). Enfin, en analysant en détail les données de ChIP-Seq pour la famille de facteurs de transcription nommés Facteur Nucléaire I, nous avons révélé que ces protéines agissent principalement comme des activateurs de transcription, et qu'elles sont associées à des modifications de la chromatine spécifiques qui sont des marqueurs de la chromatine ouverte. Nous pensons que lés facteurs NFI interagir uniquement avec l'ADN enroulé autour du nucléosome. Nous avons également constaté plusieurs régions génomiques qui indiquent une éventuelle activité de barrière chromatinienne des protéines NFI, ce qui pourrait suggérer l'utilisation de séquences de liaison NFI comme séquences isolatrices dans des applications de la biotechnologie.
Resumo:
Images obtained from high-throughput mass spectrometry (MS) contain information that remains hidden when looking at a single spectrum at a time. Image processing of liquid chromatography-MS datasets can be extremely useful for quality control, experimental monitoring and knowledge extraction. The importance of imaging in differential analysis of proteomic experiments has already been established through two-dimensional gels and can now be foreseen with MS images. We present MSight, a new software designed to construct and manipulate MS images, as well as to facilitate their analysis and comparison.
Resumo:
In many fields, the spatial clustering of sampled data points has many consequences. Therefore, several indices have been proposed to assess the level of clustering affecting datasets (e.g. the Morisita index, Ripley's Kfunction and Rényi's generalized entropy). The classical Morisita index measures how many times it is more likely to select two measurement points from the same quadrats (the data set is covered by a regular grid of changing size) than it would be in the case of a random distribution generated from a Poisson process. The multipoint version (k-Morisita) takes into account k points with k >= 2. The present research deals with a new development of the k-Morisita index for (1) monitoring network characterization and for (2) detection of patterns in monitored phenomena. From a theoretical perspective, a connection between the k-Morisita index and multifractality has also been found and highlighted on a mathematical multifractal set.
Resumo:
On December 4th 2007, a 3-Mm3 landslide occurred along the northwestern shore of Chehalis Lake. The initiation zone is located at the intersection of the main valley slope and the northern sidewall of a prominent gully. The slope failure caused a displacement wave that ran up to 38 m on the opposite shore of the lake. The landslide is temporally associated with a rain-on-snow meteorological event which is thought to have triggered it. This paper describes the Chehalis Lake landslide and presents a comparison of discontinuity orientation datasets obtained using three techniques: field measurements, terrestrial photogrammetric 3D models and an airborne LiDAR digital elevation model to describe the orientation and characteristics of the five discontinuity sets present. The discontinuity orientation data are used to perform kinematic, surface wedge limit equilibrium and three-dimensional distinct element analyses. The kinematic and surface wedge analyses suggest that the location of the slope failure (intersection of the valley slope and a gully wall) has facilitated the development of the unstable rock mass which initiated as a planar sliding failure. Results from the three-dimensional distinct element analyses suggest that the presence, orientation and high persistence of a discontinuity set dipping obliquely to the slope were critical to the development of the landslide and led to a failure mechanism dominated by planar sliding. The three-dimensional distinct element modelling also suggests that the presence of a steeply dipping discontinuity set striking perpendicular to the slope and associated with a fault exerted a significant control on the volume and extent of the failed rock mass but not on the overall stability of the slope.
Resumo:
Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third 'Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking. AVAILABILITY AND IMPLEMENTATION: All such materials are available at http://questfororthologs.org. CONTACT: erik.sonnhammer@scilifelab.se or c.dessimoz@ucl.ac.uk.
Resumo:
To have an added value over BMD, a CRF of osteoporotic fracture must be predictable of the fracture, independent of BMD, reversible and quantifiable. Many major recognized CRF exist. Out of these factors many of them are indirect factor of bone quality. TBS predicts fracture independently of BMD as demonstrated from previous studies. The aim of the study is to verify if TBS can be considered as a major CRF of osteoporotic fracture. Existing validated datasets of Caucasian women were analyzed. These datasets stem from different studies performed by the authors of this report or provided to our group. However, the level of evidence of these studies will vary. Thus, the different datasets were weighted differently according to their design. This meta-like analysis involves more than 32000 women (≥50years) with 2000 osteoporotic fractures from two prospective studies (OFELY&MANITOBA) and 7 cross-sectional studies. Weighted relative risk (RR) for TBS was expressed for each decrease of one standard deviation as well as per tertile difference (TBS=1.300 and 1.200) and compared with those obtained for the major CRF included in FRAX®. Overall TBS RR obtained (adjusted for age) was 1.79 [95%CI-1.37-2.37]. For all women combined, RR for fracture for the lowest compared with the middle TBS tertile was 1.55[1.46-1.68] and for the lowest compared with the highest TBS tertile was 2.8[2.70-3.00]. TBS is comparable to most of the major CRF and thus could be used as one of them. Further studies have to be conducted to confirm these first findings.