962 resultados para data complexity
Resumo:
Peptide toxins synthesized by venomous animals have been extensively studied in the last decades. To be useful to the scientific community, this knowledge has been stored, annotated and made easy to retrieve by several databases. The aim of this article is to present what type of information users can access from each database. ArachnoServer and ConoServer focus on spider toxins and cone snail toxins, respectively. UniProtKB, a generalist protein knowledgebase, has an animal toxin-dedicated annotation program that includes toxins from all venomous animals. Finally, the ATDB metadatabase compiles data and annotations from other databases and provides toxin ontology.
Resumo:
The aim of this study was to assess a population of patients with diabetes mellitus by means of the INTERMED, a classification system for case complexity integrating biological, psychosocial and health care related aspects of disease. The main hypothesis was that the INTERMED would identify distinct clusters of patients with different degrees of case complexity and different clinical outcomes. Patients (n=61) referred to a tertiary reference care centre were evaluated with the INTERMED and followed 9 months for HbA1c values and 6 months for health care utilisation. Cluster analysis revealed two clusters: cluster 1 (62%) consisting of complex patients with high INTERMED scores and cluster 2 (38%) consisting of less complex patients with lower INTERMED. Cluster 1 patients showed significantly higher HbA1c values and a tendency for increased health care utilisation. Total INTERMED scores were significantly related to HbA1c and explained 21% of its variance. In conclusion, different clusters of patients with different degrees of case complexity were identified by the INTERMED, allowing the detection of highly complex patients at risk for poor diabetes control. The INTERMED therefore provides an objective basis for clinical and scientific progress in diabetes mellitus. Ongoing intervention studies will have to confirm these preliminary data and to evaluate if management strategies based on the INTERMED profiles will improve outcomes.
Resumo:
The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.
Resumo:
Depth-averaged velocities and unit discharges within a 30 km reach of one of the world's largest rivers, the Rio Parana, Argentina, were simulated using three hydrodynamic models with different process representations: a reduced complexity (RC) model that neglects most of the physics governing fluid flow, a two-dimensional model based on the shallow water equations, and a three-dimensional model based on the Reynolds-averaged Navier-Stokes equations. Row characteristics simulated using all three models were compared with data obtained by acoustic Doppler current profiler surveys at four cross sections within the study reach. This analysis demonstrates that, surprisingly, the performance of the RC model is generally equal to, and in some instances better than, that of the physics based models in terms of the statistical agreement between simulated and measured flow properties. In addition, in contrast to previous applications of RC models, the present study demonstrates that the RC model can successfully predict measured flow velocities. The strong performance of the RC model reflects, in part, the simplicity of the depth-averaged mean flow patterns within the study reach and the dominant role of channel-scale topographic features in controlling the flow dynamics. Moreover, the very low water surface slopes that typify large sand-bed rivers enable flow depths to be estimated reliably in the RC model using a simple fixed-lid planar water surface approximation. This approach overcomes a major problem encountered in the application of RC models in environments characterised by shallow flows and steep bed gradients. The RC model is four orders of magnitude faster than the physics based models when performing steady-state hydrodynamic calculations. However, the iterative nature of the RC model calculations implies a reduction in computational efficiency relative to some other RC models. A further implication of this is that, if used to simulate channel morphodynamics, the present RC model may offer only a marginal advantage in terms of computational efficiency over approaches based on the shallow water equations. These observations illustrate the trade off between model realism and efficiency that is a key consideration in RC modelling. Moreover, this outcome highlights a need to rethink the use of RC morphodynamic models in fluvial geomorphology and to move away from existing grid-based approaches, such as the popular cellular automata (CA) models, that remain essentially reductionist in nature. In the case of the world's largest sand-bed rivers, this might be achieved by implementing the RC model outlined here as one element within a hierarchical modelling framework that would enable computationally efficient simulation of the morphodynamics of large rivers over millennial time scales. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Understanding the extent of genomic transcription and its functional relevance is a central goal in genomics research. However, detailed genome-wide investigations of transcriptome complexity in major mammalian organs have been scarce. Here, using extensive RNA-seq data, we show that transcription of the genome is substantially more widespread in the testis than in other organs across representative mammals. Furthermore, we reveal that meiotic spermatocytes and especially postmeiotic round spermatids have remarkably diverse transcriptomes, which explains the high transcriptome complexity of the testis as a whole. The widespread transcriptional activity in spermatocytes and spermatids encompasses protein-coding and long noncoding RNA genes but also poorly conserves intergenic sequences, suggesting that it may not be of immediate functional relevance. Rather, our analyses of genome-wide epigenetic data suggest that this prevalent transcription, which most likely promoted the birth of new genes during evolution, is facilitated by an overall permissive chromatin in these germ cells that results from extensive chromatin remodeling.
Resumo:
BACKGROUND: There is an ever-increasing volume of data on host genes that are modulated during HIV infection, influence disease susceptibility or carry genetic variants that impact HIV infection. We created GuavaH (Genomic Utility for Association and Viral Analyses in HIV, http://www.GuavaH.org), a public resource that supports multipurpose analysis of genome-wide genetic variation and gene expression profile across multiple phenotypes relevant to HIV biology. FINDINGS: We included original data from 8 genome and transcriptome studies addressing viral and host responses in and ex vivo. These studies cover phenotypes such as HIV acquisition, plasma viral load, disease progression, viral replication cycle, latency and viral-host genome interaction. This represents genome-wide association data from more than 4,000 individuals, exome sequencing data from 392 individuals, in vivo transcriptome microarray data from 127 patients/conditions, and 60 sets of RNA-seq data. Additionally, GuavaH allows visualization of protein variation in ~8,000 individuals from the general population. The publicly available GuavaH framework supports queries on (i) unique single nucleotide polymorphism across different HIV related phenotypes, (ii) gene structure and variation, (iii) in vivo gene expression in the setting of human infection (CD4+ T cells), and (iv) in vitro gene expression data in models of permissive infection, latency and reactivation. CONCLUSIONS: The complexity of the analysis of host genetic influences on HIV biology and pathogenesis calls for comprehensive motors of research on curated data. The tool developed here allows queries and supports validation of the rapidly growing body of host genomic information pertinent to HIV research.
Resumo:
New stratigraphic data along a profile from the Helvetic Gotthard massif to the remnants of the North Penninic basin in eastern Ticino and Graubunden are presented. The stratigraphic record together with existing geochemical and structural data, motivate a new interpretation of the fossil European distal margin. We introduce a new group of Triassic facies, the North-Penninic-Triassic (NPT), which is characterised by the Ladinian ``dolomie bicolori''. The NPT was located in-between the Brianconnais carbonate platform and the Helvetic lands. The observed horizontal transition, coupled with the stratigraphic superposition of a Helvetic Liassic on a Briaconnais Triassic in the Luzzone-Terri nappe, links, prior to Jurassic rifting, the Brianconnais paleogeographic domain at the Helvetic margin, south of the Gotthard. Our observations suggest that the Jurassic rifting separated the Brianconnais domain from the Helvetic margin by complex and protracted extension. The syn-rift stratigraphic record in the Adula nappe and surroundings suggests the presence of a diffuse rising area with only moderately subsiding basins above a thinned continental and proto-oceanic crust. Strong subsidence occurred in a second phase following protracted extension and the resulting delamination of the rising area. The stratigraphic coherency in the Adula's Mesozoic questions the idea of a lithospheric m lange in the eclogitic Adula nappe, which is more likely to be a coherent alpine tectonic unit. The structural and stratigraphic observations in the Piz Terri-Lunschania zone suggest the activity of syn-rift detachments. During the alpine collision these faults are reactivated (and inverted) and played a major role in allowing the Adula subduction, the ``Penninic Thrust'' above it and in creating the structural complexity of the Central Alps. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
The increase of publicly available sequencing data has allowed for rapid progress in our understanding of genome composition. As new information becomes available we should constantly be updating and reanalyzing existing and newly acquired data. In this report we focus on transposable elements (TEs) which make up a significant portion of nearly all sequenced genomes. Our ability to accurately identify and classify these sequences is critical to understanding their impact on host genomes. At the same time, as we demonstrate in this report, problems with existing classification schemes have led to significant misunderstandings of the evolution of both TE sequences and their host genomes. In a pioneering publication Finnegan (1989) proposed classifying all TE sequences into two classes based on transposition mechanisms and structural features: the retrotransposons (class I) and the DNA transposons (class II). We have retraced how ideas regarding TE classification and annotation in both prokaryotic and eukaryotic scientific communities have changed over time. This has led us to observe that: (1) a number of TEs have convergent structural features and/or transposition mechanisms that have led to misleading conclusions regarding their classification, (2) the evolution of TEs is similar to that of viruses by having several unrelated origins, (3) there might be at least 8 classes and 12 orders of TEs including 10 novel orders. In an effort to address these classification issues we propose: (1) the outline of a universal TE classification, (2) a set of methods and classification rules that could be used by all scientific communities involved in the study of TEs, and (3) a 5-year schedule for the establishment of an International Committee for Taxonomy of Transposable Elements (ICTTE).
Resumo:
PURPOSE: Signal detection on 3D medical images depends on many factors, such as foveal and peripheral vision, the type of signal, and background complexity, and the speed at which the frames are displayed. In this paper, the authors focus on the speed with which radiologists and naïve observers search through medical images. Prior to the study, the authors asked the radiologists to estimate the speed at which they scrolled through CT sets. They gave a subjective estimate of 5 frames per second (fps). The aim of this paper is to measure and analyze the speed with which humans scroll through image stacks, showing a method to visually display the behavior of observers as the search is made as well as measuring the accuracy of the decisions. This information will be useful in the development of model observers, mathematical algorithms that can be used to evaluate diagnostic imaging systems. METHODS: The authors performed a series of 3D 4-alternative forced-choice lung nodule detection tasks on volumetric stacks of chest CT images iteratively reconstructed in lung algorithm. The strategy used by three radiologists and three naïve observers was assessed using an eye-tracker in order to establish where their gaze was fixed during the experiment and to verify that when a decision was made, a correct answer was not due only to chance. In a first set of experiments, the observers were restricted to read the images at three fixed speeds of image scrolling and were allowed to see each alternative once. In the second set of experiments, the subjects were allowed to scroll through the image stacks at will with no time or gaze limits. In both static-speed and free-scrolling conditions, the four image stacks were displayed simultaneously. All trials were shown at two different image contrasts. RESULTS: The authors were able to determine a histogram of scrolling speeds in frames per second. The scrolling speed of the naïve observers and the radiologists at the moment the signal was detected was measured at 25-30 fps. For the task chosen, the performance of the observers was not affected by the contrast or experience of the observer. However, the naïve observers exhibited a different pattern of scrolling than the radiologists, which included a tendency toward higher number of direction changes and number of slices viewed. CONCLUSIONS: The authors have determined a distribution of speeds for volumetric detection tasks. The speed at detection was higher than that subjectively estimated by the radiologists before the experiment. The speed information that was measured will be useful in the development of 3D model observers, especially anthropomorphic model observers which try to mimic human behavior.
Resumo:
CONTEXT: Complex steroid disorders such as P450 oxidoreductase deficiency or apparent cortisone reductase deficiency may be recognized by steroid profiling using chromatographic mass spectrometric methods. These methods are highly specific and sensitive, and provide a complete spectrum of steroid metabolites in a single measurement of one sample which makes them superior to immunoassays. The steroid metabolome during the fetal-neonatal transition is characterized by (a) the metabolites of the fetal-placental unit at birth, (b) the fetal adrenal androgens until its involution 3-6 months postnatally, and (c) the steroid metabolites produced by the developing endocrine organs. All these developmental events change the steroid metabolome in an age- and sex-dependent manner during the first year of life. OBJECTIVE: The aim of this study was to provide normative values for the urinary steroid metabolome of healthy newborns at short time intervals in the first year of life. METHODS: We conducted a prospective, longitudinal study to measure 67 urinary steroid metabolites in 21 male and 22 female term healthy newborn infants at 13 time-points from week 1 to week 49 of life. Urine samples were collected from newborn infants before discharge from hospital and from healthy infants at home. Steroid metabolites were measured by gas chromatography-mass spectrometry (GC-MS) and steroid concentrations corrected for urinary creatinine excretion were calculated. RESULTS: 61 steroids showed age and 15 steroids sex specificity. Highest urinary steroid concentrations were found in both sexes for progesterone derivatives, in particular 20α-DH-5α-DH-progesterone, and for highly polar 6α-hydroxylated glucocorticoids. The steroids peaked at week 3 and decreased by ∼80% at week 25 in both sexes. The decline of progestins, androgens and estrogens was more pronounced than of glucocorticoids whereas the excretion of corticosterone and its metabolites and of mineralocorticoids remained constant during the first year of life. CONCLUSION: The urinary steroid profile changes dramatically during the first year of life and correlates with the physiologic developmental changes during the fetal-neonatal transition. Thus detailed normative data during this time period permit the use of steroid profiling as a powerful diagnostic tool.
Resumo:
In recent years, new analytical tools have allowed researchers to extract historical information contained in molecular data, which has fundamentally transformed our understanding of processes ruling biological invasions. However, the use of these new analytical tools has been largely restricted to studies of terrestrial organisms despite the growing recognition that the sea contains ecosystems that are amongst the most heavily affected by biological invasions, and that marine invasion histories are often remarkably complex. Here, we studied the routes of invasion and colonisation histories of an invasive marine invertebrate Microcosmus squamiger (Ascidiacea) using microsatellite loci, mitochondrial DNA sequence data and 11 worldwide populations. Discriminant analysis of principal components, clustering methods and approximate Bayesian computation (ABC) methods showed that the most likely source of the introduced populations was a single admixture event that involved populations from two genetically differentiated ancestral regions - the western and eastern coasts of Australia. The ABC analyses revealed that colonisation of the introduced range of M. squamiger consisted of a series of non-independent introductions along the coastlines of Africa, North America and Europe. Furthermore, we inferred that the sequence of colonisation across continents was in line with historical taxonomic records - first the Mediterranean Sea and South Africa from an unsampled ancestral population, followed by sequential introductions in California and, more recently, the NE Atlantic Ocean. We revealed the most likely invasion history for world populations of M. squamiger, which is broadly characterized by the presence of multiple ancestral sources and non-independent introductions within the introduced range. The results presented here illustrate the complexity of marine invasion routes and identify a cause-effect relationship between human-mediated transport and the success of widespread marine non-indigenous species, which benefit from stepping-stone invasions and admixture processes involving different sources for the spread and expansion of their range.
Resumo:
As a result of the growing interest in studying employee well-being as a complex process that portrays high levels of within-individual variability and evolves over time, this present study considers the experience of flow in the workplace from a nonlinear dynamical systems approach. Our goal is to offer new ways to move the study of employee well-being beyond linear approaches. With nonlinear dynamical systems theory as the backdrop, we conducted a longitudinal study using the experience sampling method and qualitative semi-structured interviews for data collection; 6981 registers of data were collected from a sample of 60 employees. The obtained time series were analyzed using various techniques derived from the nonlinear dynamical systems theory (i.e., recurrence analysis and surrogate data) and multiple correspondence analyses. The results revealed the following: 1) flow in the workplace presents a high degree of within-individual variability; this variability is characterized as chaotic for most of the cases (75%); 2) high levels of flow are associated with chaos; and 3) different dimensions of the flow experience (e.g., merging of action and awareness) as well as individual (e.g., age) and job characteristics (e.g., job tenure) are associated with the emergence of different dynamic patterns (chaotic, linear and random).
Resumo:
Anders Söderbäckin esitys Kirjastoverkkopäivillä 26.10.2011 Helsingissä.
Resumo:
This thesis describes an approach to overcoming the complexity of software product management (SPM) and consists of several studies that investigate the activities and roles in product management, as well as issues related to the adoption of software product management. The thesis focuses on organizations that have started the adoption of SPM but faced difficulties due to its complexity and fuzziness and suggests the frameworks for overcoming these challenges using the principles of decomposition and iterative improvements. The research process consisted of three phases, each of which provided complementary results and empirical observation to the problem of overcoming the complexity of SPM. Overall, product management processes and practices in 13 companies were studied and analysed. Moreover, additional data was collected with a survey conducted worldwide. The collected data were analysed using the grounded theory (GT) to identify the possible ways to overcome the complexity of SPM. Complementary research methods, like elements of the Theory of Constraints were used for deeper data analysis. The results of the thesis indicate that the decomposition of SPM activities depending on the specific characteristics of companies and roles is a useful approach for simplifying the existing SPM frameworks. Companies would benefit from the results by adopting SPM activities more efficiently and effectively and spending fewer resources on its adoption by concentrating on the most important SPM activities.
Resumo:
The brain is a complex system, which produces emergent properties such as those associated with activity-dependent plasticity in processes of learning and memory. Therefore, understanding the integrated structures and functions of the brain is well beyond the scope of either superficial or extremely reductionistic approaches. Although a combination of zoom-in and zoom-out strategies is desirable when the brain is studied, constructing the appropriate interfaces to connect all levels of analysis is one of the most difficult challenges of contemporary neuroscience. Is it possible to build appropriate models of brain function and dysfunctions with computational tools? Among the best-known brain dysfunctions, epilepsies are neurological syndromes that reach a variety of networks, from widespread anatomical brain circuits to local molecular environments. One logical question would be: are those complex brain networks always producing maladaptive emergent properties compatible with epileptogenic substrates? The present review will deal with this question and will try to answer it by illustrating several points from the literature and from our laboratory data, with examples at the behavioral, electrophysiological, cellular and molecular levels. We conclude that, because the brain is a complex system compatible with the production of emergent properties, including plasticity, its functions should be approached using an integrated view. Concepts such as brain networks, graphics theory, neuroinformatics, and e-neuroscience are discussed as new transdisciplinary approaches dealing with the continuous growth of information about brain physiology and its dysfunctions. The epilepsies are discussed as neurobiological models of complex systems displaying maladaptive plasticity.