926 resultados para Genomic data integration


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Integrating information in the molecular biosciences involves more than the cross-referencing of sequences or structures. Experimental protocols, results of computational analyses, annotations and links to relevant literature form integral parts of this information, and impart meaning to sequence or structure. In this review, we examine some existing approaches to integrating information in the molecular biosciences. We consider not only technical issues concerning the integration of heterogeneous data sources and the corresponding semantic implications, but also the integration of analytical results. Within the broad range of strategies for integration of data and information, we distinguish between platforms and developments. We discuss two current platforms and six current developments, and identify what we believe to be their strengths and limitations. We identify key unsolved problems in integrating information in the molecular biosciences, and discuss possible strategies for addressing them including semantic integration using ontologies, XML as a data model, and graphical user interfaces as integrative environments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the last decade, with the expansion of organizational scope and the tendency for outsourcing, there has been an increasing need for Business Process Integration (BPI), understood as the sharing of data and applications among business processes. The research efforts and development paths in BPI pursued by many academic groups and system vendors, targeting heterogeneous system integration, continue to face several conceptual and technological challenges. This article begins with a brief review of major approaches and emerging standards to address BPI. Further, we introduce a rule-driven messaging approach to BPI, which is based on the harmonization of messages in order to compose a new, often cross-organizational process. We will then introduce the design of a temporal first order language (Harmonized Messaging Calculus) that provides the formal foundation for general rules governing the business process execution. Definitions of the language terms, formulae, safety, and expressiveness are introduced and considered in detail.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Current methods to find significantly under- and over-represented gene ontology (GO) terms in a set of genes consider the genes as equally probable balls in a bag, as may be appropriate for transcripts in micro-array data. However, due to the varying length of genes and intergenic regions, that approach is inappropriate for deciding if any GO terms are correlated with a set of genomic positions. Results: We present an algorithm - GONOME - that can determine which GO terms are significantly associated with a set of genomic positions given a genome annotated with (at least) the starts and ends of genes. We show that certain GO terms may appear to be significantly associated with a set of randomly chosen positions in the human genome if gene lengths are not considered, and that these same terms have been reported as significantly over-represented in a number of recent papers. This apparent over-representation disappears when gene lengths are considered, as GONOME does. For example, we show that, when gene length is taken into account, the term development is not significantly enriched in genes associated with human CpG islands, in contradiction to a previous report. We further demonstrate the efficacy of GONOME by showing that occurrences of the proteosome-associated control element (PACE) upstream activating sequence in the S. cerevisiae genome associate significantly to appropriate GO terms. An extension of this approach yields a whole-genome motif discovery algorithm that allows identification of many other promoter sequences linked to different types of genes, including a large group of previously unknown motifs significantly associated with the terms 'translation' and 'translational elongation'. Conclusion: GONOME is an algorithm that correctly extracts over-represented GO terms from a set of genomic positions. By explicitly considering gene size, GONOME avoids a systematic bias toward GO terms linked to large genes. Inappropriate use of existing algorithms that do not take gene size into account has led to erroneous or suspect conclusions. Reciprocally GONOME may be used to identify new features in genomes that are significantly associated with particular categories of genes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Traditional vegetation mapping methods use high cost, labour-intensive aerial photography interpretation. This approach can be subjective and is limited by factors such as the extent of remnant vegetation, and the differing scale and quality of aerial photography over time. An alternative approach is proposed which integrates a data model, a statistical model and an ecological model using sophisticated Geographic Information Systems (GIS) techniques and rule-based systems to support fine-scale vegetation community modelling. This approach is based on a more realistic representation of vegetation patterns with transitional gradients from one vegetation community to another. Arbitrary, though often unrealistic, sharp boundaries can be imposed on the model by the application of statistical methods. This GIS-integrated multivariate approach is applied to the problem of vegetation mapping in the complex vegetation communities of the Innisfail Lowlands in the Wet Tropics bioregion of Northeastern Australia. The paper presents the full cycle of this vegetation modelling approach including sampling sites, variable selection, model selection, model implementation, internal model assessment, model prediction assessments, models integration of discrete vegetation community models to generate a composite pre-clearing vegetation map, independent data set model validation and model prediction's scale assessments. An accurate pre-clearing vegetation map of the Innisfail Lowlands was generated (0.83r(2)) through GIS integration of 28 separate statistical models. This modelling approach has good potential for wider application, including provision of. vital information for conservation planning and management; a scientific basis for rehabilitation of disturbed and cleared areas; a viable method for the production of adequate vegetation maps for conservation and forestry planning of poorly-studied areas. (c) 2006 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A complete workflow specification requires careful integration of many different process characteristics. Decisions must be made as to the definitions of individual activities, their scope, the order of execution that maintains the overall business process logic, the rules governing the discipline of work list scheduling to performers, identification of time constraints and more. The goal of this paper is to address an important issue in workflows modelling and specification, which is data flow, its modelling, specification and validation. Researchers have neglected this dimension of process analysis for some time, mainly focussing on structural considerations with limited verification checks. In this paper, we identify and justify the importance of data modelling in overall workflows specification and verification. We illustrate and define several potential data flow problems that, if not detected prior to workflow deployment may prevent the process from correct execution, execute process on inconsistent data or even lead to process suspension. A discussion on essential requirements of the workflow data model in order to support data validation is also given..

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many variables that are of interest in social science research are nominal variables with two or more categories, such as employment status, occupation, political preference, or self-reported health status. With longitudinal survey data it is possible to analyse the transitions of individuals between different employment states or occupations (for example). In the statistical literature, models for analysing categorical dependent variables with repeated observations belong to the family of models known as generalized linear mixed models (GLMMs). The specific GLMM for a dependent variable with three or more categories is the multinomial logit random effects model. For these models, the marginal distribution of the response does not have a closed form solution and hence numerical integration must be used to obtain maximum likelihood estimates for the model parameters. Techniques for implementing the numerical integration are available but are computationally intensive requiring a large amount of computer processing time that increases with the number of clusters (or individuals) in the data and are not always readily accessible to the practitioner in standard software. For the purposes of analysing categorical response data from a longitudinal social survey, there is clearly a need to evaluate the existing procedures for estimating multinomial logit random effects model in terms of accuracy, efficiency and computing time. The computational time will have significant implications as to the preferred approach by researchers. In this paper we evaluate statistical software procedures that utilise adaptive Gaussian quadrature and MCMC methods, with specific application to modeling employment status of women using a GLMM, over three waves of the HILDA survey.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multidimensional compound optimization is a new paradigm in the drug discovery process, yielding efficiencies during early stages and reducing attrition in the later stages of drug development. The success of this strategy relies heavily on understanding this multidimensional data and extracting useful information from it. This paper demonstrates how principled visualization algorithms can be used to understand and explore a large data set created in the early stages of drug discovery. The experiments presented are performed on a real-world data set comprising biological activity data and some whole-molecular physicochemical properties. Data visualization is a popular way of presenting complex data in a simpler form. We have applied powerful principled visualization methods, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), to help the domain experts (screening scientists, chemists, biologists, etc.) understand and draw meaningful decisions. We also benchmark these principled methods against relatively better known visualization approaches, principal component analysis (PCA), Sammon's mapping, and self-organizing maps (SOMs), to demonstrate their enhanced power to help the user visualize the large multidimensional data sets one has to deal with during the early stages of the drug discovery process. The results reported clearly show that the GTM and HGTM algorithms allow the user to cluster active compounds for different targets and understand them better than the benchmarks. An interactive software tool supporting these visualization algorithms was provided to the domain experts. The tool facilitates the domain experts by exploration of the projection obtained from the visualization algorithms providing facilities such as parallel coordinate plots, magnification factors, directional curvatures, and integration with industry standard software. © 2006 American Chemical Society.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose: The purpose of this study is to explore the nature of human resource management in publicly listed finance sector companies in Nepal. In particular, it explores the extent to which HR practice is integrated into organisational strategy and devolved to line management. Design/methodology/ approach: A structured interview was conducted with the senior executive responsible for human resource management in 26 commercial banks and insurance companies in Nepal. Findings: The degree of integration of HR practice appears to be increasing within this sector, but this is dependent on the maturity of the organisations. The devolvement of responsibility to line managers is at best partial, and in the case of the insurance companies, it is more out of necessity due to the absence of a strong central HR function. Research limitations/implications: The survey is inevitably based on a small sample; however this represents 90 per cent of the relevant population. The data suggest that Western HR is making inroads into more developed aspects of Nepalese business. Compared with Nepalese business as a whole, the financial sector appears relatively Westernised, although Nepal still lags India in its uptake of HR practices. Practical implications: It appears unlikely from a cultural perspective that the devolvement of responsibility will be achieved as a result of HR strategy. National cultural, political and social factors continue to be highly influential in shaping the Nepalese business environment. Originality/value: Few papers have explored HR practice in Nepal. This paper contributes to the overall assessment of HR uptake globally and highlights emic features impacting on that uptake. © Emerald Group Publishing Limited.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Signal integration determines cell fate on the cellular level, affects cognitive processes and affective responses on the behavioural level, and is likely to be involved in psychoneurobiological processes underlying mood disorders. Interactions between stimuli may subjected to time effects. Time-dependencies of interactions between stimuli typically lead to complex cell responses and complex responses on the behavioural level. We show that both three-factor models and time series models can be used to uncover such time-dependencies. However, we argue that for short longitudinal data the three factor modelling approach is more suitable. In order to illustrate both approaches, we re-analysed previously published short longitudinal data sets. We found that in human embryonic kidney 293 cells cells the interaction effect in the regulation of extracellular signal-regulated kinase (ERK) 1 signalling activation by insulin and epidermal growth factor is subjected to a time effect and dramatically decays at peak values of ERK activation. In contrast, we found that the interaction effect induced by hypoxia and tumour necrosis factor-alpha for the transcriptional activity of the human cyclo-oxygenase-2 promoter in HEK293 cells is time invariant at least in the first 12-h time window after stimulation. Furthermore, we applied the three-factor model to previously reported animal studies. In these studies, memory storage was found to be subjected to an interaction effect of the beta-adrenoceptor agonist clenbuterol and certain antagonists acting on the alpha-1-adrenoceptor / glucocorticoid-receptor system. Our model-based analysis suggests that only if the antagonist drug is administer in a critical time window, then the interaction effect is relevant.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The inclusion of high-level scripting functionality in state-of-the-art rendering APIs indicates a movement toward data-driven methodologies for structuring next generation rendering pipelines. A similar theme can be seen in the use of composition languages to deploy component software using selection and configuration of collaborating component implementations. In this paper we introduce the Fluid framework, which places particular emphasis on the use of high-level data manipulations in order to develop component based software that is flexible, extensible, and expressive. We introduce a data-driven, object oriented programming methodology to component based software development, and demonstrate how a rendering system with a similar focus on abstract manipulations can be incorporated, in order to develop a visualization application for geospatial data. In particular we describe a novel SAS script integration layer that provides access to vertex and fragment programs, producing a very controllable, responsive rendering system. The proposed system is very similar to developments speculatively planned for DirectX 10, but uses open standards and has cross platform applicability. © The Eurographics Association 2007.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Two key issues defined the focus of this research in manufacturing plasmid DNA for use In human gene therapy. First, the processing of E.coli bacterial cells to effect the separation of therapeutic plasmid DNA from cellular debris and adventitious material. Second, the affinity purification of the plasmid DNA in a Simple one-stage process. The need arises when considering the concerns that have been recently voiced by the FDA concerning the scalability and reproducibility of the current manufacturing processes in meeting the quality criteria of purity, potency, efficacy, and safety for a recombinant drug substance for use in humans. To develop a preliminary purification procedure, an EFD cross-flow micro-filtration module was assessed for its ability to effect the 20-fold concentration, 6-time diafiltration, and final clarification of the plasmid DNA from the subsequent cell lysate that is derived from a 1 liter E.coli bacterial cell culture. Historically, the employment of cross-flow filtration modules within procedures for harvesting cells from bacterial cultures have failed to reach the required standards dictated by existing continuous centrifuge technologies, frequently resulting in the rapid blinding of the membrane with bacterial cells that substantially reduces the permeate flux. By challenging the EFD module, containing six helical wound tubular membranes promoting centrifugal instabilities known as Dean vortices, with distilled water between the Dean number's of 187Dn and 818Dn,and the transmembrane pressures (TMP) of 0 to 5 psi. The data demonstrated that the fluid dynamics significantly influenced the permeation rate, displaying a maximum at 227Dn (312 Imh) and minimum at 818Dn (130 Imh) for a transmembrane pressure of 1 psi. Numerical studies indicated that the initial increase and subsequent decrease resulted from a competition between the centrifugal and viscous forces that create the Dean vortices. At Dean numbers between 187Dn and 227Dn , the forces combine constructively to increase the apparent strength and influence of the Dean vortices. However, as the Dean number in increases above 227 On the centrifugal force dominates the viscous forces, compressing the Dean vortices into the membrane walls and reducing their influence on the radial transmembrane pressure i.e. the permeate flux reduced. When investigating the action of the Dean vortices in controlling tile fouling rate of E.coli bacterial cells, it was demonstrated that the optimum cross-flow rate at which to effect the concentration of a bacterial cell culture was 579Dn and 3 psi TMP, processing in excess of 400 Imh for 20 minutes (i.e., concentrating a 1L culture to 50 ml in 10 minutes at an average of 450 Imh). The data demonstrated that there was a conflict between the Dean number at which the shear rate could control the cell fouling, and the Dean number at which tile optimum flux enhancement was found. Hence, the internal geometry of the EFD module was shown to sub-optimal for this application. At 579Dn and 3 psi TMP, the 6-fold diafiltration was shown to occupy 3.6 minutes of process time, processing at an average flux of 400 Imh. Again, at 579Dn and 3 psi TMP the clarification of the plasmid from tile resulting freeze-thaw cell lysate was achieved at 120 Iml1, passing 83% (2,5 mg) of the plasmid DNA (6,3 ng μ-1 10.8 mg of genomic DNA (∼23,00 Obp, 36 ng μ-1 ), and 7.2 mg of cellular proteins (5-100 kDa, 21.4 ngμ-1 ) into the post-EFD process stream. Hence the EFD module was shown to be effective, achieving the desired objectives in approximately 25 minutes. On the basis of its ability to intercalate into low molecular weight dsDNA present in dilute cell lysates, and be electrophoresed through agarose, the fluorophore PicoGreen was selected for the development of a suitable dsDNA assay. It was assesseel for its accuracy, and reliability, In determining the concentration and identity of DNA present in samples that were eleclrophoresed through agarose gels. The signal emitted by intercalated PicoGreen was shown to be constant and linear, and that the mobility of the PicaGreen-DNA complex was not affected by the intercalation. Concerning the secondary purification procedure, various anion-exchange membranes were assessed for their ability to capture plasmid DNA from the post-EFD process stream. For a commercially available Sartorius Sartobind Q15 membrane, the reduction in the equilibriumbinding capacity for  ctDNA in buffer of increasing ionic demonstrated that DNA was being.adsorbed by electrostatic  interactions only. However, the problems associated with fluid distribution across the membrane demonstrated that the membrane housing was the predominant cause of the .erratic breakthrough curves. Consequently, this would need to be rectified before such a membrane could be integrated into the current system, or indeed be scaled beyond laboratory scale. However, when challenged with the process material, the data showed that considerable quantities of protein (1150 μg) were adsorbed preferentially to the plasmid DNA (44 μg). This was also shown for derived Pall Gelman UltraBind US450 membranes that had been functionalised by varying molecular weight poly-L~lysine and polyethyleneimine ligands. Hence the anion-exchange membranes were shown to be ineffective in capturing plasmid DNA from the process stream. Finally, work was performed to integrate a sequence-specific DNA·binding protein into a single-stage DNA chromatography, isolating plasmid DNA from E.coli cells whilst minimising the contamination from genomic DNA and cellular protein. Preliminary work demonstrated that the fusion protein was capable of isolating pUC19 DNA into which the recognition sequence for the fusion-protein had been inserted (pTS DNA) when in the presence of the conditioned process material. Althougth the pTS recognition sequence differs from native pUC19 sequences by only 2 bp, the fusion protein was shown to act as a highly selective affinity ligand for pTS DNA alone. Subsequently, the scale of the process was scaled 25-fold and positioned directly following the EFD system. In conclusion, the integration of the EFD micro-filtration system and zinc-finger affinity purification technique resulted in the capture of approximately 1 mg of plasmid DNA was purified from 1L of E.coli  culture in a simple two stage process, resulting in the complete removal of genomic DNA and 96.7% of cellular protein in less than 1 hour of process time.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Relational demographers and dissimilarity researchers contend that group members who are dissimilar (vs. similar) to their peers in terms of a given diversity attribute (e.g. demographics, attitudes, values or traits) feel less attached to their work group, experience less satisfying and more conflicted relationships with their colleagues, and consequently are less effective. However, qualitative reviews suggest empirical findings tend to be weak and inconsistent (Chattopadhyay, Tluchowska and George, 2004; Riordan, 2000; Tsui and Gutek, 1999), and that it remains unclear when, how and to what extent such differences (i.e. relational diversity) affect group members social integration (i.e. attachment with their work group, satisfaction and conflicted relationships with their peers) and effectiveness (Riordan, 2000). This absence of meta-analytically derived effect size estimates and the lack of an integrative theoretical framework leave practitioners with inconclusive advice regarding whether the effects elicited by relational diversity are practically relevant, and if so how these should be managed. The current research develops an integrative theoretical framework, which it tests by using meta-analysis techniques and adding two further empirical studies to the literature. The first study reports a meta-analytic integration of the results of 129 tests of the relationship between relational diversity with social integration and individual effectiveness. Using meta-analytic and structural equation modelling techniques, it shows different effects of surface- and deep-level relational diversity on social integration Specifically, low levels of interdependence accentuated the negative effects of surface-level relational diversity on social integration, while high levels of interdependence accentuated the negative effects of deep-level relational diversity on social integration. The second study builds on a social self-regulation framework (Abrams, 1994) and suggests that under high levels of interdependence relational diversity is not one but two things: visibility and separation. Using ethnicity as a prominent example it was proposed that separation has a negative effect on group members effectiveness leading for those high in visibility and low in separation to overall positive additive effects, while to overall negative additive effects for those low in visibility and high in separation. These propositions were sustained in a sample of 621 business students working in 135 ethnically diverse work groups in a business simulation course over a period of 24 weeks. The third study suggests visibility has a positive effect on group members self-monitoring, while separation has a negative effect. The study proposed that high levels of visibility and low levels of separation lead to overall positive additive effects on self-monitoring but overall negative additive effects for those low in visibility and high in separation. Results from four waves of data on 261 business students working in 69 ethnically diverse work groups in a business simulation course held over a period of 24 weeks support these propositions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant colony optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper investigates ant-based algorithms for gene expression data clustering and associative classification. Methods and material: An ant-based clustering (Ant-C) and an ant-based association rule mining (Ant-ARM) algorithms are proposed for gene expression data analysis. The proposed algorithms make use of the natural behavior of ants such as cooperation and adaptation to allow for a flexible robust search for a good candidate solution. Results: Ant-C has been tested on the three datasets selected from the Stanford Genomic Resource Database and achieved relatively high accuracy compared to other classical clustering methods. Ant-ARM has been tested on the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) dataset and generated about 30 classification rules with high accuracy. Conclusions: Ant-C can generate optimal number of clusters without incorporating any other algorithms such as K-means or agglomerative hierarchical clustering. For associative classification, while a few of the well-known algorithms such as Apriori, FP-growth and Magnum Opus are unable to mine any association rules from the ALL/AML dataset within a reasonable period of time, Ant-ARM is able to extract associative classification rules.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The efficacy, quality, responsiveness, and value of healthcare services provided is increasingly attracting the attention and the questioning of governments, payers, patients, and healthcare providers. Investments on integration technologies and integration of supply chain processes, has been considered as a way towards removing inefficiencies in the sector. This chapter aims to initially provide an in depth analysis of the healthcare supply chain and to present core entities, processes, and flows. Moreover, the chapter explores the concept of integration in the context of the healthcare sector, and indentifies the integration drivers, as well as challenges.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article provides a unique contribution to the debates about archived qualitative data by drawing on two uses of the same data - British Migrants in Spain: the Extent and Nature of Social Integration, 2003-2005 - by Jones (2009) and Oliver and O'Reilly (2010), both of which utilise Bourdieu's concepts analytically and produce broadly similar findings. We argue that whilst the insights and experiences of those researchers directly involved in data collection are important resources for developing contextual knowledge used in data analysis, other kinds of critical distance can also facilitate credible data use. We therefore challenge the assumption that the idiosyncratic relationship between context, reflexivity and interpretation limits the future use of data. Moreover, regardless of the complex genealogy of the data itself, given the number of contingencies shaping the qualitative research process and thus the potential for partial or inaccurate interpretation, contextual familiarity need not be privileged over other aspects of qualitative praxis such as sustained theoretical insight, sociological imagination and methodological rigour. © Sociological Research Online, 1996-2012.