894 resultados para DATA INTEGRATION


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Molecular markers have been demonstrated to be useful for the estimation of stock mixture proportions where the origin of individuals is determined from baseline samples. Bayesian statistical methods are widely recognized as providing a preferable strategy for such analyses. In general, Bayesian estimation is based on standard latent class models using data augmentation through Markov chain Monte Carlo techniques. In this study, we introduce a novel approach based on recent developments in the estimation of genetic population structure. Our strategy combines analytical integration with stochastic optimization to identify stock mixtures. An important enhancement over previous methods is the possibility of appropriately handling data where only partial baseline sample information is available. We address the potential use of nonmolecular, auxiliary biological information in our Bayesian model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

NOAA’s Coral Reef Conservation program (CRCP) develops coral reef management priorities by bringing together various partners to better understand threats to coral reef ecosystems with the goal of conserving, protecting and restoring these resources. Place-based and ecosystem-based management approaches employed by CRCP require that spatially explicit information about benthic habitats and fish utilization are available to characterize coral reef ecosystems and set conservation priorities. To accomplish this, seafloor habitat mapping of coral reefs around the U.S. Virgin Islands (USVI) and Puerto Rico has been ongoing since 2004. In 2008, fishery acoustics surveys were added to NOAA survey missions in the USVI and Puerto Rico to assess fish distribution and abundance in relation to benthic habitats in high priority conservation areas. NOAA’s National Centers for Coastal Ocean Science (NCCOS) have developed fisheries acoustics survey capabilities onboard the NOAA ship Nancy Foster to complement the CRCP seafloor habitat mapping effort spearheaded by the Center for Coastal Monitoring and Assessment Biogeography Branch (CCMA-BB). The integration of these activities has evolved on the Nancy Foster over the three years summarized in this report. A strategy for improved operations and products has emerged over that time. Not only has the concurrent operation of multibeam and fisheries acoustics surveys been beneficial in terms of optimizing ship time and resources, this joint effort has advanced an integrated approach to characterizing bottom and mid-water habitats and the fishes associated with them. CCMA conducts multibeam surveys to systematically map and characterize coral reef ecosystems, resulting in products such as high resolution bathymetric maps, backscatter information, and benthic habitat classification maps. These products focus on benthic features and live bottom habitats associated with them. NCCOS Centers (the Center for Coastal Fisheries and Habitat Research and the Center for Coastal Environmental Health and Biomolecular Research) characterize coral reef ecosystems by using fisheries acoustics methods to capture biological information through the entire water column. Spatially-explicit information on marine resources derived from fisheries acoustics surveys, such as maps of fish density, supports marine spatial planning strategies and decision making by providing a biological metric for evaluating coral reef ecosystems and assessing impacts from pollution, fishing pressure, and climate change. Data from fisheries acoustics surveys address management needs by providing a measure of biomass in management areas, detecting spatial and temporal responses in distribution relative to natural and anthropogenic impacts, and identifying hotspots that support high fish abundance or fish aggregations. Fisheries acoustics surveys conducted alongside multibeam mapping efforts inherently couple water column data with information on benthic habitats and provide information on the heterogeneity of both benthic habitats and biota in the water column. Building on this information serves to inform resource managers regarding how fishes are organized around habitat structure and the scale at which these relationships are important. Where resource managers require place-based assessments regarding the location of critical habitats along with high abundances of fish, concurrent multibeam and fisheries acoustics surveys serve as an important tool for characterizing and prioritizing coral reef ecosystems. This report summarizes the evolution of fisheries acoustics surveys onboard the NOAA ship Nancy Foster from 2008 to 2010, in conjunction with multibeam data collection, aimed at characterizing benthic and mid-water habitats in high priority conservation areas around the USVI and Puerto Rico. It also serves as a resource for the continued development of consistent data products derived from acoustic surveys. By focusing on the activities of 2010, this report highlights the progress made to date and illustrates the potential application of fisheries data derived from acoustic surveys to the management of coral reef ecosystems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This book explores the processes for retrieval, classification, and integration of construction images in AEC/FM model based systems. The author describes a combination of techniques from the areas of image and video processing, computer vision, information retrieval, statistics and content-based image and video retrieval that have been integrated into a novel method for the retrieval of related construction site image data from components of a project model. This method has been tested on available construction site images from a variety of sources like past and current building construction and transportation projects and is able to automatically classify, store, integrate and retrieve image data files in inter-organizational systems so as to allow their usage in project management related tasks. objects. Therefore, automated methods for the integration of construction images are important for construction information management. During this research, processes for retrieval, classification, and integration of construction images in AEC/FM model based systems have been explored. Specifically, a combination of techniques from the areas of image and video processing, computer vision, information retrieval, statistics and content-based image and video retrieval have been deployed in order to develop a methodology for the retrieval of related construction site image data from components of a project model. This method has been tested on available construction site images from a variety of sources like past and current building construction and transportation projects and is able to automatically classify, store, integrate and retrieve image data files in inter-organizational systems so as to allow their usage in project management related tasks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel integration method for the production of cost-effective optoelectronic printed circuit boards (OE PCBs) is presented. The proposed integration method allows fabrication of OE PCBs with manufacturing processes common to the electronics industry while enabling direct attachment of electronic components onto the board with solder reflow processes as well as board assembly with automated pick-and-place tools. The OE PCB design is based on the use of polymer multimode waveguides, end-fired optical coupling schemes, and simple electro-optic connectors, eliminating the need for additional optical components in the optical layer, such as micro-mirrors and micro-lenses. A proof-of-concept low-cost optical transceiver produced with the proposed integration method is presented. This transceiver is fabricated on a low-cost FR4 substrate, comprises a polymer Y-splitter together with the electronic circuitry of the transmitter and receiver modules and achieves error-free 10-Gb/s bidirectional data transmission. Theoretical studies on the optical coupling efficiencies and alignment tolerances achieved with the employed end-fired coupling schemes are presented while experimental results on the optical transmission characteristics, frequency response, and data transmission performance of the integrated optical links are reported. The demonstrated optoelectronic unit can be used as a front-end optical network unit in short-reach datacommunication links. © 2011-2012 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With long-term marine surveys and research, and especially with the development of new marine environment monitoring technologies, prodigious amounts of complex marine environmental data are generated, and continuously increase rapidly. Features of these data include massive volume, widespread distribution, multiple-sources, heterogeneous, multi-dimensional and dynamic in structure and time. The present study recommends an integrative visualization solution for these data, to enhance the visual display of data and data archives, and to develop a joint use of these data distributed among different organizations or communities. This study also analyses the web services technologies and defines the concept of the marine information gird, then focuses on the spatiotemporal visualization method and proposes a process-oriented spatiotemporal visualization method. We discuss how marine environmental data can be organized based on the spatiotemporal visualization method, and how organized data are represented for use with web services and stored in a reusable fashion. In addition, we provide an original visualization architecture that is integrative and based on the explored technologies. In the end, we propose a prototype system of marine environmental data of the South China Sea for visualizations of Argo floats, sea surface temperature fields, sea current fields, salinity, in-situ investigation data, and ocean stations. An integration visualization architecture is illustrated on the prototype system, which highlights the process-oriented temporal visualization method and demonstrates the benefit of the architecture and the methods described in this study.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Li, Longzhuang, Liu, Yonghuai, Obregon, A., Weatherston, M. Visual Segmentation-Based Data Record Extraction From Web Documents. Proceedings of IEEE International Conference on Information Reuse and Integration, 2007, pp. 502-507. Sponsorship: IEEE

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND:In the current climate of high-throughput computational biology, the inference of a protein's function from related measurements, such as protein-protein interaction relations, has become a canonical task. Most existing technologies pursue this task as a classification problem, on a term-by-term basis, for each term in a database, such as the Gene Ontology (GO) database, a popular rigorous vocabulary for biological functions. However, ontology structures are essentially hierarchies, with certain top to bottom annotation rules which protein function predictions should in principle follow. Currently, the most common approach to imposing these hierarchical constraints on network-based classifiers is through the use of transitive closure to predictions.RESULTS:We propose a probabilistic framework to integrate information in relational data, in the form of a protein-protein interaction network, and a hierarchically structured database of terms, in the form of the GO database, for the purpose of protein function prediction. At the heart of our framework is a factorization of local neighborhood information in the protein-protein interaction network across successive ancestral terms in the GO hierarchy. We introduce a classifier within this framework, with computationally efficient implementation, that produces GO-term predictions that naturally obey a hierarchical 'true-path' consistency from root to leaves, without the need for further post-processing.CONCLUSION:A cross-validation study, using data from the yeast Saccharomyces cerevisiae, shows our method offers substantial improvements over both standard 'guilt-by-association' (i.e., Nearest-Neighbor) and more refined Markov random field methods, whether in their original form or when post-processed to artificially impose 'true-path' consistency. Further analysis of the results indicates that these improvements are associated with increased predictive capabilities (i.e., increased positive predictive value), and that this increase is consistent uniformly with GO-term depth. Additional in silico validation on a collection of new annotations recently added to GO confirms the advantages suggested by the cross-validation study. Taken as a whole, our results show that a hierarchical approach to network-based protein function prediction, that exploits the ontological structure of protein annotation databases in a principled manner, can offer substantial advantages over the successive application of 'flat' network-based methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When brain mechanism carry out motion integration and segmentation processes that compute unambiguous global motion percepts from ambiguous local motion signals? Consider, for example, a deer running at variable speeds behind forest cover. The forest cover is an occluder that creates apertures through which fragments of the deer's motion signals are intermittently experienced. The brain coherently groups these fragments into a trackable percept of the deer in its trajectory. Form and motion processes are needed to accomplish this using feedforward and feedback interactions both within and across cortical processing streams. All the cortical areas V1, V2, MT, and MST are involved in these interactions. Figure-ground processes in the form stream through V2, such as the seperation of occluding boundaries of the forest cover from the boundaries of the deer, select the motion signals which determine global object motion percepts in the motion stream through MT. Sparse, but unambiguous, feauture tracking signals are amplified before they propogate across position and are intergrated with far more numerous ambiguous motion signals. Figure-ground and integration processes together determine the global percept. A neural model predicts the processing stages that embody these form and motion interactions. Model concepts and data are summarized about motion grouping across apertures in response to a wide variety of displays, and probabilistic decision making in parietal cortex in response to random dot displays.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hydrologic research is a very demanding application of fiber-optic distributed temperature sensing (DTS) in terms of precision, accuracy and calibration. The physics behind the most frequently used DTS instruments are considered as they apply to four calibration methods for single-ended DTS installations. The new methods presented are more accurate than the instrument-calibrated data, achieving accuracies on the order of tenths of a degree root mean square error (RMSE) and mean bias. Effects of localized non-uniformities that violate the assumptions of single-ended calibration data are explored and quantified. Experimental design considerations such as selection of integration times or selection of the length of the reference sections are discussed, and the impacts of these considerations on calibrated temperatures are explored in two case studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An enterprise information system (EIS) is an integrated data-applications platform characterized by diverse, heterogeneous, and distributed data sources. For many enterprises, a number of business processes still depend heavily on static rule-based methods and extensive human expertise. Enterprises are faced with the need for optimizing operation scheduling, improving resource utilization, discovering useful knowledge, and making data-driven decisions.

This thesis research is focused on real-time optimization and knowledge discovery that addresses workflow optimization, resource allocation, as well as data-driven predictions of process-execution times, order fulfillment, and enterprise service-level performance. In contrast to prior work on data analytics techniques for enterprise performance optimization, the emphasis here is on realizing scalable and real-time enterprise intelligence based on a combination of heterogeneous system simulation, combinatorial optimization, machine-learning algorithms, and statistical methods.

On-demand digital-print service is a representative enterprise requiring a powerful EIS.We use real-life data from Reischling Press, Inc. (RPI), a digit-print-service provider (PSP), to evaluate our optimization algorithms.

In order to handle the increase in volume and diversity of demands, we first present a high-performance, scalable, and real-time production scheduling algorithm for production automation based on an incremental genetic algorithm (IGA). The objective of this algorithm is to optimize the order dispatching sequence and balance resource utilization. Compared to prior work, this solution is scalable for a high volume of orders and it provides fast scheduling solutions for orders that require complex fulfillment procedures. Experimental results highlight its potential benefit in reducing production inefficiencies and enhancing the productivity of an enterprise.

We next discuss analysis and prediction of different attributes involved in hierarchical components of an enterprise. We start from a study of the fundamental processes related to real-time prediction. Our process-execution time and process status prediction models integrate statistical methods with machine-learning algorithms. In addition to improved prediction accuracy compared to stand-alone machine-learning algorithms, it also performs a probabilistic estimation of the predicted status. An order generally consists of multiple series and parallel processes. We next introduce an order-fulfillment prediction model that combines advantages of multiple classification models by incorporating flexible decision-integration mechanisms. Experimental results show that adopting due dates recommended by the model can significantly reduce enterprise late-delivery ratio. Finally, we investigate service-level attributes that reflect the overall performance of an enterprise. We analyze and decompose time-series data into different components according to their hierarchical periodic nature, perform correlation analysis,

and develop univariate prediction models for each component as well as multivariate models for correlated components. Predictions for the original time series are aggregated from the predictions of its components. In addition to a significant increase in mid-term prediction accuracy, this distributed modeling strategy also improves short-term time-series prediction accuracy.

In summary, this thesis research has led to a set of characterization, optimization, and prediction tools for an EIS to derive insightful knowledge from data and use them as guidance for production management. It is expected to provide solutions for enterprises to increase reconfigurability, accomplish more automated procedures, and obtain data-driven recommendations or effective decisions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The wealth of phenotypic descriptions documented in the published articles, monographs, and dissertations of phylogenetic systematics is traditionally reported in a free-text format, and it is therefore largely inaccessible for linkage to biological databases for genetics, development, and phenotypes, and difficult to manage for large-scale integrative work. The Phenoscape project aims to represent these complex and detailed descriptions with rich and formal semantics that are amenable to computation and integration with phenotype data from other fields of biology. This entails reconceptualizing the traditional free-text characters into the computable Entity-Quality (EQ) formalism using ontologies. METHODOLOGY/PRINCIPAL FINDINGS: We used ontologies and the EQ formalism to curate a collection of 47 phylogenetic studies on ostariophysan fishes (including catfishes, characins, minnows, knifefishes) and their relatives with the goal of integrating these complex phenotype descriptions with information from an existing model organism database (zebrafish, http://zfin.org). We developed a curation workflow for the collection of character, taxonomic and specimen data from these publications. A total of 4,617 phenotypic characters (10,512 states) for 3,449 taxa, primarily species, were curated into EQ formalism (for a total of 12,861 EQ statements) using anatomical and taxonomic terms from teleost-specific ontologies (Teleost Anatomy Ontology and Teleost Taxonomy Ontology) in combination with terms from a quality ontology (Phenotype and Trait Ontology). Standards and guidelines for consistently and accurately representing phenotypes were developed in response to the challenges that were evident from two annotation experiments and from feedback from curators. CONCLUSIONS/SIGNIFICANCE: The challenges we encountered and many of the curation standards and methods for improving consistency that we developed are generally applicable to any effort to represent phenotypes using ontologies. This is because an ontological representation of the detailed variations in phenotype, whether between mutant or wildtype, among individual humans, or across the diversity of species, requires a process by which a precise combination of terms from domain ontologies are selected and organized according to logical relations. The efficiencies that we have developed in this process will be useful for any attempt to annotate complex phenotypic descriptions using ontologies. We also discuss some ramifications of EQ representation for the domain of systematics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Building on the planning efforts of the RCN4GSC project, a workshop was convened in San Diego to bring together experts from genomics and metagenomics, biodiversity, ecology, and bioinformatics with the charge to identify potential for positive interactions and progress, especially building on successes at establishing data standards by the GSC and by the biodiversity and ecological communities. Until recently, the contribution of microbial life to the biomass and biodiversity of the biosphere was largely overlooked (because it was resistant to systematic study). Now, emerging genomic and metagenomic tools are making investigation possible. Initial research findings suggest that major advances are in the offing. Although different research communities share some overlapping concepts and traditions, they differ significantly in sampling approaches, vocabularies and workflows. Likewise, their definitions of 'fitness for use' for data differ significantly, as this concept stems from the specific research questions of most importance in the different fields. Nevertheless, there is little doubt that there is much to be gained from greater coordination and integration. As a first step toward interoperability of the information systems used by the different communities, participants agreed to conduct a case study on two of the leading data standards from the two formerly disparate fields: (a) GSC's standard checklists for genomics and metagenomics and (b) TDWG's Darwin Core standard, used primarily in taxonomy and systematic biology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In chimpanzees, most females disperse from the community in which they were born to reproduce in a new community, thereby eliminating the risk of inbreeding with close kin. However, across sites, some females breed in their natal community, raising questions about the flexibility of dispersal, the costs and benefits of different strategies and the mitigation of costs associated with dispersal and integration. In this dissertation I address these questions by combining long-term behavioral data and recent field observations on maturing and young adult females in Gombe National Park with an experimental manipulation of relationship formation in captive apes in the Congo.

To assess the risk of inbreeding for females who do and do not disperse, 129 chimpanzees were genotyped and relatedness between each dyad was calculated. Natal females were more closely related to adult community males than were immigrant females. By examining the parentage of 58 surviving offspring, I found that natal females were not more related to the sires of their offspring than were immigrant females, despite three instances of close inbreeding. The sires of all offspring were less related to the mothers than non-sires regardless of the mother’s residence status. These results suggest that chimpanzees are capable of detecting relatedness and that, even when remaining natal, females can largely avoid, though not eliminate, inbreeding.

Next, I examined whether dispersal was associated with energetic, social, physiological and/or reproductive costs by comparing immigrant (n=10) and natal (n=9) females of similar age using 2358 hours of observational data. Natal and immigrant females did not differ in any energetic metric. Immigrant females received aggression from resident females more frequently than natal females. Immigrants spent less time in social grooming and more time self-grooming than natal females. Immigrant females primarily associated with resident males, had more social partners and lacked close social allies. There was no difference in levels of fecal glucocorticoid metabolites in immigrant and natal females. Immigrant females gave birth 2.5 years later than natal females, though the survival of their first offspring did not differ. These results indicate that immigrant females in Gombe National Park do not face energetic deficits upon transfer, but they do enter a hostile social environment and have a delayed first birth.

Next, I examined whether chimpanzees use condition- and phenotype-dependent cues in making dispersal decisions. I examined the effect of social and environmental conditions present at the time females of known age matured (n=25) on the females’ dispersal decisions. Females were more likely to disperse if they had more male maternal relatives and thus, a high risk of inbreeding. Females with a high ranking mother and multiple maternal female kin tended to disperse less frequently, suggesting that a strong female kin network provides benefits to the maturing daughter. Females were also somewhat less likely to disperse when fewer unrelated males were present in the group. Habitat quality and intrasexual competition did not affect dispersal decisions. Using a larger sample of 62 females observed as adults in Gombe, I also detected an effect of phenotypic differences in personality on the female’s dispersal decisions; extraverted, agreeable and open females were less likely to disperse.

Natural observations show that apes use grooming and play as social currency, but no experimental manipulations have been carried out to measure the effects of these behaviors on relationship formation, an essential component of integration. Thirty chimpanzees and 25 bonobos were given a choice between an unfamiliar human who had recently groomed or played with them over one who did not. Both species showed a preference for the human that had interacted with them, though the effect was driven by males. These results support the idea that grooming and play act as social currency in great apes that can rapidly shape social relationships between unfamiliar individuals. Further investigation is needed to elucidate the use of social currency in female apes.

I conclude that dispersal in female chimpanzees is flexible and the balance of costs and benefits varies for each individual. Females likely take into account social cues present at maturity and their own phenotype in choosing a settlement path and are especially sensitive to the presence of maternal male kin. The primary cost associated with philopatry is inbreeding risk and the primary cost associated with dispersal is delay in the age at first birth, presumably resulting from intense social competition. Finally, apes may strategically make use of affiliative behavior in pursuing particular relationships, something that should be useful in the integration process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A cross-domain workflow application may be constructed using a standard reference model such as the one by the Workflow Management Coalition (WfMC) [7] but the requirements for this type of application are inherently different from one organization to another. The existing models and systems built around them meet some but not all the requirements from all the organizations involved in a collaborative process. Furthermore the requirements change over time. This makes the applications difficult to develop and distribute. Service Oriented Architecture (SOA) based approaches such as the BPET (Business Process Execution Language) intend to provide a solution but fail to address the problems sufficiently, especially in the situations where the expectations and level of skills of the users (e.g. the participants of the processes) in different organisations are likely to be different. In this paper, we discuss a design pattern that provides a novel approach towards a solution. In the solution, business users can design the applications at a high level of abstraction: the use cases and user interactions; the designs are documented and used, together with the data and events captured later that represents the user interactions with the systems, to feed an intermediate component local to the users -the IFM (InterFace Mapper) -which bridges the gaps between the users and the systems. We discuss the main issues faced in the design and prototyping. The approach alleviates the need for re-programming with the APIs to any back-end service thus easing the development and distribution of the applications

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Remote sensing airborne hyperspectral data are routinely used for applications including algorithm development for satellite sensors, environmental monitoring and atmospheric studies. Single flight lines of airborne hyperspectral data are often in the region of tens of gigabytes in size. This means that a single aircraft can collect terabytes of remotely sensed hyperspectral data during a single year. Before these data can be used for scientific analyses, they need to be radiometrically calibrated, synchronised with the aircraft's position and attitude and then geocorrected. To enable efficient processing of these large datasets the UK Airborne Research and Survey Facility has recently developed a software suite, the Airborne Processing Library (APL), for processing airborne hyperspectral data acquired from the Specim AISA Eagle and Hawk instruments. The APL toolbox allows users to radiometrically calibrate, geocorrect, reproject and resample airborne data. Each stage of the toolbox outputs data in the common Band Interleaved Lines (BILs) format, which allows its integration with other standard remote sensing software packages. APL was developed to be user-friendly and suitable for use on a workstation PC as well as for the automated processing of the facility; to this end APL can be used under both Windows and Linux environments on a single desktop machine or through a Grid engine. A graphical user interface also exists. In this paper we describe the Airborne Processing Library software, its algorithms and approach. We present example results from using APL with an AISA Eagle sensor and we assess its spatial accuracy using data from multiple flight lines collected during a campaign in 2008 together with in situ surveyed ground control points.