861 resultados para Multi Domain Information Model
Resumo:
Recent research trends in computer-aided drug design have shown an increasing interest towards the implementation of advanced approaches able to deal with large amount of data. This demand arose from the awareness of the complexity of biological systems and from the availability of data provided by high-throughput technologies. As a consequence, drug research has embraced this paradigm shift exploiting approaches such as that based on networks. Indeed, the process of drug discovery can benefit from the implementation of network-based methods at different steps from target identification to drug repurposing. From this broad range of opportunities, this thesis is focused on three main topics: (i) chemical space networks (CSNs), which are designed to represent and characterize bioactive compound data sets; (ii) drug-target interactions (DTIs) prediction through a network-based algorithm that predicts missing links; (iii) COVID-19 drug research which was explored implementing COVIDrugNet, a network-based tool for COVID-19 related drugs. The main highlight emerged from this thesis is that network-based approaches can be considered useful methodologies to tackle different issues in drug research. In detail, CSNs are valuable coordinate-free, graphically accessible representations of structure-activity relationships of bioactive compounds data sets especially for medium-large libraries of molecules. DTIs prediction through the random walk with restart algorithm on heterogeneous networks can be a helpful method for target identification. COVIDrugNet is an example of the usefulness of network-based approaches for studying drugs related to a specific condition, i.e., COVID-19, and the same ‘systems-based’ approaches can be used for other diseases. To conclude, network-based tools are proving to be suitable in many applications in drug research and provide the opportunity to model and analyze diverse drug-related data sets, even large ones, also integrating different multi-domain information.
Resumo:
Digital business ecosystems (DBE) are becoming an increasingly popular concept for modelling and building distributed systems in heterogeneous, decentralized and open environments. Information- and communication technology (ICT) enabled business solutions have created an opportunity for automated business relations and transactions. The deployment of ICT in business-to-business (B2B) integration seeks to improve competitiveness by establishing real-time information and offering better information visibility to business ecosystem actors. The products, components and raw material flows in supply chains are traditionally studied in logistics research. In this study, we expand the research to cover the processes parallel to the service and information flows as information logistics integration. In this thesis, we show how better integration and automation of information flows enhance the speed of processes and, thus, provide cost savings and other benefits for organizations. Investments in DBE are intended to add value through business automation and are key decisions in building up information logistics integration. Business solutions that build on automation are important sources of value in networks that promote and support business relations and transactions. Value is created through improved productivity and effectiveness when new, more efficient collaboration methods are discovered and integrated into DBE. Organizations, business networks and collaborations, even with competitors, form DBE in which information logistics integration has a significant role as a value driver. However, traditional economic and computing theories do not focus on digital business ecosystems as a separate form of organization, and they do not provide conceptual frameworks that can be used to explore digital business ecosystems as value drivers—combined internal management and external coordination mechanisms for information logistics integration are not the current practice of a company’s strategic process. In this thesis, we have developed and tested a framework to explore the digital business ecosystems developed and a coordination model for digital business ecosystem integration; moreover, we have analysed the value of information logistics integration. The research is based on a case study and on mixed methods, in which we use the Delphi method and Internetbased tools for idea generation and development. We conducted many interviews with key experts, which we recoded, transcribed and coded to find success factors. Qualitative analyses were based on a Monte Carlo simulation, which sought cost savings, and Real Option Valuation, which sought an optimal investment program for the ecosystem level. This study provides valuable knowledge regarding information logistics integration by utilizing a suitable business process information model for collaboration. An information model is based on the business process scenarios and on detailed transactions for the mapping and automation of product, service and information flows. The research results illustrate the current cap of understanding information logistics integration in a digital business ecosystem. Based on success factors, we were able to illustrate how specific coordination mechanisms related to network management and orchestration could be designed. We also pointed out the potential of information logistics integration in value creation. With the help of global standardization experts, we utilized the design of the core information model for B2B integration. We built this quantitative analysis by using the Monte Carlo-based simulation model and the Real Option Value model. This research covers relevant new research disciplines, such as information logistics integration and digital business ecosystems, in which the current literature needs to be improved. This research was executed by high-level experts and managers responsible for global business network B2B integration. However, the research was dominated by one industry domain, and therefore a more comprehensive exploration should be undertaken to cover a larger population of business sectors. Based on this research, the new quantitative survey could provide new possibilities to examine information logistics integration in digital business ecosystems. The value activities indicate that further studies should continue, especially with regard to the collaboration issues on integration, focusing on a user-centric approach. We should better understand how real-time information supports customer value creation by imbedding the information into the lifetime value of products and services. The aim of this research was to build competitive advantage through B2B integration to support a real-time economy. For practitioners, this research created several tools and concepts to improve value activities, information logistics integration design and management and orchestration models. Based on the results, the companies were able to better understand the formulation of the digital business ecosystem and the importance of joint efforts in collaboration. However, the challenge of incorporating this new knowledge into strategic processes in a multi-stakeholder environment remains. This challenge has been noted, and new projects have been established in pursuit of a real-time economy.
Resumo:
An extensive off-line evaluation of the Noah/Single Layer Urban Canopy Model (Noah/SLUCM) urban land-surface model is presented using data from 15 sites to assess (1) the ability of the scheme to reproduce the surface energy balance observed in a range of urban environments, including seasonal changes, and (2) the impact of increasing complexity of input parameter information. Model performance is found to be most dependent on representation of vegetated surface area cover; refinement of other parameter values leads to smaller improvements. Model biases in net all-wave radiation and trade-offs between turbulent heat fluxes are highlighted using an optimization algorithm. Here we use the Urban Zones to characterize Energy partitioning (UZE) as the basis to assign default SLUCM parameter values. A methodology (FRAISE) to assign sites (or areas) to one of these categories based on surface characteristics is evaluated. Using three urban sites from the Basel Urban Boundary Layer Experiment (BUBBLE) dataset, an independent evaluation of the model performance with the parameter values representative of each class is performed. The scheme copes well with both seasonal changes in the surface characteristics and intra-urban heterogeneities in energy flux partitioning, with RMSE performance comparable to similar state-of-the-art models for all fluxes, sites and seasons. The potential of the methodology for high-resolution atmospheric modelling application using the Weather Research and Forecasting (WRF) model is highlighted. This analysis supports the recommendations that (1) three classes are appropriate to characterize the urban environment, and (2) that the parameter values identified should be adopted as default values in WRF.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Knowledge resource reuse has become a popular approach within the ontology engineering field, mainly because it can speed up the ontology development process, saving time and money and promoting the application of good practices. The NeOn Methodology provides guidelines for reuse. These guidelines include the selection of the most appropriate knowledge resources for reuse in ontology development. This is a complex decision-making problem where different conflicting objectives, like the reuse cost, understandability, integration workload and reliability, have to be taken into account simultaneously. GMAA is a PC-based decision support system based on an additive multi-attribute utility model that is intended to allay the operational difficulties involved in the Decision Analysis methodology. The paper illustrates how it can be applied to select multimedia ontologies for reuse to develop a new ontology in the multimedia domain. It also demonstrates that the sensitivity analyses provided by GMAA are useful tools for making a final recommendation.
Resumo:
Network building and exchange of information by people within networks is crucial to the innovation process. Contrary to older models, in social networks the flow of information is noncontinuous and nonlinear. There are critical barriers to information flow that operate in a problematic manner. New models and new analytic tools are needed for these systems. This paper introduces the concept of virtual circuits and draws on recent concepts of network modelling and design to introduce a probabilistic switch theory that can be described using matrices. It can be used to model multistep information flow between people within organisational networks, to provide formal definitions of efficient and balanced networks and to describe distortion of information as it passes along human communication channels. The concept of multi-dimensional information space arises naturally from the use of matrices. The theory and the use of serial diagonal matrices have applications to organisational design and to the modelling of other systems. It is hypothesised that opinion leaders or creative individuals are more likely to emerge at information-rich nodes in networks. A mathematical definition of such nodes is developed and it does not invariably correspond with centrality as defined by early work on networks.
Resumo:
We propose a novel framework where an initial classifier is learned by incorporating prior information extracted from an existing sentiment lexicon. Preferences on expectations of sentiment labels of those lexicon words are expressed using generalized expectation criteria. Documents classified with high confidence are then used as pseudo-labeled examples for automatical domain-specific feature acquisition. The word-class distributions of such self-learned features are estimated from the pseudo-labeled examples and are used to train another classifier by constraining the model's predictions on unlabeled instances. Experiments on both the movie review data and the multi-domain sentiment dataset show that our approach attains comparable or better performance than exiting weakly-supervised sentiment classification methods despite using no labeled documents.
Resumo:
Joint sentiment-topic (JST) model was previously proposed to detect sentiment and topic simultaneously from text. The only supervision required by JST model learning is domain-independent polarity word priors. In this paper, we modify the JST model by incorporating word polarity priors through modifying the topic-word Dirichlet priors. We study the polarity-bearing topics extracted by JST and show that by augmenting the original feature space with polarity-bearing topics, the in-domain supervised classifiers learned from augmented feature representation achieve the state-of-the-art performance of 95% on the movie review data and an average of 90% on the multi-domain sentiment dataset. Furthermore, using feature augmentation and selection according to the information gain criteria for cross-domain sentiment classification, our proposed approach performs either better or comparably compared to previous approaches. Nevertheless, our approach is much simpler and does not require difficult parameter tuning.
Resumo:
In this paper, we formulate the electricity retailers’ short-term decision-making problem in a liberalized retail market as a multi-objective optimization model. Retailers with light physical assets, such as generation and storage units in the distribution network, are considered. Following advances in smart grid technologies, electricity retailers are becoming able to employ incentive-based demand response (DR) programs in addition to their physical assets to effectively manage the risks of market price and load variations. In this model, the DR scheduling is performed simultaneously with the dispatch of generation and storage units. The ultimate goal is to find the optimal values of the hourly financial incentives offered to the end-users. The proposed model considers the capacity obligations imposed on retailers by the grid operator. The profit seeking retailer also has the objective to minimize the peak demand to avoid the high capacity charges in form of grid tariffs or penalties. The non-dominated sorting genetic algorithm II (NSGA-II) is used to solve the multi-objective problem. It is a fast and elitist multi-objective evolutionary algorithm. A case study is solved to illustrate the efficient performance of the proposed methodology. Simulation results show the effectiveness of the model for designing the incentive-based DR programs and indicate the efficiency of NSGA-II in solving the retailers’ multi-objective problem.
Resumo:
Amnestic mild cognitive impairment (aMCI) is characterized by memory deficits alone (single-domain, sd-aMCI) or associated with other cognitive disabilities (multi-domain, md-aMCI). The present study assessed the patterns of electroencephalographic (EEG) activity during the encoding and retrieval phases of short-term memory in these two aMCI subtypes, to identify potential functional differences according to the neuropsychological profile. Continuous EEG was recorded in 43 aMCI patients, whose 16 sd-aMCI and 27 md-aMCI, and 36 age-matched controls (EC) during delayed match-to-sample tasks for face and letter stimuli. At encoding, attended stimuli elicited parietal alpha (8-12 Hz) power decrease (desynchronization), whereas distracting stimuli were associated with alpha power increase (synchronization) over right central sites. No difference was observed in parietal alpha desynchronization among the three groups. For attended faces, the alpha synchronization underlying suppression of distracting letters was reduced in both aMCI subgroups, but more severely in md-aMCI cases that differed significantly from EC. At retrieval, the early N250r recognition effect was significantly reduced for faces in md-aMCI as compared to both sd-aMCI and EC. The results suggest a differential alteration of working memory cerebral processes for faces in the two aMCI subtypes, face covert recognition processes being specifically altered in md-aMCI.
Resumo:
In literature CO 2 liquidization is well studied with steady state modeling. Steady state modeling gives an overview of the process but it doesn’t give information about process behavior during transients. In this master’s thesis three dynamic models of CO2 liquidization were made and tested. Models were straight multi-stage compression model and two compression liquid pumping models, one with and one without cold energy recovery. Models were made with Apros software, models were also used to verify that Apros is capable to model phase changes and over critical state of CO 2. Models were verified against compressor manufacturer’s data and simulation results presented in literature. From the models made in this thesis, straight compression model was found to be the most energy efficient and fastest to react to transients. Also Apros was found to be capable tool for dynamic liquidization modeling.
Resumo:
Context awareness, dynamic reconfiguration at runtime and heterogeneity are key characteristics of future distributed systems, particularly in ubiquitous and mobile computing scenarios. The main contributions of this dissertation are theoretical as well as architectural concepts facilitating information exchange and fusion in heterogeneous and dynamic distributed environments. Our main focus is on bridging the heterogeneity issues and, at the same time, considering uncertain, imprecise and unreliable sensor information in information fusion and reasoning approaches. A domain ontology is used to establish a common vocabulary for the exchanged information. We thereby explicitly support different representations for the same kind of information and provide Inter-Representation Operations that convert between them. Special account is taken of the conversion of associated meta-data that express uncertainty and impreciseness. The Unscented Transformation, for example, is applied to propagate Gaussian normal distributions across highly non-linear Inter-Representation Operations. Uncertain sensor information is fused using the Dempster-Shafer Theory of Evidence as it allows explicit modelling of partial and complete ignorance. We also show how to incorporate the Dempster-Shafer Theory of Evidence into probabilistic reasoning schemes such as Hidden Markov Models in order to be able to consider the uncertainty of sensor information when deriving high-level information from low-level data. For all these concepts we provide architectural support as a guideline for developers of innovative information exchange and fusion infrastructures that are particularly targeted at heterogeneous dynamic environments. Two case studies serve as proof of concept. The first case study focuses on heterogeneous autonomous robots that have to spontaneously form a cooperative team in order to achieve a common goal. The second case study is concerned with an approach for user activity recognition which serves as baseline for a context-aware adaptive application. Both case studies demonstrate the viability and strengths of the proposed solution and emphasize that the Dempster-Shafer Theory of Evidence should be preferred to pure probability theory in applications involving non-linear Inter-Representation Operations.
Resumo:
The research of this thesis dissertation covers developments and applications of short-and long-term climate predictions. The short-term prediction emphasizes monthly and seasonal climate, i.e. forecasting from up to the next month over a season to up to a year or so. The long-term predictions pertain to the analysis of inter-annual- and decadal climate variations over the whole 21st century. These two climate prediction methods are validated and applied in the study area, namely, Khlong Yai (KY) water basin located in the eastern seaboard of Thailand which is a major industrial zone of the country and which has been suffering from severe drought and water shortage in recent years. Since water resources are essential for the further industrial development in this region, a thorough analysis of the potential climate change with its subsequent impact on the water supply in the area is at the heart of this thesis research. The short-term forecast of the next-season climate, such as temperatures and rainfall, offers a potential general guideline for water management and reservoir operation. To that avail, statistical models based on autoregressive techniques, i.e., AR-, ARIMA- and ARIMAex-, which includes additional external regressors, and multiple linear regression- (MLR) models, are developed and applied in the study region. Teleconnections between ocean states and the local climate are investigated and used as extra external predictors in the ARIMAex- and the MLR-model and shown to enhance the accuracy of the short-term predictions significantly. However, as the ocean state – local climate teleconnective relationships provide only a one- to four-month ahead lead time, the ocean state indices can support only a one-season-ahead forecast. Hence, GCM- climate predictors are also suggested as an additional predictor-set for a more reliable and somewhat longer short-term forecast. For the preparation of “pre-warning” information for up-coming possible future climate change with potential adverse hydrological impacts in the study region, the long-term climate prediction methodology is applied. The latter is based on the downscaling of climate predictions from several single- and multi-domain GCMs, using the two well-known downscaling methods SDSM and LARS-WG and a newly developed MLR-downscaling technique that allows the incorporation of a multitude of monthly or daily climate predictors from one- or several (multi-domain) parent GCMs. The numerous downscaling experiments indicate that the MLR- method is more accurate than SDSM and LARS-WG in predicting the recent past 20th-century (1971-2000) long-term monthly climate in the region. The MLR-model is, consequently, then employed to downscale 21st-century GCM- climate predictions under SRES-scenarios A1B, A2 and B1. However, since the hydrological watershed model requires daily-scale climate input data, a new stochastic daily climate generator is developed to rescale monthly observed or predicted climate series to daily series, while adhering to the statistical and geospatial distributional attributes of observed (past) daily climate series in the calibration phase. Employing this daily climate generator, 30 realizations of future daily climate series from downscaled monthly GCM-climate predictor sets are produced and used as input in the SWAT- distributed watershed model, to simulate future streamflow and other hydrological water budget components in the study region in a multi-realization manner. In addition to a general examination of the future changes of the hydrological regime in the KY-basin, potential future changes of the water budgets of three main reservoirs in the basin are analysed, as these are a major source of water supply in the study region. The results of the long-term 21st-century downscaled climate predictions provide evidence that, compared with the past 20th-reference period, the future climate in the study area will be more extreme, particularly, for SRES A1B. Thus, the temperatures will be higher and exhibit larger fluctuations. Although the future intensity of the rainfall is nearly constant, its spatial distribution across the region is partially changing. There is further evidence that the sequential rainfall occurrence will be decreased, so that short periods of high intensities will be followed by longer dry spells. This change in the sequential rainfall pattern will also lead to seasonal reductions of the streamflow and seasonal changes (decreases) of the water storage in the reservoirs. In any case, these predicted future climate changes with their hydrological impacts should encourage water planner and policy makers to develop adaptation strategies to properly handle the future water supply in this area, following the guidelines suggested in this study.
Resumo:
Information modelling is a topic that has been researched a great deal, but still many questions around it have not been solved. An information model is essential in the design of a database which is the core of an information system. Currently most of databases only deal with information that represents facts, or asserted information. The ability of capturing semantic aspect has to be improved, and yet other types, such as temporal and intentional information, should be considered. Semantic Analysis, a method of information modelling, has offered a way to handle various aspects of information. It employs the domain knowledge and communication acts as sources of information modelling. It lends itself to a uniform structure whereby semantic, temporal and intentional information can be captured, which builds a sound foundation for building a semantic temporal database.
Resumo:
The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.