952 resultados para Scientific Data Visualisation
Resumo:
This paper aims to cast some light on the dynamics of knowledge networks in developing countries by analyzing the scientific production of the largest university in the Northeast of Brazil and its influence on some of the remaining regional research institutions in the state of Bahia. Using a methodology test to be employed in a larger project, the Universidade Federal da Bahia (UFBA) (Federal University of Bahia), the Universidade do Estado da Bahia (Uneb) (State of Bahia University) and the Universidade Estadual de Santa Cruz (Uesc)'s (Santa Cruz State University) scientific productions are discussed in one of their most traditionally expressive sectors in academic production - namely, the field of chemistry, using social network analysis of co-authorship networks to investigate the existence of small world phenomena and the importance of these phenomena in research performance in these three universities. The results already obtained through this research bring to light data of considerable interest concerning the scientific production in unconsolidated research universities. It shows the important participation of the UFBA network in the composition of the other two public universities research networks, indicating a possible occurrence of small world phenomena in the UFBA and Uesc networks, as well as the importance of individual researchers in consolidating research networks in peripheral universities. The article also hints that the methodology employed appears to be adequate insofar as scientific production may be used as a proxy for scientific knowledge.
Resumo:
The paper discusses the difficulties in judging the quality of scientific manuscripts and describes some common pitfalls that should be avoided when preparing a paper for submission to a peer-reviewed journal. Peer review is an imperfect system, with less than optimal reliability and uncertain validity. However, as it is likely that it will remain as the principal process of screening papers for publication, authors should avoid some common mistakes when preparing a report based on empirical findings of human research. Among these are: excessively long abstracts, extensive use of abbreviations, failure to report results of parsimonious data analyses, and misinterpretation of statistical associations identified in observational studies as causal. Another common problem in many manuscripts is their excessive length, which makes them more difficult to be evaluated or read by the intended readers, if published. The evaluation of papers after their publication with a view towards their inclusion in a systematic review is also discussed. The limitations of the impact factor as a criterion to judge the quality of a paper are reviewed.
Resumo:
Data analytic applications are characterized by large data sets that are subject to a series of processing phases. Some of these phases are executed sequentially but others can be executed concurrently or in parallel on clusters, grids or clouds. The MapReduce programming model has been applied to process large data sets in cluster and cloud environments. For developing an application using MapReduce there is a need to install/configure/access specific frameworks such as Apache Hadoop or Elastic MapReduce in Amazon Cloud. It would be desirable to provide more flexibility in adjusting such configurations according to the application characteristics. Furthermore the composition of the multiple phases of a data analytic application requires the specification of all the phases and their orchestration. The original MapReduce model and environment lacks flexible support for such configuration and composition. Recognizing that scientific workflows have been successfully applied to modeling complex applications, this paper describes our experiments on implementing MapReduce as subworkflows in the AWARD framework (Autonomic Workflow Activities Reconfigurable and Dynamic). A text mining data analytic application is modeled as a complex workflow with multiple phases, where individual workflow nodes support MapReduce computations. As in typical MapReduce environments, the end user only needs to define the application algorithms for input data processing and for the map and reduce functions. In the paper we present experimental results when using the AWARD framework to execute MapReduce workflows deployed over multiple Amazon EC2 (Elastic Compute Cloud) instances.
Resumo:
Workflows have been successfully applied to express the decomposition of complex scientific applications. However the existing tools still lack adequate support to important aspects namely, decoupling the enactment engine from tasks specification, decentralizing the control of workflow activities allowing their tasks to run in distributed infrastructures, and supporting dynamic workflow reconfigurations. We present the AWARD (Autonomic Workflow Activities Reconfigurable and Dynamic) model of computation, based on Process Networks, where the workflow activities (AWA) are autonomic processes with independent control that can run in parallel on distributed infrastructures. Each AWA executes a task developed as a Java class with a generic interface allowing end-users to code their applications without low-level details. The data-driven coordination of AWA interactions is based on a shared tuple space that also enables dynamic workflow reconfiguration. For evaluation we describe experimental results of AWARD workflow executions in several application scenarios, mapped to the Amazon (Elastic Computing EC2) Cloud.
Resumo:
Harnessing idle PCs CPU cycles, storage space and other resources of networked computers to collaborative are mainly fixated on for all major grid computing research projects. Most of the university computers labs are occupied with the high puissant desktop PC nowadays. It is plausible to notice that most of the time machines are lying idle or wasting their computing power without utilizing in felicitous ways. However, for intricate quandaries and for analyzing astronomically immense amounts of data, sizably voluminous computational resources are required. For such quandaries, one may run the analysis algorithms in very puissant and expensive computers, which reduces the number of users that can afford such data analysis tasks. Instead of utilizing single expensive machines, distributed computing systems, offers the possibility of utilizing a set of much less expensive machines to do the same task. BOINC and Condor projects have been prosperously utilized for solving authentic scientific research works around the world at a low cost. In this work the main goal is to explore both distributed computing to implement, Condor and BOINC, and utilize their potency to harness the ideal PCs resources for the academic researchers to utilize in their research work. In this thesis, Data mining tasks have been performed in implementation of several machine learning algorithms on the distributed computing environment.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
The ethical aspects of the Brazilian publications about human Chagas disease (CD) developed between 1996 and 2010 and the policy adopted by Brazilian medical journals were analyzed. Articles were selected on the SciELO Brazil data basis, and the evaluation of ethical aspects was based on the normative contents about ethics in research involving human experimentation according to the Brazilian resolution of the National Health Council no. 196/1996. The editorial policies of the section "Instructions to authors" were analyzed. In the period of 1996-2012, 58.9% of articles involving human Chagas disease did not refer to the fulfillment of the ethical aspects concerning research with human beings. In 80% of the journals, the requirements and confirmation of the information about ethical aspects in the studies of human CD were not observed. Although a failure in this type of service is still observed, awareness has been raised in federal agencies, educational institutions/research and publishing groups to standardize the procedures and ethical requirements for the Brazilian journals, reinforcing the fulfillment of the ethical parameters, according to the resolution of NHC no. 196/1996.
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
INTRODUCTION:With the ease provided by current computational programs, medical and scientific journals use bar graphs to describe continuous data.METHODS:This manuscript discusses the inadequacy of bars graphs to present continuous data.RESULTS:Simulated data show that box plots and dot plots are more-feasible tools to describe continuous data.CONCLUSIONS:These plots are preferred to represent continuous variables since they effectively describe the range, shape, and variability of observations and clearly identify outliers. By contrast, bar graphs address only measures of central tendency. Bar graphs should be used only to describe qualitative data.
Resumo:
Transcriptional Regulatory Networks (TRNs) are powerful tool for representing several interactions that occur within a cell. Recent studies have provided information to help researchers in the tasks of building and understanding these networks. One of the major sources of information to build TRNs is biomedical literature. However, due to the rapidly increasing number of scientific papers, it is quite difficult to analyse the large amount of papers that have been published about this subject. This fact has heightened the importance of Biomedical Text Mining approaches in this task. Also, owing to the lack of adequate standards, as the number of databases increases, several inconsistencies concerning gene and protein names and identifiers are common. In this work, we developed an integrated approach for the reconstruction of TRNs that retrieve the relevant information from important biological databases and insert it into a unique repository, named KREN. Also, we applied text mining techniques over this integrated repository to build TRNs. However, was necessary to create a dictionary of names and synonyms associated with these entities and also develop an approach that retrieves all the abstracts from the related scientific papers stored on PubMed, in order to create a corpora of data about genes. Furthermore, these tasks were integrated into @Note, a software system that allows to use some methods from the Biomedical Text Mining field, including an algorithms for Named Entity Recognition (NER), extraction of all relevant terms from publication abstracts, extraction relationships between biological entities (genes, proteins and transcription factors). And finally, extended this tool to allow the reconstruction Transcriptional Regulatory Networks through using scientific literature.
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação
Resumo:
Currently, the quality of the Indonesian national road network is inadequate due to several constraints, including overcapacity and overloaded trucks. The high deterioration rate of the road infrastructure in developing countries along with major budgetary restrictions and high growth in traffic have led to an emerging need for improving the performance of the highway maintenance system. However, the high number of intervening factors and their complex effects require advanced tools to successfully solve this problem. The high learning capabilities of Data Mining (DM) are a powerful solution to this problem. In the past, these tools have been successfully applied to solve complex and multi-dimensional problems in various scientific fields. Therefore, it is expected that DM can be used to analyze the large amount of data regarding the pavement and traffic, identify the relationship between variables, and provide information regarding the prediction of the data. In this paper, we present a new approach to predict the International Roughness Index (IRI) of pavement based on DM techniques. DM was used to analyze the initial IRI data, including age, Equivalent Single Axle Load (ESAL), crack, potholes, rutting, and long cracks. This model was developed and verified using data from an Integrated Indonesia Road Management System (IIRMS) that was measured with the National Association of Australian State Road Authorities (NAASRA) roughness meter. The results of the proposed approach are compared with the IIRMS analytical model adapted to the IRI, and the advantages of the new approach are highlighted. We show that the novel data-driven model is able to learn (with high accuracy) the complex relationships between the IRI and the contributing factors of overloaded trucks
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação