971 resultados para Linked Data
Resumo:
Aflatoxin B1 (AFB1) is considered by different International Agencies as a genotoxic and potent hepatocarcinogen. However, despite the fact that the fungi producing this compound are detected in some work environments, AFB1 is rarely monitored in occupational settings. The aim of the present investigation was to assess exposure to AFB1 of workers from one Portuguese waste company located in the outskirt of Lisbon. Occupational exposure assessment to AFB1 was done with a biomarker of internal dose that measures AFB1 in the serum by enzyme-linked immunosorbent assay. Forty-one workers from the waste company were enrolled in this study (26 from sorting; 9 from composting; 6 from incineration). A control group (n = 30) was also considered in order to know the AFB1 background levels for the Portuguese population. All the workers showed detectable levels of AFB1 with values ranging from 2.5ng ml−1 to 25.9ng ml−1 with a median value of 9.9±5.4ng ml−1. All of the controls showed values below the method’s detection limit. Results obtained showed much higher (8-fold higher) values when compared with other Portuguese settings already studied, such as poultry and swine production. Besides this mycotoxin, other mycotoxins are probably present in this occupational setting and this aspect should be taken into consideration for the risk assessment process due to possible synergistic reactions. The data obtained suggests that exposure to AFB1 occurs in a waste management setting and claims attention for the need of appliance of preventive and protective safety measures.
Resumo:
Seismic data is difficult to analyze and classical mathematical tools reveal strong limitations in exposing hidden relationships between earthquakes. In this paper, we study earthquake phenomena in the perspective of complex systems. Global seismic data, covering the period from 1962 up to 2011 is analyzed. The events, characterized by their magnitude, geographic location and time of occurrence, are divided into groups, either according to the Flinn-Engdahl (F-E) seismic regions of Earth or using a rectangular grid based in latitude and longitude coordinates. Two methods of analysis are considered and compared in this study. In a first method, the distributions of magnitudes are approximated by Gutenberg-Richter (G-R) distributions and the parameters used to reveal the relationships among regions. In the second method, the mutual information is calculated and adopted as a measure of similarity between regions. In both cases, using clustering analysis, visualization maps are generated, providing an intuitive and useful representation of the complex relationships that are present among seismic data. Such relationships might not be perceived on classical geographic maps. Therefore, the generated charts are a valid alternative to other visualization tools, for understanding the global behavior of earthquakes.
Resumo:
Learning and teaching processes, like all human activities, can be mediated through the use of tools. Information and communication technologies are now widespread within education. Their use in the daily life of teachers and learners affords engagement with educational activities at any place and time and not necessarily linked to an institution or a certificate. In the absence of formal certification, learning under these circumstances is known as informal learning. Despite the lack of certification, learning with technology in this way presents opportunities to gather information about and present new ways of exploiting an individual’s learning. Cloud technologies provide ways to achieve this through new architectures, methodologies, and workflows that facilitate semantic tagging, recognition, and acknowledgment of informal learning activities. The transparency and accessibility of cloud services mean that institutions and learners can exploit existing knowledge to their mutual benefit. The TRAILER project facilitates this aim by providing a technological framework using cloud services, a workflow, and a methodology. The services facilitate the exchange of information and knowledge associated with informal learning activities ranging from the use of social software through widgets, computer gaming, and remote laboratory experiments. Data from these activities are shared among institutions, learners, and workers. The project demonstrates the possibility of gathering information related to informal learning activities independently of the context or tools used to carry them out.
Resumo:
Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.
Resumo:
Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.
Resumo:
Cluster analysis for categorical data has been an active area of research. A well-known problem in this area is the determination of the number of clusters, which is unknown and must be inferred from the data. In order to estimate the number of clusters, one often resorts to information criteria, such as BIC (Bayesian information criterion), MML (minimum message length, proposed by Wallace and Boulton, 1968), and ICL (integrated classification likelihood). In this work, we adopt the approach developed by Figueiredo and Jain (2002) for clustering continuous data. They use an MML criterion to select the number of clusters and a variant of the EM algorithm to estimate the model parameters. This EM variant seamlessly integrates model estimation and selection in a single algorithm. For clustering categorical data, we assume a finite mixture of multinomial distributions and implement a new EM algorithm, following a previous version (Silvestre et al., 2008). Results obtained with synthetic datasets are encouraging. The main advantage of the proposed approach, when compared to the above referred criteria, is the speed of execution, which is especially relevant when dealing with large data sets.
Resumo:
Dissertação apresentada para obtenção do grau de Mestre em Educação Matemática na Educação Pré-Escolar e nos 1.º e 2.º Ciclos do Ensino Básico
Resumo:
Consider the problem of disseminating data from an arbitrary source node to all other nodes in a distributed computer system, like Wireless Sensor Networks (WSNs). We assume that wireless broadcast is used and nodes do not know the topology. We propose new protocols which disseminate data faster and use fewer broadcasts than the simple broadcast protocol.
Resumo:
Glass fibre-reinforced plastics (GFRP), nowadays commonly used in the construction, transportation and automobile sectors, have been considered inherently difficult to recycle due to both the cross-linked nature of thermoset resins, which cannot be remoulded, and the complex composition of the composite itself, which includes glass fibres, polymer matrix and different types of inorganic fillers. Hence, to date, most of the thermoset based GFRP waste is being incinerated or landfilled leading to negative environmental impacts and additional costs to producers and suppliers. With an increasing awareness of environmental matters and the subsequent desire to save resources, recycling would convert an expensive waste disposal into a profitable reusable material. In this study, the effect of the incorporation of mechanically recycled GFRP pultrusion wastes on flexural and compressive behaviour of polyester polymer mortars (PM) was assessed. For this purpose, different contents of GFRP recyclates (0%, 4%, 8% and 12%, w/w), with distinct size grades (coarse fibrous mixture and fine powdered mixture), were incorporated into polyester PM as sand aggregates and filler replacements. The effect of the incorporation of a silane coupling agent was also assessed. Experimental results revealed that GFRP waste filled polymer mortars show improved mechanical behaviour over unmodified polyester based mortars, thus indicating the feasibility of GFRP waste reuse as raw material in concrete-polymer composites.
Resumo:
Nowadays, due to the incredible grow of the mobile devices market, when we want to implement a client-server applications we must consider mobile devices limitations. In this paper we discuss which can be the more reliable and fast way to exchange information between a server and an Android mobile application. This is an important issue because with a responsive application the user experience is more enjoyable. In this paper we present a study that test and evaluate two data transfer protocols, socket and HTTP, and three data serialization formats (XML, JSON and Protocol Buffers) using different environments and mobile devices to realize which is the most practical and fast to use.
Resumo:
Dissertação de Natureza Científica para obtenção do grau de Mestre em Engenharia Civil na Área de Especialização de Edificações
Resumo:
Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia Mecânica
Resumo:
The goal of the this paper is to show that the DGPS data Internet service we designed and developed provides campus-wide real time access to Differential GPS (DGPS) data and, thus, supports precise outdoor navigation. First we describe the developed distributed system in terms of architecture (a three tier client/server application), services provided (real time DGPS data transportation from remote DGPS sources and campus wide data dissemination) and transmission modes implemented (raw and frame mode over TCP and UDP). Then we present and discuss the results obtained and, finally, we draw some conclusions.
Resumo:
OBJECTIVE: The objective of this study was to evaluate whether adolescent pregnancy is a risk factor for low birth weight (LBW) babies. METHODS: This was a cross-sectional study of mothers and their newborns from a birth cohort in Aracaju, Northeastern Brazil. Data were collected consecutively from March to July 2005. Information collected included socioeconomic, biological and reproductive aspects of the mothers, using a standardized questionnaire. The impact of early pregnancy on birth weight was evaluated by multiple logistic regression. RESULTS: We studied 4,746 pairs of mothers and their babies. Of these, 20.6% were adolescents (< 20 years of age). Adolescent mothers had worse socioeconomic and reproductive conditions and perinatal outcomes when compared to other age groups. Having no prenatal care and smoking during pregnancy were the risk factors associated with low birth weight. Adolescent pregnancy, when linked to marital status "without partner", was associated with an increased proportion of low birth weight babies. CONCLUSIONS: Adolescence was a risk factor for LBW only for mothers without partners. Smoking during pregnancy and lack of prenatal care were considered to be independent risk factors for LBW.
Resumo:
The principal topic of this work is the application of data mining techniques, in particular of machine learning, to the discovery of knowledge in a protein database. In the first chapter a general background is presented. Namely, in section 1.1 we overview the methodology of a Data Mining project and its main algorithms. In section 1.2 an introduction to the proteins and its supporting file formats is outlined. This chapter is concluded with section 1.3 which defines that main problem we pretend to address with this work: determine if an amino acid is exposed or buried in a protein, in a discrete way (i.e.: not continuous), for five exposition levels: 2%, 10%, 20%, 25% and 30%. In the second chapter, following closely the CRISP-DM methodology, whole the process of construction the database that supported this work is presented. Namely, it is described the process of loading data from the Protein Data Bank, DSSP and SCOP. Then an initial data exploration is performed and a simple prediction model (baseline) of the relative solvent accessibility of an amino acid is introduced. It is also introduced the Data Mining Table Creator, a program developed to produce the data mining tables required for this problem. In the third chapter the results obtained are analyzed with statistical significance tests. Initially the several used classifiers (Neural Networks, C5.0, CART and Chaid) are compared and it is concluded that C5.0 is the most suitable for the problem at stake. It is also compared the influence of parameters like the amino acid information level, the amino acid window size and the SCOP class type in the accuracy of the predictive models. The fourth chapter starts with a brief revision of the literature about amino acid relative solvent accessibility. Then, we overview the main results achieved and finally discuss about possible future work. The fifth and last chapter consists of appendices. Appendix A has the schema of the database that supported this thesis. Appendix B has a set of tables with additional information. Appendix C describes the software provided in the DVD accompanying this thesis that allows the reconstruction of the present work.