832 resultados para databases and data mining
Resumo:
Data mining, as a heatedly discussed term, has been studied in various fields. Its possibilities in refining the decision-making process, realizing potential patterns and creating valuable knowledge have won attention of scholars and practitioners. However, there are less studies intending to combine data mining and libraries where data generation occurs all the time. Therefore, this thesis plans to fill such a gap. Meanwhile, potential opportunities created by data mining are explored to enhance one of the most important elements of libraries: reference service. In order to thoroughly demonstrate the feasibility and applicability of data mining, literature is reviewed to establish a critical understanding of data mining in libraries and attain the current status of library reference service. The result of the literature review indicates that free online data resources other than data generated on social media are rarely considered to be applied in current library data mining mandates. Therefore, the result of the literature review motivates the presented study to utilize online free resources. Furthermore, the natural match between data mining and libraries is established. The natural match is explained by emphasizing the data richness reality and considering data mining as one kind of knowledge, an easy choice for libraries, and a wise method to overcome reference service challenges. The natural match, especially the aspect that data mining could be helpful for library reference service, lays the main theoretical foundation for the empirical work in this study. Turku Main Library was selected as the case to answer the research question: whether data mining is feasible and applicable for reference service improvement. In this case, the daily visit from 2009 to 2015 in Turku Main Library is considered as the resource for data mining. In addition, corresponding weather conditions are collected from Weather Underground, which is totally free online. Before officially being analyzed, the collected dataset is cleansed and preprocessed in order to ensure the quality of data mining. Multiple regression analysis is employed to mine the final dataset. Hourly visits are the independent variable and weather conditions, Discomfort Index and seven days in a week are dependent variables. In the end, four models in different seasons are established to predict visiting situations in each season. Patterns are realized in different seasons and implications are created based on the discovered patterns. In addition, library-climate points are generated by a clustering method, which simplifies the process for librarians using weather data to forecast library visiting situation. Then the data mining result is interpreted from the perspective of improving reference service. After this data mining work, the result of the case study is presented to librarians so as to collect professional opinions regarding the possibility of employing data mining to improve reference services. In the end, positive opinions are collected, which implies that it is feasible to utilizing data mining as a tool to enhance library reference service.
Resumo:
When it comes to information sets in real life, often pieces of the whole set may not be available. This problem can find its origin in various reasons, describing therefore different patterns. In the literature, this problem is known as Missing Data. This issue can be fixed in various ways, from not taking into consideration incomplete observations, to guessing what those values originally were, or just ignoring the fact that some values are missing. The methods used to estimate missing data are called Imputation Methods. The work presented in this thesis has two main goals. The first one is to determine whether any kind of interactions exists between Missing Data, Imputation Methods and Supervised Classification algorithms, when they are applied together. For this first problem we consider a scenario in which the databases used are discrete, understanding discrete as that it is assumed that there is no relation between observations. These datasets underwent processes involving different combina- tions of the three components mentioned. The outcome showed that the missing data pattern strongly influences the outcome produced by a classifier. Also, in some of the cases, the complex imputation techniques investigated in the thesis were able to obtain better results than simple ones. The second goal of this work is to propose a new imputation strategy, but this time we constrain the specifications of the previous problem to a special kind of datasets, the multivariate Time Series. We designed new imputation techniques for this particular domain, and combined them with some of the contrasted strategies tested in the pre- vious chapter of this thesis. The time series also were subjected to processes involving missing data and imputation to finally propose an overall better imputation method. In the final chapter of this work, a real-world example is presented, describing a wa- ter quality prediction problem. The databases that characterized this problem had their own original latent values, which provides a real-world benchmark to test the algorithms developed in this thesis.
3D Surveying and Data Management towards the Realization of a Knowledge System for Cultural Heritage
Resumo:
The research activities involved the application of the Geomatic techniques in the Cultural Heritage field, following the development of two themes: Firstly, the application of high precision surveying techniques for the restoration and interpretation of relevant monuments and archaeological finds. The main case regards the activities for the generation of a high-fidelity 3D model of the Fountain of Neptune in Bologna. In this work, aimed to the restoration of the manufacture, both the geometrical and radiometrical aspects were crucial. The final product was the base of a 3D information system representing a shared tool where the different figures involved in the restoration activities shared their contribution in a multidisciplinary approach. Secondly, the arrangement of 3D databases for a Building Information Modeling (BIM) approach, in a process which involves the generation and management of digital representations of physical and functional characteristics of historical buildings, towards a so-called Historical Building Information Model (HBIM). A first application was conducted for the San Michele in Acerboli’s church in Santarcangelo di Romagna. The survey was performed by the integration of the classical and modern Geomatic techniques and the point cloud representing the church was used for the development of a HBIM model, where the relevant information connected to the building could be stored and georeferenced. A second application regards the domus of Obellio Firmo in Pompeii, surveyed by the integration of the classical and modern Geomatic techniques. An historical analysis permitted the definitions of phases and the organization of a database of materials and constructive elements. The goal is the obtaining of a federate model able to manage the different aspects: documental, analytic and reconstructive ones.
Resumo:
In geophysics and seismology, raw data need to be processed to generate useful information that can be turned into knowledge by researchers. The number of sensors that are acquiring raw data is increasing rapidly. Without good data management systems, more time can be spent in querying and preparing datasets for analyses than in acquiring raw data. Also, a lot of good quality data acquired at great effort can be lost forever if they are not correctly stored. Local and international cooperation will probably be reduced, and a lot of data will never become scientific knowledge. For this reason, the Seismological Laboratory of the Institute of Astronomy, Geophysics and Atmospheric Sciences at the University of Sao Paulo (IAG-USP) has concentrated fully on its data management system. This report describes the efforts of the IAG-USP to set up a seismology data management system to facilitate local and international cooperation.
Resumo:
Maltose-binding protein is the periplasmic component of the ABC transporter responsible for the uptake of maltose/maltodextrins. The Xanthomonas axonopodis pv. citri maltose-binding protein MalE has been crystallized at 293 Kusing the hanging-drop vapour-diffusion method. The crystal belonged to the primitive hexagonal space group P6(1)22, with unit-cell parameters a = 123.59, b = 123.59, c = 304.20 angstrom, and contained two molecules in the asymetric unit. It diffracted to 2.24 angstrom resolution.
Resumo:
Objective: To determine whether coinfection with sexually transmitted diseases (STD) increases HIV shedding in genital-tract secretions, and whether STD treatment reduces this shedding. Design: Systematic review and data synthesis of cross-sectional and cohort studies meeting. predefined quality criteria. Main Outcome Measures: Proportion of patients with and without a STD who had detectable HIV in genital secretions, HIV toad in genital secretions, or change following STD treatment. Results: Of 48 identified studies, three cross-sectional and three cohort studies were included. HIV was detected significantly more frequently in participants infected with Neisseria gonorrhoeae (125 of 309 participants, 41%) than in those without N gonorrhoeae infection (311 of 988 participants, 32%; P = 0.004). HIV was not significantly more frequently detected in persons infected with Chlamydia trachomatis (28 of 67 participants, 42%) than in those without C trachomatis infection (375 of 1149 participants, 33%; P = 0.13). Median HIV load reported in only one study was greater in men with urethritis (12.4 x 10(4) versus 1.51 x 10(4) copies/ml; P = 0.04). In the only cohort study in which this could be fully assessed, treatment of women with any STD reduced the proportion of those with detectable HIV from 39% to 29% (P = 0.05), whereas this proportion remained stable among controls (15-17%), A second cohort study reported fully on HIV load; among men with urethritis, viral load fell from 12.4 to 4.12 x 10(4) copies/ml 2 weeks posttreatment, whereas viral load remained stable in those without urethritis. Conclusion: Few high-quality studies were found. HIV is detected moderately more frequently in genital secretions of men and women with a STD, and HIV load is substantially increased among men with urethritis, Successful STD treatment reduces both of these parameters, but not to control levels. More high-quality studies are needed to explore this important relationship further.
Resumo:
P>The determination of normal parameters is an important procedure in the evaluation of the stomatognathic system. We used the surface electromyography standardization protocol described by Ferrario et al. (J Oral Rehabil. 2000;27:33-40, 2006;33:341) to determine reference values of the electromyographic standardized indices for the assessment of muscular symmetry (left and right side, percentage overlapping coefficient, POC), potential lateral displacing components (unbalanced contractile activities of contralateral masseter and temporalis muscles, TC), relative activity (most prevalent pair of masticatory muscles, ATTIV) and total activity (integrated areas of the electromyographic potentials over time, IMPACT) in healthy Brazilian young adults, and the relevant data reproducibility. Electromyography of the right and left masseter and temporalis muscles was performed during maximum teeth clenching in 20 healthy subjects (10 women and 10 men, mean age 23 years, s.d. 3), free from periodontal problems, temporomandibular disorders, oro-facial myofunctional disorder, and with full permanent dentition (28 teeth at least). Data reproducibility was computed for 75% of the sample. The values obtained were POC Temporal (88 center dot 11 +/- 1 center dot 45%), POC masseter (87 center dot 11 +/- 1 center dot 60%), TC (8 center dot 79 +/- 1 center dot 20%), ATTIV (-0 center dot 33 +/- 9 center dot 65%) and IMPACT (110 center dot 40 +/- 23 center dot 69 mu V/mu V center dot s %). There were no statistical differences between test and retest values (P > 0 center dot 05). The Technical Errors of Measurement (TEM) for 50% of subjects assessed during the same session were 1 center dot 5, 1 center dot 39, 1 center dot 06, 3 center dot 83 and 10 center dot 04. For 25% of the subjects assessed after a 6-month interval, the TEM were 0 center dot 80, 1 center dot 03, 0 center dot 73, 12 center dot 70 and 19 center dot 10. For all indices, there was good reproducibility. These electromyographic indices could be used in the assessment of patients with stomatognathic dysfunction.
Resumo:
The progressive aging of the population requires new kinds of social and medical intervention and the availability of different services provided to the elder population. New applications have been developed and some services are now provided at home, allowing the older people to stay home instead of having to stay in hospitals. But an adequate response to the needs of the users will imply a high percentage of use of personal data and information, including the building up and maintenance of user profiles, feeding the systems with the data and information needed for a proactive intervention in scheduling of events in which the user may be involved. Fundamental Rights may be at stake, so a legal analysis must also be considered.
Resumo:
A gestão do conhecimento abrange toda a forma de gerar, armazenar, distribuir e utilizar o conhecimento, tornando necessária a utilização de tecnologias de informação para facilitar esse processo, devido ao grande aumento no volume de dados. A descoberta de conhecimento em banco de dados é uma metodologia que tenta solucionar esse problema e o data mining é uma técnica que faz parte dessa metodologia. Este artigo desenvolve, aplica e analisa uma ferramenta de data mining, para extrair conhecimento referente à produção científica das pessoas envolvidas com a pesquisa na Universidade Federal de Lavras. A metodologia utilizada envolveu a pesquisa bibliográfica, a pesquisa documental e o método do estudo de caso. As limitações encontradas na análise dos resultados indicam que ainda é preciso padronizar o modo do preenchimento dos currículos Lattes para refinar as análises e, com isso, estabelecer indicadores. A contribuição foi gerar um banco de dados estruturado, que faz parte de um processo maior de desenvolvimento de indicadores de ciência e tecnologia, para auxiliar na elaboração de novas políticas de gestão científica e tecnológica e aperfeiçoamento do sistema de ensino superior brasileiro.
Resumo:
Este trabalho consiste no desenvolvimento de um Sistema de Apoio à Criminologia – SAC, onde se pretende ajudar os detectives/analistas na prevenção proactiva da criminalidade e na gestão dos seus recursos materiais e humanos, bem como impulsionar estudos sobre a alta incidência de determinados tipos de crime numa dada região. Historicamente, a resolução de crimes tem sido uma prerrogativa da justiça penal e dos seus especialistas e, com o aumento da utilização de sistemas computacionais no sistema judicial para registar todos os dados que dizem respeito a ocorrências de crimes, dados de suspeitos e vítimas, registo criminal de indivíduos e outros dados que fluem dentro da organização, cresce a necessidade de transformar estes dados em informação proveitosa no combate à criminalidade. O SAC tira partido de técnicas de extracção de conhecimento de informação e aplica-as a um conjunto de dados de ocorrências de crimes numa dada região e espaço temporal, bem como a um conjunto de variáveis que influenciam a criminalidade, as quais foram estudadas e identificadas neste trabalho. Este trabalho é constituído por um modelo de extracção de conhecimento de informação e por uma aplicação que permite ao utilizador fornecer um conjunto de dados adequado, garantindo a máxima eficácia do modelo.
Resumo:
Controlled fires in forest areas are frequently used in most Mediterranean countries as a preventive technique to avoid severe wildfires in summer season. In Portugal, this forest management method of fuel mass availability is also used and has shown to be beneficial as annual statistical reports confirm that the decrease of wildfires occurrence have a direct relationship with the controlled fire practice. However prescribed fire can have serious side effects in some forest soil properties. This work shows the changes that occurred in some forest soils properties after a prescribed fire action. The experiments were carried out in soil cover over a natural site of Andaluzitic schist, in Gramelas, Caminha, Portugal, that had not been burn for four years. The composed soil samples were collected from five plots at three different layers (0-3cm, 3-6cm and 6-18cm) during a three-year monitoring period after the prescribed burning. Principal Component Analysis was used to reach the presented conclusions.
Resumo:
The industrial activity is inevitably associated with a certain degradation of the environmental quality, because is not possible to guarantee that a manufacturing process can be totally innocuous. The eco-efficiency concept is globally accepted as a philosophy of entreprise management, that encourages the companies to become more competitive, innovative and environmentally responsible by promoting the link between its companies objectives for excellence and its objectives of environmental excellence issues. This link imposes the creation of an organizational methodology where the performance of the company is concordant with the sustainable development. The main propose of this project is to apply the concept of eco-efficiency to the particular case of the metallurgical and metal workshop industries through the development of the particular indicators needed and to produce a manual of procedures for implementation of the accurate solution.
Resumo:
Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia Electrotécnica e de Computadores
Resumo:
A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems.