989 resultados para Topic modeling
Resumo:
Questo elaborato tratta dell'importanza dell'analisi testuale tramite strumenti informatici. Presenta la tecnica più utilizzata per questo tipo di analisi il: Topic Modeling. Vengono indicati alcuni degli algoritmi più sfruttati e si descrivono gli obiettivi principali. Inoltre introduce il Web Mining per l’estrazione di informazioni presenti nel web, specificando una tecnica particolare chiamata Web Scraping. Nell'ultima sezione dell’elaborato viene descritto un caso di studio. L’argomento dello studio è la Privatizzazione. Viene suddiviso in tre fasi, la primi riguarda la ricerca dei documenti e articoli da analizzare del quotidiano La Repubblica, nella seconda parte la raccolta di documenti viene analizzata attraverso l’uso del software MALLET e come ultimo passo vengono analizzati i topic, prodotti dal programma, a cui vengono assegnate delle etichette per identificare i sotto-argomenti presenti nei documenti della raccolta.
Resumo:
When something unfamiliar emerges or when something familiar does something unexpected people need to make sense of what is emerging or going on in order to act. Social representations theory suggests how individuals and society make sense of the unfamiliar and hence how the resultant social representations (SRs) cognitively, emotionally, and actively orient people and enable communication. SRs are social constructions that emerge through individual and collective engagement with media and with everyday conversations among people. Recent developments in text analysis techniques, and in particular topic modeling, provide a potentially powerful analytical method to examine the structure and content of SRs using large samples of narrative or text. In this paper I describe the methods and results of applying topic modeling to 660 micronarratives collected from Australian academics / researchers, government employees, and members of the public in 2010-2011. The narrative fragments focused on adaptation to climate change (CC) and hence provide an example of Australian society making sense of an emerging and conflict ridden phenomena. The results of the topic modeling reflect elements of SRs of adaptation to CC that are consistent with findings in the literature as well as being reasonably robust predictors of classes of action in response to CC. Bayesian Network (BN) modeling was used to identify relationships among the topics (SR elements) and in particular to identify relationships among topics, sentiment, and action. Finally the resulting model and topic modeling results are used to highlight differences in the salience of SR elements among social groups. The approach of linking topic modeling and BN modeling offers a new and encouraging approach to analysis for ongoing research on SRs.
Resumo:
O uso combinado de algoritmos para a descoberta de tópicos em coleções de documentos com técnicas orientadas à visualização da evolução daqueles tópicos no tempo permite a exploração de padrões temáticos em corpora extensos a partir de representações visuais compactas. A pesquisa em apresentação investigou os requisitos de visualização do dado sobre composição temática de documentos obtido através da modelagem de tópicos – o qual é esparso e possui multiatributos – em diferentes níveis de detalhe, através do desenvolvimento de uma técnica de visualização própria e pelo uso de uma biblioteca de código aberto para visualização de dados, de forma comparativa. Sobre o problema estudado de visualização do fluxo de tópicos, observou-se a presença de requisitos de visualização conflitantes para diferentes resoluções dos dados, o que levou à investigação detalhada das formas de manipulação e exibição daqueles. Dessa investigação, a hipótese defendida foi a de que o uso integrado de mais de uma técnica de visualização de acordo com a resolução do dado amplia as possibilidades de exploração do objeto em estudo em relação ao que seria obtido através de apenas uma técnica. A exibição dos limites no uso dessas técnicas de acordo com a resolução de exploração do dado é a principal contribuição desse trabalho, no intuito de dar subsídios ao desenvolvimento de novas aplicações.
Resumo:
Tema 6. Text Mining con Topic Modeling.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-07
Resumo:
The amount of information contained within the Internet has exploded in recent decades. As more and more news, blogs, and many other kinds of articles that are published on the Internet, categorization of articles and documents are increasingly desired. Among the approaches to categorize articles, labeling is one of the most common method; it provides a relatively intuitive and effective way to separate articles into different categories. However, manual labeling is limited by its efficiency, even thought the labels selected manually have relatively high quality. This report explores the topic modeling approach of Online Latent Dirichlet Allocation (Online-LDA). Additionally, a method to automatically label articles with their latent topics by combining the Online-LDA posterior with a probabilistic automatic labeling algorithm is implemented. The goal of this report is to examine the accuracy of the labels generated automatically by a topic model and probabilistic relevance algorithm for a set of real-world, dynamically updated articles from an online Rich Site Summary (RSS) service.
Resumo:
In the scholarly publishing domain, a retraction is raised when a specific publication is considered erroneous by the venue in which it appeared after it was published. The aim of this work is uncovering new insights and learn new important information to help us understand the retraction phenomenon in the arts and humanities domain. Our investigation is based on a methodology defined using quantitative and qualitative measures derived from previous studies in the transdisciplinary research field of “science of science” (SciSci). The designed methodology takes into account a general case of retraction and applies a citation analysis based on five phases. Citations to retracted publications (before and after their retraction) are gathered and characterized with a set of attributes, including general metadata and information extracted from citing entities’ full text. The annotated characteristics are further considered for a statistical and a textual analysis (i.e., a topic modeling analysis). The contribution of this thesis is grounded by addressing the following research questions: (RQ1) How did scholarly research cite retracted humanities publications before and after their retraction? (RQ2) Did all the humanities areas behave similarly concerning the retraction phenomenon? (RQ3) What are the main differences and similarities in the retraction dynamics between the humanities domain and the STEM disciplines? RQ1 and RQ2 are addressed by tuning and applying the methodology on the analysis of the retracted publications in the humanities domain. RQ3 is addressed on two levels, i.e., considering and comparing: (L1) the outcomes of the past studies on the retraction in STEM, and (L2) the results obtained from an analysis of a retraction case in STEM using the defined methodology.
Resumo:
The MAP-i Doctoral Programme in Informatics, of the Universities of Minho, Aveiro and Porto
Resumo:
Mathematical and computational models play an essential role in understanding the cellular metabolism. They are used as platforms to integrate current knowledge on a biological system and to systematically test and predict the effect of manipulations to such systems. The recent advances in genome sequencing techniques have facilitated the reconstruction of genome-scale metabolic networks for a wide variety of organisms from microbes to human cells. These models have been successfully used in multiple biotechnological applications. Despite these advancements, modeling cellular metabolism still presents many challenges. The aim of this Research Topic is not only to expose and consolidate the state-of-the-art in metabolic modeling approaches, but also to push this frontier beyond the current edge through the introduction of innovative solutions. The articles presented in this e-book address some of the main challenges in the field, including the integration of different modeling formalisms, the integration of heterogeneous data sources into metabolic models, explicit representation of other biological processes during phenotype simulation, and standardization efforts in the representation of metabolic models and simulation results.
Resumo:
OBJECTIVE: To evaluate the public health impact of statin prescribing strategies based on the Justification for the Use of Statins in Primary Prevention: An Intervention Trial Evaluating Rosuvastatin Study (JUPITER). METHODS: We studied 2268 adults aged 35-75 without cardiovascular disease in a population-based study in Switzerland in 2003-2006. We assessed the eligibility for statins according to the Adult Treatment Panel III (ATPIII) guidelines, and by adding "strict" (hs-CRP≥2.0mg/L and LDL-cholesterol <3.4mmol/L), and "extended" (hs-CRP≥2.0mg/L alone) JUPITER-like criteria. We estimated the proportion of CHD deaths potentially prevented over 10years in the Swiss population. RESULTS: Fifteen % were already taking statins, 42% were eligible by ATPIII guidelines, 53% by adding "strict", and 62% by adding "extended" criteria, with a total of 19% newly eligible. The number needed to treat with statins to avoid one CHD death over 10years was 38 for ATPIII, 84 for "strict" and 92 for "extended" JUPITER-like criteria. ATPIII would prevent 17% of CHD deaths, compared with 20% for ATPIII+"strict" and 23% for ATPIII + "extended" criteria (+6%). CONCLUSION: Implementing JUPITER-like strategies would make statin prescribing for primary prevention more common and less efficient than it is with current guidelines.
Resumo:
This paper traces the developments of credit risk modeling in the past 10 years. Our work can be divided into two parts: selecting articles and summarizing results. On the one hand, by constructing an ordered logit model on historical Journal of Economic Literature (JEL) codes of articles about credit risk modeling, we sort out articles which are the most related to our topic. The result indicates that the JEL codes have become the standard to classify researches in credit risk modeling. On the other hand, comparing with the classical review Altman and Saunders(1998), we observe some important changes of research methods of credit risk. The main finding is that current focuses on credit risk modeling have moved from static individual-level models to dynamic portfolio models.
Resumo:
Semi-automatic building detection and extraction is a topic of growing interest due to its potential application in such areas as cadastral information systems, cartographic revision, and GIS. One of the existing strategies for building extraction is to use a digital surface model (DSM) represented by a cloud of known points on a visible surface, and comprising features such as trees or buildings. Conventional surface modeling using stereo-matching techniques has its drawbacks, the most obvious being the effect of building height on perspective, shadows, and occlusions. The laser scanner, a recently developed technological tool, can collect accurate DSMs with high spatial frequency. This paper presents a methodology for semi-automatic modeling of buildings which combines a region-growing algorithm with line-detection methods applied over the DSM.
Resumo:
Blast traumatic brain injury (BTBI) has become an important topic of study because of the increase of such incidents, especially due to the recent growth of improvised explosive devices (IEDs). This thesis discusses a project in which laboratory testing of BTBI was made possible by performing blast loading on experimental models simulating the human head. Three versions of experimental models were prepared – one having a simple geometry and the other two having geometry similar to a human head. For developing the head models, three important parts of the head were considered for material modeling and analysis – the skin, skull and brain. The materials simulating skin, skull and brain went through many testing procedures including dynamic mechanical analysis (DMA). For finding a suitable brain simulant, several materials were tested under low and high frequencies. Step response analysis, rheometry and DMA tests were performed on materials such as water based gels, oil based mixtures and silicone gels cured at different temperatures. The gelatins and silicone gels showed promising results toward their use as brain surrogate materials. Temperature degradation tests were performed on gelatins, indicating the fast degradation of gelatins at room temperature. Silicone gels were much more stable compared to the water based gels. Silicone gels were further processed using a thinner-type additive gel to bring the dynamic modulus values closer to those of human brain matter. The obtained values from DMA were compared to the values for human brain as found in literature. Then a silicone rubber brain mold was prepared to give the brain model accurate geometry. All the components were put together to make the entire head model. A steel mount was prepared to attach the head for testing at the end of the shock tube. Instrumentation was implemented in the head model to obtain effective results for understanding more about the possible mechanisms of BTBI. The final head model was named the Realistic Explosive Dummy Head or the “RED Head.” The RED Head offered potential for realistic experimental testing in blast loading conditions by virtue of its material properties and geometrical accuracy.
Resumo:
Slope failure occurs in many areas throughout the world and it becomes an important problem when it interferes with human activity, in which disasters provoke loss of life and property damage. In this research we investigate the slope failure through the centrifuge modeling, where a reduced-scale model, N times smaller than the full-scale (prototype), is used whereas the acceleration is increased by N times (compared with the gravity acceleration) to preserve the stress and the strain behavior. The aims of this research “Centrifuge modeling of sandy slopes” are in extreme synthesis: 1) test the reliability of the centrifuge modeling as a tool to investigate the behavior of a sandy slope failure; 2) understand how the failure mechanism is affected by changing the slope angle and obtain useful information for the design. In order to achieve this scope we arranged the work as follows: Chapter one: centrifuge modeling of slope failure. In this chapter we provide a general view about the context in which we are working on. Basically we explain what is a slope failure, how it happens and which are the tools available to investigate this phenomenon. Afterwards we introduce the technology used to study this topic, that is the geotechnical centrifuge. Chapter two: testing apparatus. In the first section of this chapter we describe all the procedures and facilities used to perform a test in the centrifuge. Then we explain the characteristics of the soil (Nevada sand), like the dry unit weight, water content, relative density, and its strength parameters (c,φ), which have been calculated in laboratory through the triaxial test. Chapter three: centrifuge tests. In this part of the document are presented all the results from the tests done in centrifuge. When we talk about results we refer to the acceleration at failure for each model tested and its failure surface. In our case study we tested models with the same soil and geometric characteristics but different angles. The angles tested in this research were: 60°, 75° and 90°. Chapter four: slope stability analysis. We introduce the features and the concept of the software: ReSSA (2.0). This software allows us to calculate the theoretical failure surfaces of the prototypes. Then we show in this section the comparisons between the experimental failure surfaces of the prototype, traced in the laboratory, and the one calculated by the software. Chapter five: conclusion. The conclusion of the research presents the results obtained in relation to the two main aims, mentioned above.
Resumo:
Design parameters, process flows, electro-thermal-fluidic simulations and experimental characterizations of Micro-Electro-Mechanical-Systems (MEMS) suited for gas-chromatographic (GC) applications are presented and thoroughly described in this thesis, whose topic belongs to the research activities the Institute for Microelectronics and Microsystems (IMM)-Bologna is involved since several years, i.e. the development of micro-systems for chemical analysis, based on silicon micro-machining techniques and able to perform analysis of complex gaseous mixtures, especially in the field of environmental monitoring. In this regard, attention has been focused on the development of micro-fabricated devices to be employed in a portable mini-GC system for the analysis of aromatic Volatile Organic Compounds (VOC) like Benzene, Toluene, Ethyl-benzene and Xylene (BTEX), i.e. chemical compounds which can significantly affect environment and human health because of their demonstrated carcinogenicity (benzene) or toxicity (toluene, xylene) even at parts per billion (ppb) concentrations. The most significant results achieved through the laboratory functional characterization of the mini-GC system have been reported, together with in-field analysis results carried out in a station of the Bologna air monitoring network and compared with those provided by a commercial GC system. The development of more advanced prototypes of micro-fabricated devices specifically suited for FAST-GC have been also presented (silicon capillary columns, Ultra-Low-Power (ULP) Metal OXide (MOX) sensor, Thermal Conductivity Detector (TCD)), together with the technological processes for their fabrication. The experimentally demonstrated very high sensitivity of ULP-MOX sensors to VOCs, coupled with the extremely low power consumption, makes the developed ULP-MOX sensor the most performing metal oxide sensor reported up to now in literature, while preliminary test results proved that the developed silicon capillary columns are capable of performances comparable to those of the best fused silica capillary columns. Finally, the development and the validation of a coupled electro-thermal Finite Element Model suited for both steady-state and transient analysis of the micro-devices has been described, and subsequently implemented with a fluidic part to investigate devices behaviour in presence of a gas flowing with certain volumetric flow rates.