4 resultados para digital text

em DRUM (Digital Repository at the University of Maryland)


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Using scientific methods in the humanities is at the forefront of objective literary analysis. However, processing big data is particularly complex when the subject matter is qualitative rather than numerical. Large volumes of text require specialized tools to produce quantifiable data from ideas and sentiments. Our team researched the extent to which tools such as Weka and MALLET can test hypotheses about qualitative information. We examined the claim that literary commentary exists within political environments and used US periodical articles concerning Russian literature in the early twentieth century as a case study. These tools generated useful quantitative data that allowed us to run stepwise binary logistic regressions. These statistical tests allowed for time series experiments using sea change and emergency models of history, as well as classification experiments with regard to author characteristics, social issues, and sentiment expressed. Both types of experiments supported our claim with varying degrees, but more importantly served as a definitive demonstration that digitally enhanced quantitative forms of analysis can apply to qualitative data. Our findings set the foundation for further experiments in the emerging field of digital humanities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Human relationships have long been studied by scientists from domains like sociology, psychology, literature, etc. for understanding people's desires, goals, actions and expected behaviors. In this dissertation we study inter-personal relationships as expressed in natural language text. Modeling inter-personal relationships from text finds application in general natural language understanding, as well as real-world domains such as social networks, discussion forums, intelligent virtual agents, etc. We propose that the study of relationships should incorporate not only linguistic cues in text, but also the contexts in which these cues appear. Our investigations, backed by empirical evaluation, support this thesis, and demonstrate that the task benefits from using structured models that incorporate both types of information. We present such structured models to address the task of modeling the nature of relationships between any two given characters from a narrative. To begin with, we assume that relationships are of two types: cooperative and non-cooperative. We first describe an approach to jointly infer relationships between all characters in the narrative, and demonstrate how the task of characterizing the relationship between two characters can benefit from including information about their relationships with other characters in the narrative. We next formulate the relationship-modeling problem as a sequence prediction task to acknowledge the evolving nature of human relationships, and demonstrate the need to model the history of a relationship in predicting its evolution. Thereafter, we present a data-driven method to automatically discover various types of relationships such as familial, romantic, hostile, etc. Like before, we address the task of modeling evolving relationships but don't restrict ourselves to two types of relationships. We also demonstrate the need to incorporate not only local historical but also global context while solving this problem. Lastly, we demonstrate a practical application of modeling inter-personal relationships in the domain of online educational discussion forums. Such forums offer opportunities for its users to interact and form deeper relationships. With this view, we address the task of identifying initiation of such deeper relationships between a student and the instructor. Specifically, we analyze contents of the forums to automatically suggest threads to the instructors that require their intervention. By highlighting scenarios that need direct instructor-student interactions, we alleviate the need for the instructor to manually peruse all threads of the forum and also assist students who have limited avenues for communicating with instructors. We do this by incorporating the discourse structure of the thread through latent variables that abstractly represent contents of individual posts and model the flow of information in the thread. Such latent structured models that incorporate the linguistic cues without losing their context can be helpful in other related natural language understanding tasks as well. We demonstrate this by using the model for a very different task: identifying if a stated desire has been fulfilled by the end of a story.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation applies statistical methods to the evaluation of automatic summarization using data from the Text Analysis Conferences in 2008-2011. Several aspects of the evaluation framework itself are studied, including the statistical testing used to determine significant differences, the assessors, and the design of the experiment. In addition, a family of evaluation metrics is developed to predict the score an automatically generated summary would receive from a human judge and its results are demonstrated at the Text Analysis Conference. Finally, variations on the evaluation framework are studied and their relative merits considered. An over-arching theme of this dissertation is the application of standard statistical methods to data that does not conform to the usual testing assumptions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

African American women account for a disproportionate burden of cervical cancer incidence and mortality rate when compared to non-Hispanic White women. Cervical cancer is one of the most preventable types of cancer, and women can be screened for it with a routine Pap test. Given that religion occupies an essential place in African American lives, framing health messages with important spiritual themes and delivering them through a popular communication delivery channel may allow for a more culturally-relevant and accessible technology-based approach to promoting cervical cancer educational content to African American women. Using community-engaged research as a framework, the purpose of this multiple methods study was to develop, pilot test, and evaluate the feasibility, acceptability, and initial efficacy of a spiritually-based SMS text messaging intervention to increase cervical cancer awareness and Pap test screening intention among African American women. The study recruited church-attending African American women ages 21-65 and was conducted in three phases. Phases 1 and 2 consisted of a series of focus group discussions (n=15), cognitive response interviews (n=8), and initial usability testing that were conducted to inform the intervention development and modifications. Phase 3 utilized a non-experimental one-group pretest-posttest design to pilot test the 16-day text messaging intervention (n=52). Of the individuals enrolled, forty-six completed the posttest (retention rate=88%). Findings provided evidence for the early feasibility, high acceptability, and some initial efficacy of the CervixCheck intervention. There were significant pre-post increases observed for knowledge about cervical cancer and the Pap test (p = .001) and subjective norms (p = .006). Additionally, results post-intervention revealed that 83% of participants reported being either “satisfied” or “very satisfied” with the program and 85% found the text messages either “useful” or “very useful”. 85% of the participants also indicated that they would “likely” or “very likely” share the information they learned from the intervention with the women around them, with 39% indicating that they had already shared some of the information they received with others they knew. A spiritually-based SMS text messaging intervention could be a culturally appropriate and cost-effective method of promoting cervical cancer early detection information to African American women.