7 resultados para Graph operations
em DigitalCommons@The Texas Medical Center
Resumo:
The biomedical literature is extensively catalogued and indexed in MEDLINE. MEDLINE indexing is done by trained human indexers, who identify the most important concepts in each article, and is expensive and inconsistent. Automating the indexing task is difficult: the National Library of Medicine produces the Medical Text Indexer (MTI), which suggests potential indexing terms to the indexers. MTI’s output is not good enough to work unattended. In my thesis, I propose a different way to approach the indexing task called MEDRank. MEDRank creates graphs representing the concepts in biomedical articles and their relationships within the text, and applies graph-based ranking algorithms to identify the most important concepts in each article. I evaluate the performance of several automated indexing solutions, including my own, by comparing their output to the indexing terms selected by the human indexers. MEDRank outperformed all other evaluated indexing solutions, including MTI, in general indexing performance and precision. MEDRank can be used to cluster documents, index any kind of biomedical text with standard vocabularies, or could become part of MTI itself.
Resumo:
People often use tools to search for information. In order to improve the quality of an information search, it is important to understand how internal information, which is stored in user’s mind, and external information, represented by the interface of tools interact with each other. How information is distributed between internal and external representations significantly affects information search performance. However, few studies have examined the relationship between types of interface and types of search task in the context of information search. For a distributed information search task, how data are distributed, represented, and formatted significantly affects the user search performance in terms of response time and accuracy. Guided by UFuRT (User, Function, Representation, Task), a human-centered process, I propose a search model, task taxonomy. The model defines its relationship with other existing information models. The taxonomy clarifies the legitimate operations for each type of search task of relation data. Based on the model and taxonomy, I have also developed prototypes of interface for the search tasks of relational data. These prototypes were used for experiments. The experiments described in this study are of a within-subject design with a sample of 24 participants recruited from the graduate schools located in the Texas Medical Center. Participants performed one-dimensional nominal search tasks over nominal, ordinal, and ratio displays, and searched one-dimensional nominal, ordinal, interval, and ratio tasks over table and graph displays. Participants also performed the same task and display combination for twodimensional searches. Distributed cognition theory has been adopted as a theoretical framework for analyzing and predicting the search performance of relational data. It has been shown that the representation dimensions and data scales, as well as the search task types, are main factors in determining search efficiency and effectiveness. In particular, the more external representations used, the better search task performance, and the results suggest the ideal search performance occurs when the question type and corresponding data scale representation match. The implications of the study lie in contributing to the effective design of search interface for relational data, especially laboratory results, which are often used in healthcare activities.
Resumo:
A face to face survey addressing environmental risk perception was conducted in January through March 2010. The 35 question survey was administered to a random sample of 73 households in El Paso, Texas. The instrument, administered in two adjacent residential communities neighboring an inactive copper smelter solicited responses about manmade and naturally occurring health risks and sources of health information that might utilized by respondents. The objective of the study was to determine if intervention which occurred in one of the communities increased residents' perception of risk to themselves and their families. The study was undertaken subsequent to increased attention from news media and public debate surrounding the request to reopen the smelter's operations. Results of the study indicated that the perception of environmental related health concerns were not significantly correlated with residence in a community receiving outreach and intervention. Both communities identified sun exposure as their greatest perceived environmental risk followed by cigarette smoking. Though industrial by products and chemical pollution were high ranking concerns, respondents indicated they felt that the decision not to reopen the smelter reduced risk in these areas. Residents expressed confidence in information received from the local health district though most indicated they received very little information from that source indicating an opportunity for public health education in this community as a strategy to address future health concerns.^
Resumo:
This dissertation develops and tests a comparative effectiveness methodology utilizing a novel approach to the application of Data Envelopment Analysis (DEA) in health studies. The concept of performance tiers (PerT) is introduced as terminology to express a relative risk class for individuals within a peer group and the PerT calculation is implemented with operations research (DEA) and spatial algorithms. The analysis results in the discrimination of the individual data observations into a relative risk classification by the DEA-PerT methodology. The performance of two distance measures, kNN (k-nearest neighbor) and Mahalanobis, was subsequently tested to classify new entrants into the appropriate tier. The methods were applied to subject data for the 14 year old cohort in the Project HeartBeat! study.^ The concepts presented herein represent a paradigm shift in the potential for public health applications to identify and respond to individual health status. The resultant classification scheme provides descriptive, and potentially prescriptive, guidance to assess and implement treatments and strategies to improve the delivery and performance of health systems. ^
Resumo:
Refugee populations suffer poor health status and yet the activities of refugee relief agencies in the public health sector have not been subjected previously to comprehensive evaluation. The purpose of this study was to examine the effectiveness and cost of the major public health service inputs of the international relief operation for Indochinese refugees in Thailand coordinated by the United Nations High Commissioner for Refugees (UNHCR). The investigator collected data from surveillance reports and agency records pertaining to 11 old refugee camps administered by the Government of Thailand Ministry of Interior (MOI) since an earlier refugee influx, and five new Khmer holding centers administered directly by UNHCR, from November, 1979, to March, 1982.^ Generous international funding permitted UNHCR to maintain a higher level of public health service inputs than refugees usually enjoyed in their countries of origin or than Thais around them enjoyed. Annual per capita expenditure for public health inputs averaged approximately US$151. Indochinese refugees in Thailand, for the most part, had access to adequate general food rations, to supplementary feeding programs, and to preventive health measures, and enjoyed high-quality medical services. Old refugee camps administered by MOI consistently received public health inputs of lower quantity and quality compared with new UNHCR-administered holding centers, despite comparable per capita expenditure after both types of camps had stabilized (static phase).^ Mortality and morbidity rates among new Khmer refugees were catastrophic during the emergency and transition phases of camp development. Health status in the refugee population during the static phase, however, was similar to, or better than, health status in the refugees' countries of origin or the Thai communities surrounding the camps. During the static phase, mortality and morbidity generally remained stable at roughly the same low levels in both types of camps.^ Furthermore, the results of multiple regression analyses demonstrated that combined public health inputs accounted for from one to 23 per cent of the variation in refugee mortality and morbidity. The direction of associations between some public health inputs and specific health outcome variables demonstrated no clear pattern. ^
Resumo:
Personnel involved in natural or man-made disaster response and recovery efforts may be exposed to a wide variety of physical and mental stressors that can exhibit long-lasting and detrimental psychopathological outcomes. In a disaster situation, huge numbers of "secondary" responders can be involved in contaminant clean-up and debris removal and can be at risk of developing stress-related mental health outcomes. The Occupational Safety and Health Administration (OSHA) worker training hierarchy typically required for response workers, known as "Hazardous Waste Operations and Emergency Response" (HAZWOPER), does not address the mental health and safety concerns of workers. This study focused on the prevalence of traumatic stress experienced by secondary responders that had received or expressed interest in receiving HAZWOPER training through the National Institute of Environmental Health Sciences Worker Education and Training Program (NIEHS WETP). ^ The study involved the modification of two preexisting and validated survey tools to assess secondary responder awareness of physical, mental, and traumatic stressors on mental health and sought to determine if a need existed to include traumatic stress-related mental health education in the current HAZWOPER training regimen. The study evaluated post-traumatic stress disorder (PTSD), resiliency, mental distress, and negative effects within a secondary responder population of 176 respondents. Elevated PTSD levels were seen in the study population as compared to a general responder population (32.9% positive vs. 8%-22.5% positive). Results indicated that HAZWOPER-trained disaster responders were likely to test positive for PTSD, whereas, untrained responders with no disaster experience and responders who possessed either training or disaster experience only were likely to test PTSD negative. A majority (68.75%) of the population tested below the mean resiliency to cope score (80.4) of the average worker population. Results indicated that those who were trained only or who possessed both training and disaster work experience were more likely to have lower resiliency scores than those with no training or experience. There were direct correlations between being PTSD positive and having worked at a disaster site and experiencing mental distress and negative effects. However, HAZWOPER training status does not significantly correlate with mental distress or negative effect. ^ The survey indicated clear support (91% of respondents) for mental health education. The development of a pre- and post-deployment training module is recommended. Such training could provide responders with the necessary knowledge and skills to recognize the symptomology of PTSD, mental stressors, and physical and traumatic stressors, thus empowering them to employ protective strategies or seek professional help if needed. It is further recommended that pre-deployment mental health education be included in the current HAZWOPER 24- and 40-hour course curriculums, as well as, consideration be given towards integrating a stand-alone post-deployment mental health education training course into the current HAZWOPER hierarchy.^
Resumo:
The first manuscript, entitled "Time-Series Analysis as Input for Clinical Predictive Modeling: Modeling Cardiac Arrest in a Pediatric ICU" lays out the theoretical background for the project. There are several core concepts presented in this paper. First, traditional multivariate models (where each variable is represented by only one value) provide single point-in-time snapshots of patient status: they are incapable of characterizing deterioration. Since deterioration is consistently identified as a precursor to cardiac arrests, we maintain that the traditional multivariate paradigm is insufficient for predicting arrests. We identify time series analysis as a method capable of characterizing deterioration in an objective, mathematical fashion, and describe how to build a general foundation for predictive modeling using time series analysis results as latent variables. Building a solid foundation for any given modeling task involves addressing a number of issues during the design phase. These include selecting the proper candidate features on which to base the model, and selecting the most appropriate tool to measure them. We also identified several unique design issues that are introduced when time series data elements are added to the set of candidate features. One such issue is in defining the duration and resolution of time series elements required to sufficiently characterize the time series phenomena being considered as candidate features for the predictive model. Once the duration and resolution are established, there must also be explicit mathematical or statistical operations that produce the time series analysis result to be used as a latent candidate feature. In synthesizing the comprehensive framework for building a predictive model based on time series data elements, we identified at least four classes of data that can be used in the model design. The first two classes are shared with traditional multivariate models: multivariate data and clinical latent features. Multivariate data is represented by the standard one value per variable paradigm and is widely employed in a host of clinical models and tools. These are often represented by a number present in a given cell of a table. Clinical latent features derived, rather than directly measured, data elements that more accurately represent a particular clinical phenomenon than any of the directly measured data elements in isolation. The second two classes are unique to the time series data elements. The first of these is the raw data elements. These are represented by multiple values per variable, and constitute the measured observations that are typically available to end users when they review time series data. These are often represented as dots on a graph. The final class of data results from performing time series analysis. This class of data represents the fundamental concept on which our hypothesis is based. The specific statistical or mathematical operations are up to the modeler to determine, but we generally recommend that a variety of analyses be performed in order to maximize the likelihood that a representation of the time series data elements is produced that is able to distinguish between two or more classes of outcomes. The second manuscript, entitled "Building Clinical Prediction Models Using Time Series Data: Modeling Cardiac Arrest in a Pediatric ICU" provides a detailed description, start to finish, of the methods required to prepare the data, build, and validate a predictive model that uses the time series data elements determined in the first paper. One of the fundamental tenets of the second paper is that manual implementations of time series based models are unfeasible due to the relatively large number of data elements and the complexity of preprocessing that must occur before data can be presented to the model. Each of the seventeen steps is analyzed from the perspective of how it may be automated, when necessary. We identify the general objectives and available strategies of each of the steps, and we present our rationale for choosing a specific strategy for each step in the case of predicting cardiac arrest in a pediatric intensive care unit. Another issue brought to light by the second paper is that the individual steps required to use time series data for predictive modeling are more numerous and more complex than those used for modeling with traditional multivariate data. Even after complexities attributable to the design phase (addressed in our first paper) have been accounted for, the management and manipulation of the time series elements (the preprocessing steps in particular) are issues that are not present in a traditional multivariate modeling paradigm. In our methods, we present the issues that arise from the time series data elements: defining a reference time; imputing and reducing time series data in order to conform to a predefined structure that was specified during the design phase; and normalizing variable families rather than individual variable instances. The final manuscript, entitled: "Using Time-Series Analysis to Predict Cardiac Arrest in a Pediatric Intensive Care Unit" presents the results that were obtained by applying the theoretical construct and its associated methods (detailed in the first two papers) to the case of cardiac arrest prediction in a pediatric intensive care unit. Our results showed that utilizing the trend analysis from the time series data elements reduced the number of classification errors by 73%. The area under the Receiver Operating Characteristic curve increased from a baseline of 87% to 98% by including the trend analysis. In addition to the performance measures, we were also able to demonstrate that adding raw time series data elements without their associated trend analyses improved classification accuracy as compared to the baseline multivariate model, but diminished classification accuracy as compared to when just the trend analysis features were added (ie, without adding the raw time series data elements). We believe this phenomenon was largely attributable to overfitting, which is known to increase as the ratio of candidate features to class examples rises. Furthermore, although we employed several feature reduction strategies to counteract the overfitting problem, they failed to improve the performance beyond that which was achieved by exclusion of the raw time series elements. Finally, our data demonstrated that pulse oximetry and systolic blood pressure readings tend to start diminishing about 10-20 minutes before an arrest, whereas heart rates tend to diminish rapidly less than 5 minutes before an arrest.