9 resultados para Semantic Publishing, Linked Data, Bibliometrics, Informetrics, Data Retrieval, Citations
em DigitalCommons@The Texas Medical Center
Resumo:
Colorectal cancer (CRC) is the third leading cancer in both incidence and mortality in Texas. This study investigated the adherence of CRC treatment to standard treatment guidelines and the association between standard treatment and CRC survival in Texas. The author used Texas Cancer Registry (TCR) and Medicare linked data to study the CRC treatment patterns and factors associated with standard treatment in patients who were more than 65 years old and were diagnosed in 2001 through 2007. We also determined whether adherence to standard treatment affect patients' survival. Multiple logistic regression and Cox regression analysis were used to analyze our data. Both regression models are adjusted for demographic characteristics and tumor characteristics. We found that for the 3977 regional colon cancer patients 80 years old or younger, 60.2% of them received chemotherapy, in adherence to the recommended treatment guidelines. People with younger age, female gender, higher education and lower comorbidity score are more likely adherent to this surgery guideline. Patients' adherence to chemotherapy in this cohort have better survival compared to those who are not (HR: 0.76, 95% CI: 0.68-0.84). For the 12709 colon cancer patients treated with surgery, 49.3% have more than 12 lymph nodes removed, in adherence to the treatment guidelines. People with younger age, female gender, higher education, regional stage, lager tumor size and lower comorbidity score are more likely to adherent to this surgery guideline. Patients with more than 12 lymph nodes removed in this cohort have better survival (HR: 0.86, 95% CI: 0.82-0.91). For the 1211 regional rectal cancer patients 80 years old or younger, 63.2% of them were adherent to radiation treatment. People with smaller tumor size and lower comorbidity score are more likely to adherent to this radiation guideline. There is no significant survival difference between radiation adherent patients and non-adherent patients (HR: 1.03, 95% CI: 0.82-1.29). For the 1122 regional rectal cancer patients 80 years old or younger who were treated with surgery, 76.0% of them received postoperative chemotherapy, in adherence to the treatment guidelines. People with younger age and smaller comorbidity score are related with higher adherence rate. Patients adherent with adjuvant chemotherapy in this cohort have better survival than those were not adherent (HR: 0.60, 95% CI: 0.45-0.79).^
Resumo:
OBJECTIVE: To determine whether algorithms developed for the World Wide Web can be applied to the biomedical literature in order to identify articles that are important as well as relevant. DESIGN AND MEASUREMENTS A direct comparison of eight algorithms: simple PubMed queries, clinical queries (sensitive and specific versions), vector cosine comparison, citation count, journal impact factor, PageRank, and machine learning based on polynomial support vector machines. The objective was to prioritize important articles, defined as being included in a pre-existing bibliography of important literature in surgical oncology. RESULTS Citation-based algorithms were more effective than noncitation-based algorithms at identifying important articles. The most effective strategies were simple citation count and PageRank, which on average identified over six important articles in the first 100 results compared to 0.85 for the best noncitation-based algorithm (p < 0.001). The authors saw similar differences between citation-based and noncitation-based algorithms at 10, 20, 50, 200, 500, and 1,000 results (p < 0.001). Citation lag affects performance of PageRank more than simple citation count. However, in spite of citation lag, citation-based algorithms remain more effective than noncitation-based algorithms. CONCLUSION Algorithms that have proved successful on the World Wide Web can be applied to biomedical information retrieval. Citation-based algorithms can help identify important articles within large sets of relevant results. Further studies are needed to determine whether citation-based algorithms can effectively meet actual user information needs.
Resumo:
Information overload is a significant problem for modern medicine. Searching MEDLINE for common topics often retrieves more relevant documents than users can review. Therefore, we must identify documents that are not only relevant, but also important. Our system ranks articles using citation counts and the PageRank algorithm, incorporating data from the Science Citation Index. However, citation data is usually incomplete. Therefore, we explore the relationship between the quantity of citation information available to the system and the quality of the result ranking. Specifically, we test the ability of citation count and PageRank to identify "important articles" as defined by experts from large result sets with decreasing citation information. We found that PageRank performs better than simple citation counts, but both algorithms are surprisingly robust to information loss. We conclude that even an incomplete citation database is likely to be effective for importance ranking.
Resumo:
Information overload is a significant problem for modern medicine. Searching MEDLINE for common topics often retrieves more relevant documents than users can review. Therefore, we must identify documents that are not only relevant, but also important. Our system ranks articles using citation counts and the PageRank algorithm, incorporating data from the Science Citation Index. However, citation data is usually incomplete. Therefore, we explore the relationship between the quantity of citation information available to the system and the quality of the result ranking. Specifically, we test the ability of citation count and PageRank to identify "important articles" as defined by experts from large result sets with decreasing citation information. We found that PageRank performs better than simple citation counts, but both algorithms are surprisingly robust to information loss. We conclude that even an incomplete citation database is likely to be effective for importance ranking.
Resumo:
The current state of health and biomedicine includes an enormity of heterogeneous data ‘silos’, collected for different purposes and represented differently, that are presently impossible to share or analyze in toto. The greatest challenge for large-scale and meaningful analyses of health-related data is to achieve a uniform data representation for data extracted from heterogeneous source representations. Based upon an analysis and categorization of heterogeneities, a process for achieving comparable data content by using a uniform terminological representation is developed. This process addresses the types of representational heterogeneities that commonly arise in healthcare data integration problems. Specifically, this process uses a reference terminology, and associated "maps" to transform heterogeneous data to a standard representation for comparability and secondary use. The capture of quality and precision of the “maps” between local terms and reference terminology concepts enhances the meaning of the aggregated data, empowering end users with better-informed queries for subsequent analyses. A data integration case study in the domain of pediatric asthma illustrates the development and use of a reference terminology for creating comparable data from heterogeneous source representations. The contribution of this research is a generalized process for the integration of data from heterogeneous source representations, and this process can be applied and extended to other problems where heterogeneous data needs to be merged.
Resumo:
People often use tools to search for information. In order to improve the quality of an information search, it is important to understand how internal information, which is stored in user’s mind, and external information, represented by the interface of tools interact with each other. How information is distributed between internal and external representations significantly affects information search performance. However, few studies have examined the relationship between types of interface and types of search task in the context of information search. For a distributed information search task, how data are distributed, represented, and formatted significantly affects the user search performance in terms of response time and accuracy. Guided by UFuRT (User, Function, Representation, Task), a human-centered process, I propose a search model, task taxonomy. The model defines its relationship with other existing information models. The taxonomy clarifies the legitimate operations for each type of search task of relation data. Based on the model and taxonomy, I have also developed prototypes of interface for the search tasks of relational data. These prototypes were used for experiments. The experiments described in this study are of a within-subject design with a sample of 24 participants recruited from the graduate schools located in the Texas Medical Center. Participants performed one-dimensional nominal search tasks over nominal, ordinal, and ratio displays, and searched one-dimensional nominal, ordinal, interval, and ratio tasks over table and graph displays. Participants also performed the same task and display combination for twodimensional searches. Distributed cognition theory has been adopted as a theoretical framework for analyzing and predicting the search performance of relational data. It has been shown that the representation dimensions and data scales, as well as the search task types, are main factors in determining search efficiency and effectiveness. In particular, the more external representations used, the better search task performance, and the results suggest the ideal search performance occurs when the question type and corresponding data scale representation match. The implications of the study lie in contributing to the effective design of search interface for relational data, especially laboratory results, which are often used in healthcare activities.
Resumo:
Background. Increased incidence of cancer is documented in immunosuppressed transplant patients. Likewise, as survival increases for persons infected with the Human Immunodeficiency Virus (HIV), we expect their incidence of cancer to increase. The objective of this study was to examine the current gender specific spectrum of cancer in an HIV infected cohort (especially malignancies not currently associated with Acquired Immunodeficiency Syndrome (AIDS)) in relation to the general population.^ Methods. Cancer incidence data was collected for residents of Harris County, Texas who were diagnosed with a malignancy between 1975 and 1994. This data was linked to HIV/AIDS registry data to identify malignancies in an HIV infected cohort of 14,986 persons. A standardized incidence ratio (SIR) analysis was used to compare incidence of cancer in this cohort to that in the general population. Risk factors such as mode of HIV infection, age, race and gender, were evaluated for contribution to the development of cancer within the HIV cohort, using Cox regression techniques.^ Findings. Of those in the HIV infected cohort, 2289 persons (15%) were identified as having one or more malignancies. The linkage identified 29.5% of these malignancies (males 28.7% females 60.9%). HIV infected men and women had incidences of cancer that were 16.7 (16.1, 17.3) and 2.9 (2.3, 3.7) times that expected for the general population of Harris County, Texas, adjusting for age. Significant SIR's were observed for the AIDS-defining malignancies of Kaposi's sarcoma, non-Hodgkin's lymphoma, primary lymphoma of the brain and cancer of the cervix. Additionally, significant SIR's for non-melanotic skin cancer in males, 6.9 (4.8, 9.5) and colon cancer in females, 4.0 (1.1, 10.2) were detected. Among the HIV infected cohort, race/ethnicity of White (relative risk 2.4 with 95% confidence intervals 2.0, 2.8) or Spanish Surname, 2.2 (1.9, 2.7) and an infection route of male to male sex, with, 3.0 (1.9, 4.9) or without, 3.4 (2.1, 5.5) intravenous drug use, increased the risk of having a diagnosis of an incident cancer.^ Interpretation. There appears to be an increased risk of developing cancer if infected with the HIV. In addition to the malignancies routinely associated with HIV infection, there appears to be an increased risk of being diagnosed with non-melanotic skin cancer in males and colon cancer in females. ^
Resumo:
Of the large clinical trials evaluating screening mammography efficacy, none included women ages 75 and older. Recommendations on an upper age limit at which to discontinue screening are based on indirect evidence and are not consistent. Screening mammography is evaluated using observational data from the SEER-Medicare linked database. Measuring the benefit of screening mammography is difficult due to the impact of lead-time bias, length bias and over-detection. The underlying conceptual model divides the disease into two stages: pre-clinical (T0) and symptomatic (T1) breast cancer. Treating the time in these phases as a pair of dependent bivariate observations, (t0,t1), estimates are derived to describe the distribution of this random vector. To quantify the effect of screening mammography, statistical inference is made about the mammography parameters that correspond to the marginal distribution of the symptomatic phase duration (T1). This shows the hazard ratio of death from breast cancer comparing women with screen-detected tumors to those detected at their symptom onset is 0.36 (0.30, 0.42), indicating a benefit among the screen-detected cases. ^