22 resultados para Information Preservation Method

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Currently more than half of Electronic Health Record (EHR) projects fail. Most of these failures are not due to flawed technology, but rather due to the lack of systematic considerations of human issues. Among the barriers for EHR adoption, function mismatching among users, activities, and systems is a major area that has not been systematically addressed from a human-centered perspective. A theoretical framework called Functional Framework was developed for identifying and reducing functional discrepancies among users, activities, and systems. The Functional Framework is composed of three models – the User Model, the Designer Model, and the Activity Model. The User Model was developed by conducting a survey (N = 32) that identified the functions needed and desired from the user’s perspective. The Designer Model was developed by conducting a systemic review of an Electronic Dental Record (EDR) and its functions. The Activity Model was developed using an ethnographic method called shadowing where EDR users (5 dentists, 5 dental assistants, 5 administrative personnel) were followed quietly and observed for their activities. These three models were combined to form a unified model. From the unified model the work domain ontology was developed by asking users to rate the functions (a total of 190 functions) in the unified model along the dimensions of frequency and criticality in a survey. The functional discrepancies, as indicated by the regions of the Venn diagrams formed by the three models, were consistent with the survey results, especially with user satisfaction. The survey for the Functional Framework indicated the preference of one system over the other (R=0.895). The results of this project showed that the Functional Framework provides a systematic method for identifying, evaluating, and reducing functional discrepancies among users, systems, and activities. Limitations and generalizability of the Functional Framework were discussed.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This exploratory descriptive study presents a content analyses of all (N=22) Family Preservation Journal (FPJ) articles published from its inception (1995) until today. Three raters independently used an analysis template to ascertain trends from these articles and assessed information about their purposes, methods, and findings/ implications. The main findings were less than half of the articles were deemed as 'research'; few used standardized or outcome measures; none compared family preservation to another method; descriptive knowledge was more likely to be generated; and the articles were primarily targeted to practitioners and other researchers. Given the relatively short history ofFPJ, the majority of these findings were considered typical and consistent with the literature. The recommendations call for more comprehensive practice descriptions, more research, and more rigorous research-oriented studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite current enthusiasm for investigation of gene-gene interactions and gene-environment interactions, the essential issue of how to define and detect gene-environment interactions remains unresolved. In this report, we define gene-environment interactions as a stochastic dependence in the context of the effects of the genetic and environmental risk factors on the cause of phenotypic variation among individuals. We use mutual information that is widely used in communication and complex system analysis to measure gene-environment interactions. We investigate how gene-environment interactions generate the large difference in the information measure of gene-environment interactions between the general population and a diseased population, which motives us to develop mutual information-based statistics for testing gene-environment interactions. We validated the null distribution and calculated the type 1 error rates for the mutual information-based statistics to test gene-environment interactions using extensive simulation studies. We found that the new test statistics were more powerful than the traditional logistic regression under several disease models. Finally, in order to further evaluate the performance of our new method, we applied the mutual information-based statistics to three real examples. Our results showed that P-values for the mutual information-based statistics were much smaller than that obtained by other approaches including logistic regression models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose: The purpose of the Camp For All Connection project is to facilitate access to electronic health information resources at the Camp For All facility. Setting/Participants/Resources: Camp For All is a barrier-free camp working in partnership with organizations to enrich the lives of children and adults with chronic illnesses and disabilities and their families by providing camping and retreat experiences. The camp facility is located on 206 acres in Burton, Texas. The project partners are Texas Woman's University, Houston Academy of Medicine-Texas Medical Center Library, and Camp For All. Brief Description: The Camp For All Connection project placed Internet-connected workstations at the camp's health center in the main lodge and provided training in the use of electronic health information resources. A train-the-trainer approach was used to provide training to Camp For All staff. Results/Outcome: Project workstations are being used by health care providers and camp staff for communication purposes and to make better informed health care decisions for Camp For All campers. Evaluation Method: A post-training evaluation was administered at the end of the train-the-trainer session. In addition, a series of site visits and interviews was conducted with camp staff members involved in the project. The site visits and interviews allowed for ongoing dialog between project staff and project participants.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A nonlinear viscoelastic image registration algorithm based on the demons paradigm and incorporating inverse consistent constraint (ICC) is implemented. An inverse consistent and symmetric cost function using mutual information (MI) as a similarity measure is employed. The cost function also includes regularization of transformation and inverse consistent error (ICE). The uncertainties in balancing various terms in the cost function are avoided by alternatively minimizing the similarity measure, the regularization of the transformation, and the ICE terms. The diffeomorphism of registration for preventing folding and/or tearing in the deformation is achieved by the composition scheme. The quality of image registration is first demonstrated by constructing brain atlas from 20 adult brains (age range 30-60). It is shown that with this registration technique: (1) the Jacobian determinant is positive for all voxels and (2) the average ICE is around 0.004 voxels with a maximum value below 0.1 voxels. Further, the deformation-based segmentation on Internet Brain Segmentation Repository, a publicly available dataset, has yielded high Dice similarity index (DSI) of 94.7% for the cerebellum and 74.7% for the hippocampus, attesting to the quality of our registration method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In spite of new legislation and much public and professional interest, intensive family preservation service (IFPS) remains in a vulnerable position as compared to other child welfare services. This article details a method to project ideal IFPS caseloads as a function of children who are at-risk for placement by various referral sources. Using this approach, resource allocation for IFPS can be more nearly on equal ground with the traditional child welfare functions and help IFPS to assume its needed place as a core service in the child welfare continuum.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Family preservation has been criticized for implementing programs that are not theoretically founded. One result of this circumstance is a lack of information regarding processes and outcomes of family preservation services. The knowledge base of family preservation is thus rather limited at present and will remain limited unless theory is consistently integrated within individual programs. A model for conceptualizing how theoretical consistency may be implemented within programs is presented and applied to family preservation. It is also necessary for programs to establish theoretical consistency before theoretical diversity, both within individual and across multiple programs, in order to advance the field in meaningful ways. A developmental cycle of knowledge generation is presented and applied to family preservation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Family preservation programs designed to prevent the out-of-home placement of children depend on the coordination of services from multiple agencies. Little is known regarding how coordination occurs. This case study examined this issue. Information was sought from all workers who provided services to each of five families and 'from families' case records. Thirty-one workers were interviewed with a semi-structured interview schedule containing rating scales and questions with open-ended response formats. Case records were reviewed with a case record review form. Analyses of data revealed the following. Services were coordinated to a moderate degree but that coordination deteriorated over time. Workers elaborated how aspects of communities, human service agencies, workers, and families affected coordination. Implications of findings for future research were drawn.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We developed a novel combinatorial method termed restriction endonuclease protection selection and amplification (REPSA) to identify consensus binding sites of DNA-binding ligands. REPSA uses a unique enzymatic selection based on the inhibition of cleavage by a type IIS restriction endonuclease, an enzyme that cleaves DNA at a site distal from its recognition sequence. Sequences bound by a ligand are protected from cleavage while unprotected sequences are cleaved. This enzymatic selection occurs in solution under mild conditions and is dependant only on the DNA-binding ability of the ligand. Thus, REPSA is useful for a broad range of ligands including all classes of DNA-binding ligands, weakly binding ligands, mixed populations of ligands, and unknown ligands. Here I describe REPSA and the application of this method to select the consensus DNA-binding sequences of three representative DNA-binding ligands; a nucleic acid (triplex-forming single-stranded DNA), a protein (the TATA-binding protein), and a small molecule (Distamycin A). These studies generated new information regarding the specificity of these ligands in addition to establishing their DNA-binding sequences. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this research and development project was to develop a method, a design, and a prototype for gathering, managing, and presenting data about occupational injuries.^ State-of-the-art systems analysis and design methodologies were applied to the long standing problem in the field of occupational safety and health of processing workplace injuries data into information for safety and health program management as well as preliminary research about accident etiologies. The top-down planning and bottom-up implementation approach was utilized to design an occupational injury management information system. A description of a managerial control system and a comprehensive system to integrate safety and health program management was provided.^ The project showed that current management information systems (MIS) theory and methods could be applied successfully to the problems of employee injury surveillance and control program performance evaluation. The model developed in the first section was applied at The University of Texas Health Science Center at Houston (UTHSCH).^ The system in current use at the UTHSCH was described and evaluated, and a prototype was developed for the UTHSCH. The prototype incorporated procedures for collecting, storing, and retrieving records of injuries and the procedures necessary to prepare reports, analyses, and graphics for management in the Health Science Center. Examples of reports, analyses, and graphics presenting UTHSCH and computer generated data were included.^ It was concluded that a pilot test of this MIS should be implemented and evaluated at the UTHSCH and other settings. Further research and development efforts for the total safety and health management information systems, control systems, component systems, and variable selection should be pursued. Finally, integration of the safety and health program MIS into the comprehensive or executive MIS was recommended. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Early Employee Assistance Programs (EAPs) had their origin in humanitarian motives, and there was little concern for their cost/benefit ratios; however, as some programs began accumulating data and analyzing it over time, even with single variables such as absenteeism, it became apparent that the humanitarian reasons for a program could be reinforced by cost savings particularly when the existence of the program was subject to justification.^ Today there is general agreement that cost/benefit analyses of EAPs are desirable, but the specific models for such analyses, particularly those making use of sophisticated but simple computer based data management systems, are few.^ The purpose of this research and development project was to develop a method, a design, and a prototype for gathering managing and presenting information about EAPS. This scheme provides information retrieval and analyses relevant to such aspects of EAP operations as: (1) EAP personnel activities, (2) Supervisory training effectiveness, (3) Client population demographics, (4) Assessment and Referral Effectiveness, (5) Treatment network efficacy, (6) Economic worth of the EAP.^ This scheme has been implemented and made operational at The University of Texas Employee Assistance Programs for more than three years.^ Application of the scheme in the various programs has defined certain variables which remained necessary in all programs. Depending on the degree of aggressiveness for data acquisition maintained by program personnel, other program specific variables are also defined. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Historically morphological features were used as the primary means to classify organisms. However, the age of molecular genetics has allowed us to approach this field from the perspective of the organism's genetic code. Early work used highly conserved sequences, such as ribosomal RNA. The increasing number of complete genomes in the public data repositories provides the opportunity to look not only at a single gene, but at organisms' entire parts list. ^ Here the Sequence Comparison Index (SCI) and the Organism Comparison Index (OCI), algorithms and methods to compare proteins and proteomes, are presented. The complete proteomes of 104 sequenced organisms were compared. Over 280 million full Smith-Waterman alignments were performed on sequence pairs which had a reasonable expectation of being related. From these alignments a whole proteome phylogenetic tree was constructed. This method was also used to compare the small subunit (SSU) rRNA from each organism and a tree constructed from these results. The SSU rRNA tree by the SCI/OCI method looks very much like accepted SSU rRNA trees from sources such as the Ribosomal Database Project, thus validating the method. The SCI/OCI proteome tree showed a number of small but significant differences when compared to the SSU rRNA tree and proteome trees constructed by other methods. Horizontal gene transfer does not appear to affect the SCI/OCI trees until the transferred genes make up a large portion of the proteome. ^ As part of this work, the Database of Related Local Alignments (DaRLA) was created and contains over 81 million rows of sequence alignment information. DaRLA, while primarily used to build the whole proteome trees, can also be applied shared gene content analysis, gene order analysis, and creating individual protein trees. ^ Finally, the standard BLAST method for analyzing shared gene content was compared to the SCI method using 4 spirochetes. The SCI system performed flawlessly, finding all proteins from one organism against itself and finding all the ribosomal proteins between organisms. The BLAST system missed some proteins from its respective organism and failed to detect small ribosomal proteins between organisms. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose. To examine the association between living in proximity to Toxics Release Inventory (TRI) facilities and the incidence of childhood cancer in the State of Texas. ^ Design. This is a secondary data analysis utilizing the publicly available Toxics release inventory (TRI), maintained by the U.S. Environmental protection agency that lists the facilities that release any of the 650 TRI chemicals. Total childhood cancer cases and childhood cancer rate (age 0-14 years) by county, for the years 1995-2003 were used from the Texas cancer registry, available at the Texas department of State Health Services website. Setting: This study was limited to the children population of the State of Texas. ^ Method. Analysis was done using Stata version 9 and SPSS version 15.0. Satscan was used for geographical spatial clustering of childhood cancer cases based on county centroids using the Poisson clustering algorithm which adjusts for population density. Pictorial maps were created using MapInfo professional version 8.0. ^ Results. One hundred and twenty five counties had no TRI facilities in their region, while 129 facilities had at least one TRI facility. An increasing trend for number of facilities and total disposal was observed except for the highest category based on cancer rate quartiles. Linear regression analysis using log transformation for number of facilities and total disposal in predicting cancer rates was computed, however both these variables were not found to be significant predictors. Seven significant geographical spatial clusters of counties for high childhood cancer rates (p<0.05) were indicated. Binomial logistic regression by categorizing the cancer rate in to two groups (<=150 and >150) indicated an odds ratio of 1.58 (CI 1.127, 2.222) for the natural log of number of facilities. ^ Conclusion. We have used a unique methodology by combining GIS and spatial clustering techniques with existing statistical approaches in examining the association between living in proximity to TRI facilities and the incidence of childhood cancer in the State of Texas. Although a concrete association was not indicated, further studies are required examining specific TRI chemicals. Use of this information can enable the researchers and public to identify potential concerns, gain a better understanding of potential risks, and work with industry and government to reduce toxic chemical use, disposal or other releases and the risks associated with them. TRI data, in conjunction with other information, can be used as a starting point in evaluating exposures and risks. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Microarray technology is a high-throughput method for genotyping and gene expression profiling. Limited sensitivity and specificity are one of the essential problems for this technology. Most of existing methods of microarray data analysis have an apparent limitation for they merely deal with the numerical part of microarray data and have made little use of gene sequence information. Because it's the gene sequences that precisely define the physical objects being measured by a microarray, it is natural to make the gene sequences an essential part of the data analysis. This dissertation focused on the development of free energy models to integrate sequence information in microarray data analysis. The models were used to characterize the mechanism of hybridization on microarrays and enhance sensitivity and specificity of microarray measurements. ^ Cross-hybridization is a major obstacle factor for the sensitivity and specificity of microarray measurements. In this dissertation, we evaluated the scope of cross-hybridization problem on short-oligo microarrays. The results showed that cross hybridization on arrays is mostly caused by oligo fragments with a run of 10 to 16 nucleotides complementary to the probes. Furthermore, a free-energy based model was proposed to quantify the amount of cross-hybridization signal on each probe. This model treats cross-hybridization as an integral effect of the interactions between a probe and various off-target oligo fragments. Using public spike-in datasets, the model showed high accuracy in predicting the cross-hybridization signals on those probes whose intended targets are absent in the sample. ^ Several prospective models were proposed to improve Positional Dependent Nearest-Neighbor (PDNN) model for better quantification of gene expression and cross-hybridization. ^ The problem addressed in this dissertation is fundamental to the microarray technology. We expect that this study will help us to understand the detailed mechanism that determines sensitivity and specificity on the microarrays. Consequently, this research will have a wide impact on how microarrays are designed and how the data are interpreted. ^