931 resultados para CHD Prediction, Blood Serum Data Chemometrics Methods
Resumo:
Currently, the quality of the Indonesian national road network is inadequate due to several constraints, including overcapacity and overloaded trucks. The high deterioration rate of the road infrastructure in developing countries along with major budgetary restrictions and high growth in traffic have led to an emerging need for improving the performance of the highway maintenance system. However, the high number of intervening factors and their complex effects require advanced tools to successfully solve this problem. The high learning capabilities of Data Mining (DM) are a powerful solution to this problem. In the past, these tools have been successfully applied to solve complex and multi-dimensional problems in various scientific fields. Therefore, it is expected that DM can be used to analyze the large amount of data regarding the pavement and traffic, identify the relationship between variables, and provide information regarding the prediction of the data. In this paper, we present a new approach to predict the International Roughness Index (IRI) of pavement based on DM techniques. DM was used to analyze the initial IRI data, including age, Equivalent Single Axle Load (ESAL), crack, potholes, rutting, and long cracks. This model was developed and verified using data from an Integrated Indonesia Road Management System (IIRMS) that was measured with the National Association of Australian State Road Authorities (NAASRA) roughness meter. The results of the proposed approach are compared with the IIRMS analytical model adapted to the IRI, and the advantages of the new approach are highlighted. We show that the novel data-driven model is able to learn (with high accuracy) the complex relationships between the IRI and the contributing factors of overloaded trucks
Resumo:
Recently, there has been a growing interest in the field of metabolomics, materialized by a remarkable growth in experimental techniques, available data and related biological applications. Indeed, techniques as Nuclear Magnetic Resonance, Gas or Liquid Chromatography, Mass Spectrometry, Infrared and UV-visible spectroscopies have provided extensive datasets that can help in tasks as biological and biomedical discovery, biotechnology and drug development. However, as it happens with other omics data, the analysis of metabolomics datasets provides multiple challenges, both in terms of methodologies and in the development of appropriate computational tools. Indeed, from the available software tools, none addresses the multiplicity of existing techniques and data analysis tasks. In this work, we make available a novel R package, named specmine, which provides a set of methods for metabolomics data analysis, including data loading in different formats, pre-processing, metabolite identification, univariate and multivariate data analysis, machine learning, and feature selection. Importantly, the implemented methods provide adequate support for the analysis of data from diverse experimental techniques, integrating a large set of functions from several R packages in a powerful, yet simple to use environment. The package, already available in CRAN, is accompanied by a web site where users can deposit datasets, scripts and analysis reports to be shared with the community, promoting the efficient sharing of metabolomics data analysis pipelines.
Resumo:
BACKGROUND: Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS: We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS: Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.
Resumo:
BACKGROUND: Chest pain can be caused by various conditions, with life-threatening cardiac disease being of greatest concern. Prediction scores to rule out coronary artery disease have been developed for use in emergency settings. We developed and validated a simple prediction rule for use in primary care. METHODS: We conducted a cross-sectional diagnostic study in 74 primary care practices in Germany. Primary care physicians recruited all consecutive patients who presented with chest pain (n = 1249) and recorded symptoms and findings for each patient (derivation cohort). An independent expert panel reviewed follow-up data obtained at six weeks and six months on symptoms, investigations, hospital admissions and medications to determine the presence or absence of coronary artery disease. Adjusted odds ratios of relevant variables were used to develop a prediction rule. We calculated measures of diagnostic accuracy for different cut-off values for the prediction scores using data derived from another prospective primary care study (validation cohort). RESULTS: The prediction rule contained five determinants (age/sex, known vascular disease, patient assumes pain is of cardiac origin, pain is worse during exercise, and pain is not reproducible by palpation), with the score ranging from 0 to 5 points. The area under the curve (receiver operating characteristic curve) was 0.87 (95% confidence interval [CI] 0.83-0.91) for the derivation cohort and 0.90 (95% CI 0.87-0.93) for the validation cohort. The best overall discrimination was with a cut-off value of 3 (positive result 3-5 points; negative result <or= 2 points), which had a sensitivity of 87.1% (95% CI 79.9%-94.2%) and a specificity of 80.8% (77.6%-83.9%). INTERPRETATION: The prediction rule for coronary artery disease in primary care proved to be robust in the validation cohort. It can help to rule out coronary artery disease in patients presenting with chest pain in primary care.
Resumo:
Background: Detection rates for adenoma and early colorectal cancer (CRC) are unsatisfactory due to low compliance towards invasive screening procedures such as colonoscopy. There is a large unmet screening need calling for an accurate, non-invasive and cost-effective test to screen for early neoplastic and pre-neoplastic lesions. Our goal is to identify effective biomarker combinations to develop a screening test aimed at detecting precancerous lesions and early CRC stages, based on a multigene assay performed on peripheral blood mononuclear cells (PBMC).Methods: A pilot study was conducted on 92 subjects. Colonoscopy revealed 21 CRC, 30 adenomas larger than 1 cm and 41 healthy controls. A panel of 103 biomarkers was selected by two approaches: a candidate gene approach based on literature review and whole transcriptome analysis of a subset of this cohort by Illumina TAG profiling. Blood samples were taken from each patient and PBMC purified. Total RNA was extracted and the 103 biomarkers were tested by multiplex RT-qPCR on the cohort. Different univariate and multivariate statistical methods were applied on the PCR data and 60 biomarkers, with significant p-value (< 0.01) for most of the methods, were selected.Results: The 60 biomarkers are involved in several different biological functions, such as cell adhesion, cell motility, cell signaling, cell proliferation, development and cancer. Two distinct molecular signatures derived from the biomarker combinations were established based on penalized logistic regression to separate patients without lesion from those with CRC or adenoma. These signatures were validated using bootstrapping method, leading to a separation of patients without lesion from those with CRC (Se 67%, Sp 93%, AUC 0.87) and from those with adenoma larger than 1cm (Se 63%, Sp 83%, AUC 0.77). In addition, the organ and disease specificity of these signatures was confirmed by means of patients with other cancer types and inflammatory bowel diseases.Conclusions: The two defined biomarker combinations effectively detect the presence of CRC and adenomas larger than 1 cm with high sensitivity and specificity. A prospective, multicentric, pivotal study is underway in order to validate these results in a larger cohort.
Resumo:
This report is concerned with the prediction of the long-time creep and shrinkage behavior of concrete. It is divided into three main areas. l. The development of general prediction methods that can be used by a design engineer when specific experimental data are not available. 2. The development of prediction methods based on experimental data. These methods take advantage of equations developed in item l, and can be used to accurately predict creep and shrinkage after only 28 days of data collection. 3. Experimental verification of items l and 2, and the development of specific prediction equations for four sand-lightweight aggregate concretes tested in the experimental program. The general prediction equations and methods are developed in Chapter II. Standard Equations to estimate the creep of normal weight concrete (Eq. 9), sand-lightweight concrete (Eq. 12), and lightweight concrete (Eq. 15) are recommended. These equations are developed for standard conditions (see Sec. 2. 1) and correction factors required to convert creep coefficients obtained from equations 9, 12, and 15 to valid predictions for other conditions are given in Equations 17 through 23. The correction factors are shown graphically in Figs. 6 through 13. Similar equations and methods are developed for the prediction of the shrinkage of moist cured normal weight concrete (Eq. 30}, moist cured sand-lightweight concrete (Eq. 33}, and moist cured lightweight concrete (Eq. 36). For steam cured concrete the equations are Eq. 42 for normal weight concrete, and Eq. 45 for lightweight concrete. Correction factors are given in Equations 47 through 52 and Figs., 18 through 24. Chapter III summarizes and illustrates, by examples, the prediction methods developed in Chapter II. Chapters IV and V describe an experimental program in which specific prediction equations are developed for concretes made with Haydite manufactured by Hydraulic Press Brick Co. (Eqs. 53 and 54}, Haydite manufactured by Buildex Inc. (Eqs. 55 and 56), Haydite manufactured by The Cater-Waters Corp. (Eqs. 57 and 58}, and Idealite manufactured by Idealite Co. (Eqs. 59 and 60). General prediction equations are also developed from the data obtained in the experimental program (Eqs. 61 and 62) and are compared to similar equations developed in Chapter II. Creep and Shrinkage prediction methods based on 28 day experimental data are developed in Chapter VI. The methods are verified by comparing predicted and measured values of the long-time creep and shrinkage of specimens tested at the University of Iowa (see Chapters IV and V) and elsewhere. The accuracy obtained is shown to be superior to other similar methods available to the design engineer.
Resumo:
Peer-reviewed
Resumo:
In order to improve the efficacy and safety of treatments, drug dosage needs to be adjusted to the actual needs of each patient in a truly personalized medicine approach. Key for widespread dosage adjustment is the availability of point-of-care devices able to measure plasma drug concentration in a simple, automated, and cost-effective fashion. In the present work, we introduce and test a portable, palm-sized transmission-localized surface plasmon resonance (T-LSPR) setup, comprised of off-the-shelf components and coupled with DNA-based aptamers specific to the antibiotic tobramycin (467 Da). The core of the T-LSPR setup are aptamer-functionalized gold nanoislands (NIs) deposited on a glass slide covered with fluorine-doped tin oxide (FTO), which acts as a biosensor. The gold NIs exhibit localized plasmon resonance in the visible range matching the sensitivity of the complementary metal oxide semiconductor (CMOS) image sensor employed as a light detector. The combination of gold NIs on the FTO substrate, causing NIs size and pattern irregularity, might reduce the overall sensitivity but confers extremely high stability in high-ionic solutions, allowing it to withstand numerous regeneration cycles without sensing losses. With this rather simple T-LSPR setup, we show real-time label-free detection of tobramycin in buffer, measuring concentrations down to 0.5 μM. We determined an affinity constant of the aptamer-tobramycin pair consistent with the value obtained using a commercial propagating-wave based SPR. Moreover, our label-free system can detect tobramycin in filtered undiluted blood serum, measuring concentrations down to 10 μM with a theoretical detection limit of 3.4 μM. While the association signal of tobramycin onto the aptamer is masked by the serum injection, the quantification of the captured tobramycin is possible during the dissociation phase and leads to a linear calibration curve for the concentrations over the tested range (10-80 μM). The plasmon shift following surface binding is calculated in terms of both plasmon peak location and hue, with the latter allowing faster data elaboration and real-time display of the results. The presented T-LSPR system shows for the first time label-free direct detection and quantification of a small molecule in the complex matrix of filtered undiluted blood serum. Its uncomplicated construction and compact size, together with the remarkable performances, represent a leap forward toward effective point-of-care devices for therapeutic drug concentration monitoring.
Resumo:
PURPOSE: The purpose of our study was to assess whether a model combining clinical factors, MR imaging features, and genomics would better predict overall survival of patients with glioblastoma (GBM) than either individual data type. METHODS: The study was conducted leveraging The Cancer Genome Atlas (TCGA) effort supported by the National Institutes of Health. Six neuroradiologists reviewed MRI images from The Cancer Imaging Archive (http://cancerimagingarchive.net) of 102 GBM patients using the VASARI scoring system. The patients' clinical and genetic data were obtained from the TCGA website (http://www.cancergenome.nih.gov/). Patient outcome was measured in terms of overall survival time. The association between different categories of biomarkers and survival was evaluated using Cox analysis. RESULTS: The features that were significantly associated with survival were: (1) clinical factors: chemotherapy; (2) imaging: proportion of tumor contrast enhancement on MRI; and (3) genomics: HRAS copy number variation. The combination of these three biomarkers resulted in an incremental increase in the strength of prediction of survival, with the model that included clinical, imaging, and genetic variables having the highest predictive accuracy (area under the curve 0.679±0.068, Akaike's information criterion 566.7, P<0.001). CONCLUSION: A combination of clinical factors, imaging features, and HRAS copy number variation best predicts survival of patients with GBM.
Resumo:
Vancomycin is a glycopeptide antibiotic employed in the treatment of infections caused by certain methicillin-resistant staphylococci. It is indicated also for patients allergic to penicillin or when there is no response to penicillins or cephalosporins. The adequate vancomycin concentration levels in blood serum lies between 5 and 10 mg/L. Higher values are toxic, causing mainly nephrotoxicity and ototoxicity. Various analytical methods are described in the literature: spectrophotometric, immunologic, biologic and chromatographic methods. This paper reviews the main analytical methods for vancomycin determination in biological fluids and in pharmaceutical preparations.
Resumo:
Recent years have produced great advances in the instrumentation technology. The amount of available data has been increasing due to the simplicity, speed and accuracy of current spectroscopic instruments. Most of these data are, however, meaningless without a proper analysis. This has been one of the reasons for the overgrowing success of multivariate handling of such data. Industrial data is commonly not designed data; in other words, there is no exact experimental design, but rather the data have been collected as a routine procedure during an industrial process. This makes certain demands on the multivariate modeling, as the selection of samples and variables can have an enormous effect. Common approaches in the modeling of industrial data are PCA (principal component analysis) and PLS (projection to latent structures or partial least squares) but there are also other methods that should be considered. The more advanced methods include multi block modeling and nonlinear modeling. In this thesis it is shown that the results of data analysis vary according to the modeling approach used, thus making the selection of the modeling approach dependent on the purpose of the model. If the model is intended to provide accurate predictions, the approach should be different than in the case where the purpose of modeling is mostly to obtain information about the variables and the process. For industrial applicability it is essential that the methods are robust and sufficiently simple to apply. In this way the methods and the results can be compared and an approach selected that is suitable for the intended purpose. Differences in data analysis methods are compared with data from different fields of industry in this thesis. In the first two papers, the multi block method is considered for data originating from the oil and fertilizer industries. The results are compared to those from PLS and priority PLS. The third paper considers applicability of multivariate models to process control for a reactive crystallization process. In the fourth paper, nonlinear modeling is examined with a data set from the oil industry. The response has a nonlinear relation to the descriptor matrix, and the results are compared between linear modeling, polynomial PLS and nonlinear modeling using nonlinear score vectors.
Resumo:
Learning Disability (LD) is a general term that describes specific kinds of learning problems. It is a neurological condition that affects a child's brain and impairs his ability to carry out one or many specific tasks. The learning disabled children are neither slow nor mentally retarded. This disorder can make it problematic for a child to learn as quickly or in the same way as some child who isn't affected by a learning disability. An affected child can have normal or above average intelligence. They may have difficulty paying attention, with reading or letter recognition, or with mathematics. It does not mean that children who have learning disabilities are less intelligent. In fact, many children who have learning disabilities are more intelligent than an average child. Learning disabilities vary from child to child. One child with LD may not have the same kind of learning problems as another child with LD. There is no cure for learning disabilities and they are life-long. However, children with LD can be high achievers and can be taught ways to get around the learning disability. In this research work, data mining using machine learning techniques are used to analyze the symptoms of LD, establish interrelationships between them and evaluate the relative importance of these symptoms. To increase the diagnostic accuracy of learning disability prediction, a knowledge based tool based on statistical machine learning or data mining techniques, with high accuracy,according to the knowledge obtained from the clinical information, is proposed. The basic idea of the developed knowledge based tool is to increase the accuracy of the learning disability assessment and reduce the time used for the same. Different statistical machine learning techniques in data mining are used in the study. Identifying the important parameters of LD prediction using the data mining techniques, identifying the hidden relationship between the symptoms of LD and estimating the relative significance of each symptoms of LD are also the parts of the objectives of this research work. The developed tool has many advantages compared to the traditional methods of using check lists in determination of learning disabilities. For improving the performance of various classifiers, we developed some preprocessing methods for the LD prediction system. A new system based on fuzzy and rough set models are also developed for LD prediction. Here also the importance of pre-processing is studied. A Graphical User Interface (GUI) is designed for developing an integrated knowledge based tool for prediction of LD as well as its degree. The designed tool stores the details of the children in the student database and retrieves their LD report as and when required. The present study undoubtedly proves the effectiveness of the tool developed based on various machine learning techniques. It also identifies the important parameters of LD and accurately predicts the learning disability in school age children. This thesis makes several major contributions in technical, general and social areas. The results are found very beneficial to the parents, teachers and the institutions. They are able to diagnose the child’s problem at an early stage and can go for the proper treatments/counseling at the correct time so as to avoid the academic and social losses.
Resumo:
The precision farmer wants to manage the variation in soil nutrient status continuously, which requires reliable predictions at places between sampling sites. Ordinary kriging can be used for prediction if the data are spatially dependent and there is a suitable variogram model. However, even if data are spatially correlated, there are often few soil sampling sites in relation to the area to be managed. If intensive ancillary data are available and these are coregionalized with the sparse soil data, they could be used to increase the accuracy of predictions of the soil properties by methods such as cokriging, kriging with external drift and regression kriging. This paper compares the accuracy of predictions of the plant available N properties (mineral N and potentially available N) for two arable fields in Bedfordshire, United Kingdom, from ordinary kriging, cokriging, kriging with external drift and regression kriging. For the last three, intensive elevation data were used with the soil data. The mean squared errors of prediction from these methods of kriging were determined at validation sites where the values were known. Kriging with external drift resulted in the smallest mean squared error for two of the three properties examined, and cokriging for the other. The results suggest that the use of intensive ancillary data can increase the accuracy of predictions of soil properties in arable fields provided that the variables are related spatially. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Data assimilation is predominantly used for state estimation; combining observational data with model predictions to produce an updated model state that most accurately approximates the true system state whilst keeping the model parameters fixed. This updated model state is then used to initiate the next model forecast. Even with perfect initial data, inaccurate model parameters will lead to the growth of prediction errors. To generate reliable forecasts we need good estimates of both the current system state and the model parameters. This paper presents research into data assimilation methods for morphodynamic model state and parameter estimation. First, we focus on state estimation and describe implementation of a three dimensional variational(3D-Var) data assimilation scheme in a simple 2D morphodynamic model of Morecambe Bay, UK. The assimilation of observations of bathymetry derived from SAR satellite imagery and a ship-borne survey is shown to significantly improve the predictive capability of the model over a 2 year run. Here, the model parameters are set by manual calibration; this is laborious and is found to produce different parameter values depending on the type and coverage of the validation dataset. The second part of this paper considers the problem of model parameter estimation in more detail. We explain how, by employing the technique of state augmentation, it is possible to use data assimilation to estimate uncertain model parameters concurrently with the model state. This approach removes inefficiencies associated with manual calibration and enables more effective use of observational data. We outline the development of a novel hybrid sequential 3D-Var data assimilation algorithm for joint state-parameter estimation and demonstrate its efficacy using an idealised 1D sediment transport model. The results of this study are extremely positive and suggest that there is great potential for the use of data assimilation-based state-parameter estimation in coastal morphodynamic modelling.