Biblioteca Digital

947 resultados para stochastic search variable selection

Principled Sure Independence Screening for Cox Models with Ultra-high-dimensional Covariates

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Survival Analysis with Large Dimensional Covariates: An Application in Microarray Studies

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Use of microarray technology often leads to high-dimensional and low- sample size data settings. Over the past several years, a variety of novel approaches have been proposed for variable selection in this context. However, only a small number of these have been adapted for time-to-event data where censoring is present. Among standard variable selection methods shown both to have good predictive accuracy and to be computationally efficient is the elastic net penalization approach. In this paper, adaptation of the elastic net approach is presented for variable selection both under the Cox proportional hazards model and under an accelerated failure time (AFT) model. Assessment of the two methods is conducted through simulation studies and through analysis of microarray data obtained from a set of patients with diffuse large B-cell lymphoma where time to survival is of interest. The approaches are shown to match or exceed the predictive performance of a Cox-based and an AFT-based variable selection method. The methods are moreover shown to be much more computationally efficient than their respective Cox- and AFT- based counterparts.

Veja mais

Effectively Selecting a Target Population for a Future Comparative Study

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When comparing a new treatment with a control in a randomized clinical study, the treatment effect is generally assessed by evaluating a summary measure over a specific study population. The success of the trial heavily depends on the choice of such a population. In this paper, we show a systematic, effective way to identify a promising population, for which the new treatment is expected to have a desired benefit, using the data from a current study involving similar comparator treatments. Specifically, with the existing data we first create a parametric scoring system using multiple covariates to estimate subject-specific treatment differences. Using this system, we specify a desired level of treatment difference and create a subgroup of patients, defined as those whose estimated scores exceed this threshold. An empirically calibrated group-specific treatment difference curve across a range of threshold values is constructed. The population of patients with any desired level of treatment benefit can then be identified accordingly. To avoid any ``self-serving'' bias, we utilize a cross-training-evaluation method for implementing the above two-step procedure. Lastly, we show how to select the best scoring system among all competing models. The proposals are illustrated with the data from two clinical trials in treating AIDS and cardiovascular diseases. Note that if we are not interested in designing a new study for comparing similar treatments, the new procedure can also be quite useful for the management of future patients who would receive nontrivial benefits to compensate for the risk or cost of the new treatment.

Veja mais

Towards greater accuracy in individual-tree mortality regression

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background mortality is an essential component of any forest growth and yield model. Forecasts of mortality contribute largely to the variability and accuracy of model predictions at the tree, stand and forest level. In the present study, I implement and evaluate state-of-the-art techniques to increase the accuracy of individual tree mortality models, similar to those used in many of the current variants of the Forest Vegetation Simulator, using data from North Idaho and Montana. The first technique addresses methods to correct for bias induced by measurement error typically present in competition variables. The second implements survival regression and evaluates its performance against the traditional logistic regression approach. I selected the regression calibration (RC) algorithm as a good candidate for addressing the measurement error problem. Two logistic regression models for each species were fitted, one ignoring the measurement error, which is the “naïve” approach, and the other applying RC. The models fitted with RC outperformed the naïve models in terms of discrimination when the competition variable was found to be statistically significant. The effect of RC was more obvious where measurement error variance was large and for more shade-intolerant species. The process of model fitting and variable selection revealed that past emphasis on DBH as a predictor variable for mortality, while producing models with strong metrics of fit, may make models less generalizable. The evaluation of the error variance estimator developed by Stage and Wykoff (1998), and core to the implementation of RC, in different spatial patterns and diameter distributions, revealed that the Stage and Wykoff estimate notably overestimated the true variance in all simulated stands, but those that are clustered. Results show a systematic bias even when all the assumptions made by the authors are guaranteed. I argue that this is the result of the Poisson-based estimate ignoring the overlapping area of potential plots around a tree. Effects, especially in the application phase, of the variance estimate justify suggested future efforts of improving the accuracy of the variance estimate. The second technique implemented and evaluated is a survival regression model that accounts for the time dependent nature of variables, such as diameter and competition variables, and the interval-censored nature of data collected from remeasured plots. The performance of the model is compared with the traditional logistic regression model as a tool to predict individual tree mortality. Validation of both approaches shows that the survival regression approach discriminates better between dead and alive trees for all species. In conclusion, I showed that the proposed techniques do increase the accuracy of individual tree mortality models, and are a promising first step towards the next generation of background mortality models. I have also identified the next steps to undertake in order to advance mortality models further.

Veja mais

Predicting smoking cessation and its relapse in HIV-infected patients: the Swiss HIV Cohort Study

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVES: The aim of the study was to assess whether prospective follow-up data within the Swiss HIV Cohort Study can be used to predict patients who stop smoking; or among smokers who stop, those who start smoking again. METHODS: We built prediction models first using clinical reasoning ('clinical models') and then by selecting from numerous candidate predictors using advanced statistical methods ('statistical models'). Our clinical models were based on literature that suggests that motivation drives smoking cessation, while dependence drives relapse in those attempting to stop. Our statistical models were based on automatic variable selection using additive logistic regression with component-wise gradient boosting. RESULTS: Of 4833 smokers, 26% stopped smoking, at least temporarily; because among those who stopped, 48% started smoking again. The predictive performance of our clinical and statistical models was modest. A basic clinical model for cessation, with patients classified into three motivational groups, was nearly as discriminatory as a constrained statistical model with just the most important predictors (the ratio of nonsmoking visits to total visits, alcohol or drug dependence, psychiatric comorbidities, recent hospitalization and age). A basic clinical model for relapse, based on the maximum number of cigarettes per day prior to stopping, was not as discriminatory as a constrained statistical model with just the ratio of nonsmoking visits to total visits. CONCLUSIONS: Predicting smoking cessation and relapse is difficult, so that simple models are nearly as discriminatory as complex ones. Patients with a history of attempting to stop and those known to have stopped recently are the best candidates for an intervention.

Veja mais

A chrysophyte-based quantitative reconstruction of winter severity from varved lake sediments in NE Poland during the past millennium and its relationship to natural climate variability

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Chrysophyte cysts are recognized as powerful proxies of cold-season temperatures. In this paper we use the relationship between chrysophyte assemblages and the number of days below 4 °C (DB4 °C) in the epilimnion of a lake in northern Poland to develop a transfer function and to reconstruct winter severity in Poland for the last millennium. DB4 °C is a climate variable related to the length of the winter. Multivariate ordination techniques were used to study the distribution of chrysophytes from sediment traps of 37 low-land lakes distributed along a variety of environmental and climatic gradients in northern Poland. Of all the environmental variables measured, stepwise variable selection and individual Redundancy analyses (RDA) identified DB4 °C as the most important variable for chrysophytes, explaining a portion of variance independent of variables related to water chemistry (conductivity, chlorides, K, sulfates), which were also important. A quantitative transfer function was created to estimate DB4 °C from sedimentary assemblages using partial least square regression (PLS). The two-component model (PLS-2) had a coefficient of determination of View the MathML sourceRcross2 = 0.58, with root mean squared error of prediction (RMSEP, based on leave-one-out) of 3.41 days. The resulting transfer function was applied to an annually-varved sediment core from Lake Żabińskie, providing a new sub-decadal quantitative reconstruction of DB4 °C with high chronological accuracy for the period AD 1000–2010. During Medieval Times (AD 1180–1440) winters were generally shorter (warmer) except for a decade with very long and severe winters around AD 1260–1270 (following the AD 1258 volcanic eruption). The 16th and 17th centuries and the beginning of the 19th century experienced very long severe winters. Comparison with other European cold-season reconstructions and atmospheric indices for this region indicates that large parts of the winter variability (reconstructed DB4 °C) is due to the interplay between the oscillations of the zonal flow controlled by the North Atlantic Oscillation (NAO) and the influence of continental anticyclonic systems (Siberian High, East Atlantic/Western Russia pattern). Differences with other European records are attributed to geographic climatological differences between Poland and Western Europe (Low Countries, Alps). Striking correspondence between the combined volcanic and solar forcing and the DB4 °C reconstruction prior to the 20th century suggests that winter climate in Poland responds mostly to natural forced variability (volcanic and solar) and the influence of unforced variability is low.

Veja mais

Prophylactic antibiotics or G(M)-CSF for the prevention of infections and improvement of survival in cancer patients receiving myelotoxic chemotherapy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND Febrile neutropenia (FN) and other infectious complications are some of the most serious treatment-related toxicities of chemotherapy for cancer, with a mortality rate of 2% to 21%. The two main types of prophylactic regimens are granulocyte (macrophage) colony-stimulating factors (G(M)-CSF) and antibiotics, frequently quinolones or cotrimoxazole. Current guidelines recommend the use of colony-stimulating factors when the risk of febrile neutropenia is above 20%, but they do not mention the use of antibiotics. However, both regimens have been shown to reduce the incidence of infections. Since no systematic review has compared the two regimens, a systematic review was undertaken. OBJECTIVES To compare the efficacy and safety of G(M)-CSF compared to antibiotics in cancer patients receiving myelotoxic chemotherapy. SEARCH METHODS We searched The Cochrane Library, MEDLINE, EMBASE, databases of ongoing trials, and conference proceedings of the American Society of Clinical Oncology and the American Society of Hematology (1980 to December 2015). We planned to include both full-text and abstract publications. Two review authors independently screened search results. SELECTION CRITERIA We included randomised controlled trials (RCTs) comparing prophylaxis with G(M)-CSF versus antibiotics for the prevention of infection in cancer patients of all ages receiving chemotherapy. All study arms had to receive identical chemotherapy regimes and other supportive care. We included full-text, abstracts, and unpublished data if sufficient information on study design, participant characteristics, interventions and outcomes was available. We excluded cross-over trials, quasi-randomised trials and post-hoc retrospective trials. DATA COLLECTION AND ANALYSIS Two review authors independently screened the results of the search strategies, extracted data, assessed risk of bias, and analysed data according to standard Cochrane methods. We did final interpretation together with an experienced clinician. MAIN RESULTS In this updated review, we included no new randomised controlled trials. We included two trials in the review, one with 40 breast cancer patients receiving high-dose chemotherapy and G-CSF compared to antibiotics, a second one evaluating 155 patients with small-cell lung cancer receiving GM-CSF or antibiotics.We judge the overall risk of bias as high in the G-CSF trial, as neither patients nor physicians were blinded and not all included patients were analysed as randomised (7 out of 40 patients). We considered the overall risk of bias in the GM-CSF to be moderate, because of the risk of performance bias (neither patients nor personnel were blinded), but low risk of selection and attrition bias.For the trial comparing G-CSF to antibiotics, all cause mortality was not reported. There was no evidence of a difference for infection-related mortality, with zero events in each arm. Microbiologically or clinically documented infections, severe infections, quality of life, and adverse events were not reported. There was no evidence of a difference in frequency of febrile neutropenia (risk ratio (RR) 1.22; 95% confidence interval (CI) 0.53 to 2.84). The quality of the evidence for the two reported outcomes, infection-related mortality and frequency of febrile neutropenia, was very low, due to the low number of patients evaluated (high imprecision) and the high risk of bias.There was no evidence of a difference in terms of median survival time in the trial comparing GM-CSF and antibiotics. Two-year survival times were 6% (0 to 12%) in both arms (high imprecision, low quality of evidence). There were four toxic deaths in the GM-CSF arm and three in the antibiotics arm (3.8%), without evidence of a difference (RR 1.32; 95% CI 0.30 to 5.69; P = 0.71; low quality of evidence). There were 28% grade III or IV infections in the GM-CSF arm and 18% in the antibiotics arm, without any evidence of a difference (RR 1.55; 95% CI 0.86 to 2.80; P = 0.15, low quality of evidence). There were 5 episodes out of 360 cycles of grade IV infections in the GM-CSF arm and 3 episodes out of 334 cycles in the cotrimoxazole arm (0.8%), with no evidence of a difference (RR 1.55; 95% CI 0.37 to 6.42; P = 0.55; low quality of evidence). There was no significant difference between the two arms for non-haematological toxicities like diarrhoea, stomatitis, infections, neurologic, respiratory, or cardiac adverse events. Grade III and IV thrombopenia occurred significantly more frequently in the GM-CSF arm (60.8%) compared to the antibiotics arm (28.9%); (RR 2.10; 95% CI 1.41 to 3.12; P = 0.0002; low quality of evidence). Neither infection-related mortality, incidence of febrile neutropenia, nor quality of life were reported in this trial. AUTHORS' CONCLUSIONS As we only found two small trials with 195 patients altogether, no conclusion for clinical practice is possible. More trials are necessary to assess the benefits and harms of G(M)-CSF compared to antibiotics for infection prevention in cancer patients receiving chemotherapy.

Veja mais

AN OCCUPATIONAL INJURY MANAGEMENT INFORMATION SYSTEM FOR THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this research and development project was to develop a method, a design, and a prototype for gathering, managing, and presenting data about occupational injuries.^ State-of-the-art systems analysis and design methodologies were applied to the long standing problem in the field of occupational safety and health of processing workplace injuries data into information for safety and health program management as well as preliminary research about accident etiologies. The top-down planning and bottom-up implementation approach was utilized to design an occupational injury management information system. A description of a managerial control system and a comprehensive system to integrate safety and health program management was provided.^ The project showed that current management information systems (MIS) theory and methods could be applied successfully to the problems of employee injury surveillance and control program performance evaluation. The model developed in the first section was applied at The University of Texas Health Science Center at Houston (UTHSCH).^ The system in current use at the UTHSCH was described and evaluated, and a prototype was developed for the UTHSCH. The prototype incorporated procedures for collecting, storing, and retrieving records of injuries and the procedures necessary to prepare reports, analyses, and graphics for management in the Health Science Center. Examples of reports, analyses, and graphics presenting UTHSCH and computer generated data were included.^ It was concluded that a pilot test of this MIS should be implemented and evaluated at the UTHSCH and other settings. Further research and development efforts for the total safety and health management information systems, control systems, component systems, and variable selection should be pursued. Finally, integration of the safety and health program MIS into the comprehensive or executive MIS was recommended. ^

Veja mais

Adaptive clinical trial design for targeted therapy development

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The development of targeted therapy involve many challenges. Our study will address some of the key issues involved in biomarker identification and clinical trial design. In our study, we propose two biomarker selection methods, and then apply them in two different clinical trial designs for targeted therapy development. In particular, we propose a Bayesian two-step lasso procedure for biomarker selection in the proportional hazards model in Chapter 2. In the first step of this strategy, we use the Bayesian group lasso to identify the important marker groups, wherein each group contains the main effect of a single marker and its interactions with treatments. In the second step, we zoom in to select each individual marker and the interactions between markers and treatments in order to identify prognostic or predictive markers using the Bayesian adaptive lasso. In Chapter 3, we propose a Bayesian two-stage adaptive design for targeted therapy development while implementing the variable selection method given in Chapter 2. In Chapter 4, we proposed an alternate frequentist adaptive randomization strategy for situations where a large number of biomarkers need to be incorporated in the study design. We also propose a new adaptive randomization rule, which takes into account the variations associated with the point estimates of survival times. In all of our designs, we seek to identify the key markers that are either prognostic or predictive with respect to treatment. We are going to use extensive simulation to evaluate the operating characteristics of our methods.^

Veja mais

BAYESIAN STATISTICAL METHODS IN GENE-ENVIRONMENT AND GENE-GENE INTERACTION STUDIES

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.

Veja mais

Identification of a biomarker panel for colorectal cancer diagnosis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Malignancies arising in the large bowel cause the second largest number of deaths from cancer in the Western World. Despite progresses made during the last decades, colorectal cancer remains one of the most frequent and deadly neoplasias in the western countries. Methods A genomic study of human colorectal cancer has been carried out on a total of 31 tumoral samples, corresponding to different stages of the disease, and 33 non-tumoral samples. The study was carried out by hybridisation of the tumour samples against a reference pool of non-tumoral samples using Agilent Human 1A 60-mer oligo microarrays. The results obtained were validated by qRT-PCR. In the subsequent bioinformatics analysis, gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling were built. The consensus among all the induced models produced a hierarchy of dependences and, thus, of variables. Results After an exhaustive process of pre-processing to ensure data quality--lost values imputation, probes quality, data smoothing and intraclass variability filtering--the final dataset comprised a total of 8, 104 probes. Next, a supervised classification approach and data analysis was carried out to obtain the most relevant genes. Two of them are directly involved in cancer progression and in particular in colorectal cancer. Finally, a supervised classifier was induced to classify new unseen samples. Conclusions We have developed a tentative model for the diagnosis of colorectal cancer based on a biomarker panel. Our results indicate that the gene profile described herein can discriminate between non-cancerous and cancerous samples with 94.45% accuracy using different supervised classifiers (AUC values in the range of 0.997 and 0.955)

Veja mais

Detecting reliable gene interactions by a hierarchy of Bayesian network classifiers

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main purpose of a gene interaction network is to map the relationships of the genes that are out of sight when a genomic study is tackled. DNA microarrays allow the measure of gene expression of thousands of genes at the same time. These data constitute the numeric seed for the induction of the gene networks. In this paper, we propose a new approach to build gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling. The interactions induced by the Bayesian classifiers are based both on the expression levels and on the phenotype information of the supervised variable. Feature selection and bootstrap resampling add reliability and robustness to the overall process removing the false positive findings. The consensus among all the induced models produces a hierarchy of dependences and, thus, of variables. Biologists can define the depth level of the model hierarchy so the set of interactions and genes involved can vary from a sparse to a dense set. Experimental results show how these networks perform well on classification tasks. The biological validation matches previous biological findings and opens new hypothesis for future studies

Veja mais

Regularized model learning in EDAs for continuous and multi-objective optimization

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.

Veja mais

Identification of a biomarker panel for colorectal cancer diagnosis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background:Malignancies arising in the large bowel cause the second largest number of deaths from cancer in the Western World. Despite progresses made during the last decades, colorectal cancer remains one of the most frequent and deadly neoplasias in the western countries. Methods: A genomic study of human colorectal cancer has been carried out on a total of 31 tumoral samples, corresponding to different stages of the disease, and 33 non-tumoral samples. The study was carried out by hybridisation of the tumour samples against a reference pool of non-tumoral samples using Agilent Human 1A 60-mer oligo microarrays. The results obtained were validated by qRT-PCR. In the subsequent bioinformatics analysis, gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling were built. The consensus among all the induced models produced a hierarchy of dependences and, thus, of variables. Results: After an exhaustive process of pre-processing to ensure data quality--lost values imputation, probes quality, data smoothing and intraclass variability filtering--the final dataset comprised a total of 8, 104 probes. Next, a supervised classification approach and data analysis was carried out to obtain the most relevant genes. Two of them are directly involved in cancer progression and in particular in colorectal cancer. Finally, a supervised classifier was induced to classify new unseen samples. Conclusions: We have developed a tentative model for the diagnosis of colorectal cancer based on a biomarker panel. Our results indicate that the gene profile described herein can discriminate between non-cancerous and cancerous samples with 94.45% accuracy using different supervised classifiers (AUC values in the range of 0.997 and 0.955).

Veja mais

Gestión de pruebas para equipos de vídeo

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Desde hace ya muchos años, uno de los servicios de telecomunicaciones más demandado por los españoles ha sido la televisión de pago, complementando y ampliando la oferta de contenidos audiovisuales que habitualmente son ofrecidos de manera gratuita por la televisión analógica y recientemente por la televisión digital terrestre o TDT. Estos servicios de video, han sido tradicionalmente ofrecidos por operadores satélites, operadores de cable u otros operadores de telecomunicaciones con los que a través de una conexión de datos (ADSL, VDSL o fibra óptica), ofrecían sus contenidos a través de IP. La propia evolución y mejora de la tecnología utilizada para la emisión de contenidos sobre IP, ha permitido que a día de hoy, la televisión se conciba como un servicio Over The Top (OTT) ajeno al medio de transmisión, permitiendo a cualquier agente, distribuir sus contenidos audiovisuales de manera sencilla y a todos sus clientes en todas las partes del mundo; siendo solamente necesario disponer de una conexión a internet. De esta manera, el proyecto desarrollado va a girar en torno a la herramienta StormTest de la empresa S3Group, comprada por CENTUM Solutions (empresa especializada en ofrecer servicio de ingeniería para sistema de comunicaciones, control e inteligencia de señal) con el objetivo de satisfacer las necesidades de sus clientes y con la que en definitiva se ha contado para la realización de este proyecto. El principal objetivo de este proyecto es la definición e implementación de un banco de pruebas que permita optimizar los procesos de validación técnica, mejorando los tiempos de ejecución y concentrando la actividad de los ingenieros en tareas de mayor valor. Para la realización de este proyecto, se han fijado diversos objetivos necesarios para el desarrollo de este tipo de actividades. Los principales son los siguientes:  Análisis de la problemática actual: donde en los procesos de aceptación técnica se dedica muchas horas de trabajo para la realización de pruebas repetitivas y de poco valor las cuales se pueden automatizar por herramientas existentes en el mercado.  Búsqueda y selección de una herramienta que satisfaga las necesidades de pruebas.  Instalación en los laboratorios.  Configuración y adaptación de la herramienta a las necesidades y proyectos específicos. Con todo ello, este proyecto cubrirá los siguientes logros:  Reducir los tiempos de ejecución de las campañas de pruebas, gracias a la automatización de gran parte ellas.  Realizar medidas de calidad subjetiva y objetiva complejas, imposibles de ejecutar a través de las personas. Mejorar y automatizar los sistemas de reporte de resultados. Abstract: Many years ago, one of the telecommunications services most demanded in Spain has been pay television, complementing and extending the offer of audiovisual content which are usually offered for free by analog tv and recently by digital terrestrial televisión or TDT. These video services, have been traditionally offered by satellite operators, cable operators or other telecommunications operators that through a data connection (ADSL,VDSL or fiber optic), offered its content over IP. The evolution and improvement of the technology used for broadcasting over IP, has allowed that to date, television is conceived as a service Over The Top (OTT), not dependent on the transmission medium, allowing any agent to distribute audiovisual content in a very simple way and to all its customers in all parts of the world; being only necessary to have an decent internet connection. In this way, the project will have relation with S3Group’s StormTest tool, bought by CENTUM Solutions (company specialized in engineering services for communications, control and signal intelligence system) with the aim of satisfying the needs of its customers and which ultimately has counted for the realization of this project. The main objective of this project is the definition and implementation of a test bench that allows to optimize the processes of technical validation, improving execution times and concentrating the activities of engineers on higher value tasks. For the realization of this project, it has been defined several objectives necessary for the development of this type of activity. The most important tones are listed below:  Analysis of the current situation: where in technical acceptance processes it is dedicated many hours of work for the completion of repetitive testing and without value which can be automated by tools available on the market  Search and selection of a tool that meets the needs of testing.  Installation on the laboratories.  Configuration and customization of the tool to specific projects. With all this, this project will cover the following achievements: Reduce the execution time of the testing campaigns, thanks to the automation of many of them.  Measurements of subjective and objective quality tests, impossible to run with engineers (due to subjective perception). Improve and automate reporting of results systems

Veja mais

947 resultados para stochastic search variable selection

Filtro por publicador