481 resultados para LASSO Bayesiano


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Genética e Melhoramento Animal - FCAV

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Genética e Melhoramento Animal - FCAV

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We implemented least absolute shrinkage and selection operator (LASSO) regression to evaluate gene effects in genome-wide association studies (GWAS) of brain images, using an MRI-derived temporal lobe volume measure from 729 subjects scanned as part of the Alzheimer's Disease Neuroimaging Initiative (ADNI). Sparse groups of SNPs in individual genes were selected by LASSO, which identifies efficient sets of variants influencing the data. These SNPs were considered jointly when assessing their association with neuroimaging measures. We discovered 22 genes that passed genome-wide significance for influencing temporal lobe volume. This was a substantially greater number of significant genes compared to those found with standard, univariate GWAS. These top genes are all expressed in the brain and include genes previously related to brain function or neuropsychiatric disorders such as MACROD2, SORCS2, GRIN2B, MAGI2, NPAS3, CLSTN2, GABRG3, NRXN3, PRKAG2, GAS7, RBFOX1, ADARB2, CHD4, and CDH13. The top genes we identified with this method also displayed significant and widespread post hoc effects on voxelwise, tensor-based morphometry (TBM) maps of the temporal lobes. The most significantly associated gene was an autism susceptibility gene known as MACROD2.We were able to successfully replicate the effect of the MACROD2 gene in an independent cohort of 564 young, Australian healthy adult twins and siblings scanned with MRI (mean age: 23.8±2.2 SD years). Our approach powerfully complements univariate techniques in detecting influences of genes on the living brain.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Históricamente, los modelos de no-ejercicio para predecir el consumo máximo de oxígeno (VO2max) han sido construidos mediante regresión lineal frecuentista, usando técnicas estándar de selección de modelos. Sin embargo, existe incertidumbre acerca de la estructura estadística en el proceso de selección del modelo. En este estudio se propuso construir un modelo de no-ejercicio para predecir el VO2max en deportistas orientados al rendimiento, considerando la incertidumbre de modelo a través del Promedio Bayesiano de Modelos (BMA). Un objetivo adicional fue comparar la performance predictiva del BMA con las de los modelos derivados de varias técnicas frecuentistas usuales de selección de variables. Con tal fin, se implementó un submuestreo aleatorio estratificado repetido. Los datos incluyeron observaciones de la variable respuesta (en L·min-1), así como registros de Género, Deporte, Edad, Peso, Talla e Índice de masa corporal (BMI) (Edad = 22.1 ± 4.9 años, media ± SD; n = 272). Se propuso una clasificación de deportes con el objetivo de incluirla dentro del proceso de construcción del modelo: Combate, Juego, Resistencia 1 y Resistencia 2. El enfoque BMA se implementó en base a dos métodos: Occam's window y Composición de Modelo mediante el método de Monte Carlo con Cadenas de Markov (MC²). Se observaron discrepancias en la selección de variables entre los procedimientos frecuentistas. Ambos métodos de BMA produjeron resultados muy similares. Los modelos que incluyeron Género y las variables dummies para Resistencia 1 y Resistencia 2 acumularon virtualmente toda la probabilidad de modelo a posteriori. El Peso fue el predictor continuo con la más alta probabilidad de inclusión a posteriori (menor a 0.8). Las combinaciones de variables que involucraron predictores con un alto nivel de multicolinealidad fueron desacreditadas. Los modelos con sustancial contribución para el BMA presentaron un ajuste apreciable (R² ajustado menor a 0.8). Entre los modelos seleccionados por estrategias frecuentistas, el obtenido mediante el método de regresión por pasos (Stepwise regression method) con alfa igual a 0.05 fue el más respaldado por los datos, en términos de probabilidad de modelo a posteriori. En concordancia con la literatura, el BMA tuvo mejor performance predictiva de los datos fuera de la muestra que los modelos seleccionados por técnicas frecuentistas, medida por la cobertura del intervalo de predicción de 90 por ciento. La clasificación de deportes reveló resultados consistentes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a new iterative approach called Line Adaptation for the Singular Sources Objective (LASSO) to object or shape reconstruction based on the singular sources method (or probe method) for the reconstruction of scatterers from the far-field pattern of scattered acoustic or electromagnetic waves. The scheme is based on the construction of an indicator function given by the scattered field for incident point sources in its source point from the given far-field patterns for plane waves. The indicator function is then used to drive the contraction of a surface which surrounds the unknown scatterers. A stopping criterion for those parts of the surfaces that touch the unknown scatterers is formulated. A splitting approach for the contracting surfaces is formulated, such that scatterers consisting of several separate components can be reconstructed. Convergence of the scheme is shown, and its feasibility is demonstrated using a numerical study with several examples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The lasso procedure is an estimator-shrinkage and variable selection method. This paper shows that there always exists an interval of tuning parameter values such that the corresponding mean squared prediction error for the lasso estimator is smaller than for the ordinary least squares estimator. For an estimator satisfying some condition such as unbiasedness, the paper defines the corresponding generalized lasso estimator. Its mean squared prediction error is shown to be smaller than that of the estimator for values of the tuning parameter in some interval. This implies that all unbiased estimators are not admissible. Simulation results for five models support the theoretical results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Determining the causal relation among attributes in a domain is a key task in data mining and knowledge discovery. The Minimum Message Length (MML) principle has demonstrated its ability in discovering linear causal models from training data. To explore the ways to improve efficiency, this paper proposes a novel Markov Blanket identification algorithm based on the Lasso estimator. For each variable, this algorithm first generates a Lasso tree, which represents a pruned candidate set of possible feature sets. The Minimum Message Length principle is then employed to evaluate all those candidate feature sets, and the feature set with minimum message length is chosen as the Markov Blanket. Our experiment results show the ability of this algorithm. In addition, this algorithm can be used to prune the search space of causal discovery, and further reduce the computational cost of those score-based causal discovery algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modern healthcare is getting reshaped by growing Electronic Medical Records (EMR). Recently, these records have been shown of great value towards building clinical prediction models. In EMR data, patients' diseases and hospital interventions are captured through a set of diagnoses and procedures codes. These codes are usually represented in a tree form (e.g. ICD-10 tree) and the codes within a tree branch may be highly correlated. These codes can be used as features to build a prediction model and an appropriate feature selection can inform a clinician about important risk factors for a disease. Traditional feature selection methods (e.g. Information Gain, T-test, etc.) consider each variable independently and usually end up having a long feature list. Recently, Lasso and related l1-penalty based feature selection methods have become popular due to their joint feature selection property. However, Lasso is known to have problems of selecting one feature of many correlated features randomly. This hinders the clinicians to arrive at a stable feature set, which is crucial for clinical decision making process. In this paper, we solve this problem by using a recently proposed Tree-Lasso model. Since, the stability behavior of Tree-Lasso is not well understood, we study the stability behavior of Tree-Lasso and compare it with other feature selection methods. Using a synthetic and two real-world datasets (Cancer and Acute Myocardial Infarction), we show that Tree-Lasso based feature selection is significantly more stable than Lasso and comparable to other methods e.g. Information Gain, ReliefF and T-test. We further show that, using different types of classifiers such as logistic regression, naive Bayes, support vector machines, decision trees and Random Forest, the classification performance of Tree-Lasso is comparable to Lasso and better than other methods. Our result has implications in identifying stable risk factors for many healthcare problems and therefore can potentially assist clinical decision making for accurate medical prognosis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the greatest challenges of demography, nowadays, is to obtain estimates of mortality, in a consistent manner, mainly in small areas. The lack of this information, hinders public health actions and leads to impairment of quality of classification of deaths, generating concern on the part of demographers and epidemiologists in obtaining reliable statistics of mortality in the country. In this context, the objective of this work is to obtain estimates of deaths adjustment factors for correction of adult mortality, by States, meso-regions and age groups in the northeastern region, in 2010. The proposal is based on two lines of observation: a demographic one and a statistical one, considering also two areas of coverage in the States of the Northeast region, the meso-regions, as larger areas and counties, as small areas. The methodological principle is to use the General Equation and Balancing demographic method or General Growth Balance to correct the observed deaths, in larger areas (meso-regions) of the states, since they are less prone to breakage of methodological assumptions. In the sequence, it will be applied the statistical empirical Bayesian estimator method, considering as sum of deaths in the meso-regions, the death value corrected by the demographic method, and as reference of observation of smaller area, the observed deaths in small areas (counties). As results of this combination, a smoothing effect on the degree of coverage of deaths is obtained, due to the association with the empirical Bayesian Estimator, and the possibility of evaluating the degree of coverage of deaths by age groups at counties, meso-regions and states levels, with the advantage of estimete adjustment factors, according to the desired level of aggregation. The results grouped by State, point to a significant improvement of the degree of coverage of deaths, according to the combination of the methods with values above 80%. Alagoas (0.88), Bahia (0.90), Ceará (0.90), Maranhão (0.84), Paraíba (0.88), Pernambuco (0.93), Piauí (0.85), Rio Grande do Norte (0.89) and Sergipe (0.92). Advances in the control of the registry information in the health system, linked to improvements in socioeconomic conditions and urbanization of the counties, in the last decade, provided a better quality of information registry of deaths in small areas

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)