950 resultados para akaike information criterion
Resumo:
Background: Recently, with the access of low toxicity biological and targeted therapies, evidence of the existence of a long-term survival subpopulation of cancer patients is appearing. We have studied an unselected population with advanced lung cancer to look for evidence of multimodality in survival distribution, and estimate the proportion of long-term survivors. Methods: We used survival data of 4944 patients with non-small-cell lung cancer (NSCLC) stages IIIb-IV at diagnostic, registered in the National Cancer Registry of Cuba (NCRC) between January 1998 and December 2006. We fitted one-component survival model and two-component mixture models to identify short-and long-term survivors. Bayesian information criterion was used for model selection. Results: For all of the selected parametric distributions the two components model presented the best fit. The population with short-term survival (almost 4 months median survival) represented 64% of patients. The population of long-term survival included 35% of patients, and showed a median survival around 12 months. None of the patients of short-term survival was still alive at month 24, while 10% of the patients of long-term survival died afterwards. Conclusions: There is a subgroup showing long-term evolution among patients with advanced lung cancer. As survival rates continue to improve with the new generation of therapies, prognostic models considering short-and long-term survival subpopulations should be considered in clinical research.
Resumo:
In response to declining biomass of Northeast Pacific groundfish in the late 1990s and to improve the scientific basis for management of the fishery, the Northwest Fisheries Science Center standardized and enhanced their annual bottom trawl survey in 2003. The survey was expanded to include the entire area along the U.S. west coast at depths of 55–1280 m. Coast-wide biomass and species richness significantly decreased during the first eight years (2003–10) of this fishery-independent survey. We observed an overall tendency toward declining biomass for 62 dominant taxa combined (fishery target and nontarget species) and four of seven subgroups (including cartilaginous fish, flatfishes, shelf rockfishes, and other shelf species), despite increasing or variable biomass trends in individual species. These decreases occurred during a period of reduced catch for groundfish along the shelf and upper slope regions relative to historical rates. We used information from multiple stock assessments to aggregate species into three groups: 1) with strong recruitment, 2) without strong recruitment in 1999, and 3) with unknown recruitment level. For each group, we evaluated whether declining biomass was primarily related to depletion (using year as a proxy) or environmental factors (i.e., variation in the Pacific Decadal Oscillation). According to Akaike’s information criterion, changes in aggregate biomass for species with strong recruitment were more closely related to year, whereas those with no strong recruitment were more closely related to climate. The significant decline in biomass for species without strong recruitment confirms that factors other than depletion of the exceptional 1999 year class may be responsible for the observed decrease in biomass along the U.S. west coast.
Resumo:
The Common Octopus, Octopus vulgaris, is an r-selected mollusk found off the coast of North Carolina that interests commercial fishermen because of its market value and the cost-effectiveness of unbaited pots that can catch it. This study sought to: 1) determine those gear and environmental factors that influenced catch rates of octopi, and 2) evaluate the feasibility of small-scale commercial operations for this species. Pots were fished from August 2010 through September 2011 set in strings over hard and sandy bottom in waters from 18 to 30 m deep in Onslow Bay, N.C. Three pot types were fished in each string; octopus pots with- and without lids, and conch pots. Proportional catch was modeled as a function of gear design and environmental factors (location, soak time, bottom type, and sea surface water temperature) using binomially distributed generalized linear models (GLM’s); parsimony of each GLM was assessed with Akaike Information Criteria (AIC). A total of 229 octopi were caught throughout the study. Pots with lids, pots without lids, and conch pots caught an average of 0.15, 0.17, and 0.11 octopi, respectively, with high variability in catch rates for each pot type. The GLM that best fit the data described proportional catch as a function of sea surface temperature, soak time, and station; greatest proportional catches occurred over short soak times, warmest temperatures, and less well known reef areas. Due to operating expenses (fuel, crew time, and maintenance), low catch rates of octopi, and high gear loss, a directed fishery for this species is not economically feasible at the catch rates found in this study. The model fitting to determine factors most influential on catch rates should help fishermen determine seasons and gear soak times that are likely to maximize catch rates. Potting for octopi may be commercially practical as a supplemental activity when targeting demersal fish species that are found in similar habitats and depth ranges in coastal waters off North Carolina.
Resumo:
We present a multispectral photometric stereo method for capturing geometry of deforming surfaces. A novel photometric calibration technique allows calibration of scenes containing multiple piecewise constant chromaticities. This method estimates per-pixel photometric properties, then uses a RANSAC-based approach to estimate the dominant chromaticities in the scene. A likelihood term is developed linking surface normal, image intensity and photometric properties, which allows estimating the number of chromaticities present in a scene to be framed as a model estimation problem. The Bayesian Information Criterion is applied to automatically estimate the number of chromaticities present during calibration. A two-camera stereo system provides low resolution geometry, allowing the likelihood term to be used in segmenting new images into regions of constant chromaticity. This segmentation is carried out in a Markov Random Field framework and allows the correct photometric properties to be used at each pixel to estimate a dense normal map. Results are shown on several challenging real-world sequences, demonstrating state-of-the-art results using only two cameras and three light sources. Quantitative evaluation is provided against synthetic ground truth data. © 2011 IEEE.
Resumo:
Fuzzy-neural-network-based inference systems are well-known universal approximators which can produce linguistically interpretable results. Unfortunately, their dimensionality can be extremely high due to an excessive number of inputs and rules, which raises the need for overall structure optimization. In the literature, various input selection methods are available, but they are applied separately from rule selection, often without considering the fuzzy structure. This paper proposes an integrated framework to optimize the number of inputs and the number of rules simultaneously. First, a method is developed to select the most significant rules, along with a refinement stage to remove unnecessary correlations. An improved information criterion is then proposed to find an appropriate number of inputs and rules to include in the model, leading to a balanced tradeoff between interpretability and accuracy. Simulation results confirm the efficacy of the proposed method.
Resumo:
We consider the local order estimation of nonlinear autoregressive systems with exogenous inputs (NARX), which may have different local dimensions at different points. By minimizing the kernel-based local information criterion introduced in this paper, the strongly consistent estimates for the local orders of the NARX system at points of interest are obtained. The modification of the criterion and a simple procedure of searching the minimum of the criterion, are also discussed. The theoretical results derived here are tested by simulation examples.
Resumo:
We present a novel method for the light-curve characterization of Pan-STARRS1 Medium Deep Survey (PS1 MDS) extragalactic sources into stochastic variables (SVs) and burst-like (BL) transients, using multi-band image-differencing time-series data. We select detections in difference images associated with galaxy hosts using a star/galaxy catalog extracted from the deep PS1 MDS stacked images, and adopt a maximum a posteriori formulation to model their difference-flux time-series in four Pan-STARRS1 photometric bands gP1, rP1, iP1, and zP1. We use three deterministic light-curve models to fit BL transients; a Gaussian, a Gamma distribution, and an analytic supernova (SN) model, and one stochastic light-curve model, the Ornstein-Uhlenbeck process, in order to fit variability that is characteristic of active galactic nuclei (AGNs). We assess the quality of fit of the models band-wise and source-wise, using their estimated leave-out-one cross-validation likelihoods and corrected Akaike information criteria. We then apply a K-means clustering algorithm on these statistics, to determine the source classification in each band. The final source classification is derived as a combination of the individual filter classifications, resulting in two measures of classification quality, from the averages across the photometric filters of (1) the classifications determined from the closest K-means cluster centers, and (2) the square distances from the clustering centers in the K-means clustering spaces. For a verification set of AGNs and SNe, we show that SV and BL occupy distinct regions in the plane constituted by these measures. We use our clustering method to characterize 4361 extragalactic image difference detected sources, in the first 2.5 yr of the PS1 MDS, into 1529 BL, and 2262 SV, with a purity of 95.00% for AGNs, and 90.97% for SN based on our verification sets. We combine our light-curve classifications with their nuclear or off-nuclear host galaxy offsets, to define a robust photometric sample of 1233 AGNs and 812 SNe. With these two samples, we characterize their variability and host galaxy properties, and identify simple photometric priors that would enable their real-time identification in future wide-field synoptic surveys.
Resumo:
Background: EpHA2 is a 130 kD transmembrane glycoprotein belonging to ephrin receptor subfamily and involved in angiogenesis/tumour neovascularisation. High EpHA2 mRNA level has recently been implicated in cetuximab resistance. Previously, we found high EpHA2 levels in a panel of invasive colorectal cancer (CRC) cells, which was associated with high levels of stem-cell marker CD44. Our aim was to investigate the prognostic value of EpHA2 and subsequently correlate expression levels to known clinico-pathological variables in early stage CRC. Methods: Tissue samples from 509 CRC patients were analysed. EpHA2 expression was measured using IHC. Kaplan-Meier graphs were used. Univariate and multivariate analyses employed Cox Proportional Hazards Ratio (HR) method. A backward selection method (Akaike’s information criterion) was used to determine a refined multivariate model. Results: EpHA2 was highly expressed in CRC adenocarcinoma compared to matched normal colon tissue. In support of our preclinical invasive models, strong correlation was found between EpHA2 expression and CD44 and Lgr5 staining (p<0.001). In addition, high EpHA2 expression significantly correlated with vascular invasion (p=0.03).HR for OS for stage II/III patients with high EpHA2 expression was 1.69 (95%CI: 1.164-2.439; p=0.003). When stage II/III was broken down into individual stages, there was significant correlation between high EpHA2 expression and poor 5-years OS in stage II patients (HR: 2.18; 95%CI: 1.28-3.71; p=0.005).HR in the stage III group showed a trend to statistical significance (HR: 1.48; 95%CI=0.87-2.51; p=0.05). In both univariate and multivariate analyses of stage II patients, high EpHA2 expression was the only significant factor and was retained in the final multivariate model. Higher levels of EpHA2 were noted in our RAS and BRAF mutant CRC cells, and silencing EpHA2 resulted in significant decreases in migration/invasion in parental and invasive CRC sublines. Correlation between KRAS/NRAS/BRAFmutational status and EpHA2 expression in clinical samples is ongoing. Conclusions: Taken together, our study is the first to indicate that EpHA2 expression is a predictor of poor clinical outcome and a potential novel target in early stage CRC.
Resumo:
OBJECTIVES: The purpose of this study was to evaluate the association between inflammation and heart failure (HF) risk in older adults. BACKGROUND: Inflammation is associated with HF risk factors and also directly affects myocardial function. METHODS: The association of baseline serum concentrations of interleukin (IL)-6, tumor necrosis factor-alpha, and C-reactive protein (CRP) with incident HF was assessed with Cox models among 2,610 older persons without prevalent HF enrolled in the Health ABC (Health, Aging, and Body Composition) study (age 73.6 +/- 2.9 years; 48.3% men; 59.6% white). RESULTS: During follow-up (median 9.4 years), HF developed in 311 (11.9%) participants. In models controlling for clinical characteristics, ankle-arm index, and incident coronary heart disease, doubling of IL-6, tumor necrosis factor-alpha, and CRP concentrations was associated with 29% (95% confidence interval: 13% to 47%; p < 0.001), 46% (95% confidence interval: 17% to 84%; p = 0.001), and 9% (95% confidence interval: -1% to 24%; p = 0.087) increase in HF risk, respectively. In models including all 3 markers, IL-6, and tumor necrosis factor-alpha, but not CRP, remained significant. These associations were similar across sex and race and persisted in models accounting for death as a competing event. Post-HF ejection fraction was available in 239 (76.8%) cases; inflammatory markers had stronger association with HF with preserved ejection fraction. Repeat IL-6 and CRP determinations at 1-year follow-up did not provide incremental information. Addition of IL-6 to the clinical Health ABC HF model improved model discrimination (C index from 0.717 to 0.734; p = 0.001) and fit (decreased Bayes information criterion by 17.8; p < 0.001). CONCLUSIONS: Inflammatory markers are associated with HF risk among older adults and may improve HF risk stratification.
Resumo:
Cette thèse porte sur l'analyse bayésienne de données fonctionnelles dans un contexte hydrologique. L'objectif principal est de modéliser des données d'écoulements d'eau d'une manière parcimonieuse tout en reproduisant adéquatement les caractéristiques statistiques de celles-ci. L'analyse de données fonctionnelles nous amène à considérer les séries chronologiques d'écoulements d'eau comme des fonctions à modéliser avec une méthode non paramétrique. Dans un premier temps, les fonctions sont rendues plus homogènes en les synchronisant. Ensuite, disposant d'un échantillon de courbes homogènes, nous procédons à la modélisation de leurs caractéristiques statistiques en faisant appel aux splines de régression bayésiennes dans un cadre probabiliste assez général. Plus spécifiquement, nous étudions une famille de distributions continues, qui inclut celles de la famille exponentielle, de laquelle les observations peuvent provenir. De plus, afin d'avoir un outil de modélisation non paramétrique flexible, nous traitons les noeuds intérieurs, qui définissent les éléments de la base des splines de régression, comme des quantités aléatoires. Nous utilisons alors le MCMC avec sauts réversibles afin d'explorer la distribution a posteriori des noeuds intérieurs. Afin de simplifier cette procédure dans notre contexte général de modélisation, nous considérons des approximations de la distribution marginale des observations, nommément une approximation basée sur le critère d'information de Schwarz et une autre qui fait appel à l'approximation de Laplace. En plus de modéliser la tendance centrale d'un échantillon de courbes, nous proposons aussi une méthodologie pour modéliser simultanément la tendance centrale et la dispersion de ces courbes, et ce dans notre cadre probabiliste général. Finalement, puisque nous étudions une diversité de distributions statistiques au niveau des observations, nous mettons de l'avant une approche afin de déterminer les distributions les plus adéquates pour un échantillon de courbes donné.
Resumo:
Les simulations ont été implémentées avec le programme Java.
Resumo:
In survival analysis frailty is often used to model heterogeneity between individuals or correlation within clusters. Typically frailty is taken to be a continuous random effect, yielding a continuous mixture distribution for survival times. A Bayesian analysis of a correlated frailty model is discussed in the context of inverse Gaussian frailty. An MCMC approach is adopted and the deviance information criterion is used to compare models. As an illustration of the approach a bivariate data set of corneal graft survival times is analysed. (C) 2006 Elsevier B.V. All rights reserved.
Resumo:
Estimation of a population size by means of capture-recapture techniques is an important problem occurring in many areas of life and social sciences. We consider the frequencies of frequencies situation, where a count variable is used to summarize how often a unit has been identified in the target population of interest. The distribution of this count variable is zero-truncated since zero identifications do not occur in the sample. As an application we consider the surveillance of scrapie in Great Britain. In this case study holdings with scrapie that are not identified (zero counts) do not enter the surveillance database. The count variable of interest is the number of scrapie cases per holding. For count distributions a common model is the Poisson distribution and, to adjust for potential heterogeneity, a discrete mixture of Poisson distributions is used. Mixtures of Poissons usually provide an excellent fit as will be demonstrated in the application of interest. However, as it has been recently demonstrated, mixtures also suffer under the so-called boundary problem, resulting in overestimation of population size. It is suggested here to select the mixture model on the basis of the Bayesian Information Criterion. This strategy is further refined by employing a bagging procedure leading to a series of estimates of population size. Using the median of this series, highly influential size estimates are avoided. In limited simulation studies it is shown that the procedure leads to estimates with remarkable small bias.
Resumo:
The calculation of interval forecasts for highly persistent autoregressive (AR) time series based on the bootstrap is considered. Three methods are considered for countering the small-sample bias of least-squares estimation for processes which have roots close to the unit circle: a bootstrap bias-corrected OLS estimator; the use of the Roy–Fuller estimator in place of OLS; and the use of the Andrews–Chen estimator in place of OLS. All three methods of bias correction yield superior results to the bootstrap in the absence of bias correction. Of the three correction methods, the bootstrap prediction intervals based on the Roy–Fuller estimator are generally superior to the other two. The small-sample performance of bootstrap prediction intervals based on the Roy–Fuller estimator are investigated when the order of the AR model is unknown, and has to be determined using an information criterion.
Resumo:
Various studies have indicated a relationship between enteric methane (CH4) production and milk fatty acid (FA) profiles of dairy cattle. However, the number of studies investigating such a relationship is limited and the direct relationships reported are mainly obtained by variation in CH4 production and milk FA concentration induced by dietary lipid supplements. The aim of this study was to perform a meta-analysis to quantify relationships between CH4 yield (per unit of feed and unit of milk) and milk FA profile in dairy cattle and to develop equations to predict CH4 yield based on milk FA profile of cows fed a wide variety of diets. Data from 8 experiments encompassing 30 different dietary treatments and 146 observations were included. Yield of CH4 measured in these experiments was 21.5 ± 2.46 g/kg of dry matter intake (DMI) and 13.9 ± 2.30 g/ kg of fat- and protein-corrected milk (FPCM). Correlation coefficients were chosen as effect size of the relationship between CH4 yield and individual milk FA concentration (g/100 g of FA). Average true correlation coefficients were estimated by a random-effects model. Milk FA concentrations of C6:0, C8:0, C10:0, C16:0, and C16:0-iso were significantly or tended to be positively related to CH4 yield per unit of feed. Concentrations of trans-6+7+8+9 C18:1, trans-10+11 C18:1, cis- 11 C18:1, cis-12 C18:1, cis-13 C18:1, trans-16+cis-14 C18:1, and cis-9,12 C18:2 in milk fat were significantly or tended to be negatively related to CH4 yield per unit of feed. Milk FA concentrations of C10:0, C12:0, C14:0-iso, C14:0, cis-9 C14:1, C15:0, and C16:0 were significantly or tended to be positively related to CH4 yield per unit of milk. Concentrations of C4:0, C18:0, trans-10+11 C18:1, cis-9 C18:1, cis-11 C18:1, and cis- 9,12 C18:2 in milk fat were significantly or tended to be negatively related to CH4 yield per unit of milk. Mixed model multiple regression and a stepwise selection procedure of milk FA based on the Bayesian information criterion to predict CH4 yield with milk FA as input (g/100 g of FA) resulted in the following prediction equations: CH4 (g/kg of DMI) = 23.39 + 9.74 × C16:0- iso – 1.06 × trans-10+11 C18:1 – 1.75 × cis-9,12 C18:2 (R2 = 0.54), and CH4 (g/kg of FPCM) = 21.13 – 1.38 × C4:0 + 8.53 × C16:0-iso – 0.22 × cis-9 C18:1 – 0.59 × trans-10+11 C18:1 (R2 = 0.47). This indicated that milk FA profile has a moderate potential for predicting CH4 yield per unit of feed and a slightly lower potential for predicting CH4 yield per unit of milk. Key words: methane , milk fatty acid profile , metaanalysis , dairy cattle