924 resultados para Limited dependent variable regression
Resumo:
Ordinal outcomes are frequently employed in diagnosis and clinical trials. Clinical trials of Alzheimer's disease (AD) treatments are a case in point using the status of mild, moderate or severe disease as outcome measures. As in many other outcome oriented studies, the disease status may be misclassified. This study estimates the extent of misclassification in an ordinal outcome such as disease status. Also, this study estimates the extent of misclassification of a predictor variable such as genotype status. An ordinal logistic regression model is commonly used to model the relationship between disease status, the effect of treatment, and other predictive factors. A simulation study was done. First, data based on a set of hypothetical parameters and hypothetical rates of misclassification was created. Next, the maximum likelihood method was employed to generate likelihood equations accounting for misclassification. The Nelder-Mead Simplex method was used to solve for the misclassification and model parameters. Finally, this method was applied to an AD dataset to detect the amount of misclassification present. The estimates of the ordinal regression model parameters were close to the hypothetical parameters. β1 was hypothesized at 0.50 and the mean estimate was 0.488, β2 was hypothesized at 0.04 and the mean of the estimates was 0.04. Although the estimates for the rates of misclassification of X1 were not as close as β1 and β2, they validate this method. X 1 0-1 misclassification was hypothesized as 2.98% and the mean of the simulated estimates was 1.54% and, in the best case, the misclassification of k from high to medium was hypothesized at 4.87% and had a sample mean of 3.62%. In the AD dataset, the estimate for the odds ratio of X 1 of having both copies of the APOE 4 allele changed from an estimate of 1.377 to an estimate 1.418, demonstrating that the estimates of the odds ratio changed when the analysis includes adjustment for misclassification. ^
Resumo:
It is well known that an identification problem exists in the analysis of age-period-cohort data because of the relationship among the three factors (date of birth + age at death = date of death). There are numerous suggestions about how to analyze the data. No one solution has been satisfactory. The purpose of this study is to provide another analytic method by extending the Cox's lifetable regression model with time-dependent covariates. The new approach contains the following features: (1) It is based on the conditional maximum likelihood procedure using a proportional hazard function described by Cox (1972), treating the age factor as the underlying hazard to estimate the parameters for the cohort and period factors. (2) The model is flexible so that both the cohort and period factors can be treated as dummy or continuous variables, and the parameter estimations can be obtained for numerous combinations of variables as in a regression analysis. (3) The model is applicable even when the time period is unequally spaced.^ Two specific models are considered to illustrate the new approach and applied to the U.S. prostate cancer data. We find that there are significant differences between all cohorts and there is a significant period effect for both whites and nonwhites. The underlying hazard increases exponentially with age indicating that old people have much higher risk than young people. A log transformation of relative risk shows that the prostate cancer risk declined in recent cohorts for both models. However, prostate cancer risk declined 5 cohorts (25 years) earlier for whites than for nonwhites under the period factor model (0 0 0 1 1 1 1). These latter results are similar to the previous study by Holford (1983).^ The new approach offers a general method to analyze the age-period-cohort data without using any arbitrary constraint in the model. ^
Resumo:
The problem of analyzing data with updated measurements in the time-dependent proportional hazards model arises frequently in practice. One available option is to reduce the number of intervals (or updated measurements) to be included in the Cox regression model. We empirically investigated the bias of the estimator of the time-dependent covariate while varying the effect of failure rate, sample size, true values of the parameters and the number of intervals. We also evaluated how often a time-dependent covariate needs to be collected and assessed the effect of sample size and failure rate on the power of testing a time-dependent effect.^ A time-dependent proportional hazards model with two binary covariates was considered. The time axis was partitioned into k intervals. The baseline hazard was assumed to be 1 so that the failure times were exponentially distributed in the ith interval. A type II censoring model was adopted to characterize the failure rate. The factors of interest were sample size (500, 1000), type II censoring with failure rates of 0.05, 0.10, and 0.20, and three values for each of the non-time-dependent and time-dependent covariates (1/4,1/2,3/4).^ The mean of the bias of the estimator of the coefficient of the time-dependent covariate decreased as sample size and number of intervals increased whereas the mean of the bias increased as failure rate and true values of the covariates increased. The mean of the bias of the estimator of the coefficient was smallest when all of the updated measurements were used in the model compared with two models that used selected measurements of the time-dependent covariate. For the model that included all the measurements, the coverage rates of the estimator of the coefficient of the time-dependent covariate was in most cases 90% or more except when the failure rate was high (0.20). The power associated with testing a time-dependent effect was highest when all of the measurements of the time-dependent covariate were used. An example from the Systolic Hypertension in the Elderly Program Cooperative Research Group is presented. ^
Resumo:
Atoll islands are subject to a variety of processes that influence their geomorphological development. Analysis of historical shoreline changes using remotely sensed images has become an efficient approach to both quantify past changes and estimate future island response. However, the detection of long-term changes in beach width is challenging mainly for two reasons: first, data availability is limited for many remote Pacific islands. Second, beach environments are highly dynamic and strongly influenced by seasonal or episodic shoreline oscillations. Consequently, remote-sensing studies on beach morphodynamics of atoll islands deal with dynamic features covered by a low sampling frequency. Here we present a study of beach dynamics for nine islands on Takú Atoll, Papua New Guinea, over a seven-decade period. A considerable chronological gap between aerial photographs and satellite images was addressed by applying a new method that reweighted positions of the beach limit by identifying "outlier" shoreline positions. On top of natural beach variability observed along the reweighted beach sections, we found that one third of the analyzed islands show a statistically significant decrease in reweighted beach width since 1943. The total loss of beach area for all islands corresponds to 44% of the initial beach area. Variable shoreline trajectories suggest that changes in beach width on Takú Atoll are dependent on local control (that is, human activity and longshore sediment transport). Our results show that remote imagery with a low sampling frequency may be sufficient to characterize prominent morphological changes in planform beach configuration of reef islands.
Resumo:
This paper addresses the question of maximizing classifier accuracy for classifying task-related mental activity from Magnetoencelophalography (MEG) data. We propose the use of different sources of information and introduce an automatic channel selection procedure. To determine an informative set of channels, our approach combines a variety of machine learning algorithms: feature subset selection methods, classifiers based on regularized logistic regression, information fusion, and multiobjective optimization based on probabilistic modeling of the search space. The experimental results show that our proposal is able to improve classification accuracy compared to approaches whose classifiers use only one type of MEG information or for which the set of channels is fixed a priori.
Resumo:
Sequences of the variable heavy (VH) and κ (Vκ) domains of Ig structures were divided into 21 fragments that correspond to strands, loops, or parts of these structural units of the variable domains. Amino acid sequences of fragments (termed “words”) were collected from the 1,172 human heavy and 668 human κ chains available in the Kabat database. Statistical analysis of words of 17 fragments was performed (fragments that comprise the complementary determining regions′ fragments will not be discussed in this paper). The number of different words (those with different residues in at least one position) ranged, for various fragments, from 11 to 75 in the κ chains, and from 23 to 189 in the heavy chains. The main result of this study is that very few keywords, or main patterns of words, were necessary to describe over 90% of the sequences (no more than two keywords per fragment in the κ and no more than five per fragment in the heavy chains). No identical keywords were found for different fragments of the variable domains. Keywords of aligned fragments of the VH and Vκ domains were different in all but two instances. Thus, knowing the keywords, one can determine whether any given small part of a sequence belongs to a heavy or κ chain and predict its precise localization in the sequence. In addition, by using all of the keywords obtained through analysis of the Kabat database, it was possible to describe completely the sequences of the human VH and Vκ germ-line segments.
Resumo:
To determine whether T-cell-receptor (TCR) usage by T cells recognizing a defined human tumor antigen in the context of the same HLA molecule is conserved, we analyzed the TCR diversity of autologous HLA-A2-restricted cytotoxic T-lymphocyte (CTL) clones derived from five patients with metastatic melanoma and specific for the common melanoma antigen Melan-A/MART-1. These clones were first identified among HLA-A2-restricted anti-melanoma CTL clones by their ability to specifically release tumor necrosis factor in response to HLA-A2.1+ COS-7 cells expressing this tumor antigen. A PCR with variable (V)-region gene subfamily-specific primers was performed on cDNA from each clone followed by DNA sequencing. TCRAV2S1 was the predominant alpha-chain V region, being transcribed in 6 out of 9 Melan-A/MART-1-specific CTL clones obtained from the five patients. beta-chain V-region usage was also restricted, with either TCRBV14 or TCRBV7 expressed by all but one clone. In addition, a conserved TCRAV2S1/TCRBV14 combination was expressed in four CTL clones from three patients. None of these V-region genes was found in a group of four HLA-A2-restricted CTL clones recognizing different antigens (e.g., tyrosinase) on the autologous tumor. TCR joining regions were heterogeneous, although conserved structural features were observed in the complementarity-determining region 3 sequences. These results indicate that a selective repertoire of TCR genes is used in anti-melanoma responses when the response is narrowed to major histocompatibility complex-restricted antigen-specific interactions.
Resumo:
AIM: To evaluate the prediction error in intraocular lens (IOL) power calculation for a rotationally asymmetric refractive multifocal IOL and the impact on this error of the optimization of the keratometric estimation of the corneal power and the prediction of the effective lens position (ELP). METHODS: Retrospective study including a total of 25 eyes of 13 patients (age, 50 to 83y) with previous cataract surgery with implantation of the Lentis Mplus LS-312 IOL (Oculentis GmbH, Germany). In all cases, an adjusted IOL power (PIOLadj) was calculated based on Gaussian optics using a variable keratometric index value (nkadj) for the estimation of the corneal power (Pkadj) and on a new value for ELP (ELPadj) obtained by multiple regression analysis. This PIOLadj was compared with the IOL power implanted (PIOLReal) and the value proposed by three conventional formulas (Haigis, Hoffer Q and Holladay). RESULTS: PIOLReal was not significantly different than PIOLadj and Holladay IOL power (P>0.05). In the Bland and Altman analysis, PIOLadj showed lower mean difference (-0.07 D) and limits of agreement (of 1.47 and -1.61 D) when compared to PIOLReal than the IOL power value obtained with the Holladay formula. Furthermore, ELPadj was significantly lower than ELP calculated with other conventional formulas (P<0.01) and was found to be dependent on axial length, anterior chamber depth and Pkadj. CONCLUSION: Refractive outcomes after cataract surgery with implantation of the multifocal IOL Lentis Mplus LS-312 can be optimized by minimizing the keratometric error and by estimating ELP using a mathematical expression dependent on anatomical factors.
Resumo:
Gaussian processes provide natural non-parametric prior distributions over regression functions. In this paper we consider regression problems where there is noise on the output, and the variance of the noise depends on the inputs. If we assume that the noise is a smooth function of the inputs, then it is natural to model the noise variance using a second Gaussian process, in addition to the Gaussian process governing the noise-free output value. We show that prior uncertainty about the parameters controlling both processes can be handled and that the posterior distribution of the noise rate can be sampled from using Markov chain Monte Carlo methods. Our results on a synthetic data set give a posterior noise variance that well-approximates the true variance.
Resumo:
In most treatments of the regression problem it is assumed that the distribution of target data can be described by a deterministic function of the inputs, together with additive Gaussian noise having constant variance. The use of maximum likelihood to train such models then corresponds to the minimization of a sum-of-squares error function. In many applications a more realistic model would allow the noise variance itself to depend on the input variables. However, the use of maximum likelihood to train such models would give highly biased results. In this paper we show how a Bayesian treatment can allow for an input-dependent variance while overcoming the bias of maximum likelihood.
Resumo:
As the population of the United States becomes more diverse and the immigrant Hispanic, limited English proficient (LEP) school age population continues to grow, understanding and addressing the needs of these students becomes a pressing question. The purpose of this study was to investigate the effects of group counseling, by a bilingual counselor, on the self-esteem, attendance and counselor utilization of Hispanic LEP high school students. The design for this study was a quasi-experimental design. The experimental and control groups consisted of one class from each of the four levels of English for Speakers of Other Languages (ESOL), I-IV. The counseling intervention, the independent variable, was delivered by a bilingual counselor once a week, for fifteen weeks.^ A total of 112 immigrant Hispanic LEP students selected from the total ESOL student population participated in the study. The experimental and control groups were administered the Culture Free Self Esteem Inventory (CFSEI) Form AD as a pretest and posttest. The Background Information Questionnaire (BIQ) was utilized to gather information on counselor utilization and demographic data. Attendance data were obtained from the students' computer records. At the conclusion of the study the differences between the experimental and control groups on the three dependent variables were compared.^ Statistical analyses of the data were done using SPSS statistical software. A multivariate analysis of variance (MANOVA) was utilized to determine if there were significant differences in the self-esteem scores, attendance and counselor utilization. Correlational analyses was utilized to determine if there was a relationship between English language proficiency and self-esteem and between acculturation level and self-esteem.^ The study results indicate that there were no significant differences in the self-esteem scores and attendance of the subjects in the experimental group at the completion of the group counseling treatment. Counselor utilization was statistically significant for the targeted population. A relationship was found between English language proficiency level and self-esteem scores for students in ESOL levels II, III and IV. No significant correlation was found between acculturation and self-esteem.^ Research on the dropout rates of LEP coupled with the results of this study show that students at the intermediate and advanced levels of ESOL (III and IV) exhibit more positive self-esteem and achieve higher graduation rates that levels I and II. LEP students at levels I and II, once they became familiar with the role and function of school counselors through group counseling, utilized their services. ^
Resumo:
Negative-ion mode electrospray ionization, ESI(-), with Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was coupled to a Partial Least Squares (PLS) regression and variable selection methods to estimate the total acid number (TAN) of Brazilian crude oil samples. Generally, ESI(-)-FT-ICR mass spectra present a power of resolution of ca. 500,000 and a mass accuracy less than 1 ppm, producing a data matrix containing over 5700 variables per sample. These variables correspond to heteroatom-containing species detected as deprotonated molecules, [M - H](-) ions, which are identified primarily as naphthenic acids, phenols and carbazole analog species. The TAN values for all samples ranged from 0.06 to 3.61 mg of KOH g(-1). To facilitate the spectral interpretation, three methods of variable selection were studied: variable importance in the projection (VIP), interval partial least squares (iPLS) and elimination of uninformative variables (UVE). The UVE method seems to be more appropriate for selecting important variables, reducing the dimension of the variables to 183 and producing a root mean square error of prediction of 0.32 mg of KOH g(-1). By reducing the size of the data, it was possible to relate the selected variables with their corresponding molecular formulas, thus identifying the main chemical species responsible for the TAN values.
The qWR star HD 45166 - II. Fundamental stellar parameters and evidence of a latitude-dependent wind
Resumo:
Context. The enigmatic object HD 45166 is a qWR star in a binary system with an orbital period of 1.596 day, and presents a rich emission-line spectrum in addition to absorption lines from the companion star (B7 V). As the system inclination is very small (i = 0.77 degrees +/- 0.09 degrees), HD 45166 is an ideal laboratory for wind-structure studies. Aims. The goal of the present paper is to determine the fundamental stellar and wind parameters of the qWR star. Methods. A radiative transfer model for the wind and photosphere of the qWR star was calculated using the non-LTE code CMFGEN. The wind asymmetry was also analyzed using a recently-developed version of CMFGEN to compute the emerging spectrum in two-dimensional geometry. The temporal-variance spectrum (TVS) was calculated to study the line-profile variations. Results. Abundances and stellar and wind parameters of the qWR star were obtained. The qWR star has an effective temperature of T(eff) = 50 000 +/- 2000 K, a luminosity of log(L/L(circle dot)) = 3.75 +/- 0.08, and a corresponding photospheric radius of R(phot) = 1.00 R(circle dot). The star is helium-rich (N(H)/N(He) = 2.0), while the CNO abundances are anomalous when compared either to solar values, to planetary nebulae, or to WR stars. The mass-loss rate is. M = 2.2 x 10(-7) M(circle dot) yr(-1), and the wind terminal velocity is v(infinity) = 425 km s(-1). The comparison between the observed line profiles and models computed under different latitude-dependent wind densities strongly suggests the presence of an oblate wind density enhancement, with a density contrast of at least 8: 1 from equator to pole. If a high velocity polar wind is present (similar to 1200 km s(-1)), the minimum density contrast is reduced to 4:1. Conclusions. The wind parameters determined are unusual when compared to O-type stars or to typical WR stars. While for WR stars v(infinity)/v(esc) > 1.5, in the case of HD 45166 it is much smaller (v(infinity)/v(esc) = 0.32). In addition, the efficiency of momentum transfer is eta = 0.74, which is at least 4 times smaller than in a typical WR. We find evidence for the presence of a wind compression zone, since the equatorial wind density is significantly higher than the polar wind. The TVS supports the presence of such a latitude-dependent wind and a variable absorption/scattering gas near the equator.
Resumo:
The soil bacterium Pseudomonas fluorescens Pf-5 produces two siderophores, a pyoverdine and enantio-pyochelin, and its proteome includes 45 TonB-dependent outer-membrane proteins, which commonly function in uptake of siderophores and other substrates from the environment. The 45 proteins share the conserved beta-barrel and plug domains of TonB-dependent proteins but only 18 of them have an N-terminal signaling domain characteristic of TonB-dependent transducers (TBDTs), which participate in cell-surface signaling systems. Phylogenetic analyses of the 18 TBDTs and 27 TonB-dependent receptors (TBDRs), which lack the N-terminal signaling domain, suggest a complex evolutionary history including horizontal transfer among different microbial lineages. Putative functions were assigned to certain TBDRs and TBDTs in clades including well-characterized orthologs from other Pseudomonas spp. A mutant of Pf-5 with deletions in pyoverdine and enantio-pyochelin biosynthesis genes was constructed and characterized for iron-limited growth and utilization of a spectrum of siderophores. The mutant could utilize as iron sources a large number of pyoverdines with diverse structures as well as ferric citrate, heme, and the siderophores ferrichrome, ferrioxamine B, enterobactin, and aerobactin. The diversity and complexity of the TBDTs and TBDRs with roles in iron uptake clearly indicate the importance of iron in the fitness and survival of Pf-5 in the environment.