895 resultados para Nonparametric regression
Resumo:
The average availability of a repairable system is the expected proportion of time that the system is operating in the interval [0, t]. The present article discusses the nonparametric estimation of the average availability when (i) the data on 'n' complete cycles of system operation are available, (ii) the data are subject to right censorship, and (iii) the process is observed upto a specified time 'T'. In each case, a nonparametric confidence interval for the average availability is also constructed. Simulations are conducted to assess the performance of the estimators.
Resumo:
This thesis Entitled “modelling and analysis of recurrent event data with multiple causes.Survival data is a term used for describing data that measures the time to occurrence of an event.In survival studies, the time to occurrence of an event is generally referred to as lifetime.Recurrent event data are commonly encountered in longitudinal studies when individuals are followed to observe the repeated occurrences of certain events. In many practical situations, individuals under study are exposed to the failure due to more than one causes and the eventual failure can be attributed to exactly one of these causes.The proposed model was useful in real life situations to study the effect of covariates on recurrences of certain events due to different causes.In Chapter 3, an additive hazards model for gap time distributions of recurrent event data with multiple causes was introduced. The parameter estimation and asymptotic properties were discussed .In Chapter 4, a shared frailty model for the analysis of bivariate competing risks data was presented and the estimation procedures for shared gamma frailty model, without covariates and with covariates, using EM algorithm were discussed. In Chapter 6, two nonparametric estimators for bivariate survivor function of paired recurrent event data were developed. The asymptotic properties of the estimators were studied. The proposed estimators were applied to a real life data set. Simulation studies were carried out to find the efficiency of the proposed estimators.
Resumo:
So far, in the bivariate set up, the analysis of lifetime (failure time) data with multiple causes of failure is done by treating each cause of failure separately. with failures from other causes considered as independent censoring. This approach is unrealistic in many situations. For example, in the analysis of mortality data on married couples one would be interested to compare the hazards for the same cause of death as well as to check whether death due to one cause is more important for the partners’ risk of death from other causes. In reliability analysis. one often has systems with more than one component and many systems. subsystems and components have more than one cause of failure. Design of high-reliability systems generally requires that the individual system components have extremely high reliability even after long periods of time. Knowledge of the failure behaviour of a component can lead to savings in its cost of production and maintenance and. in some cases, to the preservation of human life. For the purpose of improving reliability. it is necessary to identify the cause of failure down to the component level. By treating each cause of failure separately with failures from other causes considered as independent censoring, the analysis of lifetime data would be incomplete. Motivated by this. we introduce a new approach for the analysis of bivariate competing risk data using the bivariate vector hazard rate of Johnson and Kotz (1975).
Resumo:
An improved color video super-resolution technique using kernel regression and fuzzy enhancement is presented in this paper. A high resolution frame is computed from a set of low resolution video frames by kernel regression using an adaptive Gaussian kernel. A fuzzy smoothing filter is proposed to enhance the regression output. The proposed technique is a low cost software solution to resolution enhancement of color video in multimedia applications. The performance of the proposed technique is evaluated using several color videos and it is found to be better than other techniques in producing high quality high resolution color videos
Resumo:
In our study we use a kernel based classification technique, Support Vector Machine Regression for predicting the Melting Point of Drug – like compounds in terms of Topological Descriptors, Topological Charge Indices, Connectivity Indices and 2D Auto Correlations. The Machine Learning model was designed, trained and tested using a dataset of 100 compounds and it was found that an SVMReg model with RBF Kernel could predict the Melting Point with a mean absolute error 15.5854 and Root Mean Squared Error 19.7576
Resumo:
Short summary: This study was undertaken to assess the diversity of plant resources utilized by the local population in south-western Madagascar, the social, ecological and biophysical conditions that drive their uses and availability, and possible alternative strategies for their sustainable use in the region. The study region, ‘Mahafaly region’, located in south-western Madagascar, is one of the country’s most economically, educationally and climatically disadvantaged regions. With an arid steppe climate, the agricultural production is limited by low water availability and a low level of soil nutrients and soil organic carbon. The region comprises the recently extended Tsimanampetsotsa National Park, with numerous sacred and communities forests, which are threatened by slash and burn agriculture and overexploitation of forests resources. The present study analyzed the availability of wild yams and medicinal plants, and their importance for the livelihood of the local population in this region. An ethnobotanical survey was conducted recording the diversity, local knowledge and use of wild yams and medicinal plants utilized by the local communities in five villages in the Mahafaly region. 250 households were randomly selected followed by semi-structured interviews on the socio-economic characteristics of the households. Data allowed us to characterize sociocultural and socioeconomic factors that determine the local use of wild yams and medicinal plants, and to identify their role in the livelihoods of local people. Species-environment relationships and the current spatial distribution of the wild yams were investigated and predicted using ordination methods and a niche based habitat modelling approach. Species response curves along edaphic gradients allowed us to understand the species requirements on habitat conditions. We thus investigated various alternative methods to enhance the wild yam regeneration for their local conservation and their sustainable use in the Mahafaly region. Altogether, six species of wild yams and a total of 214 medicinal plants species from 68 families and 163 genera were identified in the study region. Results of the cluster and discriminant analysis indicated a clear pattern on resource, resulted in two groups of household and characterized by differences in livestock numbers, off-farm activities, agricultural land and harvests. A generalized linear model highlighted that economic factors significantly affect the collection intensity of wild yams, while the use of medicinal plants depends to a higher degree on socio-cultural factors. The gradient analysis on the distribution of the wild yam species revealed a clear pattern for species habitats. Species models based on NPMR (Nonparametric Multiplicative Regression analysis) indicated the importance of vegetation structure, human interventions, and soil characteristics to determine wild yam species distribution. The prediction of the current availability of wild yam resources showed that abundant wild yam resources are scarce and face high harvest intensity. Experiments on yams cultivation revealed that germination of seeds was enhanced by using pre-germination treatments before planting, vegetative regeneration performed better with the upper part of the tubers (corms) rather than the sets of tubers. In-situ regeneration was possible for the upper parts of the wild tubers but the success depended significantly on the type of soil. The use of manure (10-20 t ha¹) increased the yield of the D. alata and D. alatipes by 40%. We thus suggest the promotion of other cultivated varieties of D. alata found regions neighbouring as the Mahafaly Plateau.
Resumo:
We study the relation between support vector machines (SVMs) for regression (SVMR) and SVM for classification (SVMC). We show that for a given SVMC solution there exists a SVMR solution which is equivalent for a certain choice of the parameters. In particular our result is that for $epsilon$ sufficiently close to one, the optimal hyperplane and threshold for the SVMC problem with regularization parameter C_c are equal to (1-epsilon)^{- 1} times the optimal hyperplane and threshold for SVMR with regularization parameter C_r = (1-epsilon)C_c. A direct consequence of this result is that SVMC can be seen as a special case of SVMR.
Resumo:
Support Vector Machines Regression (SVMR) is a regression technique which has been recently introduced by V. Vapnik and his collaborators (Vapnik, 1995; Vapnik, Golowich and Smola, 1996). In SVMR the goodness of fit is measured not by the usual quadratic loss function (the mean square error), but by a different loss function called Vapnik"s $epsilon$- insensitive loss function, which is similar to the "robust" loss functions introduced by Huber (Huber, 1981). The quadratic loss function is well justified under the assumption of Gaussian additive noise. However, the noise model underlying the choice of Vapnik's loss function is less clear. In this paper the use of Vapnik's loss function is shown to be equivalent to a model of additive and Gaussian noise, where the variance and mean of the Gaussian are random variables. The probability distributions for the variance and mean will be stated explicitly. While this work is presented in the framework of SVMR, it can be extended to justify non-quadratic loss functions in any Maximum Likelihood or Maximum A Posteriori approach. It applies not only to Vapnik's loss function, but to a much broader class of loss functions.
Resumo:
This paper presents a computation of the $V_gamma$ dimension for regression in bounded subspaces of Reproducing Kernel Hilbert Spaces (RKHS) for the Support Vector Machine (SVM) regression $epsilon$-insensitive loss function, and general $L_p$ loss functions. Finiteness of the RV_gamma$ dimension is shown, which also proves uniform convergence in probability for regression machines in RKHS subspaces that use the $L_epsilon$ or general $L_p$ loss functions. This paper presenta a novel proof of this result also for the case that a bias is added to the functions in the RKHS.
Resumo:
We propose a nonparametric method for estimating derivative financial asset pricing formulae using learning networks. To demonstrate feasibility, we first simulate Black-Scholes option prices and show that learning networks can recover the Black-Scholes formula from a two-year training set of daily options prices, and that the resulting network formula can be used successfully to both price and delta-hedge options out-of-sample. For comparison, we estimate models using four popular methods: ordinary least squares, radial basis functions, multilayer perceptrons, and projection pursuit. To illustrate practical relevance, we also apply our approach to S&P 500 futures options data from 1987 to 1991.
Resumo:
Time series regression models are especially suitable in epidemiology for evaluating short-term effects of time-varying exposures on health. The problem is that potential for confounding in time series regression is very high. Thus, it is important that trend and seasonality are properly accounted for. Our paper reviews the statistical models commonly used in time-series regression methods, specially allowing for serial correlation, make them potentially useful for selected epidemiological purposes. In particular, we discuss the use of time-series regression for counts using a wide range Generalised Linear Models as well as Generalised Additive Models. In addition, recently critical points in using statistical software for GAM were stressed, and reanalyses of time series data on air pollution and health were performed in order to update already published. Applications are offered through an example on the relationship between asthma emergency admissions and photochemical air pollutants
Resumo:
It is well known that regression analyses involving compositional data need special attention because the data are not of full rank. For a regression analysis where both the dependent and independent variable are components we propose a transformation of the components emphasizing their role as dependent and independent variables. A simple linear regression can be performed on the transformed components. The regression line can be depicted in a ternary diagram facilitating the interpretation of the analysis in terms of components. An exemple with time-budgets illustrates the method and the graphical features
Resumo:
In CoDaWork’05, we presented an application of discriminant function analysis (DFA) to 4 different compositional datasets and modelled the first canonical variable using a segmented regression model solely based on an observation about the scatter plots. In this paper, multiple linear regressions are applied to different datasets to confirm the validity of our proposed model. In addition to dating the unknown tephras by calibration as discussed previously, another method of mapping the unknown tephras into samples of the reference set or missing samples in between consecutive reference samples is proposed. The application of these methodologies is demonstrated with both simulated and real datasets. This new proposed methodology provides an alternative, more acceptable approach for geologists as their focus is on mapping the unknown tephra with relevant eruptive events rather than estimating the age of unknown tephra. Kew words: Tephrochronology; Segmented regression
Resumo:
Based on Rijt-Plooij and Plooij’s (1992) research on emergence of regression periods in the first two years of life, the presence of such periods in a group of 18 babies (10 boys and 8 girls, aged between 3 weeks and 14 months) from a Catalonian population was analyzed. The measurements were a questionnaire filled in by the infants’ mothers, a semi-structured weekly tape-recorded interview, and observations in their homes. The procedure and the instruments used in the project follow those proposed by Rijt-Plooij and Plooij. Our results confirm the existence of the regression periods in the first year of children’s life. Inter-coder agreement for trained coders was 78.2% and within-coder agreement was 90.1 %. In the discussion, the possible meaning and relevance of regression periods in order to understand development from a psychobiological and social framework is commented upon
Resumo:
Resumen tomado de la publicaci??n