948 resultados para Non-parametric methods


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an analysis of motor vehicle insurance claims relating to vehicle damage and to associated medical expenses. We use univariate severity distributions estimated with parametric and non-parametric methods. The methods are implemented using the statistical package R. Parametric analysis is limited to estimation of normal and lognormal distributions for each of the two claim types. The nonparametric analysis presented involves kernel density estimation. We illustrate the benefits of applying transformations to data prior to employing kernel based methods. We use a log-transformation and an optimal transformation amongst a class of transformations that produces symmetry in the data. The central aim of this paper is to provide educators with material that can be used in the classroom to teach statistical estimation methods, goodness of fit analysis and importantly statistical computing in the context of insurance and risk management. To this end, we have included in the Appendix of this paper all the R code that has been used in the analysis so that readers, both students and educators, can fully explore the techniques described

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Land cover classification is a key research field in remote sensing and land change science as thematic maps derived from remotely sensed data have become the basis for analyzing many socio-ecological issues. However, land cover classification remains a difficult task and it is especially challenging in heterogeneous tropical landscapes where nonetheless such maps are of great importance. The present study aims to establish an efficient classification approach to accurately map all broad land cover classes in a large, heterogeneous tropical area of Bolivia, as a basis for further studies (e.g., land cover-land use change). Specifically, we compare the performance of parametric (maximum likelihood), non-parametric (k-nearest neighbour and four different support vector machines - SVM), and hybrid classifiers, using both hard and soft (fuzzy) accuracy assessments. In addition, we test whether the inclusion of a textural index (homogeneity) in the classifications improves their performance. We classified Landsat imagery for two dates corresponding to dry and wet seasons and found that non-parametric, and particularly SVM classifiers, outperformed both parametric and hybrid classifiers. We also found that the use of the homogeneity index along with reflectance bands significantly increased the overall accuracy of all the classifications, but particularly of SVM algorithms. We observed that improvements in producer’s and user’s accuracies through the inclusion of the homogeneity index were different depending on land cover classes. Earlygrowth/degraded forests, pastures, grasslands and savanna were the classes most improved, especially with the SVM radial basis function and SVM sigmoid classifiers, though with both classifiers all land cover classes were mapped with producer’s and user’s accuracies of around 90%. Our approach seems very well suited to accurately map land cover in tropical regions, thus having the potential to contribute to conservation initiatives, climate change mitigation schemes such as REDD+, and rural development policies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The work presented here is part of a larger study to identify novel technologies and biomarkers for early Alzheimer disease (AD) detection and it focuses on evaluating the suitability of a new approach for early AD diagnosis by non-invasive methods. The purpose is to examine in a pilot study the potential of applying intelligent algorithms to speech features obtained from suspected patients in order to contribute to the improvement of diagnosis of AD and its degree of severity. In this sense, Artificial Neural Networks (ANN) have been used for the automatic classification of the two classes (AD and control subjects). Two human issues have been analyzed for feature selection: Spontaneous Speech and Emotional Response. Not only linear features but also non-linear ones, such as Fractal Dimension, have been explored. The approach is non invasive, low cost and without any side effects. Obtained experimental results were very satisfactory and promising for early diagnosis and classification of AD patients.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

AbstractLiterature has unveiled that a paper has not been published yet on using non-parametric stability statistics (NPSSs) for evaluating genotypic stability in dough properties of wheat. Accordingly, the effects of genotype (G), environment (E) and GE interaction (GEI) on alveograph parameters, i.e. dough baking strength (W) and its tenacity (P)/extensibility (L), of 18 wheat (T. aestivum L.) genotypes were studied under irrigated field conditions in an 8-year trial (2006-2014) in central Turkey. Furthermore, genotypic stability for W and P/L was determined using 8 NPSSs viz. RM-Rank mean, RSD-Rank’s standard deviation, RS-Rank Sum, TOP-Ranking, Si(1), Si(2), Si(3) and Si(6) rank statistics. The ANOVA revealed that W and P/L were primarily controlled by E, although G and GEI also had significant effects. Among the 8 NPSSs, only RM, RS and TOP statistics were suitable for detecting the genotypes with high stable and bread making quality (e.g. G1 and G17). In conclusion, using RM, RS and TOP statistics is advisable to select for dough quality in wheat under multi-environment trials (METs).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Study on variable stars is an important topic of modern astrophysics. After the invention of powerful telescopes and high resolving powered CCD’s, the variable star data is accumulating in the order of peta-bytes. The huge amount of data need lot of automated methods as well as human experts. This thesis is devoted to the data analysis on variable star’s astronomical time series data and hence belong to the inter-disciplinary topic, Astrostatistics. For an observer on earth, stars that have a change in apparent brightness over time are called variable stars. The variation in brightness may be regular (periodic), quasi periodic (semi-periodic) or irregular manner (aperiodic) and are caused by various reasons. In some cases, the variation is due to some internal thermo-nuclear processes, which are generally known as intrinsic vari- ables and in some other cases, it is due to some external processes, like eclipse or rotation, which are known as extrinsic variables. Intrinsic variables can be further grouped into pulsating variables, eruptive variables and flare stars. Extrinsic variables are grouped into eclipsing binary stars and chromospheri- cal stars. Pulsating variables can again classified into Cepheid, RR Lyrae, RV Tauri, Delta Scuti, Mira etc. The eruptive or cataclysmic variables are novae, supernovae, etc., which rarely occurs and are not periodic phenomena. Most of the other variations are periodic in nature. Variable stars can be observed through many ways such as photometry, spectrophotometry and spectroscopy. The sequence of photometric observa- xiv tions on variable stars produces time series data, which contains time, magni- tude and error. The plot between variable star’s apparent magnitude and time are known as light curve. If the time series data is folded on a period, the plot between apparent magnitude and phase is known as phased light curve. The unique shape of phased light curve is a characteristic of each type of variable star. One way to identify the type of variable star and to classify them is by visually looking at the phased light curve by an expert. For last several years, automated algorithms are used to classify a group of variable stars, with the help of computers. Research on variable stars can be divided into different stages like observa- tion, data reduction, data analysis, modeling and classification. The modeling on variable stars helps to determine the short-term and long-term behaviour and to construct theoretical models (for eg:- Wilson-Devinney model for eclips- ing binaries) and to derive stellar properties like mass, radius, luminosity, tem- perature, internal and external structure, chemical composition and evolution. The classification requires the determination of the basic parameters like pe- riod, amplitude and phase and also some other derived parameters. Out of these, period is the most important parameter since the wrong periods can lead to sparse light curves and misleading information. Time series analysis is a method of applying mathematical and statistical tests to data, to quantify the variation, understand the nature of time-varying phenomena, to gain physical understanding of the system and to predict future behavior of the system. Astronomical time series usually suffer from unevenly spaced time instants, varying error conditions and possibility of big gaps. This is due to daily varying daylight and the weather conditions for ground based observations and observations from space may suffer from the impact of cosmic ray particles. Many large scale astronomical surveys such as MACHO, OGLE, EROS, xv ROTSE, PLANET, Hipparcos, MISAO, NSVS, ASAS, Pan-STARRS, Ke- pler,ESA, Gaia, LSST, CRTS provide variable star’s time series data, even though their primary intention is not variable star observation. Center for Astrostatistics, Pennsylvania State University is established to help the astro- nomical community with the aid of statistical tools for harvesting and analysing archival data. Most of these surveys releases the data to the public for further analysis. There exist many period search algorithms through astronomical time se- ries analysis, which can be classified into parametric (assume some underlying distribution for data) and non-parametric (do not assume any statistical model like Gaussian etc.,) methods. Many of the parametric methods are based on variations of discrete Fourier transforms like Generalised Lomb-Scargle peri- odogram (GLSP) by Zechmeister(2009), Significant Spectrum (SigSpec) by Reegen(2007) etc. Non-parametric methods include Phase Dispersion Minimi- sation (PDM) by Stellingwerf(1978) and Cubic spline method by Akerlof(1994) etc. Even though most of the methods can be brought under automation, any of the method stated above could not fully recover the true periods. The wrong detection of period can be due to several reasons such as power leakage to other frequencies which is due to finite total interval, finite sampling interval and finite amount of data. Another problem is aliasing, which is due to the influence of regular sampling. Also spurious periods appear due to long gaps and power flow to harmonic frequencies is an inherent problem of Fourier methods. Hence obtaining the exact period of variable star from it’s time series data is still a difficult problem, in case of huge databases, when subjected to automation. As Matthew Templeton, AAVSO, states “Variable star data analysis is not always straightforward; large-scale, automated analysis design is non-trivial”. Derekas et al. 2007, Deb et.al. 2010 states “The processing of xvi huge amount of data in these databases is quite challenging, even when looking at seemingly small issues such as period determination and classification”. It will be beneficial for the variable star astronomical community, if basic parameters, such as period, amplitude and phase are obtained more accurately, when huge time series databases are subjected to automation. In the present thesis work, the theories of four popular period search methods are studied, the strength and weakness of these methods are evaluated by applying it on two survey databases and finally a modified form of cubic spline method is intro- duced to confirm the exact period of variable star. For the classification of new variable stars discovered and entering them in the “General Catalogue of Vari- able Stars” or other databases like “Variable Star Index“, the characteristics of the variability has to be quantified in term of variable star parameters.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The effect of temperature on the degradation of blackcurrant anthocyanins in a model juice system was determined over a temperature range of 4–140 °C. The thermal degradation of anthocyanins followed pseudo first-order kinetics. From 4–100 °C an isothermal method was used to determine the kinetic parameters. In order to mimic the temperature profile in retort systems, a non-isothermal method was applied to determine the kinetic parameters in the model juice over the temperature range 110–140 °C. The results from both isothermal and non-isothermal methods fit well together, indicating that the non-isothermal procedure is a reliable mathematical method to determine the kinetics of anthocyanin degradation. The reaction rate constant (k) increased from 0.16 (±0.01) × 10−3 to 9.954 (±0.004) h−1 at 4 and 140 °C, respectively. The temperature dependence of the rate of anthocyanin degradation was modelled by an extension of the Arrhenius equation, which showed a linear increase in the activation energy with temperature.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper deals with the estimation and testing of conditional duration models by looking at the density and baseline hazard rate functions. More precisely, we foeus on the distance between the parametric density (or hazard rate) function implied by the duration process and its non-parametric estimate. Asymptotic justification is derived using the functional delta method for fixed and gamma kernels, whereas finite sample properties are investigated through Monte Carlo simulations. Finally, we show the practical usefulness of such testing procedures by carrying out an empirical assessment of whether autoregressive conditional duration models are appropriate to oIs for modelling price durations of stocks traded at the New York Stock Exchange.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper gives a first step toward a methodology to quantify the influences of regulation on short-run earnings dynamics. It also provides evidence on the patterns of wage adjustment adopted during the recent high inflationary experience in Brazil.The large variety of official wage indexation rules adopted in Brazil during the recent years combined with the availability of monthly surveys on labor markets makes the Brazilian case a good laboratory to test how regulation affects earnings dynamics. In particular, the combination of large sample sizes with the possibility of following the same worker through short periods of time allows to estimate the cross-sectional distribution of longitudinal statistics based on observed earnings (e.g., monthly and annual rates of change).The empirical strategy adopted here is to compare the distributions of longitudinal statistics extracted from actual earnings data with simulations generated from minimum adjustment requirements imposed by the Brazilian Wage Law. The analysis provides statistics on how binding were wage regulation schemes. The visual analysis of the distribution of wage adjustments proves useful to highlight stylized facts that may guide future empirical work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Documenting the presence and abundance of the neotropical mammals is the first step for understanding their population ecology, behavior and genetic dynamics in designing conservation plans. The combination of field research with molecular genetics techniques are new tools that provide valuable biological information avoiding the disturbance in the ecosystems, trying to minimize the human impact in the process to gather biological information. The objective of this paper is to review the available non invasive sampling techniques that have been used in Neotropical mammal studies to apply to determine the presence and abundance, population structure, sex ratio, taxonomic diagnostic using mitochondrial markers, and assessing genetic variability using nuclear markers. There are a wide range of non invasive sampling techniques used to determine the species identification that inhabit an area such as searching for tracks, feces, and carcasses. Other useful equipment is the camera traps that can generate an image bank that can be valuable to assess species presence and abundance by morphology. With recent advances in molecular biology, it is now possible to use the trace amounts of DNA in feces and amplify it to analyze the species diversity in an area, and the genetic variability at intraspecific level. This is particularly helpful in cases of sympatric and cryptic species in which morphology failed to diagnose the taxonomic status of several species of brocket deer of the genus Mazama.