946 resultados para statistical classification
Resumo:
Frogs have received increasing attention due to their effectiveness for indicating the environment change. Therefore, it is important to monitor and assess frogs. With the development of sensor techniques, large volumes of audio data (including frog calls) have been collected and need to be analysed. After transforming the audio data into its spectrogram representation using short-time Fourier transform, the visual inspection of this representation motivates us to use image processing techniques for analysing audio data. Applying acoustic event detection (AED) method to spectrograms, acoustic events are firstly detected from which ridges are extracted. Three feature sets, Mel-frequency cepstral coefficients (MFCCs), AED feature set and ridge feature set, are then used for frog call classification with a support vector machine classifier. Fifteen frog species widely spread in Queensland, Australia, are selected to evaluate the proposed method. The experimental results show that ridge feature set can achieve an average classification accuracy of 74.73% which outperforms the MFCCs (38.99%) and AED feature set (67.78%).
Resumo:
Over past few decades, frog species have been experiencing dramatic decline around the world. The reason for this decline includes habitat loss, invasive species, climate change and so on. To better know the status of frog species, classifying frogs has become increasingly important. In this study, acoustic features are investigated for multi-level classification of Australian frogs: family, genus and species, including three families, eleven genera and eighty five species which are collected from Queensland, Australia. For each frog species, six instances are selected from which ten acoustic features are calculated. Then, the multicollinearity between ten features are studied for selecting non-correlated features for subsequent analysis. A decision tree (DT) classifier is used to visually and explicitly determine which acoustic features are relatively important for classifying family, which for genus, and which for species. Finally, a weighted support vector machines (SVMs) classifier is used for the multi- level classification with three most important acoustic features respectively. Our experiment results indicate that using different acoustic feature sets can successfully classify frogs at different levels and the average classification accuracy can be up to 85.6%, 86.1% and 56.2% for family, genus and species respectively.
Resumo:
The main objective of statistical analysis of experi- mental investigations is to make predictions on the basis of mathematical equations so as the number of experiments. Abrasive jet machining (AJM) is an unconventional and novel machining process wherein microabrasive particles are propelled at high veloc- ities on to a workpiece. The resulting erosion can be used for cutting, etching, cleaning, deburring, drilling and polishing. In the study completed by the authors, statistical design of experiments was successfully employed to predict the rate of material removal by AJM. This paper discusses the details of such an approach and the findings.
Resumo:
A complete list of homogeneous operators in the Cowen-Douglas class B-n(D) is given. This classification is obtained from an explicit realization of all the homogeneous Hermitian holomorphic vector bundles on the unit disc under the action of the universal covering group of the bi-holomorphic automorphism group of the unit disc.
Resumo:
This paper proposes new metrics and a performance-assessment framework for vision-based weed and fruit detection and classification algorithms. In order to compare algorithms, and make a decision on which one to use fora particular application, it is necessary to take into account that the performance obtained in a series of tests is subject to uncertainty. Such characterisation of uncertainty seems not to be captured by the performance metrics currently reported in the literature. Therefore, we pose the problem as a general problem of scientific inference, which arises out of incomplete information, and propose as a metric of performance the(posterior) predictive probabilities that the algorithms will provide a correct outcome for target and background detection. We detail the framework through which these predicted probabilities can be obtained, which is Bayesian in nature. As an illustration example, we apply the framework to the assessment of performance of four algorithms that could potentially be used in the detection of capsicums (peppers).
Resumo:
The export of sediments from coastal catchments can have detrimental impacts on estuaries and near shore reef ecosystems such as the Great Barrier Reef. Catchment management approaches aimed at reducing sediment loads require monitoring to evaluate their effectiveness in reducing loads over time. However, load estimation is not a trivial task due to the complex behaviour of constituents in natural streams, the variability of water flows and often a limited amount of data. Regression is commonly used for load estimation and provides a fundamental tool for trend estimation by standardising the other time specific covariates such as flow. This study investigates whether load estimates and resultant power to detect trends can be enhanced by (i) modelling the error structure so that temporal correlation can be better quantified, (ii) making use of predictive variables, and (iii) by identifying an efficient and feasible sampling strategy that may be used to reduce sampling error. To achieve this, we propose a new regression model that includes an innovative compounding errors model structure and uses two additional predictive variables (average discounted flow and turbidity). By combining this modelling approach with a new, regularly optimised, sampling strategy, which adds uniformity to the event sampling strategy, the predictive power was increased to 90%. Using the enhanced regression model proposed here, it was possible to detect a trend of 20% over 20 years. This result is in stark contrast to previous conclusions presented in the literature. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
We consider the development of statistical models for prediction of constituent concentration of riverine pollutants, which is a key step in load estimation from frequent flow rate data and less frequently collected concentration data. We consider how to capture the impacts of past flow patterns via the average discounted flow (ADF) which discounts the past flux based on the time lapsed - more recent fluxes are given more weight. However, the effectiveness of ADF depends critically on the choice of the discount factor which reflects the unknown environmental cumulating process of the concentration compounds. We propose to choose the discount factor by maximizing the adjusted R-2 values or the Nash-Sutcliffe model efficiency coefficient. The R2 values are also adjusted to take account of the number of parameters in the model fit. The resulting optimal discount factor can be interpreted as a measure of constituent exhaustion rate during flood events. To evaluate the performance of the proposed regression estimators, we examine two different sampling scenarios by resampling fortnightly and opportunistically from two real daily datasets, which come from two United States Geological Survey (USGS) gaging stations located in Des Plaines River and Illinois River basin. The generalized rating-curve approach produces biased estimates of the total sediment loads by -30% to 83%, whereas the new approaches produce relatively much lower biases, ranging from -24% to 35%. This substantial improvement in the estimates of the total load is due to the fact that predictability of concentration is greatly improved by the additional predictors.
Resumo:
Power calculation and sample size determination are critical in designing environmental monitoring programs. The traditional approach based on comparing the mean values may become statistically inappropriate and even invalid when substantial proportions of the response values are below the detection limits or censored because strong distributional assumptions have to be made on the censored observations when implementing the traditional procedures. In this paper, we propose a quantile methodology that is robust to outliers and can also handle data with a substantial proportion of below-detection-limit observations without the need of imputing the censored values. As a demonstration, we applied the methods to a nutrient monitoring project, which is a part of the Perth Long-Term Ocean Outlet Monitoring Program. In this example, the sample size required by our quantile methodology is, in fact, smaller than that by the traditional t-test, illustrating the merit of our method.
Resumo:
The charge at which adsorption of orgamc compounds attains a maximum ( \sigma MAX M) at an electrochenucal interface is analysed using several multi-state models in a hierarchical manner The analysis is based on statistical mechamcal results for the following models (A) two-state site parity, (B) two-state muhl-slte, and (C) three-state site parity The coulombic interactions due to permanent and reduced dipole effects (using mean field approximation), electrostatic field effects and specific substrate interactions have been taken into account. The simplest model in the hierarchy (two-state site parity) yields the exphcit dependence of ( \sigma MAX M) on the permanent dipole moment, polarizability of the solvent and the adsorbate, lattice spacing, effective coordination number, etc Other models in the baerarchy bring to hght the influence of the solvent structure and the role of substrate interactions, etc As a result of this approach, the "composition" of oM.x m terms of the fundamental molecular constants becomes clear. With a view to use these molecular results to maxamum advantage, the derived results for ( \sigma MAX M) have been converted into those involving experimentally observable parameters lake Co, C 1, E N, etc Wherever possible, some of the earlier phenomenologlcal relations reported for ( \sigma MAX M), notably by Parsons, Damaskm and Frumkln, and Trasattl, are shown to have a certain molecular basis, vlz a simple two-state sate panty model.As a corollary to the hxerarcbacal modelling, \sigma MAX M and the potential corresponding to at (Emax) are shown to be constants independent of 0max or Corg for all models The lmphcatlon of our analysis f o r OmMa x with respect to that predicted by the generalized surface layer equation (which postulates Om~ and Ema x varlaUon with 0) is discussed in detail Finally we discuss an passing o M. and the electrosorptlon valency an this context.
Resumo:
Ninety-two strong-motion earthquake records from the California region, U.S.A., have been statistically studied using principal component analysis in terms of twelve important standardized strong-motion characteristics. The first two principal components account for about 57 per cent of the total variance. Based on these two components the earthquake records are classified into nine groups in a two-dimensional principal component plane. Also a unidimensional engineering rating scale is proposed. The procedure can be used as an objective approach for classifying and rating future earthquakes.
Resumo:
Many statistical forecast systems are available to interested users. In order to be useful for decision-making, these systems must be based on evidence of underlying mechanisms. Once causal connections between the mechanism and their statistical manifestation have been firmly established, the forecasts must also provide some quantitative evidence of `quality’. However, the quality of statistical climate forecast systems (forecast quality) is an ill-defined and frequently misunderstood property. Often, providers and users of such forecast systems are unclear about what ‘quality’ entails and how to measure it, leading to confusion and misinformation. Here we present a generic framework to quantify aspects of forecast quality using an inferential approach to calculate nominal significance levels (p-values) that can be obtained either by directly applying non-parametric statistical tests such as Kruskal-Wallis (KW) or Kolmogorov-Smirnov (KS) or by using Monte-Carlo methods (in the case of forecast skill scores). Once converted to p-values, these forecast quality measures provide a means to objectively evaluate and compare temporal and spatial patterns of forecast quality across datasets and forecast systems. Our analysis demonstrates the importance of providing p-values rather than adopting some arbitrarily chosen significance levels such as p < 0.05 or p < 0.01, which is still common practice. This is illustrated by applying non-parametric tests (such as KW and KS) and skill scoring methods (LEPS and RPSS) to the 5-phase Southern Oscillation Index classification system using historical rainfall data from Australia, The Republic of South Africa and India. The selection of quality measures is solely based on their common use and does not constitute endorsement. We found that non-parametric statistical tests can be adequate proxies for skill measures such as LEPS or RPSS. The framework can be implemented anywhere, regardless of dataset, forecast system or quality measure. Eventually such inferential evidence should be complimented by descriptive statistical methods in order to fully assist in operational risk management.
Resumo:
A review was carried out of the radiographs of twenty-five infants with birth weights under 1000 G, who survived for more than twenty-eight days; eighteen of these had enough suitable films for a survey of the progressive bone changes which occur in these infants, including estimation of humeral cortical cross-sectional area. The incidence of the changes has been assessed and a typical progression of radiographic appearances has been shown, with a suggested system of staging. All infants showed some loss of bone mineral, with frank changes of rickets occurring in forty-four percent. Aetiological factors are mainly concerned with the difficulty of supplying and ensuring absorption of sufficient bone mineral (calcium and phosphate) and vitamin D. Liver immaturity may be another factor. Disease states additional to prematurity accentuate the problem. Rib fractures occurring around 80–90 days post-nataEy commonly draw attention to the bone disorder and are probably the major clinical factor of importance; there is a high incidence of associated lung disease of uncertain pathology. Attention is drawn to possible confusion with other bone disorders in the post-natal period.
Resumo:
A pressed-plate Fe electrode for alkalines storage batteries, designed using a statistical method (fractional factorial technique), is described. Parameters such as the configuration of the base grid, electrode compaction temperature and pressure, binder composition, mixing time, etc. have been optimised using this method. The optimised electrodes have a capacity of 300 plus /minus 5 mA h/g of active material (mixture of Fe and magnetite) at 7 h rate to a cut-off voltage of 8.86V vs. Hg/HgO, OH exp 17 ref.
Resumo:
The statistical minimum risk pattern recognition problem, when the classification costs are random variables of unknown statistics, is considered. Using medical diagnosis as a possible application, the problem of learning the optimal decision scheme is studied for a two-class twoaction case, as a first step. This reduces to the problem of learning the optimum threshold (for taking appropriate action) on the a posteriori probability of one class. A recursive procedure for updating an estimate of the threshold is proposed. The estimation procedure does not require the knowledge of actual class labels of the sample patterns in the design set. The adaptive scheme of using the present threshold estimate for taking action on the next sample is shown to converge, in probability, to the optimum. The results of a computer simulation study of three learning schemes demonstrate the theoretically predictable salient features of the adaptive scheme.