819 resultados para Classification error rate
Resumo:
Frog protection has become increasingly essential due to the rapid decline of its biodiversity. Therefore, it is valuable to develop new methods for studying this biodiversity. In this paper, a novel feature extraction method is proposed based on perceptual wavelet packet decomposition for classifying frog calls in noisy environments. Pre-processing and syllable segmentation are first applied to the frog call. Then, a spectral peak track is extracted from each syllable if possible. Track duration, dominant frequency and oscillation rate are directly extracted from the track. With k-means clustering algorithm, the calculated dominant frequency of all frog species is clustered into k parts, which produce a frequency scale for wavelet packet decomposition. Based on the adaptive frequency scale, wavelet packet decomposition is applied to the frog calls. Using the wavelet packet decomposition coefficients, a new feature set named perceptual wavelet packet decomposition sub-band cepstral coefficients is extracted. Finally, a k-nearest neighbour (k-NN) classifier is used for the classification. The experiment results show that the proposed features can achieve an average classification accuracy of 97.45% which outperforms syllable features (86.87%) and Mel-frequency cepstral coefficients (MFCCs) feature (90.80%).
Resumo:
This paper presents the site classification of Bangalore Mahanagar Palike (BMP) area using geophysical data and the evaluation of spectral acceleration at ground level using probabilistic approach. Site classification has been carried out using experimental data from the shallow geophysical method of Multichannel Analysis of Surface wave (MASW). One-dimensional (1-D) MASW survey has been carried out at 58 locations and respective velocity profiles are obtained. The average shear wave velocity for 30 m depth (Vs(30)) has been calculated and is used for the site classification of the BMP area as per NEHRP (National Earthquake Hazards Reduction Program). Based on the Vs(30) values major part of the BMP area can be classified as ``site class D'', and ``site class C'. A smaller portion of the study area, in and around Lalbagh Park, is classified as ``site class B''. Further, probabilistic seismic hazard analysis has been carried out to map the seismic hazard in terms spectral acceleration (S-a) at rock and the ground level considering the site classes and six seismogenic sources identified. The mean annual rate of exceedance and cumulative probability hazard curve for S. have been generated. The quantified hazard values in terms of spectral acceleration for short period and long period are mapped for rock, site class C and D with 10% probability of exceedance in 50 years on a grid size of 0.5 km. In addition to this, the Uniform Hazard Response Spectrum (UHRS) at surface level has been developed for the 5% damping and 10% probability of exceedance in 50 years for rock, site class C and D These spectral acceleration and uniform hazard spectrums can be used to assess the design force for important structures and also to develop the design spectrum.
Resumo:
Species distribution modelling (SDM) typically analyses species’ presence together with some form of absence information. Ideally absences comprise observations or are inferred from comprehensive sampling. When such information is not available, then pseudo-absences are often generated from the background locations within the study region of interest containing the presences, or else absence is implied through the comparison of presences to the whole study region, e.g. as is the case in Maximum Entropy (MaxEnt) or Poisson point process modelling. However, the choice of which absence information to include can be both challenging and highly influential on SDM predictions (e.g. Oksanen and Minchin, 2002). In practice, the use of pseudo- or implied absences often leads to an imbalance where absences far outnumber presences. This leaves analysis highly susceptible to ‘naughty-noughts’: absences that occur beyond the envelope of the species, which can exert strong influence on the model and its predictions (Austin and Meyers, 1996). Also known as ‘excess zeros’, naughty noughts can be estimated via an overall proportion in simple hurdle or mixture models (Martin et al., 2005). However, absences, especially those that occur beyond the species envelope, can often be more diverse than presences. Here we consider an extension to excess zero models. The two-staged approach first exploits the compartmentalisation provided by classification trees (CTs) (as in O’Leary, 2008) to identify multiple sources of naughty noughts and simultaneously delineate several species envelopes. Then SDMs can be fit separately within each envelope, and for this stage, we examine both CTs (as in Falk et al., 2014) and the popular MaxEnt (Elith et al., 2006). We introduce a wider range of model performance measures to improve treatment of naughty noughts in SDM. We retain an overall measure of model performance, the area under the curve (AUC) of the Receiver-Operating Curve (ROC), but focus on its constituent measures of false negative rate (FNR) and false positive rate (FPR), and how these relate to the threshold in the predicted probability of presence that delimits predicted presence from absence. We also propose error rates more relevant to users of predictions: false omission rate (FOR), the chance that a predicted absence corresponds to (and hence wastes) an observed presence, and the false discovery rate (FDR), reflecting those predicted (or potential) presences that correspond to absence. A high FDR may be desirable since it could help target future search efforts, whereas zero or low FOR is desirable since it indicates none of the (often valuable) presences have been ignored in the SDM. For illustration, we chose Bradypus variegatus, a species that has previously been published as an exemplar species for MaxEnt, proposed by Phillips et al. (2006). We used CTs to increasingly refine the species envelope, starting with the whole study region (E0), eliminating more and more potential naughty noughts (E1–E3). When combined with an SDM fit within the species envelope, the best CT SDM had similar AUC and FPR to the best MaxEnt SDM, but otherwise performed better. The FNR and FOR were greatly reduced, suggesting that CTs handle absences better. Interestingly, MaxEnt predictions showed low discriminatory performance, with the most common predicted probability of presence being in the same range (0.00-0.20) for both true absences and presences. In summary, this example shows that SDMs can be improved by introducing an initial hurdle to identify naughty noughts and partition the envelope before applying SDMs. This improvement was barely detectable via AUC and FPR yet visible in FOR, FNR, and the comparison of predicted probability of presence distribution for pres/absence.
Resumo:
The use of near infrared (NIR) hyperspectral imaging and hyperspectral image analysis for distinguishing between hard, intermediate and soft maize kernels from inbred lines was evaluated. NIR hyperspectral images of two sets (12 and 24 kernels) of whole maize kernels were acquired using a Spectral Dimensions MatrixNIR camera with a spectral range of 960-1662 nm and a sisuChema SWIR (short wave infrared) hyperspectral pushbroom imaging system with a spectral range of 1000-2498 nm. Exploratory principal component analysis (PCA) was used on absorbance images to remove background, bad pixels and shading. On the cleaned images. PCA could be used effectively to find histological classes including glassy (hard) and floury (soft) endosperm. PCA illustrated a distinct difference between glassy and floury endosperm along principal component (PC) three on the MatrixNIR and PC two on the sisuChema with two distinguishable clusters. Subsequently partial least squares discriminant analysis (PLS-DA) was applied to build a classification model. The PLS-DA model from the MatrixNIR image (12 kernels) resulted in root mean square error of prediction (RMSEP) value of 0.18. This was repeated on the MatrixNIR image of the 24 kernels which resulted in RMSEP of 0.18. The sisuChema image yielded RMSEP value of 0.29. The reproducible results obtained with the different data sets indicate that the method proposed in this paper has a real potential for future classification uses.
Resumo:
Management of the commercial harvest of kangaroos relies on quotas set annually as a proportion of regular estimates of population size. Surveys to generate these estimates are expensive and, in the larger states, logistically difficult; a cheaper alternative is desirable. Rainfall is a disappointingly poor predictor of kangaroo rate of increase in many areas, but harvest statistics (sex ratio, carcass weight, skin size and animals shot per unit time) potentially offer cost-effective indirect monitoring of population abundance (and therefore trend) and status (i.e. under-or overharvest). Furthermore, because harvest data are collected continuously and throughout the harvested areas, they offer the promise of more intensive and more representative coverage of harvest areas than aerial surveys do. To be useful, harvest statistics would need to have a close and known relationship with either population size or harvest rate. We assessed this using longterm (11-22 years) data for three kangaroo species (Macropus rufus, M. giganteus and M. fuliginosus) and common wallaroos (M. robustus) across South Australia, New South Wales and Queensland. Regional variation in kangaroo body size, population composition, shooter efficiency and selectivity required separate analyses in different regions. Two approaches were taken. First, monthly harvest statistics were modelled as a function of a number of explanatory variables, including kangaroo density, harvest rate and rainfall. Second, density and harvest rate were modelled as a function of harvest statistics. Both approaches incorporated a correlated error structure. Many but not all regions had relationships with sufficient precision to be useful for indirect monitoring. However, there was no single relationship that could be applied across an entire state or across species. Combined with rainfall-driven population models and applied at a regional level, these relationships could be used to reduce the frequency of aerial surveys without compromising decisions about harvest management.
Resumo:
Environmental changes have put great pressure on biological systems leading to the rapid decline of biodiversity. To monitor this change and protect biodiversity, animal vocalizations have been widely explored by the aid of deploying acoustic sensors in the field. Consequently, large volumes of acoustic data are collected. However, traditional manual methods that require ecologists to physically visit sites to collect biodiversity data are both costly and time consuming. Therefore it is essential to develop new semi-automated and automated methods to identify species in automated audio recordings. In this study, a novel feature extraction method based on wavelet packet decomposition is proposed for frog call classification. After syllable segmentation, the advertisement call of each frog syllable is represented by a spectral peak track, from which track duration, dominant frequency and oscillation rate are calculated. Then, a k-means clustering algorithm is applied to the dominant frequency, and the centroids of clustering results are used to generate the frequency scale for wavelet packet decomposition (WPD). Next, a new feature set named adaptive frequency scaled wavelet packet decomposition sub-band cepstral coefficients is extracted by performing WPD on the windowed frog calls. Furthermore, the statistics of all feature vectors over each windowed signal are calculated for producing the final feature set. Finally, two well-known classifiers, a k-nearest neighbour classifier and a support vector machine classifier, are used for classification. In our experiments, we use two different datasets from Queensland, Australia (18 frog species from commercial recordings and field recordings of 8 frog species from James Cook University recordings). The weighted classification accuracy with our proposed method is 99.5% and 97.4% for 18 frog species and 8 frog species respectively, which outperforms all other comparable methods.
Resumo:
In this paper, we present the design and characterization of a vibratory yaw rate MEMS sensor that uses in-plane motion for both actuation and sensing. The design criterion for the rate sensor is based on a high sensitivity and low bandwidth. The required sensitivity of the yawrate sensor is attained by using the inplane motion in which the dominant damping mechanism is the fluid loss due to slide film damping i.e. two-three orders of magnitude less than the squeeze-film damping in other rate sensors with out-of-plane motion. The low bandwidth is achieved by matching the drive and the sense mode frequencies. Based on these factors, the yaw rate sensor is designed and finally realized using surface micromachining. The inplane motion of the sensor is experimentally characterized to determine the sense and the drive mode frequencies, and corresponding damping ratios. It is found that the experimental results match well with the numerical and the analytical models with less than 5% error in frequencies measurements. The measured quality factor of the sensor is approximately 467, which is two orders of magnitude higher than that for a similar rate sensor with out-of-plane sense direction.
Resumo:
Mutation and recombination are the fundamental processes leading to genetic variation in natural populations. This variation forms the raw material for evolution through natural selection and drift. Therefore, studying mutation rates may reveal information about evolutionary histories as well as phylogenetic interrelationships of organisms. In this thesis two molecular tools, DNA barcoding and the molecular clock were examined. In the first part, the efficiency of mutations to delineate closely related species was tested and the implications for conservation practices were assessed. The second part investigated the proposition that a constant mutation rate exists within invertebrates, in form of a metabolic-rate dependent molecular clock, which can be applied to accurately date speciation events. DNA barcoding aspires to be an efficient technique to not only distinguish between species but also reveal population-level variation solely relying on mutations found on a short stretch of a single gene. In this thesis barcoding was applied to discriminate between Hylochares populations from Russian Karelia and new Hylochares findings from the greater Helsinki region in Finland. Although barcoding failed to delineate the two reproductively isolated groups, their distinct morphological features and differing life-history traits led to their classification as two closely related, although separate species. The lack of genetic differentiation appears to be due to a recent divergence event not yet reflected in the beetles molecular make-up. Thus, the Russian Hylochares was described as a new species. The Finnish species, previously considered as locally extinct, was recognized as endangered. Even if, due to their identical genetic make-up, the populations had been regarded as conspecific, conservation strategies based on prior knowledge from Russia would not have guaranteed the survival of the Finnish beetle. Therefore, new conservation actions based on detailed studies of the biology and life-history of the Finnish Hylochares were conducted to protect this endemic rarity in Finland. The idea behind the strict molecular clock is that mutation rates are constant over evolutionary time and may thus be used to infer species divergence dates. However, one of the most recent theories argues that a strict clock does not tick per unit of time but that it has a constant substitution rate per unit of mass-specific metabolic energy. Therefore, according to this hypothesis, molecular clocks have to be recalibrated taking body size and temperature into account. This thesis tested the temperature effect on mutation rates in equally sized invertebrates. For the first dataset (family Eucnemidae, Coleoptera) the phylogenetic interrelationships and evolutionary history of the genus Arrhipis had to be inferred before the influence of temperature on substitution rates could be studied. Further, a second, larger invertebrate dataset (family Syrphidae, Diptera) was employed. Several methodological approaches, a number of genes and multiple molecular clock models revealed that there was no consistent relationship between temperature and mutation rate for the taxa under study. Thus, the body size effect, observed in vertebrates but controversial for invertebrates, rather than temperature may be the underlying driving force behind the metabolic-rate dependent molecular clock. Therefore, the metabolic-rate dependent molecular clock does not hold for the here studied invertebrate groups. This thesis emphasizes that molecular techniques relying on mutation rates have to be applied with caution. Whereas they may work satisfactorily under certain conditions for specific taxa, they may fail for others. The molecular clock as well as DNA barcoding should incorporate all the information and data available to obtain comprehensive estimations of the existing biodiversity and its evolutionary history.
Resumo:
The problem of constructing space-time (ST) block codes over a fixed, desired signal constellation is considered. In this situation, there is a tradeoff between the transmission rate as measured in constellation symbols per channel use and the transmit diversity gain achieved by the code. The transmit diversity is a measure of the rate of polynomial decay of pairwise error probability of the code with increase in the signal-to-noise ratio (SNR). In the setting of a quasi-static channel model, let n(t) denote the number of transmit antennas and T the block interval. For any n(t) <= T, a unified construction of (n(t) x T) ST codes is provided here, for a class of signal constellations that includes the familiar pulse-amplitude (PAM), quadrature-amplitude (QAM), and 2(K)-ary phase-shift-keying (PSK) modulations as special cases. The construction is optimal as measured by the rate-diversity tradeoff and can achieve any given integer point on the rate-diversity tradeoff curve. An estimate of the coding gain realized is given. Other results presented here include i) an extension of the optimal unified construction to the multiple fading block case, ii) a version of the optimal unified construction in which the underlying binary block codes are replaced by trellis codes, iii) the providing of a linear dispersion form for the underlying binary block codes, iv) a Gray-mapped version of the unified construction, and v) a generalization of construction of the S-ary case corresponding to constellations of size S-K. Items ii) and iii) are aimed at simplifying the decoding of this class of ST codes.
Resumo:
In this paper. we propose a novel method using wavelets as input to neural network self-organizing maps and support vector machine for classification of magnetic resonance (MR) images of the human brain. The proposed method classifies MR brain images as either normal or abnormal. We have tested the proposed approach using a dataset of 52 MR brain images. Good classification percentage of more than 94% was achieved using the neural network self-organizing maps (SOM) and 98% front support vector machine. We observed that the classification rate is high for a Support vector machine classifier compared to self-organizing map-based approach.
Resumo:
A posteriori error estimation and adaptive refinement technique for fracture analysis of 2-D/3-D crack problems is the state-of-the-art. The objective of the present paper is to propose a new a posteriori error estimator based on strain energy release rate (SERR) or stress intensity factor (SIF) at the crack tip region and to use this along with the stress based error estimator available in the literature for the region away from the crack tip. The proposed a posteriori error estimator is called the K-S error estimator. Further, an adaptive mesh refinement (h-) strategy which can be used with K-S error estimator has been proposed for fracture analysis of 2-D crack problems. The performance of the proposed a posteriori error estimator and the h-adaptive refinement strategy have been demonstrated by employing the 4-noded, 8-noded and 9-noded plane stress finite elements. The proposed error estimator together with the h-adaptive refinement strategy will facilitate automation of fracture analysis process to provide reliable solutions.
Resumo:
The minimum distance of linear block codes is one of the important parameter that indicates the error performance of the code. When the code rate is less than 1/2, efficient algorithms are available for finding minimum distance using the concept of information sets. When the code rate is greater than 1/2, only one information set is available and efficiency suffers. In this paper, we investigate and propose a novel algorithm to find the minimum distance of linear block codes with the code rate greater than 1/2. We propose to reverse the roles of information set and parity set to get virtually another information set to improve the efficiency. This method is 67.7 times faster than the minimum distance algorithm implemented in MAGMA Computational Algebra System for a (80, 45) linear block code.
Resumo:
For an n(t) transmit, n(r) receive antenna system (n(t) x nr system), a full-rate space time block code (STBC) transmits min(n(t), n(r)) complex symbols per channel use. In this paper, a scheme to obtain a full-rate STBC for 4 transmit antennas and any n(r), with reduced ML-decoding complexity is presented. The weight matrices of the proposed STBC are obtained from the unitary matrix representations of a Clifford Algebra. By puncturing the symbols of the STBC, full rate designs can be obtained for n(r) < 4. For any value of n(r), the proposed design offers the least ML-decoding complexity among known codes. The proposed design is comparable in error performance to the well known Perfect code for 4 transmit antennas while offering lower ML-decoding complexity. Further, when n(r) < 4, the proposed design has higher ergodic capacity than the punctured Perfect code. Simulation results which corroborate these claims are presented.
Resumo:
Convolutional network-error correcting codes (CNECCs) are known to provide error correcting capability in acyclic instantaneous networks within the network coding paradigm under small field size conditions. In this work, we investigate the performance of CNECCs under the error model of the network where the edges are assumed to be statistically independent binary symmetric channels, each with the same probability of error pe(0 <= p(e) < 0.5). We obtain bounds on the performance of such CNECCs based on a modified generating function (the transfer function) of the CNECCs. For a given network, we derive a mathematical condition on how small p(e) should be so that only single edge network-errors need to be accounted for, thus reducing the complexity of evaluating the probability of error of any CNECC. Simulations indicate that convolutional codes are required to possess different properties to achieve good performance in low p(e) and high p(e) regimes. For the low p(e) regime, convolutional codes with good distance properties show good performance. For the high p(e) regime, convolutional codes that have a good slope ( the minimum normalized cycle weight) are seen to be good. We derive a lower bound on the slope of any rate b/c convolutional code with a certain degree.
Resumo:
This paper studies the problem of constructing robust classifiers when the training is plagued with uncertainty. The problem is posed as a Chance-Constrained Program (CCP) which ensures that the uncertain data points are classified correctly with high probability. Unfortunately such a CCP turns out to be intractable. The key novelty is in employing Bernstein bounding schemes to relax the CCP as a convex second order cone program whose solution is guaranteed to satisfy the probabilistic constraint. Prior to this work, only the Chebyshev based relaxations were exploited in learning algorithms. Bernstein bounds employ richer partial information and hence can be far less conservative than Chebyshev bounds. Due to this efficient modeling of uncertainty, the resulting classifiers achieve higher classification margins and hence better generalization. Methodologies for classifying uncertain test data points and error measures for evaluating classifiers robust to uncertain data are discussed. Experimental results on synthetic and real-world datasets show that the proposed classifiers are better equipped to handle data uncertainty and outperform state-of-the-art in many cases.