15 resultados para Classification error rate
em University of Queensland eSpace - Australia
Resumo:
A quantum circuit implementing 5-qubit quantum-error correction on a linear-nearest-neighbor architecture is described. The canonical decomposition is used to construct fast and simple gates that incorporate the necessary swap operations allowing the circuit to achieve the same depth as the current least depth circuit. Simulations of the circuit's performance when subjected to discrete and continuous errors are presented. The relationship between the error rate of a physical qubit and that of a logical qubit is investigated with emphasis on determining the concatenated error correction threshold.
Resumo:
We describe a scheme for quantum-error correction that employs feedback and weak measurement rather than the standard tools of projective measurement and fast controlled unitary gates. The advantage of this scheme over previous protocols [for example, Ahn Phys. Rev. A 65, 042301 (2001)], is that it requires little side processing while remaining robust to measurement inefficiency, and is therefore considerably more practical. We evaluate the performance of our scheme by simulating the correction of bit flips. We also consider implementation in a solid-state quantum-computation architecture and estimate the maximal error rate that could be corrected with current technology.
Resumo:
This letter presents an analytical model for evaluating the Bit Error Rate (BER) of a Direct Sequence Code Division Multiple Access (DS-CDMA) system, with M-ary orthogonal modulation and noncoherent detection, employing an array antenna operating in a Nakagami fading environment. An expression of the Signal to Interference plus Noise Ratio (SINR) at the output of the receiver is derived, which allows the BER to be evaluated using a closed form expression. The analytical model is validated by comparing the obtained results with simulation results.
The effects of task complexity and practice on dual-task interference in visuospatial working memory
Resumo:
Although the n-back task has been widely applied to neuroimagery investigations of working memory (WM), the role of practice effects on behavioural performance of this task has not yet been investigated. The current study aimed to investigate the effects of task complexity and familiarity on the n-back task. Seventy-seven participants (39 male, 38 female) completed a visuospatial n-back task four times, twice in two testing sessions separated by a week. Participants were required to remember either the first, second or third (n-back) most recent letter positions in a continuous sequence and to indicate whether the current item matched or did not match the remembered position. A control task, with no working memory requirements required participants to match to a predetermined stimulus position. In both testing sessions, reaction time (RT) and error rate increased with increasing WM load. An exponential slope for RTs in the first session indicated dual-task interference at the 3-back level. However, a linear slope in the second session indicated a reduction of dual-task interference. Attenuation of interference in the second session suggested a reduction in executive demands of the task with practice. This suggested that practice effects occur within the n-back ask and need to be controlled for in future neuroimagery research using the task.
Resumo:
Univariate linkage analysis is used routinely to localise genes for human complex traits. Often, many traits are analysed but the significance of linkage for each trait is not corrected for multiple trait testing, which increases the experiment-wise type-I error rate. In addition, univariate analyses do not realise the full power provided by multivariate data sets. Multivariate linkage is the ideal solution but it is computationally intensive, so genome-wide analysis and evaluation of empirical significance are often prohibitive. We describe two simple methods that efficiently alleviate these caveats by combining P-values from multiple univariate linkage analyses. The first method estimates empirical pointwise and genome-wide significance between one trait and one marker when multiple traits have been tested. It is as robust as an appropriate Bonferroni adjustment, with the advantage that no assumptions are required about the number of independent tests performed. The second method estimates the significance of linkage between multiple traits and one marker and, therefore, it can be used to localise regions that harbour pleiotropic quantitative trait loci (QTL). We show that this method has greater power than individual univariate analyses to detect a pleiotropic QTL across different situations. In addition, when traits are moderately correlated and the QTL influences all traits, it can outperform formal multivariate VC analysis. This approach is computationally feasible for any number of traits and was not affected by the residual correlation between traits. We illustrate the utility of our approach with a genome scan of three asthma traits measured in families with a twin proband.
Resumo:
Presence-absence surveys are a commonly used method for monitoring broad-scale changes in wildlife distributions. However, the lack of power of these surveys for detecting population trends is problematic for their application in wildlife management. Options for improving power include increasing the sampling effort or arbitrarily relaxing the type I error rate. We present an alternative, whereby targeted sampling of particular habitats in the landscape using information from a habitat model increases power. The advantage of this approach is that it does not require a trade-off with either cost or the Pr(type I error) to achieve greater power. We use a demographic model of koala (Phascolarctos cinereus) population dynamics and simulations of the monitoring process to estimate the power to detect a trend in occupancy for a range of strategies, thereby demonstrating that targeting particular habitat qualities can improve power substantially. If the objective is to detect a decline in occupancy, the optimal strategy is to sample high-quality habitats. Alternatively, if the objective is to detect an increase in occupancy, the optimal strategy is to sample intermediate-quality habitats. The strategies with the highest power remained the same under a range of parameter assumptions, although observation error had a strong influence on the optimal strategy. Our approach specifically applies to monitoring for detecting long-term trends in occupancy or abundance. This is a common and important monitoring objective for wildlife managers, and we provide guidelines for more effectively achieving it.
Resumo:
Traditionally, machine learning algorithms have been evaluated in applications where assumptions can be reliably made about class priors and/or misclassification costs. In this paper, we consider the case of imprecise environments, where little may be known about these factors and they may well vary significantly when the system is applied. Specifically, the use of precision-recall analysis is investigated and compared to the more well known performance measures such as error-rate and the receiver operating characteristic (ROC). We argue that while ROC analysis is invariant to variations in class priors, this invariance in fact hides an important factor of the evaluation in imprecise environments. Therefore, we develop a generalised precision-recall analysis methodology in which variation due to prior class probabilities is incorporated into a multi-way analysis of variance (ANOVA). The increased sensitivity and reliability of this approach is demonstrated in a remote sensing application.
Resumo:
A test of the ability of a probabilistic neural network to classify deposits into types on the basis of deposit tonnage and average Cu, Mo, Ag, Au, Zn, and Pb grades is conducted. The purpose is to examine whether this type of system might serve as a basis for integrating geoscience information available in large mineral databases to classify sites by deposit type. Benefits of proper classification of many sites in large regions are relatively rapid identification of terranes permissive for deposit types and recognition of specific sites perhaps worthy of exploring further. Total tonnages and average grades of 1,137 well-explored deposits identified in published grade and tonnage models representing 13 deposit types were used to train and test the network. Tonnages were transformed by logarithms and grades by square roots to reduce effects of skewness. All values were scaled by subtracting the variable's mean and dividing by its standard deviation. Half of the deposits were selected randomly to be used in training the probabilistic neural network and the other half were used for independent testing. Tests were performed with a probabilistic neural network employing a Gaussian kernel and separate sigma weights for each class (type) and each variable (grade or tonnage). Deposit types were selected to challenge the neural network. For many types, tonnages or average grades are significantly different from other types, but individual deposits may plot in the grade and tonnage space of more than one type. Porphyry Cu, porphyry Cu-Au, and porphyry Cu-Mo types have similar tonnages and relatively small differences in grades. Redbed Cu deposits typically have tonnages that could be confused with porphyry Cu deposits, also contain Cu and, in some situations, Ag. Cyprus and kuroko massive sulfide types have about the same tonnages. Cu, Zn, Ag, and Au grades. Polymetallic vein, sedimentary exhalative Zn-Pb, and Zn-Pb skarn types contain many of the same metals. Sediment-hosted Au, Comstock Au-Ag, and low-sulfide Au-quartz vein types are principally Au deposits with differing amounts of Ag. Given the intent to test the neural network under the most difficult conditions, an overall 75% agreement between the experts and the neural network is considered excellent. Among the largestclassification errors are skarn Zn-Pb and Cyprus massive sulfide deposits classed by the neuralnetwork as kuroko massive sulfides—24 and 63% error respectively. Other large errors are the classification of 92% of porphyry Cu-Mo as porphyry Cu deposits. Most of the larger classification errors involve 25 or fewer training deposits, suggesting that some errors might be the result of small sample size. About 91% of the gold deposit types were classed properly and 98% of porphyry Cu deposits were classes as some type of porphyry Cu deposit. An experienced economic geologist would not make many of the classification errors that were made by the neural network because the geologic settings of deposits would be used to reduce errors. In a separate test, the probabilistic neural network correctly classed 93% of 336 deposits in eight deposit types when trained with presence or absence of 58 minerals and six generalized rock types. The overall success rate of the probabilistic neural network when trained on tonnage and average grades would probably be more than 90% with additional information on the presence of a few rock types.
Resumo:
The expectation-maximization (EM) algorithm has been of considerable interest in recent years as the basis for various algorithms in application areas of neural networks such as pattern recognition. However, there exists some misconceptions concerning its application to neural networks. In this paper, we clarify these misconceptions and consider how the EM algorithm can be adopted to train multilayer perceptron (MLP) and mixture of experts (ME) networks in applications to multiclass classification. We identify some situations where the application of the EM algorithm to train MLP networks may be of limited value and discuss some ways of handling the difficulties. For ME networks, it is reported in the literature that networks trained by the EM algorithm using iteratively reweighted least squares (IRLS) algorithm in the inner loop of the M-step, often performed poorly in multiclass classification. However, we found that the convergence of the IRLS algorithm is stable and that the log likelihood is monotonic increasing when a learning rate smaller than one is adopted. Also, we propose the use of an expectation-conditional maximization (ECM) algorithm to train ME networks. Its performance is demonstrated to be superior to the IRLS algorithm on some simulated and real data sets.
Resumo:
Genetic assignment methods use genotype likelihoods to draw inference about where individuals were or were not born, potentially allowing direct, real-time estimates of dispersal. We used simulated data sets to test the power and accuracy of Monte Carlo resampling methods in generating statistical thresholds for identifying F-0 immigrants in populations with ongoing gene flow, and hence for providing direct, real-time estimates of migration rates. The identification of accurate critical values required that resampling methods preserved the linkage disequilibrium deriving from recent generations of immigrants and reflected the sampling variance present in the data set being analysed. A novel Monte Carlo resampling method taking into account these aspects was proposed and its efficiency was evaluated. Power and error were relatively insensitive to the frequency assumed for missing alleles. Power to identify F-0 immigrants was improved by using large sample size (up to about 50 individuals) and by sampling all populations from which migrants may have originated. A combination of plotting genotype likelihoods and calculating mean genotype likelihood ratios (D-LR) appeared to be an effective way to predict whether F-0 immigrants could be identified for a particular pair of populations using a given set of markers.
Resumo:
Eastern curlews Numenius madagascariensis spending the nonbreeding season in eastern Australia foraged on three intertidal decapods: soldier crab Mictyris longicarpus, sentinel crab Macrophthalmus crassipes and ghost-shrimp Trypaea australiensis. Due to their ecology, these crustaceans were spatially segregated (=distributed in 'patches') and the curlews intermittently consumed more than one prey type. It was predicted that if the curlews behaved as intake rate maximizers, the time spent foraging on a particular prey (patch) would reflect relative availabilities of the prey types and thus prey-specific intake rates would be equal. During the mid-nonbreeding period (November-December), Mictyris and Macrophthalmus were primarily consumed and prey-specific intake rates were statistically indistinguishable (8.8 versus 10.1 kJ x min(-1)). Prior to migration (February), Mictyris and Trypaea were hunted and the respective intake rates were significantly different (8.9 versus 2.3 kJ x min(-1)). Time allocation to Trypaea-hunting was independent of the availability of Mictyris. Thus, consumption of Trypaea depressed the overall intake rate. Six hypotheses for consuming Trypaea before migration were examined. Five hypotheses: the possible error by the predator, prey specialization, observer overestimation of time spent hunting Trypaea, supplementary prey and the choice of higher quality prey due to a digestive bottleneck, were deemed unsatisfactory. The explanation for consumption of a low intake-rate but high quality prey (Trypaea) deemed plausible was diet optimisation by the Curlews in response to the pre-migratory modulation (decrease in size/processing capacity) of their digestive system. With a seasonal decrease in the average intake rate, the estimated intake per low tide increased from 1233 to 1508 kJ between the mid-nonbreeding and pre-migratory periods by increasing the overall time spent on the sandflats and the proportion of time spent foraging.
Resumo:
The use of presence/absence data in wildlife management and biological surveys is widespread. There is a growing interest in quantifying the sources of error associated with these data. We show that false-negative errors (failure to record a species when in fact it is present) can have a significant impact on statistical estimation of habitat models using simulated data. Then we introduce an extension of logistic modeling, the zero-inflated binomial (ZIB) model that permits the estimation of the rate of false-negative errors and the correction of estimates of the probability of occurrence for false-negative errors by using repeated. visits to the same site. Our simulations show that even relatively low rates of false negatives bias statistical estimates of habitat effects. The method with three repeated visits eliminates the bias, but estimates are relatively imprecise. Six repeated visits improve precision of estimates to levels comparable to that achieved with conventional statistics in the absence of false-negative errors In general, when error rates are less than or equal to50% greater efficiency is gained by adding more sites, whereas when error rates are >50% it is better to increase the number of repeated visits. We highlight the flexibility of the method with three case studies, clearly demonstrating the effect of false-negative errors for a range of commonly used survey methods.
Resumo:
Systematic protocols that use decision rules or scores arc, seen to improve consistency and transparency in classifying the conservation status of species. When applying these protocols, assessors are typically required to decide on estimates for attributes That are inherently uncertain, Input data and resulting classifications are usually treated as though they arc, exact and hence without operator error We investigated the impact of data interpretation on the consistency of protocols of extinction risk classifications and diagnosed causes of discrepancies when they occurred. We tested three widely used systematic classification protocols employed by the World Conservation Union, NatureServe, and the Florida Fish and Wildlife Conservation Commission. We provided 18 assessors with identical information for 13 different species to infer estimates for each of the required parameters for the three protocols. The threat classification of several of the species varied from low risk to high risk, depending on who did the assessment. This occurred across the three Protocols investigated. Assessors tended to agree on their placement of species in the highest (50-70%) and lowest risk categories (20-40%), but There was poor agreement on which species should be placed in the intermediate categories, Furthermore, the correspondence between The three classification methods was unpredictable, with large variation among assessors. These results highlight the importance of peer review and consensus among multiple assessors in species classifications and the need to be cautious with assessments carried out 4), a single assessor Greater consistency among assessors requires wide use of training manuals and formal methods for estimating parameters that allow uncertainties to be represented, carried through chains of calculations, and reported transparently.
Resumo:
Background: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.