909 resultados para data accuracy
Resumo:
Results of two experiments are reported that examined how people respond to rectangular targets of different sizes in simple hitting tasks. If a target moves in a straight line and a person is constrained to move along a linear track oriented perpendicular to the targetrsquos motion, then the length of the target along its direction of motion constrains the temporal accuracy and precision required to make the interception. The dimensions of the target perpendicular to its direction of motion place no constraints on performance in such a task. In contrast, if the person is not constrained to move along a straight track, the targetrsquos dimensions may constrain the spatial as well as the temporal accuracy and precision. The experiments reported here examined how people responded to targets of different vertical extent (height): the task was to strike targets that moved along a straight, horizontal path. In experiment 1 participants were constrained to move along a horizontal linear track to strike targets and so target height did not constrain performance. Target height, length and speed were co-varied. Movement time (MT) was unaffected by target height but was systematically affected by length (briefer movements to smaller targets) and speed (briefer movements to faster targets). Peak movement speed (Vmax) was influenced by all three independent variables: participants struck shorter, narrower and faster targets harder. In experiment 2, participants were constrained to move in a vertical plane normal to the targetrsquos direction of motion. In this task target height constrains the spatial accuracy required to contact the target. Three groups of eight participants struck targets of different height but of constant length and speed, hence constant temporal accuracy demand (different for each group, one group struck stationary targets = no temporal accuracy demand). On average, participants showed little or no systematic response to changes in spatial accuracy demand on any dependent measure (MT, Vmax, spatial variable error). The results are interpreted in relation to previous results on movements aimed at stationary targets in the absence of visual feedback.
Resumo:
Different interceptive tasks and modes of interception (hitting or capturing) do not necessarily involve similar control processes. Control based on preprogramming of movement parameters is possible for actions with brief movement times but is now widely rejected; continuous perceptuomotor control models are preferred for all types of interception. The rejection of preprogrammed control and acceptance of continuous control is evaluated for the timing of rapidly executed, manual hitting actions. It is shown that a preprogrammed control model is capable of providing a convincing account of observed behavior patterns that avoids many of the arguments that have been raised against it. Prominent continuous perceptual control models are analyzed within a common framework and are shown to be interpretable as feedback control strategies. Although these models can explain observations of on-line adjustments to movement, they offer only post hoc explanations for observed behavior patterns in hitting tasks and are not directly supported by data. It is proposed that rapid manual hitting tasks make up a class of interceptions for which a preprogrammed strategy is adopted-a strategy that minimizes the role of visual feedback. Such a strategy is effective when the task demands a high degree of temporal accuracy.
Resumo:
A combination of deductive reasoning, clustering, and inductive learning is given as an example of a hybrid system for exploratory data analysis. Visualization is replaced by a dialogue with the data.
Resumo:
This paper reports a comparative study of Australian and New Zealand leadership attributes, based on the GLOBE (Global Leadership and Organizational Behavior Effectiveness) program. Responses from 344 Australian managers and 184 New Zealand managers in three industries were analyzed using exploratory and confirmatory factor analysis. Results supported some of the etic leadership dimensions identified in the GLOBE study, but also found some emic dimensions of leadership for each country. An interesting finding of the study was that the New Zealand data fitted the Australian model, but not vice versa, suggesting asymmetric perceptions of leadership in the two countries.
Resumo:
In the context of cancer diagnosis and treatment, we consider the problem of constructing an accurate prediction rule on the basis of a relatively small number of tumor tissue samples of known type containing the expression data on very many (possibly thousands) genes. Recently, results have been presented in the literature suggesting that it is possible to construct a prediction rule from only a few genes such that it has a negligible prediction error rate. However, in these results the test error or the leave-one-out cross-validated error is calculated without allowance for the selection bias. There is no allowance because the rule is either tested on tissue samples that were used in the first instance to select the genes being used in the rule or because the cross-validation of the rule is not external to the selection process; that is, gene selection is not performed in training the rule at each stage of the cross-validation process. We describe how in practice the selection bias can be assessed and corrected for by either performing a cross-validation or applying the bootstrap external to the selection process. We recommend using 10-fold rather than leave-one-out cross-validation, and concerning the bootstrap, we suggest using the so-called. 632+ bootstrap error estimate designed to handle overfitted prediction rules. Using two published data sets, we demonstrate that when correction is made for the selection bias, the cross-validated error is no longer zero for a subset of only a few genes.
Resumo:
The generalized Gibbs sampler (GGS) is a recently developed Markov chain Monte Carlo (MCMC) technique that enables Gibbs-like sampling of state spaces that lack a convenient representation in terms of a fixed coordinate system. This paper describes a new sampler, called the tree sampler, which uses the GGS to sample from a state space consisting of phylogenetic trees. The tree sampler is useful for a wide range of phylogenetic applications, including Bayesian, maximum likelihood, and maximum parsimony methods. A fast new algorithm to search for a maximum parsimony phylogeny is presented, using the tree sampler in the context of simulated annealing. The mathematics underlying the algorithm is explained and its time complexity is analyzed. The method is tested on two large data sets consisting of 123 sequences and 500 sequences, respectively. The new algorithm is shown to compare very favorably in terms of speed and accuracy to the program DNAPARS from the PHYLIP package.
Resumo:
Genetic recombination can produce heterogeneous phylogenetic histories within a set of homologous genes. Delineating recombination events is important in the study of molecular evolution, as inference of such events provides a clearer picture of the phylogenetic relationships among different gene sequences or genomes. Nevertheless, detecting recombination events can be a daunting task, as the performance of different recombination-detecting approaches can vary, depending on evolutionary events that take place after recombination. We previously evaluated the effects of post-recombination events on the prediction accuracy of recombination-detecting approaches using simulated nucleotide sequence data. The main conclusion, supported by other studies, is that one should not depend on a single method when searching for recombination events. In this paper, we introduce a two-phase strategy, applying three statistical measures to detect the occurrence of recombination events, and a Bayesian phylogenetic approach to delineate breakpoints of such events in nucleotide sequences. We evaluate the performance of these approaches using simulated data, and demonstrate the applicability of this strategy to empirical data. The two-phase strategy proves to be time-efficient when applied to large datasets, and yields high-confidence results.
Resumo:
Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).
Resumo:
There are many techniques for electricity market price forecasting. However, most of them are designed for expected price analysis rather than price spike forecasting. An effective method of predicting the occurrence of spikes has not yet been observed in the literature so far. In this paper, a data mining based approach is presented to give a reliable forecast of the occurrence of price spikes. Combined with the spike value prediction techniques developed by the same authors, the proposed approach aims at providing a comprehensive tool for price spike forecasting. In this paper, feature selection techniques are firstly described to identify the attributes relevant to the occurrence of spikes. A simple introduction to the classification techniques is given for completeness. Two algorithms: support vector machine and probability classifier are chosen to be the spike occurrence predictors and are discussed in details. Realistic market data are used to test the proposed model with promising results.
Resumo:
An investigation was undertaken to test the effectiveness of two procedures for recording boundaries and plot positions for scientific studies on farms on Leyte Island, the Philippines. The accuracy of a Garmin 76 Global Positioning System (GPS) unit and a compass and chain was checked under the same conditions. Tree canopies interfered with the ability of the satellite signal to reach the GPS and therefore the GPS survey was less accurate than the compass and chain survey. Where a high degree of accuracy is required, a compass and chain survey remains the most effective method of surveying land underneath tree canopies, providing operator error is minimised. For a large number of surveys and thus large amounts of data, a GPS is more appropriate than a compass and chain survey because data are easily up-loaded into a Geographic Information System (GIS). However, under dense canopies where satellite signals cannot reach the GPS, it may be necessary to revert to a compass survey or a combination of both methods.
Resumo:
Background: This study used household survey data on the prevalence of child, parent and family variables to establish potential targets for a population-level intervention to strengthen parenting skills in the community. The goals of the intervention include decreasing child conduct problems, increasing parental self-efficacy, use of positive parenting strategies, decreasing coercive parenting and increasing help-seeking, social support and participation in positive parenting programmes. Methods: A total of 4010 parents with a child under the age of 12 years completed a statewide telephone survey on parenting. Results: One in three parents reported that their child had a behavioural or emotional problem in the previous 6 months. Furthermore, 9% of children aged 2–12 years meet criteria for oppositional defiant disorder. Parents who reported their child's behaviour to be difficult were more likely to perceive parenting as a negative experience (i.e. demanding, stressful and depressing). Parents with greatest difficulties were mothers without partners and who had low levels of confidence in their parenting roles. About 20% of parents reported being stressed and 5% reported being depressed in the 2 weeks prior to the survey. Parents with personal adjustment problems had lower levels of parenting confidence and their child was more difficult to manage. Only one in four parents had participated in a parent education programme. Conclusions: Implications for the setting of population-level goals and targets for strengthening parenting skills are discussed.
Resumo:
Despite the increasing prevalence of salinity world-wide, the measurement of exchangeable cation concentrations in saline soils remains problematic. Two soil types (Mollisol and Vertisol) were equilibrated with a range of sodium adsorption ratio (SAR) solutions at various ionic strengths. The concentrations of exchangeable cations were then determined using several different types of methods, and the measured exchangeable cation concentrations compared to reference values. At low ionic strength (low salinity), the concentration of exchangeable cations can be accurately estimated from the total soil extractable cations. In saline soils, however, the presence of soluble salts in the soil solution precludes the use of this method. Leaching of the soil with a pre-wash solution (such as alcohol) was found to effectively remove the soluble salts from the soil, thus allowing the accurate measurement of the effective cation exchange capacity (ECEC). However, the dilution associated with this pre-washing increased the exchangeable Ca concentrations while simultaneously decreasing exchangeable Na. In contrast, when calculated as the difference between the total extractable cations and the soil solution cations, good correlations were found between the calculated exchangeable cation concentrations and the reference values for both Na (Mollisol: y=0.873x and Vertisol: y=0.960x) and Ca (Mollisol: y=0.901x and Vertisol: y=1.05x). Therefore, for soils with a soil solution ionic strength greater than 50 mM (electrical conductivity of 4 dS/m) (in which exchangeable cation concentrations are overestimated by the assumption they can be estimated as the total extractable cations), concentrations can be calculated as the difference between total extractable cations and soluble cations.
Resumo:
This paper discusses a multi-layer feedforward (MLF) neural network incident detection model that was developed and evaluated using field data. In contrast to published neural network incident detection models which relied on simulated or limited field data for model development and testing, the model described in this paper was trained and tested on a real-world data set of 100 incidents. The model uses speed, flow and occupancy data measured at dual stations, averaged across all lanes and only from time interval t. The off-line performance of the model is reported under both incident and non-incident conditions. The incident detection performance of the model is reported based on a validation-test data set of 40 incidents that were independent of the 60 incidents used for training. The false alarm rates of the model are evaluated based on non-incident data that were collected from a freeway section which was video-taped for a period of 33 days. A comparative evaluation between the neural network model and the incident detection model in operation on Melbourne's freeways is also presented. The results of the comparative performance evaluation clearly demonstrate the substantial improvement in incident detection performance obtained by the neural network model. The paper also presents additional results that demonstrate how improvements in model performance can be achieved using variable decision thresholds. Finally, the model's fault-tolerance under conditions of corrupt or missing data is investigated and the impact of loop detector failure/malfunction on the performance of the trained model is evaluated and discussed. The results presented in this paper provide a comprehensive evaluation of the developed model and confirm that neural network models can provide fast and reliable incident detection on freeways. (C) 1997 Elsevier Science Ltd. All rights reserved.