991 resultados para Statistical decision


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Feature selection plays an important role in knowledge discovery and data mining nowadays. In traditional rough set theory, feature selection using reduct - the minimal discerning set of attributes - is an important area. Nevertheless, the original definition of a reduct is restrictive, so in one of the previous research it was proposed to take into account not only the horizontal reduction of information by feature selection, but also a vertical reduction considering suitable subsets of the original set of objects. Following the work mentioned above, a new approach to generate bireducts using a multi--objective genetic algorithm was proposed. Although the genetic algorithms were used to calculate reduct in some previous works, we did not find any work where genetic algorithms were adopted to calculate bireducts. Compared to the works done before in this area, the proposed method has less randomness in generating bireducts. The genetic algorithm system estimated a quality of each bireduct by values of two objective functions as evolution progresses, so consequently a set of bireducts with optimized values of these objectives was obtained. Different fitness evaluation methods and genetic operators, such as crossover and mutation, were applied and the prediction accuracies were compared. Five datasets were used to test the proposed method and two datasets were used to perform a comparison study. Statistical analysis using the one-way ANOVA test was performed to determine the significant difference between the results. The experiment showed that the proposed method was able to reduce the number of bireducts necessary in order to receive a good prediction accuracy. Also, the influence of different genetic operators and fitness evaluation strategies on the prediction accuracy was analyzed. It was shown that the prediction accuracies of the proposed method are comparable with the best results in machine learning literature, and some of them outperformed it.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Decision trees are very powerful tools for classification in data mining tasks that involves different types of attributes. When coming to handling numeric data sets, usually they are converted first to categorical types and then classified using information gain concepts. Information gain is a very popular and useful concept which tells you, whether any benefit occurs after splitting with a given attribute as far as information content is concerned. But this process is computationally intensive for large data sets. Also popular decision tree algorithms like ID3 cannot handle numeric data sets. This paper proposes statistical variance as an alternative to information gain as well as statistical mean to split attributes in completely numerical data sets. The new algorithm has been proved to be competent with respect to its information gain counterpart C4.5 and competent with many existing decision tree algorithms against the standard UCI benchmarking datasets using the ANOVA test in statistics. The specific advantages of this proposed new algorithm are that it avoids the computational overhead of information gain computation for large data sets with many attributes, as well as it avoids the conversion to categorical data from huge numeric data sets which also is a time consuming task. So as a summary, huge numeric datasets can be directly submitted to this algorithm without any attribute mappings or information gain computations. It also blends the two closely related fields statistics and data mining

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The characterization and grading of glioma tumors, via image derived features, for diagnosis, prognosis, and treatment response has been an active research area in medical image computing. This paper presents a novel method for automatic detection and classification of glioma from conventional T2 weighted MR images. Automatic detection of the tumor was established using newly developed method called Adaptive Gray level Algebraic set Segmentation Algorithm (AGASA).Statistical Features were extracted from the detected tumor texture using first order statistics and gray level co-occurrence matrix (GLCM) based second order statistical methods. Statistical significance of the features was determined by t-test and its corresponding p-value. A decision system was developed for the grade detection of glioma using these selected features and its p-value. The detection performance of the decision system was validated using the receiver operating characteristic (ROC) curve. The diagnosis and grading of glioma using this non-invasive method can contribute promising results in medical image computing

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a new statistical, model-based approach to building a contact state observer. The observer uses measurements of the contact force and position, and prior information about the task encoded in a graph, to determine the current location of the robot in the task configuration space. Each node represents what the measurements will look like in a small region of configuration space by storing a predictive, statistical, measurement model. This approach assumes that the measurements are statistically block independent conditioned on knowledge of the model, which is a fairly good model of the actual process. Arcs in the graph represent possible transitions between models. Beam Viterbi search is used to match measurement history against possible paths through the model graph in order to estimate the most likely path for the robot. The resulting approach provides a new decision process that can be use as an observer for event driven manipulation programming. The decision procedure is significantly more robust than simple threshold decisions because the measurement history is used to make decisions. The approach can be used to enhance the capabilities of autonomous assembly machines and in quality control applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Resumen tomado de la publicación. Con el apoyo económico del departamento MIDE de la UNED. Contiene anexo de preguntas

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many different individuals, who have their own expertise and criteria for decision making, are involved in making decisions on construction projects. Decision-making processes are thus significantly affected by communication, in which a dynamic performance of human intentions leads to unpredictable outcomes. In order to theorise the decision making processes including communication, it is argued here that the decision making processes resemble evolutionary dynamics in terms of both selection and mutation, which can be expressed by the replicator-mutator equation. To support this argument, a mathematical model of decision making has been made from an analogy with evolutionary dynamics, in which there are three variables: initial support rate, business hierarchy, and power of persuasion. On the other hand, a survey of patterns in decision making in construction projects has also been performed through self-administered mail questionnaire to construction practitioners. Consequently, comparison between the numerical analysis of mathematical model and the statistical analysis of empirical data has shown a significant potential of the replicator-mutator equation as a tool to study dynamic properties of intentions in communication.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A decision support system (DSS) was implemented based on a fuzzy logic inference system (FIS) to provide assistance in dose alteration of Duodopa infusion in patients with advanced Parkinson’s disease, using data from motor state assessments and dosage. Three-tier architecture with an object oriented approach was used. The DSS has a web enabled graphical user interface that presents alerts indicating non optimal dosage and states, new recommendations, namely typical advice with typical dose and statistical measurements. One data set was used for design and tuning of the FIS and another data set was used for evaluating performance compared with actual given dose. Overall goodness-of-fit for the new patients (design data) was 0.65 and for the ongoing patients (evaluation data) 0.98. User evaluation is now ongoing. The system could work as an assistant to clinical staff for Duodopa treatment in advanced Parkinson’s disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Data mining can be used in healthcare industry to “mine” clinical data to discover hidden information for intelligent and affective decision making. Discovery of hidden patterns and relationships often goes intact, yet advanced data mining techniques can be helpful as remedy to this scenario. This thesis mainly deals with Intelligent Prediction of Chronic Renal Disease (IPCRD). Data covers blood, urine test, and external symptoms applied to predict chronic renal disease. Data from the database is initially transformed to Weka (3.6) and Chi-Square method is used for features section. After normalizing data, three classifiers were applied and efficiency of output is evaluated. Mainly, three classifiers are analyzed: Decision Tree, Naïve Bayes, K-Nearest Neighbour algorithm. Results show that each technique has its unique strength in realizing the objectives of the defined mining goals. Efficiency of Decision Tree and KNN was almost same but Naïve Bayes proved a comparative edge over others. Further sensitivity and specificity tests are used as statistical measures to examine the performance of a binary classification. Sensitivity (also called recall rate in some fields) measures the proportion of actual positives which are correctly identified while Specificity measures the proportion of negatives which are correctly identified. CRISP-DM methodology is applied to build the mining models. It consists of six major phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the service life of water supply network (WSN) growth, the growing phenomenon of aging pipe network has become exceedingly serious. As urban water supply network is hidden underground asset, it is difficult for monitoring staff to make a direct classification towards the faults of pipe network by means of the modern detecting technology. In this paper, based on the basic property data (e.g. diameter, material, pressure, distance to pump, distance to tank, load, etc.) of water supply network, decision tree algorithm (C4.5) has been carried out to classify the specific situation of water supply pipeline. Part of the historical data was used to establish a decision tree classification model, and the remaining historical data was used to validate this established model. Adopting statistical methods were used to access the decision tree model including basic statistical method, Receiver Operating Characteristic (ROC) and Recall-Precision Curves (RPC). These methods has been successfully used to assess the accuracy of this established classification model of water pipe network. The purpose of classification model was to classify the specific condition of water pipe network. It is important to maintain the pipeline according to the classification results including asset unserviceable (AU), near perfect condition (NPC) and serious deterioration (SD). Finally, this research focused on pipe classification which plays a significant role in maintaining water supply networks in the future.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We are investigating the combination of wavelets and decision trees to detect ships and other maritime surveillance targets from medium resolution SAR images. Wavelets have inherent advantages to extract image descriptors while decision trees are able to handle different data sources. In addition, our work aims to consider oceanic features such as ship wakes and ocean spills. In this incipient work, Haar and Cohen-Daubechies-Feauveau 9/7 wavelets obtain detailed descriptors from targets and ocean features and are inserted with other statistical parameters and wavelets into an oblique decision tree. © 2011 Springer-Verlag.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The identification of tree species is a key step for sustainable management plans of forest resources, as well as for several other applications that are based on such surveys. However, the present available techniques are dependent on the presence of tree structures, such as flowers, fruits, and leaves, limiting the identification process to certain periods of the year Therefore, this article introduces a study on the application of statistical parameters for texture classification of tree trunk images. For that, 540 samples from five Brazilian native deciduous species were acquired and measures of entropy, uniformity, smoothness, asymmetry (third moment), mean, and standard deviation were obtained from the presented textures. Using a decision tree, a biometric species identification system was constructed and resulted to a 0.84 average precision rate for species classification with 0.83accuracy and 0.79 agreement. Thus, it can be considered that the use of texture presented in trunk images can represent an important advance in tree identification, since the limitations of the current techniques can be overcome.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A common interest in gene expression data analysis is to identify from a large pool of candidate genes the genes that present significant changes in expression levels between a treatment and a control biological condition. Usually, it is done using a statistic value and a cutoff value that are used to separate the genes differentially and nondifferentially expressed. In this paper, we propose a Bayesian approach to identify genes differentially expressed calculating sequentially credibility intervals from predictive densities which are constructed using the sampled mean treatment effect from all genes in study excluding the treatment effect of genes previously identified with statistical evidence for difference. We compare our Bayesian approach with the standard ones based on the use of the t-test and modified t-tests via a simulation study, using small sample sizes which are common in gene expression data analysis. Results obtained report evidence that the proposed approach performs better than standard ones, especially for cases with mean differences and increases in treatment variance in relation to control variance. We also apply the methodologies to a well-known publicly available data set on Escherichia coli bacterium.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many of developing countries are facing crisis in water management due to increasing of population, water scarcity, water contaminations and effects of world economic crisis. Water distribution systems in developing countries are facing many challenges of efficient repair and rehabilitation since the information of water network is very limited, which makes the rehabilitation assessment plans very difficult. Sufficient information with high technology in developed countries makes the assessment for rehabilitation easy. Developing countries have many difficulties to assess the water network causing system failure, deterioration of mains and bad water quality in the network due to pipe corrosion and deterioration. The limited information brought into focus the urgent need to develop economical assessment for rehabilitation of water distribution systems adapted to water utilities. Gaza Strip is subject to a first case study, suffering from severe shortage in the water supply and environmental problems and contamination of underground water resources. This research focuses on improvement of water supply network to reduce the water losses in water network based on limited database using techniques of ArcGIS and commercial water network software (WaterCAD). A new approach for rehabilitation water pipes has been presented in Gaza city case study. Integrated rehabilitation assessment model has been developed for rehabilitation water pipes including three components; hydraulic assessment model, Physical assessment model and Structural assessment model. WaterCAD model has been developed with integrated in ArcGIS to produce the hydraulic assessment model for water network. The model have been designed based on pipe condition assessment with 100 score points as a maximum points for pipe condition. As results from this model, we can indicate that 40% of water pipeline have score points less than 50 points and about 10% of total pipes length have less than 30 score points. By using this model, the rehabilitation plans for each region in Gaza city can be achieved based on available budget and condition of pipes. The second case study is Kuala Lumpur Case from semi-developed countries, which has been used to develop an approach to improve the water network under crucial conditions using, advanced statistical and GIS techniques. Kuala Lumpur (KL) has water losses about 40% and high failure rate, which make severe problem. This case can represent cases in South Asia countries. Kuala Lumpur faced big challenges to reduce the water losses in water network during last 5 years. One of these challenges is high deterioration of asbestos cement (AC) pipes. They need to replace more than 6500 km of AC pipes, which need a huge budget to be achieved. Asbestos cement is subject to deterioration due to various chemical processes that either leach out the cement material or penetrate the concrete to form products that weaken the cement matrix. This case presents an approach for geo-statistical model for modelling pipe failures in a water distribution network. Database of Syabas Company (Kuala Lumpur water company) has been used in developing the model. The statistical models have been calibrated, verified and used to predict failures for both networks and individual pipes. The mathematical formulation developed for failure frequency in Kuala Lumpur was based on different pipeline characteristics, reflecting several factors such as pipe diameter, length, pressure and failure history. Generalized linear model have been applied to predict pipe failures based on District Meter Zone (DMZ) and individual pipe levels. Based on Kuala Lumpur case study, several outputs and implications have been achieved. Correlations between spatial and temporal intervals of pipe failures also have been done using ArcGIS software. Water Pipe Assessment Model (WPAM) has been developed using the analysis of historical pipe failure in Kuala Lumpur which prioritizing the pipe rehabilitation candidates based on ranking system. Frankfurt Water Network in Germany is the third main case study. This case makes an overview for Survival analysis and neural network methods used in water network. Rehabilitation strategies of water pipes have been developed for Frankfurt water network in cooperation with Mainova (Frankfurt Water Company). This thesis also presents a methodology of technical condition assessment of plastic pipes based on simple analysis. This thesis aims to make contribution to improve the prediction of pipe failures in water networks using Geographic Information System (GIS) and Decision Support System (DSS). The output from the technical condition assessment model can be used to estimate future budget needs for rehabilitation and to define pipes with high priority for replacement based on poor condition. rn