876 resultados para classification and regression tree


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Correlation and regression are two of the statistical procedures most widely used by optometrists. However, these tests are often misused or interpreted incorrectly, leading to erroneous conclusions from clinical experiments. This review examines the major statistical tests concerned with correlation and regression that are most likely to arise in clinical investigations in optometry. First, the use, interpretation and limitations of Pearson's product moment correlation coefficient are described. Second, the least squares method of fitting a linear regression to data and for testing how well a regression line fits the data are described. Third, the problems of using linear regression methods in observational studies, if there are errors associated in measuring the independent variable and for predicting a new value of Y for a given X, are discussed. Finally, methods for testing whether a non-linear relationship provides a better fit to the data and for comparing two or more regression lines are considered.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Time, cost and quality achievements on large-scale construction projects are uncertain because of technological constraints, involvement of many stakeholders, long durations, large capital requirements and improper scope definitions. Projects that are exposed to such an uncertain environment can effectively be managed with the application of risk management throughout the project life cycle. Risk is by nature subjective. However, managing risk subjectively poses the danger of non-achievement of project goals. Moreover, risk analysis of the overall project also poses the danger of developing inappropriate responses. This article demonstrates a quantitative approach to construction risk management through an analytic hierarchy process (AHP) and decision tree analysis. The entire project is classified to form a few work packages. With the involvement of project stakeholders, risky work packages are identified. As all the risk factors are identified, their effects are quantified by determining probability (using AHP) and severity (guess estimate). Various alternative responses are generated, listing the cost implications of mitigating the quantified risks. The expected monetary values are derived for each alternative in a decision tree framework and subsequent probability analysis helps to make the right decision in managing risks. In this article, the entire methodology is explained by using a case application of a cross-country petroleum pipeline project in India. The case study demonstrates the project management effectiveness of using AHP and DTA.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conventional feed forward Neural Networks have used the sum-of-squares cost function for training. A new cost function is presented here with a description length interpretation based on Rissanen's Minimum Description Length principle. It is a heuristic that has a rough interpretation as the number of data points fit by the model. Not concerned with finding optimal descriptions, the cost function prefers to form minimum descriptions in a naive way for computational convenience. The cost function is called the Naive Description Length cost function. Finding minimum description models will be shown to be closely related to the identification of clusters in the data. As a consequence the minimum of this cost function approximates the most probable mode of the data rather than the sum-of-squares cost function that approximates the mean. The new cost function is shown to provide information about the structure of the data. This is done by inspecting the dependence of the error to the amount of regularisation. This structure provides a method of selecting regularisation parameters as an alternative or supplement to Bayesian methods. The new cost function is tested on a number of multi-valued problems such as a simple inverse kinematics problem. It is also tested on a number of classification and regression problems. The mode-seeking property of this cost function is shown to improve prediction in time series problems. Description length principles are used in a similar fashion to derive a regulariser to control network complexity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Task classification is introduced as a method for the evaluation of monitoring behaviour in different task situations. On the basis of an analysis of different monitoring tasks, a task classification system comprising four task 'dimensions' is proposed. The perceptual speed and flexibility of closure categories, which are identified with signal discrimination type, comprise the principal dimension in this taxonomy, the others being sense modality, the time course of events, and source complexity. It is also proposed that decision theory provides the most complete method for the analysis of performance in monitoring tasks. Several different aspects of decision theory in relation to monitoring behaviour are described. A method is also outlined whereby both accuracy and latency measures of performance may be analysed within the same decision theory framework. Eight experiments and an organizational study are reported. The results show that a distinction can be made between the perceptual efficiency (sensitivity) of a monitor and his criterial level of response, and that in most monitoring situations, there is no decrement in efficiency over the work period, but an increase in the strictness of the response criterion. The range of tasks exhibiting either or both of these performance trends can be specified within the task classification system. In particular, it is shown that a sensitivity decrement is only obtained for 'speed' tasks with a high stimulation rate. A distinctive feature of 'speed' tasks is that target detection requires the discrimination of a change in a stimulus relative to preceding stimuli, whereas in 'closure' tasks, the information required for the discrimination of targets is presented at the same point In time. In the final study, the specification of tasks yielding sensitivity decrements is shown to be consistent with a task classification analysis of the monitoring literature. It is also demonstrated that the signal type dimension has a major influence on the consistency of individual differences in performance in different tasks. The results provide an empirical validation for the 'speed' and 'closure' categories, and suggest that individual differences are not completely task specific but are dependent on the demands common to different tasks. Task classification is therefore shovn to enable improved generalizations to be made of the factors affecting 1) performance trends over time, and 2) the consistencv of performance in different tasks. A decision theory analysis of response latencies is shown to support the view that criterion shifts are obtained in some tasks, while sensitivity shifts are obtained in others. The results of a psychophysiological study also suggest that evoked potential latency measures may provide temporal correlates of criterion shifts in monitoring tasks. Among other results, the finding that the latencies of negative responses do not increase over time is taken to invalidate arousal-based theories of performance trends over a work period. An interpretation in terms of expectancy, however, provides a more reliable explanation of criterion shifts. Although the mechanisms underlying the sensitivity decrement are not completely clear, the results rule out 'unitary' theories such as observing response and coupling theory. It is suggested that an interpretation in terms of the memory data limitations on information processing provides the most parsimonious explanation of all the results in the literature relating to sensitivity decrement. Task classification therefore enables the refinement and selection of theories of monitoring behaviour in terms of their reliability in generalizing predictions to a wide range of tasks. It is thus concluded that task classification and decision theory provide a reliable basis for the assessment and analysis of monitoring behaviour in different task situations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study proposes an integrated analytical framework for effective management of project risks using combined multiple criteria decision-making technique and decision tree analysis. First, a conceptual risk management model was developed through thorough literature review. The model was then applied through action research on a petroleum oil refinery construction project in the Central part of India in order to demonstrate its effectiveness. Oil refinery construction projects are risky because of technical complexity, resource unavailability, involvement of many stakeholders and strict environmental requirements. Although project risk management has been researched extensively, practical and easily adoptable framework is missing. In the proposed framework, risks are identified using cause and effect diagram, analysed using the analytic hierarchy process and responses are developed using the risk map. Additionally, decision tree analysis allows modelling various options for risk response development and optimises selection of risk mitigating strategy. The proposed risk management framework could be easily adopted and applied in any project and integrated with other project management knowledge areas.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an approach to development of intelligent search system and automatic document classification and cataloging tools for CASE-system based on metadata. The described method uses advantages of ontology approach and traditional approach based on keywords. The method has powerful intelligent means and it can be integrated with existing document search systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research is to establish new optimization methods for pattern recognition and classification of different white blood cells in actual patient data to enhance the process of diagnosis. Beckman-Coulter Corporation supplied flow cytometry data of numerous patients that are used as training sets to exploit the different physiological characteristics of the different samples provided. The methods of Support Vector Machines (SVM) and Artificial Neural Networks (ANN) were used as promising pattern classification techniques to identify different white blood cell samples and provide information to medical doctors in the form of diagnostic references for the specific disease states, leukemia. The obtained results prove that when a neural network classifier is well configured and trained with cross-validation, it can perform better than support vector classifiers alone for this type of data. Furthermore, a new unsupervised learning algorithm---Density based Adaptive Window Clustering algorithm (DAWC) was designed to process large volumes of data for finding location of high data cluster in real-time. It reduces the computational load to ∼O(N) number of computations, and thus making the algorithm more attractive and faster than current hierarchical algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Annual precipitation for the last 2,500 years was reconstructed for northeastern Qinghai from living and archaeological juniper trees. A dominant feature of the precipitation of this area is a high degree of variability in mean rainfall at annual, decadal, and centennial scales, with many wet and dry periods that are corroborated by other paleoclimatic indicators. Reconstructed values of annual precipitation vary mostly from 100 to 300 mm and thus are no different from the modern instrumental record in Dulan. However, relatively dry years with below-average precipitation occurred more frequently in the past than in the present. Periods of relatively dry years occurred during 74-25 BC, AD 51-375, 426-500, 526-575, 626-700, 1100-1225, 1251-1325, 1451-1525, 1651-1750 and 1801-1825. Periods with a relatively wet climate occurred during AD 376-425, 576-625, 951-1050, 1351-1375, 1551-1600 and the present. This variability is probably related to latitudinal positions of winter frontal storms. Another key feature of precipitation in this area is an apparently direct relationship between interannual variability in rainfall with temperature, whereby increased warming in the future might lead to increased flooding and droughts. Such increased climatic variability might then impact human societies of the area, much as the climate has done for the past 2,500 years.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Research across several countries has shown that degree classification (i.e. the final grade awarded to students successfully completing university) is an important determinant of graduates’ first destination outcome. Graduates leaving university with higher degree classifications have better employment opportunities and a higher likelihood of continuing education relative to those with lower degree classifications. This article investigates whether one of the reasons for this result is that employers and higher education institutions use degree classification as a signalling device for the ability that recent graduates may possess. Given the large number of applicants and the amount of time and resources typically required to assess their skills, employers and higher education institutions may decide to rely on this measure when forming beliefs about recent graduates’ abilities. Using data on two cohorts of recent graduates from a UK university, results suggest that an Upper Second degree classification may have a signalling role.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Malware detection is a growing problem particularly on the Android mobile platform due to its increasing popularity and accessibility to numerous third party app markets. This has also been made worse by the increasingly sophisticated detection avoidance techniques employed by emerging malware families. This calls for more effective techniques for detection and classification of Android malware. Hence, in this paper we present an n-opcode analysis based approach that utilizes machine learning to classify and categorize Android malware. This approach enables automated feature discovery that eliminates the need for applying expert or domain knowledge to define the needed features. Our experiments on 2520 samples that were performed using up to 10-gram opcode features showed that an f-measure of 98% is achievable using this approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to optimize frontal detection in sea surface temperature fields at 4 km resolution, a combined statistical and expert-based approach is applied to test different spatial smoothing of the data prior to the detection process. Fronts are usually detected at 1 km resolution using the histogram-based, single image edge detection (SIED) algorithm developed by Cayula and Cornillon in 1992, with a standard preliminary smoothing using a median filter and a 3 × 3 pixel kernel. Here, detections are performed in three study regions (off Morocco, the Mozambique Channel, and north-western Australia) and across the Indian Ocean basin using the combination of multiple windows (CMW) method developed by Nieto, Demarcq and McClatchie in 2012 which improves on the original Cayula and Cornillon algorithm. Detections at 4 km and 1 km of resolution are compared. Fronts are divided in two intensity classes (“weak” and “strong”) according to their thermal gradient. A preliminary smoothing is applied prior to the detection using different convolutions: three type of filters (median, average and Gaussian) combined with four kernel sizes (3 × 3, 5 × 5, 7 × 7, and 9 × 9 pixels) and three detection window sizes (16 × 16, 24 × 24 and 32 × 32 pixels) to test the effect of these smoothing combinations on reducing the background noise of the data and therefore on improving the frontal detection. The performance of the combinations on 4 km data are evaluated using two criteria: detection efficiency and front length. We find that the optimal combination of preliminary smoothing parameters in enhancing detection efficiency and preserving front length includes a median filter, a 16 × 16 pixel window size, and a 5 × 5 pixel kernel for strong fronts and a 7 × 7 pixel kernel for weak fronts. Results show an improvement in detection performance (from largest to smallest window size) of 71% for strong fronts and 120% for weak fronts. Despite the small window used (16 × 16 pixels), the length of the fronts has been preserved relative to that found with 1 km data. This optimal preliminary smoothing and the CMW detection algorithm on 4 km sea surface temperature data are then used to describe the spatial distribution of the monthly frequencies of occurrence for both strong and weak fronts across the Indian Ocean basin. In general strong fronts are observed in coastal areas whereas weak fronts, with some seasonal exceptions, are mainly located in the open ocean. This study shows that adequate noise reduction done by a preliminary smoothing of the data considerably improves the frontal detection efficiency as well as the global quality of the results. Consequently, the use of 4 km data enables frontal detections similar to 1 km data (using a standard median 3 × 3 convolution) in terms of detectability, length and location. This method, using 4 km data is easily applicable to large regions or at the global scale with far less constraints of data manipulation and processing time relative to 1 km data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Endogenous and environmental variables are fundamental in explaining variations in fish condition. Based on more than 20 yr of fish weight and length data, relative condition indices were computed for anchovy and sardine caught in the Gulf of Lions. Classification and regression trees (CART) were used to identify endogenous factors affecting fish condition, and to group years of similar condition. Both species showed a similar annual cycle with condition being minimal in February and maximal in July. CART identified 3 groups of years where the fish populations generally showed poor, average and good condition and within which condition differed between age classes but not according to sex. In particular, during the period of poor condition (mostly recent years), sardines older than 1 yr appeared to be more strongly affected than younger individuals. Time-series were analyzed using generalized linear models (GLMs) to examine the effects of oceanographic abiotic (temperature, Western Mediterranean Oscillation [WeMO] and Rhone outflow) and biotic (chlorophyll a and 6 plankton classes) factors on fish condition. The selected models explained 48 and 35% of the variance of anchovy and sardine condition, respectively. Sardine condition was negatively related to temperature but positively related to the WeMO and mesozooplankton and diatom concentrations. A positive effect of mesozooplankton and Rhone runoff on anchovy condition was detected. The importance of increasing temperatures and reduced water mixing in the NW Mediterranean Sea, affecting planktonic productivity and thus fish condition by bottom-up control processes, was highlighted by these results. Changes in plankton quality, quantity and phenology could lead to insufficient or inadequate food supply for both species.