868 resultados para Data mining, alberi decisionali, incertezza, classificazione
Resumo:
对于一个企业来说,质量是产品和服务的生命。质量受企业生产经营管理活动中多种因素的影响,是企业各项工作的综合反映。目前企业产品质量指标的检测大多是在产品生产出来后才进行的,检测需要成本,有时还需要进行破坏性试验,如测量产品的抗拉强度,要做拉断检验。这样滞后的质量数据对生产过程的实时质量控制帮助不大,而且当发现产品质量不合格时,损失已无法挽回,这样极大地影响了企业的生产质量和效益。贯彻预防原则是现代质量管理的核心与精髓,要保证和提高产品质量,必须把影响质量的各个指标全面系统地管理起来。因而,如何将这些生产过程参数与产品质量特性关联起来成为企业生产故障预测及诊断的瓶颈问题。 本文首先从功能结构组织,数据库逻辑结构设计,类关系等多个方面描述了制造执行系统(MES)平台统计过程控制(SPC)子系统开发的相关工作。并针对所开发的SPC子系统在异常原因识别方面的不足,将数据挖掘与统计过程控制(SPC)技术结合起来,对变速箱总装线质量信息进行统计分析和深度挖掘,提出一种生产过程在线质量控制和诊断模型。该模型运用SPC对生产异常状态进行监测,并基于数据挖掘技术对大量过程检测数据进行分析,找到最有可能出现问题的工序和加工设备,将控制图异常状态与生产过程参数关联起来,实现异常状态的实时检测与诊断。研究表明,数据挖掘的理论和方法适合于质量控制领域,可以为产品质量控制提供一种新的途径。
Resumo:
On the issue of geological hazard evaluation(GHE), taking remote sensing and GIS systems as experimental environment, assisting with some programming development, this thesis combines multi-knowledges of geo-hazard mechanism, statistic learning, remote sensing (RS), high-spectral recognition, spatial analysis, digital photogrammetry as well as mineralogy, and selects geo-hazard samples from Hong Kong and Three Parallel River region as experimental data, to study two kinds of core questions of GHE, geo-hazard information acquiring and evaluation model. In the aspect of landslide information acquiring by RS, three detailed topics are presented, image enhance for visual interpretation, automatic recognition of landslide as well as quantitative mineral mapping. As to the evaluation model, the latest and powerful data mining method, support vector machine (SVM), is introduced to GHE field, and a serious of comparing experiments are carried out to verify its feasibility and efficiency. Furthermore, this paper proposes a method to forecast the distribution of landslides if rainfall in future is known baseing on historical rainfall and corresponding landslide susceptibility map. The details are as following: (a) Remote sensing image enhancing methods for geo-hazard visual interpretation. The effect of visual interpretation is determined by RS data and image enhancing method, for which the most effective and regular technique is image merge between high-spatial image and multi-spectral image, but there are few researches concerning the merging methods of geo-hazard recognition. By the comparing experimental of six mainstream merging methods and combination of different remote sensing data source, this thesis presents merits of each method ,and qualitatively analyzes the effect of spatial resolution, spectral resolution and time phase on merging image. (b) Automatic recognition of shallow landslide by RS image. The inventory of landslide is the base of landslide forecast and landslide study. If persistent collecting of landslide events, updating the geo-hazard inventory in time, and promoting prediction model incessantly, the accuracy of forecast would be boosted step by step. RS technique is a feasible method to obtain landslide information, which is determined by the feature of geo-hazard distribution. An automatic hierarchical approach is proposed to identify shallow landslides in vegetable region by the combination of multi-spectral RS imagery and DEM derivatives, and the experiment is also drilled to inspect its efficiency. (c) Hazard-causing factors obtaining. Accurate environmental factors are the key to analyze and predict the risk of regional geological hazard. As to predict huge debris flow, the main challenge is still to determine the startup material and its volume in debris flow source region. Exerting the merits of various RS technique, this thesis presents the methods to obtain two important hazard-causing factors, DEM and alteration mineral, and through spatial analysis, finds the relationship between hydrothermal clay alteration minerals and geo-hazards in the arid-hot valleys of Three Parallel Rivers region. (d) Applying support vector machine (SVM) to landslide susceptibility mapping. Introduce the latest and powerful statistical learning theory, SVM, to RGHE. SVM that proved an efficient statistic learning method can deal with two-class and one-class samples, with feature avoiding produce ‘pseudo’ samples. 55 years historical samples in a natural terrain of Hong Kong are used to assess this method, whose susceptibility maps obtained by one-class SVM and two-class SVM are compared to that obtained by logistic regression method. It can conclude that two-class SVM possesses better prediction efficiency than logistic regression and one-class SVM. However, one-class SVM, only requires failed cases, has an advantage over the other two methods as only "failed" case information is usually available in landslide susceptibility mapping. (e) Predicting the distribution of rainfall-induced landslides by time-series analysis. Rainfall is the most dominating factor to bring in landslides. More than 90% losing and casualty by landslides is introduced by rainfall, so predicting landslide sites under certain rainfall is an important geological evaluating issue. With full considering the contribution of stable factors (landslide susceptibility map) and dynamic factors (rainfall), the time-series linear regression analysis between rainfall and landslide risk mapis presented, and experiments based on true samples prove that this method is perfect in natural region of Hong Kong. The following 4 practicable or original findings are obtained: 1) The RS ways to enhance geo-hazards image, automatic recognize shallow landslides, obtain DEM and mineral are studied, and the detailed operating steps are given through examples. The conclusion is practical strongly. 2) The explorative researching about relationship between geo-hazards and alteration mineral in arid-hot valley of Jinshajiang river is presented. Based on standard USGS mineral spectrum, the distribution of hydrothermal alteration mineral is mapped by SAM method. Through statistic analysis between debris flows and hazard-causing factors, the strong correlation between debris flows and clay minerals is found and validated. 3) Applying SVM theory (especially one-class SVM theory) to the landslide susceptibility mapping and system evaluation for its performance is also carried out, which proves that advantages of SVM in this field. 4) Establishing time-serial prediction method for rainfall induced landslide distribution. In a natural study area, the distribution of landslides induced by a storm is predicted successfully under a real maximum 24h rainfall based on the regression between 4 historical storms and corresponding landslides.
Resumo:
Population research is a front area concerned by domestic and overseas, especially its researches on its spatial visualization and its geo-visualization system design, which provides a sound base for understanding and analysis of the regional difference in population distribution and its spatial rules. With the development of GIS, the theory of geo-visualization more and more plays an important role in many research fields, especially in population information visualization, and has been made the big achievements recently. Nevertheless, the current research is less attention paid to the system design for statistical-geo visualization for population information. This paper tries to explore the design theories and methodologies for statistical-geo-visualization system for population information. The researches are mainly focused on the framework, the methodologies and techniques for the system design and construction. The purpose of the research is developed a platform for population atlas by the integration of the former owned copy software of the research group in statistical mapping system. As a modern tool, the system will provide a spatial visual environment for user to analyze the characteristics of population distribution and differentiate the interrelations of the population components. Firstly, the paper discusses the essentiality of geo-visualization for population information and brings forward the key issue in statistical-geo visualization system design based on the analysis of inland and international trends. Secondly, the geo-visualization system for population design, including its structure, functionality, module, user interface design, is studied based on the concepts of theory and technology of geo-visualization. The system design is proposed and further divided into three parts: support layer, technical layer, user layer. The support layer is a basic operation module and main part of the system. The technical layer is a core part of the system, supported by database and function modules. The database module mainly include the integrated population database (comprises spatial data, attribute data and geographical features information), the cartographic symbol library, the color library, the statistical analysis model. The function module of the system consists of thematic map maker component, statistical graph maker component, database management component and statistical analysis component. The user layer is an integrated platform, which provides the functions to design and implement a visual interface for user to query, analysis and management the statistic data and the electronic map. Based on the above, China's E-atlas for population was designed and developed by the integration of the national fifth census data with 1:400 million scaled spatial data. The atlas illustrates the actual development level of the population nowadays in China by about 200 thematic maps relating with 10 map categories(environment, population distribution, sex and age, immigration, nation, family and marriage, birth, education, employment, house). As a scientific reference tool, China's E-atlas for population has already received the high evaluation after published in early 2005. Finally, the paper makes the deep analysis of the sex ratio in China, to show how to use the functions of the system to analyze the specific population problem and how to make the data mining. The analysis results showed that: 1. The sex ratio has been increased in many regions after fourth census in 1990 except the cities in the east region, and the high sex ratio is highly located in hilly and low mountain areas where with the high illiteracy rate and the high poor rate; 2. The statistical-geo visualization system is a powerful tool to handle population information, which can be used to reflect the regional differences and the regional variations of population in China and indicate the interrelations of the population with other environment factors. Although the author tries to bring up a integrate design frame of the statistical-geo visualization system, there are still many problems needed to be resolved with the development of geo-visualization studies.
Resumo:
Nonlinear multivariate statistical techniques on fast computers offer the potential to capture more of the dynamics of the high dimensional, noisy systems underlying financial markets than traditional models, while making fewer restrictive assumptions. This thesis presents a collection of practical techniques to address important estimation and confidence issues for Radial Basis Function networks arising from such a data driven approach, including efficient methods for parameter estimation and pruning, a pointwise prediction error estimator, and a methodology for controlling the "data mining'' problem. Novel applications in the finance area are described, including customized, adaptive option pricing and stock price prediction.
Resumo:
O Sistema de Indução C4.5. Requerimentos-chave para a utilização do software. Um exemplo ilustrativo. Algumas dicas de uso.
Resumo:
Clare, A. and King R.D. (2003) Predicting gene function in Saccharomyces cerevisiae. 2nd European Conference on Computational Biology (ECCB '03). (published as a journal supplement in Bioinformatics 19: ii42-ii49)
Resumo:
King, R. D. and Ouali, M. (2004) Poly-transformation. In proceedings of 5th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2004). Springer LNCS 3177 p99-107
Resumo:
Urquhart, C. (editor for JUSTEIS team), Spink, S., Thomas, R., Yeoman, A., Durbin, J., Turner, J., Armstrong, A., Lonsdale, R. & Fenton, R. (2003). JUSTEIS (JISC Usage Surveys: Trends in Electronic Information Services) Strand A: survey of end users of all electronic information services (HE and FE), with Action research report. Final report 2002/2003 Cycle Four. Aberystwyth: Department of Information Studies, University of Wales Aberystwyth with Information Automation Ltd (CIQM). Sponsorship: JISC
Resumo:
Q. Shen and R. Jensen, 'Rough sets, their extensions and applications,' International Journal of Automation and Computing (IJAC), vol. 4, no. 3, pp. 217-218, 2007.
Resumo:
R. Jensen, 'Performing Feature Selection with ACO. Swarm Intelligence and Data Mining,' A. Abraham, C. Grosan and V. Ramos (eds.), Studies in Computational Intelligence, vol. 34, pp. 45-73. 2006.
Resumo:
R. Jensen, Q. Shen and A. Tuson, 'Finding Rough Set Reducts with SAT,' Proceedings of the 10th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, LNAI 3641, pp. 194-203, 2005.
Resumo:
M. Galea, Q. Shen and J. Levine. Evolutionary approaches to fuzzy modelling. Knowledge Engineering Review, 19(1):27-59, 2004.
Resumo:
M. Galea and Q. Shen. Simultaneous ant colony optimisation algorithms for learning linguistic fuzzy rules. A. Abraham, C. Grosan and V. Ramos (Eds.), Swarm Intelligence in Data Mining, pages 75-99.
Resumo:
R. Jensen and Q. Shen, 'Fuzzy-Rough Feature Significance for Fuzzy Decision Trees,' in Proceedings of the 2005 UK Workshop on Computational Intelligence, pp. 89-96, 2005.
Resumo:
The problem of discovering frequent poly-regions (i.e. regions of high occurrence of a set of items or patterns of a given alphabet) in a sequence is studied, and three efficient approaches are proposed to solve it. The first one is entropy-based and applies a recursive segmentation technique that produces a set of candidate segments which may potentially lead to a poly-region. The key idea of the second approach is the use of a set of sliding windows over the sequence. Each sliding window covers a sequence segment and keeps a set of statistics that mainly include the number of occurrences of each item or pattern in that segment. Combining these statistics efficiently yields the complete set of poly-regions in the given sequence. The third approach applies a technique based on the majority vote, achieving linear running time with a minimal number of false negatives. After identifying the poly-regions, the sequence is converted to a sequence of labeled intervals (each one corresponding to a poly-region). An efficient algorithm for mining frequent arrangements of intervals is applied to the converted sequence to discover frequently occurring arrangements of poly-regions in different parts of DNA, including coding regions. The proposed algorithms are tested on various DNA sequences producing results of significant biological meaning.