8 resultados para DATA QUALITY
em Chinese Academy of Sciences Institutional Repositories Grid Portal
Resumo:
There is increasing evidence that many of the mitochondrial DNA (mtDNA) databases published in the fields of forensic science and molecular anthropology are flawed. An a posteriori phylogenetic analysis of the sequences could help to eliminate most of the errors and thus greatly improve data quality. However, previously published caveats and recommendations along these lines were not yet picked up by all researchers. Here we call for stringent quality control of mtDNA data by haplogroup-directed database comparisons. We take some problematic databases of East Asian mtDNAs, published in the Journal of Forensic Sciences and Forensic Science International, as examples to demonstrate the process of pinpointing obvious errors. Our results show that data sets are not only notoriously plagued by base shifts and artificial recombination but also by lab-specific phantom mutations, especially in the second hypervariable region (HVR-II). (C) 2003 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Mitochondrial disease currently received an increasing concern. However, the case-control design commonly adopted in this field is vulnerable to genetic background, population stratification and poor data quality. Although the phylogenetic analysis could
Resumo:
ETL过程是一个从分布数据源(包括数据库、应用系统、文件系统等)抽取数据,进行转换、集成和传输,并最终加载到目标系统的过程。传统的ETL过程主要服务于数据仓库(Data Warehouse),属于企业决策支持系统的一部分。随着数据集成技术的发展和轻量级的数据集成中间件的出现,ETL过程广泛应用于企业数据集成与数据交换系统。在ETL过程中,数据质量控制是一个极为重要的基本组件和功能,它对集成中的数据进行检测、转换、清洗,以防止“脏”数据进入目标系统。在ETL过程中如果缺少对数据质量的有效控制,就会导致数据集成项目无法圆满实现目标或彻底失败。 针对ETL过程中存在的数据质量问题,设计并实现面向ETL过程的数据质量控制系统,是本文研究的重点。论文通过对ETL过程中各阶段可能产生的数据质量问题进行了分类,并对质量控制需求建模,提出一个面向ETL过程的数据质量控制框架,该框架通过对源端数据的分析来指导ETL的设计,通过灵活、可配置、可扩展的数据处理机制实现数据的过滤、转换与清洗,并支持对数据质量处理全过程进行监控。在该框架基础上,论文特别在灵活的数据处理机制、数据分析、数据过滤和数据清洗四个方面进行了探讨。在数据处理机制方面,提出了基于插件元模型的数据处理机制,该机制可以满足用户对数据过滤、数据转换与数据清洗等功能的各种定制需求,并具有较强的可扩展性;在数据分析方面,根据字段类型对数据进行分类统计,并针对大数据量统计分析问题,提出了可自动配置的不同数据统计策略;在数据过滤方面,通过将抽取数据的SQL语句重写的方式,过滤不满足完整性约束的元组;在数据清洗方法方面给出了一种利用统计信息动态确定属性相似度权重的方法,对基于字段的相似记录检测算法的领域无关算法进行了改进,提高了数据检测的准确性。在上述工作基础上,在数据集成中间件OnceDI中设计并实现了数据质量控制系统,并在设计中通过设计模式的应用增强系统的可扩展性。
Resumo:
坡耕地研究是水土流失治理及水土保持规划的核心内容 ,GIS的发展及其在多领域的成功应用为坡耕地研究提供了新的技术手段。用矢量基础图件叠加来获取单元图斑的方法是GIS的基本功能 ,但叠加多边形的不确定性成为影响GIS数据质量的灾难性因素。以MapGIS作为基础平台 ,以DOM作为控制底图更新信息 ,采取矢栅叠加的方法获得单元图斑 ,用操作属性表获取专题属性 ,进行小流域坡耕地快速调查的研究 ,展示了矢栅叠加方法在信息复合过程中的显著优势。
Resumo:
In the paper through extensive study and design, the technical plan for establishing the exploration database center is made to combine imported and self developed techniques. By research and repeated experiment a modern database center has been set up with its hardware and network having advanced performance, its system well configured, its data store and management complete, and its data support being fast and direct. Through study on the theory, method and model of decision an exploration decision assistant schema is designed with one decision plan of well location decision support system being evaluated and put into action. 1. Study on the establishment of Shengli exploration database center Research is made on the hardware configuration of the database center including its workstations and all connected hardware and system. The hardware of the database center is formed by connecting workstations, microcomputer workstations, disk arrays, and those equipments used for seismic processing and interpretation. Research on the data store and management includes the analysis of the contents to be managed, data flow, data standard, data QC, data backup and restore policy, optimization of database system. A reasonable data management regulation and workflow is made and the scientific exploration data management system is created. Data load is done by working out a schedule firstly and at last 200 more projects of seismic surveys has been loaded amount to 25TB. 2. Exploration work support system and its application Seismic data processing system support has the following features, automatic extraction of seismic attributes, GIS navigation, data order, extraction of any sized data cube, pseudo huge capacity disk array, standard output exchange format etc. The prestack data can be accessed by the processing system or data can be transferred to other processing system through standard exchange format. For supporting seismic interpretation system the following features exist such as auto scan and store of interpretation result, internal data quality control etc. the interpretation system is connected directly with database center to get real time support of seismic data, formation data and well data. Comprehensive geological study support is done through intranet with the ability to query or display data graphically on the navigation system under some geological constraints. Production management support system is mainly used to collect, analyze and display production data with its core technology on the controlled data collection and creation of multiple standard forms. 3. exploration decision support system design By classification of workflow and data flow of all the exploration stages and study on decision theory and method, target of each decision step, decision model and requirement, three concept models has been formed for the Shengli exploration decision support system including the exploration distribution support system, the well location support system and production management support system. the well location decision support system has passed evaluation and been put into action. 4. Technical advance Hardware and software match with high performance for the database center. By combining parallel computer system, database server, huge capacity ATL, disk array, network and firewall together to create the first exploration database center in China with reasonable configuration, high performance and able to manage the whole data sets of exploration. Huge exploration data management technology is formed where exploration data standards and management regulations are made to guarantee data quality, safety and security. Multifunction query and support system for comprehensive exploration information support. It includes support system for geological study, seismic processing and interpretation and production management. In the system a lot of new database and computer technology have been used to provide real time information support for exploration work. Finally is the design of Shengli exploration decision support system. 5. Application and benefit Data storage has reached the amount of 25TB with thousand of users in Shengli oil field to access data to improve work efficiency multiple times. The technology has also been applied by many other units of SINOPEC. Its application of providing data to a project named Exploration achievements and Evaluation of Favorable Targets in Hekou Area shortened the data preparation period from 30 days to 2 days, enriching data abundance 15 percent and getting information support from the database center perfectly. Its application to provide former processed result for a project named Pre-stack depth migration in Guxi fracture zone reduced the amount of repeated process and shortened work period of one month and improved processing precision and quality, saving capital investment of data processing of 30 million yuan. It application by providing project database automatically in project named Geological and seismic study of southern slope zone of Dongying Sag shortened data preparation time so that researchers have more time to do research, thus to improve interpretation precision and quality.
Resumo:
The exploration and development of natural gas in the north of Ordos basin have been one important part in China’s energy stratagem. Reservoir in upper Palaeozoic group is of lithological trap and its prediction is a crux in a series of works. Based on foregoing seismic reservoir prediction, seismic data are re-processed with some optical methods and pre-stack information is used in corresponding inversions. Through the application of diverse methods, a series of techniques for reservoir prediction come into being. Several results are achieved as flowing: 1. A set of log processing and interpretation methods is developed. Porosity, permeability and gas saturation models are rebuilt. 2. Based on the petro-physics analysis of reservoirs in upper Palaeozoic group, the equations about lithology, property, hydrocarbon and elastic parameters are established. 3. Forward modeling based on elastic wave theory is first applied in the study area and increases the resolution of modeling results. 4. A series of techniques such as pre-stack time migration and others are combined to improve the data quality. 5. Pre-stack seismic inversion is first employed in the north of Ordos Basin and brings the results of EI, P-impedance, S-impedance and other elastic parameters. 6. In post-stack inversion, logs indicating reservoir parameters are rebuilt and boost the resolution of lithology inversion. 7. Amplitude, coherence, frequency-discomposed amplitude, waveform and other sensitive attributes are extracted to describe sands distribution. Seismic modes standing for sands of P1x3, P1x2 are established. 8. Among 9 proposed wells, 8 wells encountered sands and became production wells. The output of DK13 amounts to 510,000 m3 per day. Keywords:the north of Ordos Basin, reservoir prediction, pre-stack inversion, post-stack inversion, seismic attributes.
Resumo:
Maichen Depression lie between Leizhou Peninsula and Qiongzhou Strait. Oil and gas have been discovered in Weixinan Depression, Wushi Depression and Fushan Depression, which pertain to a same basin — North Sea Basin along with Maichen Depression.Jiangsu Oil started exploration at 2002. The first well began to drill at November, 2004 after gravity survey, electric method prospecting and 2D seismic exploration had been finished. Generating rock and hydrocarbon shows have been verified by the drilling. Low yield oil stream has been tested. And we started 3D seismic exploration at November, 2005. My thesis topic came from the actual needs of our exploration in the Maichen Depression. In the thesis, I give emphasis to analyse the own seismic geologic conditions of Maichen Depression. By real tests, we choosed the means to overcome or weaken the unfavorably impress owing to the own coditions in Maichen Depression. Finally, we obtained the usable seismic data. 1. Owing to the multiphase eruptive rock during the Quaternary Period, the near surface layers are very inhomogeneous. By simultaneous testing at same point with short refraction, uphole surveys of radial source and of surface source, the most appropriate method had been sorted out. Radial source uphole survey has been regarding the best practicable means in the complex area. Accurate surficial geology was very helpful to choosing of acquirement means and parameters. Basically the appropriate method of seismic acquirement has been built at Maichen area. 2. The seismic primary data has many, very strong and complex noise. By noise characteristic analysis in different domain, many means of denoising had been paralleled individual and joint application researched. As a result, the pre-stack multidomain joint denoise flow was the appropriate method. It can improve the seismic signal-to-noise ratio. 3. The problem of seismic static correction at Maichen Depression is very conspicuous. Many static correction methods had been tested individual and joint researched. The seismic data quality has been improved after choosing the appropriate combination of static correction flows. 4. Although the above-mentioned process are resultful, the seismic profile quality is just passable. Some reflector continuity and fault zone imagery are ambiguity. So it was the useful method to reduce the structural ambiguity during seismic interpretation that built-up geologic model in accord with real geologic character by areal structure study upon backbone seismic profiles. In the same way, traps have been assessed and drill targets have been selected.
Resumo:
Junggar Basin has a large amount of recoverable reserves, However, due to the unfavorable factors, such as bad seismic data quality, complex structure with many faults and less wells, the exploration of oil and gas is still relatively limited, so advanced theoretical guidance and effective technical supports are desirable. Based on the theories of sedimentology, as well as comprehensive studies of outcrops, seismic data, drilling data and setting of this area, the paper establishes the isochronous correlation framework, and analyzes the sedimentary facies types and provenance direction, and obtains the profile and plain maps of the sedimentary facies combined with the logging constrained inversion. Then the paper analyzes the reservoir controlling factors, reservoir lithology attribute, 4-property relationship and sensibility based on the sedimentary facies research, and sets up a 3D geological model using facies controlled modeling. Finally, the paper optimizes some target areas with the conclusions of reservoir, structure and reservoir formation.Firstly, the paper establishs the isochronous correlation framework by the seismic data, drilling data and setting of this area. The sedimentary facies in Tai13 well block are braided river and meandering river according to the analysis of the lithology attribute, logging facies and sedimentary structure attribute of outcrop. The concept of “wetland” is put forward for the first time. The provenance direction of Badaowan and Qigu formation is obtained by the geology setting, sedimentary setting and paleocurrent direction. The paper obtains the profile and plain maps of the sedimentary facies from the sand value of the wells and the sand thickness maps from the logging constrained inversion. Then, this paper takes characteristics and control factors of the Jurassic reservoirs analysis on thin section observation, scanning transmission electron microscope observation and find out the petrology characteristics of reservoir, space types of reservoir and lithofacies division. In this area, primary pores dominate in the reservoir pores, which believed that sedimentation played the most important roles of the reservoir quality and diagenesis is the minor factor influencing secondary porosity. Using stochastic modeling technique,the paper builds quantitative 3-D reservoir Parameter. Finally, combined the study of structure and reservoir formation, the reservoir distribution regularity is concluded: (a) structures control the reservoir formation and accumulation. (b) Locating in the favorable sedimentary facies belt. And the area which meets these conditions mentioned above is a good destination for exploration.