891 resultados para DISCRIMINANT-ANALYSIS
Resumo:
The sediment and diagenesis process of reservoir are the key controlling factors for the formation and distribution of hydrocarbon reservoir. For quite a long time, most of the research on sediment-diagenesis facise is mainly focusing on qualitative analysis. With the further development on exploration of oil field, the qualitative analysis alone can’t meet the requirements of complicated requirements of oil and gas exploreation, so the quantitative analysis of sediment-diagenesis facise and related facies modling have become more and more important. On the basis of the research result from stratum and sediment on GuLong Area Putaohua Oil Layer Group, from the basic principles of sedimentology, and with the support from the research result from field core and mining research results, the thesis mainly makes the research on the sediment types, the space framework of sands and the evolution rules of diagenesis while mainly sticking to the research on sediment systement analysis and diagenetic deformation, and make further quantitative classification on sediment-diageneses facies qualitatively, discussed the new way to divide the sediment-diagenesis facies, and offer new basis for reservoir exploration by the research. Through using statistics theory including factor analysis, cluster analysis and discriminant analysis, the thesis devided sediment-diagenesis facies quantitatively. This research method is innovative on studying sediment-diagenesis facies. Firstly, the factor analysis could study the main mechanism of those correlative variables in geologic body, and then could draw a conclusion on the control factors of fluid and capability of reservoir in the layer of studying area. Secondly, with the selected main parameter for the cluster analysis, the classification of diagenesis is mainly based on the data analysis, thus the subjective judgement from the investigator could be eliminated, besides the results could be more quantitative, which is helpful to the correlative statistical analysis, so one could get further study on the quantitative relations of each sediment-diagenesis facies type. Finally, with the reliablities of discriminant analysis cluster results, and the adoption of discriminant probability to formulate the chart, the thesis could reflect chorisogram of sediment-diagenesis facies for planar analysis, which leads to a more dependable analytic results.According to the research, with the multi-statistics analysis methods combinations, we could get quantitative analysis on sediment-diagenesis facies of reservoir, and the final result could be more reliable and also have better operability.
Resumo:
The Ordos Basin is a large-scale craton superimposed basin locating on the west of the North China platform, which was the hotspot of interior basin exploration and development. Qiaozhen oil field located in the Ganquan region of south-central of Ordos Basin. The paper is based on the existing research data, combined with the new theory and progress of the sedimentology, sequence stratigraphy, reservoir sedimentology, petroleum geology, etc, and analyzes systematically the sedimentary and reservoir characteristics in the chang2 and chang1 oil-bearing strata group of Yanchang formation On the basis of stratigraphic classification and comparison study, the strata chang2 and chang1 were divided into five intervals. Appling the method of cartography with single factor and dominance aspect, we have drawn contour line map of sand thickness, contour line map of ratio between sand thickness and stratum thickness. We discussed distribution characteristics of reservoir sand body and evolution of sedimentary facies and microfacies. And combining the field type section , lithologic characteristics, sedimentary structures, the sedimentary facies of single oil well and particle size analysis and according to the features of different sequence, the study area was divided into one sedimentary facies、three parfacies and ten microfacies. The author chew over the characteristics of every facies, parfacies and microfacies and spatial and temporal distribution. Comprehensive research on petrologic characteristics of reservoir , diagenesis types, pore types, distribution of sand bodies, physical properties, oiliness, reservoir heterogeneities, characteristics of interlayer, eventually research on synthetic classifying evaluation of reservoir.The reservoir is classified four types: Ⅰ、Ⅱ、Ⅲ、Ⅳ and pore type, fracture-porosity type. Take reservoir's average thickness, porosity, permeability, oil saturation and shale content as parameters, by using clustering analysis and discriminant analysis, the reservoir is classified three groups. Based on the evaluation, synthetizing the reservoir quality, the sealing ability of cap rock, trap types, reservoir-forming model ,in order to analyze the disciplinarian of accumulation oil&gas. Ultimately, many favorable zones were examined for chang23,chang223,chang222,chang221,chang212,chang12,chang11 intervals. There are twenty two favorable zones in the research area. Meanwhile deploy the next disposition scheme.
Resumo:
Based on social survey data conducted by local research group in some counties executed in the nearly past five years in China, the author proposed and solved two kernel problems in the field of social situation forecasting: i) How can the attitudes’ data on individual level be integrated with social situation data on macrolevel; ii) How can the powers of forecasting models’ constructed by different statistic methods be compared? Five integrative statistics were applied to the research: 1) algorithm average (MEAN); 2) standard deviation (SD); 3) coefficient variability (CV); 4) mixed secondary moment (M2); 5) Tendency (TD). To solve the former problem, the five statistics were taken to synthesize the individual and mocrolevel data of social situations on the levels of counties’ regions, and form novel integrative datasets, from the basis of which, the latter problem was accomplished by the author: modeling methods such as Multiple Regression Analysis (MRA), Discriminant Analysis (DA) and Support Vector Machine (SVM) were used to construct several forecasting models. Meanwhile, on the dimensions of stepwise vs. enter, short-term vs. long-term forecasting and different integrative (statistic) models, meta-analysis and power analysis were taken to compare the predicting power of each model within and among modeling methods. Finally, it can be concluded from the research of the dissertation: 1) Exactly significant difference exists among different integrative (statistic) models, in which, tendency (TD) integrative models have the highest power, but coefficient variability (CV) ones have the lowest; 2) There is no significant difference of the power between stepwise and enter models as well as short-term and long-term forecasting models; 3) There is significant difference among models constructed by different methods, of which, support vector machine (SVM) has the highest statistic power. This research founded basis in all facets for exploring the optimal forecasting models of social situation’s more deeply, further more, it is the first time methods of meta-analysis and power analysis were immersed into the assessments of such forecasting models.
Resumo:
In the early part of this century, with the change from the seller's market to the buyer market, the competition between companies changed from product competition, selling competition to corporate image competition, and companies began to consciously build corporate reputation through fast developed mass media. As a result, a series of methods to build corporate image were created, such as advertising, public relations and corporate identify system(CIS), which ,in turn, promoted the development of the research of corporate image. The factors of corporate image have been the central issue of the corporate image research, for the probe of this issue is of great significance to both the development of corporate image theory and the practice of corporate image building. As far as the literature we have gathered is concerned, the exiting research on this topic either remains at the level of qualitative investigating and induction, or is limited in some particular industry. Therefore. There bean no commonly accepted corporate image theory so far. In the recent years, with the introduction of competition mechanism and the establishment of the company. As subject position in the market, the building of corporate image gas been developed quickly in our country, and the development of practice imperatively requires the guide of scientific theory. On the basis of the analysis and summarization of the research of the predecessors, the present dissertation attempts to do some investigation and research work on the common and individual characteristics of corporate image factors of the companies in different industries in our country. The method of questionnaire survey is used in the present research. The subject sample is gathered on the basis of convenience and feasibility, and at the mean time, some consideration is also given to straticulate randomization principles. The subjects are asked to select one of their most familiar companies, and determine the important of even item in the questionnaire to the selected company(i.e. the importance assessment), and then, determine the grades the selected company gains on every item(i.e. the image assessment). The discriminant analysis of corporate image of different industries. The selected sample is grouped and coded according to the standard of industry classification. The discriminant analysis is done with the selected companies as the sample and the grades of image assessment as the variables. The result indicates that industry variable is an important standard of the classification of corporate image, and the companies in the same industry are more similar in corporate image. The analysis of the common and individual characteristics of corporate image of different industries. Firstly, in every industry, the items are sieved according to the grades of importance assessment, and exploratory factor analysis is done with grades of image assessment on the selected items as the variables. Secondly, the factors drawn from every industry in arranged in order according to their importance. The result indicates that the corporate image of different industries shares some common characteristics, for there exist common factors among different industries. In the mean while, the corporate image of different industries has its individual characteristics, that is, there is some difference in the domain of the factors, and in the order of the factors(including the difference of the principle factor).
Resumo:
The standard early markers for identifying and grading HIE severity, are not sufficient to ensure all children who would benefit from treatment are identified in a timely fashion. The aim of this thesis was to explore potential early biomarkers of HIE. Methods: To achieve this a cohort of infants with perinatal depression was prospectively recruited. All infants had cord blood samples drawn and biobanked, and were assessed with standardised neurological examination, and early continuous multi-channel EEG. Cord samples from a control cohort of healthy infants were used for comparison. Biomarkers studied included; multiple inflammatory proteins using multiplex assay; the metabolomics profile using LC/MS; and the miRNA profile using microarray. Results: Eighty five infants with perinatal depression were recruited. Analysis of inflammatory proteins consisted of exploratory analysis of 37 analytes conducted in a sub-population, followed by validation of all significantly altered analytes in the remaining population. IL-6 and IL-6 differed significantly in infants with a moderate/severely abnormal vs. a normal-mildly abnormal EEG in both cohorts (Exploratory: p=0.016, p=0.005: Validation: p=0.024, p=0.039; respectively). Metabolomic analysis demonstrated a perturbation in 29 metabolites. A Cross- validated Partial Least Square Discriminant Analysis model was developed, which accurately predicted HIE with an AUC of 0.92 (95% CI: 0.84-0.97). Analysis of the miRNA profile found 70 miRNA significantly altered between moderate/severely encephalopathic infants and controls. miRNA target prediction databases identified potential targets for the altered miRNA in pathways involved in cellular metabolism, cell cycle and apoptosis, cell signaling, and the inflammatory cascade. Conclusion: This thesis has demonstrated that the recruitment of a large cohortof asphyxiated infants, with cord blood carefully biobanked, and detailed early neurophysiological and clinical assessment recorded, is feasible. Additionally the results described, provide potential alternate and novel blood based biomarkers for the identification and assessment of HIE.
Resumo:
As more diagnostic testing options become available to physicians, it becomes more difficult to combine various types of medical information together in order to optimize the overall diagnosis. To improve diagnostic performance, here we introduce an approach to optimize a decision-fusion technique to combine heterogeneous information, such as from different modalities, feature categories, or institutions. For classifier comparison we used two performance metrics: The receiving operator characteristic (ROC) area under the curve [area under the ROC curve (AUC)] and the normalized partial area under the curve (pAUC). This study used four classifiers: Linear discriminant analysis (LDA), artificial neural network (ANN), and two variants of our decision-fusion technique, AUC-optimized (DF-A) and pAUC-optimized (DF-P) decision fusion. We applied each of these classifiers with 100-fold cross-validation to two heterogeneous breast cancer data sets: One of mass lesion features and a much more challenging one of microcalcification lesion features. For the calcification data set, DF-A outperformed the other classifiers in terms of AUC (p < 0.02) and achieved AUC=0.85 +/- 0.01. The DF-P surpassed the other classifiers in terms of pAUC (p < 0.01) and reached pAUC=0.38 +/- 0.02. For the mass data set, DF-A outperformed both the ANN and the LDA (p < 0.04) and achieved AUC=0.94 +/- 0.01. Although for this data set there were no statistically significant differences among the classifiers' pAUC values (pAUC=0.57 +/- 0.07 to 0.67 +/- 0.05, p > 0.10), the DF-P did significantly improve specificity versus the LDA at both 98% and 100% sensitivity (p < 0.04). In conclusion, decision fusion directly optimized clinically significant performance measures, such as AUC and pAUC, and sometimes outperformed two well-known machine-learning techniques when applied to two different breast cancer data sets.
Resumo:
A shearing quotient (SQ) is a way of quantitatively representing the Phase I shearing edges on a molar tooth. Ordinary or phylogenetic least squares regression is fit to data on log molar length (independent variable) and log sum of measured shearing crests (dependent variable). The derived linear equation is used to generate an 'expected' shearing crest length from molar length of included individuals or taxa. Following conversion of all variables to real space, the expected value is subtracted from the observed value for each individual or taxon. The result is then divided by the expected value and multiplied by 100. SQs have long been the metric of choice for assessing dietary adaptations in fossil primates. Not all studies using SQ have used the same tooth position or crests, nor have all computed regression equations using the same approach. Here we focus on re-analyzing the data of one recent study to investigate the magnitude of effects of variation in 1) shearing crest inclusion, and 2) details of the regression setup. We assess the significance of these effects by the degree to which they improve or degrade the association between computed SQs and diet categories. Though altering regression parameters for SQ calculation has a visible effect on plots, numerous iterations of statistical analyses vary surprisingly little in the success of the resulting variables for assigning taxa to dietary preference. This is promising for the comparability of patterns (if not casewise values) in SQ between studies. We suggest that differences in apparent dietary fidelity of recent studies are attributable principally to tooth position examined.
Resumo:
Automatic taxonomic categorisation of 23 species of dinoflagellates was demonstrated using field-collected specimens. These dinoflagellates have been responsible for the majority of toxic and noxious phytoplankton blooms which have occurred in the coastal waters of the European Union in recent years and make severe impact on the aquaculture industry. The performance by human 'expert' ecologists/taxonomists in identifying these species was compared to that achieved by 2 artificial neural network classifiers (multilayer perceptron and radial basis function networks) and 2 other statistical techniques, k-Nearest Neighbour and Quadratic Discriminant Analysis. The neural network classifiers outperform the classical statistical techniques. Over extended trials, the human experts averaged 85% while the radial basis network achieved a best performance of 83%, the multilayer perceptron 66%, k-Nearest Neighbour 60%, and the Quadratic Discriminant Analysis 56%.
Resumo:
The detection of dense harmful algal blooms (HABs) by satellite remote sensing is usually based on analysis of chlorophyll-a as a proxy. However, this approach does not provide information about the potential harm of bloom, nor can it identify the dominant species. The developed HAB risk classification method employs a fully automatic data-driven approach to identify key characteristics of water leaving radiances and derived quantities, and to classify pixels into “harmful”, “non-harmful” and “no bloom” categories using Linear Discriminant Analysis (LDA). Discrimination accuracy is increased through the use of spectral ratios of water leaving radiances, absorption and backscattering. To reduce the false alarm rate the data that cannot be reliably classified are automatically labelled as “unknown”. This method can be trained on different HAB species or extended to new sensors and then applied to generate independent HAB risk maps; these can be fused with other sensors to fill gaps or improve spatial or temporal resolution. The HAB discrimination technique has obtained accurate results on MODIS and MERIS data, correctly identifying 89% of Phaeocystis globosa HABs in the southern North Sea and 88% of Karenia mikimotoi blooms in the Western English Channel. A linear transformation of the ocean colour discriminants is used to estimate harmful cell counts, demonstrating greater accuracy than if based on chlorophyll-a; this will facilitate its integration into a HAB early warning system operating in the southern North Sea.
Resumo:
The histological grading of cervical intraepithelial neoplasia (CIN) remains subjective, resulting in inter- and intra-observer variation and poor reproducibility in the grading of cervical lesions. This study has attempted to develop an objective grading system using automated machine vision. The architectural features of cervical squamous epithelium are quantitatively analysed using a combination of computerized digital image processing and Delaunay triangulation analysis; 230 images digitally captured from cases previously classified by a gynaecological pathologist included normal cervical squamous epithelium (n = 30), koilocytosis (n = 46), CIN 1 (n = 52), CIN 2 (n = 56), and CIN 3 (n=46). Intra- and inter-observer variation had kappa values of 0.502 and 0.415, respectively. A machine vision system was developed in KS400 macro programming language to segment and mark the centres of all nuclei within the epithelium. By object-oriented analysis of image components, the positional information of nuclei was used to construct a Delaunay triangulation mesh. Each mesh was analysed to compute triangle dimensions including the mean triangle area, the mean triangle edge length, and the number of triangles per unit area, giving an individual quantitative profile of measurements for each case. Discriminant analysis of the geometric data revealed the significant discriminatory variables from which a classification score was derived. The scoring system distinguished between normal and CIN 3 in 98.7% of cases and between koilocytosis and CIN 1 in 76.5% of cases, but only 62.3% of the CIN cases were classified into the correct group, with the CIN 2 group showing the highest rate of misclassification. Graphical plots of triangulation data demonstrated the continuum of morphological change from normal squamous epithelium to the highest grade of CIN, with overlapping of the groups originally defined by the pathologists. This study shows that automated location of nuclei in cervical biopsies using computerized image analysis is possible. Analysis of positional information enables quantitative evaluation of architectural features in CIN using Delaunay triangulation meshes, which is effective in the objective classification of CIN. This demonstrates the future potential of automated machine vision systems in diagnostic histopathology. Copyright (C) 2000 John Wiley and Sons, Ltd.
Resumo:
Logistic regression and Gaussian mixture model (GMM) classifiers have been trained to estimate the probability of acute myocardial infarction (AMI) in patients based upon the concentrations of a panel of cardiac markers. The panel consists of two new markers, fatty acid binding protein (FABP) and glycogen phosphorylase BB (GPBB), in addition to the traditional cardiac troponin I (cTnI), creatine kinase MB (CKMB) and myoglobin. The effect of using principal component analysis (PCA) and Fisher discriminant analysis (FDA) to preprocess the marker concentrations was also investigated. The need for classifiers to give an accurate estimate of the probability of AMI is argued and three categories of performance measure are described, namely discriminatory ability, sharpness, and reliability. Numerical performance measures for each category are given and applied. The optimum classifier, based solely upon the samples take on admission, was the logistic regression classifier using FDA preprocessing. This gave an accuracy of 0.85 (95% confidence interval: 0.78-0.91) and a normalised Brier score of 0.89. When samples at both admission and a further time, 1-6 h later, were included, the performance increased significantly, showing that logistic regression classifiers can indeed use the information from the five cardiac markers to accurately and reliably estimate the probability AMI. © Springer-Verlag London Limited 2008.
Resumo:
The concentration of organic acids in anaerobic digesters is one of the most critical parameters for monitoring and advanced control of anaerobic digestion processes. Thus, a reliable online-measurement system is absolutely necessary. A novel approach to obtaining these measurements indirectly and online using UV/vis spectroscopic probes, in conjunction with powerful pattern recognition methods, is presented in this paper. An UV/vis spectroscopic probe from S::CAN is used in combination with a custom-built dilution system to monitor the absorption of fully fermented sludge at a spectrum from 200 to 750 nm. Advanced pattern recognition methods are then used to map the non-linear relationship between measured absorption spectra to laboratory measurements of organic acid concentrations. Linear discriminant analysis, generalized discriminant analysis (GerDA), support vector machines (SVM), relevance vector machines, random forest and neural networks are investigated for this purpose and their performance compared. To validate the approach, online measurements have been taken at a full-scale 1.3-MW industrial biogas plant. Results show that whereas some of the methods considered do not yield satisfactory results, accurate prediction of organic acid concentration ranges can be obtained with both GerDA and SVM-based classifiers, with classification rates in excess of 87% achieved on test data.
Resumo:
A study combining high resolution mass spectrometry (liquid chromatography-quadrupole time-of-flight-mass spectrometry, UPLC-QTof-MS) and chemometrics for the analysis of post-mortem brain tissue from subjects with Alzheimer’s disease (AD) (n = 15) and healthy age-matched controls (n = 15) was undertaken. The huge potential of this metabolomics approach for distinguishing AD cases is underlined by the correct prediction of disease status in 94–97% of cases. Predictive power was confirmed in a blind test set of 60 samples, reaching 100% diagnostic accuracy. The approach also indicated compounds significantly altered in concentration following the onset of human AD. Using orthogonal partial least-squares discriminant analysis (OPLS-DA), a multivariate model was created for both modes of acquisition explaining the maximum amount of variation between sample groups (Positive Mode-R2 = 97%; Q2 = 93%; root mean squared error of validation (RMSEV) = 13%; Negative Mode-R2 = 99%; Q2 = 92%; RMSEV = 15%). In brain extracts, 1264 and 1457 ions of interest were detected for the different modes of acquisition (positive and negative, respectively). Incorporation of gender into the model increased predictive accuracy and decreased RMSEV values. High resolution UPLC-QTof-MS has not previously been employed to biochemically profile post-mortem brain tissue, and the novel methods described and validated herein prove its potential for making new discoveries related to the etiology, pathophysiology, and treatment of degenerative brain disorders.
Resumo:
Purpose: The purpose of this paper is to present an artificial neural network (ANN) model that predicts earthmoving trucks condition level using simple predictors; the model’s performance is compared to the respective predictive accuracy of the statistical method of discriminant analysis (DA).
Design/methodology/approach: An ANN-based predictive model is developed. The condition level predictors selected are the capacity, age, kilometers travelled and maintenance level. The relevant data set was provided by two Greek construction companies and includes the characteristics of 126 earthmoving trucks.
Findings: Data processing identifies a particularly strong connection of kilometers travelled and maintenance level with the earthmoving trucks condition level. Moreover, the validation process reveals that the predictive efficiency of the proposed ANN model is very high. Similar findings emerge from the application of DA to the same data set using the same predictors.
Originality/value: Earthmoving trucks’ sound condition level prediction reduces downtime and its adverse impact on earthmoving duration and cost, while also enhancing the maintenance and replacement policies effectiveness. This research proves that a sound condition level prediction for earthmoving trucks is achievable through the utilization of easy to collect data and provides a comparative evaluation of the results of two widely applied predictive methods.
Resumo:
In this study, 137 corn distillers dried grains with solubles (DDGS) samples from a range of different geographical origins (Jilin Province of China, Heilongjiang Province of China, USA and Europe) were collected and analysed. Different near infrared spectrometers combined with different chemometric packages were used in two independent laboratories to investigate the feasibility of classifying geographical origin of DDGS. Base on the same dataset, one laboratory developed a partial least square discriminant analysis model and another laboratory developed an orthogonal partial least square discriminant analysis model. Results showed that both models could perfectly classify DDGS samples from different geographical origins. These promising results encourage the development of larger scale efforts to produce datasets which can be used to differentiate the geographical origin of DDGS and such efforts are required to provide higher level food security measures on a global scale.