27 resultados para Statistical method
Resumo:
The techniques of principal component analysis (PCA) and partial least squares (PLS) are introduced from the point of view of providing a multivariate statistical method for modelling process plants. The advantages and limitations of PCA and PLS are discussed from the perspective of the type of data and problems that might be encountered in this application area. These concepts are exemplified by two case studies dealing first with data from a continuous stirred tank reactor (CSTR) simulation and second a literature source describing a low-density polyethylene (LDPE) reactor simulation.
Resumo:
Slurries with high penetrability for production of Self-consolidating Slurry Infiltrated Fiber Concrete (SIFCON) were investigated in this study. Factorial experimental design was adopted in this investigation to assess the combined effects of five independent variables on mini-slump test, plate cohesion meter, induced bleeding test, J-fiber penetration test and compressive strength at 7 and 28 days. The independent variables investigated were the proportions of limestone powder (LSP) and sand, the dosages of superplasticiser (SP) and viscosity agent (VA), and water-to-binder ratio (w/b). A two-level fractional factorial statistical method was used to model the influence of key parameters on properties affecting the behaviour of fresh cement slurry and compressive strength. The models are valid for mixes with 10 to 50% LSP as replacement of cement, 0.02 to 0.06% VA by mass of cement, 0.6 to 1.2% SP and 50 to 150% sand (% mass of binder) and 0.42 to 0.48 w/b. The influences of LSP, SP, VA, sand and W/B were characterised and analysed using polynomial regression which identifies the primary factors and their interactions on the measured properties. Mathematical polynomials were developed for mini-slump, plate cohesion meter, J-fiber penetration test, induced bleeding and compressive strength as functions of LSP, SP, VA, sand and w/b. The estimated results of mini-slump, induced bleeding test and compressive strength from the derived models are compared with results obtained from previously proposed models that were developed for cement paste. The proposed response models of the self-consolidating SIFCON offer useful information regarding the mix optimization to secure a highly penetration of slurry with low compressive strength
Resumo:
Query processing over the Internet involving autonomous data sources is a major task in data integration. It requires the estimated costs of possible queries in order to select the best one that has the minimum cost. In this context, the cost of a query is affected by three factors: network congestion, server contention state, and complexity of the query. In this paper, we study the effects of both the network congestion and server contention state on the cost of a query. We refer to these two factors together as system contention states. We present a new approach to determining the system contention states by clustering the costs of a sample query. For each system contention state, we construct two cost formulas for unary and join queries respectively using the multiple regression process. When a new query is submitted, its system contention state is estimated first using either the time slides method or the statistical method. The cost of the query is then calculated using the corresponding cost formulas. The estimated cost of the query is further adjusted to improve its accuracy. Our experiments show that our methods can produce quite accurate cost estimates of the submitted queries to remote data sources over the Internet.
Resumo:
Face recognition with unknown, partial distortion and occlusion is a practical problem, and has a wide range of applications, including security and multimedia information retrieval. The authors present a new approach to face recognition subject to unknown, partial distortion and occlusion. The new approach is based on a probabilistic decision-based neural network, enhanced by a statistical method called the posterior union model (PUM). PUM is an approach for ignoring severely mismatched local features and focusing the recognition mainly on the reliable local features. It thereby improves the robustness while assuming no prior information about the corruption. We call the new approach the posterior union decision-based neural network (PUDBNN). The new PUDBNN model has been evaluated on three face image databases (XM2VTS, AT&T and AR) using testing images subjected to various types of simulated and realistic partial distortion and occlusion. The new system has been compared to other approaches and has demonstrated improved performance.
Resumo:
Purpose: The purpose of this paper is to present an artificial neural network (ANN) model that predicts earthmoving trucks condition level using simple predictors; the model’s performance is compared to the respective predictive accuracy of the statistical method of discriminant analysis (DA).
Design/methodology/approach: An ANN-based predictive model is developed. The condition level predictors selected are the capacity, age, kilometers travelled and maintenance level. The relevant data set was provided by two Greek construction companies and includes the characteristics of 126 earthmoving trucks.
Findings: Data processing identifies a particularly strong connection of kilometers travelled and maintenance level with the earthmoving trucks condition level. Moreover, the validation process reveals that the predictive efficiency of the proposed ANN model is very high. Similar findings emerge from the application of DA to the same data set using the same predictors.
Originality/value: Earthmoving trucks’ sound condition level prediction reduces downtime and its adverse impact on earthmoving duration and cost, while also enhancing the maintenance and replacement policies effectiveness. This research proves that a sound condition level prediction for earthmoving trucks is achievable through the utilization of easy to collect data and provides a comparative evaluation of the results of two widely applied predictive methods.
Resumo:
Several one-dimensional design methods have been used to predict the off-design performance of three modern centrifugal compressors for automotive turbocharging. The three methods used are single-zone, two-zone, and a more recent statistical method. The predicted results from each method are compared against empirical data taken from standard hot gas stand tests for each turbocharger. Each of the automotive turbochargers considered in this study have notably different geometries and are of varying application. Due to the non-adiabatic test conditions, the empirical data has been corrected for the effect of heat transfer to ensure comparability with the 1D models. Each method is evaluated for usability and accuracy in both pressure ratio and efficiency prediction. The paper presents an insight into the limitations of each of these models when applied to one-dimensional automotive turbocharger design, and proposes that a corrected single-zone modelling approach has the greatest potential for further development, whilst the statistical method could be immediately introduced to a design process where design variations are limited.
Resumo:
One of the major challenges in systems biology is to understand the complex responses of a biological system to external perturbations or internal signalling depending on its biological conditions. Genome-wide transcriptomic profiling of cellular systems under various chemical perturbations allows the manifestation of certain features of the chemicals through their transcriptomic expression profiles. The insights obtained may help to establish the connections between human diseases, associated genes and therapeutic drugs. The main objective of this study was to systematically analyse cellular gene expression data under various drug treatments to elucidate drug-feature specific transcriptomic signatures. We first extracted drug-related information (drug features) from the collected textual description of DrugBank entries using text-mining techniques. A novel statistical method employing orthogonal least square learning was proposed to obtain drug-feature-specific signatures by integrating gene expression with DrugBank data. To obtain robust signatures from noisy input datasets, a stringent ensemble approach was applied with the combination of three techniques: resampling, leave-one-out cross validation, and aggregation. The validation experiments showed that the proposed method has the capacity of extracting biologically meaningful drug-feature-specific gene expression signatures. It was also shown that most of signature genes are connected with common hub genes by regulatory network analysis. The common hub genes were further shown to be related to general drug metabolism by Gene Ontology analysis. Each set of genes has relatively few interactions with other sets, indicating the modular nature of each signature and its drug-feature-specificity. Based on Gene Ontology analysis, we also found that each set of drug feature (DF)-specific genes were indeed enriched in biological processes related to the drug feature. The results of these experiments demonstrated the pot- ntial of the method for predicting certain features of new drugs using their transcriptomic profiles, providing a useful methodological framework and a valuable resource for drug development and characterization.
Resumo:
This paper presents a new series of AMS dates on ultrafiltered bone gelatin extracted from identified cutmarked or humanly-modified bones and teeth from the site of Abri Pataud, in the French Dordogne. The sequence of 32 new determinations provides a coherent and reliable chronology from the site's early Upper Palaeolithic levels 5-14, excavated by Hallam Movius. The results show that there were some problems with the previous series of dates, with many underestimating the real age. The new results, when calibrated and modelled using a Bayesian statistical method, allow detailed understanding of the pace of cultural changes within the Aurignacian I and II levels of the site, something not achievable before. In the future, the sequence of dates will allow wider comparison to similarly dated contexts elsewhere in Europe. High precision dating is only possible by using large suites of AMS dates from humanly-modified material within well understood archaeological sequences modelled using a Bayesian statistical method. © 2011.
Resumo:
Single component geochemical maps are the most basic representation of spatial elemental distributions and commonly used in environmental and exploration geochemistry. However, the compositional nature of geochemical data imposes several limitations on how the data should be presented. The problems relate to the constant sum problem (closure), and the inherently multivariate relative information conveyed by compositional data. Well known is, for instance, the tendency of all heavy metals to show lower values in soils with significant contributions of diluting elements (e.g., the quartz dilution effect); or the contrary effect, apparent enrichment in many elements due to removal of potassium during weathering. The validity of classical single component maps is thus investigated, and reasonable alternatives that honour the compositional character of geochemical concentrations are presented. The first recommended such method relies on knowledge-driven log-ratios, chosen to highlight certain geochemical relations or to filter known artefacts (e.g. dilution with SiO2 or volatiles). This is similar to the classical normalisation approach to a single element. The second approach uses the (so called) log-contrasts, that employ suitable statistical methods (such as classification techniques, regression analysis, principal component analysis, clustering of variables, etc.) to extract potentially interesting geochemical summaries. The caution from this work is that if a compositional approach is not used, it becomes difficult to guarantee that any identified pattern, trend or anomaly is not an artefact of the constant sum constraint. In summary the authors recommend a chain of enquiry that involves searching for the appropriate statistical method that can answer the required geological or geochemical question whilst maintaining the integrity of the compositional nature of the data. The required log-ratio transformations should be applied followed by the chosen statistical method. Interpreting the results may require a closer working relationship between statisticians, data analysts and geochemists.
Resumo:
Reliability has emerged as a critical design constraint especially in memories. Designers are going to great lengths to guarantee fault free operation of the underlying silicon by adopting redundancy-based techniques, which essentially try to detect and correct every single error. However, such techniques come at a cost of large area, power and performance overheads which making many researchers to doubt their efficiency especially for error resilient systems where 100% accuracy is not always required. In this paper, we present an alternative method focusing on the confinement of the resulting output error induced by any reliability issues. By focusing on memory faults, rather than correcting every single error the proposed method exploits the statistical characteristics of any target application and replaces any erroneous data with the best available estimate of that data. To realize the proposed method a RISC processor is augmented with custom instructions and special-purpose functional units. We apply the method on the proposed enhanced processor by studying the statistical characteristics of the various algorithms involved in a popular multimedia application. Our experimental results show that in contrast to state-of-the-art fault tolerance approaches, we are able to reduce runtime and area overhead by 71.3% and 83.3% respectively.
Resumo:
The work in this paper is of particular significance since it considers the problem of modelling cross- and auto-correlation in statistical process monitoring. The presence of both types of correlation can lead to fault insensitivity or false alarms, although in published literature to date, only autocorrelation has been broadly considered. The proposed method, which uses a Kalman innovation model, effectively removes both correlations. The paper (and Part 2 [2]) has emerged from work supported by EPSRC grant GR/S84354/01 and is of direct relevance to problems in several application areas including chemical, electrical, and mechanical process monitoring.
Resumo:
This paper builds on work presented in the first paper, Part 1 [1] and is of equal significance. The paper proposes a novel compensation method to preserve the integrity of step-fault signatures prevalent in various processes that can be masked during the removal of both auto- and cross correlation. Using industrial data, the paper demonstrates the benefit of the proposed method, which is applicable to chemical, electrical, and mechanical process monitoring. This paper, (and Part 1 [1]), has led to further work supported by EPSRC grant GR/S84354/01 involving kernel PCA methods.
Resumo:
Background: Results from clinical trials are usually summarized in the form of sampling distributions. When full information (mean, SEM) about these distributions is given, performing meta-analysis is straightforward. However, when some of the sampling distributions only have mean values, a challenging issue is to decide how to use such distributions in meta-analysis. Currently, the most common approaches are either ignoring such trials or for each trial with a missing SEM, finding a similar trial and taking its SEM value as the missing SEM. Both approaches have drawbacks. As an alternative, this paper develops and tests two new methods, the first being the prognostic method and the second being the interval method, to estimate any missing SEMs from a set of sampling distributions with full information. A merging method is also proposed to handle clinical trials with partial information to simulate meta-analysis.
Resumo:
Background
Interaction of a drug or chemical with a biological system can result in a gene-expression profile or signature characteristic of the event. Using a suitably robust algorithm these signatures can potentially be used to connect molecules with similar pharmacological or toxicological properties by gene expression profile. Lamb et al first proposed the Connectivity Map [Lamb et al (2006), Science 313, 1929–1935] to make successful connections among small molecules, genes, and diseases using genomic signatures.
Results
Here we have built on the principles of the Connectivity Map to present a simpler and more robust method for the construction of reference gene-expression profiles and for the connection scoring scheme, which importantly allows the valuation of statistical significance of all the connections observed. We tested the new method with two randomly generated gene signatures and three experimentally derived gene signatures (for HDAC inhibitors, estrogens, and immunosuppressive drugs, respectively). Our testing with this method indicates that it achieves a higher level of specificity and sensitivity and so advances the original method.
Conclusion
The method presented here not only offers more principled statistical procedures for testing connections, but more importantly it provides effective safeguard against false connections at the same time achieving increased sensitivity. With its robust performance, the method has potential use in the drug development pipeline for the early recognition of pharmacological and toxicological properties in chemicals and new drug candidates, and also more broadly in other 'omics sciences.