850 resultados para Local classification method
Resumo:
An X-ray visualization technique has been used for the quantitative determination of local liquid holdups distribution and liquid holdup hysteresis in a nonwetting two-dimensional (2-D) packed bed. A medical diagnostic X-ray unit has been used to image the local holdups in a 2-D cold model having a random packing of expanded polystyrene beads. An aqueous barium chloride solution was used as a fluid to achieve good contrast on X-ray images. To quantify the local liquid holdup, a simple calibration technique has been developed that can be used for most of the radiological methods such as gamma ray and neutron radiography. The global value of total liquid holdup, obtained by X-ray method, has been compared with two conventional methods: drainage and tracer response. The X-ray technique, after validation, has been used to visualize and quantify, the liquid hysteresis phenomena in a packed bed. The liquid flows in preferred paths or channels that carry droplets/rivulets of increasing size and number as the liquid flow rate is increased. When the flow is reduced, these paths are retained and the higher liquid holdup that persists in these regions leads to the holdup hysteresis effect. Holdup in some regions of the packed bed may be an order of magnitude higher than average at a particular flow rate. (c) 2005 American Institute of Chemical Engineers
Resumo:
Background: Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C-beta atoms in other residues within a sphere around the C-beta atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results: We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles), we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either contacted or non-contacted, the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion: The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary sequence and higher order consecutive protein structural and functional properties.
Resumo:
In this paper we apply a new method for the determination of surface area of carbonaceous materials, using the local surface excess isotherms obtained from the Grand Canonical Monte Carlo simulation and a concept of area distribution in terms of energy well-depth of solid–fluid interaction. The range of this well-depth considered in our GCMC simulation is from 10 to 100 K, which is wide enough to cover all carbon surfaces that we dealt with (for comparison, the well-depth for perfect graphite surface is about 58 K). Having the set of local surface excess isotherms and the differential area distribution, the overall adsorption isotherm can be obtained in an integral form. Thus, given the experimental data of nitrogen or argon adsorption on a carbon material, the differential area distribution can be obtained from the inversion process, using the regularization method. The total surface area is then obtained as the area of this distribution. We test this approach with a number of data in the literature, and compare our GCMC-surface area with that obtained from the classical BET method. In general, we find that the difference between these two surface areas is about 10%, indicating the need to reliably determine the surface area with a very consistent method. We, therefore, suggest the approach of this paper as an alternative to the BET method because of the long-recognized unrealistic assumptions used in the BET theory. Beside the surface area obtained by this method, it also provides information about the differential area distribution versus the well-depth. This information could be used as a microscopic finger-print of the carbon surface. It is expected that samples prepared from different precursors and different activation conditions will have distinct finger-prints. We illustrate this with Cabot BP120, 280 and 460 samples, and the differential area distributions obtained from the adsorption of argon at 77 K and nitrogen also at 77 K have exactly the same patterns, suggesting the characteristics of this carbon.
Resumo:
In the English literature, facial approximation methods have been commonly classified into three types: Russian, American, or Combination. These categorizations are based on the protocols used, for example, whether methods use average soft-tissue depths (American methods) or require face muscle construction (Russian methods). However, literature searches outside the usual realm of English publications reveal key papers that demonstrate that the Russian category above has been founded on distorted views. In reality, Russian methods are based on limited face muscle construction, with heavy reliance on modified average soft-tissue depths. A closer inspection of the American method also reveals inconsistencies with the recognized classification scheme. This investigation thus demonstrates that all major methods of facial approximation depend on both face anatomy and average soft-tissue depths, rendering common method classification schemes redundant. The best way forward appears to be for practitioners to describe the methods they use (including the weight each one gives to average soft-tissue depths and deep face tissue construction) without placing them in any categorical classificatory group or giving them an ambiguous name. The state of this situation may need to be reviewed in the future in light of new research results and paradigms.
Resumo:
The Gauss-Marquardt-Levenberg (GML) method of computer-based parameter estimation, in common with other gradient-based approaches, suffers from the drawback that it may become trapped in local objective function minima, and thus report optimized parameter values that are not, in fact, optimized at all. This can seriously degrade its utility in the calibration of watershed models where local optima abound. Nevertheless, the method also has advantages, chief among these being its model-run efficiency, and its ability to report useful information on parameter sensitivities and covariances as a by-product of its use. It is also easily adapted to maintain this efficiency in the face of potential numerical problems (that adversely affect all parameter estimation methodologies) caused by parameter insensitivity and/or parameter correlation. The present paper presents two algorithmic enhancements to the GML method that retain its strengths, but which overcome its weaknesses in the face of local optima. Using the first of these methods an intelligent search for better parameter sets is conducted in parameter subspaces of decreasing dimensionality when progress of the parameter estimation process is slowed either by numerical instability incurred through problem ill-posedness, or when a local objective function minimum is encountered. The second methodology minimizes the chance of successive GML parameter estimation runs finding the same objective function minimum by starting successive runs at points that are maximally removed from previous parameter trajectories. As well as enhancing the ability of a GML-based method to find the global objective function minimum, the latter technique can also be used to find the locations of many non-global optima (should they exist) in parameter space. This can provide a useful means of inquiring into the well-posedness of a parameter estimation problem, and for detecting the presence of bimodal parameter and predictive probability distributions. The new methodologies are demonstrated by calibrating a Hydrological Simulation Program-FORTRAN (HSPF) model against a time series of daily flows. Comparison with the SCE-UA method in this calibration context demonstrates a high level of comparative model run efficiency for the new method. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Background: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
Resumo:
Ecological regions are increasingly used as a spatial unit for planning and environmental management. It is important to define these regions in a scientifically defensible way to justify any decisions made on the basis that they are representative of broad environmental assets. The paper describes a methodology and tool to identify cohesive bioregions. The methodology applies an elicitation process to obtain geographical descriptions for bioregions, each of these is transformed into a Normal density estimate on environmental variables within that region. This prior information is balanced with data classification of environmental datasets using a Bayesian statistical modelling approach to objectively map ecological regions. The method is called model-based clustering as it fits a Normal mixture model to the clusters associated with regions, and it addresses issues of uncertainty in environmental datasets due to overlapping clusters.
Resumo:
In this paper, we describe the evaluation of a method for building detection by the Dempster-Shafer fusion of LIDAR data and multispectral images. For that purpose, ground truth was digitised for two test sites with quite different characteristics. Using these data sets, the heuristic model for the probability mass assignments of the method is validated, and rules for the tuning of the parameters of this model are discussed. Further we evaluate the contributions of the individual cues used in the classification process to the quality of the classification results. Our results show the degree to which the overall correctness of the results can be improved by fusing LIDAR data with multispectral images.
Resumo:
Racing algorithms have recently been proposed as a general-purpose method for performing model selection in machine teaming algorithms. In this paper, we present an empirical study of the Hoeffding racing algorithm for selecting the k parameter in a simple k-nearest neighbor classifier. Fifteen widely-used classification datasets from UCI are used and experiments conducted across different confidence levels for racing. The results reveal a significant amount of sensitivity of the k-nn classifier to its model parameter value. The Hoeffding racing algorithm also varies widely in its performance, in terms of the computational savings gained over an exhaustive evaluation. While in some cases the savings gained are quite small, the racing algorithm proved to be highly robust to the possibility of erroneously eliminating the optimal models. All results were strongly dependent on the datasets used.
Resumo:
Finite element analysis (FEA) of nonlinear problems in solid mechanics is a time consuming process, but it can deal rigorously with the problems of both geometric, contact and material nonlinearity that occur in roll forming. The simulation time limits the application of nonlinear FEA to these problems in industrial practice, so that most applications of nonlinear FEA are in theoretical studies and engineering consulting or troubleshooting. Instead, quick methods based on a global assumption of the deformed shape have been used by the roll-forming industry. These approaches are of limited accuracy. This paper proposes a new form-finding method - a relaxation method to solve the nonlinear problem of predicting the deformed shape due to plastic deformation in roll forming. This method involves applying a small perturbation to each discrete node in order to update the local displacement field, while minimizing plastic work. This is iteratively applied to update the positions of all nodes. As the method assumes a local displacement field, the strain and stress components at each node are calculated explicitly. Continued perturbation of nodes leads to optimisation of the displacement field. Another important feature of this paper is a new approach to consideration of strain history. For a stable and continuous process such as rolling and roll forming, the strain history of a point is represented spatially by the states at a row of nodes leading in the direction of rolling to the current one. Therefore the increment of the strain components and the work-increment of a point can be found without moving the object forward. Using this method we can find the solution for rolling or roll forming in just one step. This method is expected to be faster than commercial finite element packages by eliminating repeated solution of large sets of simultaneous equations and the need to update boundary conditions that represent the rolls.
Resumo:
Government agencies responsible for riparian environments are assessing the combined utility of field survey and remote sensing for mapping and monitoring indicators of riparian zone condition. The objective of this work was to compare the Tropical Rapid Appraisal of Riparian Condition (TRARC) method to a satellite image based approach. TRARC was developed for rapid assessment of the environmental condition of savanna riparian zones. The comparison assessed mapping accuracy, representativeness of TRARC assessment, cost-effectiveness, and suitability for multi-temporal analysis. Two multi-spectral QuickBird images captured in 2004 and 2005 and coincident field data covering sections of the Daly River in the Northern Territory, Australia were used in this work. Both field and image data were processed to map riparian health indicators (RHIs) including percentage canopy cover, organic litter, canopy continuity, stream bank stability, and extent of tree clearing. Spectral vegetation indices, image segmentation and supervised classification were used to produce RHI maps. QuickBird image data were used to examine if the spatial distribution of TRARC transects provided a representative sample of ground based RHI measurements. Results showed that TRARC transects were required to cover at least 3% of the study area to obtain a representative sample. The mapping accuracy and costs of the image based approach were compared to those of the ground based TRARC approach. Results proved that TRARC was more cost-effective at smaller scales (1-100km), while image based assessment becomes more feasible at regional scales (100-1000km). Finally, the ability to use both the image and field based approaches for multi-temporal analysis of RHIs was assessed. Change detection analysis demonstrated that image data can provide detailed information on gradual change, while the TRARC method was only able to identify more gross scale changes. In conclusion, results from both methods were considered to complement each other if used at appropriate spatial scales.
Resumo:
The Java programming language supports concurrency. Concurrent programs are hard to test due to their inherent non-determinism. This paper presents a classification of concurrency failures that is based on a model of Java concurrency. The model and failure classification is used to justify coverage of synchronization primitives of concurrent components. This is achieved by constructing concurrency flow graphs for each method call. A producer-consumer monitor is used to demonstrate how the approach can be used to measure coverage of concurrency primitives and thereby assist in determining test sequences for deterministic execution.
Resumo:
We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques.
Resumo:
Solving many scientific problems requires effective regression and/or classification models for large high-dimensional datasets. Experts from these problem domains (e.g. biologists, chemists, financial analysts) have insights into the domain which can be helpful in developing powerful models but they need a modelling framework that helps them to use these insights. Data visualisation is an effective technique for presenting data and requiring feedback from the experts. A single global regression model can rarely capture the full behavioural variability of a huge multi-dimensional dataset. Instead, local regression models, each focused on a separate area of input space, often work better since the behaviour of different areas may vary. Classical local models such as Mixture of Experts segment the input space automatically, which is not always effective and it also lacks involvement of the domain experts to guide a meaningful segmentation of the input space. In this paper we addresses this issue by allowing domain experts to interactively segment the input space using data visualisation. The segmentation output obtained is then further used to develop effective local regression models.
Resumo:
We consider the problem of assigning an input vector bfx to one of m classes by predicting P(c|bfx) for c = 1, ldots, m. For a two-class problem, the probability of class 1 given bfx is estimated by s(y(bfx)), where s(y) = 1/(1 + e-y). A Gaussian process prior is placed on y(bfx), and is combined with the training data to obtain predictions for new bfx points. We provide a Bayesian treatment, integrating over uncertainty in y and in the parameters that control the Gaussian process prior; the necessary integration over y is carried out using Laplace's approximation. The method is generalized to multi-class problems (m >2) using the softmax function. We demonstrate the effectiveness of the method on a number of datasets.