21 resultados para Bankruptcy prediction methods
em Indian Institute of Science - Bangalore - Índia
Resumo:
The significance of treating rainfall as a chaotic system instead of a stochastic system for a better understanding of the underlying dynamics has been taken up by various studies recently. However, an important limitation of all these approaches is the dependence on a single method for identifying the chaotic nature and the parameters involved. Many of these approaches aim at only analyzing the chaotic nature and not its prediction. In the present study, an attempt is made to identify chaos using various techniques and prediction is also done by generating ensembles in order to quantify the uncertainty involved. Daily rainfall data of three regions with contrasting characteristics (mainly in the spatial area covered), Malaprabha, Mahanadi and All-India for the period 1955-2000 are used for the study. Auto-correlation and mutual information methods are used to determine the delay time for the phase space reconstruction. Optimum embedding dimension is determined using correlation dimension, false nearest neighbour algorithm and also nonlinear prediction methods. The low embedding dimensions obtained from these methods indicate the existence of low dimensional chaos in the three rainfall series. Correlation dimension method is done on th phase randomized and first derivative of the data series to check whether the saturation of the dimension is due to the inherent linear correlation structure or due to low dimensional dynamics. Positive Lyapunov exponents obtained prove the exponential divergence of the trajectories and hence the unpredictability. Surrogate data test is also done to further confirm the nonlinear structure of the rainfall series. A range of plausible parameters is used for generating an ensemble of predictions of rainfall for each year separately for the period 1996-2000 using the data till the preceding year. For analyzing the sensitiveness to initial conditions, predictions are done from two different months in a year viz., from the beginning of January and June. The reasonably good predictions obtained indicate the efficiency of the nonlinear prediction method for predicting the rainfall series. Also, the rank probability skill score and the rank histograms show that the ensembles generated are reliable with a good spread and skill. A comparison of results of the three regions indicates that although they are chaotic in nature, the spatial averaging over a large area can increase the dimension and improve the predictability, thus destroying the chaotic nature. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
Protein structure validation is an important step in computational modeling and structure determination. Stereochemical assessment of protein structures examine internal parameters such as bond lengths and Ramachandran (phi, psi) angles. Gross structure prediction methods such as inverse folding procedure and structure determination especially at low resolution can sometimes give rise to models that are incorrect due to assignment of misfolds or mistracing of electron density maps. Such errors are not reflected as strain in internal parameters. HARMONY is a procedure that examines the compatibility between the sequence and the structure of a protein by assigning scores to individual residues and their amino acid exchange patterns after considering their local environments. Local environments are described by the backbone conformation, solvent accessibility and hydrogen bonding patterns. We are now providing HARMONY through a web server such that users can submit their protein structure files and, if required, the alignment of homologous sequences. Scores are mapped on the structure for subsequent examination that is useful to also recognize regions of possible local errors in protein structures. HARMONY server is located at http://caps.ncbs.res.in/harmony/
Resumo:
Perfect or even mediocre weather predictions over a long period are almost impossible because of the ultimate growth of a small initial error into a significant one. Even though the sensitivity of initial conditions limits the predictability in chaotic systems, an ensemble of prediction from different possible initial conditions and also a prediction algorithm capable of resolving the fine structure of the chaotic attractor can reduce the prediction uncertainty to some extent. All of the traditional chaotic prediction methods in hydrology are based on single optimum initial condition local models which can model the sudden divergence of the trajectories with different local functions. Conceptually, global models are ineffective in modeling the highly unstable structure of the chaotic attractor. This paper focuses on an ensemble prediction approach by reconstructing the phase space using different combinations of chaotic parameters, i.e., embedding dimension and delay time to quantify the uncertainty in initial conditions. The ensemble approach is implemented through a local learning wavelet network model with a global feed-forward neural network structure for the phase space prediction of chaotic streamflow series. Quantification of uncertainties in future predictions are done by creating an ensemble of predictions with wavelet network using a range of plausible embedding dimensions and delay times. The ensemble approach is proved to be 50% more efficient than the single prediction for both local approximation and wavelet network approaches. The wavelet network approach has proved to be 30%-50% more superior to the local approximation approach. Compared to the traditional local approximation approach with single initial condition, the total predictive uncertainty in the streamflow is reduced when modeled with ensemble wavelet networks for different lead times. Localization property of wavelets, utilizing different dilation and translation parameters, helps in capturing most of the statistical properties of the observed data. The need for taking into account all plausible initial conditions and also bringing together the characteristics of both local and global approaches to model the unstable yet ordered chaotic attractor of a hydrologic series is clearly demonstrated.
Resumo:
Depth measures the extent of atom/residue burial within a protein. It correlates with properties such as protein stability, hydrogen exchange rate, protein-protein interaction hot spots, post-translational modification sites and sequence variability. Our server, DEPTH, accurately computes depth and solvent-accessible surface area (SASA) values. We show that depth can be used to predict small molecule ligand binding cavities in proteins. Often, some of the residues lining a ligand binding cavity are both deep and solvent exposed. Using the depth-SASA pair values for a residue, its likelihood to form part of a small molecule binding cavity is estimated. The parameters of the method were calibrated over a training set of 900 high-resolution X-ray crystal structures of single-domain proteins bound to small molecules (molecular weight < 1.5 KDa). The prediction accuracy of DEPTH is comparable to that of other geometry-based prediction methods including LIGSITE, SURFNET and Pocket-Finder (all with Matthew's correlation coefficient of similar to 0.4) over a testing set of 225 single and multi-chain protein structures. Users have the option of tuning several parameters to detect cavities of different sizes, for example, geometrically flat binding sites. The input to the server is a protein 3D structure in PDB format. The users have the option of tuning the values of four parameters associated with the computation of residue depth and the prediction of binding cavities. The computed depths, SASA and binding cavity predictions are displayed in 2D plots and mapped onto 3D representations of the protein structure using Jmol. Links are provided to download the outputs. Our server is useful for all structural analysis based on residue depth and SASA, such as guiding site-directed mutagenesis experiments and small molecule docking exercises, in the context of protein functional annotation and drug discovery.
Resumo:
The rapidly growing structure databases enhance the probability of finding identical sequences sharing structural similarity. Structure prediction methods are being used extensively to abridge the gap between known protein sequences and the solved structures which is essential to understand its specific biochemical and cellular functions. In this work, we plan to study the ambiguity between sequence-structure relationships and examine if sequentially identical peptide fragments adopt similar three-dimensional structures. Fragments of varying lengths (five to ten residues) were used to observe the behavior of sequence and its three-dimensional structures. The STAMP program was used to superpose the three-dimensional structures and the two parameters (Sequence Structure Similarity Score (Sc) and Root Mean Square Deviation value) were employed to classify them into three categories: similar, intermediate and dissimilar structures. Furthermore, the same approach was carried out on all the three-dimensional protein structures solved in the two organisms, Mycobacterium tuberculosis and Plasmodium falciparum to validate our results.
Resumo:
Further improvement in performance, to achieve near transparent quality LSF quantization, is shown to be possible by using a higher order two dimensional (2-D) prediction in the coefficient domain. The prediction is performed in a closed-loop manner so that the LSF reconstruction error is the same as the quantization error of the prediction residual. We show that an optimum 2-D predictor, exploiting both inter-frame and intra-frame correlations, performs better than existing predictive methods. Computationally efficient split vector quantization technique is used to implement the proposed 2-D prediction based method. We show further improvement in performance by using weighted Euclidean distance.
Resumo:
NDDO-based (AM1) configuration interaction (CI) calculations have been used to calculate the wavelength and oscillator strengths of electronic absorptions in organic molecules and the results used in a sum-over-states treatment to calculate second-order-hyperpolarizabilities. The results for both spectra and hyperpolarizabilities are of acceptable quality as long as a suitable CI-expansion is used. We have found that using an active space of eight electrons in eight orbitals and including all single and pair-double excitations in the CI leads to results that agree well with experiment and that do not change significantly with increasing active space for most organic molecules. Calculated second-order hyperpolarizabilities using this type of CI within a sum-over-states calculation appear to be of useful accuracy.
Resumo:
Lateral or transaxial truncation of cone-beam data can occur either due to the field of view limitation of the scanning apparatus or iregion-of-interest tomography. In this paper, we Suggest two new methods to handle lateral truncation in helical scan CT. It is seen that reconstruction with laterally truncated projection data, assuming it to be complete, gives severe artifacts which even penetrates into the field of view. A row-by-row data completion approach using linear prediction is introduced for helical scan truncated data. An extension of this technique known as windowed linear prediction approach is introduced. Efficacy of the two techniques are shown using simulation with standard phantoms. A quantitative image quality measure of the resulting reconstructed images are used to evaluate the performance of the proposed methods against an extension of a standard existing technique.
Resumo:
Non-stationary signal modeling is a well addressed problem in the literature. Many methods have been proposed to model non-stationary signals such as time varying linear prediction and AM-FM modeling, the later being more popular. Estimation techniques to determine the AM-FM components of narrow-band signal, such as Hilbert transform, DESA1, DESA2, auditory processing approach, ZC approach, etc., are prevalent but their robustness to noise is not clearly addressed in the literature. This is critical for most practical applications, such as in communications. We explore the robustness of different AM-FM estimators in the presence of white Gaussian noise. Also, we have proposed three new methods for IF estimation based on non-uniform samples of the signal and multi-resolution analysis. Experimental results show that ZC based methods give better results than the popular methods such as DESA in clean condition as well as noisy condition.
Resumo:
The determination of the overconsolidation ratio (OCR) of clay deposits is an important task in geotechnical engineering practice. This paper examines the potential of a support vector machine (SVM) for predicting the OCR of clays from piezocone penetration test data. SVM is a statistical learning theory based on a structural risk minimization principle that minimizes both error and weight terms. The five input variables used for the SVM model for prediction of OCR are the corrected cone resistance (qt), vertical total stress (sigmav), hydrostatic pore pressure (u0), pore pressure at the cone tip (u1), and the pore pressure just above the cone base (u2). Sensitivity analysis has been performed to investigate the relative importance of each of the input parameters. From the sensitivity analysis, it is clear that qt=primary in situ data influenced by OCR followed by sigmav, u0, u2, and u1. Comparison between SVM and some of the traditional interpretation methods is also presented. The results of this study have shown that the SVM approach has the potential to be a practical tool for determination of OCR.
Resumo:
The swelling pressure of soil depends upon various soil parameters such as mineralogy, clay content, Atterberg's limits, dry density, moisture content, initial degree of saturation, etc. along with structural and environmental factors. It is very difficult to model and analyze swelling pressure effectively taking all the above aspects into consideration. Various statistical/empirical methods have been attempted to predict the swelling pressure based on index properties of soil. In this paper, the computational intelligence techniques artificial neural network and support vector machine have been used to develop models based on the set of available experimental results to predict swelling pressure from the inputs; natural moisture content, dry density, liquid limit, plasticity index, and clay fraction. The generalization of the model to new set of data other than the training set of data is discussed which is required for successful application of a model. A detailed study of the relative performance of the computational intelligence techniques has been carried out based on different statistical performance criteria.
Resumo:
The swelling pressure of soil depends upon various soil parameters such as mineralogy, clay content, Atterberg's limits, dry density, moisture content, initial degree of saturation, etc. along with structural and environmental factors. It is very difficult to model and analyze swelling pressure effectively taking all the above aspects into consideration. Various statistical/empirical methods have been attempted to predict the swelling pressure based on index properties of soil. In this paper, the computational intelligence techniques artificial neural network and support vector machine have been used to develop models based on the set of available experimental results to predict swelling pressure from the inputs; natural moisture content, dry density, liquid limit, plasticity index, and clay fraction. The generalization of the model to new set of data other than the training set of data is discussed which is required for successful application of a model. A detailed study of the relative performance of the computational intelligence techniques has been carried out based on different statistical performance criteria.
Resumo:
Representation and quantification of uncertainty in climate change impact studies are a difficult task. Several sources of uncertainty arise in studies of hydrologic impacts of climate change, such as those due to choice of general circulation models (GCMs), scenarios and downscaling methods. Recently, much work has focused on uncertainty quantification and modeling in regional climate change impacts. In this paper, an uncertainty modeling framework is evaluated, which uses a generalized uncertainty measure to combine GCM, scenario and downscaling uncertainties. The Dempster-Shafer (D-S) evidence theory is used for representing and combining uncertainty from various sources. A significant advantage of the D-S framework over the traditional probabilistic approach is that it allows for the allocation of a probability mass to sets or intervals, and can hence handle both aleatory or stochastic uncertainty, and epistemic or subjective uncertainty. This paper shows how the D-S theory can be used to represent beliefs in some hypotheses such as hydrologic drought or wet conditions, describe uncertainty and ignorance in the system, and give a quantitative measurement of belief and plausibility in results. The D-S approach has been used in this work for information synthesis using various evidence combination rules having different conflict modeling approaches. A case study is presented for hydrologic drought prediction using downscaled streamflow in the Mahanadi River at Hirakud in Orissa, India. Projections of n most likely monsoon streamflow sequences are obtained from a conditional random field (CRF) downscaling model, using an ensemble of three GCMs for three scenarios, which are converted to monsoon standardized streamflow index (SSFI-4) series. This range is used to specify the basic probability assignment (bpa) for a Dempster-Shafer structure, which represents uncertainty associated with each of the SSFI-4 classifications. These uncertainties are then combined across GCMs and scenarios using various evidence combination rules given by the D-S theory. A Bayesian approach is also presented for this case study, which models the uncertainty in projected frequencies of SSFI-4 classifications by deriving a posterior distribution for the frequency of each classification, using an ensemble of GCMs and scenarios. Results from the D-S and Bayesian approaches are compared, and relative merits of each approach are discussed. Both approaches show an increasing probability of extreme, severe and moderate droughts and decreasing probability of normal and wet conditions in Orissa as a result of climate change. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
The importance of long-range prediction of rainfall pattern for devising and planning agricultural strategies cannot be overemphasized. However, the prediction of rainfall pattern remains a difficult problem and the desired level of accuracy has not been reached. The conventional methods for prediction of rainfall use either dynamical or statistical modelling. In this article we report the results of a new modelling technique using artificial neural networks. Artificial neural networks are especially useful where the dynamical processes and their interrelations for a given phenomenon are not known with sufficient accuracy. Since conventional neural networks were found to be unsuitable for simulating and predicting rainfall patterns, a generalized structure of a neural network was then explored and found to provide consistent prediction (hindcast) of all-India annual mean rainfall with good accuracy. Performance and consistency of this network are evaluated and compared with those of other (conventional) neural networks. It is shown that the generalized network can make consistently good prediction of annual mean rainfall. Immediate application and potential of such a prediction system are discussed.
Resumo:
Molecular understanding of disease processes can be accelerated if all interactions between the host and pathogen are known. The unavailability of experimental methods for large-scale detection of interactions across host and pathogen organisms hinders this process. Here we apply a simple method to predict protein-protein interactions across a host and pathogen organisms. We use homology detection approaches against the protein-protein interaction databases. DIP and iPfam in order to predict interacting proteins in a host-pathogen pair. In the present work, we first applied this approach to the test cases involving the pairs phage T4 - Escherichia coli and phage lambda - E. coli and show that previously known interactions could be recognized using our approach. We further apply this approach to predict interactions between human and three pathogens E. coli, Salmonella enterica typhimurium and Yersinia pestis. We identified several novel interactions involving proteins of host or pathogen that could be thought of as highly relevant to the disease process. Serendipitously, many interactions involve hypothetical proteins of yet unknown function. Hypothetical proteins are predicted from computational analysis of genome sequences with no laboratory analysis on their functions yet available. The predicted interactions involving such proteins could provide hints to their functions. (C) 2011 Elsevier B.V. All rights reserved.