22 resultados para feature analysis
em Indian Institute of Science - Bangalore - Índia
Resumo:
Classification of a large document collection involves dealing with a huge feature space where each distinct word is a feature. In such an environment, classification is a costly task both in terms of running time and computing resources. Further it will not guarantee optimal results because it is likely to overfit by considering every feature for classification. In such a context, feature selection is inevitable. This work analyses the feature selection methods, explores the relations among them and attempts to find a minimal subset of features which are discriminative for document classification.
Resumo:
The minimum cost classifier when general cost functionsare associated with the tasks of feature measurement and classification is formulated as a decision graph which does not reject class labels at intermediate stages. Noting its complexities, a heuristic procedure to simplify this scheme to a binary decision tree is presented. The optimizationof the binary tree in this context is carried out using ynamicprogramming. This technique is applied to the voiced-unvoiced-silence classification in speech processing.
Resumo:
The importance of interlaminar stresses has prompted a fresh look at the theory of laminated plates. An important feature in modelling such laminates is the need to provide for continuity of some strains and stresses, while at the same time allowing for the discontinuities in the others. A new modelling possibility is examined in this paper. The procedure allows for discontinuities in the in-plane stresses and transverse strains and continuity in the in-plane strains and transverse stresses. This theory is in the form of a heirarchy of formulations each representing an iterative step. Application of the theory is illustrated by considering the example of an infinite laminated strip subjected to sinusoidal loading.
Resumo:
A method to identify β-sheets in globular proteins from extended strands, using only α-carbon positions, has been developed. The strands that form β-sheets are picked up by means of simple distance criteria. The method has been tested by applying it to three proteins with accurately known secondary structures. It has also been applied to ten other proteins wherein only α-carbon coordinates are available, and the list of β-sheets obtained. The following points are worth noting: (i) The sheets identified by the algorithm are found to agree satisfactorily with the reported ones based on backbone hydrogen bonding, wherever this information is available. (ii) β-Strands that do not form parts of any sheet are a common feature of protein structures. (iii) Such isolated β-strands tend to be short. (iv) The conformation corresponding to the preferred right-handed twist of the sheet is overwhelmingly observed in both the sheet-forming and isolated β-strands.
Resumo:
The application of Gaussian Quadrature (GQ) procedures to the evaluation of i—E curves in linear sweep voltammetry is advocated. It is shown that a high degree of precision is achieved with these methods and the values obtained through GQ are in good agreement with (and even better than) the values reported in literature by Nicholson-Shain, for example. Another welcome feature with GQ is its ability to be interpreted as an elegant, efficient analytic approximation scheme too. A comparison of the values obtained by this approach and by a recent scheme based on series approximation proposed by Oldham is made and excellent agreement is shown to exist.
Resumo:
In an earlier paper [1], it has been shown that velocity ratio, defined with reference to the analogous circuit, is a basic parameter in the complete analysis of a linear one-dimensional dynamical system. In this paper it is shown that the terms constituting velocity ratio can be readily determined by means of an algebraic algorithm developed from a heuristic study of the process of transfer matrix multiplication. The algorithm permits the set of most significant terms at a particular frequency of interest to be identified from a knowledge of the relative magnitudes of the impedances of the constituent elements of a proposed configuration. This feature makes the algorithm a potential tool in a first approach to a rational design of a complex dynamical filter. This algorithm is particularly suited for the desk analysis of a medium size system with lumped as well as distributed elements.
Resumo:
Mycobacterium leprae recA harbors an in-frame insertion sequence that encodes an intein homing endonuclease (PI-MleI). Most inteins (intein endonucleases) possess two conserved LAGLIDADG (DOD) motifs at their ctive center. A common feature of LAGLIDADG-type homing endonucleases is that they recognize and cleave the same or very similar DNA sequences. However, PI-MleI is distinctive from other members of the family of LAGLIDADG-type HEases for its modular structure with functionally separable domains for DNA-binding and cleavage, each with distinct sequence preferences. Sequence alignment analyses of PI-MleI revealed three putative LAGLIDADG motifs; however, there is conflicting bioinformatics data in regard to their identity and specific location within the intein polypeptide. To resolve this conflict and to determine the active-site residues essential for DNA target site recognition and double-stranded DNA cleavage, we performed site-directed mutagenesis of presumptive catalytic residues in the LAGLIDADG motifs. Analysis of target DNA recognition and kinetic parameters of the wild-type PI-MleI and its variants disclosed that the two amino acid residues, Asp(122) (in Block C) and Asp(193) (in functional Block E), are crucial to the double-stranded DNA endonuclease activity, whereas Asp(218) (in pseudo-Block E) is not. However, despite the reduced catalytic activity, the PI-MleI variants, like the wild-type PI-MleI, generated a footprint of the same length around the insertion site. The D122T variant showed significantly reduced catalytic activity, and D122A and D193A mutations although failed to affect their DNA-binding affinities, but abolished the double-stranded DNA cleavage activity. On the other hand, D122C variant showed approximately twofold higher double-stranded DNA cleavage activity, compared with the wild-type PI-MleI. These results provide compelling evidence that Asp(122) and Asp(193) in DOD motif I and II, respectively, are bona fide active-site residues essential for DNA cleavage activity. The implications of these results are discussed in this report.
Resumo:
With the increased utilization of advanced composites in strategic industries, the concept of Structural Health Monitoring (SHM) with its inherent advantages is gaining ground over the conventional methods of NDE and NDI. The most attractive feature of this concept is on-line evaluation using embedded sensors. Consequently, development of methodologies with identification of appropriate sensors such as PVDF films becomes the key for exploiting the new concept. And, of the methods used for on-line evaluation acoustic emission has been most effective. Thus, Acoustic Emission (AE) generated during static tensile loading of glass fiber reinforced plastic composites was monitored using a Polyvinylidene fluoride (PVDF) film sensor. The frequency response of the film sensor was obtained with pencil lead breakage tests to choose the appropriate band of operation. The specimen considered for the experiments were chosen to characterize the differences in the operation of the failure mechanisms through AE parametric analysis. The results of the investigations can be characterized using AE parameter indicating that a PVDF film sensor was effective as an AE sensor used in structural health monitoring on-line.
Resumo:
802.11 WLANs are characterized by high bit error rate and frequent changes in network topology. The key feature that distinguishes WLANs from wired networks is the multi-rate transmission capability, which helps to accommodate a wide range of channel conditions. This has a significant impact on higher layers such as routing and transport levels. While many WLAN products provide rate control at the hardware level to adapt to the channel conditions, some chipsets like Atheros do not have support for automatic rate control. We first present a design and implementation of an FER-based automatic rate control state machine, which utilizes the statistics available at the device driver to find the optimal rate. The results show that the proposed rate switching mechanism adapts quite fast to the channel conditions. The hop count metric used by current routing protocols has proven itself for single rate networks. But it fails to take into account other important factors in a multi-rate network environment. We propose transmission time as a better path quality metric to guide routing decisions. It incorporates the effects of contention for the channel, the air time to send the data and the asymmetry of links. In this paper, we present a new design for a multi-rate mechanism as well as a new routing metric that is responsive to the rate. We address the issues involved in using transmission time as a metric and presents a comparison of the performance of different metrics for dynamic routing.
Resumo:
Extraction of text areas from the document images with complex content and layout is one of the challenging tasks. Few texture based techniques have already been proposed for extraction of such text blocks. Most of such techniques are greedy for computation time and hence are far from being realizable for real time implementation. In this work, we propose a modification to two of the existing texture based techniques to reduce the computation. This is accomplished with Harris corner detectors. The efficiency of these two textures based algorithms, one based on Gabor filters and other on log-polar wavelet signature, are compared. A combination of Gabor feature based texture classification performed on a smaller set of Harris corner detected points is observed to deliver the accuracy and efficiency.
Resumo:
Homomorphic analysis and pole-zero modeling of electrocardiogram (ECG) signals are presented in this paper. Four typical ECG signals are considered and deconvolved into their minimum and maximum phase components through cepstral filtering, with a view to study the possibility of more efficient feature selection from the component signals for diagnostic purposes. The complex cepstra of the signals are linearly filtered to extract the basic wavelet and the excitation function. The ECG signals are, in general, mixed phase and hence, exponential weighting is done to aid deconvolution of the signals. The basic wavelet for normal ECG approximates the action potential of the muscle fiber of the heart and the excitation function corresponds to the excitation pattern of the heart muscles during a cardiac cycle. The ECG signals and their components are pole-zero modeled and the pole-zero pattern of the models can give a clue to classify the normal and abnormal signals. Besides, storing only the parameters of the model can result in a data reduction of more than 3:1 for normal signals sampled at a moderate 128 samples/s
Resumo:
Low-humidity monoclinic lysozyme, resulting from a water-mediated transformation, has one of the lowest solvent contents (22% by volume) observed in a protein crystal. Its structure has been solved by the molecular replacement method and refined to an R value of 0.175 for 7684 observed reflections in the 10–1.75 Å resolution shell. 90% of the solvent in the well ordered crystals could be located. Favourable sites of hydration on the protein surface include side chains with multiple hydrogen-bonding centres, and regions between short hydrophilic side chains and the main-chain CO or NH groups of the same or nearby residues. Major secondary structural features are not disrupted by hydration. However, the free CO groups at the C terminii and, to a lesser extent, the NH groups at the N terminii of helices provide favourable sites for water interactions, as do reverse turns and regions which connect β-structure and helices. The hydration shell consists of discontinuous networks of water molecules, the maximum number of molecules in a network being ten. The substrate-binding cleft is heavily hydrated, as is the main loop region which is stabilized by water interactions. The protein molecules are close packed in the crystals with a molecular coordination number of 14. Arginyl residues are extensively involved in intermolecular hydrogen bonds and water bridges. The water molecules in the crystal are organized into discrete clusters. A distinctive feature of the clusters is the frequent occurrence of three-membered rings. The protein molecules undergo substantial rearrangement during the transformation from the native to the low-humidity form. The main-chain conformations in the two forms are nearly the same, but differences exist in the side-chain conformation. The differences are particularly pronounced in relation to Trp 62 and Trp 63. The shift in Trp 62 is especially interesting as it is also known to move during inhibitor binding.
Resumo:
An intelligent computer aided defect analysis (ICADA) system, based on artificial intelligence techniques, has been developed to identify design, process or material parameters which could be responsible for the occurrence of defective castings in a manufacturing campaign. The data on defective castings for a particular time frame, which is an input to the ICADA system, has been analysed. It was observed that a large proportion, i.e. 50-80% of all the defective castings produced in a foundry, have two, three or four types of defects occurring above a threshold proportion, say 10%. Also, a large number of defect types are either not found at all or found in a very small proportion, with a threshold value below 2%. An important feature of the ICADA system is the recognition of this pattern in the analysis. Thirty casting defect types and a large number of causes numbering between 50 and 70 for each, as identified in the AFS analysis of casting defects-the standard reference source for a casting process-constituted the foundation for building the knowledge base. Scientific rationale underlying the formation of a defect during the casting process was identified and 38 metacauses were coded. Process, material and design parameters which contribute to the metacauses were systematically examined and 112 were identified as rootcauses. The interconnections between defects, metacauses and rootcauses were represented as a three tier structured graph and the handling of uncertainty in the occurrence of events such as defects, metacauses and rootcauses was achieved by Bayesian analysis. The hill climbing search technique, associated with forward reasoning, was employed to recognize one or several root causes.
Resumo:
Feature extraction in bilingual OCR is handicapped by the increase in the number of classes or characters to be handled. This is evident in the case of Indian languages whose alphabet set is large. It is expected that the complexity of the feature extraction process increases with the number of classes. Though the determination of the best set of features that could be used cannot be ascertained through any quantitative measures, the characteristics of the scripts can help decide on the feature extraction procedure. This paper describes a hierarchical feature extraction scheme for recognition of printed bilingual (Tamil and Roman) text. The scheme divides the combined alphabet set of both the scripts into subsets by the extraction of certain spatial and structural features. Three features viz geometric moments, DCT based features and Wavelet transform based features are extracted from the grouped symbols and a linear transformation is performed on them for the purpose of efficient representation in the feature space. The transformation is obtained by the maximization of certain criterion functions. Three techniques : Principal component analysis, maximization of Fisher's ratio and maximization of divergence measure have been employed to estimate the transformation matrix. It has been observed that the proposed hierarchical scheme allows for easier handling of the alphabets and there is an appreciable rise in the recognition accuracy as a result of the transformations.
Transient analysis in Al-doped barium strontium titanate thin films grown by pulsed laser deposition
Resumo:
Thin films of (Ba0.5Sr0.5)TiO3 (BST) with different concentrations of Al doping were grown using a pulsed laser deposition technique. dc leakage properties were studied as a function of Al doping level and compared to that of undoped BST films. With an initial Al doping level of 0.1 at. % which substitutes Ti in the lattice site, the films showed a decrease in the leakage current, however, for 1 at. % Al doping level the leakage current was found to be relatively higher. Current time measurements at elevated temperatures on 1 at. % Al doped BST films revealed space-charge transient type characteristics. A complete analysis of the transient characteristics was carried out to identify the charge transport process through variation of applied electric field and ambient temperature. The result revealed a very low mobility process comparable to ionic motion, and was found responsible for the observed feature. Calculation from ionic diffusivity and charge transport revealed a conduction process associated with an activation energy of around 1 eV. The low mobility charge carriers were identified as oxygen vacancies in motion under the application of electric field. Thus a comprehensive understanding of the charge transport process in highly acceptor doped BST was developed and it was conclusive that the excess of oxygen vacancies created by intentional Al doping give rise to space-charge transient type characteristics. © 2001 American Institute of Physics.