919 resultados para feature analysis
Resumo:
The estimation of prediction quality is important because without quality measures, it is difficult to determine the usefulness of a prediction. Currently, methods for ligand binding site residue predictions are assessed in the function prediction category of the biennial Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiment, utilizing the Matthews Correlation Coefficient (MCC) and Binding-site Distance Test (BDT) metrics. However, the assessment of ligand binding site predictions using such metrics requires the availability of solved structures with bound ligands. Thus, we have developed a ligand binding site quality assessment tool, FunFOLDQA, which utilizes protein feature analysis to predict ligand binding site quality prior to the experimental solution of the protein structures and their ligand interactions. The FunFOLDQA feature scores were combined using: simple linear combinations, multiple linear regression and a neural network. The neural network produced significantly better results for correlations to both the MCC and BDT scores, according to Kendall’s τ, Spearman’s ρ and Pearson’s r correlation coefficients, when tested on both the CASP8 and CASP9 datasets. The neural network also produced the largest Area Under the Curve score (AUC) when Receiver Operator Characteristic (ROC) analysis was undertaken for the CASP8 dataset. Furthermore, the FunFOLDQA algorithm incorporating the neural network, is shown to add value to FunFOLD, when both methods are employed in combination. This results in a statistically significant improvement over all of the best server methods, the FunFOLD method (6.43%), and one of the top manual groups (FN293) tested on the CASP8 dataset. The FunFOLDQA method was also found to be competitive with the top server methods when tested on the CASP9 dataset. To the best of our knowledge, FunFOLDQA is the first attempt to develop a method that can be used to assess ligand binding site prediction quality, in the absence of experimental data.
Resumo:
The submerged entry nozzle (SEN) is used to transport the molten steel from a tundish to a mould. The main purpose of its usage is to prevent oxygen and nitrogen pick-up by molten steel from the gas. Furthermore, to achieve the desired flow conditions in the mould. Therefore, the SEN can be considered as a vital factor for a stable casting process and the steel quality. In addition, the steelmaking processes occur at high temperatures around 1873 K, so the interaction between the refractory materials of the SEN and molten steel is unavoidable. Therefore, the knowledge of the SEN behaviors during preheating and casting processes is necessary for the design of the steelmaking processes The internal surfaces of modern SENs are coated with a glass/silicon powder layer to prevent the SEN graphite oxidation during preheating. The effects of the interaction between the coating layer and the SEN base refractory materials on clogging were studied. A large number of accretion samples formed inside alumina-graphite clogged SENs were examined using FEG-SEM-EDS and Feature analysis. The internal coated SENs were used for continuous casting of stainless steel grades alloyed with Rare Earth Metals (REM). The post-mortem study results clearly revealed the formation of a multi-layer accretion. A harmful effect of the SENs decarburization on the accretion thickness was also indicated. In addition, the results indicated a penetration of the formed alkaline-rich glaze into the alumina-graphite base refractory. More specifically, the alkaline-rich glaze reacts with graphite to form a carbon monoxide gas. Thereafter, dissociation of CO at the interface between SEN and molten metal takes place. This leads to reoxidation of dissolved alloying elements such as REM (Rare Earth Metal). This reoxidation forms the “In Situ” REM oxides at the interface between the SEN and the REM alloyed molten steel. Also, the interaction of the penetrated glaze with alumina in the SEN base refractory materials leads to the formation of a high-viscous alumina-rich glaze during the SEN preheating process. This, in turn, creates a very uneven surface at the SEN internal surface. Furthermore, these uneven areas react with dissolved REM in molten steel to form REM aluminates, REM silicates and REM alumina-silicates. The formation of the large “in-situ” REM oxides and the reaction of the REM alloying elements with the previously mentioned SEN´s uneven areas may provide a large REM-rich surface in contact with the primary inclusions in molten steel. This may facilitate the attraction and agglomeration of the primary REM oxide inclusions on the SEN internal surface and thereafter the clogging. The study revealed the disadvantages of the glass/silicon powder coating applications and the SEN decarburization. The decarburization behaviors of Al2O3-C, ZrO2-C and MgO-C refractory materials from a commercial Submerged Entry Nozzle (SEN), were also investigated for different gas atmospheres consisting of CO2, O2 and Ar. The gas ratio values were kept the same as it is in a propane combustion flue gas at different Air-Fuel-Ratio (AFR) values for both Air-Fuel and Oxygen-Fuel combustion systems. Laboratory experiments were carried out under nonisothermal conditions followed by isothermal heating. The decarburization ratio (α) values of all three refractory types were determined by measuring the real time weight losses of the samples. The results showed the higher decarburization ratio (α) values increasing for MgO-C refractory when changing the Air-Fuel combustion to Oxygen-Fuel combustion at the same AFR value. It substantiates the SEN preheating advantage at higher temperatures for shorter holding times compared to heating at lower temperatures during longer holding times for Al2O3-C samples. Diffusion models were proposed for estimation of the decarburization rate of an Al2O3-C refractory in the SEN. Two different methods were studied to prevent the SEN decarburization during preheating: The effect of an ZrSi2 antioxidant and the coexistence of an antioxidant additive and a (4B2O3 ·BaO) glass powder on carbon oxidation for non-isothermal and isothermal heating conditions in a controlled atmosphere. The coexistence of 8 wt% ZrSi2 and 15 wt% (4B2O3 ·BaO) glass powder of the total alumina-graphite refractory base materials, presented the most effective resistance to carbon oxidation. The 121% volume expansion due to the Zircon formation during heating and filling up the open pores by a (4B2O3 ·BaO) glaze during the green body sintering led to an excellent carbon oxidation resistance. The effects of the plasma spray-PVD coating of the Yttria Stabilized Zirconia (YSZ) powder on the carbon oxidation of the Al2O3-C coated samples were investigated. Trials were performed at non-isothermal heating conditions in a controlled atmosphere. Also, the applied temperature profile for the laboratory trials were defined based on the industrial preheating trials. The controlled atmospheres consisted of CO2, O2 and Ar. The thicknesses of the decarburized layers were measured and examined using light optic microscopy, FEG-SEM and EDS. A 250-290 μm YSZ coating is suggested to be an appropriate coating, as it provides both an even surface as well as prevention of the decarburization even during heating in air. In addition, the interactions between the YSZ coated alumina-graphite refractory base materials in contact with a cerium alloyed molten stainless steel were surveyed. The YSZ coating provided a total prevention of the alumina reduction by cerium. Therefore, the prevention of the first clogging product formed on the surface of the SEN refractory base materials. Therefore, the YSZ plasma-PVD coating can be recommended for coating of the hot surface of the commercial SENs.
Resumo:
A feature represents a functional requirement fulfilled by a system. Since many maintenance tasks are expressed in terms of features, it is important to establish the correspondence between a feature and its implementation in source code. Traditional approaches to establish this correspondence exercise features to generate a trace of runtime events, which is then processed by post-mortem analysis. These approaches typically generate large amounts of data to analyze. Due to their static nature, these approaches do not support incremental and interactive analysis of features. We propose a radically different approach called live feature analysis, which provides a model at runtime of features. Our approach analyzes features on a running system and also makes it possible to grow feature representations by exercising different scenarios of the same feature, and identifies execution elements even to the sub-method level. We describe how live feature analysis is implemented effectively by annotating structural representations of code based on abstract syntax trees. We illustrate our live analysis with a case study where we achieve a more complete feature representation by exercising and merging variants of feature behavior and demonstrate the efficiency or our technique with benchmarks.
Resumo:
Preclinical studies using animal models have shown that grey matter plasticity in both perilesional and distant neural networks contributes to behavioural recovery of sensorimotor functions after ischaemic cortical stroke. Whether such morphological changes can be detected after human cortical stroke is not yet known, but this would be essential to better understand post-stroke brain architecture and its impact on recovery. Using serial behavioural and high-resolution magnetic resonance imaging (MRI) measurements, we tracked recovery of dexterous hand function in 28 patients with ischaemic stroke involving the primary sensorimotor cortices. We were able to classify three recovery subgroups (fast, slow, and poor) using response feature analysis of individual recovery curves. To detect areas with significant longitudinal grey matter volume (GMV) change, we performed tensor-based morphometry of MRI data acquired in the subacute phase, i.e. after the stage compromised by acute oedema and inflammation. We found significant GMV expansion in the perilesional premotor cortex, ipsilesional mediodorsal thalamus, and caudate nucleus, and GMV contraction in the contralesional cerebellum. According to an interaction model, patients with fast recovery had more perilesional than subcortical expansion, whereas the contrary was true for patients with impaired recovery. Also, there were significant voxel-wise correlations between motor performance and ipsilesional GMV contraction in the posterior parietal lobes and expansion in dorsolateral prefrontal cortex. In sum, perilesional GMV expansion is associated with successful recovery after cortical stroke, possibly reflecting the restructuring of local cortical networks. Distant changes within the prefrontal-striato-thalamic network are related to impaired recovery, probably indicating higher demands on cognitive control of motor behaviour.
Resumo:
The lexical items like and well can serve as discourse markers (DMs), but can also play numerous other roles, such as verb or adverb. Identifying the occurrences that function as DMs is an important step for language understanding by computers. In this study, automatic classifiers using lexical, prosodic/positional and sociolinguistic features are trained over transcribed dialogues, manually annotated with DM information. The resulting classifiers improve state-of-the-art performance of DM identification, at about 90% recall and 79% precision for like (84.5% accuracy, κ = 0.69), and 99% recall and 98% precision for well (97.5% accuracy, κ = 0.88). Automatic feature analysis shows that lexical collocations are the most reliable indicators, followed by prosodic/positional features, while sociolinguistic features are marginally useful for the identification of DM like and not useful for well. The differentiated processing of each type of DM improves classification accuracy, suggesting that these types should be treated individually.
Resumo:
This paper presents an automatic strategy to decide how to pronounce a Capital Letter Sequence (CLS) in a Text to Speech system (TTS). If CLS is well known by the TTS, it can be expanded in several words. But when the CLS is unknown, the system has two alternatives: spelling it (abbreviation) or pronouncing it as a new word (acronym). In Spanish, there is a high relationship between letters and phonemes. Because of this, when a CLS is similar to other words in Spanish, there is a high tendency to pronounce it as a standard word. This paper proposes an automatic method for detecting acronyms. Additionaly, this paper analyses the discrimination capability of some features, and several strategies for combining them in order to obtain the best classifier. For the best classifier, the classification error is 8.45%. About the feature analysis, the best features have been the Letter Sequence Perplexity and the Average N-gram order.
Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex.
Resumo:
The stages of integration leading from local feature analysis to object recognition were explored in human visual cortex by using the technique of functional magnetic resonance imaging. Here we report evidence for object-related activation. Such activation was located at the lateral-posterior aspect of the occipital lobe, just abutting the posterior aspect of the motion-sensitive area MT/V5, in a region termed the lateral occipital complex (LO). LO showed preferential activation to images of objects, compared to a wide range of texture patterns. This activation was not caused by a global difference in the Fourier spatial frequency content of objects versus texture images, since object images produced enhanced LO activation compared to textures matched in power spectra but randomized in phase. The preferential activation to objects also could not be explained by different patterns of eye movements: similar levels of activation were observed when subjects fixated on the objects and when they scanned the objects with their eyes. Additional manipulations such as spatial frequency filtering and a 4-fold change in visual size did not affect LO activation. These results suggest that the enhanced responses to objects were not a manifestation of low-level visual processing. A striking demonstration that activity in LO is uniquely correlated to object detectability was produced by the "Lincoln" illusion, in which blurring of objects digitized into large blocks paradoxically increases their recognizability. Such blurring led to significant enhancement of LO activation. Despite the preferential activation to objects, LO did not seem to be involved in the final, "semantic," stages of the recognition process. Thus, objects varying widely in their recognizability (e.g., famous faces, common objects, and unfamiliar three-dimensional abstract sculptures) activated it to a similar degree. These results are thus evidence for an intermediate link in the chain of processing stages leading to object recognition in human visual cortex.
Resumo:
The availability and pervasiveness of new communication services, such as mobile networks and multimedia communication over digital networks, has resulted in strong demands for approaches to modeling and realizing customized communication systems. The stovepipe approach used to develop today's communication applications is no longer effective because it results in a lengthy and expensive development cycle. To address this need, the Communication Virtual Machine (CVM) technology has been developed by researchers at Florida International University. The CVM technology includes the Communication Modeling Language (CML) and the platform, CVM, to model and rapidly realize communication models. ^ In this dissertation, we investigate the basic communication primitives needed to capture and specify an end-user's requirements for communication-intensive applications, and how these specifications can be automatically realized. To identify the basic communication primitives, we perform a feature analysis on a set of communication-intensive scenarios from the healthcare domain. Based on the feature analysis, we define a new version of CML that includes the meta-model definition (abstract syntax and static semantics) and a partial behavior model (operational semantics). To validate our CML definition, we present a case study that shows how one of the scenarios from the healthcare domain is modeled and automatically realized. ^
Resumo:
The main goal of this research is to design an efficient compression al~ gorithm for fingerprint images. The wavelet transform technique is the principal tool used to reduce interpixel redundancies and to obtain a parsimonious representation for these images. A specific fixed decomposition structure is designed to be used by the wavelet packet in order to save on the computation, transmission, and storage costs. This decomposition structure is based on analysis of information packing performance of several decompositions, two-dimensional power spectral density, effect of each frequency band on the reconstructed image, and the human visual sensitivities. This fixed structure is found to provide the "most" suitable representation for fingerprints, according to the chosen criteria. Different compression techniques are used for different subbands, based on their observed statistics. The decision is based on the effect of each subband on the reconstructed image according to the mean square criteria as well as the sensitivities in human vision. To design an efficient quantization algorithm, a precise model for distribution of the wavelet coefficients is developed. The model is based on the generalized Gaussian distribution. A least squares algorithm on a nonlinear function of the distribution model shape parameter is formulated to estimate the model parameters. A noise shaping bit allocation procedure is then used to assign the bit rate among subbands. To obtain high compression ratios, vector quantization is used. In this work, the lattice vector quantization (LVQ) is chosen because of its superior performance over other types of vector quantizers. The structure of a lattice quantizer is determined by its parameters known as truncation level and scaling factor. In lattice-based compression algorithms reported in the literature the lattice structure is commonly predetermined leading to a nonoptimized quantization approach. In this research, a new technique for determining the lattice parameters is proposed. In the lattice structure design, no assumption about the lattice parameters is made and no training and multi-quantizing is required. The design is based on minimizing the quantization distortion by adapting to the statistical characteristics of the source in each subimage. 11 Abstract Abstract Since LVQ is a multidimensional generalization of uniform quantizers, it produces minimum distortion for inputs with uniform distributions. In order to take advantage of the properties of LVQ and its fast implementation, while considering the i.i.d. nonuniform distribution of wavelet coefficients, the piecewise-uniform pyramid LVQ algorithm is proposed. The proposed algorithm quantizes almost all of source vectors without the need to project these on the lattice outermost shell, while it properly maintains a small codebook size. It also resolves the wedge region problem commonly encountered with sharply distributed random sources. These represent some of the drawbacks of the algorithm proposed by Barlaud [26). The proposed algorithm handles all types of lattices, not only the cubic lattices, as opposed to the algorithms developed by Fischer [29) and Jeong [42). Furthermore, no training and multiquantizing (to determine lattice parameters) is required, as opposed to Powell's algorithm [78). For coefficients with high-frequency content, the positive-negative mean algorithm is proposed to improve the resolution of reconstructed images. For coefficients with low-frequency content, a lossless predictive compression scheme is used to preserve the quality of reconstructed images. A method to reduce bit requirements of necessary side information is also introduced. Lossless entropy coding techniques are subsequently used to remove coding redundancy. The algorithms result in high quality reconstructed images with better compression ratios than other available algorithms. To evaluate the proposed algorithms their objective and subjective performance comparisons with other available techniques are presented. The quality of the reconstructed images is important for a reliable identification. Enhancement and feature extraction on the reconstructed images are also investigated in this research. A structural-based feature extraction algorithm is proposed in which the unique properties of fingerprint textures are used to enhance the images and improve the fidelity of their characteristic features. The ridges are extracted from enhanced grey-level foreground areas based on the local ridge dominant directions. The proposed ridge extraction algorithm, properly preserves the natural shape of grey-level ridges as well as precise locations of the features, as opposed to the ridge extraction algorithm in [81). Furthermore, it is fast and operates only on foreground regions, as opposed to the adaptive floating average thresholding process in [68). Spurious features are subsequently eliminated using the proposed post-processing scheme.
Resumo:
It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of the large number of terms, patterns, and noise. Most existing popular text mining and classification methods have adopted term-based approaches. However, they have all suffered from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern-based methods should perform better than term- based ones in describing user preferences, but many experiments do not support this hypothesis. This research presents a promising method, Relevance Feature Discovery (RFD), for solving this challenging issue. It discovers both positive and negative patterns in text documents as high-level features in order to accurately weight low-level features (terms) based on their specificity and their distributions in the high-level features. The thesis also introduces an adaptive model (called ARFD) to enhance the exibility of using RFD in adaptive environment. ARFD automatically updates the system's knowledge based on a sliding window over new incoming feedback documents. It can efficiently decide which incoming documents can bring in new knowledge into the system. Substantial experiments using the proposed models on Reuters Corpus Volume 1 and TREC topics show that the proposed models significantly outperform both the state-of-the-art term-based methods underpinned by Okapi BM25, Rocchio or Support Vector Machine and other pattern-based methods.
Resumo:
Online business or Electronic Commerce (EC) is getting popular among customers today, as a result large number of product reviews have been posted online by the customers. This information is very valuable not only for prospective customers to make decision on buying product but also for companies to gather information of customers’ satisfaction about their products. Opinion mining is used to capture customer reviews and separated this review into subjective expressions (sentiment word) and objective expressions (no sentiment word). This paper proposes a novel, multi-dimensional model for opinion mining, which integrates customers’ characteristics and their opinion about any products. The model captures subjective expression from product reviews and transfers to fact table before representing in multi-dimensions named as customers, products, time and location. Data warehouse techniques such as OLAP and Data Cubes were used to analyze opinionated sentences. A comprehensive way to calculate customers’ orientation on products’ features and attributes are presented in this paper.
Resumo:
One of the major challenges in systems biology is to understand the complex responses of a biological system to external perturbations or internal signalling depending on its biological conditions. Genome-wide transcriptomic profiling of cellular systems under various chemical perturbations allows the manifestation of certain features of the chemicals through their transcriptomic expression profiles. The insights obtained may help to establish the connections between human diseases, associated genes and therapeutic drugs. The main objective of this study was to systematically analyse cellular gene expression data under various drug treatments to elucidate drug-feature specific transcriptomic signatures. We first extracted drug-related information (drug features) from the collected textual description of DrugBank entries using text-mining techniques. A novel statistical method employing orthogonal least square learning was proposed to obtain drug-feature-specific signatures by integrating gene expression with DrugBank data. To obtain robust signatures from noisy input datasets, a stringent ensemble approach was applied with the combination of three techniques: resampling, leave-one-out cross validation, and aggregation. The validation experiments showed that the proposed method has the capacity of extracting biologically meaningful drug-feature-specific gene expression signatures. It was also shown that most of signature genes are connected with common hub genes by regulatory network analysis. The common hub genes were further shown to be related to general drug metabolism by Gene Ontology analysis. Each set of genes has relatively few interactions with other sets, indicating the modular nature of each signature and its drug-feature-specificity. Based on Gene Ontology analysis, we also found that each set of drug feature (DF)-specific genes were indeed enriched in biological processes related to the drug feature. The results of these experiments demonstrated the pot- ntial of the method for predicting certain features of new drugs using their transcriptomic profiles, providing a useful methodological framework and a valuable resource for drug development and characterization.
Resumo:
For the tracking of extrema associated with weather systems to be applied to a broad range of fields it is necessary to remove a background field that represents the slowly varying, large spatial scales. The sensitivity of the tracking analysis to the form of background field removed is explored for the Northern Hemisphere winter storm tracks for three contrasting fields from an integration of the U. K. Met Office's (UKMO) Hadley Centre Climate Model (HadAM3). Several methods are explored for the removal of a background field from the simple subtraction of the climatology, to the more sophisticated removal of the planetary scales. Two temporal filters are also considered in the form of a 2-6-day Lanczos filter and a 20-day high-pass Fourier filter. The analysis indicates that the simple subtraction of the climatology tends to change the nature of the systems to the extent that there is a redistribution of the systems relative to the climatological background resulting in very similar statistical distributions for both positive and negative anomalies. The optimal planetary wave filter removes total wavenumbers less than or equal to a number in the range 5-7, resulting in distributions more easily related to particular types of weather system. For the temporal filters the 2-6-day bandpass filter is found to have a detrimental impact on the individual weather systems, resulting in the storm tracks having a weak waveguide type of behavior. The 20-day high-pass temporal filter is less aggressive than the 2-6-day filter and produces results falling between those of the climatological and 2-6-day filters.