978 resultados para Multi-class steganalysis
Resumo:
The microarray technology provides a high-throughput technique to study gene expression. Microarrays can help us diagnose different types of cancers, understand biological processes, assess host responses to drugs and pathogens, find markers for specific diseases, and much more. Microarray experiments generate large amounts of data. Thus, effective data processing and analysis are critical for making reliable inferences from the data. ^ The first part of dissertation addresses the problem of finding an optimal set of genes (biomarkers) to classify a set of samples as diseased or normal. Three statistical gene selection methods (GS, GS-NR, and GS-PCA) were developed to identify a set of genes that best differentiate between samples. A comparative study on different classification tools was performed and the best combinations of gene selection and classifiers for multi-class cancer classification were identified. For most of the benchmarking cancer data sets, the gene selection method proposed in this dissertation, GS, outperformed other gene selection methods. The classifiers based on Random Forests, neural network ensembles, and K-nearest neighbor (KNN) showed consistently god performance. A striking commonality among these classifiers is that they all use a committee-based approach, suggesting that ensemble classification methods are superior. ^ The same biological problem may be studied at different research labs and/or performed using different lab protocols or samples. In such situations, it is important to combine results from these efforts. The second part of the dissertation addresses the problem of pooling the results from different independent experiments to obtain improved results. Four statistical pooling techniques (Fisher inverse chi-square method, Logit method. Stouffer's Z transform method, and Liptak-Stouffer weighted Z-method) were investigated in this dissertation. These pooling techniques were applied to the problem of identifying cell cycle-regulated genes in two different yeast species. As a result, improved sets of cell cycle-regulated genes were identified. The last part of dissertation explores the effectiveness of wavelet data transforms for the task of clustering. Discrete wavelet transforms, with an appropriate choice of wavelet bases, were shown to be effective in producing clusters that were biologically more meaningful. ^
Resumo:
The total time a customer spends in the business process system, called the customer cycle-time, is a major contributor to overall customer satisfaction. Business process analysts and designers are frequently asked to design process solutions with optimal performance. Simulation models have been very popular to quantitatively evaluate the business processes; however, simulation is time-consuming and it also requires extensive modeling experiences to develop simulation models. Moreover, simulation models neither provide recommendations nor yield optimal solutions for business process design. A queueing network model is a good analytical approach toward business process analysis and design, and can provide a useful abstraction of a business process. However, the existing queueing network models were developed based on telephone systems or applied to manufacturing processes in which machine servers dominate the system. In a business process, the servers are usually people. The characteristics of human servers should be taken into account by the queueing model, i.e. specialization and coordination. ^ The research described in this dissertation develops an open queueing network model to do a quick analysis of business processes. Additionally, optimization models are developed to provide optimal business process designs. The queueing network model extends and improves upon existing multi-class open-queueing network models (MOQN) so that the customer flow in the human-server oriented processes can be modeled. The optimization models help business process designers to find the optimal design of a business process with consideration of specialization and coordination. ^ The main findings of the research are, first, parallelization can reduce the cycle-time for those customer classes that require more than one parallel activity; however, the coordination time due to the parallelization overwhelms the savings from parallelization under the high utilization servers since the waiting time significantly increases, thus the cycle-time increases. Third, the level of industrial technology employed by a company and coordination time to mange the tasks have strongest impact on the business process design; as the level of industrial technology employed by the company is high; more division is required to improve the cycle-time; as the coordination time required is high; consolidation is required to improve the cycle-time. ^
Resumo:
The objective of this study was to develop a GIS-based multi-class index overlay model to determine areas susceptible to inland flooding during extreme precipitation events in Broward County, Florida. Data layers used in the method include Airborne Laser Terrain Mapper (ALTM) elevation data, excess precipitation depth determined through performing a Soil Conservation Service (SCS) Curve Number (CN) analysis, and the slope of the terrain. The method includes a calibration procedure that uses "weights and scores" criteria obtained from Hurricane Irene (1999) records, a reported 100-year precipitation event, Doppler radar data and documented flooding locations. Results are displayed in maps of Eastern Broward County depicting types of flooding scenarios for a 100-year, 24-hour storm based on the soil saturation conditions. As expected the results of the multi-class index overlay analysis showed that an increase for the potential of inland flooding could be expected when a higher antecedent moisture condition is experienced. The proposed method proves to have some potential as a predictive tool for flooding susceptibility based on a relatively simple approach.
Resumo:
This study examines the business model complexity of Irish credit unions using a latent class approach to measure structural performance over the period 2002 to 2013. The latent class approach allows the endogenous identification of a multi-class framework for business models based on credit union specific characteristics. The analysis finds a three class system to be appropriate with the multi-class model dependent on three financial viability characteristics. This finding is consistent with the deliberations of the Irish Commission on Credit Unions (2012) which identified complexity and diversity in the business models of Irish credit unions and recommended that such complexity and diversity could not be accommodated within a one size fits all regulatory framework. The analysis also highlights that two of the classes are subject to diseconomies of scale. This may suggest credit unions would benefit from a reduction in scale or perhaps that there is an imbalance in the present change process. Finally, relative performance differences are identified for each class in terms of technical efficiency. This suggests that there is an opportunity for credit unions to improve their performance by using within-class best practice or alternatively by switching to another class.
Resumo:
This paper presents a multi-class AdaBoost based on incorporating an ensemble of binary AdaBoosts which is organized as Binary Decision Tree (BDT). It is proved that binary AdaBoost is extremely successful in producing accurate classification but it does not perform very well for multi-class problems. To avoid this performance degradation, the multi-class problem is divided into a number of binary problems and binary AdaBoost classifiers are invoked to solve these classification problems. This approach is tested with a dataset consisting of 6500 binary images of traffic signs. Haar-like features of these images are computed and the multi-class AdaBoost classifier is invoked to classify them. A classification rate of 96.7% and 95.7% is achieved for the traffic sign boarders and pictograms, respectively. The proposed approach is also evaluated using a number of standard datasets such as Iris, Wine, Yeast, etc. The performance of the proposed BDT classifier is quite high as compared with the state of the art and it converges very fast to a solution which indicates it as a reliable classifier.
Resumo:
Authentication plays an important role in how we interact with computers, mobile devices, the web, etc. The idea of authentication is to uniquely identify a user before granting access to system privileges. For example, in recent years more corporate information and applications have been accessible via the Internet and Intranet. Many employees are working from remote locations and need access to secure corporate files. During this time, it is possible for malicious or unauthorized users to gain access to the system. For this reason, it is logical to have some mechanism in place to detect whether the logged-in user is the same user in control of the user's session. Therefore, highly secure authentication methods must be used. We posit that each of us is unique in our use of computer systems. It is this uniqueness that is leveraged to "continuously authenticate users" while they use web software. To monitor user behavior, n-gram models are used to capture user interactions with web-based software. This statistical language model essentially captures sequences and sub-sequences of user actions, their orderings, and temporal relationships that make them unique by providing a model of how each user typically behaves. Users are then continuously monitored during software operations. Large deviations from "normal behavior" can possibly indicate malicious or unintended behavior. This approach is implemented in a system called Intruder Detector (ID) that models user actions as embodied in web logs generated in response to a user's actions. User identification through web logs is cost-effective and non-intrusive. We perform experiments on a large fielded system with web logs of approximately 4000 users. For these experiments, we use two classification techniques; binary and multi-class classification. We evaluate model-specific differences of user behavior based on coarse-grain (i.e., role) and fine-grain (i.e., individual) analysis. A specific set of metrics are used to provide valuable insight into how each model performs. Intruder Detector achieves accurate results when identifying legitimate users and user types. This tool is also able to detect outliers in role-based user behavior with optimal performance. In addition to web applications, this continuous monitoring technique can be used with other user-based systems such as mobile devices and the analysis of network traffic.
Resumo:
The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.
Resumo:
In this paper, the problem of semantic place categorization in mobile robotics is addressed by considering a time-based probabilistic approach called dynamic Bayesian mixture model (DBMM), which is an improved variation of the dynamic Bayesian network. More specifically, multi-class semantic classification is performed by a DBMM composed of a mixture of heterogeneous base classifiers, using geometrical features computed from 2D laserscanner data, where the sensor is mounted on-board a moving robot operating indoors. Besides its capability to combine different probabilistic classifiers, the DBMM approach also incorporates time-based (dynamic) inferences in the form of previous class-conditional probabilities and priors. Extensive experiments were carried out on publicly available benchmark datasets, highlighting the influence of the number of time-slices and the effect of additive smoothing on the classification performance of the proposed approach. Reported results, under different scenarios and conditions, show the effectiveness and competitive performance of the DBMM.
Resumo:
This work is a description of Tajio, a Western Malayo-Polynesian language spoken in Central Sulawesi, Indonesia. It covers the essential aspects of Tajio grammar without being exhaustive. Tajio has a medium sized phoneme inventory consisting of twenty consonants and five vowels. The language does not have lexical (word) stress; rather, it has a phrasal accent. This phrasal accent regularly occurs on the penultimate syllable of an intonational phrase, rendering this syllable auditorily prominent through a pitch rise. Possible syllable structures in Tajio are (C)V(C). CVN structures are allowed as closed syllables, but CVN syllables in word-medial position are not frequent. As in other languages in the area, the only sequence of consonants allowed in native Tajio words are sequences of nasals followed by a homorganic obstruent. The homorganic nasal-obstruent sequences found in Tajio can occur word-initially and word-medially but never in word-final position. As in many Austronesian languages, word class classification in Tajio is not straightforward. The classification of words in Tajio must be carried out on two levels: the morphosyntactic level and the lexical level. The open word classes in Tajio consist of nouns and verbs. Verbs are further divided into intransitive verbs (dynamic intransitive verbs and statives) and dynamic transitive verbs. Based on their morphological potential, lexical roots in Tajio fall into three classes: single-class roots, dual-class roots and multi-class roots. There are two basic transitive constructions in Tajio: Actor Voice and Undergoer Voice, where the actor or undergoer argument respectively serves as subjects. It shares many characteristics with symmetrical voice languages, yet it is not fully symmetric, as arguments in AV and UV are not equally marked. Neither subjects nor objects are marked in AV constructions. In UV constructions, however, subjects are unmarked while objects are marked either by prefixation or clitization. Evidence from relativization, control and raising constructions supports the analysis that AV and UV are in fact transitive, with subject arguments and object arguments behaving alike in both voices. Only the subject can be relativized, controlled, raised or function as the implicit subject of subjectless adverbial clauses. In contrast, the objects of AV and UV constructions do not exhibit these features. Tajio is a predominantly head-marking language with basic A-V-O constituent order. V and O form a constituent, and the subject can either precede or follow this complex. Thus, basic word order is S-V-O or V-O-S. Subject, as well as non-subject arguments, may be omitted when contextually specified. Verbs are marked for voice and mood, the latter of which is is obligatory. The two values distinguished are realis and non-realis. Depending on the type of predicate involved in clause formation, three clause types can be distinguished: verbal clauses, existential clauses and non-verbal clauses. Tajio has a small number of multi-verbal structures that appear to qualify as serial verb constructions. SVCs in Tajio always include a motion verb or a directional.
Resumo:
Establishment of a treatment plan is based on efficacy and easy application by the clinician, and acceptance by the patient. Treatment of adult patients with Class III malocclusion might require orthognathic surgery, especially when the deformity is severe, with a significant impact on facial esthetics. Impacted teeth can remarkably influence treatment planning, which should be precise and concise to allow a reasonably short treatment time with low biologic cost. We report here the case of a 20-year-old man who had a skeletal Class III malocclusion and impaction of the maxillary right canine, leading to remarkable deviation of the maxillary midline; this was his chief complaint. Because of the severely deviated position of the impacted canine, treatment included extraction of the maxillary right canine and left first premolar for midline correction followed by leveling, alignment, correction of compensatory tooth positioning, and orthognathic surgery to correct the skeletal Class III malocclusion because of the severe maxillary deficiency. This treatment approach allowed correction of the maxillary dental midline discrepancy to the midsagittal plane and establishment of good occlusion and optimal esthetics. (Am J Orthod Dentofacial Orthop 2010;137:840-9)
Resumo:
The mortality caused by snakebites is more damaging than many tropical diseases, such as dengue haemorrhagic fever, cholera, leishmaniasis, schistosomiasis and Chagas disease. For this reason, snakebite envenoming adversely affects health services of tropical and subtropical countries and is recognized as a neglected disease by the World Health Organization. One of the main components of snake venoms is the Lys49-phospholipases A2, which is catalytically inactive but possesses other toxic and pharmacological activities. Preliminary studies with MjTX-I from Bothrops moojeni snake venom revealed intriguing new structural and functional characteristics compared to other bothropic Lys49-PLA2s. We present in this article a comprehensive study with MjTX-I using several techniques, including crystallography, small angle X-ray scattering, analytical size-exclusion chromatography, dynamic light scattering, myographic studies, bioinformatics and molecular phylogenetic analyses.Based in all these experiments we demonstrated that MjTX-I is probably a unique Lys49-PLA2, which may adopt different oligomeric forms depending on the physical-chemical environment. Furthermore, we showed that its myotoxic activity is dramatically low compared to other Lys49-PLA2s, probably due to the novel oligomeric conformations and important mutations in the C-terminal region of the protein. The phylogenetic analysis also showed that this toxin is clearly distinct from other bothropic Lys49-PLA2s, in conformity with the peculiar oligomeric characteristics of MjTX-I and possible emergence of new functionalities inresponse to environmental changes and adaptation to new preys. © 2013 Salvador et al.
Resumo:
This paper is concerned with the existence of multi-bump solutions to a class of quasilinear Schrodinger equations in R. The proof relies on variational methods and combines some arguments given by del Pino and Felmer, Ding and Tanaka, and Sere.
Resumo:
Prototype Selection (PS) algorithms allow a faster Nearest Neighbor classification by keeping only the most profitable prototypes of the training set. In turn, these schemes typically lower the performance accuracy. In this work a new strategy for multi-label classifications tasks is proposed to solve this accuracy drop without the need of using all the training set. For that, given a new instance, the PS algorithm is used as a fast recommender system which retrieves the most likely classes. Then, the actual classification is performed only considering the prototypes from the initial training set belonging to the suggested classes. Results show that this strategy provides a large set of trade-off solutions which fills the gap between PS-based classification efficiency and conventional kNN accuracy. Furthermore, this scheme is not only able to, at best, reach the performance of conventional kNN with barely a third of distances computed, but it does also outperform the latter in noisy scenarios, proving to be a much more robust approach.
Resumo:
"21 November 1980."