992 resultados para Discrete valued features
Resumo:
Selection of features that will permit accurate pattern classification is a difficult task. However, if a particular data set is represented by discrete valued features, it becomes possible to determine empirically the contribution that each feature makes to the discrimination between classes. This paper extends the discrimination bound method so that both the maximum and average discrimination expected on unseen test data can be estimated. These estimation techniques are the basis of a backwards elimination algorithm that can be use to rank features in order of their discriminative power. Two problems are used to demonstrate this feature selection process: classification of the Mushroom Database, and a real-world, pregnancy related medical risk prediction task - assessment of risk of perinatal death.
Resumo:
2000 Mathematics Subject Classification: 11S31 12E15 12F10 12J20.
Resumo:
Only some of the information contained in a medical record will be useful to the prediction of patient outcome. We describe a novel method for selecting those outcome predictors which allow us to reliably discriminate between adverse and benign end results. Using the area under the receiver operating characteristic as a nonparametric measure of discrimination, we show how to calculate the maximum discrimination attainable with a given set of discrete valued features. This upper limit forms the basis of our feature selection algorithm. We use the algorithm to select features (from maternity records) relevant to the prediction of failure to progress in labour. The results of this analysis motivate investigation of those predictors of failure to progress relevant to parous and nulliparous sub-populations.
Resumo:
We propose expected attainable discrimination (EAD) as a measure to select discrete valued features for reliable discrimination between two classes of data. EAD is an average of the area under the ROC curves obtained when a simple histogram probability density model is trained and tested on many random partitions of a data set. EAD can be incorporated into various stepwise search methods to determine promising subsets of features, particularly when misclassification costs are difficult or impossible to specify. Experimental application to the problem of risk prediction in pregnancy is described.
Resumo:
In this paper, we use optical flow based complex-valued features extracted from video sequences to recognize human actions. The optical flow features between two image planes can be appropriately represented in the Complex plane. Therefore, we argue that motion information that is used to model the human actions should be represented as complex-valued features and propose a fast learning fully complex-valued neural classifier to solve the action recognition task. The classifier, termed as, ``fast learning fully complex-valued neural (FLFCN) classifier'' is a single hidden layer fully complex-valued neural network. The neurons in the hidden layer employ the fully complex-valued activation function of the type of a hyperbolic secant function. The parameters of the hidden layer are chosen randomly and the output weights are estimated as the minimum norm least square solution to a set of linear equations. The results indicate the superior performance of FLFCN classifier in recognizing the actions compared to real-valued support vector machines and other existing results in the literature. Complex valued representation of 2D motion and orthogonal decision boundaries boost the classification performance of FLFCN classifier. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
Population coding is widely regarded as a key mechanism for achieving reliable behavioral decisions. We previously introduced reinforcement learning for population-based decision making by spiking neurons. Here we generalize population reinforcement learning to spike-based plasticity rules that take account of the postsynaptic neural code. We consider spike/no-spike, spike count and spike latency codes. The multi-valued and continuous-valued features in the postsynaptic code allow for a generalization of binary decision making to multi-valued decision making and continuous-valued action selection. We show that code-specific learning rules speed up learning both for the discrete classification and the continuous regression tasks. The suggested learning rules also speed up with increasing population size as opposed to standard reinforcement learning rules. Continuous action selection is further shown to explain realistic learning speeds in the Morris water maze. Finally, we introduce the concept of action perturbation as opposed to the classical weight- or node-perturbation as an exploration mechanism underlying reinforcement learning. Exploration in the action space greatly increases the speed of learning as compared to exploration in the neuron or weight space.
Resumo:
Background: There is a growing trend for individuals to seek health information from online sources. Alcohol and other drug (AOD) use is a significant health problem worldwide, but access and use of AOD websites is poorly understood. ----- ----- Objective: To investigate content and functionality preferences for AOD and other health websites. Methods: An anonymous online survey examined general Internet and AOD-specific usage and search behaviors, valued features of AOD and health-related websites (general and interactive website features), indicators of website trustworthiness, valued AOD website tools or functions, and treatment modality preferences. ----- ----- Results: Surveys were obtained from 1214 drug (n = 766) and alcohol website users (n = 448) (mean age 26.2 years, range 16-70). There were no significant differences between alcohol and drug groups on demographic variables, Internet usage, indicators of website trustworthiness, or on preferences for AOD website functionality. A robust website design/navigation, open access, and validated content provision were highly valued by both groups. While attractiveness and pictures or graphics were also valued, high-cost features (videos, animations, games) were minority preferences. Almost half of respondents in both groups were unable to readily access the information they sought. Alcohol website users placed greater importance on several AOD website tools and functions than did those accessing other drug websites: online screening tools (χ²2 = 15.8, P < .001, n = 985); prevention programs (χ²2 = 27.5, P < .001, n = 981); tracking functions (χ²2 = 11.5, P = .003, n = 983); self help treatment programs (χ²2 = 8.3, P = .02, n = 984); downloadable fact sheets for friends (χ²2 = 11.6, P = .003, n = 981); or family (χ²2 = 12.7, P = .002, n = 983). The most preferred online treatment option for both the user groups was an Internet site with email therapist support. Explorations of demographic differences were also performed. While gender did not affect survey responses, younger respondents were more likely to value interactive and social networking features, whereas downloading of credible information was most highly valued by older respondents. ----- ----- Conclusions: Significant deficiencies in the provision of accessible information on AOD websites were identified, an important problem since information seeking was the most common reason for accessing these websites, and, therefore, may be a key avenue for engaging website users in behaviour change. The few differences between AOD website users suggested that both types of websites may have similar features, although alcohol website users may more readily be engaged in screening, prevention and self-help programs, tracking change, and may value fact sheets more highly. While the sociodemographic differences require replication and clarification, these differences support the notion that the design and features of AOD websites should target specific audiences to have maximal impact.
Resumo:
Methicillin-resistant Staphylococcus Aureus (MRSA) is a pathogen that continues to be of major concern in hospitals. We develop models and computational schemes based on observed weekly incidence data to estimate MRSA transmission parameters. We extend the deterministic model of McBryde, Pettitt, and McElwain (2007, Journal of Theoretical Biology 245, 470–481) involving an underlying population of MRSA colonized patients and health-care workers that describes, among other processes, transmission between uncolonized patients and colonized health-care workers and vice versa. We develop new bivariate and trivariate Markov models to include incidence so that estimated transmission rates can be based directly on new colonizations rather than indirectly on prevalence. Imperfect sensitivity of pathogen detection is modeled using a hidden Markov process. The advantages of our approach include (i) a discrete valued assumption for the number of colonized health-care workers, (ii) two transmission parameters can be incorporated into the likelihood, (iii) the likelihood depends on the number of new cases to improve precision of inference, (iv) individual patient records are not required, and (v) the possibility of imperfect detection of colonization is incorporated. We compare our approach with that used by McBryde et al. (2007) based on an approximation that eliminates the health-care workers from the model, uses Markov chain Monte Carlo and individual patient data. We apply these models to MRSA colonization data collected in a small intensive care unit at the Princess Alexandra Hospital, Brisbane, Australia.
Resumo:
Robust hashing is an emerging field that can be used to hash certain data types in applications unsuitable for traditional cryptographic hashing methods. Traditional hashing functions have been used extensively for data/message integrity, data/message authentication, efficient file identification and password verification. These applications are possible because the hashing process is compressive, allowing for efficient comparisons in the hash domain but non-invertible meaning hashes can be used without revealing the original data. These techniques were developed with deterministic (non-changing) inputs such as files and passwords. For such data types a 1-bit or one character change can be significant, as a result the hashing process is sensitive to any change in the input. Unfortunately, there are certain applications where input data are not perfectly deterministic and minor changes cannot be avoided. Digital images and biometric features are two types of data where such changes exist but do not alter the meaning or appearance of the input. For such data types cryptographic hash functions cannot be usefully applied. In light of this, robust hashing has been developed as an alternative to cryptographic hashing and is designed to be robust to minor changes in the input. Although similar in name, robust hashing is fundamentally different from cryptographic hashing. Current robust hashing techniques are not based on cryptographic methods, but instead on pattern recognition techniques. Modern robust hashing algorithms consist of feature extraction followed by a randomization stage that introduces non-invertibility and compression, followed by quantization and binary encoding to produce a binary hash output. In order to preserve robustness of the extracted features, most randomization methods are linear and this is detrimental to the security aspects required of hash functions. Furthermore, the quantization and encoding stages used to binarize real-valued features requires the learning of appropriate quantization thresholds. How these thresholds are learnt has an important effect on hashing accuracy and the mere presence of such thresholds are a source of information leakage that can reduce hashing security. This dissertation outlines a systematic investigation of the quantization and encoding stages of robust hash functions. While existing literature has focused on the importance of quantization scheme, this research is the first to emphasise the importance of the quantizer training on both hashing accuracy and hashing security. The quantizer training process is presented in a statistical framework which allows a theoretical analysis of the effects of quantizer training on hashing performance. This is experimentally verified using a number of baseline robust image hashing algorithms over a large database of real world images. This dissertation also proposes a new randomization method for robust image hashing based on Higher Order Spectra (HOS) and Radon projections. The method is non-linear and this is an essential requirement for non-invertibility. The method is also designed to produce features more suited for quantization and encoding. The system can operate without the need for quantizer training, is more easily encoded and displays improved hashing performance when compared to existing robust image hashing algorithms. The dissertation also shows how the HOS method can be adapted to work with biometric features obtained from 2D and 3D face images.
Resumo:
The characteristic features of the absorption and photoluminescence spectra of ZnSe quantum dots (QDs) inside a silica matrix derived from a sol-gel method were studied at room temperature. Compared with the bulk materials, the absorption edges of ZnSe QDs in silica gel glass were shifted to higher energies and the spectra exhibited the discrete excitonic features due to the quantum confinement effects. Besides the band-edge emission, photoluminescence at ultraviolet excitation also showed the emissions related to the higher excitonic states. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
This paper develops a framework to test whether discrete-valued irregularly-spaced financial transactions data follow a subordinated Markov process. For that purpose, we consider a specific optional sampling in which a continuous-time Markov process is observed only when it crosses some discrete level. This framework is convenient for it accommodates not only the irregular spacing of transactions data, but also price discreteness. Further, it turns out that, under such an observation rule, the current price duration is independent of previous price durations given the current price realization. A simple nonparametric test then follows by examining whether this conditional independence property holds. Finally, we investigate whether or not bid-ask spreads follow Markov processes using transactions data from the New York Stock Exchange. The motivation lies on the fact that asymmetric information models of market microstructures predict that the Markov property does not hold for the bid-ask spread. The results are mixed in the sense that the Markov assumption is rejected for three out of the five stocks we have analyzed.
Resumo:
Pós-graduação em Educação - FFC
Resumo:
O mapeamento geológico realizado na área de Nova Canadá, porção sul do Domínio Carajás, aliado aos estudos petrográficos e geoquímicos, permitiram a caracterização de pelo menos três novas unidades que antes estavam inseridas no contexto geológico do Complexo Xingu. São elas: (i) Leucogranodiorito Nova Canadá, que é constituído por rochas leucogranodioríticas mais enriquecidas em Al2O3, CaO, Na2O, Ba, Sr e na razão Sr/Y, que mostram fortes afinidades geoquímicas com a Suíte Guarantã do Domínio Rio Maria, as quais também podem ser correlacionadas aos TTGs Transicionais do Cráton Yilgarn. Estas rochas apresentam padrão ETR levemente fracionado, mostram baixas razões (La/Yb)N e anomalias negativas de Eu ausentes ou discretas; (ii) Leucogranito Velha Canadá, caracterizado pelos conteúdos mais elevados de SiO2, Fe2O3, TiO2, K2O, Rb, HFSE (Zr, Y e Nb), das razões K2O/Na2O, FeOt/(FeOt+MgO), Ba/Sr e Rb/Sr. Apresentam dois padrões distintos de ETR: (a) baixas à moderadas razões (La/Yb)N com anomalias negativas de Eu acentuadas; e (b) moderadas à altas razões (La/Yb)N, com anomalias negativas de Eu discretas e um padrão côncavo dos ETRP. Em diversos aspectos, as rochas do granito Velha Canadá mostram fortes afinidades com os leucogranitos potássicos tipo Xinguara e Mata Surrão do Domínio Rio Maria, assim como aqueles da região da Canaã dos Carajás e mais discretamente com os granitos de baixo Ca do Cráton Yilgarn. Para a origem das rochas do Leucogranodiorito Nova Canadá é admitida a hipótese de cristalização fracionada a partir de líquidos com afinidade sanukitóide, seguido por processos de mistura entre estes e líquidos de composição trondhjemítica, enquanto que para aquelas de alto K do Leucogranito Velha Canadá, acreditase na fusão parcial de metatonalitos tipo TTG em diferentes níveis crustais, para gerar líquidos com tais características; e (iii) associações trondhjemíticas com afinidade TTG de alto Al2O3, Na2O e baixo K2O, compatíveis com os granitoides arqueanos da série cálcioalcalina tonalítica-trondhjemítica de baixo potássio. Foram distinguidas duas variedades: (a) biotita-trondhjemito com estruturação marcada pelo desenvolvimento de feições que indicam atuação de pelo menos dois eventos deformacionais em estágios sin- a pós-magmáticos, como bandamentos composicionais, dobras e indícios de migmatização; e (b) muscovita ± biotita trondhjemito que é distinguido da variedade anterior pela presença da muscovita, saussuritização do plagioclásio, textura equigranular média e atuação discreta da deformação com o desenvolvimento de uma foliação E-W de baixo angulo. A primeira variedade destes litotipos, que ocorre predominantemente na porção norte, tem ocorrência restrita. Com intensa deformação e prováveis feições de anatexia (migmatitos) podem indicar que estas rochas tenham sido afetadas por um retrabalhamento crustal, ligado à geração dos leucogranitos dominantemente descritos na área. Os trondhjemitos do sul da área são mais enriquecidos em Fe2O3, MgO, TiO2, CaO, Zr, Rb, e na razão Rb/Sr em relação aos trondhjemitos da porção norte da área. Estas exibem ainda padrões fracionados de ETR, com variações nos conteúdos de ETRP, além da ausência de anomalias de Eu e Sr, e baixos conteúdos de Y e Yb. Tais feições são tipicamente atribuídas à magmas gerados por fusão parcial de uma fonte máfica em diferentes profundidades, com aumento da influência da granada no resíduo e a falta de plagioclásio tanto na fase residual como na fracionante. Em uma análise geral, a disposição dos trends geoquímicos evolutivos de ambas as variedades sugere que estas unidades não são comagmáticas. As afinidades geoquímicas entre as rochas da área de Nova Canadá com aquelas do Domínio Mesoarqueano Rio Maria, poderiam nos levar a entender a região de Nova Canadá como uma extensão do Rio Maria para norte, enquanto que para aquelas do Leucogranito Velha Canadá, que são mais jovens e geradas já no Neoarqueano, se descarta a idéia de associação com os mesmos eventos tectono-magmáticos que atuaram em Rio Maria.
Resumo:
This paper proposes a semantic analysis of the French free-choice indefinite 'n’importe qui'. The semantics of the indefinite is organised as a ternary structure. The (1) abstract meaning underlies all uses of the item and acts as a principle of creative interpretation generation and comprehension. This principle is actualised via (2) discrete contextual features through to (3) contextual interpretations. Thus, the “existential” reading of 'n’importe qui' is derived by a veridical reading of the arbitrary selection of a qualitatively-marked occurrence from the set of human animates. The derivation of contextual readings from the enrichment by contextual cues of an underspecified meaning has a claim to an explanatory model of the semantics of grammatical polysemous items, and is certainly relevant to model-theoretic approaches in as much as formal semantic notions are intricately linked to the contextual interpretation of items. It is not 'n’importe qui' itself, but its contextual interpretations which may be weak or strong, and an homonymous treatment is not possible given the continuity of the quality and free-choice dimensions from one observed reading of n’importe qui to the next.