979 resultados para Selection techniques
Resumo:
There are many techniques for electricity market price forecasting. However, most of them are designed for expected price analysis rather than price spike forecasting. An effective method of predicting the occurrence of spikes has not yet been observed in the literature so far. In this paper, a data mining based approach is presented to give a reliable forecast of the occurrence of price spikes. Combined with the spike value prediction techniques developed by the same authors, the proposed approach aims at providing a comprehensive tool for price spike forecasting. In this paper, feature selection techniques are firstly described to identify the attributes relevant to the occurrence of spikes. A simple introduction to the classification techniques is given for completeness. Two algorithms: support vector machine and probability classifier are chosen to be the spike occurrence predictors and are discussed in details. Realistic market data are used to test the proposed model with promising results.
Resumo:
With the advent of multi-fibre spectrographs such as the 'Two-Degree Field' (2dF) instrument at the Angle-Australian Telescope, quasar surveys that are free of any preselection of candidates and any biases this implies have become possible for the first time. The first of these is that which is being undertaken as part of the Fornax Spectroscopic Survey, a survey of the area around the Fornax Cluster of galaxies, and aims to obtain the spectra of all objects in the magnitude range 16.5 < b(j) < 19.7. To date, 3679 objects in the central pi -deg(2) area have been successfully identified from their spectral characteristics. Of these, 71 are found to be quasars, 61 with redshifts 0.3 < z < 2.2 and 10 with redshifts z > 2.2. Using this complete quasar sample, a new determination of quasar number counts is made, enabling an independent check of existing quasars surveys. Cumulative counts per square degree at a magnitude limit of b(j) < 19.5 are found to be 11.5 +/- 2.2 for 0.3 < z < 2.2, 2.22 +/- 0.93 for z > 2.2 and 13.7 +/- 3.1 for z > 0.3. Given the likely detection of extra quasars in the Fornax survey, we make a more detailed examination of existing quasar selection techniques. First, looking at the use of a stellar criterion, four of the 71 quasars are 'non-stellar' on the basis of the automated plate measuring facility (APM) b(j) classification, however inspection shows all are consistent with stellar, but misclassified due to image confusion. Examining the ultraviolet excess and multicolour selection techniques, for the selection criteria investigated, ultraviolet excess would find 69 +/- 6 per cent of our 0.3 < z < 2.2 quasars and only 50(-18)(+14), per cent of our z > 2.2 quasars, while the completeness level for multicolour selection is found to be 90(-4)(+3) per cent for 0.3 < z < 2.2 quasars and 80(-12)(+14) per cent for z > 2.2 quasars. The extra quasars detected by our all-object survey thus have unusually red star-like colours, and this appears to be a result of the continuum shape rather than any emission features. An intrinsic dust extinction model may, at least partly, account for the red colours.
Resumo:
Trabalho de Projeto para obtenção do grau de Mestre em Engenharia Informática e de Computadores
Resumo:
Mestrado em Gestão de Recursos Humanos
Resumo:
Aquest TFC consisteix a construir una eina que permet visualitzar molècules en aquest format de visualització. L'eina haurà de permetre navegar al voltant de la molècula i emprar tècniques de selecció que permetin identificar àtoms, calcular distàncies i angles de torsió. A més, s'haurà de poder definir un eix i sobre aquest eix generar una rotació i gravar-la en una pel·lícula (en el format que sigui més senzill).
Resumo:
BACKGROUND: Transcranial magnetic stimulation combined with electroencephalogram (TMS-EEG) can be used to explore the dynamical state of neuronal networks. In patients with epilepsy, TMS can induce epileptiform discharges (EDs) with a stochastic occurrence despite constant stimulation parameters. This observation raises the possibility that the pre-stimulation period contains multiple covert states of brain excitability some of which are associated with the generation of EDs. OBJECTIVE: To investigate whether the interictal period contains "high excitability" states that upon brain stimulation produce EDs and can be differentiated from "low excitability" states producing normal appearing TMS-EEG responses. METHODS: In a cohort of 25 patients with Genetic Generalized Epilepsies (GGE) we identified two subjects characterized by the intermittent development of TMS-induced EDs. The high-excitability in the pre-stimulation period was assessed using multiple measures of univariate time series analysis. Measures providing optimal discrimination were identified by feature selection techniques. The "high excitability" states emerged in multiple loci (indicating diffuse cortical hyperexcitability) and were clearly differentiated on the basis of 14 measures from "low excitability" states (accuracy = 0.7). CONCLUSION: In GGE, the interictal period contains multiple, quasi-stable covert states of excitability a class of which is associated with the generation of TMS-induced EDs. The relevance of these findings to theoretical models of ictogenesis is discussed.
Resumo:
A novel approach to multiclass tumor classification using Artificial Neural Networks (ANNs) was introduced in a recent paper cite{Khan2001}. The method successfully classified and diagnosed small, round blue cell tumors (SRBCTs) of childhood into four distinct categories, neuroblastoma (NB), rhabdomyosarcoma (RMS), non-Hodgkin lymphoma (NHL) and the Ewing family of tumors (EWS), using cDNA gene expression profiles of samples that included both tumor biopsy material and cell lines. We report that using an approach similar to the one reported by Yeang et al cite{Yeang2001}, i.e. multiclass classification by combining outputs of binary classifiers, we achieved equal accuracy with much fewer features. We report the performances of 3 binary classifiers (k-nearest neighbors (kNN), weighted-voting (WV), and support vector machines (SVM)) with 3 feature selection techniques (Golub's Signal to Noise (SN) ratios cite{Golub99}, Fisher scores (FSc) and Mukherjee's SVM feature selection (SVMFS))cite{Sayan98}.
Resumo:
El presente proyecto tiene como objeto identificar cuáles son los conceptos de salud, enfermedad, epidemiología y riesgo aplicables a las empresas del sector de extracción de petróleo y gas natural en Colombia. Dado, el bajo nivel de predicción de los análisis financieros tradicionales y su insuficiencia, en términos de inversión y toma de decisiones a largo plazo, además de no considerar variables como el riesgo y las expectativas de futuro, surge la necesidad de abordar diferentes perspectivas y modelos integradores. Esta apreciación es pertinente dentro del sector de extracción de petróleo y gas natural, debido a la creciente inversión extranjera que ha reportado, US$2.862 millones en el 2010, cifra mayor a diez veces su valor en el año 2003. Así pues, se podrían desarrollar modelos multi-dimensional, con base en los conceptos de salud financiera, epidemiológicos y estadísticos. El termino de salud y su adopción en el sector empresarial, resulta útil y mantiene una coherencia conceptual, evidenciando una presencia de diferentes subsistemas o factores interactuantes e interconectados. Es necesario mencionar también, que un modelo multidimensional (multi-stage) debe tener en cuenta el riesgo y el análisis epidemiológico ha demostrado ser útil al momento de determinarlo e integrarlo en el sistema junto a otros conceptos, como la razón de riesgo y riesgo relativo. Esto se analizará mediante un estudio teórico-conceptual, que complementa un estudio previo, para contribuir al proyecto de finanzas corporativas de la línea de investigación en Gerencia.
Resumo:
The interest in the systematic analysis of astronomical time series data, as well as development in astronomical instrumentation and automation over the past two decades has given rise to several questions of how to analyze and synthesize the growing amount of data. These data have led to many discoveries in the areas of modern astronomy asteroseismology, exoplanets and stellar evolution. However, treatment methods and data analysis have failed to follow the development of the instruments themselves, although much effort has been done. In present thesis, we propose new methods of data analysis and two catalogs of the variable stars that allowed the study of rotational modulation and stellar variability. Were analyzed the photometric databases fromtwo distinctmissions: CoRoT (Convection Rotation and planetary Transits) and WFCAM (Wide Field Camera). Furthermore the present work describes several methods for the analysis of photometric data besides propose and refine selection techniques of data using indices of variability. Preliminary results show that variability indices have an efficiency greater than the indices most often used in the literature. An efficient selection of variable stars is essential to improve the efficiency of all subsequent steps. Fromthese analyses were obtained two catalogs; first, fromtheWFCAMdatabase we achieve a catalog with 319 variable stars observed in the photometric bands Y ZJHK. These stars show periods ranging between ∼ 0, 2 to ∼ 560 days whose the variability signatures present RR-Lyrae, Cepheids , LPVs, cataclysmic variables, among many others. Second, from the CoRoT database we selected 4, 206 stars with typical signatures of rotationalmodulation, using a supervised process. These stars show periods ranging between ∼ 0, 33 to ∼ 92 days, amplitude variability between ∼ 0, 001 to ∼ 0, 5 mag, color index (J - H) between ∼ 0, 0 to ∼ 1, 4 mag and spectral type CoRoT FGKM. The WFCAM variable stars catalog is being used to compose a database of light curves to be used as template in an automatic classifier for variable stars observed by the project VVV (Visible and Infrared Survey Telescope for Astronomy) moreover it are a fundamental start point to study different scientific cases. For example, a set of 12 young stars who are in a star formation region and the study of RR Lyrae-whose properties are not well established in the infrared. Based on CoRoT results we were able to show, for the first time, the rotational modulation evolution for an wide homogeneous sample of field stars. The results are inagreement with those expected by the stellar evolution theory. Furthermore, we identified 4 solar-type stars ( with color indices, spectral type, luminosity class and rotation period close to the Sun) besides 400 M-giant stars that we have a special interest to forthcoming studies. From the solar-type stars we can describe the future and past of the Sun while properties of M-stars are not well known. Our results allow concluded that there is a high dependence of the color-period diagram with the reddening in which increase the uncertainties of the age-period realized by previous works using CoRoT data. This thesis provides a large data-set for different scientific works, such as; magnetic activity, cataclysmic variables, brown dwarfs, RR-Lyrae, solar analogous, giant stars, among others. For instance, these data will allow us to study the relationship of magnetic activitywith stellar evolution. Besides these aspects, this thesis presents an improved classification for a significant number of stars in the CoRoT database and introduces a new set of tools that can be used to improve the entire process of the photometric databases analysis
Resumo:
OBJETIVO: Identificar vantagens e desvantagens do uso de segmentos em relação ao sorteio feito a partir da lista completa de endereços, para o sorteio de domicílios em amostragem por conglomerados em múltiplos estágios em favelas. PROCEDIMENTOS METODOLÓGICOS: Estudo qualitativo realizado em quatro favelas sorteadas no Inquérito de Saúde do Município de São Paulo, SP, 2008, nas quais foram aplicadas as duas técnicas. Foram realizados grupos focais com pesquisadores de campo - arroladores e entrevistadores do inquérito. Os conteúdos das conversações foram analisados, agrupados em categorias e organizados em núcleos temáticos. ANÁLISE DOS RESULTADOS: A utilização de segmentos de domicílios foi associada a numerosas vantagens e poucas desvantagens. Entre as vantagens, constaram a rapidez e facilidade na elaboração do cadastro de endereços e na localização e identificação de domicílios na etapa de realização das entrevistas, maior segurança dos entrevistadores e da população, maior acesso aos entrevistados, maior estabilidade e maior cobertura do cadastro produzido, e menor ocorrência de erros na identificação dos domicílios sorteados. CONCLUSÕES: A construção de cadastro de domicílios por meio da criação de segmentos é vantajosa em relação à listagem completa de endereços, quando feita em favelas. Por ter se mostrado uma opção econômica e fácil de ser aplicada, constitui alternativa para a simplificação do processo de amostragem em áreas com as suas características de desorganização e adensamento de domicílios.
Resumo:
dIn this work, a perceptron neural-network technique is applied to estimate hourly values of the diffuse solar-radiation at the surface in São Paulo City, Brazil, using as input the global solar-radiation and other meteorological parameters measured from 1998 to 2001. The neural-network verification was performed using the hourly measurements of diffuse solar-radiation obtained during the year 2002. The neural network was developed based on both feature determination and pattern selection techniques. It was found that the inclusion of the atmospheric long-wave radiation as input improves the neural-network performance. on the other hand traditional meteorological parameters, like air temperature and atmospheric pressure, are not as important as long-wave radiation which acts as a surrogate for cloud-cover information on the regional scale. An objective evaluation has shown that the diffuse solar-radiation is better reproduced by neural network synthetic series than by a correlation model. (C) 2004 Elsevier Ltd. All rights reserved.
Resumo:
Musical genre classification has been paramount in the last years, mainly in large multimedia datasets, in which new songs and genres can be added at every moment by anyone. In this context, we have seen the growing of musical recommendation systems, which can improve the benefits for several applications, such as social networks and collective musical libraries. In this work, we have introduced a recent machine learning technique named Optimum-Path Forest (OPF) for musical genre classification, which has been demonstrated to be similar to the state-of-the-art pattern recognition techniques, but much faster for some applications. Experiments in two public datasets were conducted against Support Vector Machines and a Bayesian classifier to show the validity of our work. In addition, we have executed an experiment using very recent hybrid feature selection techniques based on OPF to speed up feature extraction process. © 2011 International Society for Music Information Retrieval.
Resumo:
Background: Meat quality involves many traits, such as marbling, tenderness, juiciness, and backfat thickness, all of which require attention from livestock producers. Backfat thickness improvement by means of traditional selection techniques in Canchim beef cattle has been challenging due to its low heritability, and it is measured late in an animal's life. Therefore, the implementation of new methodologies for identification of single nucleotide polymorphisms (SNPs) linked to backfat thickness are an important strategy for genetic improvement of carcass and meat quality.Results: The set of SNPs identified by the random forest approach explained as much as 50% of the deregressed estimated breeding value (dEBV) variance associated with backfat thickness, and a small set of 5 SNPs were able to explain 34% of the dEBV for backfat thickness. Several quantitative trait loci (QTL) for fat-related traits were found in the surrounding areas of the SNPs, as well as many genes with roles in lipid metabolism.Conclusions: These results provided a better understanding of the backfat deposition and regulation pathways, and can be considered a starting point for future implementation of a genomic selection program for backfat thickness in Canchim beef cattle. © 2013 Mokry et al.; licensee BioMed Central Ltd.
Resumo:
Sistemas de reconhecimento e síntese de voz são constituídos por módulos que dependem da língua e, enquanto existem muitos recursos públicos para alguns idiomas (p.e. Inglês e Japonês), os recursos para Português Brasileiro (PB) ainda são escassos. Outro aspecto é que, para um grande número de tarefas, a taxa de erro dos sistemas de reconhecimento de voz atuais ainda é elevada, quando comparada à obtida por seres humanos. Assim, apesar do sucesso das cadeias escondidas de Markov (HMM), é necessária a pesquisa por novos métodos. Este trabalho tem como motivação esses dois fatos e se divide em duas partes. A primeira descreve o desenvolvimento de recursos e ferramentas livres para reconhecimento e síntese de voz em PB, consistindo de bases de dados de áudio e texto, um dicionário fonético, um conversor grafema-fone, um separador silábico e modelos acústico e de linguagem. Todos os recursos construídos encontram-se publicamente disponíveis e, junto com uma interface de programação proposta, têm sido usados para o desenvolvimento de várias novas aplicações em tempo-real, incluindo um módulo de reconhecimento de voz para a suíte de aplicativos para escritório OpenOffice.org. São apresentados testes de desempenho dos sistemas desenvolvidos. Os recursos aqui produzidos e disponibilizados facilitam a adoção da tecnologia de voz para PB por outros grupos de pesquisa, desenvolvedores e pela indústria. A segunda parte do trabalho apresenta um novo método para reavaliar (rescoring) o resultado do reconhecimento baseado em HMMs, o qual é organizado em uma estrutura de dados do tipo lattice. Mais especificamente, o sistema utiliza classificadores discriminativos que buscam diminuir a confusão entre pares de fones. Para cada um desses problemas binários, são usadas técnicas de seleção automática de parâmetros para escolher a representaçãao paramétrica mais adequada para o problema em questão.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)