Biblioteca Digital

842 resultados para data movement problem

Eigenvalue computations in the context of data-sparse approximations of integral operators

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work, we consider the numerical solution of a large eigenvalue problem resulting from a finite rank discretization of an integral operator. We are interested in computing a few eigenpairs, with an iterative method, so a matrix representation that allows for fast matrix-vector products is required. Hierarchical matrices are appropriate for this setting, and also provide cheap LU decompositions required in the spectral transformation technique. We illustrate the use of freely available software tools to address the problem, in particular SLEPc for the eigensolvers and HLib for the construction of H-matrices. The numerical tests are performed using an astrophysics application. Results show the benefits of the data-sparse representation compared to standard storage schemes, in terms of computational cost as well as memory requirements.

The problem of estimating the volatility of zero coupon bond interest rate

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Financial literature and financial industry use often zero coupon yield curves as input for testing hypotheses, pricing assets or managing risk. They assume this provided data as accurate. We analyse implications of the methodology and of the sample selection criteria used to estimate the zero coupon bond yield term structure on the resulting volatility of spot rates with different maturities. We obtain the volatility term structure using historical volatilities and Egarch volatilities. As input for these volatilities we consider our own spot rates estimation from GovPX bond data and three popular interest rates data sets: from the Federal Reserve Board, from the US Department of the Treasury (H15), and from Bloomberg. We find strong evidence that the resulting zero coupon bond yield volatility estimates as well as the correlation coefficients among spot and forward rates depend significantly on the data set. We observe relevant differences in economic terms when volatilities are used to price derivatives.

A user-centered interface for scheduling problem definition

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a user-centered interface for a scheduling system. The purpose of this interface is to provide graphical and interactive ways of defining a scheduling problem. To create such user interface an evaluation-centered user interaction development method was adopted: the star life cycle. The created prototype comprises the Task Module and the Scheduling Problem Module. The first one allows users to define a sequence of operations, i.e., a task. The second one enables a scheduling problem definition, which consists in a set of tasks. Both modules are equipped with a set of real time validations to assure the correct definition of the necessary data input for the scheduling module of the system. The usability evaluation allowed us to measure the ease of interaction and observe the different forms of interaction provided by each participant, namely the reactions to the real time validation mechanism.

Using data mining techniques to support DR programs definition in smart grids

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years, Power Systems (PS) have experimented many changes in their operation. The introduction of new players managing Distributed Generation (DG) units, and the existence of new Demand Response (DR) programs make the control of the system a more complex problem and allow a more flexible management. An intelligent resource management in the context of smart grids is of huge important so that smart grids functions are assured. This paper proposes a new methodology to support system operators and/or Virtual Power Players (VPPs) to determine effective and efficient DR programs that can be put into practice. This method is based on the use of data mining techniques applied to a database which is obtained for a large set of operation scenarios. The paper includes a case study based on 27,000 scenarios considering a diversity of distributed resources in a 32 bus distribution network.

Ant colony search algorithm for the optimal power flow problem

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To maintain a power system within operation limits, a level ahead planning it is necessary to apply competitive techniques to solve the optimal power flow (OPF). OPF is a non-linear and a large combinatorial problem. The Ant Colony Search (ACS) optimization algorithm is inspired by the organized natural movement of real ants and has been successfully applied to different large combinatorial optimization problems. This paper presents an implementation of Ant Colony optimization to solve the OPF in an economic dispatch context. The proposed methodology has been developed to be used for maintenance and repairing planning with 48 to 24 hours antecipation. The main advantage of this method is its low execution time that allows the use of OPF when a large set of scenarios has to be analyzed. The paper includes a case study using the IEEE 30 bus network. The results are compared with other well-known methodologies presented in the literature.

Previsão de vento baseado em técnicas de Data Mining

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mestrado em Engenharia Electrotécnica – Sistemas Eléctricos de Energia

Agreement between data obtained from repeated interviews with a six-years interval

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of the study was to compare information collected through face-to-face interviews at first time and six years later in a city of Southeastern Brazil. In 1998, 32 mothers (N=32) of children aged 20 to 30 months answered a face-to-face interview with structured questions regarding their children's brushing habits. Six years later this same interview was repeated with the same mothers. Both interviews were compared for overall agreement, kappa and weighted kappa. Overall agreement between both interviews varied from 41 to 96%. Kappa values ranged from 0.00 to 0.65 (very bad to good) without any significant differences. The results showed lack of agreement when the same interview is conducted six years later, showing that the recall bias can be a methodological problem of interviews.

Sistema de percepção visual para veículos autónomos aéreos

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Esta dissertação aborda o problema de detecção e desvio de obstáculos "SAA- Sense And Avoid" em movimento para veículos aéreos. Em particular apresenta contribuições tendo em vista a obtenção de soluções para permitir a utilização de aeronaves não tripuladas em espaço aéreo não segregado e para aplicações civis. Estas contribuições caracterizam-se por: uma análise do problema de SAA em \UAV's - Unmmaned Aerial Vehicles\ civis; a definição do conceito e metodologia para o projecto deste tipo de sistemas; uma proposta de \ben- chmarking\ para o sistema SAA caracterizando um conjunto de "datasets\ adequados para a validação de métodos de detecção; respectiva validação experimental do processo e obtenção de "datasets"; a análise do estado da arte para a detecção de \Dim point features\ ; o projecto de uma arquitectura para uma solução de SAA incorporando a integração de compensação de \ego motion" e respectiva validação para um "dataset" recolhido. Tendo em vista a análise comparativa de diferentes métodos bem como a validação de soluções foi proposta a recolha de um conjunto de \datasets" de informação sensorial e de navegação. Para os mesmos foram definidos um conjunto de experiências e cenários experimentais. Foi projectado e implementado um setup experimental para a recolha dos \datasets" e realizadas experiências de recolha recorrendo a aeronaves tripuladas. O setup desenvolvido incorpora um sistema inercial de alta precisão, duas câmaras digitais sincronizadas (possibilitando análise de informa formação stereo) e um receptor GPS. As aeronaves alvo transportam um receptor GPS com logger incorporado permitindo a correlação espacial dos resultados de detecção. Com este sistema foram recolhidos dados referentes a cenários de aproximação com diferentes trajectórias e condições ambientais bem como incorporando movimento do dispositivo detector. O método proposto foi validado para os datasets recolhidos tendo-se verificado, numa análise preliminar, a detecção do obstáculo (avião ultraleve) em todas as frames para uma distância inferior a 3 km com taxas de sucesso na ordem dos 95% para distâncias entre os 3 e os 4 km. Os resultados apresentados permitem validar a arquitectura proposta para a solução do problema de SAA em veículos aéreos autónomos e abrem perspectivas muito promissoras para desenvolvimento futuro com forte impacto técnico-científico bem como sócio-economico. A incorporação de informa formação de \ego motion" permite fornecer um forte incremento em termos de desempenho.

Fast growing fungi: a problem to be solved to achieve characterization of occupational exposure to fungi in cork industry

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Chrysonilia sitophila is a common mould in cork industry and has been identified as a cause of IgE sensitization and occupational asthma. This fungal species have a fast growth rate that may inhibit others species’ growth causing underestimated data from characterization of occupational fungal exposure. Aiming to ascertain occupational exposure to fungi in cork industry, were analyzed papers from 2000 about the best air sampling method, to obtain quantification and identification of all airborne culturable fungi, besides the ones that have fast-growing rates. Impaction method don’t allows the collection of a representative air volume, because even with some media that restricts the growth of the colonies, in environments with higher fungal load, such as cork industry, the counting of the colonies is very difficult. Otherwise, impinger method permits the collection of a representative air volume, since we can make dilution of the collected volume. Besides culture methods that allows fungal identification trough macro- and micro-morphology, growth features, thermotolerance and ecological data, we can apply molecular biology with the impinger method, to detect the presence of non-viable particles and potential mycotoxin producers’ strains, and also to detect mycotoxins presence with ELISA or HPLC. Selection of the best air sampling method in each setting is crucial to achieve characterization of occupational exposure to fungi. Information about the prevalent fungal species in each setting and also the eventual fungal load it’s needed for a criterious selection.

Feature selection for clustering categorical data with an embedded modelling approach

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.

Determining the number of clusters in categorical data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cluster analysis for categorical data has been an active area of research. A well-known problem in this area is the determination of the number of clusters, which is unknown and must be inferred from the data. In order to estimate the number of clusters, one often resorts to information criteria, such as BIC (Bayesian information criterion), MML (minimum message length, proposed by Wallace and Boulton, 1968), and ICL (integrated classification likelihood). In this work, we adopt the approach developed by Figueiredo and Jain (2002) for clustering continuous data. They use an MML criterion to select the number of clusters and a variant of the EM algorithm to estimate the model parameters. This EM variant seamlessly integrates model estimation and selection in a single algorithm. For clustering categorical data, we assume a finite mixture of multinomial distributions and implement a new EM algorithm, following a previous version (Silvestre et al., 2008). Results obtained with synthetic datasets are encouraging. The main advantage of the proposed approach, when compared to the above referred criteria, is the speed of execution, which is especially relevant when dealing with large data sets.

Disseminating data using broadcast when topology is unknown

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Consider the problem of disseminating data from an arbitrary source node to all other nodes in a distributed computer system, like Wireless Sensor Networks (WSNs). We assume that wireless broadcast is used and nodes do not know the topology. We propose new protocols which disseminate data faster and use fewer broadcasts than the simple broadcast protocol.

Mining protein structure data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The principal topic of this work is the application of data mining techniques, in particular of machine learning, to the discovery of knowledge in a protein database. In the first chapter a general background is presented. Namely, in section 1.1 we overview the methodology of a Data Mining project and its main algorithms. In section 1.2 an introduction to the proteins and its supporting file formats is outlined. This chapter is concluded with section 1.3 which defines that main problem we pretend to address with this work: determine if an amino acid is exposed or buried in a protein, in a discrete way (i.e.: not continuous), for five exposition levels: 2%, 10%, 20%, 25% and 30%. In the second chapter, following closely the CRISP-DM methodology, whole the process of construction the database that supported this work is presented. Namely, it is described the process of loading data from the Protein Data Bank, DSSP and SCOP. Then an initial data exploration is performed and a simple prediction model (baseline) of the relative solvent accessibility of an amino acid is introduced. It is also introduced the Data Mining Table Creator, a program developed to produce the data mining tables required for this problem. In the third chapter the results obtained are analyzed with statistical significance tests. Initially the several used classifiers (Neural Networks, C5.0, CART and Chaid) are compared and it is concluded that C5.0 is the most suitable for the problem at stake. It is also compared the influence of parameters like the amino acid information level, the amino acid window size and the SCOP class type in the accuracy of the predictive models. The fourth chapter starts with a brief revision of the literature about amino acid relative solvent accessibility. Then, we overview the main results achieved and finally discuss about possible future work. The fifth and last chapter consists of appendices. Appendix A has the schema of the database that supported this thesis. Appendix B has a set of tables with additional information. Appendix C describes the software provided in the DVD accompanying this thesis that allows the reconstruction of the present work.

Scalable data acquisition for densely instrumented cyber-physical systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Consider the problem of designing an algorithm for acquiring sensor readings. Consider specifically the problem of obtaining an approximate representation of sensor readings where (i) sensor readings originate from different sensor nodes, (ii) the number of sensor nodes is very large, (iii) all sensor nodes are deployed in a small area (dense network) and (iv) all sensor nodes communicate over a communication medium where at most one node can transmit at a time (a single broadcast domain). We present an efficient algorithm for this problem, and our novel algorithm has two desired properties: (i) it obtains an interpolation based on all sensor readings and (ii) it is scalable, that is, its time-complexity is independent of the number of sensor nodes. Achieving these two properties is possible thanks to the close interlinking of the information processing algorithm, the communication system and a model of the physical world.

Energy efficient scheduling for cluster-tree wireless sensor networks with time-bounded data flows: application to IEEE 802.15.4/ZigBee

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cluster scheduling and collision avoidance are crucial issues in large-scale cluster-tree Wireless Sensor Networks (WSNs). The paper presents a methodology that provides a Time Division Cluster Scheduling (TDCS) mechanism based on the cyclic extension of RCPS/TC (Resource Constrained Project Scheduling with Temporal Constraints) problem for a cluster-tree WSN, assuming bounded communication errors. The objective is to meet all end-to-end deadlines of a predefined set of time-bounded data flows while minimizing the energy consumption of the nodes by setting the TDCS period as long as possible. Sinceeach cluster is active only once during the period, the end-to-end delay of a given flow may span over several periods when there are the flows with opposite direction. The scheduling tool enables system designers to efficiently configure all required parameters of the IEEE 802.15.4/ZigBee beaconenabled cluster-tree WSNs in the network design time. The performance evaluation of thescheduling tool shows that the problems with dozens of nodes can be solved while using optimal solvers.

«
1
2
...
6
7
8
9
10
11
12
...
56
57
»