852 resultados para Initial data problem
Resumo:
Pós-graduação em Engenharia de Produção - FEG
Resumo:
Mestrado em Engenharia Electrotécnica – Sistemas Eléctricos de Energia
Resumo:
The objective of the study was to compare information collected through face-to-face interviews at first time and six years later in a city of Southeastern Brazil. In 1998, 32 mothers (N=32) of children aged 20 to 30 months answered a face-to-face interview with structured questions regarding their children's brushing habits. Six years later this same interview was repeated with the same mothers. Both interviews were compared for overall agreement, kappa and weighted kappa. Overall agreement between both interviews varied from 41 to 96%. Kappa values ranged from 0.00 to 0.65 (very bad to good) without any significant differences. The results showed lack of agreement when the same interview is conducted six years later, showing that the recall bias can be a methodological problem of interviews.
Resumo:
Dissertação apresentada à Escola Superior de Educação de Lisboa para obtenção de grau de mestre em Educação Artística, na Especialização de Artes Plásticas na Educação
Resumo:
Chrysonilia sitophila is a common mould in cork industry and has been identified as a cause of IgE sensitization and occupational asthma. This fungal species have a fast growth rate that may inhibit others species’ growth causing underestimated data from characterization of occupational fungal exposure. Aiming to ascertain occupational exposure to fungi in cork industry, were analyzed papers from 2000 about the best air sampling method, to obtain quantification and identification of all airborne culturable fungi, besides the ones that have fast-growing rates. Impaction method don’t allows the collection of a representative air volume, because even with some media that restricts the growth of the colonies, in environments with higher fungal load, such as cork industry, the counting of the colonies is very difficult. Otherwise, impinger method permits the collection of a representative air volume, since we can make dilution of the collected volume. Besides culture methods that allows fungal identification trough macro- and micro-morphology, growth features, thermotolerance and ecological data, we can apply molecular biology with the impinger method, to detect the presence of non-viable particles and potential mycotoxin producers’ strains, and also to detect mycotoxins presence with ELISA or HPLC. Selection of the best air sampling method in each setting is crucial to achieve characterization of occupational exposure to fungi. Information about the prevalent fungal species in each setting and also the eventual fungal load it’s needed for a criterious selection.
Resumo:
Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.
Resumo:
Cluster analysis for categorical data has been an active area of research. A well-known problem in this area is the determination of the number of clusters, which is unknown and must be inferred from the data. In order to estimate the number of clusters, one often resorts to information criteria, such as BIC (Bayesian information criterion), MML (minimum message length, proposed by Wallace and Boulton, 1968), and ICL (integrated classification likelihood). In this work, we adopt the approach developed by Figueiredo and Jain (2002) for clustering continuous data. They use an MML criterion to select the number of clusters and a variant of the EM algorithm to estimate the model parameters. This EM variant seamlessly integrates model estimation and selection in a single algorithm. For clustering categorical data, we assume a finite mixture of multinomial distributions and implement a new EM algorithm, following a previous version (Silvestre et al., 2008). Results obtained with synthetic datasets are encouraging. The main advantage of the proposed approach, when compared to the above referred criteria, is the speed of execution, which is especially relevant when dealing with large data sets.
Resumo:
Consider the problem of disseminating data from an arbitrary source node to all other nodes in a distributed computer system, like Wireless Sensor Networks (WSNs). We assume that wireless broadcast is used and nodes do not know the topology. We propose new protocols which disseminate data faster and use fewer broadcasts than the simple broadcast protocol.
Resumo:
Consider the problem of designing an algorithm for acquiring sensor readings. Consider specifically the problem of obtaining an approximate representation of sensor readings where (i) sensor readings originate from different sensor nodes, (ii) the number of sensor nodes is very large, (iii) all sensor nodes are deployed in a small area (dense network) and (iv) all sensor nodes communicate over a communication medium where at most one node can transmit at a time (a single broadcast domain). We present an efficient algorithm for this problem, and our novel algorithm has two desired properties: (i) it obtains an interpolation based on all sensor readings and (ii) it is scalable, that is, its time-complexity is independent of the number of sensor nodes. Achieving these two properties is possible thanks to the close interlinking of the information processing algorithm, the communication system and a model of the physical world.
Resumo:
Cluster scheduling and collision avoidance are crucial issues in large-scale cluster-tree Wireless Sensor Networks (WSNs). The paper presents a methodology that provides a Time Division Cluster Scheduling (TDCS) mechanism based on the cyclic extension of RCPS/TC (Resource Constrained Project Scheduling with Temporal Constraints) problem for a cluster-tree WSN, assuming bounded communication errors. The objective is to meet all end-to-end deadlines of a predefined set of time-bounded data flows while minimizing the energy consumption of the nodes by setting the TDCS period as long as possible. Sinceeach cluster is active only once during the period, the end-to-end delay of a given flow may span over several periods when there are the flows with opposite direction. The scheduling tool enables system designers to efficiently configure all required parameters of the IEEE 802.15.4/ZigBee beaconenabled cluster-tree WSNs in the network design time. The performance evaluation of thescheduling tool shows that the problems with dozens of nodes can be solved while using optimal solvers.
Resumo:
Doctoral Thesis in Information Systems and Technologies Area of Engineering and Manag ement Information Systems
Resumo:
Dissertation presented at the Faculty of Science and Technology of the New University of Lisbon in fulfillment of the requirements for the Masters degree in Electrical Engineering and Computers
Resumo:
The use of remote labs in undergraduate courses has been reported in literature several times since the mid 90's. Nevertheless, very few articles present results about the correspondent learning gains obtained by students, and in what conditions those systems can be more efficient, thus suggesting a lack of data concerning their pedagogical effectiveness. This paper addresses such a gap by presenting some initial findings concerning the use of a remote lab (VISIR), in a large undergraduate course on Physics, with over 550 students enrolled.
Resumo:
Feature selection is a central problem in machine learning and pattern recognition. On large datasets (in terms of dimension and/or number of instances), using search-based or wrapper techniques can be cornputationally prohibitive. Moreover, many filter methods based on relevance/redundancy assessment also take a prohibitively long time on high-dimensional. datasets. In this paper, we propose efficient unsupervised and supervised feature selection/ranking filters for high-dimensional datasets. These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features. The experimental results, with up to 10(5) features, show the time efficiency of our methods, with lower generalization error than state-of-the-art techniques, while being dramatically simpler and faster.
Resumo:
Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies