848 resultados para Training data
Resumo:
Automatic generation of classification rules has been an increasingly popular technique in commercial applications such as Big Data analytics, rule based expert systems and decision making systems. However, a principal problem that arises with most methods for generation of classification rules is the overfit-ting of training data. When Big Data is dealt with, this may result in the generation of a large number of complex rules. This may not only increase computational cost but also lower the accuracy in predicting further unseen instances. This has led to the necessity of developing pruning methods for the simplification of rules. In addition, classification rules are used further to make predictions after the completion of their generation. As efficiency is concerned, it is expected to find the first rule that fires as soon as possible by searching through a rule set. Thus a suit-able structure is required to represent the rule set effectively. In this chapter, the authors introduce a unified framework for construction of rule based classification systems consisting of three operations on Big Data: rule generation, rule simplification and rule representation. The authors also review some existing methods and techniques used for each of the three operations and highlight their limitations. They introduce some novel methods and techniques developed by them recently. These methods and techniques are also discussed in comparison to existing ones with respect to efficient processing of Big Data.
Resumo:
Single-carrier (SC) block transmission with frequency-domain equalisation (FDE) offers a viable transmission technology for combating the adverse effects of long dispersive channels encountered in high-rate broadband wireless communication systems. However, for high bandwidthefficiency and high power-efficiency systems, the channel can generally be modelled by the Hammerstein system that includes the nonlinear distortion effects of the high power amplifier (HPA) at transmitter. For such nonlinear Hammerstein channels, the standard SC-FDE scheme no longer works. This paper advocates a complex-valued (CV) B-spline neural network based nonlinear SC-FDE scheme for Hammerstein channels. Specifically, We model the nonlinear HPA, which represents the CV static nonlinearity of the Hammerstein channel, by a CV B-spline neural network, and we develop two efficient alternating least squares schemes for estimating the parameters of the Hammerstein channel, including both the channel impulse response coefficients and the parameters of the CV B-spline model. We also use another CV B-spline neural network to model the inversion of the nonlinear HPA, and the parameters of this inverting B-spline model can easily be estimated using the standard least squares algorithm based on the pseudo training data obtained as a natural byproduct of the Hammerstein channel identification. Equalisation of the SC Hammerstein channel can then be accomplished by the usual one-tap linear equalisation in frequency domain as well as the inverse B-spline neural network model obtained in time domain. Extensive simulation results are included to demonstrate the effectiveness of our nonlinear SC-FDE scheme for Hammerstein channels.
Resumo:
A practical orthogonal frequency-division multiplexing (OFDM) system can generally be modelled by the Hammerstein system that includes the nonlinear distortion effects of the high power amplifier (HPA) at transmitter. In this contribution, we advocate a novel nonlinear equalization scheme for OFDM Hammerstein systems. We model the nonlinear HPA, which represents the static nonlinearity of the OFDM Hammerstein channel, by a B-spline neural network, and we develop a highly effective alternating least squares algorithm for estimating the parameters of the OFDM Hammerstein channel, including channel impulse response coefficients and the parameters of the B-spline model. Moreover, we also use another B-spline neural network to model the inversion of the HPA’s nonlinearity, and the parameters of this inverting B-spline model can easily be estimated using the standard least squares algorithm based on the pseudo training data obtained as a byproduct of the Hammerstein channel identification. Equalization of the OFDM Hammerstein channel can then be accomplished by the usual one-tap linear equalization as well as the inverse B-spline neural network model obtained. The effectiveness of our nonlinear equalization scheme for OFDM Hammerstein channels is demonstrated by simulation results.
Resumo:
A practical single-carrier (SC) block transmission with frequency domain equalisation (FDE) system can generally be modelled by the Hammerstein system that includes the nonlinear distortion effects of the high power amplifier (HPA) at transmitter. For such Hammerstein channels, the standard SC-FDE scheme no longer works. We propose a novel Bspline neural network based nonlinear SC-FDE scheme for Hammerstein channels. In particular, we model the nonlinear HPA, which represents the complex-valued static nonlinearity of the Hammerstein channel, by two real-valued B-spline neural networks, one for modelling the nonlinear amplitude response of the HPA and the other for the nonlinear phase response of the HPA. We then develop an efficient alternating least squares algorithm for estimating the parameters of the Hammerstein channel, including the channel impulse response coefficients and the parameters of the two B-spline models. Moreover, we also use another real-valued B-spline neural network to model the inversion of the HPA’s nonlinear amplitude response, and the parameters of this inverting B-spline model can be estimated using the standard least squares algorithm based on the pseudo training data obtained as a byproduct of the Hammerstein channel identification. Equalisation of the SC Hammerstein channel can then be accomplished by the usual one-tap linear equalisation in frequency domain as well as the inverse Bspline neural network model obtained in time domain. The effectiveness of our nonlinear SC-FDE scheme for Hammerstein channels is demonstrated in a simulation study.
Resumo:
The induction of classification rules from previously unseen examples is one of the most important data mining tasks in science as well as commercial applications. In order to reduce the influence of noise in the data, ensemble learners are often applied. However, most ensemble learners are based on decision tree classifiers which are affected by noise. The Random Prism classifier has recently been proposed as an alternative to the popular Random Forests classifier, which is based on decision trees. Random Prism is based on the Prism family of algorithms, which is more robust to noise. However, like most ensemble classification approaches, Random Prism also does not scale well on large training data. This paper presents a thorough discussion of Random Prism and a recently proposed parallel version of it called Parallel Random Prism. Parallel Random Prism is based on the MapReduce programming paradigm. The paper provides, for the first time, novel theoretical analysis of the proposed technique and in-depth experimental study that show that Parallel Random Prism scales well on a large number of training examples, a large number of data features and a large number of processors. Expressiveness of decision rules that our technique produces makes it a natural choice for Big Data applications where informed decision making increases the user’s trust in the system.
Resumo:
High bandwidth-efficiency quadrature amplitude modulation (QAM) signaling widely adopted in high-rate communication systems suffers from a drawback of high peak-toaverage power ratio, which may cause the nonlinear saturation of the high power amplifier (HPA) at transmitter. Thus, practical high-throughput QAM communication systems exhibit nonlinear and dispersive channel characteristics that must be modeled as a Hammerstein channel. Standard linear equalization becomes inadequate for such Hammerstein communication systems. In this paper, we advocate an adaptive B-Spline neural network based nonlinear equalizer. Specifically, during the training phase, an efficient alternating least squares (LS) scheme is employed to estimate the parameters of the Hammerstein channel, including both the channel impulse response (CIR) coefficients and the parameters of the B-spline neural network that models the HPA’s nonlinearity. In addition, another B-spline neural network is used to model the inversion of the nonlinear HPA, and the parameters of this inverting B-spline model can easily be estimated using the standard LS algorithm based on the pseudo training data obtained as a natural byproduct of the Hammerstein channel identification. Nonlinear equalisation of the Hammerstein channel is then accomplished by the linear equalization based on the estimated CIR as well as the inverse B-spline neural network model. Furthermore, during the data communication phase, the decision-directed LS channel estimation is adopted to track the time-varying CIR. Extensive simulation results demonstrate the effectiveness of our proposed B-Spline neural network based nonlinear equalization scheme.
Resumo:
This paper presents a new framework for generating triangular meshes from textured color images. The proposed framework combines a texture classification technique, called W-operator, with Imesh, a method originally conceived to generate simplicial meshes from gray scale images. An extension of W-operators to handle textured color images is proposed, which employs a combination of RGB and HSV channels and Sequential Floating Forward Search guided by mean conditional entropy criterion to extract features from the training data. The W-operator is built into the local error estimation used by Imesh to choose the mesh vertices. Furthermore, the W-operator also enables to assign a label to the triangles during the mesh construction, thus allowing to obtain a segmented mesh at the end of the process. The presented results show that the combination of W-operators with Imesh gives rise to a texture classification-based triangle mesh generation framework that outperforms pixel based methods. Crown Copyright (C) 2009 Published by Elsevier Inc. All rights reserved.
Resumo:
The objective of this study is to develop a Pollution Early Warning System (PEWS) for efficient management of water quality in oyster harvesting areas. To that end, this paper presents a web-enabled, user-friendly PEWS for managing water quality in oyster harvesting areas along Louisiana Gulf Coast, USA. The PEWS consists of (1) an Integrated Space-Ground Sensing System (ISGSS) gathering data for environmental factors influencing water quality, (2) an Artificial Neural Network (ANN) model for predicting the level of fecal coliform bacteria, and (3) a web-enabled, user-friendly Geographic Information System (GIS) platform for issuing water pollution advisories and managing oyster harvesting waters. The ISGSS (data acquisition system) collects near real-time environmental data from various sources, including NASA MODIS Terra and Aqua satellites and in-situ sensing stations managed by the USGS and the NOAA. The ANN model is developed using the ANN program in MATLAB Toolbox. The ANN model involves a total of 6 independent environmental variables, including rainfall, tide, wind, salinity, temperature, and weather type along with 8 different combinations of the independent variables. The ANN model is constructed and tested using environmental and bacteriological data collected monthly from 2001 – 2011 by Louisiana Molluscan Shellfish Program at seven oyster harvesting areas in Louisiana Coast, USA. The ANN model is capable of explaining about 76% of variation in fecal coliform levels for model training data and 44% for independent data. The web-based GIS platform is developed using ArcView GIS and ArcIMS. The web-based GIS system can be employed for mapping fecal coliform levels, predicted by the ANN model, and potential risks of norovirus outbreaks in oyster harvesting waters. The PEWS is able to inform decision-makers of potential risks of fecal pollution and virus outbreak on a daily basis, greatly reducing the risk of contaminated oysters to human health.
Resumo:
Continuing development of new materials makes systems lighter and stronger permitting more complex systems to provide more functionality and flexibility that demands a more effective evaluation of their structural health. Smart material technology has become an area of increasing interest in this field. The combination of smart materials and artificial neural networks can be used as an excellent tool for pattern recognition, turning their application adequate for monitoring and fault classification of equipment and structures. In order to identify the fault, the neural network must be trained using a set of solutions to its corresponding forward Variational problem. After the training process, the net can successfully solve the inverse variational problem in the context of monitoring and fault detection because of their pattern recognition and interpolation capabilities. The use of structural frequency response function is a fundamental portion of structural dynamic analysis, and it can be extracted from measured electric impedance through the electromechanical interaction of a piezoceramic and a structure. In this paper we use the FRF obtained by a mathematical model (FEM) in order to generate the training data for the neural networks, and the identification of damage can be done by measuring electric impedance, since suitable data normalization correlates FRF and electrical impedance.
Resumo:
A prática do tênis de mesa requer inúmeras ações dinâmicas que podem conduzir a lesões desportivas, por isso é de importância conhecer fatores inerentes ao traumatismo nos atletas para posterior formulação dos modelos preventivos. Objetivou-se explorar os fatores de risco para lesões desportivas em mesa-tenistas. Para isso, foram entrevistados 111 atletas participantes do Campeonato Paulista de Tênis de Mesa, com média de idade de 22,39±8,88 anos de ambos os gêneros, recrutados ao acaso, classificados em dois níveis competitivos: regional/estadual e nacional/internacional. Utilizou-se o Inquérito de Morbidade Referida adaptado com as características do tênis de mesa com a finalidade de reunir dados pessoais, de treinamento e da lesão desportiva. Foram observadas 0,51 lesões por atleta, e os atletas de nível nacional/internacional apresentaram maiores índices de lesão (52,94%) do que os de nível estadual/regional (48,84%). No gesto específico, notou-se que os membros superiores (93,62%) e o tronco (87,5%) são os locais mais acometidos. Para ambos os níveis, o treinamento foi o momento mais relatado de ocorrência dos agravos. Conclui-se que atletas de nível nacional/internacional possuem maiores índices de lesão e que o gesto específico é a principal causa das lesões, acometendo principalmente os membros superiores e o tronco e ocorrendo com maior frequência durante o treinamento.
Resumo:
Pathogenic variation in Colletotrichum gloeosporioides infecting species of the tropical pasture legume Stylosanthes at its center of diversity was determined from 296 isolates collected from wild host population and selected germ plasm of S. capitata, S. guianensis, S. scabra, and S. macrocephala in Brazil. A putative host differential set comprising 11 accessions was selected from a bioassay of 18 isolates on 19 host accessions using principal component analysis. A similar analysis of anthracnose severity data for a subset of 195 isolates on the 11 differentials indicated that an adequate summary of pathogenic variation could be obtained using only five of these differentials. of the five differentials, S. seabrana 'Primar' was resistant and S. scabra 'Fitzroy' was susceptible to most isolates. A cluster analysis was used to determine eight natural race clusters using the 195 isolates. Linear discriminant functions were developed for eight race clusters using the 195 isolates as the training data set, and these were applied to classify a test data set of the remaining 101 isolates. All except 11 isolates of the test data set were classified into one of the eight race clusters. Over 10% of the 296 isolates were weakly pathogenic to all five differentials and another 40% were virulent on just one differential. The unclassified isolates represent six new races with unique virulence combinations, of which one isolate is virulent on all five differentials. The majority of isolates came from six field sites, and Shannon's index of diversity indicated considerable variation between sites. Pathogenic diversity was extensive at three sites where selected germ plasm were under evaluation, and complex race clusters and unclassified isolates representing new races were more prevalent at these sites compared with sites containing wild Stylosanthes populations.
Resumo:
The application process of fluid fertilizers through variable rates implemented by classical techniques with feedback and conventional equipments can be inefficient or unstable. This paper proposes an open-loop control system based on artificial neural network of the type multilayer perceptron for the identification and control of the fertilizer flow rate. The network training is made by the algorithm of Levenberg-Marquardt with training data obtained from measurements. Preliminary results indicate a fast, stable and low cost control system for precision fanning. Copyright (C) 2000 IFAC.
Resumo:
Background: The participation of children and adolescents in sports, including basketball, is becoming increasingly common, and this increased involvement raises concerns about the potential risk of sports injuries. Objective. To analyze the occurrence of sports injuries among young basketball players according to their position on the court and to associate these injuries with risk factors. Method. A retrospective, epidemiological study. A sample consisting of 204 basketball players with a mean age of 14.33 ± 1.19 years participated in the study. The players were interviewed using a reported condition questionnaire containing anthropometric and training data as well as information on injuries during the previous 12 months. Results: The frequency of injury was highest among the shooting guards (47.8%), followed by the centers (34.8%) and point guards (17.4%). Among the 204 participants, 40 players reported a total of 46 injuries, representing 0.22 injuries per participant and 1.15 injuries per injured participant. For the shooting guards and centers, statistically significant differences between injured and non-injured players were found related to age, weight, height, length of time in training and number of weekly practice hours (p < 0.05). For point guards, a statistically significant difference between injured and non-injured players was found based on weight alone (p < 0.05). Conclusion: The occurrence of injuries among basketball players was low. Injuries were associated with both intrinsic and extrinsic factors among shooting guards and centers, whereas injuries were only associated with weight among point guards. © 2013 Vanderlei et al; licensee BioMed Central Ltd.
Resumo:
This paper proposes a rank aggregation framework for video multimodal geocoding. Textual and visual descriptions associated with videos are used to define ranked lists. These ranked lists are later combined, and the resulting ranked list is used to define appropriate locations for videos. An architecture that implements the proposed framework is designed. In this architecture, there are specific modules for each modality (e.g, textual and visual) that can be developed and evolved independently. Another component is a data fusion module responsible for combining seamlessly the ranked lists defined for each modality. We have validated the proposed framework in the context of the MediaEval 2012 Placing Task, whose objective is to automatically assign geographical coordinates to videos. Obtained results show how our multimodal approach improves the geocoding results when compared to methods that rely on a single modality (either textual or visual descriptors). We also show that the proposed multimodal approach yields comparable results to the best submissions to the Placing Task in 2012 using no extra information besides the available development/training data. Another contribution of this work is related to the proposal of a new effectiveness evaluation measure. The proposed measure is based on distance scores that summarize how effective a designed/tested approach is, considering its overall result for a test dataset. © 2013 Springer Science+Business Media New York.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)