42 resultados para data gathering algorithm
Resumo:
Future digital signal processing (DSP) systems must provide robustness on algorithm and application level to the presence of reliability issues that come along with corresponding implementations in modern semiconductor process technologies. In this paper, we address this issue by investigating the impact of unreliable memories on general DSP systems. In particular, we propose a novel framework to characterize the effects of unreliable memories, which enables us to devise novel methods to mitigate the associated performance loss. We propose to deploy specifically designed data representations, which have the capability of substantially improving the system reliability compared to that realized by conventional data representations used in digital integrated circuits, such as 2's-complement or sign-magnitude number formats. To demonstrate the efficacy of the proposed framework, we analyze the impact of unreliable memories on coded communication systems, and we show that the deployment of optimized data representations substantially improves the error-rate performance of such systems.
Resumo:
This paper presents a surrogate-model based optimization of a doubly-fed induction generator (DFIG) machine winding design for maximizing power yield. Based on site-specific wind profile data and the machine’s previous operational performance, the DFIG’s stator and rotor windings are optimized to match the maximum efficiency with operating conditions for rewinding purposes. The particle swarm optimization (PSO)-based surrogate optimization techniques are used in conjunction with the finite element method (FEM) to optimize the machine design utilizing the limited available information for the site-specific wind profile and generator operating conditions. A response surface method in the surrogate model is developed to formulate the design objectives and constraints. Besides, the machine tests and efficiency calculations follow IEEE standard 112-B. Numerical and experimental results validate the effectiveness of the proposed technologies.
Resumo:
This paper addresses the estimation of parameters of a Bayesian network from incomplete data. The task is usually tackled by running the Expectation-Maximization (EM) algorithm several times in order to obtain a high log-likelihood estimate. We argue that choosing the maximum log-likelihood estimate (as well as the maximum penalized log-likelihood and the maximum a posteriori estimate) has severe drawbacks, being affected both by overfitting and model uncertainty. Two ideas are discussed to overcome these issues: a maximum entropy approach and a Bayesian model averaging approach. Both ideas can be easily applied on top of EM, while the entropy idea can be also implemented in a more sophisticated way, through a dedicated non-linear solver. A vast set of experiments shows that these ideas produce significantly better estimates and inferences than the traditional and widely used maximum (penalized) log-likelihood and maximum a posteriori estimates. In particular, if EM is adopted as optimization engine, the model averaging approach is the best performing one; its performance is matched by the entropy approach when implemented using the non-linear solver. The results suggest that the applicability of these ideas is immediate (they are easy to implement and to integrate in currently available inference engines) and that they constitute a better way to learn Bayesian network parameters.
Resumo:
In this paper, we propose a new learning approach to Web data annotation, where a support vector machine-based multiclass classifier is trained to assign labels to data items. For data record extraction, a data section re-segmentation algorithm based on visual and content features is introduced to improve the performance of Web data record extraction. We have implemented the proposed approach and tested it with a large set of Web query result pages in different domains. Our experimental results show that our proposed approach is highly effective and efficient.
Resumo:
One of the main purposes of building a battery model is for monitoring and control during battery charging/discharging as well as for estimating key factors of batteries such as the state of charge for electric vehicles. However, the model based on the electrochemical reactions within the batteries is highly complex and difficult to compute using conventional approaches. Radial basis function (RBF) neural networks have been widely used to model complex systems for estimation and control purpose, while the optimization of both the linear and non-linear parameters in the RBF model remains a key issue. A recently proposed meta-heuristic algorithm named Teaching-Learning-Based Optimization (TLBO) is free of presetting algorithm parameters and performs well in non-linear optimization. In this paper, a novel self-learning TLBO based RBF model is proposed for modelling electric vehicle batteries using RBF neural networks. The modelling approach has been applied to two battery testing data sets and compared with some other RBF based battery models, the training and validation results confirm the efficacy of the proposed method.
Resumo:
BACKGROUND: Urothelial pathogenesis is a complex process driven by an underlying network of interconnected genes. The identification of novel genomic target regions and gene targets that drive urothelial carcinogenesis is crucial in order to improve our current limited understanding of urothelial cancer (UC) on the molecular level. The inference of genome-wide gene regulatory networks (GRN) from large-scale gene expression data provides a promising approach for a detailed investigation of the underlying network structure associated to urothelial carcinogenesis.
METHODS: In our study we inferred and compared three GRNs by the application of the BC3Net inference algorithm to large-scale transitional cell carcinoma gene expression data sets from Illumina RNAseq (179 samples), Illumina Bead arrays (165 samples) and Affymetrix Oligo microarrays (188 samples). We investigated the structural and functional properties of GRNs for the identification of molecular targets associated to urothelial cancer.
RESULTS: We found that the urothelial cancer (UC) GRNs show a significant enrichment of subnetworks that are associated with known cancer hallmarks including cell cycle, immune response, signaling, differentiation and translation. Interestingly, the most prominent subnetworks of co-located genes were found on chromosome regions 5q31.3 (RNAseq), 8q24.3 (Oligo) and 1q23.3 (Bead), which all represent known genomic regions frequently deregulated or aberated in urothelial cancer and other cancer types. Furthermore, the identified hub genes of the individual GRNs, e.g., HID1/DMC1 (tumor development), RNF17/TDRD4 (cancer antigen) and CYP4A11 (angiogenesis/ metastasis) are known cancer associated markers. The GRNs were highly dataset specific on the interaction level between individual genes, but showed large similarities on the biological function level represented by subnetworks. Remarkably, the RNAseq UC GRN showed twice the proportion of significant functional subnetworks. Based on our analysis of inferential and experimental networks the Bead UC GRN showed the lowest performance compared to the RNAseq and Oligo UC GRNs.
CONCLUSION: To our knowledge, this is the first study investigating genome-scale UC GRNs. RNAseq based gene expression data is the data platform of choice for a GRN inference. Our study offers new avenues for the identification of novel putative diagnostic targets for subsequent studies in bladder tumors.
Resumo:
Cloud data centres are critical business infrastructures and the fastest growing service providers. Detecting anomalies in Cloud data centre operation is vital. Given the vast complexity of the data centre system software stack, applications and workloads, anomaly detection is a challenging endeavour. Current tools for detecting anomalies often use machine learning techniques, application instance behaviours or system metrics distribu- tion, which are complex to implement in Cloud computing environments as they require training, access to application-level data and complex processing. This paper presents LADT, a lightweight anomaly detection tool for Cloud data centres that uses rigorous correlation of system metrics, implemented by an efficient corre- lation algorithm without need for training or complex infrastructure set up. LADT is based on the hypothesis that, in an anomaly-free system, metrics from data centre host nodes and virtual machines (VMs) are strongly correlated. An anomaly is detected whenever correlation drops below a threshold value. We demonstrate and evaluate LADT using a Cloud environment, where it shows that the hosting node I/O operations per second (IOPS) are strongly correlated with the aggregated virtual machine IOPS, but this correlation vanishes when an application stresses the disk, indicating a node-level anomaly.
Resumo:
This research presents a fast algorithm for projected support vector machines (PSVM) by selecting a basis vector set (BVS) for the kernel-induced feature space, the training points are projected onto the subspace spanned by the selected BVS. A standard linear support vector machine (SVM) is then produced in the subspace with the projected training points. As the dimension of the subspace is determined by the size of the selected basis vector set, the size of the produced SVM expansion can be specified. A two-stage algorithm is derived which selects and refines the basis vector set achieving a locally optimal model. The model expansion coefficients and bias are updated recursively for increase and decrease in the basis set and support vector set. The condition for a point to be classed as outside the current basis vector and selected as a new basis vector is derived and embedded in the recursive procedure. This guarantees the linear independence of the produced basis set. The proposed algorithm is tested and compared with an existing sparse primal SVM (SpSVM) and a standard SVM (LibSVM) on seven public benchmark classification problems. Our new algorithm is designed for use in the application area of human activity recognition using smart devices and embedded sensors where their sometimes limited memory and processing resources must be exploited to the full and the more robust and accurate the classification the more satisfied the user. Experimental results demonstrate the effectiveness and efficiency of the proposed algorithm. This work builds upon a previously published algorithm specifically created for activity recognition within mobile applications for the EU Haptimap project [1]. The algorithms detailed in this paper are more memory and resource efficient making them suitable for use with bigger data sets and more easily trained SVMs.
Resumo:
Conventional practice in Regional Geochemistry includes as a final step of any geochemical campaign the generation of a series of maps, to show the spatial distribution of each of the components considered. Such maps, though necessary, do not comply with the compositional, relative nature of the data, which unfortunately make any conclusion based on them sensitive
to spurious correlation problems. This is one of the reasons why these maps are never interpreted isolated. This contribution aims at gathering a series of statistical methods to produce individual maps of multiplicative combinations of components (logcontrasts), much in the flavor of equilibrium constants, which are designed on purpose to capture certain aspects of the data.
We distinguish between supervised and unsupervised methods, where the first require an external, non-compositional variable (besides the compositional geochemical information) available in an analogous training set. This external variable can be a quantity (soil density, collocated magnetics, collocated ratio of Th/U spectral gamma counts, proportion of clay particle fraction, etc) or a category (rock type, land use type, etc). In the supervised methods, a regression-like model between the external variable and the geochemical composition is derived in the training set, and then this model is mapped on the whole region. This case is illustrated with the Tellus dataset, covering Northern Ireland at a density of 1 soil sample per 2 square km, where we map the presence of blanket peat and the underlying geology. The unsupervised methods considered include principal components and principal balances
(Pawlowsky-Glahn et al., CoDaWork2013), i.e. logcontrasts of the data that are devised to capture very large variability or else be quasi-constant. Using the Tellus dataset again, it is found that geological features are highlighted by the quasi-constant ratios Hf/Nb and their ratio against SiO2; Rb/K2O and Zr/Na2O and the balance between these two groups of two variables; the balance of Al2O3 and TiO2 vs. MgO; or the balance of Cr, Ni and Co vs. V and Fe2O3. The largest variability appears to be related to the presence/absence of peat.
Resumo:
Current data-intensive image processing applications push traditional embedded architectures to their limits. FPGA based hardware acceleration is a potential solution but the programmability gap and time consuming HDL design flow is significant. The proposed research approach to develop “FPGA based programmable hardware acceleration platform” that uses, large number of Streaming Image processing Processors (SIPPro) potentially addresses these issues. SIPPro is pipelined in-order soft-core processor architecture with specific optimisations for image processing applications. Each SIPPro core uses 1 DSP48, 2 Block RAMs and 370 slice-registers, making the processor as compact as possible whilst maintaining flexibility and programmability. It is area efficient, scalable and high performance softcore architecture capable of delivering 530 MIPS per core using Xilinx Zynq SoC (ZC7Z020-3). To evaluate the feasibility of the proposed architecture, a Traffic Sign Recognition (TSR) algorithm has been prototyped on a Zedboard with the color and morphology operations accelerated using multiple SIPPros. Simulation and experimental results demonstrate that the processing platform is able to achieve a speedup of 15 and 33 times for color filtering and morphology operations respectively, with a significant reduced design effort and time.
Resumo:
The Richardson–Lucy algorithm is one of the most important in image deconvolution. However, a drawback is its slow convergence. A significant acceleration was obtained using the technique proposed by Biggs and Andrews (BA), which is implemented in the deconvlucy function of the image processing MATLAB toolbox. The BA method was developed heuristically with no proof of convergence. In this paper, we introduce the heavy-ball (H-B) method for Poisson data optimization and extend it to a scaled H-B method, which includes the BA method as a special case. The method has a proof of the convergence rateof O(K^2), where k is the number of iterations. We demonstrate the superior convergence performance, by a speedup factor off ive, of the scaled H-B method on both synthetic and real 3D images.
Resumo:
Analysing public sentiment about future events, such as demonstration or parades, may provide valuable information while estimating the level of disruption and disorder during these events. Social media, such as Twitter or Facebook, provides views and opinions of users related to any public topics. Consequently, sentiment analysis of social media content may be of interest to different public sector organisations, especially in the security and law enforcement sector. In this paper we present a lexicon-based approach to sentiment analysis of Twitter content. The algorithm performs normalisation of the sentiment in an effort to provide intensity of the sentiment rather than positive/negative label. Following this, we evaluate an evidence-based combining function that supports the classification process in cases when positive and negative words co-occur in a tweet. Finally, we illustrate a case study examining the relation between sentiment of twitter posts related to English Defence League and the level of disorder during the EDL related events.