29 resultados para text and data mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

High-speed optical clock recovery, demultiplexing and data regeneration will be integral parts of any future photonic network based on high bit-rate OTDM. Much research has been conducted on devices that perform these functions, however to date each process has been demonstrated independently. A very promising method of all-optical switching is that of a semiconductor optical amplifier-based nonlinear optical loop mirror (SOA-NOLM). This has various advantages compared with the standard fiber NOLM, most notably low switching power, compact size and stability. We use the SOA-NOLM as an all-optical mixer in a classical phase-locked loop arrangement to achieve optical clock recovery, while at the same time achieving data regeneration in a single compact device

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present the first experimental implementation of a recently designed quasi-lossless fibre span with strongly reduced signal power excursion. The resulting fibre waveguide medium can be advantageously used both in lightwave communications and in all-optical nonlinear data processing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis describes advances in the characterisation, calibration and data processing of optical coherence tomography (OCT) systems. Femtosecond (fs) laser inscription was used for producing OCT-phantoms. Transparent materials are generally inert to infra-red radiations, but with fs lasers material modification occurs via non-linear processes when the highly focused light source interacts with the materials. This modification is confined to the focal volume and is highly reproducible. In order to select the best inscription parameters, combination of different inscription parameters were tested, using three fs laser systems, with different operating properties, on a variety of materials. This facilitated the understanding of the key characteristics of the produced structures with the aim of producing viable OCT-phantoms. Finally, OCT-phantoms were successfully designed and fabricated in fused silica. The use of these phantoms to characterise many properties (resolution, distortion, sensitivity decay, scan linearity) of an OCT system was demonstrated. Quantitative methods were developed to support the characterisation of an OCT system collecting images from phantoms and also to improve the quality of the OCT images. Characterisation methods include the measurement of the spatially variant resolution (point spread function (PSF) and modulation transfer function (MTF)), sensitivity and distortion. Processing of OCT data is a computer intensive process. Standard central processing unit (CPU) based processing might take several minutes to a few hours to process acquired data, thus data processing is a significant bottleneck. An alternative choice is to use expensive hardware-based processing such as field programmable gate arrays (FPGAs). However, recently graphics processing unit (GPU) based data processing methods have been developed to minimize this data processing and rendering time. These processing techniques include standard-processing methods which includes a set of algorithms to process the raw data (interference) obtained by the detector and generate A-scans. The work presented here describes accelerated data processing and post processing techniques for OCT systems. The GPU based processing developed, during the PhD, was later implemented into a custom built Fourier domain optical coherence tomography (FD-OCT) system. This system currently processes and renders data in real time. Processing throughput of this system is currently limited by the camera capture rate. OCTphantoms have been heavily used for the qualitative characterization and adjustment/ fine tuning of the operating conditions of OCT system. Currently, investigations are under way to characterize OCT systems using our phantoms. The work presented in this thesis demonstrate several novel techniques of fabricating OCT-phantoms and accelerating OCT data processing using GPUs. In the process of developing phantoms and quantitative methods, a thorough understanding and practical knowledge of OCT and fs laser processing systems was developed. This understanding leads to several novel pieces of research that are not only relevant to OCT but have broader importance. For example, extensive understanding of the properties of fs inscribed structures will be useful in other photonic application such as making of phase mask, wave guides and microfluidic channels. Acceleration of data processing with GPUs is also useful in other fields.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Renewable energy forms have been widely used in the past decades highlighting a "green" shift in energy production. An actual reason behind this turn to renewable energy production is EU directives which set the Union's targets for energy production from renewable sources, greenhouse gas emissions and increase in energy efficiency. All member countries are obligated to apply harmonized legislation and practices and restructure their energy production networks in order to meet EU targets. Towards the fulfillment of 20-20-20 EU targets, in Greece a specific strategy which promotes the construction of large scale Renewable Energy Source plants is promoted. In this paper, we present an optimal design of the Greek renewable energy production network applying a 0-1 Weighted Goal Programming model, considering social, environmental and economic criteria. In the absence of a panel of experts Data Envelopment Analysis (DEA) approach is used in order to filter the best out of the possible network structures, seeking for the maximum technical efficiency. Super-Efficiency DEA model is also used in order to reduce the solutions and find the best out of all the possible. The results showed that in order to achieve maximum efficiency, the social and environmental criteria must be weighted more than the economic ones.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Accurate measurement of intervertebral kinematics of the cervical spine can support the diagnosis of widespread diseases related to neck pain, such as chronic whiplash dysfunction, arthritis, and segmental degeneration. The natural inaccessibility of the spine, its complex anatomy, and the small range of motion only permit concise measurement in vivo. Low dose X-ray fluoroscopy allows time-continuous screening of cervical spine during patient's spontaneous motion. To obtain accurate motion measurements, each vertebra was tracked by means of image processing along a sequence of radiographic images. To obtain a time-continuous representation of motion and to reduce noise in the experimental data, smoothing spline interpolation was used. Estimation of intervertebral motion for cervical segments was obtained by processing patient's fluoroscopic sequence; intervertebral angle and displacement and the instantaneous centre of rotation were computed. The RMS value of fitting errors resulted in about 0.2 degree for rotation and 0.2 mm for displacements. © 2013 Paolo Bifulco et al.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For wireless power transfer (WPT) systems, communication between the primary side and the pickup side is a challenge because of the large air gap and magnetic interferences. A novel method, which integrates bidirectional data communication into a high-power WPT system, is proposed in this paper. The power and data transfer share the same inductive link between coreless coils. Power/data frequency division multiplexing technique is applied, and the power and data are transmitted by employing different frequency carriers and controlled independently. The circuit model of the multiband system is provided to analyze the transmission gain of the communication channel, as well as the power delivery performance. The crosstalk interference between two carriers is discussed. In addition, the signal-to-noise ratios of the channels are also estimated, which gives a guideline for the design of mod/demod circuits. Finally, a 500-W WPT prototype has been built to demonstrate the effectiveness of the proposed WPT system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sentiment classification over Twitter is usually affected by the noisy nature (abbreviations, irregular forms) of tweets data. A popular procedure to reduce the noise of textual data is to remove stopwords by using pre-compiled stopword lists or more sophisticated methods for dynamic stopword identification. However, the effectiveness of removing stopwords in the context of Twitter sentiment classification has been debated in the last few years. In this paper we investigate whether removing stopwords helps or hampers the effectiveness of Twitter sentiment classification methods. To this end, we apply six different stopword identification methods to Twitter data from six different datasets and observe how removing stopwords affects two well-known supervised sentiment classification methods. We assess the impact of removing stopwords by observing fluctuations on the level of data sparsity, the size of the classifier's feature space and its classification performance. Our results show that using pre-compiled lists of stopwords negatively impacts the performance of Twitter sentiment classification approaches. On the other hand, the dynamic generation of stopword lists, by removing those infrequent terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and substantially shrinking the feature space

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Product recommender systems are often deployed by e-commerce websites to improve user experience and increase sales. However, recommendation is limited by the product information hosted in those e-commerce sites and is only triggered when users are performing e-commerce activities. In this paper, we develop a novel product recommender system called METIS, a MErchanT Intelligence recommender System, which detects users' purchase intents from their microblogs in near real-time and makes product recommendation based on matching the users' demographic information extracted from their public profiles with product demographics learned from microblogs and online reviews. METIS distinguishes itself from traditional product recommender systems in the following aspects: 1) METIS was developed based on a microblogging service platform. As such, it is not limited by the information available in any specific e-commerce website. In addition, METIS is able to track users' purchase intents in near real-time and make recommendations accordingly. 2) In METIS, product recommendation is framed as a learning to rank problem. Users' characteristics extracted from their public profiles in microblogs and products' demographics learned from both online product reviews and microblogs are fed into learning to rank algorithms for product recommendation. We have evaluated our system in a large dataset crawled from Sina Weibo. The experimental results have verified the feasibility and effectiveness of our system. We have also made a demo version of our system publicly available and have implemented a live system which allows registered users to receive recommendations in real time. © 2014 ACM.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. miniDVMS v1.8 provides a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualisation domain. The advantage of this interface is that the user is directly involved in the data mining process. Principled projection methods, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), are integrated with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, and user interaction facilities, to provide this integrated visual data mining framework. The software also supports conventional visualisation techniques such as principal component analysis (PCA), Neuroscale, and PhiVis. This user manual gives an overview of the purpose of the software tool, highlights some of the issues to be taken care while creating a new model, and provides information about how to install and use the tool. The user manual does not require the readers to have familiarity with the algorithms it implements. Basic computing skills are enough to operate the software.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When applying multivariate analysis techniques in information systems and social science disciplines, such as management information systems (MIS) and marketing, the assumption that the empirical data originate from a single homogeneous population is often unrealistic. When applying a causal modeling approach, such as partial least squares (PLS) path modeling, segmentation is a key issue in coping with the problem of heterogeneity in estimated cause-and-effect relationships. This chapter presents a new PLS path modeling approach which classifies units on the basis of the heterogeneity of the estimates in the inner model. If unobserved heterogeneity significantly affects the estimated path model relationships on the aggregate data level, the methodology will allow homogenous groups of observations to be created that exhibit distinctive path model estimates. The approach will, thus, provide differentiated analytical outcomes that permit more precise interpretations of each segment formed. An application on a large data set in an example of the American customer satisfaction index (ACSI) substantiates the methodology’s effectiveness in evaluating PLS path modeling results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address the important bioinformatics problem of predicting protein function from a protein's primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework called joint sentiment-topic (JST) model based on latent Dirichlet allocation (LDA), which detects sentiment and topic simultaneously from text. A reparameterized version of the JST model called Reverse-JST, obtained by reversing the sequence of sentiment and topic generation in the modeling process, is also studied. Although JST is equivalent to Reverse-JST without a hierarchical prior, extensive experiments show that when sentiment priors are added, JST performs consistently better than Reverse-JST. Besides, unlike supervised approaches to sentiment classification which often fail to produce satisfactory performance when shifting to other domains, the weakly supervised nature of JST makes it highly portable to other domains. This is verified by the experimental results on data sets from five different domains where the JST model even outperforms existing semi-supervised approaches in some of the data sets despite using no labeled documents. Moreover, the topics and topic sentiment detected by JST are indeed coherent and informative. We hypothesize that the JST model can readily meet the demand of large-scale sentiment analysis from the web in an open-ended fashion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose: Technological devices such as smartphones and tablets are widely available and increasingly used as visual aids. This study evaluated the use of a novel app for tablets (MD_evReader) developed as a reading aid for individuals with a central field loss resulting from macular degeneration. The MD_evReader app scrolls text as single lines (similar to a news ticker) and is intended to enhance reading performance using the eccentric viewing technique by both reducing the demands on the eye movement system and minimising the deleterious effects of perceptual crowding. Reading performance with scrolling text was compared with reading static sentences, also presented on a tablet computer. Methods: Twenty-six people with low vision (diagnosis of macular degeneration) read static or dynamic text (scrolled from right to left), presented as a single line at high contrast on a tablet device. Reading error rates and comprehension were recorded for both text formats, and the participant’s subjective experience of reading with the app was assessed using a simple questionnaire. Results: The average reading speed for static and dynamic text was not significantly different and equal to or greater than 85 words per minute. The comprehension scores for both text formats were also similar, equal to approximately 95% correct. However, reading error rates were significantly (p=0.02) less for dynamic text than for static text. The participants’ questionnaire ratings of their reading experience with the MD_evReader were highly positive and indicated a preference for reading with this app compared with their usual method. Conclusions: Our data show that reading performance with scrolling text is at least equal to that achieved with static text and in some respects (reading error rate) is better than static text. Bespoke apps informed by an understanding of the underlying sensorimotor processes involved in a cognitive task such as reading have excellent potential as aids for people with visual impairments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present in this article an automated framework that extracts product adopter information from online reviews and incorporates the extracted information into feature-based matrix factorization formore effective product recommendation. In specific, we propose a bootstrapping approach for the extraction of product adopters from review text and categorize them into a number of different demographic categories. The aggregated demographic information of many product adopters can be used to characterize both products and users in the form of distributions over different demographic categories. We further propose a graphbased method to iteratively update user- and product-related distributions more reliably in a heterogeneous user-product graph and incorporate them as features into the matrix factorization approach for product recommendation. Our experimental results on a large dataset crawled from JINGDONG, the largest B2C e-commerce website in China, show that our proposed framework outperforms a number of competitive baselines for product recommendation.