903 resultados para Probabilistic latent semantic analysis (PLSA)


30.00% 30.00%



There are many applications for which reliable and safe robots are desired. For example, assistant robots for disabled or elderly people and surgical robots are required to be safe and reliable to prevent human injury and task failure. However, different levels of safety and reliability are required for different tasks so that understanding the reliability of robots is paramount. Currently, it is possible to guarantee the completion of a task when the robot is fault tolerant and the task remains in the fault-tolerant workspace (FTW). The traditional definition of FTW does not consider different reliabilities for the robotic manipulator's different joints. The aim of this paper is to extend the concept of a FTW to address the reliability of different joints. Such an extension can offer a wider FTW while maintaining the required level of reliability. This is achieved by associating a probability with every part of the workspace to extend the FTW. As a result, reliable fault-tolerant workspaces (RFTWs) are introduced by using the novel concept of conditional reliability maps. Such a RFTW can be used to improve the performance of assistant robots while providing the confidence that the robot remains reliable for completion of its assigned tasks. © 2012 Copyright Taylor & Francis and The Robotics Society of Japan.


30.00% 30.00%



Ordinal data is omnipresent in almost all multiuser-generated feedback - questionnaires, preferences etc. This paper investigates modelling of ordinal data with Gaussian restricted Boltzmann machines (RBMs). In particular, we present the model architecture, learning and inference procedures for both vector-variate and matrix-variate ordinal data. We show that our model is able to capture latent opinion profile of citizens around the world, and is competitive against state-of-art collaborative filtering techniques on large-scale public datasets. The model thus has the potential to extend application of RBMs to diverse domains such as recommendation systems, product reviews and expert assessments.


30.00% 30.00%



Efficient management of chronic diseases is critical in modern health care. We consider diabetes mellitus, and our ongoing goal is to examine how machine learning can deliver information for clinical efficiency. The challenge is to aggregate highly heterogeneous sources including demographics, diagnoses, pathologies and treatments, and extract similar groups so that care plans can be designed. To this end, we extend our recent model, the mixed-variate restricted Boltzmann machine (MV.RBM), as it seamlessly integrates multiple data types for each patient aggregated over time and outputs a homogeneous representation called "latent profile" that can be used for patient clustering, visualisation, disease correlation analysis and prediction. We demonstrate that the method outperforms all baselines on these tasks - the primary characteristics of patients in the same groups are able to be identified and the good result can be achieved for the diagnosis codes prediction.


30.00% 30.00%



Following the recent success in quantitative analysis of essential fatty acid compositions in a commercial microencapsulated fish oil (?EFO) supplement, we extended the application of portable attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopic technique and partial least square regression (PLSR) analysis for rapid determination of total protein contents-the other major component in most commercial ?EFO powders. In contrast to the traditional chromatographic methodology used in a routine amino acid analysis (AAA), the ATR-FTIR spectra of the ?EFO powder can be acquired directly from its original powder form with no requirement of any sample preparation, making the technique exceptionally fast, noninvasive, and environmentally friendly as well as being cost effective and hence eminently suitable for routine use by industry. By optimizing the spectral region of interest and number of latent factors through the developed PLSR strategy, a good linear calibration model was produced as indicated by an excellent value of coefficient of determination R2 = 0.9975, using standard ?EFO powders with total protein contents in the range of 140-450 mg/g. The prediction of the protein contents acquired from an independent validation set through the optimized PLSR model was highly accurate as evidenced through (1) a good linear fitting (R2 = 0.9759) in the plot of predicted versus reference values, which were obtained from a standard AAA method, (2) lowest root mean square error of prediction (11.64 mg/g), and (3) high residual predictive deviation (6.83) ranked in very good level of predictive quality indicating high robustness and good predictive performance of the achieved PLSR calibration model. The study therefore demonstrated the potential application of the portable ATR-FTIR technique when used together with PLSR analysis for rapid online monitoring of the two major components (i.e., oil and protein contents) in finished ?EFO powders in the actual manufacturing setting.


30.00% 30.00%



Current growth of individuals on the autism spectrum disorder (ASD) requires continuous support and care. With the popularity of social media, online communities of people affected by ASD emerge. This paper presents an analysis of these online communities through understanding aspects that differentiate such communities. In this paper, the aspects given are not expressed in terms of friendship, exchange of information, social support or recreation, but rather with regard to the topics and linguistic styles that people express in their on-line writing. Using data collected unobtrusively from LiveJournal, we analyze posts made by ten autism communities in conjunction with those made by a control group of standard communities. Significant differences have been found between autism and control communities when characterized by latent topics of discussion and psycholinguistic features. Latent topics are found to have greater predictive power than linguistic features when classifying blog posts as either autism or control community. This study suggests that data mining of online blogs has the potential to detect clinically meaningful data. It opens the door to possibilities including sentinel risk surveillance and harnessing the power in diverse large datasets.


30.00% 30.00%



Social media provides rich sources of personal information and community interaction which can be linked to aspect of mental health. In this paper we investigate manifest properties of textual messages, including latent topics, psycholinguistic features, and authors' mood, of a large corpus of blog posts, to analyze the aspect of social capital in social media communities. Using data collected from Live Journal, we find that bloggers with lower social capital have fewer positive moods and more negative moods than those with higher social capital. It is also found that people with low social capital have more random mood swings over time than the people with high social capital. Significant differences are found between low and high social capital groups when characterized by a set of latent topics and psycholinguistic features derived from blogposts, suggesting discriminative features, proved to be useful for classification tasks. Good prediction is achieved when classifying among social capital groups using topic and linguistic features, with linguistic features are found to have greater predictive power than latent topics. The significance of our work lies in the importance of online social capital to potential construction of automatic healthcare monitoring systems. We further establish the link between mood and social capital in online communities, suggesting the foundation of new systems to monitor online mental well-being.


30.00% 30.00%



Critical analysis and problem-solving skills are two graduate attributes that are important in ensuring that graduates are well equipped in working across research and practice settings within the discipline of psychology. Despite the importance of these skills, few psychology undergraduate programmes have undertaken any systematic development, implementation, and evaluation of curriculum activities to foster these graduate skills. The current study reports on the development and implementation of a tutorial programme designed to enhance the critical analysis and problem-solving skills of undergraduate psychology students. Underpinned by collaborative learning and problem-based learning, the tutorial programme was administered to 273 third year undergraduate students in psychology. Latent Growth Curve Modelling revealed that students demonstrated a significant linear increase in self-reported critical analysis and problem-solving skills across the tutorial programme. The findings suggest that the development of inquiry-based curriculum offers important opportunities for psychology undergraduates to develop critical analysis and problem-solving skills. © 2013 The Australian Psychological Society.


30.00% 30.00%



Streams of short text, such as news titles, enable us to effectively and efficiently learn the real world events that occur anywhere and anytime. Short text messages that are companied by timestamps and generally brief events using only a few words differ from other longer text documents, such as web pages, news stories, blogs, technical papers and books. For example, few words repeat in the same news titles, thus frequency of the term (i.e., TF) is not as important in short text corpus as in longer text corpus. Therefore, analysis of short text faces new challenges. Also, detecting and tracking events through short text analysis need to reliably identify events from constant topic clusters; however, existing methods, such as Latent Dirichlet Allocation (LDA), generates different topic results for a corpus at different executions. In this paper, we provide a Finding Topic Clusters using Co-occurring Terms (FTCCT) algorithm to automatically generate topics from a short text corpus, and develop an Event Evolution Mining (EEM) algorithm to discover hot events and their evolutions (i.e., the popularity degrees of events changing over time). In FTCCT, a term (i.e., a single word or a multiple-words phrase) belongs to only one topic in a corpus. Experiments on news titles of 157 countries within 4 months (from July to October, 2013) demonstrate that our FTCCT-based method (combining FTCCT and EEM) achieves far higher quality of the event's content and description words than LDA-based method (combining LDA and EEM) for analysis of streams of short text. Our method also visualizes the evolutions of the hot events. The discovered world-wide event evolutions have explored some interesting correlations of the world-wide events; for example, successive extreme weather phenomenon occur in different locations - typhoon in Hong Kong and Philippines followed hurricane and storm flood in Mexico in September 2013. © 2014 Springer Science+Business Media New York.


30.00% 30.00%



Multimedia content understanding research requires rigorous approach to deal with the complexity of the data. At the crux of this problem is the method to deal with multilevel data whose structure exists at multiple scales and across data sources. A common example is modeling tags jointly with images to improve retrieval, classification and tag recommendation. Associated contextual observation, such as metadata, is rich that can be exploited for content analysis. A major challenge is the need for a principal approach to systematically incorporate associated media with the primary data source of interest. Taking a factor modeling approach, we propose a framework that can discover low-dimensional structures for a primary data source together with other associated information. We cast this task as a subspace learning problem under the framework of Bayesian nonparametrics and thus the subspace dimensionality and the number of clusters are automatically learnt from data instead of setting these parameters a priori. Using Beta processes as the building block, we construct random measures in a hierarchical structure to generate multiple data sources and capture their shared statistical at the same time. The model parameters are inferred efficiently using a novel combination of Gibbs and slice sampling. We demonstrate the applicability of the proposed model in three applications: image retrieval, automatic tag recommendation and image classification. Experiments using two real-world datasets show that our approach outperforms various state-of-the-art related methods.


30.00% 30.00%



Although organizational context is central to evidence-based practice, underdeveloped measurement hindersitsassessment. The Alberta Context Tool, comprised of 59 items that tap10 modifiable contextual concepts, was developed to address this gap. The purpose of this study to examine the reliability and validity of scores obtained when the Alberta Context Tool is completed by professional nurses across different healthcare settings. Five separate studies (N = 2361 nurses across different care settings) comprised the study sample. Reliability and validity were assessed. Cronbach's alpha exceeded 0.70 for9/10 Alberta Context Tool concepts. Item-total correlations exceeded acceptable standards for 56/59items. Confirmatory Factor Analysescoordinated acceptably with the Alberta Context Tool's proposed latent structure. The mean values for each Alberta Context Tool concept increased from low to high levels of research utilization(as hypothesized) further supporting its validity. This study provides robust evidence forreliability and validity of scores obtained with the Alberta Context Tool when administered to professional nurses.


30.00% 30.00%



In this article, an exponential stability analysis of Markovian jumping stochastic bidirectional associative memory (BAM) neural networks with mode-dependent probabilistic time-varying delays and impulsive control is investigated. By establishment of a stochastic variable with Bernoulli distribution, the information of probabilistic time-varying delay is considered and transformed into one with deterministic time-varying delay and stochastic parameters. By fully taking the inherent characteristic of such kind of stochastic BAM neural networks into account, a novel Lyapunov-Krasovskii functional is constructed with as many as possible positive definite matrices which depends on the system mode and a triple-integral term is introduced for deriving the delay-dependent stability conditions. Furthermore, mode-dependent mean square exponential stability criteria are derived by constructing a new Lyapunov-Krasovskii functional with modes in the integral terms and using some stochastic analysis techniques. The criteria are formulated in terms of a set of linear matrix inequalities, which can be checked efficiently by use of some standard numerical packages. Finally, numerical examples and its simulations are given to demonstrate the usefulness and effectiveness of the proposed results.


30.00% 30.00%



P2P collusive piracy, where paid P2P clients share the content with unpaid clients, has drawn significant concerns in recent years. Study on the follow relationship provides an emerging track of research in capturing the followee (e.g., paid client) for the blocking of piracy spread from all his followers (e.g., unpaid clients). Unfortunately, existing research efforts on the follow relationship in online social network have largely overlooked the time constraint and the content feedback in sequential behavior analysis. Hence, how to consider these two characteristics for effective P2P collusive piracy prevention remains an open problem. In this paper, we proposed a multi-bloom filter circle to facilitate the time-constraint storage and query of P2P sequential behaviors. Then, a probabilistic follow with content feedback model to fast discover and quantify the probabilistic follow relationship is further developed, and then, the corresponding approach to piracy prevention is designed. The extensive experimental analysis demonstrates the capability of the proposed approach.


30.00% 30.00%



Electronic Medical Record (EMR) has established itself as a valuable resource for large scale analysis of health data. A hospital EMR dataset typically consists of medical records of hospitalized patients. A medical record contains diagnostic information (diagnosis codes), procedures performed (procedure codes) and admission details. Traditional topic models, such as latent Dirichlet allocation (LDA) and hierarchical Dirichlet process (HDP), can be employed to discover disease topics from EMR data by treating patients as documents and diagnosis codes as words. This topic modeling helps to understand the constitution of patient diseases and offers a tool for better planning of treatment. In this paper, we propose a novel and flexible hierarchical Bayesian nonparametric model, the word distance dependent Chinese restaurant franchise (wddCRF), which incorporates word-to-word distances to discover semantically-coherent disease topics. We are motivated by the fact that diagnosis codes are connected in the form of ICD-10 tree structure which presents semantic relationships between codes. We exploit a decay function to incorporate distances between words at the bottom level of wddCRF. Efficient inference is derived for the wddCRF by using MCMC technique. Furthermore, since procedure codes are often correlated with diagnosis codes, we develop the correspondence wddCRF (Corr-wddCRF) to explore conditional relationships of procedure codes for a given disease pattern. Efficient collapsed Gibbs sampling is derived for the Corr-wddCRF. We evaluate the proposed models on two real-world medical datasets - PolyVascular disease and Acute Myocardial Infarction disease. We demonstrate that the Corr-wddCRF model discovers more coherent topics than the Corr-HDP. We also use disease topic proportions as new features and show that using features from the Corr-wddCRF outperforms the baselines on 14-days readmission prediction. Beside these, the prediction for procedure codes based on the Corr-wddCRF also shows considerable accuracy.