999 resultados para Matrix Factorisation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addressed issues that have prevented qualitative researchers from using thematic discovery algorithms. The central hypothesis evaluated whether allowing qualitative researchers to interact with thematic discovery algorithms and incorporate domain knowledge improved their ability to address research questions and trust the derived themes. Non-negative Matrix Factorisation and Latent Dirichlet Allocation find latent themes within document collections but these algorithms are rarely used, because qualitative researchers do not trust and cannot interact with the themes that are automatically generated. The research determined the types of interactivity that qualitative researchers require and then evaluated interactive algorithms that matched these requirements. Theoretical contributions included the articulation of design guidelines for interactive thematic discovery algorithms, the development of an Evaluation Model and a Conceptual Framework for Interactive Content Analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we formulate the nonnegative matrix factorisation (NMF) problem as a maximum likelihood estimation problem for hidden Markov models and propose online expectation-maximisation (EM) algorithms to estimate the NMF and the other unknown static parameters. We also propose a sequential Monte Carlo approximation of our online EM algorithm. We show the performance of the proposed method with two numerical examples. © 2012 IFAC.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This article explores two matrix methods to induce the ``shades of meaning" (SoM) of a word. A matrix representation of a word is computed from a corpus of traces based on the given word. Non-negative Matrix Factorisation (NMF) and Singular Value Decomposition (SVD) compute a set of vectors corresponding to a potential shade of meaning. The two methods were evaluated based on loss of conditional entropy with respect to two sets of manually tagged data. One set reflects concepts generally appearing in text, and the second set comprises words used for investigations into word sense disambiguation. Results show that for NMF consistently outperforms SVD for inducing both SoM of general concepts as well as word senses. The problem of inducing the shades of meaning of a word is more subtle than that of word sense induction and hence relevant to thematic analysis of opinion where nuances of opinion can arise.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This work applies a variety of multilinear function factorisation techniques to extract appropriate features or attributes from high dimensional multivariate time series for classification. Recently, a great deal of work has centred around designing time series classifiers using more and more complex feature extraction and machine learning schemes. This paper argues that complex learners and domain specific feature extraction schemes of this type are not necessarily needed for time series classification, as excellent classification results can be obtained by simply applying a number of existing matrix factorisation or linear projection techniques, which are simple and computationally inexpensive. We highlight this using a geometric separability measure and classification accuracies obtained though experiments on four different high dimensional multivariate time series datasets. © 2013 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This overview focuses on the application of chemometrics techniques for the investigation of soils contaminated by polycyclic aromatic hydrocarbons (PAHs) and metals because these two important and very diverse groups of pollutants are ubiquitous in soils. The salient features of various studies carried out in the micro- and recreational environments of humans, are highlighted in the context of the various multivariate statistical techniques available across discipline boundaries that have been effectively used in soil studies. Particular attention is paid to techniques employed in the geosciences that may be effectively utilized for environmental soil studies; classical multivariate approaches that may be used in isolation or as complementary methods to these are also discussed. Chemometrics techniques widely applied in atmospheric studies for identifying sources of pollutants or for determining the importance of contaminant source contributions to a particular site, have seen little use in soil studies, but may be effectively employed in such investigations. Suitable programs are also available for suggesting mitigating measures in cases of soil contamination, and these are also considered. Specific techniques reviewed include pattern recognition techniques such as Principal Components Analysis (PCA), Fuzzy Clustering (FC) and Cluster Analysis (CA); geostatistical tools include variograms, Geographical Information Systems (GIS), contour mapping and kriging; source identification and contribution estimation methods reviewed include Positive Matrix Factorisation (PMF), and Principal Component Analysis on Absolute Principal Component Scores (PCA/APCS). Mitigating measures to limit or eliminate pollutant sources may be suggested through the use of ranking analysis and multi criteria decision making methods (MCDM). These methods are mainly represented in this review by studies employing the Preference Ranking Organisation Method for Enrichment Evaluation (PROMETHEE) and its associated graphic output, Geometrical Analysis for Interactive Aid (GAIA).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Airborne fine particles were collected at a suburban site in Queensland, Australia between 1995 and 2003. The samples were analysed for 21 elements, and Positive Matrix Factorisation (PMF), Preference Ranking Organisation METHods for Enrichment Evaluation (PROMETHEE) and Graphical Analysis for Interactive Assistance (GAIA) were applied to the data. PROMETHEE provided information on the ranking of pollutant levels from the sampling years while PMF provided insights into the sources of the pollutants, their chemical composition, most likely locations and relative contribution to the levels of particulate pollution at the site. PROMETHEE and GAIA found that the removal of lead from fuel in the area had a significant impact on the pollution patterns while PMF identified 6 pollution sources including: Railways (5.5%), Biomass Burning (43.3%), Soil (9.2%), Sea Salt (15.6%), Aged Sea Salt (24.4%) and Motor Vehicles (2.0%). Thus the results gave information that can assist in the formulation of mitigation measures for air pollution.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

House dust is a heterogeneous matrix, which contains a number of biological materials and particulate matter gathered from several sources. It is the accumulation of a number of semi-volatile and non-volatile contaminants. The contaminants are trapped and preserved. Therefore, house dust can be viewed as an archive of both the indoor and outdoor air pollution. There is evidence to show that on average, people tend to stay indoors most of the time and this increases exposure to house dust. The aims of this investigation were to: " assess the levels of Polycyclic Aromatic Hydrocarbons (PAHs), elements and pesticides in the indoor environment of the Brisbane area; " identify and characterise the possible sources of elemental constituents (inorganic elements), PAHs and pesticides by means of Positive Matrix Factorisation (PMF); and " establish the correlations between the levels of indoor air pollutants (PAHs, elements and pesticides) with the external and internal characteristics or attributes of the buildings and indoor activities by means of multivariate data analysis techniques. The dust samples were collected during the period of 2005-2007 from homes located in different suburbs of Brisbane, Ipswich and Toowoomba, in South East Queensland, Australia. A vacuum cleaner fitted with a paper bag was used as a sampler for collecting the house dust. A survey questionnaire was filled by the house residents which contained information about the indoor and outdoor characteristics of their residences. House dust samples were analysed for three different pollutants: Pesticides, Elements and PAHs. The analyses were carried-out for samples of particle size less than 250 µm. The chemical analyses for both pesticides and PAHs were performed using a Gas Chromatography Mass Spectrometry (GC-MS), while elemental analysis was carried-out by using Inductively-Coupled Plasma-Mass Spectroscopy (ICP-MS). The data was subjected to multivariate data analysis techniques such as multi-criteria decision-making procedures, Preference Ranking Organisation Method for Enrichment Evaluations (PROMETHEE), coupled with Geometrical Analysis for Interactive Aid (GAIA) in order to rank the samples and to examine data display. This study showed that compared to the results from previous works, which were carried-out in Australia and overseas, the concentrations of pollutants in house dusts in Brisbane and the surrounding areas were relatively very high. The results of this work also showed significant correlations between some of the physical parameters (types of building material, floor level, distance from industrial areas and major road, and smoking) and the concentrations of pollutants. Types of building materials and the age of houses were found to be two of the primary factors that affect the concentrations of pesticides and elements in house dust. The concentrations of these two types of pollutant appear to be higher in old houses (timber houses) than in the brick ones. In contrast, the concentrations of PAHs were noticed to be higher in brick houses than in the timber ones. Other factors such as floor level, and distance from the main street and industrial area, also affected the concentrations of pollutants in the house dust samples. To apportion the sources and to understand mechanisms of pollutants, Positive Matrix Factorisation (PMF) receptor model was applied. The results showed that there were significant correlations between the degree of concentration of contaminants in house dust and the physical characteristics of houses, such as the age and the type of the house, the distance from the main road and industrial areas, and smoking. Sources of pollutants were identified. For PAHs, the sources were cooking activities, vehicle emissions, smoking, oil fumes, natural gas combustion and traces of diesel exhaust emissions; for pesticides the sources were application of pesticides for controlling termites in buildings and fences, treating indoor furniture and in gardens for controlling pests attacking horticultural and ornamental plants; for elements the sources were soil, cooking, smoking, paints, pesticides, combustion of motor fuels, residual fuel oil, motor vehicle emissions, wearing down of brake linings and industrial activities.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we present the outcomes of a project on the exploration of the use of Field Programmable Gate Arrays(FPGAs) as co-processors for scientific computation. We designed a custom circuit for the pipelined solving of multiple tri-diagonal linear systems. The design is well suited for applications that require many independent tri diagonal system solves, such as finite difference methods for solving PDEs or applications utilising cubic spline interpolation. The selected solver algorithm was the Tri Diagonal Matrix Algorithm (TDMA or Thomas Algorithm). Our solver supports user specified precision thought the use of a custom floating point VHDL library supporting addition, subtraction, multiplication and division. The variable precision TDMA solver was tested for correctness in simulation mode. The TDMA pipeline was tested successfully in hardware using a simplified solver model. The details of implementation, the limitations, and future work are also discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we present the outcomes of a project on the exploration of the use of Field Programmable Gate Arrays (FPGAs) as co-processors for scientific computation. We designed a custom circuit for the pipelined solving of multiple tri-diagonal linear systems. The design is well suited for applications that require many independent tri-diagonal system solves, such as finite difference methods for solving PDEs or applications utilising cubic spline interpolation. The selected solver algorithm was the Tri-Diagonal Matrix Algorithm (TDMA or Thomas Algorithm). Our solver supports user specified precision thought the use of a custom floating point VHDL library supporting addition, subtraction, multiplication and division. The variable precision TDMA solver was tested for correctness in simulation mode. The TDMA pipeline was tested successfully in hardware using a simplified solver model. The details of implementation, the limitations, and future work are also discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Particulate matter is common in our environment and has been linked to human health problems particularly in the ultrafine size range. A range of chemical species have been associated with particulate matter and of special concern are the hazardous chemicals that can accentuate health problems. If the sources of such particles can be identified then strategies can be developed for the reduction of air pollution and consequently, the improvement of the quality of life. In this investigation, particle number size distribution data and the concentrations of chemical species were obtained at two sites in Brisbane, Australia. Source apportionment was used to determine the sources (or factors) responsible for the particle size distribution data. The apportionment was performed by Positive Matrix Factorisation (PMF) and Principal Component Analysis/Absolute Principal Component Scores (PCA/APCS), and the results were compared with information from the gaseous chemical composition analysis. Although PCA/APCS resolved more sources, the results of the PMF analysis appear to be more reliable. Six common sources identified by both methods include: traffic 1, traffic 2, local traffic, biomass burning, and two unassigned factors. Thus motor vehicle related activities had the most impact on the data with the average contribution from nearly all sources to the measured concentrations higher during peak traffic hours and weekdays. Further analyses incorporated the meteorological measurements into the PMF results to determine the direction of the sources relative to the measurement sites, and this indicated that traffic on the nearby road and intersection was responsible for most of the factors. The described methodology which utilised a combination of three types of data related to particulate matter to determine the sources could assist future development of particle emission control and reduction strategies.