16 resultados para Isomorphic factorization

em Queensland University of Technology - ePrints Archive


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Narrative text is a useful way of identifying injury circumstances from the routine emergency department data collections. Automatically classifying narratives based on machine learning techniques is a promising technique, which can consequently reduce the tedious manual classification process. Existing works focus on using Naive Bayes which does not always offer the best performance. This paper proposes the Matrix Factorization approaches along with a learning enhancement process for this task. The results are compared with the performance of various other classification approaches. The impact on the classification results from the parameters setting during the classification of a medical text dataset is discussed. With the selection of right dimension k, Non Negative Matrix Factorization-model method achieves 10 CV accuracy of 0.93.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Near-infrared (NIR) and Fourier transform infrared (FTIR) spectroscopy have been used to determine the mineralogical character of isomorphic substitutions for Mg2+ by divalent transition metals Fe, Mn, Co and Ni in natural halotrichite series. The minerals are characterised by d-d transitions in NIR region 12000-7500 cm-1. NIR spectrum of halotrichite reveals broad feature from 12000 to 7500 cm-1 with a splitting of two bands resulting from ferrous ion transition 5T2g ® 5Eg. The presence of overtones of OH- fundamentals near 7000 cm-1 confirms molecular water in the mineral structure of the halotrichite series. The appearance of the most intense peak at around 5132 cm-1 is a common feature in the three minerals and is derived from combination of OH- vibrations of water molecules and 2 water bending modes. The influence of cations like Mg2+, Fe2+, Mn2+, Co2+, Ni2+ shows on the spectra of halotrichites. Especially wupatkiite-OH stretching vibrations in which bands are distorted conspicuously to low wave numbers at 3270, 2904 and 2454 cm-1. The observation of high frequency 2 mode in the infrared spectrum at 1640 cm-1 indicates coordination of water molecules is strongly hydrogen bonded in natural halotrichites. The splittings of bands in 3 and 4 (SO4)2- stretching regions may be attributed to the reduction of symmetry from Td to C2v for sulphate ion. This work has shown the usefulness of NIR spectroscopy for the rapid identification and classification of the halotrichite minerals.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

NIR and IR spectroscopy has been applied for detection of chemical species and the nature of hydrogen bonding in arsenate complexes. The structure and spectral properties of copper(II) arsenate minerals chalcophyllite and chenevixite are compared with copper(II) sulphate minerals devilline, chalcoalumite and caledonite. Split NIR bands in the electronic spectrum of two ranges 11700-8500 cm-1 and 8500-7200 cm-1 confirm distortion of octahedral symmetry for Cu(II) in the arsenate complexes. The observed bands with maxima at 9860 and 7750 cm-1 are assigned to Cu(II) transitions 2B1g ® 2B2g and 2B1g ® 2A1g. Overlapping bands in the NIR region 4500-4000 cm-1 is the effect of multi anions OH-, (AsO4)3- and (SO4)2-. The observation of broad and diffuse bands in the range 3700-2900 cm-1 confirms strong hydrogen bonding in chalcophyllite relative to chenevixite. The position of the water bending vibrations indicates the water is strongly hydrogen bonded in the mineral structure. The strong absorption feature centred at 1644 cm-1 in chalcophyllite indicates water is strongly hydrogen bonded in the mineral structure. The H2O-bending vibrations shift to low wavenumbers in chenevixite and an additional band observed at 1390 cm-1 is related to carbonate impurity. The characterisation of IR spectra by ν3 antisymmetric stretching vibrations of (SO4)2- and (AsO4)3 ions near 1100 and 800 cm-1 respectively is the result of isomorphic substitution for arsenate by sulphate in both the minerals of chalcophyllite and chenevixite.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The near-infrared (NIR) and infrared (IR) spectroscopy has been applied for characterisation of three complex Cu-Zn sulphate/phosphate minerals, namely ktenasite, orthoserpierite and kipushite. The spectral signatures of the three minerals are quite distinct in relation to their composition and structure. The effect of structural cations substitution (Zn2+ and Cu2+) on band shifts is significant both in the electronic and vibrational spectra of these Cu-Zn minerals. The variable Cu:Zn ratio between Zn-rich and Cu-rich compositions shows a strong effect on Cu(II) bands in the electronic spectra. The Cu(II) spectrum is most significant in kipushite (Cu-rich) with bands displayed at high wavenumbers at11390 and 7545 cm-1. The isomorphic substitution of Cu2+ for Zn2+ is reflected in the NIR and IR spectroscopic signatures. The multiple bands for 3 and 4 (SO4)2- stretching vibrations in ktenasite and orthoserpierite are attributed to the reduction of symmetry to the sulphate ion from Td to C2V. The IR spectrum of kipushite is characterised by strong (PO4)3- vibrational modes at 1090 and 990 cm-1. The range of IR absorption is higher in Ktenasite than in kipushite while it is intermediate in orthoserpierite.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The multi-criteria decision making methods, Preference METHods for Enrichment Evaluation (PROMETHEE) and Graphical Analysis for Interactive Assistance (GAIA), and the two-way Positive Matrix Factorization (PMF) receptor model were applied to airborne fine particle compositional data collected at three sites in Hong Kong during two monitoring campaigns held from November 2000 to October 2001 and November 2004 to October 2005. PROMETHEE/GAIA indicated that the three sites were worse during the later monitoring campaign, and that the order of the air quality at the sites during each campaign was: rural site > urban site > roadside site. The PMF analysis on the other hand, identified 6 common sources at all of the sites (diesel vehicle, fresh sea salt, secondary sulphate, soil, aged sea salt and oil combustion) which accounted for approximately 68.8 ± 8.7% of the fine particle mass at the sites. In addition, road dust, gasoline vehicle, biomass burning, secondary nitrate, and metal processing were identified at some of the sites. Secondary sulphate was found to be the highest contributor to the fine particle mass at the rural and urban sites with vehicle emission as a high contributor to the roadside site. The PMF results are broadly similar to those obtained in a previous analysis by PCA/APCS. However, the PMF analysis resolved more factors at each site than the PCA/APCS. In addition, the study demonstrated that combined results from multi-criteria decision making analysis and receptor modelling can provide more detailed information that can be used to formulate the scientific basis for mitigating air pollution in the region.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate the behavior of the empirical minimization algorithm using various methods. We first analyze it by comparing the empirical, random, structure and the original one on the class, either in an additive sense, via the uniform law of large numbers, or in a multiplicative sense, using isomorphic coordinate projections. We then show that a direct analysis of the empirical minimization algorithm yields a significantly better bound, and that the estimates we obtain are essentially sharp. The method of proof we use is based on Talagrand’s concentration inequality for empirical processes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The presence of arsenic in the environment is a hazard. The accumulation of arsenate by a range of cations in the formation of minerals provides a mechanism for the accumulation of arsenate. The formation of the tsumcorite minerals is an example of a series of minerals which accumulate arsenate. There are about twelve examples in this mineral group. Raman spectroscopy offers a method for the analysis of these minerals. The structure of selected tsumcorite minerals with arsenate and sulphate anions were analysed by Raman spectroscopy. Isomorphic substitution of sulphate for arsenate is observed for gartrellite and thometzekite. A comparison is made with the sulphate bearing mineral natrochalcite. The position of the hydroxyl and water stretching vibrations are related to the strength of the hydrogen bond formed between the OH unit and the AsO43- anion. Characteristic Raman spectra of the minerals enable the assignment of the bands to specific vibrational modes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In topological mapping, perceptual aliasing can cause different places to appear indistinguishable to the robot. In case of severely corrupted or non-available odometry information, topological mapping is difficult as the robot is challenged with the loop-closing problem; that is to determine whether it has visited a particular place before. In this article we propose to use neighbourhood information to disambiguate otherwise indistinguishable places. Using neighbourhood information for place disambiguation is an approach that neither depends on a specific choice of sensors nor requires geometric information such as odometry. Local neighbourhood information is extracted from a sequence of observations of visited places. In experiments using either sonar or visual observations from an indoor environment the benefits of using neighbourhood clues for the disambiguation of otherwise identical vertices are demonstrated. Over 90% of the maps we obtain are isomorphic with the ground truth. The choice of the robot’s sensors does not impact the results of the experiments much.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Purpose: To investigate the significance of sources around measurement sites, assist the development of control strategies for the important sources and mitigate the adverse effects of air pollution due to particle size. Methods: In this study, sampling was conducted at two sites located in urban/industrial and residential areas situated at roadsides along the Brisbane Urban Corridor. Ultrafine and fine particle measurements obtained at the two sites in June-July 2002 were analysed by Positive Matrix Factorization (PMF). Results: Six sources were present, including local traffic, two traffic sources, biomass burning, and two currently unidentified sources. Secondary particles had a significant impact at Site 1, while nitrates, peak traffic hours and main roads located close to the source also affected the results for both sites. Conclusions: This significant traffic corridor exemplifies the type of sources present in heavily trafficked locations and future attempts to control pollution in this type of environment could focus on the sources that were identified.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The aim of this paper is to provide a comparison of various algorithms and parameters to build reduced semantic spaces. The effect of dimension reduction, the stability of the representation and the effect of word order are examined in the context of the five algorithms bearing on semantic vectors: Random projection (RP), singular value decom- position (SVD), non-negative matrix factorization (NMF), permutations and holographic reduced representations (HRR). The quality of semantic representation was tested by means of synonym finding task using the TOEFL test on the TASA corpus. Dimension reduction was found to improve the quality of semantic representation but it is hard to find the optimal parameter settings. Even though dimension reduction by RP was found to be more generally applicable than SVD, the semantic vectors produced by RP are somewhat unstable. The effect of encoding word order into the semantic vector representation via HRR did not lead to any increase in scores over vectors constructed from word co-occurrence in context information. In this regard, very small context windows resulted in better semantic vectors for the TOEFL test.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Topic recommendation can help users deal with the information overload issue in micro-blogging communities. This paper proposes to use the implicit information network formed by the multiple relationships among users, topics and micro-blogs, and the temporal information of micro-blogs to find semantically and temporally relevant topics of each topic, and to profile users' time-drifting topic interests. The Content based, Nearest Neighborhood based and Matrix Factorization models are used to make personalized recommendations. The effectiveness of the proposed approaches is demonstrated in the experiments conducted on a real world dataset that collected from Twitter.com.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Robust hashing is an emerging field that can be used to hash certain data types in applications unsuitable for traditional cryptographic hashing methods. Traditional hashing functions have been used extensively for data/message integrity, data/message authentication, efficient file identification and password verification. These applications are possible because the hashing process is compressive, allowing for efficient comparisons in the hash domain but non-invertible meaning hashes can be used without revealing the original data. These techniques were developed with deterministic (non-changing) inputs such as files and passwords. For such data types a 1-bit or one character change can be significant, as a result the hashing process is sensitive to any change in the input. Unfortunately, there are certain applications where input data are not perfectly deterministic and minor changes cannot be avoided. Digital images and biometric features are two types of data where such changes exist but do not alter the meaning or appearance of the input. For such data types cryptographic hash functions cannot be usefully applied. In light of this, robust hashing has been developed as an alternative to cryptographic hashing and is designed to be robust to minor changes in the input. Although similar in name, robust hashing is fundamentally different from cryptographic hashing. Current robust hashing techniques are not based on cryptographic methods, but instead on pattern recognition techniques. Modern robust hashing algorithms consist of feature extraction followed by a randomization stage that introduces non-invertibility and compression, followed by quantization and binary encoding to produce a binary hash output. In order to preserve robustness of the extracted features, most randomization methods are linear and this is detrimental to the security aspects required of hash functions. Furthermore, the quantization and encoding stages used to binarize real-valued features requires the learning of appropriate quantization thresholds. How these thresholds are learnt has an important effect on hashing accuracy and the mere presence of such thresholds are a source of information leakage that can reduce hashing security. This dissertation outlines a systematic investigation of the quantization and encoding stages of robust hash functions. While existing literature has focused on the importance of quantization scheme, this research is the first to emphasise the importance of the quantizer training on both hashing accuracy and hashing security. The quantizer training process is presented in a statistical framework which allows a theoretical analysis of the effects of quantizer training on hashing performance. This is experimentally verified using a number of baseline robust image hashing algorithms over a large database of real world images. This dissertation also proposes a new randomization method for robust image hashing based on Higher Order Spectra (HOS) and Radon projections. The method is non-linear and this is an essential requirement for non-invertibility. The method is also designed to produce features more suited for quantization and encoding. The system can operate without the need for quantizer training, is more easily encoded and displays improved hashing performance when compared to existing robust image hashing algorithms. The dissertation also shows how the HOS method can be adapted to work with biometric features obtained from 2D and 3D face images.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An Aerodyne Aerosol Mass Spectrometer was deployed at five urban schools to examine spatial and temporal variability of organic aerosols (OA) and positive matrix factorization (PMF) used for the first time in the Southern Hemisphere to apportion the sources of the OA across an urban area. The sources identified included hydrocarbon-like OA (HOA), biomass burning OA (BBOA) and oxygenated OA (OOA). At all sites, the main source was OOA, which accounted for 62–73% of the total OA mass and was generally more oxidized compared to those reported in the Northern Hemisphere. This suggests that there are differences in aging processes or regional sources in the two hemispheres. Unlike HOA and BBOA, OOA demonstrated instructive temporal variations but not spatial variation across the urban area. Application of cluster analysis to the PMF-derived sources offered a simple and effective method for qualitative comparison of PMF sources that can be used in other studies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper discusses how fundamentals of number theory, such as unique prime factorization and greatest common divisor can be made accessible to secondary school students through spreadsheets. In addition, the three basic multiplicative functions of number theory are defined and illustrated through a spreadsheet environment. Primes are defined simply as those natural numbers with just two divisors. One focus of the paper is to show the ease with which spreadsheets can be used to introduce students to some basics of elementary number theory. Complete instructions are given to build a spreadsheet to enable the user to input a positive integer, either with a slider or manually, and see the prime decomposition. The spreadsheet environment allows students to observe patterns, gain structural insight, form and test conjectures, and solve problems in elementary number theory.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Description of a patient's injuries is recorded in narrative text form by hospital emergency departments. For statistical reporting, this text data needs to be mapped to pre-defined codes. Existing research in this field uses the Naïve Bayes probabilistic method to build classifiers for mapping. In this paper, we focus on providing guidance on the selection of a classification method. We build a number of classifiers belonging to different classification families such as decision tree, probabilistic, neural networks, and instance-based, ensemble-based and kernel-based linear classifiers. An extensive pre-processing is carried out to ensure the quality of data and, in hence, the quality classification outcome. The records with a null entry in injury description are removed. The misspelling correction process is carried out by finding and replacing the misspelt word with a soundlike word. Meaningful phrases have been identified and kept, instead of removing the part of phrase as a stop word. The abbreviations appearing in many forms of entry are manually identified and only one form of abbreviations is used. Clustering is utilised to discriminate between non-frequent and frequent terms. This process reduced the number of text features dramatically from about 28,000 to 5000. The medical narrative text injury dataset, under consideration, is composed of many short documents. The data can be characterized as high-dimensional and sparse, i.e., few features are irrelevant but features are correlated with one another. Therefore, Matrix factorization techniques such as Singular Value Decomposition (SVD) and Non Negative Matrix Factorization (NNMF) have been used to map the processed feature space to a lower-dimensional feature space. Classifiers with these reduced feature space have been built. In experiments, a set of tests are conducted to reflect which classification method is best for the medical text classification. The Non Negative Matrix Factorization with Support Vector Machine method can achieve 93% precision which is higher than all the tested traditional classifiers. We also found that TF/IDF weighting which works well for long text classification is inferior to binary weighting in short document classification. Another finding is that the Top-n terms should be removed in consultation with medical experts, as it affects the classification performance.