14 resultados para data-projection

em Deakin Research Online - Australia


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, a hybrid intelligent system that integrates the SOM (Self-Organizing Map) neural network, kMER (kernel-based Maximum Entropy learning Rule), and Probabilistic Neural Network (PNN) for data visualization and classification is proposed. The rationales of this Probabilistic SOM-kMER model are explained, and its applicability is demonstrated using two benchmark data sets. The results are analyzed and compared with those from a number of existing methods. Implication of the proposed hybrid system as a useful and usable data visualization and classification tool is discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper proposes a novel architecture for
developing decision support systems. Unlike conventional decision support systems, the proposed architecture endeavors to reveal the decision-making process such that humans' subjectivity can be
incorporated into a computerized system and, at the same time, to
preserve the capability of the computerized system in processing information objectively. A number of techniques used in developing the decision support system are elaborated to make the decisionmarking
process transparent. These include procedures for high dimensional data visualization, pattern classification, prediction, and evolutionary computational search. An artificial data set is first
employed to compare the proposed approach with other methods. A simulated handwritten data set and a real data set on liver disease diagnosis are then employed to evaluate the efficacy of the proposed
approach. The results are analyzed and discussed. The potentials of the proposed architecture as a useful decision support system are demonstrated.

Relevância:

60.00% 60.00%

Publicador:

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Protein mass spectrometry (MS) pattern recognition has recently emerged as a new method for cancer diagnosis. Unfortunately, classification performance may degrade owing to the enormously high dimensionality of the data. This paper investigates the use of Random Projection in protein MS data dimensionality reduction. The effectiveness of Random Projection (RP) is analyzed and compared against Principal Component Analysis (PCA) by using three classification algorithms, namely Support Vector Machine, Feed-forward Neural Networks and K-Nearest Neighbour. Three real-world cancer data sets are employed to evaluate the performances of RP and PCA. Through the investigations, RP method demonstrated better or at least comparable classification performance as PCA if the dimensionality of the projection matrix is sufficiently large. This paper also explores the use of RP as a pre-processing step prior to PCA. The results show that without sacrificing classification accuracy, performing RP prior to PCA significantly improves the computational time.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Efficiency measurement is at the heart of most management accounting functions. Data envelopment analysis (DEA) is a linear programming technique used to measure relative efficiency of organisational units referred in DEA literature as decision making units (DMUs). Universities are complex organisations involving multiple inputs and outputs (Abbott & Doucouliagos, 2008). There is no agreement in identifying and measuring the inputs and outputs of higher education institutes (Avkiran, 2001). Hence, accurate efficiency measurement in such complex institutes needs rigorous research.

Prior DEA studies have investigated the application of the technique at university (Avkiran, 2001; Abbott & Doucouliagos, 2003; Abbott & Doucouliagos, 2008) or department/school (Beasley, 1990; Sinuany-Stern, Mehrez & Barboy, 1994) levels. The organisational unit that has control and hence the responsibility over inputs and outputs is the most appropriate decision making unit (DMU) for DEA to provide useful managerial information. In the current study, DEA has been applied at faculty level for two reasons. First, in the case university, as with most other universities, inputs and outputs are more accurately identified with faculties than departments/schools. Second, efficiency results at university level are highly aggregated and do not provide detail managerial information.

Prior DEA time series studies have used input and output cost and income data without adjusting for changes in time value of money. This study examines the effects of adjusting financial data for changes in dollar values without proportional changes in the quantity of the inputs and the outputs. The study is carried out mainly from management accounting perspective. It is mainly focused on the use of the DEA efficiency information for managerial decision purposes. It is not intended to contribute to the theoretical development of the linear programming model. It takes the view that one does not need to be a mechanic to be a good car driver.

The results suggest that adjusting financial input and output data in time series analysis change efficiency values, rankings, reference set as well as projection amounts. The findings also suggest that the case University could have saved close to $10 million per year if all faculties had operated efficiently. However, it is also recognised that quantitative performance measures have their own limitations and should be used cautiously.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The human immunodeficiency virus–acquired immune deficiency syndrome (HIV–AIDS) epidemic in Hong Kong has been under surveillance in the form of voluntary reporting since 1984. However, there has been little discussion or research on the reconstruction of the HIV incidence curve. This paper is the first to use a modified back-projection method to estimate the incidence of HIV in Hong Kong on the basis of the number of positive HIV tests only. The model proposed has several advantages over the original back-projection method based on AIDS data only. First, not all HIV-infected individuals will develop AIDS by the time of analysis, but some of them may undertake an HIV test; therefore, the HIV data set contains more information than the AIDS data set. Second, the HIV diagnosis curve usually has a smoother pattern than the AIDS diagnosis curve, as it is not affected by redefinition of AIDS. Third, the time to positive HIV diagnosis is unlikely to be affected by treatment effects, as it is unlikely that an individual receives medication before the diagnosis of HIV. Fourth, the induction period from HIV infection to the first HIV positive test is usually shorter than the incubation period which is from HIV infection to diagnosis of AIDS. With a shorter induction period, more information becomes available for estimating the HIV incidence curve. Finally, this method requires the number of positive HIV diagnoses only, which is readily available from HIV–AIDS surveillance systems in many countries. It is estimated that, in Hong Kong, the cumulative number of HIV infections during the period 1979–2000 is about 2600, whereas an estimate based only on AIDS data seems to give an underestimate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Even if the class label information is unknown, side information represents some equivalence constraints between pairs of patterns, indicating whether pairs originate from the same class. Exploiting side information, we develop algorithms to preserve both the intra-class and inter-class local structures. This new type of locality preserving projection (LPP), called LPP with side information (LPPSI), preserves the data's local structure in the sense that the close, similar training patterns will be kept close, whilst the close but dissimilar ones are separated. Our algorithms balance these conflicting requirements, and we further improve this technique using kernel methods. Experiments conducted on popular face databases demonstrate that the proposed algorithm significantly outperforms LPP. Further, we show that the performance of our algorithm with partial side information (that is, using only small amount of pair-wise similarity/dissimilarity information during training) is comparable with that when using full side information. We conclude that exploiting side information by preserving both similar and dissimilar local structures of the data significantly improves performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Two Dimensional Locality Preserving Projection (2D-LPP) is a recent extension of LPP, a popular face recognition algorithm. It has been shown that 2D-LPP performs better than PCA, 2D-PCA and LPP. However, the computational cost of 2D-LPP is high. This paper proposes a novel algorithm called Ridge Regression for Two Dimensional Locality Preserving Projection (RR- 2DLPP), which is an extension of 2D-LPP with the use of ridge regression. RR-2DLPP is comparable to 2DLPP in performance whilst having a lower computational cost. The experimental results on three benchmark face data sets - the ORL, Yale and FERET databases - demonstrate the effectiveness and efficiency of RR-2DLPP compared with other face recognition algorithms such as PCA, LPP, SR, 2D-PCA and 2D-LPP.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Understanding neural functions requires knowledge from analysing electrophysiological data. The process of assigning spikes of a multichannel signal into clusters, called spike sorting, is one of the important problems in such analysis. There have been various automated spike sorting techniques with both advantages and disadvantages regarding accuracy and computational costs. Therefore, developing spike sorting methods that are highly accurate and computationally inexpensive is always a challenge in the biomedical engineering practice.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we tackle the incompleteness of user rating history in the context of collaborative filtering for Top-N recommendations. Previous research ignore a fact that two rating patterns exist in the user × item rating matrix and influence each other. More importantly, their interactive influence characterizes the development of each other, which can consequently be exploited to improve the modelling of rating patterns, especially when the user × item rating matrix is highly incomplete due to the well-known data sparsity issue. This paper proposes a Rating Pattern Subspace to iteratively re-optimize the missing values in each user’s rating history by modelling both the global and the personal rating patterns simultaneously. The basic idea is to project the user × item rating matrix on a low-rank subspace to capture the global rating patterns. Then, the projection of each individual user on the subspace is further optimized according to his/her own rating history and the captured global rating patterns. Finally, the optimized user projections are used to improve the modelling of the global rating patterns. Based on this subspace, we propose a RapSVD-L algorithm for Top-N recommendations. In the experiments, the performance of the proposed method is compared with the state-of-the-art Top-N recommendation methods on two real datasets under various data sparsity levels. The experimental results show that RapSVD-L outperforms the compared algorithms not only on the all items recommendations but also on the long tail item recommendations in terms of accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Personal use is permitted.We present a novel framework of performing multimedia data hiding using an over-complete dictionary, which brings compressive sensing to the application of data hiding. Unlike the conventional orthonormal full-space dictionary, the over-complete dictionary produces an underdetermined system with infinite transform results. We first discuss the minimum norm formulation (ℓ2-norm) which yields a closed-form solution and the concept of watermark projection, so that higher embedding capacity and an additional privacy preserving feature can be obtained. Furthermore, we study the sparse formulation (ℓ0-norm) and illustrate that as long as the ℓ0-norm of the sparse representation of the host signal is less than the signal's dimension in the original domain, an informed sparse domain data hiding system can be established by modifying the coefficients of the atoms that have not participated in representing the host signal. A single support modification-based data hiding system is then proposed and analyzed as an example. Several potential research directions are discussed for further studies. More generally, apart from the ℓ2 and ℓ0-norm constraints, other conditions for reliable detection performance are worth of future investigation.