974 resultados para Pattern Mining
Resumo:
The development of techniques for scaling up classifiers so that they can be applied to problems with large datasets of training examples is one of the objectives of data mining. Recently, AdaBoost has become popular among machine learning community thanks to its promising results across a variety of applications. However, training AdaBoost on large datasets is a major problem, especially when the dimensionality of the data is very high. This paper discusses the effect of high dimensionality on the training process of AdaBoost. Two preprocessing options to reduce dimensionality, namely the principal component analysis and random projection are briefly examined. Random projection subject to a probabilistic length preserving transformation is explored further as a computationally light preprocessing step. The experimental results obtained demonstrate the effectiveness of the proposed training process for handling high dimensional large datasets.
Resumo:
The aim of this study is to investigate the blood flow pattern in carotid bifurcation with a high degree of luminal stenosis, combining in vivo magnetic resonance imaging (MRI) and computational fluid dynamics (CFD). A newly developed two-equation transitional model was employed to evaluate wall shear stress (WSS) distribution and pressure drop across the stenosis, which are closely related to plaque vulnerability. A patient with an 80% left carotid stenosis was imaged using high resolution MRI, from which a patient-specific geometry was reconstructed and flow boundary conditions were acquired for CFD simulation. A transitional model was implemented to investigate the flow velocity and WSS distribution in the patient-specific model. The peak time-averaged WSS value of approximately 73Pa was predicted by the transitional flow model, and the regions of high WSS occurred at the throat of the stenosis. High oscillatory shear index values up to 0.50 were present in a helical flow pattern from the outer wall of the internal carotid artery immediately after the throat. This study shows the potential suitability of a transitional turbulent flow model in capturing the flow phenomena in severely stenosed carotid arteries using patient-specific MRI data and provides the basis for further investigation of the links between haemodynamic variables and plaque vulnerability. It may be useful in the future for risk assessment of patients with carotid disease.
Resumo:
Abstract-The success of automatic speaker recognition in laboratory environments suggests applications in forensic science for establishing the Identity of individuals on the basis of features extracted from speech. A theoretical model for such a verification scheme for continuous normaliy distributed featureIss developed. The three cases of using a) single feature, b)multipliendependent measurements of a single feature, and c)multpleindependent features are explored.The number iofndependent features needed for areliable personal identification is computed based on the theoretcal model and an expklatory study of some speech featues.
Resumo:
A simple sequential thinning algorithm for peeling off pixels along contours is described. An adaptive algorithm obtained by incorporating shape adaptivity into this sequential process is also given. The distortions in the skeleton at the right-angle and acute-angle corners are minimized in the adaptive algorithm. The asymmetry of the skeleton, which is a characteristic of sequential algorithm, and is due to the presence of T-corners in some of the even-thickness pattern is eliminated. The performance (in terms of time requirements and shape preservation) is compared with that of a modern thinning algorithm.
Resumo:
Five species of commercial prawns Penaeus plebejus, P. merguiensis, P. semisulcatus/P. esculentus and M. bennettae, were obtained from South-East and North Queensland, chilled soon after capture and then stored either whole or deheaded on ice and ice slurry, until spoilage. Total bacterial counts, total volatile nitrogen, K-values and total demerit scores were assessed at regular intervals. Their shelf lives ranged from 10-17 days on ice and >20 days on ice slurry. Initial bacterial flora on prawns from shallower waters (4-15m) were dominated by Gram-positives and had lag periods around 7 days, whereas prawns from deeper waters (100m) were dominant in Pseudomonas spp. with no lag periods in bacterial growth. The dominant spoiler in ice was mainly Pseudomonas fragi whereas the main spoiler in ice slurry was Shewanella putrefaciens. Bacterial interactions seem to play a major role in the patterns of spoilage in relation to capture environment and pattern of storage
Resumo:
An adaptive learning scheme, based on a fuzzy approximation to the gradient descent method for training a pattern classifier using unlabeled samples, is described. The objective function defined for the fuzzy ISODATA clustering procedure is used as the loss function for computing the gradient. Learning is based on simultaneous fuzzy decisionmaking and estimation. It uses conditional fuzzy measures on unlabeled samples. An exponential membership function is assumed for each class, and the parameters constituting these membership functions are estimated, using the gradient, in a recursive fashion. The induced possibility of occurrence of each class is useful for estimation and is computed using 1) the membership of the new sample in that class and 2) the previously computed average possibility of occurrence of the same class. An inductive entropy measure is defined in terms of induced possibility distribution to measure the extent of learning. The method is illustrated with relevant examples.
Resumo:
The minimum cost classifier when general cost functionsare associated with the tasks of feature measurement and classification is formulated as a decision graph which does not reject class labels at intermediate stages. Noting its complexities, a heuristic procedure to simplify this scheme to a binary decision tree is presented. The optimizationof the binary tree in this context is carried out using ynamicprogramming. This technique is applied to the voiced-unvoiced-silence classification in speech processing.
Resumo:
My thesis examined an alternative approach, referred to as the unitary taxation approach to the allocation of profit, which arises from the notion that as a multinational group exists as a single economic entity, it should be taxed as one taxable unit. The plausibility of a unitary taxation regime achieving international acceptance and agreement is highly contestable due to its implementation issues, and economic and political feasibility. Using a case-study approach focusing on Freeport-McMoRan and Rio Tinto's mining operations in Indonesia, this thesis compares both tax regimes against the criteria for a good tax system - equity, efficiency, neutrality and simplicity. This thesis evaluates key issues that arise when implementing a unitary taxation approach with formulary apportionment based on the context of mining multinational firms in Indonesia.
Resumo:
In this paper we tackle the problem of efficient video event detection. We argue that linear detection functions should be preferred in this regard due to their scalability and efficiency during estimation and evaluation. A popular approach in this regard is to represent a sequence using a bag of words (BOW) representation due to its: (i) fixed dimensionality irrespective of the sequence length, and (ii) its ability to compactly model the statistics in the sequence. A drawback to the BOW representation, however, is the intrinsic destruction of the temporal ordering information. In this paper we propose a new representation that leverages the uncertainty in relative temporal alignments between pairs of sequences while not destroying temporal ordering. Our representation, like BOW, is of a fixed dimensionality making it easily integrated with a linear detection function. Extensive experiments on CK+, 6DMG, and UvA-NEMO databases show significant performance improvements across both isolated and continuous event detection tasks.
Resumo:
Purpose: Presence of neurophysiological abnormalities in dyslexia has been a conflicting issue. This study was performed to evaluate the role of sensory visual deficits in the pathogenesis of dyslexia. Methods: Pattern visual evoked potentials (PVEP) were recorded in 72 children including 36 children with dyslexia and 36 children without dyslexia (controls) who were matched for age, sex and intelligence. Two check sizes of 15 and 60 min of arc were used with temporal frequencies of 1.5 Hz for transient and 6 Hz for steady‑state methods. Results: Mean latency and amplitude values for 15 min arc and 60 min arc check sizes using steady state and transient methods showed no significant difference between the two study groups (P values: 0.139/0.481/0.356/0.062).Furthermore, no significant difference was observed between two methods of PVEPs in dyslexic and normal children using 60min arc with high contrast(Pvalues: 0.116, 0.402, 0.343 and 0.106). Conclusion: The sensitivity of PVEP has high validity to detect visual deficits in children with dyslexic problem. However, no significant difference was found between dyslexia and normal children using high contrast stimuli.
Resumo:
The statistical minimum risk pattern recognition problem, when the classification costs are random variables of unknown statistics, is considered. Using medical diagnosis as a possible application, the problem of learning the optimal decision scheme is studied for a two-class twoaction case, as a first step. This reduces to the problem of learning the optimum threshold (for taking appropriate action) on the a posteriori probability of one class. A recursive procedure for updating an estimate of the threshold is proposed. The estimation procedure does not require the knowledge of actual class labels of the sample patterns in the design set. The adaptive scheme of using the present threshold estimate for taking action on the next sample is shown to converge, in probability, to the optimum. The results of a computer simulation study of three learning schemes demonstrate the theoretically predictable salient features of the adaptive scheme.
Resumo:
Existing process mining techniques provide summary views of the overall process performance over a period of time, allowing analysts to identify bottlenecks and associated performance issues. However, these tools are not de- signed to help analysts understand how bottlenecks form and dissolve over time nor how the formation and dissolution of bottlenecks – and associated fluctua- tions in demand and capacity – affect the overall process performance. This paper presents an approach to analyze the evolution of process performance via a notion of Staged Process Flow (SPF). An SPF abstracts a business process as a series of queues corresponding to stages. The paper defines a number of stage character- istics and visualizations that collectively allow process performance evolution to be analyzed from multiple perspectives. The approach has been implemented in the ProM process mining framework. The paper demonstrates the advantages of the SPF approach over state-of-the-art process performance mining tools using two real-life event logs publicly available.
Resumo:
The Queensland Great Barrier Reef line fishery in Australia is regulated via a range of input and output controls including minimum size limits, daily catch limits and commercial catch quotas. As a result of these measures a substantial proportion of the catch is released or discarded. The fate of these released fish is uncertain, but hook-related mortality can potentially be decreased by using hooks that reduce the rates of injury, bleeding and deep hooking. There is also the potential to reduce the capture of non-target species though gear selectivity. A total of 1053 individual fish representing five target species and three non-target species were caught using six hook types including three hook patterns (non-offset circle, J and offset circle), each in two sizes (small 4/0 or 5/0 and large 8/0). Catch rates for each of the hook patterns and sizes varied between species with no consistent results for target or non-target species. When data for all of the fish species were aggregated there was a trend for larger hooks, J hooks and offset circle hooks to cause a greater number of injuries. Using larger hooks was more likely to result in bleeding, although this trend was not statistically significant. Larger hooks were also more likely to foul-hook fish or hook fish in the eye. There was a reduction in the rates of injuries and bleeding for both target and non-target species when using the smaller hook sizes. For a number of species included in our study the incidence of deep hooking decreased when using non-offset circle hooks, however, these results were not consistent for all species. Our results highlight the variability in hook performance across a range of tropical demersal finfish species. The most obvious conservation benefits for both target and non-target species arise from using smaller sized hooks and non-offset circle hooks. Fishers should be encouraged to use these hook configurations to reduce the potential for post-release mortality of released fish.
Resumo:
We are addressing the problem of jointly using multiple noisy speech patterns for automatic speech recognition (ASR), given that they come from the same class. If the user utters a word K times, the ASR system should try to use the information content in all the K patterns of the word simultaneously and improve its speech recognition accuracy compared to that of the single pattern based speech recognition. T address this problem, recently we proposed a Multi Pattern Dynamic Time Warping (MPDTW) algorithm to align the K patterns by finding the least distortion path between them. A Constrained Multi Pattern Viterbi algorithm was used on this aligned path for isolated word recognition (IWR). In this paper, we explore the possibility of using only the MPDTW algorithm for IWR. We also study the properties of the MPDTW algorithm. We show that using only 2 noisy test patterns (10 percent burst noise at -5 dB SNR) reduces the noisy speech recognition error rate by 37.66 percent when compared to the single pattern recognition using the Dynamic Time Warping algorithm.
Resumo:
Overprocessing waste occurs in a business process when effort is spent in a way that does not add value to the customer nor to the business. Previous studies have identied a recurrent overprocessing pattern in business processes with so-called "knockout checks", meaning activities that classify a case into "accepted" or "rejected", such that if the case is accepted it proceeds forward, while if rejected, it is cancelled and all work performed in the case is considered unnecessary. Thus, when a knockout check rejects a case, the effort spent in other (previous) checks becomes overprocessing waste. Traditional process redesign methods propose to order knockout checks according to their mean effort and rejection rate. This paper presents a more fine-grained approach where knockout checks are ordered at runtime based on predictive machine learning models. Experiments on two real-life processes show that this predictive approach outperforms traditional methods while incurring minimal runtime overhead.