861 resultados para Robust Learning Algorithm
Resumo:
In this work, we present an adaptive unequal loss protection (ULP) scheme for H264/AVC video transmission over lossy networks. This scheme combines erasure coding, H.264/AVC error resilience techniques and importance measures in video coding. The unequal importance of the video packets is identified in the group of pictures (GOP) and the H.264/AVC data partitioning levels. The presented method can adaptively assign unequal amount of forward error correction (FEC) parity across the video packets according to the network conditions, such as the available network bandwidth, packet loss rate and average packet burst loss length. A near optimal algorithm is developed to deal with the FEC assignment for optimization. The simulation results show that our scheme can effectively utilize network resources such as bandwidth, while improving the quality of the video transmission. In addition, the proposed ULP strategy ensures graceful degradation of the received video quality as the packet loss rate increases. © 2010 IEEE.
Resumo:
Subspaces and manifolds are two powerful models for high dimensional signals. Subspaces model linear correlation and are a good fit to signals generated by physical systems, such as frontal images of human faces and multiple sources impinging at an antenna array. Manifolds model sources that are not linearly correlated, but where signals are determined by a small number of parameters. Examples are images of human faces under different poses or expressions, and handwritten digits with varying styles. However, there will always be some degree of model mismatch between the subspace or manifold model and the true statistics of the source. This dissertation exploits subspace and manifold models as prior information in various signal processing and machine learning tasks.
A near-low-rank Gaussian mixture model measures proximity to a union of linear or affine subspaces. This simple model can effectively capture the signal distribution when each class is near a subspace. This dissertation studies how the pairwise geometry between these subspaces affects classification performance. When model mismatch is vanishingly small, the probability of misclassification is determined by the product of the sines of the principal angles between subspaces. When the model mismatch is more significant, the probability of misclassification is determined by the sum of the squares of the sines of the principal angles. Reliability of classification is derived in terms of the distribution of signal energy across principal vectors. Larger principal angles lead to smaller classification error, motivating a linear transform that optimizes principal angles. This linear transformation, termed TRAIT, also preserves some specific features in each class, being complementary to a recently developed Low Rank Transform (LRT). Moreover, when the model mismatch is more significant, TRAIT shows superior performance compared to LRT.
The manifold model enforces a constraint on the freedom of data variation. Learning features that are robust to data variation is very important, especially when the size of the training set is small. A learning machine with large numbers of parameters, e.g., deep neural network, can well describe a very complicated data distribution. However, it is also more likely to be sensitive to small perturbations of the data, and to suffer from suffer from degraded performance when generalizing to unseen (test) data.
From the perspective of complexity of function classes, such a learning machine has a huge capacity (complexity), which tends to overfit. The manifold model provides us with a way of regularizing the learning machine, so as to reduce the generalization error, therefore mitigate overfiting. Two different overfiting-preventing approaches are proposed, one from the perspective of data variation, the other from capacity/complexity control. In the first approach, the learning machine is encouraged to make decisions that vary smoothly for data points in local neighborhoods on the manifold. In the second approach, a graph adjacency matrix is derived for the manifold, and the learned features are encouraged to be aligned with the principal components of this adjacency matrix. Experimental results on benchmark datasets are demonstrated, showing an obvious advantage of the proposed approaches when the training set is small.
Stochastic optimization makes it possible to track a slowly varying subspace underlying streaming data. By approximating local neighborhoods using affine subspaces, a slowly varying manifold can be efficiently tracked as well, even with corrupted and noisy data. The more the local neighborhoods, the better the approximation, but the higher the computational complexity. A multiscale approximation scheme is proposed, where the local approximating subspaces are organized in a tree structure. Splitting and merging of the tree nodes then allows efficient control of the number of neighbourhoods. Deviation (of each datum) from the learned model is estimated, yielding a series of statistics for anomaly detection. This framework extends the classical {\em changepoint detection} technique, which only works for one dimensional signals. Simulations and experiments highlight the robustness and efficacy of the proposed approach in detecting an abrupt change in an otherwise slowly varying low-dimensional manifold.
Resumo:
This work explores the use of statistical methods in describing and estimating camera poses, as well as the information feedback loop between camera pose and object detection. Surging development in robotics and computer vision has pushed the need for algorithms that infer, understand, and utilize information about the position and orientation of the sensor platforms when observing and/or interacting with their environment.
The first contribution of this thesis is the development of a set of statistical tools for representing and estimating the uncertainty in object poses. A distribution for representing the joint uncertainty over multiple object positions and orientations is described, called the mirrored normal-Bingham distribution. This distribution generalizes both the normal distribution in Euclidean space, and the Bingham distribution on the unit hypersphere. It is shown to inherit many of the convenient properties of these special cases: it is the maximum-entropy distribution with fixed second moment, and there is a generalized Laplace approximation whose result is the mirrored normal-Bingham distribution. This distribution and approximation method are demonstrated by deriving the analytical approximation to the wrapped-normal distribution. Further, it is shown how these tools can be used to represent the uncertainty in the result of a bundle adjustment problem.
Another application of these methods is illustrated as part of a novel camera pose estimation algorithm based on object detections. The autocalibration task is formulated as a bundle adjustment problem using prior distributions over the 3D points to enforce the objects' structure and their relationship with the scene geometry. This framework is very flexible and enables the use of off-the-shelf computational tools to solve specialized autocalibration problems. Its performance is evaluated using a pedestrian detector to provide head and foot location observations, and it proves much faster and potentially more accurate than existing methods.
Finally, the information feedback loop between object detection and camera pose estimation is closed by utilizing camera pose information to improve object detection in scenarios with significant perspective warping. Methods are presented that allow the inverse perspective mapping traditionally applied to images to be applied instead to features computed from those images. For the special case of HOG-like features, which are used by many modern object detection systems, these methods are shown to provide substantial performance benefits over unadapted detectors while achieving real-time frame rates, orders of magnitude faster than comparable image warping methods.
The statistical tools and algorithms presented here are especially promising for mobile cameras, providing the ability to autocalibrate and adapt to the camera pose in real time. In addition, these methods have wide-ranging potential applications in diverse areas of computer vision, robotics, and imaging.
Resumo:
Knowledge-based radiation treatment is an emerging concept in radiotherapy. It
mainly refers to the technique that can guide or automate treatment planning in
clinic by learning from prior knowledge. Dierent models are developed to realize
it, one of which is proposed by Yuan et al. at Duke for lung IMRT planning. This
model can automatically determine both beam conguration and optimization ob-
jectives with non-coplanar beams based on patient-specic anatomical information.
Although plans automatically generated by this model demonstrate equivalent or
better dosimetric quality compared to clinical approved plans, its validity and gener-
ality are limited due to the empirical assignment to a coecient called angle spread
constraint dened in the beam eciency index used for beam ranking. To eliminate
these limitations, a systematic study on this coecient is needed to acquire evidences
for its optimal value.
To achieve this purpose, eleven lung cancer patients with complex tumor shape
with non-coplanar beams adopted in clinical approved plans were retrospectively
studied in the frame of the automatic lung IMRT treatment algorithm. The primary
and boost plans used in three patients were treated as dierent cases due to the
dierent target size and shape. A total of 14 lung cases, thus, were re-planned using
the knowledge-based automatic lung IMRT planning algorithm by varying angle
spread constraint from 0 to 1 with increment of 0.2. A modied beam angle eciency
index used for navigate the beam selection was adopted. Great eorts were made to assure the quality of plans associated to every angle spread constraint as good
as possible. Important dosimetric parameters for PTV and OARs, quantitatively
re
ecting the plan quality, were extracted from the DVHs and analyzed as a function
of angle spread constraint for each case. Comparisons of these parameters between
clinical plans and model-based plans were evaluated by two-sampled Students t-tests,
and regression analysis on a composite index built on the percentage errors between
dosimetric parameters in the model-based plans and those in the clinical plans as a
function of angle spread constraint was performed.
Results show that model-based plans generally have equivalent or better quality
than clinical approved plans, qualitatively and quantitatively. All dosimetric param-
eters except those for lungs in the automatically generated plans are statistically
better or comparable to those in the clinical plans. On average, more than 15% re-
duction on conformity index and homogeneity index for PTV and V40, V60 for heart
while an 8% and 3% increase on V5, V20 for lungs, respectively, are observed. The
intra-plan comparison among model-based plans demonstrates that plan quality does
not change much with angle spread constraint larger than 0.4. Further examination
on the variation curve of the composite index as a function of angle spread constraint
shows that 0.6 is the optimal value that can result in statistically the best achievable
plans.
Resumo:
Uncertainty quantification (UQ) is both an old and new concept. The current novelty lies in the interactions and synthesis of mathematical models, computer experiments, statistics, field/real experiments, and probability theory, with a particular emphasize on the large-scale simulations by computer models. The challenges not only come from the complication of scientific questions, but also from the size of the information. It is the focus in this thesis to provide statistical models that are scalable to massive data produced in computer experiments and real experiments, through fast and robust statistical inference.
Chapter 2 provides a practical approach for simultaneously emulating/approximating massive number of functions, with the application on hazard quantification of Soufri\`{e}re Hills volcano in Montserrate island. Chapter 3 discusses another problem with massive data, in which the number of observations of a function is large. An exact algorithm that is linear in time is developed for the problem of interpolation of Methylation levels. Chapter 4 and Chapter 5 are both about the robust inference of the models. Chapter 4 provides a new criteria robustness parameter estimation criteria and several ways of inference have been shown to satisfy such criteria. Chapter 5 develops a new prior that satisfies some more criteria and is thus proposed to use in practice.
Resumo:
The effectiveness of an optimization algorithm can be reduced to its ability to navigate an objective function’s topology. Hybrid optimization algorithms combine various optimization algorithms using a single meta-heuristic so that the hybrid algorithm is more robust, computationally efficient, and/or accurate than the individual algorithms it is made of. This thesis proposes a novel meta-heuristic that uses search vectors to select the constituent algorithm that is appropriate for a given objective function. The hybrid is shown to perform competitively against several existing hybrid and non-hybrid optimization algorithms over a set of three hundred test cases. This thesis also proposes a general framework for evaluating the effectiveness of hybrid optimization algorithms. Finally, this thesis presents an improved Method of Characteristics Code with novel boundary conditions, which better characterizes pipelines than previous codes. This code is coupled with the hybrid optimization algorithm in order to optimize the operation of real-world piston pumps.
Resumo:
In this work, we propose a biologically inspired appearance model for robust visual tracking. Motivated in part by the success of the hierarchical organization of the primary visual cortex (area V1), we establish an architecture consisting of five layers: whitening, rectification, normalization, coding and polling. The first three layers stem from the models developed for object recognition. In this paper, our attention focuses on the coding and pooling layers. In particular, we use a discriminative sparse coding method in the coding layer along with spatial pyramid representation in the pooling layer, which makes it easier to distinguish the target to be tracked from its background in the presence of appearance variations. An extensive experimental study shows that the proposed method has higher tracking accuracy than several state-of-the-art trackers.
Resumo:
Background and aims: Machine learning techniques for the text mining of cancer-related clinical documents have not been sufficiently explored. Here some techniques are presented for the pre-processing of free-text breast cancer pathology reports, with the aim of facilitating the extraction of information relevant to cancer staging.
Materials and methods: The first technique was implemented using the freely available software RapidMiner to classify the reports according to their general layout: ‘semi-structured’ and ‘unstructured’. The second technique was developed using the open source language engineering framework GATE and aimed at the prediction of chunks of the report text containing information pertaining to the cancer morphology, the tumour size, its hormone receptor status and the number of positive nodes. The classifiers were trained and tested respectively on sets of 635 and 163 manually classified or annotated reports, from the Northern Ireland Cancer Registry.
Results: The best result of 99.4% accuracy – which included only one semi-structured report predicted as unstructured – was produced by the layout classifier with the k nearest algorithm, using the binary term occurrence word vector type with stopword filter and pruning. For chunk recognition, the best results were found using the PAUM algorithm with the same parameters for all cases, except for the prediction of chunks containing cancer morphology. For semi-structured reports the performance ranged from 0.97 to 0.94 and from 0.92 to 0.83 in precision and recall, while for unstructured reports performance ranged from 0.91 to 0.64 and from 0.68 to 0.41 in precision and recall. Poor results were found when the classifier was trained on semi-structured reports but tested on unstructured.
Conclusions: These results show that it is possible and beneficial to predict the layout of reports and that the accuracy of prediction of which segments of a report may contain certain information is sensitive to the report layout and the type of information sought.
Resumo:
The purpose of this paper is to examine the promising contributions of the Concept Maps for Learning (CMfL) website to assessment for learning practices. The CMfL website generates concept maps from relatedness degree of concepts pairs through the Pathfinder Scaling Algorithm. This website also confirms the established principles of effective assessment for learning, for it is capable of automatically assessing students' higher order knowledge, simultaneously identifying strengths and weaknesses, immediately providing useful feedback and being user-friendly. According to the default assessment plan, students first create concept maps on a particular subject and then they are given individualized visual feedback followed by associated instructional material (e.g., videos, website links, examples, problems, etc.) based on a comparison of their concept map and a subject matter expert's map. After studying the feedback and instructional material, teachers can monitor their students' progress by having them create revised concept maps. Therefore, we claim that the CMfL website may reduce the workload of teachers as well as provide immediate and delayed feedback on the weaknesses of students in different forms such as graphical and multimedia. For the following study, we will examine whether these promising contributions to assessment for learning are valid in a variety of subjects.
Resumo:
In this paper, we describe how the pathfinder algorithm converts relatedness ratings of concept pairs to concept maps; we also present how this algorithm has been used to develop the Concept Maps for Learning website (www.conceptmapsforlearning.com) based on the principles of effective formative assessment. The pathfinder networks, one of the network representation tools, claim to help more students memorize and recall the relations between concepts than spatial representation tools (such as Multi- Dimensional Scaling). Therefore, the pathfinder networks have been used in various studies on knowledge structures, including identifying students’ misconceptions. To accomplish this, each student’s knowledge map and the expert knowledge map are compared via the pathfinder software, and the differences between these maps are highlighted. After misconceptions are identified, the pathfinder software fails to provide any feedback on these misconceptions. To overcome this weakness, we have been developing a mobile-based concept mapping tool providing visual, textual and remedial feedback (ex. videos, website links and applets) on the concept relations. This information is then placed on the expert concept map, but not on the student’s concept map. Additionally, students are asked to note what they understand from given feedback, and given the opportunity to revise their knowledge maps after receiving various types of feedback.
Resumo:
Learning Bayesian networks with bounded tree-width has attracted much attention recently, because low tree-width allows exact inference to be performed efficiently. Some existing methods \cite{korhonen2exact, nie2014advances} tackle the problem by using $k$-trees to learn the optimal Bayesian network with tree-width up to $k$. Finding the best $k$-tree, however, is computationally intractable. In this paper, we propose a sampling method to efficiently find representative $k$-trees by introducing an informative score function to characterize the quality of a $k$-tree. To further improve the quality of the $k$-trees, we propose a probabilistic hill climbing approach that locally refines the sampled $k$-trees. The proposed algorithm can efficiently learn a quality Bayesian network with tree-width at most $k$. Experimental results demonstrate that our approach is more computationally efficient than the exact methods with comparable accuracy, and outperforms most existing approximate methods.
Resumo:
We present a method for learning treewidth-bounded Bayesian networks from data sets containing thousands of variables. Bounding the treewidth of a Bayesian network greatly reduces the complexity of inferences. Yet, being a global property of the graph, it considerably increases the difficulty of the learning process. Our novel algorithm accomplishes this task, scaling both to large domains and to large treewidths. Our novel approach consistently outperforms the state of the art on experiments with up to thousands of variables.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Strategic Management Simulation as a Blended Learning Dimension: Campus Based Students’ Perspectives
Resumo:
Although business simulations are widely used in management education, there is no consensus about how to optimise their application. Our research explores the use of business simulations as a dimension of a blended learning pedagogic approach for undergraduate business education. Accepting that few best-practice prescriptive models for the design and implementation of simulations in this context have been presented, and that there is little empirical evidence for the claims made by proponents of such models, we address the lacuna by considering business student perspectives on the use of simulations. We then intersect available data with espoused positive outcomes made by the authors of a prescriptive model. We find the model to be essentially robust and offer evidence to support this position. In so doing we provide one of the few empirically based studies to support claims made by proponents of simulations in business education. The research should prove valuable for those with an academic interest in the use of simulations, either as a blended learning dimension or as a stand-alone business education activity. Further, the findings contribute to the academic debate surrounding the use and efficacy of simulation-based training [SBT] within business and management education.
Resumo:
Thesis (Master's)--University of Washington, 2016-08