980 resultados para problem complexity
Resumo:
This thesis explores the problem of mobile robot navigation in dense human crowds. We begin by considering a fundamental impediment to classical motion planning algorithms called the freezing robot problem: once the environment surpasses a certain level of complexity, the planner decides that all forward paths are unsafe, and the robot freezes in place (or performs unnecessary maneuvers) to avoid collisions. Since a feasible path typically exists, this behavior is suboptimal. Existing approaches have focused on reducing predictive uncertainty by employing higher fidelity individual dynamics models or heuristically limiting the individual predictive covariance to prevent overcautious navigation. We demonstrate that both the individual prediction and the individual predictive uncertainty have little to do with this undesirable navigation behavior. Additionally, we provide evidence that dynamic agents are able to navigate in dense crowds by engaging in joint collision avoidance, cooperatively making room to create feasible trajectories. We accordingly develop interacting Gaussian processes, a prediction density that captures cooperative collision avoidance, and a "multiple goal" extension that models the goal driven nature of human decision making. Navigation naturally emerges as a statistic of this distribution.
Most importantly, we empirically validate our models in the Chandler dining hall at Caltech during peak hours, and in the process, carry out the first extensive quantitative study of robot navigation in dense human crowds (collecting data on 488 runs). The multiple goal interacting Gaussian processes algorithm performs comparably with human teleoperators in crowd densities nearing 1 person/m2, while a state of the art noncooperative planner exhibits unsafe behavior more than 3 times as often as the multiple goal extension, and twice as often as the basic interacting Gaussian process approach. Furthermore, a reactive planner based on the widely used dynamic window approach proves insufficient for crowd densities above 0.55 people/m2. We also show that our noncooperative planner or our reactive planner capture the salient characteristics of nearly any dynamic navigation algorithm. For inclusive validation purposes, we show that either our non-interacting planner or our reactive planner captures the salient characteristics of nearly any existing dynamic navigation algorithm. Based on these experimental results and theoretical observations, we conclude that a cooperation model is critical for safe and efficient robot navigation in dense human crowds.
Finally, we produce a large database of ground truth pedestrian crowd data. We make this ground truth database publicly available for further scientific study of crowd prediction models, learning from demonstration algorithms, and human robot interaction models in general.
Resumo:
Signal processing techniques play important roles in the design of digital communication systems. These include information manipulation, transmitter signal processing, channel estimation, channel equalization and receiver signal processing. By interacting with communication theory and system implementing technologies, signal processing specialists develop efficient schemes for various communication problems by wisely exploiting various mathematical tools such as analysis, probability theory, matrix theory, optimization theory, and many others. In recent years, researchers realized that multiple-input multiple-output (MIMO) channel models are applicable to a wide range of different physical communications channels. Using the elegant matrix-vector notations, many MIMO transceiver (including the precoder and equalizer) design problems can be solved by matrix and optimization theory. Furthermore, the researchers showed that the majorization theory and matrix decompositions, such as singular value decomposition (SVD), geometric mean decomposition (GMD) and generalized triangular decomposition (GTD), provide unified frameworks for solving many of the point-to-point MIMO transceiver design problems.
In this thesis, we consider the transceiver design problems for linear time invariant (LTI) flat MIMO channels, linear time-varying narrowband MIMO channels, flat MIMO broadcast channels, and doubly selective scalar channels. Additionally, the channel estimation problem is also considered. The main contributions of this dissertation are the development of new matrix decompositions, and the uses of the matrix decompositions and majorization theory toward the practical transmit-receive scheme designs for transceiver optimization problems. Elegant solutions are obtained, novel transceiver structures are developed, ingenious algorithms are proposed, and performance analyses are derived.
The first part of the thesis focuses on transceiver design with LTI flat MIMO channels. We propose a novel matrix decomposition which decomposes a complex matrix as a product of several sets of semi-unitary matrices and upper triangular matrices in an iterative manner. The complexity of the new decomposition, generalized geometric mean decomposition (GGMD), is always less than or equal to that of geometric mean decomposition (GMD). The optimal GGMD parameters which yield the minimal complexity are derived. Based on the channel state information (CSI) at both the transmitter (CSIT) and receiver (CSIR), GGMD is used to design a butterfly structured decision feedback equalizer (DFE) MIMO transceiver which achieves the minimum average mean square error (MSE) under the total transmit power constraint. A novel iterative receiving detection algorithm for the specific receiver is also proposed. For the application to cyclic prefix (CP) systems in which the SVD of the equivalent channel matrix can be easily computed, the proposed GGMD transceiver has K/log_2(K) times complexity advantage over the GMD transceiver, where K is the number of data symbols per data block and is a power of 2. The performance analysis shows that the GGMD DFE transceiver can convert a MIMO channel into a set of parallel subchannels with the same bias and signal to interference plus noise ratios (SINRs). Hence, the average bit rate error (BER) is automatically minimized without the need for bit allocation. Moreover, the proposed transceiver can achieve the channel capacity simply by applying independent scalar Gaussian codes of the same rate at subchannels.
In the second part of the thesis, we focus on MIMO transceiver design for slowly time-varying MIMO channels with zero-forcing or MMSE criterion. Even though the GGMD/GMD DFE transceivers work for slowly time-varying MIMO channels by exploiting the instantaneous CSI at both ends, their performance is by no means optimal since the temporal diversity of the time-varying channels is not exploited. Based on the GTD, we develop space-time GTD (ST-GTD) for the decomposition of linear time-varying flat MIMO channels. Under the assumption that CSIT, CSIR and channel prediction are available, by using the proposed ST-GTD, we develop space-time geometric mean decomposition (ST-GMD) DFE transceivers under the zero-forcing or MMSE criterion. Under perfect channel prediction, the new system minimizes both the average MSE at the detector in each space-time (ST) block (which consists of several coherence blocks), and the average per ST-block BER in the moderate high SNR region. Moreover, the ST-GMD DFE transceiver designed under an MMSE criterion maximizes Gaussian mutual information over the equivalent channel seen by each ST-block. In general, the newly proposed transceivers perform better than the GGMD-based systems since the super-imposed temporal precoder is able to exploit the temporal diversity of time-varying channels. For practical applications, a novel ST-GTD based system which does not require channel prediction but shares the same asymptotic BER performance with the ST-GMD DFE transceiver is also proposed.
The third part of the thesis considers two quality of service (QoS) transceiver design problems for flat MIMO broadcast channels. The first one is the power minimization problem (min-power) with a total bitrate constraint and per-stream BER constraints. The second problem is the rate maximization problem (max-rate) with a total transmit power constraint and per-stream BER constraints. Exploiting a particular class of joint triangularization (JT), we are able to jointly optimize the bit allocation and the broadcast DFE transceiver for the min-power and max-rate problems. The resulting optimal designs are called the minimum power JT broadcast DFE transceiver (MPJT) and maximum rate JT broadcast DFE transceiver (MRJT), respectively. In addition to the optimal designs, two suboptimal designs based on QR decomposition are proposed. They are realizable for arbitrary number of users.
Finally, we investigate the design of a discrete Fourier transform (DFT) modulated filterbank transceiver (DFT-FBT) with LTV scalar channels. For both cases with known LTV channels and unknown wide sense stationary uncorrelated scattering (WSSUS) statistical channels, we show how to optimize the transmitting and receiving prototypes of a DFT-FBT such that the SINR at the receiver is maximized. Also, a novel pilot-aided subspace channel estimation algorithm is proposed for the orthogonal frequency division multiplexing (OFDM) systems with quasi-stationary multi-path Rayleigh fading channels. Using the concept of a difference co-array, the new technique can construct M^2 co-pilots from M physical pilot tones with alternating pilot placement. Subspace methods, such as MUSIC and ESPRIT, can be used to estimate the multipath delays and the number of identifiable paths is up to O(M^2), theoretically. With the delay information, a MMSE estimator for frequency response is derived. It is shown through simulations that the proposed method outperforms the conventional subspace channel estimator when the number of multipaths is greater than or equal to the number of physical pilots minus one.
Resumo:
Storage systems are widely used and have played a crucial rule in both consumer and industrial products, for example, personal computers, data centers, and embedded systems. However, such system suffers from issues of cost, restricted-lifetime, and reliability with the emergence of new systems and devices, such as distributed storage and flash memory, respectively. Information theory, on the other hand, provides fundamental bounds and solutions to fully utilize resources such as data density, information I/O and network bandwidth. This thesis bridges these two topics, and proposes to solve challenges in data storage using a variety of coding techniques, so that storage becomes faster, more affordable, and more reliable.
We consider the system level and study the integration of RAID schemes and distributed storage. Erasure-correcting codes are the basis of the ubiquitous RAID schemes for storage systems, where disks correspond to symbols in the code and are located in a (distributed) network. Specifically, RAID schemes are based on MDS (maximum distance separable) array codes that enable optimal storage and efficient encoding and decoding algorithms. With r redundancy symbols an MDS code can sustain r erasures. For example, consider an MDS code that can correct two erasures. It is clear that when two symbols are erased, one needs to access and transmit all the remaining information to rebuild the erasures. However, an interesting and practical question is: What is the smallest fraction of information that one needs to access and transmit in order to correct a single erasure? In Part I we will show that the lower bound of 1/2 is achievable and that the result can be generalized to codes with arbitrary number of parities and optimal rebuilding.
We consider the device level and study coding and modulation techniques for emerging non-volatile memories such as flash memory. In particular, rank modulation is a novel data representation scheme proposed by Jiang et al. for multi-level flash memory cells, in which a set of n cells stores information in the permutation induced by the different charge levels of the individual cells. It eliminates the need for discrete cell levels, as well as overshoot errors, when programming cells. In order to decrease the decoding complexity, we propose two variations of this scheme in Part II: bounded rank modulation where only small sliding windows of cells are sorted to generated permutations, and partial rank modulation where only part of the n cells are used to represent data. We study limits on the capacity of bounded rank modulation and propose encoding and decoding algorithms. We show that overlaps between windows will increase capacity. We present Gray codes spanning all possible partial-rank states and using only ``push-to-the-top'' operations. These Gray codes turn out to solve an open combinatorial problem called universal cycle, which is a sequence of integers generating all possible partial permutations.
Resumo:
Home to hundreds of millions of souls and land of excessiveness, the Himalaya is also the locus of a unique seismicity whose scope and peculiarities still remain to this day somewhat mysterious. Having claimed the lives of kings, or turned ancient timeworn cities into heaps of rubbles and ruins, earthquakes eerily inhabit Nepalese folk tales with the fatalistic message that nothing lasts forever. From a scientific point of view as much as from a human perspective, solving the mysteries of Himalayan seismicity thus represents a challenge of prime importance. Documenting geodetic strain across the Nepal Himalaya with various GPS and leveling data, we show that unlike other subduction zones that exhibit a heterogeneous and patchy coupling pattern along strike, the last hundred kilometers of the Main Himalayan Thrust fault, or MHT, appear to be uniformly locked, devoid of any of the “creeping barriers” that traditionally ward off the propagation of large events. The approximately 20 mm/yr of reckoned convergence across the Himalaya matching previously established estimates of the secular deformation at the front of the arc, the slip accumulated at depth has to somehow elastically propagate all the way to the surface at some point. And yet, neither large events from the past nor currently recorded microseismicity nearly compensate for the massive moment deficit that quietly builds up under the giant mountains. Along with this large unbalanced moment deficit, the uncommonly homogeneous coupling pattern on the MHT raises the question of whether or not the locked portion of the MHT can rupture all at once in a giant earthquake. Univocally answering this question appears contingent on the still elusive estimate of the magnitude of the largest possible earthquake in the Himalaya, and requires tight constraints on local fault properties. What makes the Himalaya enigmatic also makes it the potential source of an incredible wealth of information, and we exploit some of the oddities of Himalayan seismicity in an effort to improve the understanding of earthquake physics and cipher out the properties of the MHT. Thanks to the Himalaya, the Indo-Gangetic plain is deluged each year under a tremendous amount of water during the annual summer monsoon that collects and bears down on the Indian plate enough to pull it away from the Eurasian plate slightly, temporarily relieving a small portion of the stress mounting on the MHT. As the rainwater evaporates in the dry winter season, the plate rebounds and tension is increased back on the fault. Interestingly, the mild waggle of stress induced by the monsoon rains is about the same size as that from solid-Earth tides which gently tug at the planets solid layers, but whereas changes in earthquake frequency correspond with the annually occurring monsoon, there is no such correlation with Earth tides, which oscillate back-and-forth twice a day. We therefore investigate the general response of the creeping and seismogenic parts of MHT to periodic stresses in order to link these observations to physical parameters. First, the response of the creeping part of the MHT is analyzed with a simple spring-and-slider system bearing rate-strengthening rheology, and we show that at the transition with the locked zone, where the friction becomes near velocity neutral, the response of the slip rate may be amplified at some periods, which values are analytically related to the physical parameters of the problem. Such predictions therefore hold the potential of constraining fault properties on the MHT, but still await observational counterparts to be applied, as nothing indicates that the variations of seismicity rate on the locked part of the MHT are the direct expressions of variations of the slip rate on its creeping part, and no variations of the slip rate have been singled out from the GPS measurements to this day. When shifting to the locked seismogenic part of the MHT, spring-and-slider models with rate-weakening rheology are insufficient to explain the contrasted responses of the seismicity to the periodic loads that tides and monsoon both place on the MHT. Instead, we resort to numerical simulations using the Boundary Integral CYCLes of Earthquakes algorithm and examine the response of a 2D finite fault embedded with a rate-weakening patch to harmonic stress perturbations of various periods. We show that such simulations are able to reproduce results consistent with a gradual amplification of sensitivity as the perturbing period get larger, up to a critical period corresponding to the characteristic time of evolution of the seismicity in response to a step-like perturbation of stress. This increase of sensitivity was not reproduced by simple 1D-spring-slider systems, probably because of the complexity of the nucleation process, reproduced only by 2D-fault models. When the nucleation zone is close to its critical unstable size, its growth becomes highly sensitive to any external perturbations and the timings of produced events may therefore find themselves highly affected. A fully analytical framework has yet to be developed and further work is needed to fully describe the behavior of the fault in terms of physical parameters, which will likely provide the keys to deduce constitutive properties of the MHT from seismological observations.
Resumo:
In a probabilistic assessment of the performance of structures subjected to uncertain environmental loads such as earthquakes, an important problem is to determine the probability that the structural response exceeds some specified limits within a given duration of interest. This problem is known as the first excursion problem, and it has been a challenging problem in the theory of stochastic dynamics and reliability analysis. In spite of the enormous amount of attention the problem has received, there is no procedure available for its general solution, especially for engineering problems of interest where the complexity of the system is large and the failure probability is small.
The application of simulation methods to solving the first excursion problem is investigated in this dissertation, with the objective of assessing the probabilistic performance of structures subjected to uncertain earthquake excitations modeled by stochastic processes. From a simulation perspective, the major difficulty in the first excursion problem comes from the large number of uncertain parameters often encountered in the stochastic description of the excitation. Existing simulation tools are examined, with special regard to their applicability in problems with a large number of uncertain parameters. Two efficient simulation methods are developed to solve the first excursion problem. The first method is developed specifically for linear dynamical systems, and it is found to be extremely efficient compared to existing techniques. The second method is more robust to the type of problem, and it is applicable to general dynamical systems. It is efficient for estimating small failure probabilities because the computational effort grows at a much slower rate with decreasing failure probability than standard Monte Carlo simulation. The simulation methods are applied to assess the probabilistic performance of structures subjected to uncertain earthquake excitation. Failure analysis is also carried out using the samples generated during simulation, which provide insight into the probable scenarios that will occur given that a structure fails.
Resumo:
In the quest for a descriptive theory of decision-making, the rational actor model in economics imposes rather unrealistic expectations and abilities on human decision makers. The further we move from idealized scenarios, such as perfectly competitive markets, and ambitiously extend the reach of the theory to describe everyday decision making situations, the less sense these assumptions make. Behavioural economics has instead proposed models based on assumptions that are more psychologically realistic, with the aim of gaining more precision and descriptive power. Increased psychological realism, however, comes at the cost of a greater number of parameters and model complexity. Now there are a plethora of models, based on different assumptions, applicable in differing contextual settings, and selecting the right model to use tends to be an ad-hoc process. In this thesis, we develop optimal experimental design methods and evaluate different behavioral theories against evidence from lab and field experiments.
We look at evidence from controlled laboratory experiments. Subjects are presented with choices between monetary gambles or lotteries. Different decision-making theories evaluate the choices differently and would make distinct predictions about the subjects' choices. Theories whose predictions are inconsistent with the actual choices can be systematically eliminated. Behavioural theories can have multiple parameters requiring complex experimental designs with a very large number of possible choice tests. This imposes computational and economic constraints on using classical experimental design methods. We develop a methodology of adaptive tests: Bayesian Rapid Optimal Adaptive Designs (BROAD) that sequentially chooses the "most informative" test at each stage, and based on the response updates its posterior beliefs over the theories, which informs the next most informative test to run. BROAD utilizes the Equivalent Class Edge Cutting (EC2) criteria to select tests. We prove that the EC2 criteria is adaptively submodular, which allows us to prove theoretical guarantees against the Bayes-optimal testing sequence even in the presence of noisy responses. In simulated ground-truth experiments, we find that the EC2 criteria recovers the true hypotheses with significantly fewer tests than more widely used criteria such as Information Gain and Generalized Binary Search. We show, theoretically as well as experimentally, that surprisingly these popular criteria can perform poorly in the presence of noise, or subject errors. Furthermore, we use the adaptive submodular property of EC2 to implement an accelerated greedy version of BROAD which leads to orders of magnitude speedup over other methods.
We use BROAD to perform two experiments. First, we compare the main classes of theories for decision-making under risk, namely: expected value, prospect theory, constant relative risk aversion (CRRA) and moments models. Subjects are given an initial endowment, and sequentially presented choices between two lotteries, with the possibility of losses. The lotteries are selected using BROAD, and 57 subjects from Caltech and UCLA are incentivized by randomly realizing one of the lotteries chosen. Aggregate posterior probabilities over the theories show limited evidence in favour of CRRA and moments' models. Classifying the subjects into types showed that most subjects are described by prospect theory, followed by expected value. Adaptive experimental design raises the possibility that subjects could engage in strategic manipulation, i.e. subjects could mask their true preferences and choose differently in order to obtain more favourable tests in later rounds thereby increasing their payoffs. We pay close attention to this problem; strategic manipulation is ruled out since it is infeasible in practice, and also since we do not find any signatures of it in our data.
In the second experiment, we compare the main theories of time preference: exponential discounting, hyperbolic discounting, "present bias" models: quasi-hyperbolic (α, β) discounting and fixed cost discounting, and generalized-hyperbolic discounting. 40 subjects from UCLA were given choices between 2 options: a smaller but more immediate payoff versus a larger but later payoff. We found very limited evidence for present bias models and hyperbolic discounting, and most subjects were classified as generalized hyperbolic discounting types, followed by exponential discounting.
In these models the passage of time is linear. We instead consider a psychological model where the perception of time is subjective. We prove that when the biological (subjective) time is positively dependent, it gives rise to hyperbolic discounting and temporal choice inconsistency.
We also test the predictions of behavioral theories in the "wild". We pay attention to prospect theory, which emerged as the dominant theory in our lab experiments of risky choice. Loss aversion and reference dependence predicts that consumers will behave in a uniquely distinct way than the standard rational model predicts. Specifically, loss aversion predicts that when an item is being offered at a discount, the demand for it will be greater than that explained by its price elasticity. Even more importantly, when the item is no longer discounted, demand for its close substitute would increase excessively. We tested this prediction using a discrete choice model with loss-averse utility function on data from a large eCommerce retailer. Not only did we identify loss aversion, but we also found that the effect decreased with consumers' experience. We outline the policy implications that consumer loss aversion entails, and strategies for competitive pricing.
In future work, BROAD can be widely applicable for testing different behavioural models, e.g. in social preference and game theory, and in different contextual settings. Additional measurements beyond choice data, including biological measurements such as skin conductance, can be used to more rapidly eliminate hypothesis and speed up model comparison. Discrete choice models also provide a framework for testing behavioural models with field data, and encourage combined lab-field experiments.
Resumo:
The incidence of blue-green algal blooms and surface scum-formation are certainly not new phenomena. Many British and European authors have been faithfully describing the unmistakable symptoms of blue-green algal scums for over 800 years. There is no disputing that blue-green algal toxins are extremely harmful. Three quite separate categories of compound have been separated: neurotoxins; hepatotoxins and lipopolysaccharides. There is a popular association between blue-green algae and eutrophication. Certainly the main nuisance species - of Microcystis, Anabaena and Aphanizomenon are rare in oligotrophic lakes and reservoirs. Several approaches have been proposed for the control of blue-green algae. Distinction is made between methods for discharging algae already present (eg algicides; straw bales; viruses; parasitic fungi and herbivorous ciliates), and methods for averting an anticipated abundance in the future (phosphorous control, artificial circulation etc).
Resumo:
Complexity in the earthquake rupture process can result from many factors. This study investigates the origin of such complexity by examining several recent, large earthquakes in detail. In each case the local tectonic environment plays an important role in understanding the source of the complexity.
Several large shallow earthquakes (Ms > 7.0) along the Middle American Trench have similarities and differences between them that may lead to a better understanding of fracture and subduction processes. They are predominantly thrust events consistent with the known subduction of the Cocos plate beneath N. America. Two events occurring along this subduction zone close to triple junctions show considerable complexity. This may be attributable to a more heterogeneous stress environment in these regions and as such has implications for other subduction zone boundaries.
An event which looks complex but is actually rather simple is the 1978 Bermuda earthquake (Ms ~ 6). It is located predominantly in the mantle. Its mechanism is one of pure thrust faulting with a strike N 20°W and dip 42°NE. Its apparent complexity is caused by local crustal structure. This is an important event in terms of understanding and estimating seismic hazard on the eastern seaboard of N. America.
A study of several large strike-slip continental earthquakes identifies characteristics which are common to them and may be useful in determining what to expect from the next great earthquake on the San Andreas fault. The events are the 1976 Guatemala earthquake on the Motagua fault and two events on the Anatolian fault in Turkey (the 1967, Mudurnu Valley and 1976, E. Turkey events). An attempt to model the complex P-waveforms of these events results in good synthetic fits for the Guatemala and Mudurnu Valley events. However, the E. Turkey event proves to be too complex as it may have associated thrust or normal faulting. Several individual sources occurring at intervals of between 5 and 20 seconds characterize the Guatemala and Mudurnu Valley events. The maximum size of an individual source appears to be bounded at about 5 x 1026 dyne-cm. A detailed source study including directivity is performed on the Guatemala event. The source time history of the Mudurnu Valley event illustrates its significance in modeling strong ground motion in the near field. The complex source time series of the 1967 event produces amplitudes greater by a factor of 2.5 than a uniform model scaled to the same size for a station 20 km from the fault.
Three large and important earthquakes demonstrate an important type of complexity --- multiple-fault complexity. The first, the 1976 Philippine earthquake, an oblique thrust event, represents the first seismological evidence for a northeast dipping subduction zone beneath the island of Mindanao. A large event, following the mainshock by 12 hours, occurred outside the aftershock area and apparently resulted from motion on a subsidiary fault since the event had a strike-slip mechanism.
An aftershock of the great 1960 Chilean earthquake on June 6, 1960, proved to be an interesting discovery. It appears to be a large strike-slip event at the main rupture's southern boundary. It most likely occurred on the landward extension of the Chile Rise transform fault, in the subducting plate. The results for this event suggest that a small event triggered a series of slow events; the duration of the whole sequence being longer than 1 hour. This is indeed a "slow earthquake".
Perhaps one of the most complex of events is the recent Tangshan, China event. It began as a large strike-slip event. Within several seconds of the mainshock it may have triggered thrust faulting to the south of the epicenter. There is no doubt, however, that it triggered a large oblique normal event to the northeast, 15 hours after the mainshock. This event certainly contributed to the great loss of life-sustained as a result of the Tangshan earthquake sequence.
What has been learned from these studies has been applied to predict what one might expect from the next great earthquake on the San Andreas. The expectation from this study is that such an event would be a large complex event, not unlike, but perhaps larger than, the Guatemala or Mudurnu Valley events. That is to say, it will most likely consist of a series of individual events in sequence. It is also quite possible that the event could trigger associated faulting on neighboring fault systems such as those occurring in the Transverse Ranges. This has important bearing on the earthquake hazard estimation for the region.
Resumo:
An analytic technique is developed that couples to finite difference calculations to extend the results to arbitrary distance. Finite differences and the analytic result, a boundary integral called two-dimensional Kirchhoff, are applied to simple models and three seismological problems dealing with data. The simple models include a thorough investigation of the seismologic effects of a deep continental basin. The first problem is explosions at Yucca Flat, in the Nevada test site. By modeling both near-field strong-motion records and teleseismic P-waves simultaneously, it is shown that scattered surface waves are responsible for teleseismic complexity. The second problem deals with explosions at Amchitka Island, Alaska. The near-field seismograms are investigated using a variety of complex structures and sources. The third problem involves regional seismograms of Imperial Valley, California earthquakes recorded at Pasadena, California. The data are shown to contain evidence of deterministic structure, but lack of more direct measurements of the structure and possible three-dimensional effects make two-dimensional modeling of these data difficult.
Resumo:
In the first part of the thesis we explore three fundamental questions that arise naturally when we conceive a machine learning scenario where the training and test distributions can differ. Contrary to conventional wisdom, we show that in fact mismatched training and test distribution can yield better out-of-sample performance. This optimal performance can be obtained by training with the dual distribution. This optimal training distribution depends on the test distribution set by the problem, but not on the target function that we want to learn. We show how to obtain this distribution in both discrete and continuous input spaces, as well as how to approximate it in a practical scenario. Benefits of using this distribution are exemplified in both synthetic and real data sets.
In order to apply the dual distribution in the supervised learning scenario where the training data set is fixed, it is necessary to use weights to make the sample appear as if it came from the dual distribution. We explore the negative effect that weighting a sample can have. The theoretical decomposition of the use of weights regarding its effect on the out-of-sample error is easy to understand but not actionable in practice, as the quantities involved cannot be computed. Hence, we propose the Targeted Weighting algorithm that determines if, for a given set of weights, the out-of-sample performance will improve or not in a practical setting. This is necessary as the setting assumes there are no labeled points distributed according to the test distribution, only unlabeled samples.
Finally, we propose a new class of matching algorithms that can be used to match the training set to a desired distribution, such as the dual distribution (or the test distribution). These algorithms can be applied to very large datasets, and we show how they lead to improved performance in a large real dataset such as the Netflix dataset. Their computational complexity is the main reason for their advantage over previous algorithms proposed in the covariate shift literature.
In the second part of the thesis we apply Machine Learning to the problem of behavior recognition. We develop a specific behavior classifier to study fly aggression, and we develop a system that allows analyzing behavior in videos of animals, with minimal supervision. The system, which we call CUBA (Caltech Unsupervised Behavior Analysis), allows detecting movemes, actions, and stories from time series describing the position of animals in videos. The method summarizes the data, as well as it provides biologists with a mathematical tool to test new hypotheses. Other benefits of CUBA include finding classifiers for specific behaviors without the need for annotation, as well as providing means to discriminate groups of animals, for example, according to their genetic line.