7 resultados para Associative Classifiers
em CaltechTHESIS
Resumo:
Long linear polymers that are end-functionalized with associative groups were studied as additives to hydrocarbon fluids to mitigate the fire hazard associated with the presence of mist in a crash scenario. These polymers were molecularly designed to overcome both the shear-degradation of long polymer chains in turbulent flows, and the chain collapse induced by the random placement of associative groups along polymer backbones. Architectures of associative groups on the polymer chain ends that were tested included clusters of self-associative carboxyl groups and pairs of hetero-complementary associative units.
Linear polymers with clusters of discrete numbers of carboxyl groups on their chain ends were investigated first: an innovative synthetic strategy was devised to achieve unprecedented backbone lengths and precise control of the number of carboxyl groups on chain ends (N). We found that a very narrow range of N allows the co-existence of sufficient end-association strength and polymer solubility in apolar media. Subsequent steady-flow rheological study on solution behavior of such soluble polymers in apolar media revealed that the end-association of very long chains in apolar media leads to the formation of flower-like micelles interconnected by bridging chains, which trap significant fraction of polymer chains into looped structures with low contribution to mist-control. The efficacy of very long 1,4-polybutadiene chains end-functionalized with clusters of four carboxyl groups as mist-control additives for jet fuel was further tested. In addition to being shear-resistant, the polymer was found capable of providing fire-protection to jet fuel at concentrations as low as 0.3wt%. We also found that this polymer has excellent solubility in jet fuel over a wide range of temperature (-30 to +70°C) and negligible interference with dewatering of jet fuel. It does not cause an adverse increase in viscosity at concentrations where mist-control efficacy exists.
Four pairs of hetero-complementary associative end-groups of varying strengths were subsequently investigated, in the hopes of achieving supramolecular aggregates with both mist-control ability and better utilization of polymer building blocks. Rheological study of solutions of the corresponding complementary associative polymer pairs in apolar media revealed the strength of complementary end-association required to achieve supramolecular aggregates capable of modulating rheological properties of the solution.
Both self-associating and complementary associating polymers have therefore been found to resist shear degradation. The successful strategy of building soluble, end-associative polymers with either self-associative or complementary associative groups will guide the next generation of mist-control technology.
Resumo:
We perform a measurement of direct CP violation in b to s+gamma Acp, and the measurement of a difference between Acp for neutral B and charged B mesons, Delta A_{X_s\gamma}, using 429 inverse femtobarn of data recorded at the Upsilon(4S) resonance with the BABAR detector. B mesons are reconstructed from 16 exclusive final states. Particle identification is done using an algorithm based on Error Correcting Output Code with an exhaustive matrix. Background rejection and best candidate selection are done using two decision tree-based classifiers. We found $\acp = 1.73%+-1.93%+-1.02% and Delta A_X_sgamma = 4.97%+-3.90%+-1.45% where the uncertainties are statistical and systematic respectively. Based on the measured value of Delta A_X_sgamma, we determine a 90% confidence interval for Im C_8g/C_7gamma, where C_7gamma and C_8g are Wilson coefficients for New Physics amplitudes, at -1.64 < Im C_8g/C_7gamma < 6.52.
Resumo:
Humans are able of distinguishing more than 5000 visual categories even in complex environments using a variety of different visual systems all working in tandem. We seem to be capable of distinguishing thousands of different odors as well. In the machine learning community, many commonly used multi-class classifiers do not scale well to such large numbers of categories. This thesis demonstrates a method of automatically creating application-specific taxonomies to aid in scaling classification algorithms to more than 100 cate- gories using both visual and olfactory data. The visual data consists of images collected online and pollen slides scanned under a microscope. The olfactory data was acquired by constructing a small portable sniffing apparatus which draws air over 10 carbon black polymer composite sensors. We investigate performance when classifying 256 visual categories, 8 or more species of pollen and 130 olfactory categories sampled from common household items and a standardized scratch-and-sniff test. Taxonomies are employed in a divide-and-conquer classification framework which improves classification time while allowing the end user to trade performance for specificity as needed. Before classification can even take place, the pollen counter and electronic nose must filter out a high volume of background “clutter” to detect the categories of interest. In the case of pollen this is done with an efficient cascade of classifiers that rule out most non-pollen before invoking slower multi-class classifiers. In the case of the electronic nose, much of the extraneous noise encountered in outdoor environments can be filtered using a sniffing strategy which preferentially samples the visensor response at frequencies that are relatively immune to background contributions from ambient water vapor. This combination of efficient background rejection with scalable classification algorithms is tested in detail for three separate projects: 1) the Caltech-256 Image Dataset, 2) the Caltech Automated Pollen Identification and Counting System (CAPICS) and 3) a portable electronic nose specially constructed for outdoor use.
Resumo:
A series of Cs- and C1-symmetric doubly-linked ansa-metallocenes of the general formula {1,1'-SiMe2-2,2'-E-('ƞ5-C5H2-4-R1)-(ƞ5-C5H-3',5'-(CHMe2)2)}ZrC2 (E = SiMe2 (1), SiPh2 (2), SiMe2 -SiMe2 (3); R1 = H, CHMe2, C5H9, C6H11, C6H5) has been prepared. When activated by methylaluminoxane, these are active propylene polymerization catalysts. 1 and 2 produce syndiotactic polypropylenes, and 3 produces isotactic polypropylenes. Site epimerization is the major pathway for stereoerror formation for 1 and 2. In addition, the polymer chain has slightly stronger steric interaction with the diphenylsilylene linker than with the dimethylsilylene linker. This results in more frequent site epimerization and reduced syndiospecificity for 2 compared to 1.
C1-Symmetric ansa-zirconocenes [1,1 '-SiMe2-(C5H4)-(3-R-C5H3)]ZrCl2 (4), [1,1 '-SiMe2-(C5H4)-(2,4-R2-C5H2)]ZrCl2 (5) and [1,1 '-SiMe2-2,2 '-(SiMe2-SiMe2)-(C5H3)-( 4-R-C5H2)]ZrCl2 (6) have been prepared to probe the origin of isospecificity in 3. While 4 and 3 produce polymers with similar isospecificity, 5 and 6 give mostly hemi-isotactic-like polymers. It is proposed that the facile site epimerization via an associative pathway allows rapid equilibration of the polymer chain between the isospecific and aspecific insertion sites. This results in more frequent insertion from the isospecific site, which has a lower kinetic barrier for chain propagation. On the other hand, site epimerization for 5 and 6 is slow. This leads to mostly alternating insertion from the isospecific and aspecific sites, and consequently, a hemi-isotactic-like polymers. In comparison, site epimerization is even slower for 3, but enchainment from the aspecific site has an extremely high kinetic barrier for monomer coordination. Therefore, enchainment occurs preferentially from the isospecific site to produce isotactic polymers.
A series of cationic complexes [(ArN=CR-CR=NAr)PtMe(L)]+[BF4]+ (Ar = aryl; R = H, CH3; L = water, trifluoroethanol) has been prepared. They react smoothly with benzene at approximately room temperature in trifluoroethanol solvent to yield methane and the corresponding phenyl Pt(II) cations, via Pt(IV)-methyl-phenyl-hydride intermediates. The reaction products of methyl-substituted benzenes suggest an inherent reactivity preference for aromatic over benzylic C-H bond activation, which can however be overridden by steric effects. For the reaction of benzene with cationic Pt(II) complexes, in which the diimine ligands bear 3,5-disubstituted aryl groups at the nitrogen atoms, the rate-determining step is C-H bond activation. For the more sterically crowded analogs with 2,6-dimethyl-substituted aryl groups, benzene coordination becomes rate-determining. The more electron-rich the ligand, as reflected by the CO stretching frequency in the IR spectrum of the corresponding cationic carbonyl complex, the faster the rate of C-H bond activation. This finding, however, does not reflect the actual C-H bond activation process, but rather reflects only the relative ease of solvent molecules displacing water molecules to initiate the reaction. That is, the change in rates is mostly due to a ground state effect. Several lines of evidence suggest that associative substitution pathways operate to get the hydrocarbon substrate into, and out of, the coordination sphere; i.e., that benzene substitution proceeds by a solvent- (TFE-) assisted associative pathway.
Resumo:
The LIGO and Virgo gravitational-wave observatories are complex and extremely sensitive strain detectors that can be used to search for a wide variety of gravitational waves from astrophysical and cosmological sources. In this thesis, I motivate the search for the gravitational wave signals from coalescing black hole binary systems with total mass between 25 and 100 solar masses. The mechanisms for formation of such systems are not well-understood, and we do not have many observational constraints on the parameters that guide the formation scenarios. Detection of gravitational waves from such systems — or, in the absence of detection, the tightening of upper limits on the rate of such coalescences — will provide valuable information that can inform the astrophysics of the formation of these systems. I review the search for these systems and place upper limits on the rate of black hole binary coalescences with total mass between 25 and 100 solar masses. I then show how the sensitivity of this search can be improved by up to 40% by the the application of the multivariate statistical classifier known as a random forest of bagged decision trees to more effectively discriminate between signal and non-Gaussian instrumental noise. I also discuss the use of this classifier in the search for the ringdown signal from the merger of two black holes with total mass between 50 and 450 solar masses and present upper limits. I also apply multivariate statistical classifiers to the problem of quantifying the non-Gaussianity of LIGO data. Despite these improvements, no gravitational-wave signals have been detected in LIGO data so far. However, the use of multivariate statistical classification can significantly improve the sensitivity of the Advanced LIGO detectors to such signals.
Resumo:
These studies explore how, where, and when representations of variables critical to decision-making are represented in the brain. In order to produce a decision, humans must first determine the relevant stimuli, actions, and possible outcomes before applying an algorithm that will select an action from those available. When choosing amongst alternative stimuli, the framework of value-based decision-making proposes that values are assigned to the stimuli and that these values are then compared in an abstract “value space” in order to produce a decision. Despite much progress, in particular regarding the pinpointing of ventromedial prefrontal cortex (vmPFC) as a region that encodes the value, many basic questions remain. In Chapter 2, I show that distributed BOLD signaling in vmPFC represents the value of stimuli under consideration in a manner that is independent of the type of stimulus it is. Thus the open question of whether value is represented in abstraction, a key tenet of value-based decision-making, is confirmed. However, I also show that stimulus-dependent value representations are also present in the brain during decision-making and suggest a potential neural pathway for stimulus-to-value transformations that integrates these two results.
More broadly speaking, there is both neural and behavioral evidence that two distinct control systems are at work during action selection. These two systems compose the “goal-directed system”, which selects actions based on an internal model of the environment, and the “habitual” system, which generates responses based on antecedent stimuli only. Computational characterizations of these two systems imply that they have different informational requirements in terms of input stimuli, actions, and possible outcomes. Associative learning theory predicts that the habitual system should utilize stimulus and action information only, while goal-directed behavior requires that outcomes as well as stimuli and actions be processed. In Chapter 3, I test whether areas of the brain hypothesized to be involved in habitual versus goal-directed control represent the corresponding theorized variables.
The question of whether one or both of these neural systems drives Pavlovian conditioning is less well-studied. Chapter 4 describes an experiment in which subjects were scanned while engaged in a Pavlovian task with a simple non-trivial structure. After comparing a variety of model-based and model-free learning algorithms (thought to underpin goal-directed and habitual decision-making, respectively), it was found that subjects’ reaction times were better explained by a model-based system. In addition, neural signaling of precision, a variable based on a representation of a world model, was found in the amygdala. These data indicate that the influence of model-based representations of the environment can extend even to the most basic learning processes.
Knowledge of the state of hidden variables in an environment is required for optimal inference regarding the abstract decision structure of a given environment and therefore can be crucial to decision-making in a wide range of situations. Inferring the state of an abstract variable requires the generation and manipulation of an internal representation of beliefs over the values of the hidden variable. In Chapter 5, I describe behavioral and neural results regarding the learning strategies employed by human subjects in a hierarchical state-estimation task. In particular, a comprehensive model fit and comparison process pointed to the use of "belief thresholding". This implies that subjects tended to eliminate low-probability hypotheses regarding the state of the environment from their internal model and ceased to update the corresponding variables. Thus, in concert with incremental Bayesian learning, humans explicitly manipulate their internal model of the generative process during hierarchical inference consistent with a serial hypothesis testing strategy.
Resumo:
In the first part of the thesis we explore three fundamental questions that arise naturally when we conceive a machine learning scenario where the training and test distributions can differ. Contrary to conventional wisdom, we show that in fact mismatched training and test distribution can yield better out-of-sample performance. This optimal performance can be obtained by training with the dual distribution. This optimal training distribution depends on the test distribution set by the problem, but not on the target function that we want to learn. We show how to obtain this distribution in both discrete and continuous input spaces, as well as how to approximate it in a practical scenario. Benefits of using this distribution are exemplified in both synthetic and real data sets.
In order to apply the dual distribution in the supervised learning scenario where the training data set is fixed, it is necessary to use weights to make the sample appear as if it came from the dual distribution. We explore the negative effect that weighting a sample can have. The theoretical decomposition of the use of weights regarding its effect on the out-of-sample error is easy to understand but not actionable in practice, as the quantities involved cannot be computed. Hence, we propose the Targeted Weighting algorithm that determines if, for a given set of weights, the out-of-sample performance will improve or not in a practical setting. This is necessary as the setting assumes there are no labeled points distributed according to the test distribution, only unlabeled samples.
Finally, we propose a new class of matching algorithms that can be used to match the training set to a desired distribution, such as the dual distribution (or the test distribution). These algorithms can be applied to very large datasets, and we show how they lead to improved performance in a large real dataset such as the Netflix dataset. Their computational complexity is the main reason for their advantage over previous algorithms proposed in the covariate shift literature.
In the second part of the thesis we apply Machine Learning to the problem of behavior recognition. We develop a specific behavior classifier to study fly aggression, and we develop a system that allows analyzing behavior in videos of animals, with minimal supervision. The system, which we call CUBA (Caltech Unsupervised Behavior Analysis), allows detecting movemes, actions, and stories from time series describing the position of animals in videos. The method summarizes the data, as well as it provides biologists with a mathematical tool to test new hypotheses. Other benefits of CUBA include finding classifiers for specific behaviors without the need for annotation, as well as providing means to discriminate groups of animals, for example, according to their genetic line.