896 resultados para High-dimensional data visualization
Resumo:
Die vorliegende Arbeit befasst sich mit der Synthese und Charakterisierung von Polymeren mit redox-funktionalen Phenothiazin-Seitenketten. Phenothiazin und seine Derivate sind kleine Redoxeinheiten, deren reversibles Redoxverhalten mit electrochromen Eigenschaften verbunden ist. Das besondere an Phenothiazine ist die Bildung von stabilen Radikalkationen im oxidierten Zustand. Daher können Phenothiazine als bistabile Moleküle agieren und zwischen zwei stabilen Redoxzuständen wechseln. Dieser Schaltprozess geht gleichzeitig mit einer Farbveränderung an her.rnrnIm Rahmen dieser Arbeit wird die Synthese neuartiger Phenothiazin-Polymere mittels radikalischer Polymerisation beschrieben. Phenothiazin-Derivate wurden kovalent an aliphatischen und aromatischen Polymerketten gebunden. Dies erfolgte über zwei unterschiedlichen synthetischen Routen. Die erste Route beinhaltet den Einsatz von Vinyl-Monomeren mit Phenothiazin Funktionalität zur direkten Polymerisation. Die zweite Route verwendet Amin modifizierte Phenothiazin-Derivate zur Funktionalisierung von Polymeren mit Aktivester-Seitenketten in einer polymeranalogen Reaktion. rnrnPolymere mit redox-funktionalen Phenothiazin-Seitenketten sind aufgrund ihrer Elektron-Donor-Eigenschaften geeignete Kandidaten für die Verwendung als Kathodenmaterialien. Zur Überprüfung ihrer Eignung wurden Phenothiazin-Polymere als Elektrodenmaterialien in Lithium-Batteriezellen eingesetzt. Die verwendeten Polymere wiesen gute Kapazitätswerte von circa 50-90 Ah/kg sowie schnelle Aufladezeiten in der Batteriezelle auf. Besonders die Aufladezeiten sind 5-10 mal höher als konventionelle Lithium-Batterien. Im Hinblick auf Anzahl der Lade- und Entladezyklen, erzielten die Polymere gute Werte in den Langzeit-Stabilitätstests. Insgesamt überstehen die Polymere 500 Ladezyklen mit geringen Veränderungen der Anfangswerte bezüglich Ladezeiten und -kapazitäten. Die Langzeit-Stabilität hängt unmittelbar mit der Radikalstabilität zusammen. Eine Stabilisierung der Radikalkationen gelang durch die Verlängerung der Seitenkette am Stickstoffatom des Phenothiazins und der Polymerhauptkette. Eine derartige Alkyl-Substitution erhöht die Radikalstabilität durch verstärkte Wechselwirkung mit dem aromatischen Ring und verbessert somit die Batterieleistung hinsichtlich der Stabilität gegenüber Lade- und Entladezyklen. rnrnDes Weiteren wurde die praktische Anwendung von bistabilen Phenothiazin-Polymeren als Speichermedium für hohe Datendichten untersucht. Dazu wurden dünne Filme des Polymers auf leitfähigen Substraten elektrochemisch oxidiert. Die elektrochemische Oxidation erfolgte mittels Rasterkraftmikroskopie in Kombination mit leitfähigen Mikroskopspitzen. Mittels dieser Technik gelang es, die Oberfläche des Polymers im nanoskaligen Bereich zu oxidieren und somit die lokale Leitfähigkeit zu verändern. Damit konnten unterschiedlich große Muster lithographisch beschrieben und aufgrund der Veränderung ihrer Leitfähigkeit detektiert werden. Der Schreibprozess führte nur zu einer Veränderung der lokalen Leitfähigkeit ohne die topographische Beschaffenheit des Polymerfilms zu beeinflussen. Außerdem erwiesen sich die Muster als besonders stabil sowohl mechanisch als auch über die Zeit.rnrnZum Schluss wurden neue Synthesestrategien entwickelt um mechanisch stabile als auch redox-funktionale Oberflächen zu produzieren. Mit Hilfe der oberflächen-initiierten Atomtransfer-Radikalpolymerisation wurden gepfropfte Polymerbürsten mit redox-funktionalen Phenothiazin-Seitenketten hergestellt und mittels Röntgenmethoden und Rasterkraftmikroskopie analysiert. Eine der Synthesestrategien geht von gepfropften Aktivesterbürsten aus, die anschließend in einem nachfolgenden Schritt mit redox-funktionalen Gruppen modifiziert werden können. Diese Vorgehensweise ist besonders vielversprechend und erlaubt es unterschiedliche funktionelle Gruppen an den Aktivesterbürsten zu verankern. Damit können durch Verwendung von vernetzenden Gruppen neben den Redoxeigenschaften, die mechanische Stabilität solcher Polymerfilme optimiert werden. rn rn
Resumo:
In der Erdöl– und Gasindustrie sind bildgebende Verfahren und Simulationen auf der Porenskala im Begriff Routineanwendungen zu werden. Ihr weiteres Potential lässt sich im Umweltbereich anwenden, wie z.B. für den Transport und Verbleib von Schadstoffen im Untergrund, die Speicherung von Kohlendioxid und dem natürlichen Abbau von Schadstoffen in Böden. Mit der Röntgen-Computertomografie (XCT) steht ein zerstörungsfreies 3D bildgebendes Verfahren zur Verfügung, das auch häufig für die Untersuchung der internen Struktur geologischer Proben herangezogen wird. Das erste Ziel dieser Dissertation war die Implementierung einer Bildverarbeitungstechnik, die die Strahlenaufhärtung der Röntgen-Computertomografie beseitigt und den Segmentierungsprozess dessen Daten vereinfacht. Das zweite Ziel dieser Arbeit untersuchte die kombinierten Effekte von Porenraumcharakteristika, Porentortuosität, sowie die Strömungssimulation und Transportmodellierung in Porenräumen mit der Gitter-Boltzmann-Methode. In einer zylindrischen geologischen Probe war die Position jeder Phase auf Grundlage der Beobachtung durch das Vorhandensein der Strahlenaufhärtung in den rekonstruierten Bildern, das eine radiale Funktion vom Probenrand zum Zentrum darstellt, extrahierbar und die unterschiedlichen Phasen ließen sich automatisch segmentieren. Weiterhin wurden Strahlungsaufhärtungeffekte von beliebig geformten Objekten durch einen Oberflächenanpassungsalgorithmus korrigiert. Die Methode der „least square support vector machine” (LSSVM) ist durch einen modularen Aufbau charakterisiert und ist sehr gut für die Erkennung und Klassifizierung von Mustern geeignet. Aus diesem Grund wurde die Methode der LSSVM als pixelbasierte Klassifikationsmethode implementiert. Dieser Algorithmus ist in der Lage komplexe geologische Proben korrekt zu klassifizieren, benötigt für den Fall aber längere Rechenzeiten, so dass mehrdimensionale Trainingsdatensätze verwendet werden müssen. Die Dynamik von den unmischbaren Phasen Luft und Wasser wird durch eine Kombination von Porenmorphologie und Gitter Boltzmann Methode für Drainage und Imbibition Prozessen in 3D Datensätzen von Böden, die durch synchrotron-basierte XCT gewonnen wurden, untersucht. Obwohl die Porenmorphologie eine einfache Methode ist Kugeln in den verfügbaren Porenraum einzupassen, kann sie dennoch die komplexe kapillare Hysterese als eine Funktion der Wassersättigung erklären. Eine Hysterese ist für den Kapillardruck und die hydraulische Leitfähigkeit beobachtet worden, welche durch die hauptsächlich verbundenen Porennetzwerke und der verfügbaren Porenraumgrößenverteilung verursacht sind. Die hydraulische Konduktivität ist eine Funktion des Wassersättigungslevels und wird mit einer makroskopischen Berechnung empirischer Modelle verglichen. Die Daten stimmen vor allem für hohe Wassersättigungen gut überein. Um die Gegenwart von Krankheitserregern im Grundwasser und Abwässern vorhersagen zu können, wurde in einem Bodenaggregat der Einfluss von Korngröße, Porengeometrie und Fluidflussgeschwindigkeit z.B. mit dem Mikroorganismus Escherichia coli studiert. Die asymmetrischen und langschweifigen Durchbruchskurven, besonders bei höheren Wassersättigungen, wurden durch dispersiven Transport aufgrund des verbundenen Porennetzwerks und durch die Heterogenität des Strömungsfeldes verursacht. Es wurde beobachtet, dass die biokolloidale Verweilzeit eine Funktion des Druckgradienten als auch der Kolloidgröße ist. Unsere Modellierungsergebnisse stimmen sehr gut mit den bereits veröffentlichten Daten überein.
Resumo:
Data visualization is the process of representing data as pictures to support reasoning about the underlying data. For the interpretation to be as easy as possible, we need to be as close as possible to the original data. As most visualization tools have an internal meta-model, which is different from the one for the presented data, they usually need to duplicate the original data to conform to their meta-model. This leads to an increase in the resources needed, increase which is not always justified. In this work we argue for the need of having an engine that is as close as possible to the data and we present our solution of moving the visualization tool to the data, instead of moving the data to the visualization tool. Our solution also emphasizes the necessity of reusing basic blocks to express complex visualizations and allowing the programmer to script the visualization using his preferred tools, rather than a third party format. As a validation of the expressiveness of our framework, we show how we express several already published visualizations and describe the pros and cons of the approach.
Resumo:
In recent years, researchers in the health and social sciences have become increasingly interested in mediation analysis. Specifically, upon establishing a non-null total effect of an exposure, investigators routinely wish to make inferences about the direct (indirect) pathway of the effect of the exposure not through (through) a mediator variable that occurs subsequently to the exposure and prior to the outcome. Natural direct and indirect effects are of particular interest as they generally combine to produce the total effect of the exposure and therefore provide insight on the mechanism by which it operates to produce the outcome. A semiparametric theory has recently been proposed to make inferences about marginal mean natural direct and indirect effects in observational studies (Tchetgen Tchetgen and Shpitser, 2011), which delivers multiply robust locally efficient estimators of the marginal direct and indirect effects, and thus generalizes previous results for total effects to the mediation setting. In this paper we extend the new theory to handle a setting in which a parametric model for the natural direct (indirect) effect within levels of pre-exposure variables is specified and the model for the observed data likelihood is otherwise unrestricted. We show that estimation is generally not feasible in this model because of the curse of dimensionality associated with the required estimation of auxiliary conditional densities or expectations, given high-dimensional covariates. We thus consider multiply robust estimation and propose a more general model which assumes a subset but not all of several working models holds.
Resumo:
Quantifying the health effects associated with simultaneous exposure to many air pollutants is now a research priority of the US EPA. Bayesian hierarchical models (BHM) have been extensively used in multisite time series studies of air pollution and health to estimate health effects of a single pollutant adjusted for potential confounding of other pollutants and other time-varying factors. However, when the scientific goal is to estimate the impacts of many pollutants jointly, a straightforward application of BHM is challenged by the need to specify a random-effect distribution on a high-dimensional vector of nuisance parameters, which often do not have an easy interpretation. In this paper we introduce a new BHM formulation, which we call "reduced BHM", aimed at analyzing clustered data sets in the presence of a large number of random effects that are not of primary scientific interest. At the first stage of the reduced BHM, we calculate the integrated likelihood of the parameter of interest (e.g. excess number of deaths attributed to simultaneous exposure to high levels of many pollutants). At the second stage, we specify a flexible random-effect distribution directly on the parameter of interest. The reduced BHM overcomes many of the challenges in the specification and implementation of full BHM in the context of a large number of nuisance parameters. In simulation studies we show that the reduced BHM performs comparably to the full BHM in many scenarios, and even performs better in some cases. Methods are applied to estimate location-specific and overall relative risks of cardiovascular hospital admissions associated with simultaneous exposure to elevated levels of particulate matter and ozone in 51 US counties during the period 1999-2005.
Resumo:
One of the original ocean-bottom time-lapse seismic studies was performed at the Teal South oil field in the Gulf of Mexico during the late 1990’s. This work reexamines some aspects of previous work using modern analysis techniques to provide improved quantitative interpretations. Using three-dimensional volume visualization of legacy data and the two phases of post-production time-lapse data, I provide additional insight into the fluid migration pathways and the pressure communication between different reservoirs, separated by faults. This work supports a conclusion from previous studies that production from one reservoir caused regional pressure decline that in turn resulted in liberation of gas from multiple surrounding unproduced reservoirs. I also provide an explanation for unusual time-lapse changes in amplitude-versus-offset (AVO) data related to the compaction of the producing reservoir which, in turn, changed an isotropic medium to an anisotropic medium. In the first part of this work, I examine regional changes in seismic response due to the production of oil and gas from one reservoir. The previous studies primarily used two post-production ocean-bottom surveys (Phase I and Phase II), and not the legacy streamer data, due to the unavailability of legacy prestack data and very different acquisition parameters. In order to incorporate the legacy data in the present study, all three poststack data sets were cross-equalized and examined using instantaneous amplitude and energy volumes. This approach appears quite effective and helps to suppress changes unrelated to production while emphasizing those large-amplitude changes that are related to production in this noisy (by current standards) suite of data. I examine the multiple data sets first by using the instantaneous amplitude and energy attributes, and then also examine specific apparent time-lapse changes through direct comparisons of seismic traces. In so doing, I identify time-delays that, when corrected for, indicate water encroachment at the base of the producing reservoir. I also identify specific sites of leakage from various unproduced reservoirs, the result of regional pressure blowdown as explained in previous studies; those earlier studies, however, were unable to identify direct evidence of fluid movement. Of particular interest is the identification of one site where oil apparently leaked from one reservoir into a “new” reservoir that did not originally contain oil, but was ideally suited as a trap for fluids leaking from the neighboring spill-point. With continued pressure drop, oil in the new reservoir increased as more oil entered into the reservoir and expanded, liberating gas from solution. Because of the limited volume available for oil and gas in that temporary trap, oil and gas also escaped from it into the surrounding formation. I also note that some of the reservoirs demonstrate time-lapse changes only in the “gas cap” and not in the oil zone, even though gas must be coming out of solution everywhere in the reservoir. This is explained by interplay between pore-fluid modulus reduction by gas saturation decrease and dry-frame modulus increase by frame stiffening. In the second part of this work, I examine various rock-physics models in an attempt to quantitatively account for frame-stiffening that results from reduced pore-fluid pressure in the producing reservoir, searching for a model that would predict the unusual AVO features observed in the time-lapse prestack and stacked data at Teal South. While several rock-physics models are successful at predicting the time-lapse response for initial production, most fail to match the observations for continued production between Phase I and Phase II. Because the reservoir was initially overpressured and unconsolidated, reservoir compaction was likely significant, and is probably accomplished largely by uniaxial strain in the vertical direction; this implies that an anisotropic model may be required. Using Walton’s model for anisotropic unconsolidated sand, I successfully model the time-lapse changes for all phases of production. This observation may be of interest for application to other unconsolidated overpressured reservoirs under production.
Resumo:
Simbrain is a visually-oriented framework for building and analyzing neural networks. It emphasizes the analysis of networks which control agents embedded in virtual environments, and visualization of the structures which occur in the high dimensional state spaces of these networks. The program was originally intended to facilitate analysis of representational processes in embodied agents, however it is also well suited to teaching neural networks concepts to a broader audience than is traditional for neural networks courses. Simbrain was used to teach a course at a new university, UC Merced, in its inaugural year. Experiences from the course and sample lessons are provided.
Resumo:
We present an algorithm for estimating dense image correspondences. Our versatile approach lends itself to various tasks typical for video post-processing, including image morphing, optical flow estimation, stereo rectification, disparity/depth reconstruction, and baseline adjustment. We incorporate recent advances in feature matching, energy minimization, stereo vision, and data clustering into our approach. At the core of our correspondence estimation we use Efficient Belief Propagation for energy minimization. While state-of-the-art algorithms only work on thumbnail-sized images, our novel feature downsampling scheme in combination with a simple, yet efficient data term compression, can cope with high-resolution data. The incorporation of SIFT (Scale-Invariant Feature Transform) features into data term computation further resolves matching ambiguities, making long-range correspondence estimation possible. We detect occluded areas by evaluating the correspondence symmetry, we further apply Geodesic matting to automatically determine plausible values in these regions.
Resumo:
High-quality data are essential for veterinary surveillance systems, and their quality can be affected by the source and the method of collection. Data recorded on farms could provide detailed information about the health of a population of animals, but the accuracy of the data recorded by farmers is uncertain. The aims of this study were to evaluate the quality of the data on animal health recorded on 97 Swiss dairy farms, to compare the quality of the data obtained by different recording systems, and to obtain baseline data on the health of the animals on the 97 farms. Data on animal health were collected from the farms for a year. Their quality was evaluated by assessing the completeness and accuracy of the recorded information, and by comparing farmers' and veterinarians' records. The quality of the data provided by the farmers was satisfactory, although electronic recording systems made it easier to trace the animals treated. The farmers tended to record more health-related events than the veterinarians, although this varied with the event considered, and some events were recorded only by the veterinarians. The farmers' attitude towards data collection was positive. Factors such as motivation, feedback, training, and simplicity and standardisation of data collection were important because they influenced the quality of the data.
Resumo:
High-throughput assays, such as yeast two-hybrid system, have generated a huge amount of protein-protein interaction (PPI) data in the past decade. This tremendously increases the need for developing reliable methods to systematically and automatically suggest protein functions and relationships between them. With the available PPI data, it is now possible to study the functions and relationships in the context of a large-scale network. To data, several network-based schemes have been provided to effectively annotate protein functions on a large scale. However, due to those inherent noises in high-throughput data generation, new methods and algorithms should be developed to increase the reliability of functional annotations. Previous work in a yeast PPI network (Samanta and Liang, 2003) has shown that the local connection topology, particularly for two proteins sharing an unusually large number of neighbors, can predict functional associations between proteins, and hence suggest their functions. One advantage of the work is that their algorithm is not sensitive to noises (false positives) in high-throughput PPI data. In this study, we improved their prediction scheme by developing a new algorithm and new methods which we applied on a human PPI network to make a genome-wide functional inference. We used the new algorithm to measure and reduce the influence of hub proteins on detecting functionally associated proteins. We used the annotations of the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as independent and unbiased benchmarks to evaluate our algorithms and methods within the human PPI network. We showed that, compared with the previous work from Samanta and Liang, our algorithm and methods developed in this study improved the overall quality of functional inferences for human proteins. By applying the algorithms to the human PPI network, we obtained 4,233 significant functional associations among 1,754 proteins. Further comparisons of their KEGG and GO annotations allowed us to assign 466 KEGG pathway annotations to 274 proteins and 123 GO annotations to 114 proteins with estimated false discovery rates of <21% for KEGG and <30% for GO. We clustered 1,729 proteins by their functional associations and made pathway analysis to identify several subclusters that are highly enriched in certain signaling pathways. Particularly, we performed a detailed analysis on a subcluster enriched in the transforming growth factor β signaling pathway (P<10-50) which is important in cell proliferation and tumorigenesis. Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations. Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotations in this post-genomic era.
Resumo:
It is a challenge to measure the impact of releasing data to the public since the effects may not be directly linked to particular open data activities or substantial impact may only occur several years after publishing the data. This paper proposes a framework to assess the impact of releasing open data by applying the Social Return on Investment (SROI) approach. SROI was developed for organizations intended to generate social and environmental benefits thus fitting the purpose of most open data initiatives. We link the four steps of SROI (input, output, outcome, impact) with the 14 high-value data categories of the G8 Open Data Charter to create a matrix of open data examples, activities, and impacts in each of the data categories. This Impact Monitoring Framework helps data providers to navigate the impact space of open data laying out the conceptual basis for further research.
Resumo:
High-frequency data collected continuously over a multiyear time frame are required for investigating the various agents that drive ecological and hydrodynamic processes in estuaries. Here, we present water quality and current in-situ observations from a fixed monitoring station operating from 2008 to 2014 in the lower Guadiana Estuary, southern Portugal (37°11.30' N, 7°24.67' W). The data were recorded by a multi-parametric probe providing hourly records (temperature, salinity, chlorophyll, dissolved oxygen, turbidity, and pH) at a water depth of ~1 m, and by a bottom-mounted acoustic Doppler current profiler measuring the pressure, near-bottom temperature, and flow velocity through the water column every 15 min. The time-series data, in particular the probe ones, present substantial gaps arising from equipment failure and maintenance, which are ineluctable with this type of observations in harsh environments. However, prolonged (months-long) periods of multi-parametric observations during contrasted external forcing conditions are available. The raw data are reported together with flags indicating the quality status of each record. River discharge data from two hydrographic stations located near the estuary head are also provided to support data analysis and interpretation.
Resumo:
Based on discrete samples, we report new high-resolution records of the ~185 kyr Iceland Basin (IB) geomagnetic excursion from Ocean Drilling Project (ODP) Site 1063 on the Bermuda Rise (sedimentation rate 32 cm/kyr) and from ODP Site 983 in the far North Atlantic (sedimentation rate 18 cm/kyr). Two records from Holes 1063A and 1063B are very consistent, and provide the highest resolution of the detailed field behaviour during the IB excursion obtained so far. Inclination records from Holes 983B and 983C in the far North Atlantic are also very consistent, whereas declination anomalies deviate more notably. The pseudo-Thellier (PT) technique was applied along with more conventional palaeointensity proxies (NRM/ARM and NRM/kappa) to recover relative palaeointensity (RPI) estimates from Hole 1063A and Hole 983B. As expected, these proxies indicate that the field intensity generally dropped at both sites during the IB excursion, but also that the history of RPI from the two sites is different. VGPs from Site 1063 indicate that the field at this location experienced some stop-and-go behaviour between patches of intense vertical flux over North America and the tip of South America, areas which coincide fairly well with patches of preferred transitional VGP clustering from reversals and zones of high seismic velocity in the lower mantle. Changes in RPI at this location were generally gradual, possibly due to the proximity of these flux patches, and the first period of VGP-clustering over North America was accompanied by a conspicuous increase in RPI. VGPs from Site 983 track along a different path, and the associated RPI changes are very abrupt and completely synchronous with the onset and termination of the excursion. The differing VGP paths from Sites 1063 and 983 indicate that the global field structure during the IB excursion was not dominated by a single dipole.
Resumo:
The mid-Piacenzian (MP) warm period (3.264-3.025 Ma) has been identified as the most recent time in geologic history during which mean global surface temperatures were considerably warmer than today for a sustained period. This interval has therefore been proposed as a potential (albeit imperfect) analog for future climate change and as such, has received much scientific attention over the past two decades. Central to this research effort is the Pliocene Research, Interpretation, and Synoptic Mapping (PRISM) project, an iterative paleoenvironmental reconstruction of the MP focused on increasing our understanding of warm-period climate forcings, dynamics, and feedbacks by providing three-dimensional data sets for general circulation models. A mainstay of the PRISM project has been the development of a global sea surface temperature (SST) data set based primarily upon quantitative analyses of planktic foraminifer assemblages, supplemented with geochemical SST estimates wherever possible. In order to improve spatial coverage of the PRISM faunal data set in the low and mid-latitude North Atlantic, this study provides a description of the MP planktic foraminifer assemblage from five Ocean Drilling Program sites (951, 958, 1006, 1062, and 1063) in the subtropical gyre, a region critical to Atlantic Ocean circulation and tropical heat advection. Assemblages from each core provide evidence for a temperature- and circulation-driven 5-10° northward displacement of MP faunal provinces, as well as regional shifts in planktic foraminifer populations linked to species ecology and interactions. General biogeographic trends also indicate that, relative to modern conditions, gyre circulation was stronger (particularly the Gulf Stream, North Atlantic Current, and North Equatorial Current) and meridionally broader. A comparison of mid-Piacenzian and modern North Atlantic planktic foraminifer assemblages suggests that low latitude western boundary currents were less than 1 °C warmer while eastern boundary currents were ~1-2 °C warmer, supporting the hypothesis of enhanced northward heat advection along western boundary currents and warming of high latitude Northeast Atlantic source regions for the Canary Current. These findings are consistent with a model of reduced meridional SST gradients, with little-to-no low latitude warming, and more vigorous ocean circulation. Results therefore support the theory that enhanced meridional overturn circulation and associated northward heat advection made an important contribution, in conjunction with elevated atmospheric CO2 concentrations, to the 2-3 °C global surface temperature increase (relative to today) and strong polar amplification of SST warmth during the MP warm period.
Resumo:
Pragmatism is the leading motivation of regularization. We can understand regularization as a modification of the maximum-likelihood estimator so that a reasonable answer could be given in an unstable or ill-posed situation. To mention some typical examples, this happens when fitting parametric or non-parametric models with more parameters than data or when estimating large covariance matrices. Regularization is usually used, in addition, to improve the bias-variance tradeoff of an estimation. Then, the definition of regularization is quite general, and, although the introduction of a penalty is probably the most popular type, it is just one out of multiple forms of regularization. In this dissertation, we focus on the applications of regularization for obtaining sparse or parsimonious representations, where only a subset of the inputs is used. A particular form of regularization, L1-regularization, plays a key role for reaching sparsity. Most of the contributions presented here revolve around L1-regularization, although other forms of regularization are explored (also pursuing sparsity in some sense). In addition to present a compact review of L1-regularization and its applications in statistical and machine learning, we devise methodology for regression, supervised classification and structure induction of graphical models. Within the regression paradigm, we focus on kernel smoothing learning, proposing techniques for kernel design that are suitable for high dimensional settings and sparse regression functions. We also present an application of regularized regression techniques for modeling the response of biological neurons. Supervised classification advances deal, on the one hand, with the application of regularization for obtaining a na¨ıve Bayes classifier and, on the other hand, with a novel algorithm for brain-computer interface design that uses group regularization in an efficient manner. Finally, we present a heuristic for inducing structures of Gaussian Bayesian networks using L1-regularization as a filter. El pragmatismo es la principal motivación de la regularización. Podemos entender la regularización como una modificación del estimador de máxima verosimilitud, de tal manera que se pueda dar una respuesta cuando la configuración del problema es inestable. A modo de ejemplo, podemos mencionar el ajuste de modelos paramétricos o no paramétricos cuando hay más parámetros que casos en el conjunto de datos, o la estimación de grandes matrices de covarianzas. Se suele recurrir a la regularización, además, para mejorar el compromiso sesgo-varianza en una estimación. Por tanto, la definición de regularización es muy general y, aunque la introducción de una función de penalización es probablemente el método más popular, éste es sólo uno de entre varias posibilidades. En esta tesis se ha trabajado en aplicaciones de regularización para obtener representaciones dispersas, donde sólo se usa un subconjunto de las entradas. En particular, la regularización L1 juega un papel clave en la búsqueda de dicha dispersión. La mayor parte de las contribuciones presentadas en la tesis giran alrededor de la regularización L1, aunque también se exploran otras formas de regularización (que igualmente persiguen un modelo disperso). Además de presentar una revisión de la regularización L1 y sus aplicaciones en estadística y aprendizaje de máquina, se ha desarrollado metodología para regresión, clasificación supervisada y aprendizaje de estructura en modelos gráficos. Dentro de la regresión, se ha trabajado principalmente en métodos de regresión local, proponiendo técnicas de diseño del kernel que sean adecuadas a configuraciones de alta dimensionalidad y funciones de regresión dispersas. También se presenta una aplicación de las técnicas de regresión regularizada para modelar la respuesta de neuronas reales. Los avances en clasificación supervisada tratan, por una parte, con el uso de regularización para obtener un clasificador naive Bayes y, por otra parte, con el desarrollo de un algoritmo que usa regularización por grupos de una manera eficiente y que se ha aplicado al diseño de interfaces cerebromáquina. Finalmente, se presenta una heurística para inducir la estructura de redes Bayesianas Gaussianas usando regularización L1 a modo de filtro.