935 resultados para subset comparisons
Resumo:
Envisat Advanced Synthetic Aperture Radar (ASAR) Wide Swath Mode (WSM) images are used to derive C-band HH-polarization normalized radar cross sections (NRCS). These are compared with ice-core analysis and visual ship-based observations of snow and ice properties observed according to the Antarctic Sea Ice Processes and Climate (ASPeCt) protocol during two International Polar Year summer cruises (Oden 2008 and Palmer 2009) in West Antarctica. Thick first-year (TFY) and multi-year (MY) ice were the dominant ice types. The NRCS value ranges between -16.3 ± 1.1 and -7.6 ± 1.0 dB for TFY ice, and is -12.6 ± 1.3 dB for MY ice; for TFY ice, NRCS values increase from ~-15 dB to -9 dB from December/January to mid-February. In situ and ASPeCt observations are not, however, detailed enough to interpret the observed NRCS change over time. Co-located Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E) vertically polarized 37 GHz brightness temperatures (TB37V), 7 day and 1 day averages as well as the TB37V difference between ascending and descending AMSR-E overpasses suggest the low NRCS values (-15 dB) are associated with snowmelt being still in progress, while the change towards higher NRCS values (-9dB) is caused by commencement of melt-refreeze cycles after about mid-January.
Resumo:
We examined controls on the carbon isotopic composition of sea ice brines and organic matter during cruises to the Ross Sea, Antarctica in November/December 1998 and November/December 2006. Brine samples were analyzed for salinity, nutrients, total dissolved inorganic carbon (sum CO2), and the 13C/12C ratio of Sum CO2 (d13C(sum CO2)). Particulate organic matter from sea ice cores was analyzed for percent particulate organic carbon (POC), percent total particulate nitrogen (TPN), and stable carbon isotopic composition (d13C(POC)). Sum CO2 in sea ice brines ranged from 1368 to 7149 µmol/kg, equivalent to 1483 to 2519 µmol/kg when normalized to 34.5 psu salinity (s sum CO2), the average salinity of Ross Sea surface waters. Sea ice primary producers removed up to 34% of the available sum CO2, an amount much higher than the maximum removal observed in sea ice free water. Carbonate precipitation and CO2 degassing may reduce s sum CO2 by a similar amount (e.g., 30%) in the most hypersaline sea ice environments, although brine volumes are low in very cold ice that supports these brines. Brine d13C(sum CO2) ranged from -2.6 to +8.0 per mil while d13C(POC) ranged from -30.5 to -9.2 per mil. Isotopic enrichment of the sum CO2 pool via net community production accounts for some but not all carbon isotopic enrichment of sea ice POC. Comparisons of s sum CO2, d13C(sum CO2), and d13C(POC) within sea ice suggest that epsilon p (the net photosynthetic fractionation factor) for sea ice algae is ~8 per mil smaller than the epsilon p observed for phytoplankton in open water regions of the Ross Sea. These results have implications for modeling of carbon uptake and transformation in the ice-covered ocean and for reconstruction of past sea ice extent based on stable isotopic composition of organic matter in sediment cores.
Resumo:
The increasing importance of pollutant noise has led to the creation of many new noise testing laboratories in recent years. For this reason and due to the legal implications that noise reporting may have, it is necessary to create procedures intended to guarantee the quality of the testing and its results. For instance, the ISO/IEC standard 17025:2005 specifies general requirements for the competence of testing laboratories. In this standard, interlaboratory comparisons are one of the main measures that must be applied to guarantee the quality of laboratories when applying specific methodologies for testing. In the specific case of environmental noise, round robin tests are usually difficult to design, as it is difficult to find scenarios that can be available and controlled while the participants carry out the measurements. Monitoring and controlling the factors that can influence the measurements (source emissions, propagation, background noise…) is not usually affordable, so the most extended solution is to create very effortless scenarios, where most of the factors that can have an influence on the results are excluded (sampling, processing of results, background noise, source detection…) The new approach described in this paper only requires the organizer to make actual measurements (or prepare virtual ones). Applying and interpreting a common reference document (standard, regulation…), the participants must analyze these input data independently to provide the results, which will be compared among the participants. The measurement costs are severely reduced for the participants, there is no need to monitor the scenario conditions, and almost any relevant factor can be included in this methodology
Resumo:
This paper studies feature subset selection in classification using a multiobjective estimation of distribution algorithm. We consider six functions, namely area under ROC curve, sensitivity, specificity, precision, F1 measure and Brier score, for evaluation of feature subsets and as the objectives of the problem. One of the characteristics of these objective functions is the existence of noise in their values that should be appropriately handled during optimization. Our proposed algorithm consists of two major techniques which are specially designed for the feature subset selection problem. The first one is a solution ranking method based on interval values to handle the noise in the objectives of this problem. The second one is a model estimation method for learning a joint probabilistic model of objectives and variables which is used to generate new solutions and advance through the search space. To simplify model estimation, l1 regularized regression is used to select a subset of problem variables before model learning. The proposed algorithm is compared with a well-known ranking method for interval-valued objectives and a standard multiobjective genetic algorithm. Particularly, the effects of the two new techniques are experimentally investigated. The experimental results show that the proposed algorithm is able to obtain comparable or better performance on the tested datasets.
Resumo:
Machine learning techniques are used for extracting valuable knowledge from data. Nowa¬days, these techniques are becoming even more important due to the evolution in data ac¬quisition and storage, which is leading to data with different characteristics that must be exploited. Therefore, advances in data collection must be accompanied with advances in machine learning techniques to solve new challenges that might arise, on both academic and real applications. There are several machine learning techniques depending on both data characteristics and purpose. Unsupervised classification or clustering is one of the most known techniques when data lack of supervision (unlabeled data) and the aim is to discover data groups (clusters) according to their similarity. On the other hand, supervised classification needs data with supervision (labeled data) and its aim is to make predictions about labels of new data. The presence of data labels is a very important characteristic that guides not only the learning task but also other related tasks such as validation. When only some of the available data are labeled whereas the others remain unlabeled (partially labeled data), neither clustering nor supervised classification can be used. This scenario, which is becoming common nowadays because of labeling process ignorance or cost, is tackled with semi-supervised learning techniques. This thesis focuses on the branch of semi-supervised learning closest to clustering, i.e., to discover clusters using available labels as support to guide and improve the clustering process. Another important data characteristic, different from the presence of data labels, is the relevance or not of data features. Data are characterized by features, but it is possible that not all of them are relevant, or equally relevant, for the learning process. A recent clustering tendency, related to data relevance and called subspace clustering, claims that different clusters might be described by different feature subsets. This differs from traditional solutions to data relevance problem, where a single feature subset (usually the complete set of original features) is found and used to perform the clustering process. The proximity of this work to clustering leads to the first goal of this thesis. As commented above, clustering validation is a difficult task due to the absence of data labels. Although there are many indices that can be used to assess the quality of clustering solutions, these validations depend on clustering algorithms and data characteristics. Hence, in the first goal three known clustering algorithms are used to cluster data with outliers and noise, to critically study how some of the most known validation indices behave. The main goal of this work is however to combine semi-supervised clustering with subspace clustering to obtain clustering solutions that can be correctly validated by using either known indices or expert opinions. Two different algorithms are proposed from different points of view to discover clusters characterized by different subspaces. For the first algorithm, available data labels are used for searching for subspaces firstly, before searching for clusters. This algorithm assigns each instance to only one cluster (hard clustering) and is based on mapping known labels to subspaces using supervised classification techniques. Subspaces are then used to find clusters using traditional clustering techniques. The second algorithm uses available data labels to search for subspaces and clusters at the same time in an iterative process. This algorithm assigns each instance to each cluster based on a membership probability (soft clustering) and is based on integrating known labels and the search for subspaces into a model-based clustering approach. The different proposals are tested using different real and synthetic databases, and comparisons to other methods are also included when appropriate. Finally, as an example of real and current application, different machine learning tech¬niques, including one of the proposals of this work (the most sophisticated one) are applied to a task of one of the most challenging biological problems nowadays, the human brain model¬ing. Specifically, expert neuroscientists do not agree with a neuron classification for the brain cortex, which makes impossible not only any modeling attempt but also the day-to-day work without a common way to name neurons. Therefore, machine learning techniques may help to get an accepted solution to this problem, which can be an important milestone for future research in neuroscience. Resumen Las técnicas de aprendizaje automático se usan para extraer información valiosa de datos. Hoy en día, la importancia de estas técnicas está siendo incluso mayor, debido a que la evolución en la adquisición y almacenamiento de datos está llevando a datos con diferentes características que deben ser explotadas. Por lo tanto, los avances en la recolección de datos deben ir ligados a avances en las técnicas de aprendizaje automático para resolver nuevos retos que pueden aparecer, tanto en aplicaciones académicas como reales. Existen varias técnicas de aprendizaje automático dependiendo de las características de los datos y del propósito. La clasificación no supervisada o clustering es una de las técnicas más conocidas cuando los datos carecen de supervisión (datos sin etiqueta), siendo el objetivo descubrir nuevos grupos (agrupaciones) dependiendo de la similitud de los datos. Por otra parte, la clasificación supervisada necesita datos con supervisión (datos etiquetados) y su objetivo es realizar predicciones sobre las etiquetas de nuevos datos. La presencia de las etiquetas es una característica muy importante que guía no solo el aprendizaje sino también otras tareas relacionadas como la validación. Cuando solo algunos de los datos disponibles están etiquetados, mientras que el resto permanece sin etiqueta (datos parcialmente etiquetados), ni el clustering ni la clasificación supervisada se pueden utilizar. Este escenario, que está llegando a ser común hoy en día debido a la ignorancia o el coste del proceso de etiquetado, es abordado utilizando técnicas de aprendizaje semi-supervisadas. Esta tesis trata la rama del aprendizaje semi-supervisado más cercana al clustering, es decir, descubrir agrupaciones utilizando las etiquetas disponibles como apoyo para guiar y mejorar el proceso de clustering. Otra característica importante de los datos, distinta de la presencia de etiquetas, es la relevancia o no de los atributos de los datos. Los datos se caracterizan por atributos, pero es posible que no todos ellos sean relevantes, o igualmente relevantes, para el proceso de aprendizaje. Una tendencia reciente en clustering, relacionada con la relevancia de los datos y llamada clustering en subespacios, afirma que agrupaciones diferentes pueden estar descritas por subconjuntos de atributos diferentes. Esto difiere de las soluciones tradicionales para el problema de la relevancia de los datos, en las que se busca un único subconjunto de atributos (normalmente el conjunto original de atributos) y se utiliza para realizar el proceso de clustering. La cercanía de este trabajo con el clustering lleva al primer objetivo de la tesis. Como se ha comentado previamente, la validación en clustering es una tarea difícil debido a la ausencia de etiquetas. Aunque existen muchos índices que pueden usarse para evaluar la calidad de las soluciones de clustering, estas validaciones dependen de los algoritmos de clustering utilizados y de las características de los datos. Por lo tanto, en el primer objetivo tres conocidos algoritmos se usan para agrupar datos con valores atípicos y ruido para estudiar de forma crítica cómo se comportan algunos de los índices de validación más conocidos. El objetivo principal de este trabajo sin embargo es combinar clustering semi-supervisado con clustering en subespacios para obtener soluciones de clustering que puedan ser validadas de forma correcta utilizando índices conocidos u opiniones expertas. Se proponen dos algoritmos desde dos puntos de vista diferentes para descubrir agrupaciones caracterizadas por diferentes subespacios. Para el primer algoritmo, las etiquetas disponibles se usan para bus¬car en primer lugar los subespacios antes de buscar las agrupaciones. Este algoritmo asigna cada instancia a un único cluster (hard clustering) y se basa en mapear las etiquetas cono-cidas a subespacios utilizando técnicas de clasificación supervisada. El segundo algoritmo utiliza las etiquetas disponibles para buscar de forma simultánea los subespacios y las agru¬paciones en un proceso iterativo. Este algoritmo asigna cada instancia a cada cluster con una probabilidad de pertenencia (soft clustering) y se basa en integrar las etiquetas conocidas y la búsqueda en subespacios dentro de clustering basado en modelos. Las propuestas son probadas utilizando diferentes bases de datos reales y sintéticas, incluyendo comparaciones con otros métodos cuando resulten apropiadas. Finalmente, a modo de ejemplo de una aplicación real y actual, se aplican diferentes técnicas de aprendizaje automático, incluyendo una de las propuestas de este trabajo (la más sofisticada) a una tarea de uno de los problemas biológicos más desafiantes hoy en día, el modelado del cerebro humano. Específicamente, expertos neurocientíficos no se ponen de acuerdo en una clasificación de neuronas para la corteza cerebral, lo que imposibilita no sólo cualquier intento de modelado sino también el trabajo del día a día al no tener una forma estándar de llamar a las neuronas. Por lo tanto, las técnicas de aprendizaje automático pueden ayudar a conseguir una solución aceptada para este problema, lo cual puede ser un importante hito para investigaciones futuras en neurociencia.
Resumo:
Guías de actividad física durante el embarazo
Resumo:
The objective of this paper is to provide performance metrics for small-signal stability assessment of a given system architecture. The stability margins are stated utilizing a concept of maximum peak criteria (MPC) derived from the behavior of an impedance-based sensitivity function. For each minor-loop gain defined at every system interface, a single number to state the robustness of stability is provided based on the computed maximum value of the corresponding sensitivity function. In order to compare various power-architecture solutions in terms of stability, a parameter providing an overall measure of the whole system stability is required. The selected figure of merit is geometric average of each maximum peak value within the system. It provides a meaningful metrics for system comparisons: the best system in terms of robust stability is the one that minimizes this index. In addition, the largest peak value within the system interfaces is given thus detecting the weakest point of the system in terms of robustness.
Resumo:
Actualmente son una práctica común los procesos de normalización de métodos de ensayo y acreditación de laboratorios, ya que permiten una evaluación de los procedimientos llevados a cabo por profesionales de un sector tecnológico y además permiten asegurar unos mínimos de calidad en los resultados finales. En el caso de los laboratorios de acústica, para conseguir y mantener la acreditación de un laboratorio es necesario participar activamente en ejercicios de intercomparación, utilizados para asegurar la calidad de los métodos empleados. El inconveniente de estos ensayos es el gran coste que suponen para los laboratorios, siendo en ocasiones inasumible por estos teniendo que renunciar a la acreditación. Este Proyecto Fin de Grado se centrará en el desarrollo de un Laboratorio Virtual implementado mediante una herramienta software que servirá para realizar ejercicios de intercomparación no presenciales, ampliando de ese modo el concepto e-comparison y abriendo las bases a que en un futuro este tipo de ejercicios no presenciales puedan llegar a sustituir a los llevados a cabo actualmente. En el informe primero se hará una pequeña introducción, donde se expondrá la evolución y la importancia de los procedimientos de calidad acústica en la sociedad actual. A continuación se comentará las normativas internacionales en las que se soportará el proyecto, la norma ISO 145-5, así como los métodos matemáticos utilizados en su implementación, los métodos estadísticos de propagación de incertidumbres especificados por la JCGM (Joint Committee for Guides in Metrology). Después, se hablará sobre la estructura del proyecto, tanto del tipo de programación utilizada en su desarrollo como la metodología de cálculo utilizada para conseguir que todas las funcionalidades requeridas en este tipo de ensayo estén correctamente implementadas. Posteriormente se llevará a cabo una validación estadística basada en la comparación de unos datos generados por el programa, procesados utilizando la simulación de Montecarlo, y unos cálculos analíticos, que permita comprobar que el programa funciona tal y como se ha previsto en la fase de estudio teórico. También se realizará una prueba del programa, similar a la que efectuaría un técnico de laboratorio, en la que se evaluará la incertidumbre de la medida calculándola mediante el método tradicional, pudiendo comparar los datos obtenidos con los que deberían obtenerse. Por último, se comentarán las conclusiones obtenidas con el desarrollo y pruebas del Laboratorio Virtual, y se propondrán nuevas líneas de investigación futuras relacionadas con el concepto e-comparison y la implementación de mejoras al Laboratorio Virtual. ABSTRACT. Nowadays it is common practise to make procedures to normalise trials methods standards and laboratory accreditations, as they allow for the evaluation of the procedures made by professionals from a particular technological sector in addition to ensuring a minimum quality in the results. In order for an acoustics laboratory to achieve and maintain the accreditation it is necessary to actively participate in the intercomparison exercises, since these are used to assure the quality of the methods used by the technicians. Unfortunately, the high cost of these trials is unaffordable for many laboratories, which then have to renounce to having the accreditation. This Final Project is focused on the development of a Virtual Laboratory implemented by a software tool that it will be used for making non-attendance intercomparison trials, widening the concept of e-comparison and opening the possibility for using this type of non-attendance trials instead of the current ones. First, as a short introduction, I show the evolution and the importance today of acoustic quality procedures. Second, I will discuss the international standards, such as ISO 145-5, as well the mathematic and statistical methods of uncertainty propagation specified by the Joint Committee for Guides in Metrology, that are used in the Project. Third, I speak about the structure of the Project, as well as the programming language structure and the methodology used to get the different features needed in this acoustic trial. Later, a statistical validation will be carried out, based on comparison of data generated by the program, processed using a Montecarlo simulation, and analytical calculations to verify that the program works as planned in the theoretical study. There will also be a test of the program, similar to one that a laboratory technician would carry out, by which the uncertainty in the measurement will be compared to a traditional calculation method so as to compare the results. Finally, the conclusions obtained with the development and testing of the Virtual Laboratory will be discussed, new research paths related to e-comparison definition and the improvements for the Laboratory will be proposed.
Resumo:
Apolipoprotein(a) [apo(a)] is the distinguishing protein component of lipoprotein(a), a major inherited risk factor for atherosclerosis. Human apo(a) is homologous to plasminogen. It contains from 15 to 50 repeated domains closely related to plasminogen kringle four, plus single kringle five-like and inactive protease-like domains. This expressed gene is confined to a subset of primates. Although most mammals lack apo(a), hedgehogs produce an apo(a)-like protein composed of highly repeated copies of a plasminogen kringle three-like domain, with complete absence of protease domain sequences. Both human and hedgehog apo(a)-like proteins form covalently linked lipoprotein particles that can bind to fibrin and other substrates shared with plasminogen. DNA sequence comparisons and phylogenetic analysis indicate that the human type of apo(a) evolved from a duplicated plasminogen gene during recent primate evolution. In contrast, the kringle three-based type of apo(a) evolved from an independent duplication of the plasminogen gene approximately 80 million years ago. In a type of convergent evolution, the plasminogen gene has been independently remodeled twice during mammalian evolution to produce similar forms of apo(a) in two widely divergent groups of species.
Resumo:
Self-incompatibility in Brassica is controlled by a single multi-allelic locus (S locus), which contains at least two highly polymorphic genes expressed in the stigma: an S glycoprotein gene (SLG) and an S receptor kinase gene (SRK). The putative ligand-binding domain of SRK exhibits high homology to the secretory protein SLG, and it is believed that SLG and SRK form an active receptor kinase complex with a self-pollen ligand, which leads to the rejection of self-pollen. Here, we report 31 novel SLG sequences of Brassica oleracea and Brassica campestris. Sequence comparisons of a large number of SLG alleles and SLG-related genes revealed the following points. (i) The striking sequence similarity observed in an inter-specific comparison (95.6% identity between SLG14 of B. oleracea and SLG25 of B. campestris in deduced amino acid sequence) suggests that SLG diversification predates speciation. (ii) A perfect match of the sequences in hypervariable regions, which are thought to determine S specificity in an intra-specific comparison (SLG8 and SLG46 of B. campestris) and the observation that the hypervariable regions of SLG and SRK of the same S haplotype were not necessarily highly similar suggests that SLG and SRK bind different sites of the pollen ligand and that they together determine S specificity. (iii) Comparison of the hypervariable regions of SLG alleles suggests that intragenic recombination, together with point mutations, has contributed to the generation of the high level of sequence variation in SLG alleles. Models for the evolution of SLG/SRK are presented.
Resumo:
The immunosuppressant rapamycin inhibits Tor1p and Tor2p (target of rapamycin proteins), ultimately resulting in cellular responses characteristic of nutrient deprivation through a mechanism involving translational arrest. We measured the immediate transcriptional response of yeast grown in rich media and treated with rapamycin to investigate the direct effects of Tor proteins on nutrient-sensitive signaling pathways. The results suggest that Tor proteins directly modulate the glucose activation and nitrogen discrimination pathways and the pathways that respond to the diauxic shift (including glycolysis and the citric acid cycle). Tor proteins do not directly modulate the general amino acid control, nitrogen starvation, or sporulation (in diploid cells) pathways. Poor nitrogen quality activates the nitrogen discrimination pathway, which is controlled by the complex of the transcriptional repressor Ure2p and activator Gln3p. Inhibiting Tor proteins with rapamycin increases the electrophoretic mobility of Ure2p. The work presented here illustrates the coordinated use of genome-based and biochemical approaches to delineate a cellular pathway modulated by the protein target of a small molecule.