439 resultados para Psilídeo-de-concha
Resumo:
Locally weighted regression is a technique that predicts the response for new data items from their neighbors in the training data set, where closer data items are assigned higher weights in the prediction. However, the original method may suffer from overfitting and fail to select the relevant variables. In this paper we propose combining a regularization approach with locally weighted regression to achieve sparse models. Specifically, the lasso is a shrinkage and selection method for linear regression. We present an algorithm that embeds lasso in an iterative procedure that alternatively computes weights and performs lasso-wise regression. The algorithm is tested on three synthetic scenarios and two real data sets. Results show that the proposed method outperforms linear and local models for several kinds of scenarios
Resumo:
The emerging use of real-time 3D-based multimedia applications imposes strict quality of service (QoS) requirements on both access and core networks. These requirements and their impact to provide end-to-end 3D videoconferencing services have been studied within the Spanish-funded VISION project, where different scenarios were implemented showing an agile stereoscopic video call that might be offered to the general public in the near future. In view of the requirements, we designed an integrated access and core converged network architecture which provides the requested QoS to end-to-end IP sessions. Novel functional blocks are proposed to control core optical networks, the functionality of the standard ones is redefined, and the signaling improved to better meet the requirements of future multimedia services. An experimental test-bed to assess the feasibility of the solution was also deployed. In such test-bed, set-up and release of end-to-end sessions meeting specific QoS requirements are shown and the impact of QoS degradation in terms of the user perceived quality degradation is quantified. In addition, scalability results show that the proposed signaling architecture is able to cope with large number of requests introducing almost negligible delay.
Resumo:
Las tecnologías de vídeo en 3D han estado al alza en los últimos años, con abundantes avances en investigación unidos a una adopción generalizada por parte de la industria del cine, y una importancia creciente en la electrónica de consumo. Relacionado con esto, está el concepto de vídeo multivista, que abarca el vídeo 3D, y puede definirse como un flujo de vídeo compuesto de dos o más vistas. El vídeo multivista permite prestaciones avanzadas de vídeo, como el vídeo estereoscópico, el “free viewpoint video”, contacto visual mejorado mediante vistas virtuales, o entornos virtuales compartidos. El propósito de esta tesis es salvar un obstáculo considerable de cara al uso de vídeo multivista en sistemas de comunicación: la falta de soporte para esta tecnología por parte de los protocolos de señalización existentes, que hace imposible configurar una sesión con vídeo multivista mediante mecanismos estándar. Así pues, nuestro principal objetivo es la extensión del Protocolo de Inicio de Sesión (SIP) para soportar la negociación de sesiones multimedia con flujos de vídeo multivista. Nuestro trabajo se puede resumir en tres contribuciones principales. En primer lugar, hemos definido una extensión de señalización para configurar sesiones SIP con vídeo 3D. Esta extensión modifica el Protocolo de Descripción de Sesión (SDP) para introducir un nuevo atributo de nivel de medios, y un nuevo tipo de dependencia de descodificación, que contribuyen a describir los formatos de vídeo 3D que pueden emplearse en una sesión, así como la relación entre los flujos de vídeo que componen un flujo de vídeo 3D. La segunda contribución consiste en una extensión a SIP para manejar la señalización de videoconferencias con flujos de vídeo multivista. Se definen dos nuevos paquetes de eventos SIP para describir las capacidades y topología de los terminales de conferencia, por un lado, y la configuración espacial y mapeo de flujos de una conferencia, por el otro. También se describe un mecanismo para integrar el intercambio de esta información en el proceso de inicio de una conferencia SIP. Como tercera y última contribución, introducimos el concepto de espacio virtual de una conferencia, o un sistema de coordenadas que incluye todos los objetos relevantes de la conferencia (como dispositivos de captura, pantallas, y usuarios). Explicamos cómo el espacio virtual se relaciona con prestaciones de conferencia como el contacto visual, la escala de vídeo y la fidelidad espacial, y proporcionamos reglas para determinar las prestaciones de una conferencia a partir del análisis de su espacio virtual, y para generar espacios virtuales durante la configuración de conferencias.
Resumo:
High-gradient, stepped fluvial tufa systems with dammed areas existed in the River Añamaza valley (NW Iberian Ranges, Spain) during Quaternary times. Single deposits range from a few meters to about 70 m thick, in which prograding-aggrading wedges separated by erosional surfaces exist. Several episodes of tufa formation have been distinguished by means of U-series, Amino-acid racemization and radiocarbon techniques. These correlate to MIS 8, 7, 5 and 1. The presence of MIS 9 is uncertain, as chronological data may also correspond to older stages. Most tufas in this area formed in MIS 5. Distinct tufa episodes can also be distinguished in the Holocene. These are the first chronological data presented for one of the northernmost Quaternary tufa systems in the Iberian Ranges.
Resumo:
This paper studies feature subset selection in classification using a multiobjective estimation of distribution algorithm. We consider six functions, namely area under ROC curve, sensitivity, specificity, precision, F1 measure and Brier score, for evaluation of feature subsets and as the objectives of the problem. One of the characteristics of these objective functions is the existence of noise in their values that should be appropriately handled during optimization. Our proposed algorithm consists of two major techniques which are specially designed for the feature subset selection problem. The first one is a solution ranking method based on interval values to handle the noise in the objectives of this problem. The second one is a model estimation method for learning a joint probabilistic model of objectives and variables which is used to generate new solutions and advance through the search space. To simplify model estimation, l1 regularized regression is used to select a subset of problem variables before model learning. The proposed algorithm is compared with a well-known ranking method for interval-valued objectives and a standard multiobjective genetic algorithm. Particularly, the effects of the two new techniques are experimentally investigated. The experimental results show that the proposed algorithm is able to obtain comparable or better performance on the tested datasets.
Resumo:
This paper proposes a new multi-objective estimation of distribution algorithm (EDA) based on joint modeling of objectives and variables. This EDA uses the multi-dimensional Bayesian network as its probabilistic model. In this way it can capture the dependencies between objectives, variables and objectives, as well as the dependencies learnt between variables in other Bayesian network-based EDAs. This model leads to a problem decomposition that helps the proposed algorithm to find better trade-off solutions to the multi-objective problem. In addition to Pareto set approximation, the algorithm is also able to estimate the structure of the multi-objective problem. To apply the algorithm to many-objective problems, the algorithm includes four different ranking methods proposed in the literature for this purpose. The algorithm is applied to the set of walking fish group (WFG) problems, and its optimization performance is compared with an evolutionary algorithm and another multi-objective EDA. The experimental results show that the proposed algorithm performs significantly better on many of the problems and for different objective space dimensions, and achieves comparable results on some compared with the other algorithms.
Resumo:
The twentieth century brought a new sensibility characterized by the discredit of cartesian rationality and the weakening of universal truths, related with aesthetic values as order, proportion and harmony. In the middle of the century, theorists such as Theodor Adorno, Rudolf Arnheim and Anton Ehrenzweig warned about the transformation developed by the artistic field. Contemporary aesthetics seemed to have a new goal: to deny the idea of art as an organized, finished and coherent structure. The order had lost its privileged position. Disorder, probability, arbitrariness, accidentality, randomness, chaos, fragmentation, indeterminacy... Gradually new terms were coined by aesthetic criticism to explain what had been happening since the beginning of the century. The first essays on the matter sought to provide new interpretative models based on, among other arguments, the phenomenology of perception, the recent discoveries of quantum mechanics, the deeper layers of the psyche or the information theories. Overall, were worthy attempts to give theoretical content to a situation as obvious as devoid of founding charter. Finally, in 1962, Umberto Eco brought together all this efforts by proposing a single theoretical frame in his book Opera Aperta. According to his point of view, all of the aesthetic production of twentieth century had a characteristic in common: its capacity to express multiplicity. For this reason, he considered that the nature of contemporary art was, above all, ambiguous. The aim of this research is to clarify the consequences of the incorporation of ambiguity in architectural theoretical discourse. We should start making an accurate analysis of this concept. However, this task is quite difficult because ambiguity does not allow itself to be clearly defined. This concept has the disadvantage that its signifier is as imprecise as its signified. In addition, the negative connotations that ambiguity still has outside the aesthetic field, stigmatizes this term and makes its use problematic. Another problem of ambiguity is that the contemporary subject is able to locate it in all situations. This means that in addition to distinguish ambiguity in contemporary productions, so does in works belonging to remote ages and styles. For that reason, it could be said that everything is ambiguous. And that’s correct, because somehow ambiguity is present in any creation of the imperfect human being. However, as Eco, Arnheim and Ehrenzweig pointed out, there are two major differences between current and past contexts. One affects the subject and the other the object. First, it’s the contemporary subject, and no other, who has acquired the ability to value and assimilate ambiguity. Secondly, ambiguity was an unexpected aesthetic result in former periods, while in contemporary object it has been codified and is deliberately present. In any case, as Eco did, we consider appropriate the use of the term ambiguity to refer to the contemporary aesthetic field. Any other term with more specific meaning would only show partial and limited aspects of a situation quite complex and difficult to diagnose. Opposed to what normally might be expected, in this case ambiguity is the term that fits better due to its particular lack of specificity. In fact, this lack of specificity is what allows to assign a dynamic condition to the idea of ambiguity that in other terms would hardly be operative. Thus, instead of trying to define the idea of ambiguity, we will analyze how it has evolved and its consequences in architectural discipline. Instead of trying to define what it is, we will examine what its presence has supposed in each moment. We will deal with ambiguity as a constant presence that has always been latent in architectural production but whose nature has been modified over time. Eco, in the mid-twentieth century, discerned between classical ambiguity and contemporary ambiguity. Currently, half a century later, the challenge is to discern whether the idea of ambiguity has remained unchanged or have suffered a new transformation. What this research will demonstrate is that it’s possible to detect a new transformation that has much to do with the cultural and aesthetic context of last decades: the transition from modernism to postmodernism. This assumption leads us to establish two different levels of contemporary ambiguity: each one related to one these periods. The first level of ambiguity is widely well-known since many years. Its main characteristics are a codified multiplicity, an interpretative freedom and an active subject who gives conclusion to an object that is incomplete or indefinite. This level of ambiguity is related to the idea of indeterminacy, concept successfully introduced into contemporary aesthetic language. The second level of ambiguity has been almost unnoticed for architectural criticism, although it has been identified and studied in other theoretical disciplines. Much of the work of Fredric Jameson and François Lyotard shows reasonable evidences that the aesthetic production of postmodernism has transcended modern ambiguity to reach a new level in which, despite of the existence of multiplicity, the interpretative freedom and the active subject have been questioned, and at last denied. In this period ambiguity seems to have reached a new level in which it’s no longer possible to obtain a conclusive and complete interpretation of the object because it has became an unreadable device. The postmodern production offers a kind of inaccessible multiplicity and its nature is deeply contradictory. This hypothetical transformation of the idea of ambiguity has an outstanding analogy with that shown in the poetic analysis made by William Empson, published in 1936 in his Seven Types of Ambiguity. Empson established different levels of ambiguity and classified them according to their poetic effect. This layout had an ascendant logic towards incoherence. In seventh level, where ambiguity is higher, he located the contradiction between irreconcilable opposites. It could be said that contradiction, once it undermines the coherence of the object, was the better way that contemporary aesthetics found to confirm the Hegelian judgment, according to which art would ultimately reject its capacity to express truth. Much of the transformation of architecture throughout last century is related to the active involvement of ambiguity in its theoretical discourse. In modern architecture ambiguity is present afterwards, in its critical review made by theoreticians like Colin Rowe, Manfredo Tafuri and Bruno Zevi. The publication of several studies about Mannerism in the forties and fifties rescued certain virtues of an historical style that had been undervalued due to its deviation from Renacentist canon. Rowe, Tafuri and Zevi, among others, pointed out the similarities between Mannerism and certain qualities of modern architecture, both devoted to break previous dogmas. The recovery of Mannerism allowed joining ambiguity and modernity for first time in the same sentence. In postmodernism, on the other hand, ambiguity is present ex-professo, developing a prominent role in the theoretical discourse of this period. The distance between its analytical identification and its operational use quickly disappeared because of structuralism, an analytical methodology with the aspiration of becoming a modus operandi. Under its influence, architecture began to be identified and studied as a language. Thus, postmodern theoretical project discerned between the components of architectural language and developed them separately. Consequently, there is not only one, but three projects related to postmodern contradiction: semantic project, syntactic project and pragmatic project. Leading these projects are those prominent architects whose work manifested an especial interest in exploring and developing the potential of the use of contradiction in architecture. Thus, Robert Venturi, Peter Eisenman and Rem Koolhaas were who established the main features through which architecture developed the dialectics of ambiguity, in its last and extreme level, as a theoretical project in each component of architectural language. Robert Venturi developed a new interpretation of architecture based on its semantic component, Peter Eisenman did the same with its syntactic component, and also did Rem Koolhaas with its pragmatic component. With this approach this research aims to establish a new reflection on the architectural transformation from modernity to postmodernity. Also, it can serve to light certain aspects still unaware that have shaped the architectural heritage of past decades, consequence of a fruitful relationship between architecture and ambiguity and its provocative consummation in a contradictio in terminis. Esta investigación centra su atención fundamentalmente sobre las repercusiones de la incorporación de la ambigüedad en forma de contradicción en el discurso arquitectónico postmoderno, a través de cada uno de sus tres proyectos teóricos. Está estructurada, por tanto, en torno a un capítulo principal titulado Dialéctica de la ambigüedad como proyecto teórico postmoderno, que se desglosa en tres, de títulos: Proyecto semántico. Robert Venturi; Proyecto sintáctico. Peter Eisenman; y Proyecto pragmático. Rem Koolhaas. El capítulo central se complementa con otros dos situados al inicio. El primero, titulado Dialéctica de la ambigüedad contemporánea. Una aproximación realiza un análisis cronológico de la evolución que ha experimentado la idea de la ambigüedad en la teoría estética del siglo XX, sin entrar aún en cuestiones arquitectónicas. El segundo, titulado Dialéctica de la ambigüedad como crítica del proyecto moderno se ocupa de examinar la paulatina incorporación de la ambigüedad en la revisión crítica de la modernidad, que sería de vital importancia para posibilitar su posterior introducción operativa en la postmodernidad. Un último capítulo, situado al final del texto, propone una serie de Proyecciones que, a tenor de lo analizado en los capítulos anteriores, tratan de establecer una relectura del contexto arquitectónico actual y su evolución posible, considerando, en todo momento, que la reflexión en torno a la ambigüedad todavía hoy permite vislumbrar nuevos horizontes discursivos. Cada doble página de la Tesis sintetiza la estructura tripartita del capítulo central y, a grandes rasgos, la principal herramienta metodológica utilizada en la investigación. De este modo, la triple vertiente semántica, sintáctica y pragmática con que se ha identificado al proyecto teórico postmoderno se reproduce aquí en una distribución específica de imágenes, notas a pie de página y cuerpo principal del texto. En la columna de la izquierda están colocadas las imágenes que acompañan al texto principal. Su distribución atiende a criterios estéticos y compositivos, cualificando, en la medida de lo posible, su condición semántica. A continuación, a su derecha, están colocadas las notas a pie de página. Su disposición es en columna y cada nota está colocada a la misma altura que su correspondiente llamada en el texto principal. Su distribución reglada, su valor como notación y su posible equiparación con una estructura profunda aluden a su condición sintáctica. Finalmente, el cuerpo principal del texto ocupa por completo la mitad derecha de cada doble página. Concebido como un relato continuo, sin apenas interrupciones, su papel como responsable de satisfacer las demandas discursivas que plantea una investigación doctoral está en correspondencia con su condición pragmática.
Resumo:
El libro analiza la operación conocidad como Barrios en Remodelación que transformó gran parte de la perifería madrileña en la década de los 80, construyendo cerca de 38.000 viviendas en las que fueron realojados los ciudadanos que antes vivían en los mismos espacios, reconociéndoles de hecho el "derecho a la ciudad" al reconocerles que el espacio reformado era consecuencia del habitar de los que allí vivieron previamente y por tanto de su ciudad. Se recogen aquí los aspectos principales de una operación que alcanzó las 38.000 viviendas en 28 barrios con una inversión, en precios de la época, de 220.000 millones de pesetas y en un plazo de 10 años.
Resumo:
We have investigated OsHKT2;1 natural variation in a collection of 49 cultivars with different levels of salt tolerance and geographical origins. The effect of identified polymorphism on OsHKT2;1 activity was analysed through heterologous expression of variants in Xenopus oocytes. OsHKT2;1 appeared to be a highly conserved protein with only five possible amino acid substitutions that have no substantial effect on functional properties. Our study, however, also identified a new HKT isoform, No-OsHKT2;2/1 in Nona Bokra, a highly salt-tolerant cultivar. No-OsHKT2;2/1 probably originated from a deletion in chromosome 6, producing a chimeric gene. Its 5¢ region corresponds to that of OsHKT2;2, whose full-length sequence is not present in Nipponbare but has been identified in Pokkali, a salt-tolerant rice cultivar. Its 3¢ region corresponds to that of OsHKT2;1. No-OsHKT2;2/1 is essentially expressed in roots and displays a significant level of expression at high Na+ concentrations, in contrast to OsHKT2;1. Expressed in Xenopus oocytes or in Saccharomyces cerevisiae, No-OsHKT2;2/1 exhibited a strong permeability to Na+ and K+, even at high external Na+ concentrations, like OsHKT2;2, and in contrast to OsHKT2;1. Our results suggest that No-OsHKT2;2/1 can contribute to Nona Bokra salt tolerance by enabling root K+ uptake under saline conditions.
Resumo:
In this paper the Alpine cleavage affecting the Permo-Triassic series of the Espadan Range (Castellón) is studied. Cleavage affects to argillites and sandstones in Saxonian and Buntsandstein facies. At cartographic scale it is linked with the Espadan box anticline with constant ONO-ESE trend. At microscoscopic scale it constitutes a “spaced cleavage” with a predominance of pressure solution and passive rotation mechanisms. At outcrop scale the cleavage characterizes by a sigmoidal geometry linked both the post-cleavage flexural slip as a cleavage-related flexural flow mechanism. The proposed kinematic model to explain its origin includes three main stages: 1) incipient development of cleavage linked to layer-parallel shortening, 2) buckling and increasing of cleavage penetrativity and 3) folfing amplification and layer-parallel shear. RESUMEN Se estudia la esquistosidad alpina que afecta a la serie Permo-Triásica de la Sierra de Espadán, (Castellón). La esquistosidad afecta a los tramos argilíticos y areniscosos en facies Saxoniense y Buntsandstein, con distinto grado de penetratividad. A escala cartográfica se asocia al anticlinal de Espadán con geometría en cofre y orientación ONO-ESE. A escala microestructural se clasifica como esquistosidad espaciada con predominio de los mecanismos de disolución por presión y rotación mecánica de filosilicatos. A escala de afloramiento destaca la geometría sigmoidal de las superficies de esquistosidad atribuida tanto a un mecanismo post-esquistoso de flexodeslizamiento en las capas competentes como a flexofluencia sin-esquistosa en capas incompetentes. El modelo cinemático para su génesis contempla tres estadios: 1) desarrollo incipiente de esquistosidad en relación a acortamiento paralelo a las capas, 2) buckling e incremento del grado de penetratividad y 3) amplificación de los pliegues y cizalla simple paralela a las capas
Resumo:
Mutations in the TP53 gene are very common in human cancers, and are associated with poor clinical outcome. Transgenic mouse models lacking the Trp53 gene or that express mutant Trp53 transgenes produce tumours with malignant features in many organs. We previously showed the transcriptome of a p53-deficient mouse skin carcinoma model to be similar to those of human cancers with TP53 mutations and associated with poor clinical outcomes. This report shows that much of the 682-gene signature of this murine skin carcinoma transcriptome is also present in breast and lung cancer mouse models in which p53 is inhibited. Further, we report validated gene-expression-based tests for predicting the clinical outcome of human breast and lung adenocarcinoma. It was found that human patients with cancer could be stratified based on the similarity of their transcriptome with the mouse skin carcinoma 682-gene signature. The results also provide new targets for the treatment of p53-defective tumours.
Resumo:
Hoy en día, con la evolución continua y rápida de las tecnologías de la información y los dispositivos de computación, se recogen y almacenan continuamente grandes volúmenes de datos en distintos dominios y a través de diversas aplicaciones del mundo real. La extracción de conocimiento útil de una cantidad tan enorme de datos no se puede realizar habitualmente de forma manual, y requiere el uso de técnicas adecuadas de aprendizaje automático y de minería de datos. La clasificación es una de las técnicas más importantes que ha sido aplicada con éxito a varias áreas. En general, la clasificación se compone de dos pasos principales: en primer lugar, aprender un modelo de clasificación o clasificador a partir de un conjunto de datos de entrenamiento, y en segundo lugar, clasificar las nuevas instancias de datos utilizando el clasificador aprendido. La clasificación es supervisada cuando todas las etiquetas están presentes en los datos de entrenamiento (es decir, datos completamente etiquetados), semi-supervisada cuando sólo algunas etiquetas son conocidas (es decir, datos parcialmente etiquetados), y no supervisada cuando todas las etiquetas están ausentes en los datos de entrenamiento (es decir, datos no etiquetados). Además, aparte de esta taxonomía, el problema de clasificación se puede categorizar en unidimensional o multidimensional en función del número de variables clase, una o más, respectivamente; o también puede ser categorizado en estacionario o cambiante con el tiempo en función de las características de los datos y de la tasa de cambio subyacente. A lo largo de esta tesis, tratamos el problema de clasificación desde tres perspectivas diferentes, a saber, clasificación supervisada multidimensional estacionaria, clasificación semisupervisada unidimensional cambiante con el tiempo, y clasificación supervisada multidimensional cambiante con el tiempo. Para llevar a cabo esta tarea, hemos usado básicamente los clasificadores Bayesianos como modelos. La primera contribución, dirigiéndose al problema de clasificación supervisada multidimensional estacionaria, se compone de dos nuevos métodos de aprendizaje de clasificadores Bayesianos multidimensionales a partir de datos estacionarios. Los métodos se proponen desde dos puntos de vista diferentes. El primer método, denominado CB-MBC, se basa en una estrategia de envoltura de selección de variables que es voraz y hacia delante, mientras que el segundo, denominado MB-MBC, es una estrategia de filtrado de variables con una aproximación basada en restricciones y en el manto de Markov. Ambos métodos han sido aplicados a dos problemas reales importantes, a saber, la predicción de los inhibidores de la transcriptasa inversa y de la proteasa para el problema de infección por el virus de la inmunodeficiencia humana tipo 1 (HIV-1), y la predicción del European Quality of Life-5 Dimensions (EQ-5D) a partir de los cuestionarios de la enfermedad de Parkinson con 39 ítems (PDQ-39). El estudio experimental incluye comparaciones de CB-MBC y MB-MBC con los métodos del estado del arte de la clasificación multidimensional, así como con métodos comúnmente utilizados para resolver el problema de predicción de la enfermedad de Parkinson, a saber, la regresión logística multinomial, mínimos cuadrados ordinarios, y mínimas desviaciones absolutas censuradas. En ambas aplicaciones, los resultados han sido prometedores con respecto a la precisión de la clasificación, así como en relación al análisis de las estructuras gráficas que identifican interacciones conocidas y novedosas entre las variables. La segunda contribución, referida al problema de clasificación semi-supervisada unidimensional cambiante con el tiempo, consiste en un método nuevo (CPL-DS) para clasificar flujos de datos parcialmente etiquetados. Los flujos de datos difieren de los conjuntos de datos estacionarios en su proceso de generación muy rápido y en su aspecto de cambio de concepto. Es decir, los conceptos aprendidos y/o la distribución subyacente están probablemente cambiando y evolucionando en el tiempo, lo que hace que el modelo de clasificación actual sea obsoleto y deba ser actualizado. CPL-DS utiliza la divergencia de Kullback-Leibler y el método de bootstrapping para cuantificar y detectar tres tipos posibles de cambio: en las predictoras, en la a posteriori de la clase o en ambas. Después, si se detecta cualquier cambio, un nuevo modelo de clasificación se aprende usando el algoritmo EM; si no, el modelo de clasificación actual se mantiene sin modificaciones. CPL-DS es general, ya que puede ser aplicado a varios modelos de clasificación. Usando dos modelos diferentes, el clasificador naive Bayes y la regresión logística, CPL-DS se ha probado con flujos de datos sintéticos y también se ha aplicado al problema real de la detección de código malware, en el cual los nuevos ficheros recibidos deben ser continuamente clasificados en malware o goodware. Los resultados experimentales muestran que nuestro método es efectivo para la detección de diferentes tipos de cambio a partir de los flujos de datos parcialmente etiquetados y también tiene una buena precisión de la clasificación. Finalmente, la tercera contribución, sobre el problema de clasificación supervisada multidimensional cambiante con el tiempo, consiste en dos métodos adaptativos, a saber, Locally Adpative-MB-MBC (LA-MB-MBC) y Globally Adpative-MB-MBC (GA-MB-MBC). Ambos métodos monitorizan el cambio de concepto a lo largo del tiempo utilizando la log-verosimilitud media como métrica y el test de Page-Hinkley. Luego, si se detecta un cambio de concepto, LA-MB-MBC adapta el actual clasificador Bayesiano multidimensional localmente alrededor de cada nodo cambiado, mientras que GA-MB-MBC aprende un nuevo clasificador Bayesiano multidimensional. El estudio experimental realizado usando flujos de datos sintéticos multidimensionales indica los méritos de los métodos adaptativos propuestos. ABSTRACT Nowadays, with the ongoing and rapid evolution of information technology and computing devices, large volumes of data are continuously collected and stored in different domains and through various real-world applications. Extracting useful knowledge from such a huge amount of data usually cannot be performed manually, and requires the use of adequate machine learning and data mining techniques. Classification is one of the most important techniques that has been successfully applied to several areas. Roughly speaking, classification consists of two main steps: first, learn a classification model or classifier from an available training data, and secondly, classify the new incoming unseen data instances using the learned classifier. Classification is supervised when the whole class values are present in the training data (i.e., fully labeled data), semi-supervised when only some class values are known (i.e., partially labeled data), and unsupervised when the whole class values are missing in the training data (i.e., unlabeled data). In addition, besides this taxonomy, the classification problem can be categorized into uni-dimensional or multi-dimensional depending on the number of class variables, one or more, respectively; or can be also categorized into stationary or streaming depending on the characteristics of the data and the rate of change underlying it. Through this thesis, we deal with the classification problem under three different settings, namely, supervised multi-dimensional stationary classification, semi-supervised unidimensional streaming classification, and supervised multi-dimensional streaming classification. To accomplish this task, we basically used Bayesian network classifiers as models. The first contribution, addressing the supervised multi-dimensional stationary classification problem, consists of two new methods for learning multi-dimensional Bayesian network classifiers from stationary data. They are proposed from two different points of view. The first method, named CB-MBC, is based on a wrapper greedy forward selection approach, while the second one, named MB-MBC, is a filter constraint-based approach based on Markov blankets. Both methods are applied to two important real-world problems, namely, the prediction of the human immunodeficiency virus type 1 (HIV-1) reverse transcriptase and protease inhibitors, and the prediction of the European Quality of Life-5 Dimensions (EQ-5D) from 39-item Parkinson’s Disease Questionnaire (PDQ-39). The experimental study includes comparisons of CB-MBC and MB-MBC against state-of-the-art multi-dimensional classification methods, as well as against commonly used methods for solving the Parkinson’s disease prediction problem, namely, multinomial logistic regression, ordinary least squares, and censored least absolute deviations. For both considered case studies, results are promising in terms of classification accuracy as well as regarding the analysis of the learned MBC graphical structures identifying known and novel interactions among variables. The second contribution, addressing the semi-supervised uni-dimensional streaming classification problem, consists of a novel method (CPL-DS) for classifying partially labeled data streams. Data streams differ from the stationary data sets by their highly rapid generation process and their concept-drifting aspect. That is, the learned concepts and/or the underlying distribution are likely changing and evolving over time, which makes the current classification model out-of-date requiring to be updated. CPL-DS uses the Kullback-Leibler divergence and bootstrapping method to quantify and detect three possible kinds of drift: feature, conditional or dual. Then, if any occurs, a new classification model is learned using the expectation-maximization algorithm; otherwise, the current classification model is kept unchanged. CPL-DS is general as it can be applied to several classification models. Using two different models, namely, naive Bayes classifier and logistic regression, CPL-DS is tested with synthetic data streams and applied to the real-world problem of malware detection, where the new received files should be continuously classified into malware or goodware. Experimental results show that our approach is effective for detecting different kinds of drift from partially labeled data streams, as well as having a good classification performance. Finally, the third contribution, addressing the supervised multi-dimensional streaming classification problem, consists of two adaptive methods, namely, Locally Adaptive-MB-MBC (LA-MB-MBC) and Globally Adaptive-MB-MBC (GA-MB-MBC). Both methods monitor the concept drift over time using the average log-likelihood score and the Page-Hinkley test. Then, if a drift is detected, LA-MB-MBC adapts the current multi-dimensional Bayesian network classifier locally around each changed node, whereas GA-MB-MBC learns a new multi-dimensional Bayesian network classifier from scratch. Experimental study carried out using synthetic multi-dimensional data streams shows the merits of both proposed adaptive methods.
Resumo:
The Physical Properties Laboratory (LPF) has been working on the improvement of fruit and vegetable grading lines since 1992'. The experience shows that the improvement of grading lines for decreasing mechanical damages has to be approached from two viewpoints: 1) machinery aggressiveness, and 2) fruit susceptibility. Machinery aggressiveness can be characterized as impact probability for different impact intensities assessed by means of electronic fruits (IS-100) 2,5 . On the other hand, bruise susceptibility can be determined using different laboratory tests. A recent study from LPF4 shows that damage may arise differently in pome and in stone fruits, since: a) pome fruits are mainly stress-susceptible, while stone fruits appear to be more deformation-susceptible, and b) bruise size may be a good predictor for bruise susceptibility in pome fruits while for stone fruits bruise probability is the most relevant characteristic of bruise susceptibility. Also, this study indicates the feasibility of predicting bruise probability using several mechanical and load characterization parameters. Despite the efforts to establish damage thresholds in peachess, no simulation models are currently available for predicting bruise occurrence in grading lines.
Resumo:
En este estudio se describe el software de simulación de daños en líneas de clasificación de fruta SIMLIN 2.0. Se refiere su empleo en la simulación de confección de melocotones Sudanell con una susceptibilidad intrínseca estimada mediante un modelo logístico, ajustado con esta misma herramienta, a partir de datos de Laboratorio SIMLIN 2.0 precisa la caracterización de las partidas de fruta mediante distribuciones de probabilidad, la cual puede llevarse a cabo con un interfaz de usuario de fácil utilización. El software permite evaluar los porcentajes de daño previstos para líneas de clasificación con distintos niveles de agresividad establecidos por medio de bases de datos generadas con frutos electrónicos tipo IS-100. Aporta distintas salidas gráficas que ayudan a definir las estrategias de mejora que más se adecúen a cada caso.
Resumo:
The size and complexity of cloud environments make them prone to failures. The traditional approach to achieve a high dependability for these systems relies on constant monitoring. However, this method is purely reactive. A more proactive approach is provided by online failure prediction (OFP) techniques. In this paper, we describe a OFP system for private IaaS platforms, currently under development, that combines di_erent types of data input, including monitoring information, event logs, and failure data. In addition, this system operates at both the physical and virtual planes of the cloud, taking into account the relationships between nodes and failure propagation mechanisms that are unique to cloud environments.