944 resultados para RIGHT-CENSORED DATA


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hoy en día, con la evolución continua y rápida de las tecnologías de la información y los dispositivos de computación, se recogen y almacenan continuamente grandes volúmenes de datos en distintos dominios y a través de diversas aplicaciones del mundo real. La extracción de conocimiento útil de una cantidad tan enorme de datos no se puede realizar habitualmente de forma manual, y requiere el uso de técnicas adecuadas de aprendizaje automático y de minería de datos. La clasificación es una de las técnicas más importantes que ha sido aplicada con éxito a varias áreas. En general, la clasificación se compone de dos pasos principales: en primer lugar, aprender un modelo de clasificación o clasificador a partir de un conjunto de datos de entrenamiento, y en segundo lugar, clasificar las nuevas instancias de datos utilizando el clasificador aprendido. La clasificación es supervisada cuando todas las etiquetas están presentes en los datos de entrenamiento (es decir, datos completamente etiquetados), semi-supervisada cuando sólo algunas etiquetas son conocidas (es decir, datos parcialmente etiquetados), y no supervisada cuando todas las etiquetas están ausentes en los datos de entrenamiento (es decir, datos no etiquetados). Además, aparte de esta taxonomía, el problema de clasificación se puede categorizar en unidimensional o multidimensional en función del número de variables clase, una o más, respectivamente; o también puede ser categorizado en estacionario o cambiante con el tiempo en función de las características de los datos y de la tasa de cambio subyacente. A lo largo de esta tesis, tratamos el problema de clasificación desde tres perspectivas diferentes, a saber, clasificación supervisada multidimensional estacionaria, clasificación semisupervisada unidimensional cambiante con el tiempo, y clasificación supervisada multidimensional cambiante con el tiempo. Para llevar a cabo esta tarea, hemos usado básicamente los clasificadores Bayesianos como modelos. La primera contribución, dirigiéndose al problema de clasificación supervisada multidimensional estacionaria, se compone de dos nuevos métodos de aprendizaje de clasificadores Bayesianos multidimensionales a partir de datos estacionarios. Los métodos se proponen desde dos puntos de vista diferentes. El primer método, denominado CB-MBC, se basa en una estrategia de envoltura de selección de variables que es voraz y hacia delante, mientras que el segundo, denominado MB-MBC, es una estrategia de filtrado de variables con una aproximación basada en restricciones y en el manto de Markov. Ambos métodos han sido aplicados a dos problemas reales importantes, a saber, la predicción de los inhibidores de la transcriptasa inversa y de la proteasa para el problema de infección por el virus de la inmunodeficiencia humana tipo 1 (HIV-1), y la predicción del European Quality of Life-5 Dimensions (EQ-5D) a partir de los cuestionarios de la enfermedad de Parkinson con 39 ítems (PDQ-39). El estudio experimental incluye comparaciones de CB-MBC y MB-MBC con los métodos del estado del arte de la clasificación multidimensional, así como con métodos comúnmente utilizados para resolver el problema de predicción de la enfermedad de Parkinson, a saber, la regresión logística multinomial, mínimos cuadrados ordinarios, y mínimas desviaciones absolutas censuradas. En ambas aplicaciones, los resultados han sido prometedores con respecto a la precisión de la clasificación, así como en relación al análisis de las estructuras gráficas que identifican interacciones conocidas y novedosas entre las variables. La segunda contribución, referida al problema de clasificación semi-supervisada unidimensional cambiante con el tiempo, consiste en un método nuevo (CPL-DS) para clasificar flujos de datos parcialmente etiquetados. Los flujos de datos difieren de los conjuntos de datos estacionarios en su proceso de generación muy rápido y en su aspecto de cambio de concepto. Es decir, los conceptos aprendidos y/o la distribución subyacente están probablemente cambiando y evolucionando en el tiempo, lo que hace que el modelo de clasificación actual sea obsoleto y deba ser actualizado. CPL-DS utiliza la divergencia de Kullback-Leibler y el método de bootstrapping para cuantificar y detectar tres tipos posibles de cambio: en las predictoras, en la a posteriori de la clase o en ambas. Después, si se detecta cualquier cambio, un nuevo modelo de clasificación se aprende usando el algoritmo EM; si no, el modelo de clasificación actual se mantiene sin modificaciones. CPL-DS es general, ya que puede ser aplicado a varios modelos de clasificación. Usando dos modelos diferentes, el clasificador naive Bayes y la regresión logística, CPL-DS se ha probado con flujos de datos sintéticos y también se ha aplicado al problema real de la detección de código malware, en el cual los nuevos ficheros recibidos deben ser continuamente clasificados en malware o goodware. Los resultados experimentales muestran que nuestro método es efectivo para la detección de diferentes tipos de cambio a partir de los flujos de datos parcialmente etiquetados y también tiene una buena precisión de la clasificación. Finalmente, la tercera contribución, sobre el problema de clasificación supervisada multidimensional cambiante con el tiempo, consiste en dos métodos adaptativos, a saber, Locally Adpative-MB-MBC (LA-MB-MBC) y Globally Adpative-MB-MBC (GA-MB-MBC). Ambos métodos monitorizan el cambio de concepto a lo largo del tiempo utilizando la log-verosimilitud media como métrica y el test de Page-Hinkley. Luego, si se detecta un cambio de concepto, LA-MB-MBC adapta el actual clasificador Bayesiano multidimensional localmente alrededor de cada nodo cambiado, mientras que GA-MB-MBC aprende un nuevo clasificador Bayesiano multidimensional. El estudio experimental realizado usando flujos de datos sintéticos multidimensionales indica los méritos de los métodos adaptativos propuestos. ABSTRACT Nowadays, with the ongoing and rapid evolution of information technology and computing devices, large volumes of data are continuously collected and stored in different domains and through various real-world applications. Extracting useful knowledge from such a huge amount of data usually cannot be performed manually, and requires the use of adequate machine learning and data mining techniques. Classification is one of the most important techniques that has been successfully applied to several areas. Roughly speaking, classification consists of two main steps: first, learn a classification model or classifier from an available training data, and secondly, classify the new incoming unseen data instances using the learned classifier. Classification is supervised when the whole class values are present in the training data (i.e., fully labeled data), semi-supervised when only some class values are known (i.e., partially labeled data), and unsupervised when the whole class values are missing in the training data (i.e., unlabeled data). In addition, besides this taxonomy, the classification problem can be categorized into uni-dimensional or multi-dimensional depending on the number of class variables, one or more, respectively; or can be also categorized into stationary or streaming depending on the characteristics of the data and the rate of change underlying it. Through this thesis, we deal with the classification problem under three different settings, namely, supervised multi-dimensional stationary classification, semi-supervised unidimensional streaming classification, and supervised multi-dimensional streaming classification. To accomplish this task, we basically used Bayesian network classifiers as models. The first contribution, addressing the supervised multi-dimensional stationary classification problem, consists of two new methods for learning multi-dimensional Bayesian network classifiers from stationary data. They are proposed from two different points of view. The first method, named CB-MBC, is based on a wrapper greedy forward selection approach, while the second one, named MB-MBC, is a filter constraint-based approach based on Markov blankets. Both methods are applied to two important real-world problems, namely, the prediction of the human immunodeficiency virus type 1 (HIV-1) reverse transcriptase and protease inhibitors, and the prediction of the European Quality of Life-5 Dimensions (EQ-5D) from 39-item Parkinson’s Disease Questionnaire (PDQ-39). The experimental study includes comparisons of CB-MBC and MB-MBC against state-of-the-art multi-dimensional classification methods, as well as against commonly used methods for solving the Parkinson’s disease prediction problem, namely, multinomial logistic regression, ordinary least squares, and censored least absolute deviations. For both considered case studies, results are promising in terms of classification accuracy as well as regarding the analysis of the learned MBC graphical structures identifying known and novel interactions among variables. The second contribution, addressing the semi-supervised uni-dimensional streaming classification problem, consists of a novel method (CPL-DS) for classifying partially labeled data streams. Data streams differ from the stationary data sets by their highly rapid generation process and their concept-drifting aspect. That is, the learned concepts and/or the underlying distribution are likely changing and evolving over time, which makes the current classification model out-of-date requiring to be updated. CPL-DS uses the Kullback-Leibler divergence and bootstrapping method to quantify and detect three possible kinds of drift: feature, conditional or dual. Then, if any occurs, a new classification model is learned using the expectation-maximization algorithm; otherwise, the current classification model is kept unchanged. CPL-DS is general as it can be applied to several classification models. Using two different models, namely, naive Bayes classifier and logistic regression, CPL-DS is tested with synthetic data streams and applied to the real-world problem of malware detection, where the new received files should be continuously classified into malware or goodware. Experimental results show that our approach is effective for detecting different kinds of drift from partially labeled data streams, as well as having a good classification performance. Finally, the third contribution, addressing the supervised multi-dimensional streaming classification problem, consists of two adaptive methods, namely, Locally Adaptive-MB-MBC (LA-MB-MBC) and Globally Adaptive-MB-MBC (GA-MB-MBC). Both methods monitor the concept drift over time using the average log-likelihood score and the Page-Hinkley test. Then, if a drift is detected, LA-MB-MBC adapts the current multi-dimensional Bayesian network classifier locally around each changed node, whereas GA-MB-MBC learns a new multi-dimensional Bayesian network classifier from scratch. Experimental study carried out using synthetic multi-dimensional data streams shows the merits of both proposed adaptive methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a general procedure for solving incomplete data estimation problems. The procedure can be used to find the maximum likelihood estimate or to solve estimating equations in difficult cases such as estimation with the censored or truncated regression model, the nonlinear structural measurement error model, and the random effects model. The procedure is based on the general principle of stochastic approximation and the Markov chain Monte-Carlo method. Applying the theory on adaptive algorithms, we derive conditions under which the proposed procedure converges. Simulation studies also indicate that the proposed procedure consistently converges to the maximum likelihood estimate for the structural measurement error logistic regression model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The linear pentadecapeptide antibiotic, gramicidin D, is a naturally occurring product of Bacillus brevis known to form ion channels in synthetic and natural membranes. The x-ray crystal structures of the right-handed double-stranded double-helical dimers (DSDHℛ) reported here agree with 15N-NMR and CD data on the functional gramicidin D channel in lipid bilayers. These structures demonstrate single-file ion transfer through the channels. The results also indicate that previous crystal structure reports of a left-handed double-stranded double-helical dimer in complex with Cs+ and K+ salts may be in error and that our evidence points to the DSDHℛ as the major conformer responsible for ion transport in membranes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mutations in Tg737 cause a wide spectrum of phenotypes, including random left-right axis specification, polycystic kidney disease, liver and pancreatic defects, hydrocephalus, and skeletal patterning abnormalities. To further assess the biological function of Tg737 and its role in the mutant pathology, we identified the cell population expressing Tg737 and determined the subcellular localization of its protein product called Polaris. Tg737 expression is associated with cells possessing either motile or immotile cilia and sperm. Similarly, Polaris concentrated just below the apical membrane in the region of the basal bodies and within the cilia or flagellar axoneme. The data suggest that Polaris functions in a ciliogenic pathway or in cilia maintenance, a role supported by the loss of cilia on the ependymal cell layer in ventricles of Tg737orpk brains and by the lack of node cilia in Tg737Δ2-3βGal mutants.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have studied the ability of the histone (H3-H4)2 tetramer, the central part of the nucleosome of eukaryotic chromatin, to form particles on DNA minicircles of negative and positive superhelicities, and the effect of relaxing these particles with topoisomerase I. The results show that even modest positive torsional stress from the DNA, and in particular that generated by DNA thermal fluctuations, can trigger a major, reversible change in the conformation of the particle. Neither a large excess of naked DNA, nor a crosslink between the two H3s prevented the transition from one form to the other. This suggested that during the transition, the histones neither dissociated from the DNA nor were even significantly reshuffled. Moreover, the particles reconstituted on negatively and positively supercoiled minicircles look similar under electron microscopy. These data agree best with a transition involving a switch of the wrapped DNA from a left- to a right-handed superhelix. It is further proposed, based on the left-handed overall superhelical conformation of the tetramer within the octamer [Arents, G., Burlingame, R. W., Wang, B. C., Love, W. E. & Moudrianakis, E. N. (1991) Proc. Natl.Acad. Sci. USA 88, 10148-10152] that this change in DNA topology is mediated by a similar change in the topology of the tetramer itself, which may occur through a rotation (or a localized deformation) of the two H3-H4 dimers about their H3-H3 interface. Potential implications of this model for nucleosome dynamics in vivo are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective. To synthesise the scientific evidence concerning barriers to health care access faced by migrants. We sought to critically analyse this evidence with a view to guiding policies. Design. A systematic review methodology was used to identify systematic and scoping reviews which quantitatively or qualitatively analysed data from primary studies. The main variables analysed were structural and contextual barriers (health system organisation) as well as individual (patients and providers). The quality of evidence from the systematic reviews was critically appraised. From 2674 reviews, 79 were retained for further scrutiny, and finally 9 met the inclusion criteria. Results. The structural barriers identified were the lack of health insurance and the high cost of drugs (non-universal health system) and organisational aspects of health system (social insurance system and national health system). The individual barriers were linguistic and cultural. None of the reviews provided a quality appraisal of the studies. Conclusions. Barriers to health care for migrants range from entitlement in non-universal health systems to accessibility in universal ones, and determinants of access to the respective health services should be analysed within the corresponding national context. Generate social and institutional changes that eliminate barriers to access to health services is essential to ensure health for all.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Usinig original data on 1,5000 mandibles, but mainly previously published data, I present a overview of the distribution characteristics of mandibular torus and a hypothesis concerning its cause. Pedigree studies have established that genetic factors influence torus development. Extrinsic factors are strongly implicated by other evidence: prevalence among Arctic peoples, effect of dietary change, age regression, preponderance in males and on the right side, effect of cranial deformation, concurrence with palatine torus and maxillary alveolar exostoses, and clinical evidence. I propose that the primary factor is masticatory stress. According to a mechanism suggested by orthodontic research, the horizontal component of bite force tips the lower canine, premolars and first molar so that their root apices exert pressure on the periodontal membrane, causing formation of new bone on the lingual cortical plate of the alveolar process. Thus formed, the hyperostosis is vulnerable to trauma and its periosteal covering becomes bruised causing additional deposition of bone. Genes influence torus indirectly through their effect on occlusion. A patern of increased expressivity with incidence suggests that a quasicontinuous model may provide a better fit to pedigree data than single locus models previously tested.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper examines the challenges facing the EU regarding data retention, particularly in the aftermath of the judgment Digital Rights Ireland by the Court of Justice of the European Union (CJEU) of April 2014, which found the Data Retention Directive 2002/58 to be invalid. It first offers a brief historical account of the Data Retention Directive and then moves to a detailed assessment of what the judgment means for determining the lawfulness of data retention from the perspective of the EU Charter of Fundamental Rights: what is wrong with the Data Retention Directive and how would it need to be changed to comply with the right to respect for privacy? The paper also looks at the responses to the judgment from the European institutions and elsewhere, and presents a set of policy suggestions to the European institutions on the way forward. It is argued here that one of the main issues underlying the Digital Rights Ireland judgment has been the role of fundamental rights in the EU legal order, and in particular the extent to which the retention of metadata for law enforcement purposes is consistent with EU citizens’ right to respect for privacy and to data protection. The paper offers three main recommendations to EU policy-makers: first, to give priority to a full and independent evaluation of the value of the data retention directive; second, to assess the judgment’s implications for other large EU information systems and proposals that provide for the mass collection of metadata from innocent persons, in the EU; and third, to adopt without delay the proposal for Directive COM(2012)10 dealing with data protection in the fields of police and judicial cooperation in criminal matters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND Arrhythmogenic right ventricular cardiomyopathy/dysplasia (ARVC/D) is considered a progressive cardiomyopathy. However, data on the clinical features of disease progression are limited. The aim of this study was to assess 12-lead surface electrocardiographic (ECG) changes during long-term follow-up, and to compare these findings with echocardiographic data in our large cohort of patients with ARVC/D. METHODS Baseline and follow-up ECGs of 111 patients from three tertiary care centers in Switzerland were systematically analyzed with digital calipers by two blinded observers, and correlated with findings from transthoracic echocardiography. RESULTS The median follow-up was 4 years (IQR 1.9-9.2 years). ECG progression was significant for epsilon waves (baseline 14% vs. follow-up 31%, p = 0.01) and QRS duration (111 ms vs. 114 ms, p = 0.04). Six patients with repolarization abnormalities according to the 2010 Task Force Criteria at baseline did not display these criteria at follow-up, whereas in all patients with epsilon waves at baseline these depolarization abnormalities also remained at follow-up. T wave inversions in inferior leads were common (36% of patients at baseline), and were significantly associated with major repolarization abnormalities (p = 0.02), extensive echocardiographic right ventricular involvement (p = 0.04), T wave inversions in lateral precordial leads (p = 0.05), and definite ARVC/D (p = 0.05). CONCLUSIONS Our data supports the concept that ARVC/D is generally progressive, which can be detected by 12-lead surface ECG. Repolarization abnormalities may disappear during the course of the disease. Furthermore, the presence of T wave inversions in inferior leads is common in ARVC/D.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cover title.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mode of access: Internet.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

"Fiche 1" consists of cataloging data in macroform.