9 resultados para multiple data

em Archivo Digital para la Docencia y la Investigación - Repositorio Institucional de la Universidad del País Vasco


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Desde que se inventó el primer ordenador, uno de los objetivos ha sido que el ordenador fuese capaz de ejecutar más y más rápido, para poder así solucionar problemas más complejos. La primera solución fue aumentar la potencia de los procesadores, pero las limitaciones físicas impuestas por la velocidad de los componentes electrónicos han obligado a buscar otras formas de mejorar el rendimiento. Desde entonces, ha habido muchos tipos de tecnologías para aumentar el rendimiento como los multiprocesadores, las arquitecturas MIMD… pero nosotros analizaremos la arquitectura SIMD. Este tipo de procesadores fue muy usado en los supercomputadores de los años 80 y 90, pero el progreso de los microprocesadores hizo que esta tecnología quedara en un segundo plano. Hoy en día la todos los procesadores tienen arquitecturas que implementan las instrucciones SIMD (Single Instruction, Multiple Data). En este documento estudiaremos las tecnologías de SIMD de Intel SSE, AVX y AVX2 para ver si realmente usando el procesador vectorial con las instrucciones SIMD, se obtiene alguna mejora de rendimiento. Hay que tener en cuenta que AVX solo está disponible desde 2011 y AVX2 no ha estado disponible hasta el 2013, por lo tanto estaremos trabajando con nuevas tecnologías. Además este tipo de tecnologías tiene el futuro asegurado, al anunciar Intel su nueva tecnología, AVX- 512 para 2015.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Due to the recent implantation of the Bologna process, the definition of competences in Higher Education is an important matter that deserves special attention and requires a detailed analysis. For that reason, we study the importance given to severa! competences for the professional activity and the degree to which these competences have been achieved through the received education. The answers include also competences observed in two periods of time given by individuals of multiple characteristics. In this context and in order to obtain synthesized results, we propose the use of Multiple Table Factor Analysis. Through this analysis, individuals are described by severa! groups, showing the most important variability factors of the individuals and allowing the analysis of the common structure ofthe different data tables. The obtained results will allow us finding out the existence or absence of a common structure in the answers of the various data tables, knowing which competences have similar answer structure in the groups of variables, as well as characterizing those answers through the individuals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

DNA microarray, or DNA chip, is a technology that allows us to obtain the expression level of many genes in a single experiment. The fact that numerical expression values can be easily obtained gives us the possibility to use multiple statistical techniques of data analysis. In this project microarray data is obtained from Gene Expression Omnibus, the repository of National Center for Biotechnology Information (NCBI). Then, the noise is removed and data is normalized, also we use hypothesis tests to find the most relevant genes that may be involved in a disease and use machine learning methods like KNN, Random Forest or Kmeans. For performing the analysis we use Bioconductor, packages in R for the analysis of biological data, and we conduct a case study in Alzheimer disease. The complete code can be found in https://github.com/alberto-poncelas/ bioc-alzheimer

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, reanalysis fields from the ECMWF have been statistically downscaled to predict from large-scale atmospheric fields, surface moisture flux and daily precipitation at two observatories (Zaragoza and Tortosa, Ebro Valley, Spain) during the 1961-2001 period. Three types of downscaling models have been built: (i) analogues, (ii) analogues followed by random forests and (iii) analogues followed by multiple linear regression. The inputs consist of data (predictor fields) taken from the ERA-40 reanalysis. The predicted fields are precipitation and surface moisture flux as measured at the two observatories. With the aim to reduce the dimensionality of the problem, the ERA-40 fields have been decomposed using empirical orthogonal functions. Available daily data has been divided into two parts: a training period used to find a group of about 300 analogues to build the downscaling model (1961-1996) and a test period (19972001), where models' performance has been assessed using independent data. In the case of surface moisture flux, the models based on analogues followed by random forests do not clearly outperform those built on analogues plus multiple linear regression, while simple averages calculated from the nearest analogues found in the training period, yielded only slightly worse results. In the case of precipitation, the three types of model performed equally. These results suggest that most of the models' downscaling capabilities can be attributed to the analogues-calculation stage.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper deals with the convergence of a remote iterative learning control system subject to data dropouts. The system is composed by a set of discrete-time multiple input-multiple output linear models, each one with its corresponding actuator device and its sensor. Each actuator applies the input signals vector to its corresponding model at the sampling instants and the sensor measures the output signals vector. The iterative learning law is processed in a controller located far away of the models so the control signals vector has to be transmitted from the controller to the actuators through transmission channels. Such a law uses the measurements of each model to generate the input vector to be applied to its subsequent model so the measurements of the models have to be transmitted from the sensors to the controller. All transmissions are subject to failures which are described as a binary sequence taking value 1 or 0. A compensation dropout technique is used to replace the lost data in the transmission processes. The convergence to zero of the errors between the output signals vector and a reference one is achieved as the number of models tends to infinity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Human endogenous retroviruses (HERVs) are repetitive sequences derived from ancestral germ-line infections by exogenous retroviruses and different HERV families have been integrated in the genome. HERV-Fc1 in chromosome X has been previously associated with multiple sclerosis (MS) in Northern European populations. Additionally, HERV-Fc1 RNA levels of expression have been found increased in plasma of MS patients with active disease. Considering the North-South latitude gradient in MS prevalence, we aimed to evaluate the role of HERV-Fc1on MS risk in three independent Spanish cohorts. Methods: A single nucleotide polymorphism near HERV-Fc1, rs391745, was genotyped by Taqman chemistry in a total of 2473 MS patients and 3031 ethnically matched controls, consecutively recruited from: Northern (569 patients and 980 controls), Central (883 patients and 692 controls) and Southern (1021 patients and 1359 controls) Spain. Our results were pooled in a meta-analysis with previously published data. Results: Significant associations of the HERV-Fc1 polymorphism with MS were observed in two Spanish cohorts and the combined meta-analysis with previous data yielded a significant association [rs391745 C-allele carriers: p(M-H) = 0.0005; ORM-H (95% CI) = 1.27 (1.11-1.45)]. Concordantly to previous findings, when the analysis was restricted to relapsing remitting and secondary progressive MS samples, a slight enhancement in the strength of the association was observed [p(M-H) = 0.0003, ORM-H (95% CI) = 1.32 (1.14-1.53)]. Conclusion: Association of the HERV-Fc1 polymorphism rs391745 with bout-onset MS susceptibility was confirmed in Southern European cohorts.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the problem of one-class classification (OCC) one of the classes, the target class, has to be distinguished from all other possible objects, considered as nontargets. In many biomedical problems this situation arises, for example, in diagnosis, image based tumor recognition or analysis of electrocardiogram data. In this paper an approach to OCC based on a typicality test is experimentally compared with reference state-of-the-art OCC techniques-Gaussian, mixture of Gaussians, naive Parzen, Parzen, and support vector data description-using biomedical data sets. We evaluate the ability of the procedures using twelve experimental data sets with not necessarily continuous data. As there are few benchmark data sets for one-class classification, all data sets considered in the evaluation have multiple classes. Each class in turn is considered as the target class and the units in the other classes are considered as new units to be classified. The results of the comparison show the good performance of the typicality approach, which is available for high dimensional data; it is worth mentioning that it can be used for any kind of data (continuous, discrete, or nominal), whereas state-of-the-art approaches application is not straightforward when nominal variables are present.