919 resultados para genoma, genetica, dna, bioinformatica, mapreduce, snp, gwas, big data, sequenziamento, pipeline


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Open Research Data - A step by step guide through the research data lifecycle, data set creation, big data vs long-tail, metadata, data centres/data repositories, open access for data, data sharing, data citation and publication.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research evaluates pattern recognition techniques on a subclass of big data where the dimensionality of the input space (p) is much larger than the number of observations (n). Specifically, we evaluate massive gene expression microarray cancer data where the ratio κ is less than one. We explore the statistical and computational challenges inherent in these high dimensional low sample size (HDLSS) problems and present statistical machine learning methods used to tackle and circumvent these difficulties. Regularization and kernel algorithms were explored in this research using seven datasets where κ < 1. These techniques require special attention to tuning necessitating several extensions of cross-validation to be investigated to support better predictive performance. While no single algorithm was universally the best predictor, the regularization technique produced lower test errors in five of the seven datasets studied.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper researches on Matthew Effect in Sina Weibo microblogger. We choose the microblogs in the ranking list of Hot Microblog App in Sina Weibo microblogger as target of our study. The differences of repost number of microblogs in the ranking list between before and after the time when it enter the ranking list of Hot Microblog app are analyzed. And we compare the spread features of the microblogs in the ranking list with those hot microblogs not in the list and those ordinary microblogs of users who have some microblog in the ranking list before. Our study proves the existence of Matthew Effect in social network. © 2013 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The miniaturization, sophistication, proliferation, and accessibility of technologies are enabling the capture of more and previously inaccessible phenomena in Parkinson's disease (PD). However, more information has not translated into a greater understanding of disease complexity to satisfy diagnostic and therapeutic needs. Challenges include noncompatible technology platforms, the need for wide-scale and long-term deployment of sensor technology (among vulnerable elderly patients in particular), and the gap between the "big data" acquired with sensitive measurement technologies and their limited clinical application. Major opportunities could be realized if new technologies are developed as part of open-source and/or open-hardware platforms that enable multichannel data capture sensitive to the broad range of motor and nonmotor problems that characterize PD and are adaptable into self-adjusting, individualized treatment delivery systems. The International Parkinson and Movement Disorders Society Task Force on Technology is entrusted to convene engineers, clinicians, researchers, and patients to promote the development of integrated measurement and closed-loop therapeutic systems with high patient adherence that also serve to (1) encourage the adoption of clinico-pathophysiologic phenotyping and early detection of critical disease milestones, (2) enhance the tailoring of symptomatic therapy, (3) improve subgroup targeting of patients for future testing of disease-modifying treatments, and (4) identify objective biomarkers to improve the longitudinal tracking of impairments in clinical care and research. This article summarizes the work carried out by the task force toward identifying challenges and opportunities in the development of technologies with potential for improving the clinical management and the quality of life of individuals with PD. © 2016 International Parkinson and Movement Disorder Society.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sensing technology is a key enabler of the Internet of Things (IoT) and could produce huge volume data to contribute the Big Data paradigm. Modelling of sensing information is an important and challenging topic, which influences essentially the quality of smart city systems. In this paper, the author discusses the relevant technologies and information modelling in the context of smart city and especially reports the investigation of how to model sensing and location information in order to support smart city development.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose – The purpose of this paper is to examine challenges and potential of big data in heterogeneous business networks and relate these to an implemented logistics solution. Design/methodology/approach – The paper establishes an overview of challenges and opportunities of current significance in the area of big data, specifically in the context of transparency and processes in heterogeneous enterprise networks. Within this context, the paper presents how existing components and purpose-driven research were combined for a solution implemented in a nationwide network for less-than-truckload consignments. Findings – Aside from providing an extended overview of today’s big data situation, the findings have shown that technical means and methods available today can comprise a feasible process transparency solution in a large heterogeneous network where legacy practices, reporting lags and incomplete data exist, yet processes are sensitive to inadequate policy changes. Practical implications – The means introduced in the paper were found to be of utility value in improving process efficiency, transparency and planning in logistics networks. The particular system design choices in the presented solution allow an incremental introduction or evolution of resource handling practices, incorporating existing fragmentary, unstructured or tacit knowledge of experienced personnel into the theoretically founded overall concept. Originality/value – The paper extends previous high-level view on the potential of big data, and presents new applied research and development results in a logistics application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Az üzleti élet globalizálódása, a technológiai fejlődés által nyújtott új eszközök részben új lehetőségeket, részben új feladatokat, elvárásokat jelentenek mind a tudományos, mind a gyakorlati marketingkutatás számára. Változnak az adatfelvétel módszerei, a fogyasztók attitüdjének és magatartásának változásával a primer kutatás módszerei között is hangsúlyeltolódás következik be, megnő a megfigyeléses, a kísérleti vizsgálatok szerepe. A kvalitatív és kvantitatív kutatás közötti határvonal is elmosódik, mindkét kutatási módszertanban új típusú módszerek jelennek meg és terjednek el. A nagy adatbázisok, a Big data lehetőségeit is integrálnia kell a marketingkutatásnak és az adatelemzésnek. A tanulmány a jelenlegi változások, valamint a jövőbeli szcenáriók felvázolására is kísérletet tesz.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Jacquemontia reclinata House (Convolvulaceae) is a federally-listed endangered species endemic to coastal strand habitat of southeastern Florida, from Palm Beach to Miami-Dade counties. Although J. reclinata is currently defined as a species, its taxonomic distinctness has never been analyzed using phylogenetic evidence. In order to assess the evolutionary distinctness of J. reclinata and identify its closest relatives, internal transcribed spacer (ITS) regions within nuclear ribosomal DNA were sequenced, and the sequence data was used to reconstruct a phylogeny of Jacquemontia. The study included the three putative relatives of J. reclinata and all other species within Jacquemontia known to occur in the Greater Antilles and Bahamas, except for three species. Results concur with previous morphological studies, which suggest that J. reclinata is closely related to J. cayensis Britton, J. curtisii Peter, and J. havanensis Urban. These three species and J. reclinata form an unresolved clade. Therefore, it is not certain which of these Caribbean species is sister to J. reclinata. The lack of resolution within the clade that includes J. reclinata implies that the taxa within the clade are evolutionarily similar. Future taxonomic studies of J. reclinata should focus in resolving relationships within the Jacquemontia reclinata clade.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Graph Reduction Machines, are a traditional technique for implementing functional programming languages. They allow to run programs by transforming graphs by the successive application of reduction rules. Web service composition enables the creation of new web services from existing ones. BPEL is a workflow-based language for creating web service compositions. It is also the industrial and academic standard for this kind of languages. As it is designed to compose web services, the use of BPEL in a scenario where multiple technologies need to be used is problematic: when operations other than web services need to be performed to implement the business logic of a company, part of the work is done on an ad hoc basis. To allow heterogeneous operations to be part of the same workflow, may help to improve the implementation of business processes in a principled way. This work uses a simple variation of the BPEL language for creating compositions containing not only web service operations but also big data tasks or user-defined operations. We define an extensible graph reduction machine that allows the evaluation of BPEL programs and implement this machine as proof of concept. We present some experimental results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The social media classification problems draw more and more attention in the past few years. With the rapid development of Internet and the popularity of computers, there is astronomical amount of information in the social network (social media platforms). The datasets are generally large scale and are often corrupted by noise. The presence of noise in training set has strong impact on the performance of supervised learning (classification) techniques. A budget-driven One-class SVM approach is presented in this thesis that is suitable for large scale social media data classification. Our approach is based on an existing online One-class SVM learning algorithm, referred as STOCS (Self-Tuning One-Class SVM) algorithm. To justify our choice, we first analyze the noise-resilient ability of STOCS using synthetic data. The experiments suggest that STOCS is more robust against label noise than several other existing approaches. Next, to handle big data classification problem for social media data, we introduce several budget driven features, which allow the algorithm to be trained within limited time and under limited memory requirement. Besides, the resulting algorithm can be easily adapted to changes in dynamic data with minimal computational cost. Compared with two state-of-the-art approaches, Lib-Linear and kNN, our approach is shown to be competitive with lower requirements of memory and time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Con l’avvento di Internet, il numero di utenti con un effettivo accesso alla rete e la possibilità di condividere informazioni con tutto il mondo è, negli anni, in continua crescita. Con l’introduzione dei social media, in aggiunta, gli utenti sono portati a trasferire sul web una grande quantità di informazioni personali mettendoli a disposizione delle varie aziende. Inoltre, il mondo dell’Internet Of Things, grazie al quale i sensori e le macchine risultano essere agenti sulla rete, permette di avere, per ogni utente, un numero maggiore di dispositivi, direttamente collegati tra loro e alla rete globale. Proporzionalmente a questi fattori anche la mole di dati che vengono generati e immagazzinati sta aumentando in maniera vertiginosa dando luogo alla nascita di un nuovo concetto: i Big Data. Nasce, di conseguenza, la necessità di far ricorso a nuovi strumenti che possano sfruttare la potenza di calcolo oggi offerta dalle architetture più complesse che comprendono, sotto un unico sistema, un insieme di host utili per l’analisi. A tal merito, una quantità di dati così vasta, routine se si parla di Big Data, aggiunta ad una velocità di trasmissione e trasferimento altrettanto alta, rende la memorizzazione dei dati malagevole, tanto meno se le tecniche di storage risultano essere i tradizionali DBMS. Una soluzione relazionale classica, infatti, permetterebbe di processare dati solo su richiesta, producendo ritardi, significative latenze e inevitabile perdita di frazioni di dataset. Occorre, perciò, far ricorso a nuove tecnologie e strumenti consoni a esigenze diverse dalla classica analisi batch. In particolare, è stato preso in considerazione, come argomento di questa tesi, il Data Stream Processing progettando e prototipando un sistema bastato su Apache Storm scegliendo, come campo di applicazione, la cyber security.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A evolução tecnológica na comunicação contemporânea estrutura sistemas digitais via redes de computadores conectados e exploração maciça de dispositivos tecnológicos. Os dados digitais captados e distribuídos via aplicativos instalados em smartphones criam ambiente dinâmico comunicacional. O Jornalismo e a Comunicação tentam se adaptar ao novo ecossistema informacional impetrado pelas constantes inovações tecnológicas que possibilitam a criação de novos ambientes e sistemas para acesso à informação de relevância social. Surgem novas ferramentas para produção e distribuição de conteúdos jornalísticos, produtos baseados em dados e interações inteligentes, algoritmos usados em diversos processos, plataformas hiperlocais e sistemas de narrativas e produção digitais. Nesse contexto, o objetivo da pesquisa foi elaborar uma análise e comparação entre produtos de mídia e tecnologia específicos. Se as novas tecnologias acrescentam atributos às produções e narrativas jornalísticas, seus impactos na prática da atividade e também se há modificação nos processos de produção de informação de relevância social em relação aos processos jornalísticos tradicionais e consolidados. Investiga se o uso de informações insertadas pelos usuários, em tempo real, melhora a qualidade das narrativas emergentes através de dispositivos móveis e se a gamificação ou ludificação altera a percepção de credibilidade do jornalismo. Para que assim seja repensado a forma de se produzir e gerar informação e conhecimento para os públicos que demandam conteúdo

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Las comunidades colaborativas, donde grandes cantidades de personas colaboran para la producción de recursos compartidos (e.g. Github, Wikipedia, OpenStreetMap, Arduino, StackOverflow) están extendiéndose progresivamente a multitud de campos. No obstante, es complicado comprender cómo funcionan y evolucionan. ¿Qué tipos de usuarios son más activos en Wikia? ¿Cómo ha evolucionado el número de wikis activas en los últimos años? ¿Qué perfil de actividad presentan la mayor parte de colaboradores de Wikia? ¿Son más activos los hombres o las mujeres en la Wikipedia? En los proyectos de Github, ¿el esfuerzo de programación (y frecuencia de commits) se distribuye de forma homogénea a lo largo del tiempo o suele estar concentrado? Estas comunidades, típicamente online, dejan registrada su actividad en grandes bases de datos, muchas de ellas disponibles públicamente. Sin embargo, el ciudadano de a pie no tiene ni las herramientas ni el conocimiento necesario para sacar conclusiones de esos datos. En este TFG desarrollamos una herramienta de análisis exploratorio y visualización de datos de la plataforma Wikia, sitio web colaborativo que permite la creación, edición y modificación del contenido y estructura de miles de páginas web de tipo enciclopedia basadas en la tecnología wiki. Nuestro objetivo es que esta aplicación web sea usable por cualquiera y que no requiera que el usuario sea un experto en Big Data para poder visualizar las gráficas de evolución o distribuciones del comportamiento interno de la comunidad, pudiendo modificar algunos de sus parámetros y visualizando cómo cambian. Como resultado de este trabajo se ha desarrollado una primera versión de la aplicación disponible en GitHub1 y en http://chartsup.esy.es/

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Durante el desarrollo del proyecto he aprendido sobre Big Data, Android y MongoDB mientras que ayudaba a desarrollar un sistema para la predicción de las crisis del trastorno bipolar mediante el análisis masivo de información de diversas fuentes. En concreto hice una parte teórica sobre bases de datos NoSQL, Streaming Spark y Redes Neuronales y después diseñé y configuré una base de datos MongoDB para el proyecto del trastorno bipolar. También aprendí sobre Android y diseñé y desarrollé una aplicación de móvil en Android para recoger datos para usarlos como entrada en el sistema de predicción de crisis. Una vez terminado el desarrollo de la aplicación también llevé a cabo una evaluación con usuarios.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ACKNOWLEDGEMENTS This research is based upon work supported in part by the U.S. ARL and U.K. Ministry of Defense under Agreement Number W911NF-06-3-0001, and by the NSF under award CNS-1213140. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views or represent the official policies of the NSF, the U.S. ARL, the U.S. Government, the U.K. Ministry of Defense or the U.K. Government. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.