919 resultados para genoma, genetica, dna, bioinformatica, mapreduce, snp, gwas, big data, sequenziamento, pipeline


Relevância:

100.00% 100.00%

Publicador:

Resumo:

De entre los principales retos que plantea la docencia universitaria actual, destaca el de avanzar hacia modelos docentes centrados en el estudiante, capaces de desarrollar y conducir su aprendizaje de forma autónoma (tutorizada) tanto en las actividades presenciales como en las no presenciales. En este sentido, la posibilidad de operar con grandes bases de datos georeferenciadas de libre acceso supone un magnífico potencial para la investigación y la docencia del Urbanismo. Por ello, intervenir como guías en el proceso de comprensión y empleo de los datos a gran escala, es uno de los principales desafíos actuales de los docentes de las asignaturas de Urbanismo. Este artículo tiene por objeto explicar la experiencia desarrollada en la Universidad de Alicante (UA), con el propósito de iniciar al alumnado en el consumo inteligente de la información, para llevar a cabo sus propios análisis y obtener sus propias interpretaciones. El trabajo muestra los métodos y herramientas empleadas para tal fin, que permiten acercarse a nuevas formas dinámicas de relación con el conocimiento, a nuevas prácticas educativas activas y, sobre todo, a la creación de una nueva conciencia social más consciente y acorde con el mundo que habitamos.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El Cuadro de Mando SmartUA es una aplicación software que permite localizar y visualizar con facilidad, en cualquier momento y desde cualquier lugar, toda la información recopilada desde diversas fuentes de datos y redes de sensores generadas por el proyecto Smart University de la Universidad de Alicante; representarla en forma de mapas y gráficas; realizar búsquedas y filtros sobre dicha información; y mostrar a la comunidad universitaria en particular y a la ciudadanía en general, de una forma objetiva e inteligible, los fenómenos que ocurren en el campus, interconectado sistemas y personas para un mejor aprovechamiento de los recursos, una gestión eficiente y una innovación continua.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to become better prepared to support Research Data Management (RDM) practices in sciences and engineering, Queen’s University Library, together with the University Research Services, conducted a research study of all ranks of faculty members, as well as postdoctoral fellows and graduate students at the Faculty of Engineering & Applied Science, Departments of Chemistry, Computer Science, Geological Sciences and Geological Engineering, Mathematics and Statistics, Physics, Engineering Physics & Astronomy, School of Environmental Studies, and Geography & Planning in the Faculty of Arts and Science.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Internet traffic classification is a relevant and mature research field, anyway of growing importance and with still open technical challenges, also due to the pervasive presence of Internet-connected devices into everyday life. We claim the need for innovative traffic classification solutions capable of being lightweight, of adopting a domain-based approach, of not only concentrating on application-level protocol categorization but also classifying Internet traffic by subject. To this purpose, this paper originally proposes a classification solution that leverages domain name information extracted from IPFIX summaries, DNS logs, and DHCP leases, with the possibility to be applied to any kind of traffic. Our proposed solution is based on an extension of Word2vec unsupervised learning techniques running on a specialized Apache Spark cluster. In particular, learning techniques are leveraged to generate word-embeddings from a mixed dataset composed by domain names and natural language corpuses in a lightweight way and with general applicability. The paper also reports lessons learnt from our implementation and deployment experience that demonstrates that our solution can process 5500 IPFIX summaries per second on an Apache Spark cluster with 1 slave instance in Amazon EC2 at a cost of $ 3860 year. Reported experimental results about Precision, Recall, F-Measure, Accuracy, and Cohen's Kappa show the feasibility and effectiveness of the proposal. The experiments prove that words contained in domain names do have a relation with the kind of traffic directed towards them, therefore using specifically trained word embeddings we are able to classify them in customizable categories. We also show that training word embeddings on larger natural language corpuses leads improvements in terms of precision up to 180%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Questa tesi concerne quella che è una generalizzata tendenza verso la trasformazione digitale dei processi di business. Questa evoluzione, che implica l’utilizzo delle moderne tecnologie informatiche tra cui il Cloud Computing, le Big Data Analytics e gli strumenti Mobile, non è priva di insidie che vanno di volta in volta individuate ed affrontate opportunamente. In particolare si farà riferimento ad un caso aziendale, quello della nota azienda bolognese FAAC spa, ed alla funzione acquisti. Nell'ambito degli approvvigionamenti l'azienda sente la necessità di ristrutturare e digitalizzare il processo di richiesta di offerta (RdO) ai propri fornitori, al fine di consentire alla funzione di acquisti di concentrarsi sull'implementazione della strategia aziendale più che sull'operatività quotidiana. Si procede quindi in questo elaborato all'implementazione di un progetto di implementazione di una piattaforma specifica di e-procurement per la gestione delle RdO. Preliminarmente vengono analizzati alcuni esempi di project management presenti in letteratura e quindi viene definito un modello per la gestione del progetto specifico. Lo svolgimento comprende quindi: una fase di definizione degli obiettivi di continuità dell'azienda, un'analisi As-Is dei processi, la definizione degli obiettivi specifici di progetto e dei KPI di valutazione delle performance, la progettazione della piattaforma software ed infine alcune valutazioni relative ai rischi ed alle alternative dell'implementazione.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Neste breve artigo, procuro analisar um workshop de pesquisa sobre o tema das “Cidades Inteligentes”, ou smart cities, no qual estive presente. Nessa análise, mostro como os conceitos de “cidade inteligente” e “big data” são construídos de modo distinto pelos dois grupos de pessoas presentes no evento, que classifico como “otimizadores” e “reguladores”. Essas diferentes formas de se enxergar os dispositivos em questão levam a uma série de controvérsias. Em um primeiro momento, procuro enquadrar o modo como algumas das controvérsias aparecem dentro do marco teórico da Construção Social da Tecnologia (SCOT). Posteriormente, pretendo mostrar que as controvérsias que apareceram ao longo do evento não foram solucionadas – e dificilmente serão, num futuro próximo – enquanto não se optar por um modelo analítico tal como a Teoria Ator-Rede, que dá ouvidos para um grupo ignorado naquelas discussões: os dispositivos empregados na construção do conceito de “Cidade Inteligente”.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O sociólogo que comanda um dos grandes centros de análise de big data no Brasil diz que os políticos só vão recuperar legitimidade quando aprenderem que "curtir" é coisa séria para detectar tendências e medir o pulso das aspirações sociais ganhou volume e tempo real no oceano de informações do big data, nome que se dá à gigantesca quantidade de dados produzidos diariamente na internet. É nessa mina inesgotável que o sociólogo carioca Marco Aurelio Ruediger, da Fundação Getulio Vargas, abastece a Diretoria de Análise de Políticas Públicas, um centro de estudo da visão que os brasileiros têm da máquina estatal e dos poderes da República.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In September 2015, the UN Member States are expected to commit to an ambitious new set of global goals for a new era of sustainable development. Achieving them will require an unprecedented joint effort on the part of governments at every level, civil society and the private sector, and millions of individual choices and actions. To be realised, the SDGs will require a monitoring and accountability framework and a plan for implementation. A commitment to realise the opportunities of the data revolution should be firmly embedded into the action plan for the SDGs, to support those countries most in need of resources, and to set the world on track for an unprecedented push towards a new world of data for change.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Australian ghost bat is a large, opportunistic carnivorous species that has undergone a marked range contraction toward more mesic, tropical sites over the past century. Comparison of mitochondrial DNA (mtDNA) control region sequences and six nuclear microsatellite loci in 217 ghost bats from nine populations across subtropical and tropical Australia revealed strong population subdivision (mtDNA phi(ST) = 0.80; microsatellites URST = 0.337). Low-latitude (tropical) populations had higher heterozygosity and less marked phylogeographic structure and lower subdivision among sites within regions (within Northern Territory [NT] and within North Queensland [NQ]) than did populations at higher latitudes (subtropical sites; central Queensland [CQ]), although sampling of geographically proximal breeding sites is unavoidably restricted for the latter. Gene flow among populations within each of the northern regions appears to be male biased in that the difference in population subdivision for mtDNA and microsatellites (NT phi(ST) = 0.39, URST = 0.02; NQ phi(ST) = 0.60, URST = -0.03) is greater than expected from differences in the effective population size of haploid versus diploid loci. The high level of population subdivision across the range of the ghost bat contrasts with evidence for high gene flow in other chiropteran species and may be due to narrow physiological tolerances and consequent limited availability of roosts for ghost bats, particularly across the subtropical and relatively arid regions. This observation is consistent with the hypothesis that the contraction of the species' range is associated with late Holocene climate change. The extreme isolation among higher-latitude populations may predispose them to additional local extinctions if the processes responsible for the range contraction continue to operate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We analyze a Big Data set of geo-tagged tweets for a year (Oct. 2013–Oct. 2014) to understand the regional linguistic variation in the U.S. Prior work on regional linguistic variations usually took a long time to collect data and focused on either rural or urban areas. Geo-tagged Twitter data offers an unprecedented database with rich linguistic representation of fine spatiotemporal resolution and continuity. From the one-year Twitter corpus, we extract lexical characteristics for twitter users by summarizing the frequencies of a set of lexical alternations that each user has used. We spatially aggregate and smooth each lexical characteristic to derive county-based linguistic variables, from which orthogonal dimensions are extracted using the principal component analysis (PCA). Finally a regionalization method is used to discover hierarchical dialect regions using the PCA components. The regionalization results reveal interesting linguistic regional variations in the U.S. The discovered regions not only confirm past research findings in the literature but also provide new insights and a more detailed understanding of very recent linguistic patterns in the U.S.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

GraphChi is the first reported disk-based graph engine that can handle billion-scale graphs on a single PC efficiently. GraphChi is able to execute several advanced data mining, graph mining and machine learning algorithms on very large graphs. With the novel technique of parallel sliding windows (PSW) to load subgraph from disk to memory for vertices and edges updating, it can achieve data processing performance close to and even better than those of mainstream distributed graph engines. GraphChi mentioned that its memory is not effectively utilized with large dataset, which leads to suboptimal computation performances. In this paper we are motivated by the concepts of 'pin ' from TurboGraph and 'ghost' from GraphLab to propose a new memory utilization mode for GraphChi, which is called Part-in-memory mode, to improve the GraphChi algorithm performance. The main idea is to pin a fixed part of data inside the memory during the whole computing process. Part-in-memory mode is successfully implemented with only about 40 additional lines of code to the original GraphChi engine. Extensive experiments are performed with large real datasets (including Twitter graph with 1.4 billion edges). The preliminary results show that Part-in-memory mode memory management approach effectively reduces the GraphChi running time by up to 60% in PageRank algorithm. Interestingly it is found that a larger portion of data pinned in memory does not always lead to better performance in the case that the whole dataset cannot be fitted in memory. There exists an optimal portion of data which should be kept in the memory to achieve the best computational performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multiple transformative forces target marketing, many of which derive from new technologies that allow us to sample thinking in real time (i.e., brain imaging), or to look at large aggregations of decisions (i.e., big data). There has been an inclination to refer to the intersection of these technologies with the general topic of marketing as “neuromarketing”. There has not been a serious effort to frame neuromarketing, which is the goal of this paper. Neuromarketing can be compared to neuroeconomics, wherein neuroeconomics is generally focused on how individuals make “choices”, and represent distributions of choices. Neuromarketing, in contrast, focuses on how a distribution of choices can be shifted or “influenced”, which can occur at multiple “scales” of behavior (e.g., individual, group, or market/society). Given influence can affect choice through many cognitive modalities, and not just that of valuation of choice options, a science of influence also implies a need to develop a model of cognitive function integrating attention, memory, and reward/aversion function. The paper concludes with a brief description of three domains of neuromarketing application for studying influence, and their caveats.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Parkinson's disease is a complex heterogeneous disorder with urgent need for disease-modifying therapies. Progress in successful therapeutic approaches for PD will require an unprecedented level of collaboration. At a workshop hosted by Parkinson's UK and co-organized by Critical Path Institute's (C-Path) Coalition Against Major Diseases (CAMD) Consortiums, investigators from industry, academia, government and regulatory agencies agreed on the need for sharing of data to enable future success. Government agencies included EMA, FDA, NINDS/NIH and IMI (Innovative Medicines Initiative). Emerging discoveries in new biomarkers and genetic endophenotypes are contributing to our understanding of the underlying pathophysiology of PD. In parallel there is growing recognition that early intervention will be key for successful treatments aimed at disease modification. At present, there is a lack of a comprehensive understanding of disease progression and the many factors that contribute to disease progression heterogeneity. Novel therapeutic targets and trial designs that incorporate existing and new biomarkers to evaluate drug effects independently and in combination are required. The integration of robust clinical data sets is viewed as a powerful approach to hasten medical discovery and therapies, as is being realized across diverse disease conditions employing big data analytics for healthcare. The application of lessons learned from parallel efforts is critical to identify barriers and enable a viable path forward. A roadmap is presented for a regulatory, academic, industry and advocacy driven integrated initiative that aims to facilitate and streamline new drug trials and registrations in Parkinson's disease.