7 resultados para Imbalanced datasets
em Universidade do Minho
Resumo:
Due to advances in information technology (e.g., digital video cameras, ubiquitous sensors), the automatic detection of human behaviors from video is a very recent research topic. In this paper, we perform a systematic and recent literature review on this topic, from 2000 to 2014, covering a selection of 193 papers that were searched from six major scientific publishers. The selected papers were classified into three main subjects: detection techniques, datasets and applications. The detection techniques were divided into four categories (initialization, tracking, pose estimation and recognition). The list of datasets includes eight examples (e.g., Hollywood action). Finally, several application areas were identified, including human detection, abnormal activity detection, action recognition, player modeling and pedestrian detection. Our analysis provides a road map to guide future research for designing automatic visual human behavior detection systems.
Resumo:
Special issue guest editorial, June, 2015.
Resumo:
Data traces, consisting of logs about the use of mobile and wireless networks, have been used to study the statistics of encounters between mobile nodes, in an attempt to predict the performance of opportunistic networks. Understanding the role and potential of mobile devices as relaying nodes in message dissemination and delivery depends on the knowledge about patterns and number of encounters among nodes. Data traces about the use of WiFi networks are widely available and can be used to extract large datasets of encounters between nodes. However, these logs only capture indirect encounters between nodes, and the resulting encounters datasets might not realistically represent the spatial and temporal behaviour of nodes. This paper addresses the impact of overlapping between the coverage areas of different Access Points of WiFi networks in extracting encounters datasets from the usage logs. Simulation and real-world experimental results show that indirect encounter traces extracted directly from these logs strongly underestimate the opportunities for direct node-to- node message exchange in opportunistic networks.
Resumo:
Measurements of the centrality and rapidity dependence of inclusive jet production in sNN−−−√=5.02 TeV proton--lead (p+Pb) collisions and the jet cross-section in s√=2.76 TeV proton--proton collisions are presented. These quantities are measured in datasets corresponding to an integrated luminosity of 27.8 nb−1 and 4.0 pb−1, respectively, recorded with the ATLAS detector at the Large Hadron Collider in 2013. The p+Pb collision centrality was characterised using the total transverse energy measured in the pseudorapidity interval −4.9<η<−3.2 in the direction of the lead beam. Results are presented for the double-differential per-collision yields as a function of jet rapidity and transverse momentum (pT) for minimum-bias and centrality-selected p+Pb collisions, and are compared to the jet rate from the geometric expectation. The total jet yield in minimum-bias events is slightly enhanced above the expectation in a pT-dependent manner but is consistent with the expectation within uncertainties. The ratios of jet spectra from different centrality selections show a strong modification of jet production at all pT at forward rapidities and for large pT at mid-rapidity, which manifests as a suppression of the jet yield in central events and an enhancement in peripheral events. These effects imply that the factorisation between hard and soft processes is violated at an unexpected level in proton--nucleus collisions. Furthermore, the modifications at forward rapidities are found to be a function of the total jet energy only, implying that the violations may have a simple dependence on the hard parton--parton kinematics.
Resumo:
A high-resolution mtDNA phylogenetic tree allowed us to look backward in time to investigate purifying selection. Purifying selection was very strong in the last 2,500 years, continuously eliminating pathogenic mutations back until the end of the Younger Dryas (∼11,000 years ago), when a large population expansion likely relaxed selection pressure. This was preceded by a phase of stable selection until another relaxation occurred in the out-of-Africa migration. Demography and selection are closely related: expansions led to relaxation of selection and higher pathogenicity mutations significantly decreased the growth of descendants. The only detectible positive selection was the recurrence of highly pathogenic nonsynonymous mutations (m.3394T>C-m.3397A>G-m.3398T>C) at interior branches of the tree, preventing the formation of a dinucleotide STR (TATATA) in the MT-ND1 gene. At the most recent time scale in 124 mother-children transmissions, purifying selection was detectable through the loss of mtDNA variants with high predicted pathogenicity. A few haplogroup-defining sites were also heteroplasmic, agreeing with a significant propensity in 349 positions in the phylogenetic tree to revert back to the ancestral variant. This nonrandom mutation property explains the observation of heteroplasmic mutations at some haplogroup-defining sites in sequencing datasets, which may not indicate poor quality as has been claimed.
Resumo:
Dissertação de mestrado integrado em Engenharia Biomédica (área de especialização em Informática Médica)
Resumo:
Recently, there has been a growing interest in the field of metabolomics, materialized by a remarkable growth in experimental techniques, available data and related biological applications. Indeed, techniques as Nuclear Magnetic Resonance, Gas or Liquid Chromatography, Mass Spectrometry, Infrared and UV-visible spectroscopies have provided extensive datasets that can help in tasks as biological and biomedical discovery, biotechnology and drug development. However, as it happens with other omics data, the analysis of metabolomics datasets provides multiple challenges, both in terms of methodologies and in the development of appropriate computational tools. Indeed, from the available software tools, none addresses the multiplicity of existing techniques and data analysis tasks. In this work, we make available a novel R package, named specmine, which provides a set of methods for metabolomics data analysis, including data loading in different formats, pre-processing, metabolite identification, univariate and multivariate data analysis, machine learning, and feature selection. Importantly, the implemented methods provide adequate support for the analysis of data from diverse experimental techniques, integrating a large set of functions from several R packages in a powerful, yet simple to use environment. The package, already available in CRAN, is accompanied by a web site where users can deposit datasets, scripts and analysis reports to be shared with the community, promoting the efficient sharing of metabolomics data analysis pipelines.