152 resultados para Datasets


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Online forums are becoming a popular way of finding useful
information on the web. Search over forums for existing discussion
threads so far is limited to keyword-based search due
to the minimal effort required on part of the users. However,
it is often not possible to capture all the relevant context in a
complex query using a small number of keywords. Examplebased
search that retrieves similar discussion threads given
one exemplary thread is an alternate approach that can help
the user provide richer context and vastly improve forum
search results. In this paper, we address the problem of
finding similar threads to a given thread. Towards this, we
propose a novel methodology to estimate similarity between
discussion threads. Our method exploits the thread structure
to decompose threads in to set of weighted overlapping
components. It then estimates pairwise thread similarities
by quantifying how well the information in the threads are
mutually contained within each other using lexical similarities
between their underlying components. We compare our
proposed methods on real datasets against state-of-the-art
thread retrieval mechanisms wherein we illustrate that our
techniques outperform others by large margins on popular
retrieval evaluation measures such as NDCG, MAP, Precision@k
and MRR. In particular, consistent improvements of
up to 10% are observed on all evaluation measures

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mining seafloor massive sulfides for metals is an emergent industry faced with environmental management challenges. These revolve largely around limits to our current understanding of biological variability in marine systems, a challenge common to all marine environmental management. VentBase was established as a forum where academic, commercial, governmental, and non-governmental stakeholders can develop a consensus regarding the management of exploitative activities in the deep-sea. Participants advocate a precautionary approach with the incorporation of lessons learned from coastal studies. This workshop report from VentBase encourages the standardization of sampling methodologies for deep-sea environmental impact assessment. VentBase stresses the need for the collation of spatial data and importance of datasets amenable to robust statistical analyses. VentBase supports the identification of set-asides to prevent the local extirpation of vent-endemic communities and for the post-extraction recolonization of mine sites. © 2013.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, a novel and effective lip-based biometric identification approach with the Discrete Hidden Markov Model Kernel (DHMMK) is developed. Lips are described by shape features (both geometrical and sequential) on two different grid layouts: rectangular and polar. These features are then specifically modeled by a DHMMK, and learnt by a support vector machine classifier. Our experiments are carried out in a ten-fold cross validation fashion on three different datasets, GPDS-ULPGC Face Dataset, PIE Face Dataset and RaFD Face Dataset. Results show that our approach has achieved an average classification accuracy of 99.8%, 97.13%, and 98.10%, using only two training images per class, on these three datasets, respectively. Our comparative studies further show that the DHMMK achieved a 53% improvement against the baseline HMM approach. The comparative ROC curves also confirm the efficacy of the proposed lip contour based biometrics learned by DHMMK. We also show that the performance of linear and RBF SVM is comparable under the frame work of DHMMK.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper is concerned with the application of an automated hybrid approach in addressing the university timetabling problem. The approach described is based on the nature-inspired artificial bee colony (ABC) algorithm. An ABC algorithm is a biologically-inspired optimization approach, which has been widely implemented in solving a range of optimization problems in recent years such as job shop scheduling and machine timetabling problems. Although the approach has proven to be robust across a range of problems, it is acknowledged within the literature that there currently exist a number of inefficiencies regarding the exploration and exploitation abilities. These inefficiencies can often lead to a slow convergence speed within the search process. Hence, this paper introduces a variant of the algorithm which utilizes a global best model inspired from particle swarm optimization to enhance the global exploration ability while hybridizing with the great deluge (GD) algorithm in order to improve the local exploitation ability. Using this approach, an effective balance between exploration and exploitation is attained. In addition, a traditional local search approach is incorporated within the GD algorithm with the aim of further enhancing the performance of the overall hybrid method. To evaluate the performance of the proposed approach, two diverse university timetabling datasets are investigated, i.e., Carter's examination timetabling and Socha course timetabling datasets. It should be noted that both problems have differing complexity and different solution landscapes. Experimental results demonstrate that the proposed method is capable of producing high quality solutions across both these benchmark problems, showing a good degree of generality in the approach. Moreover, the proposed method produces best results on some instances as compared with other approaches presented in the literature.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

HOX genes are master regulators of organ morphogenesis and cell differentiation during embryonic development, and continue to be expressed throughout post-natal life. To test the hypothesis that HOX genes are dysregulated in head and neck squamous cell carcinoma (HNSCC) we defined their expression profile, and investigated the function, transcriptional regulation and clinical relevance of a subset of highly expressed HOXD genes. Two HOXD genes, D10 and D11, showed strikingly high levels in HNSCC cell lines, patient tumor samples and publicly available datasets. Knockdown of HOXD10 in HNSCC cells caused decreased proliferation and invasion, whereas knockdown of HOXD11 reduced only invasion. POU2F1 consensus sequences were identified in the 5' DNA of HOXD10 and D11. Knockdown of POU2F1 significantly reduced expression of HOXD10 and D11 and inhibited HNSCC proliferation. Luciferase reporter constructs of the HOXD10 and D11 promoters confirmed that POU2F1 consensus binding sites are required for optimal promoter activity. Utilizing patient tumor samples a significant association was found between immunohistochemical staining of HOXD10 and both the overall and the disease-specific survival, adding further support that HOXD10 is dysregulated in head and neck cancer. Additional studies are now warranted to fully evaluate HOXD10 as a prognostic tool in head and neck cancers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The recent explosion of genetic and clinical data generated from tumor genome analysis presents an unparalleled opportunity to enhance our understanding of cancer, but this opportunity is compromised by the reluctance of many in the scientific community to share datasets and the lack of interoperability between different data platforms. The Global Alliance for Genomics and Health is addressing these barriers and challenges through a cooperative framework that encourages "team science" and responsible data sharing, complemented by the development of a series of application program interfaces that link different data platforms, thus breaking down traditional silos and liberating the data to enable new discoveries and ultimately benefit patients.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Generating timetables for an institution is a challenging and time consuming task due to different demands on the overall structure of the timetable. In this paper, a new hybrid method which is a combination of a great deluge and artificial bee colony algorithm (INMGD-ABC) is proposed to address the university timetabling problem. Artificial bee colony algorithm (ABC) is a population based method that has been introduced in recent years and has proven successful in solving various optimization problems effectively. However, as with many search based approaches, there exist weaknesses in the exploration and exploitation abilities which tend to induce slow convergence of the overall search process. Therefore, hybridization is proposed to compensate for the identified weaknesses of the ABC. Also, inspired from imperialist competitive algorithms, an assimilation policy is implemented in order to improve the global exploration ability of the ABC algorithm. In addition, Nelder–Mead simplex search method is incorporated within the great deluge algorithm (NMGD) with the aim of enhancing the exploitation ability of the hybrid method in fine-tuning the problem search region. The proposed method is tested on two differing benchmark datasets i.e. examination and course timetabling datasets. A statistical analysis t-test has been conducted and shows the performance of the proposed approach as significantly better than basic ABC algorithm. Finally, the experimental results are compared against state-of-the art methods in the literature, with results obtained that are competitive and in certain cases achieving some of the current best results to those in the literature.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

PURPOSE: The purpose of this study is to establish the prevalence of potentially inappropriate prescribing (PIP) in middle-aged adults (45-64 years) in two populations with differing socio-economic profiles, and to investigate factors associated with PIP, using the PROMPT (PRescribing Optimally in Middle-aged People's Treatments) criteria.METHODS: A retrospective cross-sectional study was conducted using 2012 data from the Enhanced Prescribing Database (EPD), covering the full population in Northern Ireland and the Health Services Executive Primary Care Reimbursement Service (HSE-PCRS) database, covering the most socio-economically deprived third of the population in this age group in the Republic of Ireland. The prevalence for each PROMPT criterion and overall prevalence of PIP were calculated. Logistic regression was used to investigate the association between PIP and gender, age group and polypharmacy.RESULTS: This study included 441,925 patients from the EPD and 309,748 patients from the HSE-PCRS database. Polypharmacy was common in both datasets (46.7 % in the HSE-PCRS and 20.3 % in the EPD). The prevalence of PIP was 42.9 % (95%CI 42.7, 43.1) in the HSE-PCRS and 21.1 % (95%CI 21.0, 21.2) in the EPD. Age group, female gender and polypharmacy were significantly associated with PIP in both populations (p < 0.05) and polypharmacy had the strongest association.CONCLUSIONS: PIP is common amongst middle-aged people with the risk of PIP increasing with polypharmacy. Differences in the prevalence of polypharmacy and PIP between the two populations may relate to heterogeneity in healthcare services and different socio-economic profiles, with higher rates of multimorbidity and associated polypharmacy in more deprived groups.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Time-domain modelling of single-reed woodwind instruments usually involves a lumped model of the excitation mechanism. The parameters of this lumped model have to be estimated for use in numerical simulations. Several attempts have been made to estimate these parameters, including observations of the mechanics of isolated reeds, measurements under artificial or real playing conditions and estimations based on numerical simulations. In this study an optimisation routine is presented, that can estimate reed-model parameters, given the pressure and flow signals in the mouthpiece. The method is validated, tested on a series of numerically synthesised data. In order to incorporate the actions of the player in the parameter estimation process, the optimisation routine has to be applied to signals obtained under real playing conditions. The estimated parameters can then be used to resynthesise the pressure and flow signals in the mouthpiece. In the case of measured data, as opposed to numerically synthesised data, special care needs to be taken while modelling the bore of the instrument. In fact, a careful study of various experimental datasets revealed that for resynthesis to work, the bore termination impedance should be known very precisely from theory. An example is given, where the above requirement is satisfied, and the resynthesised signals closely match the original signals generated by the player.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Aim: Our primary aim is to understand how assemblages of rare (restricted range) and common (widespread) species are correlated with each other among different taxa. We tested the proposition that marine species richness patterns of rare and common species differ, both within a taxon in their contribution to the richness pattern of the full assemblage and among taxa in the strength of their correlations with each other. Location The UK intertidal zone. Methods: We used high-resolution marine datasets for UK intertidal macroalgae, molluscs and crustaceans each with more than 400 species. We estimated the relative contribution of rare and common species, treating rarity and commonness as a continuous spectrum, to spatial patterns in richness using spatial crosscorrelations. Correlation strength and significance was estimated both within and between taxa. Results: Common species drove richness patterns within taxa, but rare species contributed more when species were placed on an equal footing via scaling by binomial variance. Between taxa, relatively small sub-assemblages (fewer than 60 species) of common species produced the maximum correlation with each other, regardless of taxon pairing. Cross-correlations between rare species were generally weak, with maximum correlation occurring between small sub-assemblages in only one case. Cross-correlations between common and rare species of different taxa were consistently weak or absent. Main conclusions: Common species in the three marine assemblages were congruent in their richness patterns, but rare species were generally not. The contrast between the stronger correlations among common species and the weak or absent correlations among rare species indicates a decoupling of the processes driving common and rare species richness patterns. The internal structure of richness patterns of these marine taxa is similar to that observed for terrestrial taxa.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we propose a novel recurrent neural networkarchitecture for video-based person re-identification.Given the video sequence of a person, features are extracted from each frame using a convolutional neural network that incorporates a recurrent final layer, which allows information to flow between time-steps. The features from all time steps are then combined using temporal pooling to give an overall appearance feature for the complete sequence. The convolutional network, recurrent layer, and temporal pooling layer, are jointly trained to act as a feature extractor for video-based re-identification using a Siamese network architecture.Our approach makes use of colour and optical flow information in order to capture appearance and motion information which is useful for video re-identification. Experiments are conduced on the iLIDS-VID and PRID-2011 datasets to show that this approach outperforms existing methods of video-based re-identification.

https://github.com/niallmcl/Recurrent-Convolutional-Video-ReID
Project Source Code

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study examines the potential of next-generation sequencing based ‘genotyping-by-sequencing’ (GBS) of microsatellite loci for rapid and cost-effective genotyping in large-scale population genetic studies. The recovery of individual genotypes from large sequence pools was achieved by PCR-incorporated combinatorial barcoding using universal primers. Three experimental conditions were employed to explore the possibility of using this approach with existing and novel multiplex marker panels and weighted amplicon mixture. The GBS approach was validated against microsatellite data generated by capillary electrophoresis. GBS allows access to the underlying nucleotide sequences that can reveal homoplasy, even in large datasets and facilitates cross laboratory transfer. GBS of microsatellites, using individual combinatorial barcoding, is potentially faster and cheaper than current microsatellite approaches and offers better and more data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a new wrapper feature selection algorithm for human detection. This algorithm is a hybrid featureselection approach combining the benefits of filter and wrapper methods. It allows the selection of an optimalfeature vector that well represents the shapes of the subjects in the images. In detail, the proposed featureselection algorithm adopts the k-fold subsampling and sequential backward elimination approach, while thestandard linear support vector machine (SVM) is used as the classifier for human detection. We apply theproposed algorithm to the publicly accessible INRIA and ETH pedestrian full image datasets with the PASCALVOC evaluation criteria. Compared to other state of the arts algorithms, our feature selection based approachcan improve the detection speed of the SVM classifier by over 50% with up to 2% better detection accuracy.Our algorithm also outperforms the equivalent systems introduced in the deformable part model approach witharound 9% improvement in the detection accuracy

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the rapid development of internet-of-things (IoT), face scrambling has been proposed for privacy protection during IoT-targeted image/video distribution. Consequently in these IoT applications, biometric verification needs to be carried out in the scrambled domain, presenting significant challenges in face recognition. Since face models become chaotic signals after scrambling/encryption, a typical solution is to utilize traditional data-driven face recognition algorithms. While chaotic pattern recognition is still a challenging task, in this paper we propose a new ensemble approach – Many-Kernel Random Discriminant Analysis (MK-RDA) to discover discriminative patterns from chaotic signals. We also incorporate a salience-aware strategy into the proposed ensemble method to handle chaotic facial patterns in the scrambled domain, where random selections of features are made on semantic components via salience modelling. In our experiments, the proposed MK-RDA was tested rigorously on three human face datasets: the ORL face dataset, the PIE face dataset and the PUBFIG wild face dataset. The experimental results successfully demonstrate that the proposed scheme can effectively handle chaotic signals and significantly improve the recognition accuracy, making our method a promising candidate for secure biometric verification in emerging IoT applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Purpose:
A number of independent gene expression profiling studies have identified transcriptional subtypes in colorectal cancer (CRC) with potential diagnostic utility, culminating in publication of a CRC Consensus Molecular Subtype classification. The worst prognostic subtype has been defined by genes associated with stem-like biology. Recently, it has been shown that the majority of genes associated with this poor prognostic group are stromal-derived. We investigated the potential for tumor misclassification into multiple diagnostic subgroups based on tumoral region sampled.

Experimental Design:
We performed multi-region tissue RNA extraction/transcriptomic analysis using Colorectal Specific Arrays on invasive front, central tumor and lymph node regions selected from tissue samples from 25 CRC patients.

Results:
We identified a consensus 30 gene list which represents the intratumoral heterogeneity within a cohort of primary CRC tumors. Using a series of online datasets, we showed that this gene list displays prognostic potential (HR=2.914 (CI 0.9286-9.162) in stage II/III CRC patients, but in addition we demonstrated that these genes are stromal derived, challenging the assumption that poor prognosis tumors with stem-like biology have undergone a widespread Epithelial Mesenchymal Transition (EMT). Most importantly, we showed that patients can be simultaneously classified into multiple diagnostically relevant subgroups based purely on the tumoral region analysed.

Conclusions:
Gene expression profiles derived from the non-malignant stromal region can influence assignment of CRC transcriptional subtypes, questioning the current molecular classification dogma and highlighting the need to consider pathology sampling region and degree of stromal infiltration when employing transcription-based classifiers to underpin clinical decision-making in CRC.