364 resultados para Dataset


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The performance of visual speech recognition (VSR) systems are significantly influenced by the accuracy of the visual front-end. The current state-of-the-art VSR systems use off-the-shelf face detectors such as Viola- Jones (VJ) which has limited reliability for changes in illumination and head poses. For a VSR system to perform well under these conditions, an accurate visual front end is required. This is an important problem to be solved in many practical implementations of audio visual speech recognition systems, for example in automotive environments for an efficient human-vehicle computer interface. In this paper, we re-examine the current state-of-the-art VSR by comparing off-the-shelf face detectors with the recently developed Fourier Lucas-Kanade (FLK) image alignment technique. A variety of image alignment and visual speech recognition experiments are performed on a clean dataset as well as with a challenging automotive audio-visual speech dataset. Our results indicate that the FLK image alignment technique can significantly outperform off-the shelf face detectors, but requires frequent fine-tuning.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we explore the effectiveness of patch-based gradient feature extraction methods when applied to appearance-based gait recognition. Extending existing popular feature extraction methods such as HOG and LDP, we propose a novel technique which we term the Histogram of Weighted Local Directions (HWLD). These 3 methods are applied to gait recognition using the GEI feature, with classification performed using SRC. Evaluations on the CASIA and OULP datasets show significant improvements using these patch-based methods over existing implementations, with the proposed method achieving the highest recognition rate for the respective datasets. In addition, the HWLD can easily be extended to 3D, which we demonstrate using the GEV feature on the DGD dataset, observing improvements in performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Currently, recommender systems (RS) have been widely applied in many commercial e-commerce sites to help users deal with the information overload problem. Recommender systems provide personalized recommendations to users and, thus, help in making good decisions about which product to buy from the vast amount of product choices. Many of the current recommender systems are developed for simple and frequently purchased products like books and videos, by using collaborative-filtering and content-based approaches. These approaches are not directly applicable for recommending infrequently purchased products such as cars and houses as it is difficult to collect a large number of ratings data from users for such products. Many of the ecommerce sites for infrequently purchased products are still using basic search-based techniques whereby the products that match with the attributes given in the target user’s query are retrieved and recommended. However, search-based recommenders cannot provide personalized recommendations. For different users, the recommendations will be the same if they provide the same query regardless of any difference in their interest. In this article, a simple user profiling approach is proposed to generate user’s preferences to product attributes (i.e., user profiles) based on user product click stream data. The user profiles can be used to find similarminded users (i.e., neighbours) accurately. Two recommendation approaches are proposed, namely Round- Robin fusion algorithm (CFRRobin) and Collaborative Filtering-based Aggregated Query algorithm (CFAgQuery), to generate personalized recommendations based on the user profiles. Instead of using the target user’s query to search for products as normal search based systems do, the CFRRobin technique uses the attributes of the products in which the target user’s neighbours have shown interest as queries to retrieve relevant products, and then recommends to the target user a list of products by merging and ranking the returned products using the Round Robin method. The CFAgQuery technique uses the attributes of the products that the user’s neighbours have shown interest in to derive an aggregated query, which is then used to retrieve products to recommend to the target user. Experiments conducted on a real e-commerce dataset show that both the proposed techniques CFRRobin and CFAgQuery perform better than the standard Collaborative Filtering and the Basic Search approaches, which are widely applied by the current e-commerce applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Crashes on motorway contribute to a significant proportion (40-50%) of non-recurrent motorway congestions. Hence reduce crashes will help address congestion issues (Meyer, 2008). Crash likelihood estimation studies commonly focus on traffic conditions in a Short time window around the time of crash while longer-term pre-crash traffic flow trends are neglected. In this paper we will show, through data mining techniques, that a relationship between pre-crash traffic flow patterns and crash occurrence on motorways exists, and that this knowledge has the potential to improve the accuracy of existing models and opens the path for new development approaches. The data for the analysis was extracted from records collected between 2007 and 2009 on the Shibuya and Shinjuku lines of the Tokyo Metropolitan Expressway in Japan. The dataset includes a total of 824 rear-end and sideswipe crashes that have been matched with traffic flow data of one hour prior to the crash using an incident detection algorithm. Traffic flow trends (traffic speed/occupancy time series) revealed that crashes could be clustered with regards of the dominant traffic flow pattern prior to the crash. Using the k-means clustering method allowed the crashes to be clustered based on their flow trends rather than their distance. Four major trends have been found in the clustering results. Based on these findings, crash likelihood estimation algorithms can be fine-tuned based on the monitored traffic flow conditions with a sliding window of 60 minutes to increase accuracy of the results and minimize false alarms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Crashes that occur on motorways contribute to a significant proportion (40-50%) of non-recurrent motorway congestions. Hence, reducing the frequency of crashes assists in addressing congestion issues (Meyer, 2008). Crash likelihood estimation studies commonly focus on traffic conditions in a short time window around the time of a crash while longer-term pre-crash traffic flow trends are neglected. In this paper we will show, through data mining techniques that a relationship between pre-crash traffic flow patterns and crash occurrence on motorways exists. We will compare them with normal traffic trends and show this knowledge has the potential to improve the accuracy of existing models and opens the path for new development approaches. The data for the analysis was extracted from records collected between 2007 and 2009 on the Shibuya and Shinjuku lines of the Tokyo Metropolitan Expressway in Japan. The dataset includes a total of 824 rear-end and sideswipe crashes that have been matched with crashes corresponding to traffic flow data using an incident detection algorithm. Traffic trends (traffic speed time series) revealed that crashes can be clustered with regards to the dominant traffic patterns prior to the crash. Using the K-Means clustering method with Euclidean distance function allowed the crashes to be clustered. Then, normal situation data was extracted based on the time distribution of crashes and were clustered to compare with the “high risk” clusters. Five major trends have been found in the clustering results for both high risk and normal conditions. The study discovered traffic regimes had differences in the speed trends. Based on these findings, crash likelihood estimation models can be fine-tuned based on the monitored traffic conditions with a sliding window of 30 minutes to increase accuracy of the results and minimize false alarms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Evidence suggests that both nascent and young firms (henceforth: “new firms”)—despite typically being small and resource-constrained—are sometimes able to innovate effectively. Such firms are seldom able to invest in lengthy and expensive development processes, which suggests that they may frequently rely instead on other pathways to generate innovativeness within the firm. In this paper, we develop and test arguments that “bricolage,” defined as making do by applying combinations of the resources at hand to new problems and opportunities, provides an important pathway to achieve innovation for new resource-constrained firms. Through bricolage, resource-constrained firms engage in the processes of “recombination” that are core to creating innovative outcomes. Based on a large longitudinal dataset, our results suggest that variations in the degree to which firms engage in bricolage behaviors can provide a broadly applicable explanation of innovativeness under resource constraints by new firms. We find no general support for our competing hypothesis that the positive effects may level off or even turn negative at high levels of bricolage..

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Based on regional-scale studies, aboveground production and litter decomposition are thought to positively covary, because they are driven by shared biotic and climatic factors. Until now we have been unable to test whether production and decomposition are generally coupled across climatically dissimilar regions, because we lacked replicated data collected within a single vegetation type across multiple regions, obfuscating the drivers and generality of the association between production and decomposition. Furthermore, our understanding of the relationships between production and decomposition rests heavily on separate meta-analyses of each response, because no studies have simultaneously measured production and the accumulation or decomposition of litter using consistent methods at globally relevant scales. Here, we use a multi-country grassland dataset collected using a standardized protocol to show that live plant biomass (an estimate of aboveground net primary production) and litter disappearance (represented by mass loss of aboveground litter) do not strongly covary. Live biomass and litter disappearance varied at different spatial scales. There was substantial variation in live biomass among continents, sites and plots whereas among continent differences accounted for most of the variation in litter disappearance rates. Although there were strong associations among aboveground biomass, litter disappearance and climatic factors in some regions (e.g. U.S. Great Plains), these relationships were inconsistent within and among the regions represented by this study. These results highlight the importance of replication among regions and continents when characterizing the correlations between ecosystem processes and interpreting their global-scale implications for carbon flux. We must exercise caution in parameterizing litter decomposition and aboveground production in future regional and global carbon models as their relationship is complex.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Speaker attribution is the task of annotating a spoken audio archive based on speaker identities. This can be achieved using speaker diarization and speaker linking. In our previous work, we proposed an efficient attribution system, using complete-linkage clustering, for conducting attribution of large sets of two-speaker telephone data. In this paper, we build on our proposed approach to achieve a robust system, applicable to multiple recording domains. To do this, we first extend the diarization module of our system to accommodate multi-speaker (>2) recordings. We achieve this through using a robust cross-likelihood ratio (CLR) threshold stopping criterion for clustering, as opposed to the original stopping criterion of two speakers used for telephone data. We evaluate this baseline diarization module across a dataset of Australian broadcast news recordings, showing a significant lack of diarization accuracy without previous knowledge of the true number of speakers within a recording. We thus propose applying an additional pass of complete-linkage clustering to the diarization module, demonstrating an absolute improvement of 20% in diarization error rate (DER). We then evaluate our proposed multi-domain attribution system across the broadcast news data, demonstrating achievable attribution error rates (AER) as low as 17%.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A "self-exciting" market is one in which the probability of observing a crash increases in response to the occurrence of a crash. It essentially describes cases where the initial crash serves to weaken the system to some extent, making subsequent crashes more likely. This thesis investigates if equity markets possess this property. A self-exciting extension of the well-known jump-based Bates (1996) model is used as the workhorse model for this thesis, and a particle-filtering algorithm is used to facilitate estimation by means of maximum likelihood. The estimation method is developed so that option prices are easily included in the dataset, leading to higher quality estimates. Equilibrium arguments are used to price the risks associated with the time-varying crash probability, and in turn to motivate a risk-neutral system for use in option pricing. The option pricing function for the model is obtained via the application of widely-used Fourier techniques. An application to S&P500 index returns and a panel of S&P500 index option prices reveals evidence of self excitation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Several lines of evidence suggests that transcription factors are involved in the pathogenesis of Multiple Sclerosis (MS) but a complete mapping the whole network has been elusive. One of the reasons is that there are several clinical subtypes of MS and transcription factors which may be involved in one subtype may not be in others. We investigated the possibility that this network could be mapped using microarray technologies and modern bioinformatics methods on a dataset from whole blood in 99 untreated MS patients (36 Relapse Remitting MS, 43 Primary Progressive MS, and 20 Secondary Progressive MS) and 45 age-matched healthy controls, Methodology/Principal Findings We have used two different analytical methodologies: a differential expression analysis and a differential co-expression analysis, which have converged on a significant number of regulatory motifs that seem to be statistically overrepresented in genes which are either differentially expressed (or differentially co-expressed) in cases and controls (e.g. V$KROX_Q6, p-value < 3.31E-6; V$CREBP1_Q2, p-value < 9.93E-6, V$YY1_02, p-value < 1.65E-5). Conclusions/significance: Our analysis uncovered a network of transcription factors that potentially dysregulate several genes in MS or one or more of its disease subtypes. Analysing the published literature we have found that these transcription factors are involved in the early T-lymphocyte specification and commitment as well as in oligodendrocytes dedifferentiation and development. The most significant transcription factors motifs were for the Early Growth response EGR/KROX family, ATF2, YY1 (Yin and Yang 1), E2F-1/DP-1 and E2F-4/DP-2 heterodimers, SOX5, and CREB and ATF families.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract An experimental dataset representing a typical flow field in a stormwater gross pollutant trap (GPT) was visualised. A technique was developed to apply the image-based flow visualisation (IBFV) algorithm to the raw dataset. Particle image velocimetry (PIV) software was previously used to capture the flow field data by tracking neutrally buoyant particles with a high speed camera. The dataset consisted of scattered 2D point velocity vectors and the IBFV visualisation facilitates flow feature characterisation within the GPT. The flow features played a pivotal role in understanding stormwater pollutant capture and retention behaviour within the GPT. It was found that the IBFV animations revealed otherwise unnoticed flow features and experimental artefacts. For example, a circular tracer marker in the IBFV program visually highlighted streamlines to investigate the possible flow paths of pollutants entering the GPT. The investigated flow paths were compared with the behaviour of pollutants monitored during experiments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Purpose Videokeratoscopy images can be used for the non-invasive assessment of the tear film. In this work the applicability of an image processing technique, textural-analysis, for the assessment of the tear film in Placido disc images has been investigated. Methods In the presence of tear film thinning/break-up, the reflected pattern from the videokeratoscope is disturbed in the region of tear film disruption. Thus, the Placido pattern carries information about the stability of the underlying tear film. By characterizing the pattern regularity, the tear film quality can be inferred. In this paper, a textural features approach is used to process the Placido images. This method provides a set of texture features from which an estimate of the tear film quality can be obtained. The method is tested for the detection of dry eye in a retrospective dataset from 34 subjects (22-normal and 12-dry eye), with measurements taken under suppressed blinking conditions. Results To assess the capability of each texture-feature to discriminate dry eye from normal subjects, the receiver operating curve (ROC) was calculated and the area under the curve (AUC), specificity and sensitivity extracted. For the different features examined, the AUC value ranged from 0.77 to 0.82, while the sensitivity typically showed values above 0.9 and the specificity showed values around 0.6. Overall, the estimated ROCs indicate that the proposed technique provides good discrimination performance. Conclusions Texture analysis of videokeratoscopy images is applicable to study tear film anomalies in dry eye subjects. The proposed technique appears to have demonstrated its clinical relevance and utility.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Crashes that occur on motorways contribute to a significant proportion (40-50%) of non-recurrent motorway congestion. Hence, reducing the frequency of crashes assist in addressing congestion issues (Meyer, 2008). Analysing traffic conditions and discovering risky traffic trends and patterns are essential basics in crash likelihood estimations studies and still require more attention and investigation. In this paper we will show, through data mining techniques, that there is a relationship between pre-crash traffic flow patterns and crash occurrence on motorways, compare them with normal traffic trends, and that this knowledge has the potentiality to improve the accuracy of existing crash likelihood estimation models, and opens the path for new development approaches. The data for the analysis was extracted from records collected between 2007 and 2009 on the Shibuya and Shinjuku lines of the Tokyo Metropolitan Expressway in Japan. The dataset includes a total of 824 rear-end and sideswipe crashes that have been matched with crashes corresponding traffic flow data using an incident detection algorithm. Traffic trends (traffic speed time series) revealed that crashes can be clustered with regards to the dominant traffic patterns prior to the crash occurrence. K-Means clustering algorithm applied to determine dominant pre-crash traffic patterns. In the first phase of this research, traffic regimes identified by analysing crashes and normal traffic situations using half an hour speed in upstream locations of crashes. Then, the second phase investigated the different combination of speed risk indicators to distinguish crashes from normal traffic situations more precisely. Five major trends have been found in the first phase of this paper for both high risk and normal conditions. The study discovered traffic regimes had differences in the speed trends. Moreover, the second phase explains that spatiotemporal difference of speed is a better risk indicator among different combinations of speed related risk indicators. Based on these findings, crash likelihood estimation models can be fine-tuned to increase accuracy of estimations and minimize false alarms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objective To describe the epidemiology of acute lower respiratory infection (ALRI) and bronchiectasis in Northern Territory Indigenous infants hospitalised in the first year of life. Design A historical cohort study constructed from the NT Hospital Discharge Dataset and the NT Imm(u)nisation Register. Participants and setting All NT resident Indigenous infants, born 1 January 1999 to 31 December 2004, admitted to NT public hospitals and followed up to 12 months of age. Main outcome measures Incidence of ALRI and bronchiectasis (ICD-10-AM codes) and radiologically confirmed pneumonia (World Health Organization protocol). Results Data on 9295 infants, 8498 child-years of observation and 15 948 hospitalised episodes of care were analysed. ALRI incidence was 426.7 episodes per 1000 child-years (95% Cl, 416.2-437.2). Incidence rates were two times higher (relative risk, 2.12; 95% Cl, 1.98-2.27) for infants in Central Australia compared with those in the Top End. The median age at first admission for an ALRI was 4.6 months (interquartile range, 2.6-7.3). Bronchiolitis accounted for most of the disease burden, with a rate of 227 per 1000 child-years. The incidence of first diagnosis of bronchiectasis was 1.18 per 1000 child-years (95% Cl, 0.60-2.16). One or more key comorbidities were present in 1445 of the 3227 (44.8%) episodes of care for ALRI. Conclusions Rates of ALRI and bronchiectasis in NT Indigenous infants are excessive, with early onset, frequent repeat episodes, and a high prevalence of comorbidities. These high rates of disease demand urgent attention.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Within the QUT Business School (QUTBS)– researchers across economics, finance and accounting depend on data driven research. They analyze historic and global financial data across a range of instruments to understand the relationships and effects between them as they respond to news and events in their region. Scholars and Higher Degree Research Students in turn seek out universities which offer these particular datasets to further their research. This involves downloading and manipulating large datasets, often with a focus on depth of detail, frequency and long tail historical data. This is stock exchange data and has potential commercial value therefore the license for access tends to be very expensive. This poster reports the following findings: •The library has a part to play in freeing up researchers from the burden of negotiating subscriptions, fundraising and managing the legal requirements around license and access. •The role of the library is to communicate the nature and potential of these complex resources across the university to disciplines as diverse as Mathematics, Health, Information Systems and Creative Industries. •Has demonstrated clear concrete support for research by QUT Library and built relationships into faculty. It has made data available to all researchers and attracted new HDRs. The aim is to reach the output threshold of research outputs to submit into FOR Code 1502 (Banking, Finance and Investment) for ERA 2015. •It is difficult to identify what subset of dataset will be obtained given somewhat vague price tiers. •The integrity of data is variable as it is limited by the way it is collected, this occasionally raises issues for researchers(Cook, Campbell, & Kelly, 2012) •Improved library understanding of the content of our products and the nature of financial based research is a necessary part of the service.