Biblioteca Digital

209 resultados para Archivos LOG

Analyzing web multimedia query reformulation behavior

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Current multimedia Web search engines still use keywords as the primary means to search. Due to the richness in multimedia contents, general users constantly experience some difficulties in formulating textual queries that are representative enough for their needs. As a result, query reformulation becomes part of an inevitable process in most multimedia searches. Previous Web query formulation studies did not investigate the modification sequences and thus can only report limited findings on the reformulation behavior. In this study, we propose an automatic approach to examine multimedia query reformulation using large-scale transaction logs. The key findings show that search term replacement is the most dominant type of modifications in visual searches but less important in audio searches. Image search users prefer the specified search strategy more than video and audio users. There is also a clear tendency to replace terms with synonyms or associated terms in visual queries. The analysis of the search strategies in different types of multimedia searching provides some insights into user’s searching behavior, which can contribute to the design of future query formulation assistance for keyword-based Web multimedia retrieval systems.

A study and comparison of multimedia Web searching : 1997-2006

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Searching for multimedia is an important activity for users of Web search engines. Studying user's interactions with Web search engine multimedia buttons, including image, audio, and video, is important for the development of multimedia Web search systems. This article provides results from a Weblog analysis study of multimedia Web searching by Dogpile users in 2006. The study analyzes the (a) duration, size, and structure of Web search queries and sessions; (b) user demographics; (c) most popular multimedia Web searching terms; and (d) use of advanced Web search techniques including Boolean and natural language. The current study findings are compared with results from previous multimedia Web searching studies. The key findings are: (a) Since 1997, image search consistently is the dominant media type searched followed by audio and video; (b) multimedia search duration is still short (>50% of searching episodes are <1 min), using few search terms; (c) many multimedia searches are for information about people, especially in audio search; and (d) multimedia search has begun to shift from entertainment to other categories such as medical, sports, and technology (based on the most repeated terms). Implications for design of Web multimedia search engines are discussed.

From Analysis to Estimation of User Behavior

Relevância:

10.00% 10.00%

Publicador:

Machine Learning Approach to Search Query Classification

Relevância:

10.00% 10.00%

Publicador:

Topic Analysis and Identification of Queries

Relevância:

10.00% 10.00%

Publicador:

Determining the informational, navigational and transactional intent of web queries

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we define and present a comprehensive classification of user intent for Web searching. The classification consists of three hierarchical levels of informational, navigational, and transactional intent. After deriving attributes of each, we then developed a software application that automatically classified queries using a Web search engine log of over a million and a half queries submitted by several hundred thousand users. Our findings show that more than 80% of Web queries are informational in nature, with about 10% each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the results from this manual classification to the results determined by the automated method. This comparison showed that the automatic classification has an accuracy of 74%. Of the remaining 25% of the queries, the user intent is vague or multi-faceted, pointing to the need for probabilistic classification. We discuss how search engines can use knowledge of user intent to provide more targeted and relevant results in Web searching.

Reconstruction of falsified computer logs for digital forensics investigations

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Digital forensics investigations aim to find evidence that helps confirm or disprove a hypothesis about an alleged computer-based crime. However, the ease with which computer-literate criminals can falsify computer event logs makes the prosecutor's job highly challenging. Given a log which is suspected to have been falsified or tampered with, a prosecutor is obliged to provide a convincing explanation for how the log may have been created. Here we focus on showing how a suspect computer event log can be transformed into a hypothesised actual sequence of events, consistent with independent, trusted sources of event orderings. We present two algorithms which allow the effort involved in falsifying logs to be quantified, as a function of the number of `moves' required to transform the suspect log into the hypothesised one, thus allowing a prosecutor to assess the likelihood of a particular falsification scenario. The first algorithm always produces an optimal solution but, for reasons of efficiency, is suitable for short event logs only. To deal with the massive amount of data typically found in computer event logs, we also present a second heuristic algorithm which is considerably more efficient but may not always generate an optimal outcome.

The impact of delayed blood centrifuging, choice of collection tube, and type of assay on 25-hydroxyvitamin D concentrations

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Studies have examined the associations between cancers and circulating 25-hydroxyvitamin D [25(OH)D], but little is known about the impact of different laboratory practices on 25(OH)D concentrations. We examined the potential impact of delayed blood centrifuging, choice of collection tube, and type of assay on 25(OH)D concentrations. Blood samples from 20 healthy volunteers underwent alternative laboratory procedures: four centrifuging times (2, 24, 72, and 96 h after blood draw); three types of collection tubes (red top serum tube, two different plasma anticoagulant tubes containing heparin or EDTA); and two types of assays (DiaSorin radioimmunoassay [RIA] and chemiluminescence immunoassay [CLIA/LIAISON®]). Log-transformed 25(OH)D concentrations were analyzed using the generalized estimating equations (GEE) linear regression models. We found no difference in 25(OH)D concentrations by centrifuging times or type of assay. There was some indication of a difference in 25(OH)D concentrations by tube type in CLIA/LIAISON®-assayed samples, with concentrations in heparinized plasma (geometric mean, 16.1 ng ml−1) higher than those in serum (geometric mean, 15.3 ng ml−1) (p = 0.01), but the difference was significant only after substantial centrifuging delays (96 h). Our study suggests no necessity for requiring immediate processing of blood samples after collection or for the choice of a tube type or assay.

Critical social theory and transformative learning : evidence in pre-service teachers' service-learning reflection logs

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper reports on the opportunities for transformational learning experienced by a group of pre-service teachers who were engaged in service-learning as a pedagogical process with a focus on reflection. Critical social theory informed the design of the reflection process as it enabled a move away from knowledge transmission toward knowledge transformation. The structured reflection log was designed to illustrate the critical social theory expectations of quality learning that teach students to think critically: ideology critique and utopian critique. Butin's lenses and a reflection framework informed by the work of Bain, Ballantyne, Mills and Lester were used in the design of the service-learning reflection log. Reported data provide evidence of transformational learning and highlight how the students critique their world and imagine how they could contribute to a better world in their work as a beginning teacher.

Fraud detection in ERP systems using scenario matching

Relevância:

10.00% 10.00%

Publicador:

Resumo:

ERP systems generally implement controls to prevent certain common kinds of fraud. In addition however, there is an imperative need for detection of more sophisticated patterns of fraudulent activity as evidenced by the legal requirement for company audits and the common incidence of fraud. This paper describes the design and implementation of a framework for detecting patterns of fraudulent activity in ERP systems. We include the description of six fraud scenarios and the process of specifying and detecting the occurrence of those scenarios in ERP user log data using the prototype software which we have developed. The test results for detecting these scenarios in log data have been verified and confirm the success of our approach which can be generalized to ERP systems in general.

Exploring visual features through Gabor representations for facial expression detection

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Gabor representations have been widely used in facial analysis (face recognition, face detection and facial expression detection) due to their biological relevance and computational properties. Two popular Gabor representations used in literature are: 1) Log-Gabor and 2) Gabor energy filters. Even though these representations are somewhat similar, they also have distinct differences as the Log-Gabor filters mimic the simple cells in the visual cortex while the Gabor energy filters emulate the complex cells, which causes subtle differences in the responses. In this paper, we analyze the difference between these two Gabor representations and quantify these differences on the task of facial action unit (AU) detection. In our experiments conducted on the Cohn-Kanade dataset, we report an average area underneath the ROC curve (A`) of 92.60% across 17 AUs for the Gabor energy filters, while the Log-Gabor representation achieved an average A` of 96.11%. This result suggests that small spatial differences that the Log-Gabor filters pick up on are more useful for AU detection than the differences in contours and edges that the Gabor energy filters extract.

Towards developing an integrated model of information behaviour

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents the results from a study of information behaviors in the context of people's everyday lives undertaken in order to develop an integrated model of information behavior (IB). 34 participants from across 6 countries maintained a daily information journal or diary – mainly through a secure web log – for two weeks, to an aggregate of 468 participant days over five months. The text-rich diary data was analyzed using a multi-method qualitative-quantitative analysis in the following order: Grounded Theory analysis with manual coding, automated concept analysis using thesaurus-based visualization, and finally a statistical analysis of the coding data. The findings indicate that people engage in several information behaviors simultaneously throughout their everyday lives (including home and work life) and that sense-making is entangled in all aspects of them. Participants engaged in many of the information behaviors in a parallel, distributed, and concurrent fashion: many information behaviors for one information problem, one information behavior across many information problems, and many information behaviors concurrently across many information problems. Findings indicate also that information avoidance – both active and passive avoidance – is a common phenomenon and that information organizing behaviors or the lack thereof caused the most problems for participants. An integrated model of information behaviors is presented based on the findings.

Optimising figure of merit for phonetic spoken term detection

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper introduces a novel technique to directly optimise the Figure of Merit (FOM) for phonetic spoken term detection. The FOM is a popular measure of sTD accuracy, making it an ideal candiate for use as an objective function. A simple linear model is introduced to transform the phone log-posterior probabilities output by a phe classifier to produce enhanced log-posterior features that are more suitable for the STD task. Direct optimisation of the FOM is then performed by training the parameters of this model using a non-linear gradient descent algorithm. Substantial FOM improvements of 11% relative are achieved on held-out evaluation data, demonstrating the generalisability of the approach.

Description and evaluation of a Newton-based electronic appetite rating system for temporal tracking of appetite in human subjects

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study assessed the reliability and validity of a palm-top-based electronic appetite rating system (EARS) in relation to the traditional paper and pen method. Twenty healthy subjects [10 male (M) and 10 female (F)] — mean age M=31 years (S.D.=8), F=27 years (S.D.=5); mean BMI M=24 (S.D.=2), F=21 (S.D.=5) — participated in a 4-day protocol. Measurements were made on days 1 and 4. Subjects were given paper and an EARS to log hourly subjective motivation to eat during waking hours. Food intake and meal times were fixed. Subjects were given a maintenance diet (comprising 40% fat, 47% carbohydrate and 13% protein by energy) calculated at 1.6×Resting Metabolic Rate (RMR), as three isoenergetic meals. Bland and Altman's test for bias between two measurement techniques found significant differences between EARS and paper and pen for two of eight responses (hunger and fullness). Regression analysis confirmed that there were no day, sex or order effects between ratings obtained using either technique. For 15 subjects, there was no significant difference between results, with a linear relationship between the two methods that explained most of the variance (r2 ranged from 62.6 to 98.6). The slope for all subjects was less than 1, which was partly explained by a tendency for bias at the extreme end of results on the EARS technique. These data suggest that the EARS is a useful and reliable technique for real-time data collection in appetite research but that it should not be used interchangeably with paper and pen techniques.

Hierarchical models for 2D presence/absence data having ambiguous zeroes: With a biogeographical case study on dingo behaviour

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

«
1
2
3
4
5
6
7
8
...
13
14
»