936 resultados para textual similarity
Resumo:
Applications of the axisymmetric Boussinesq equation to groundwater hydrology and reservoir engineering have long been recognised. An archetypal example is invasion by drilling fluid into a permeable bed where there is initially no such fluid present, a circumstance of some importance in the oil industry. It is well known that the governing Boussinesq model can be reduced to a nonlinear ordinary differential equation using a similarity variable, a transformation that is valid for a certain time-dependent flux at the origin. Here, a new analytical approximation is obtained for this case. The new solution,, which has a simple form, is demonstrated to be highly accurate. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
With the rapid increase in both centralized video archives and distributed WWW video resources, content-based video retrieval is gaining its importance. To support such applications efficiently, content-based video indexing must be addressed. Typically, each video is represented by a sequence of frames. Due to the high dimensionality of frame representation and the large number of frames, video indexing introduces an additional degree of complexity. In this paper, we address the problem of content-based video indexing and propose an efficient solution, called the Ordered VA-File (OVA-File) based on the VA-file. OVA-File is a hierarchical structure and has two novel features: 1) partitioning the whole file into slices such that only a small number of slices are accessed and checked during k Nearest Neighbor (kNN) search and 2) efficient handling of insertions of new vectors into the OVA-File, such that the average distance between the new vectors and those approximations near that position is minimized. To facilitate a search, we present an efficient approximate kNN algorithm named Ordered VA-LOW (OVA-LOW) based on the proposed OVA-File. OVA-LOW first chooses possible OVA-Slices by ranking the distances between their corresponding centers and the query vector, and then visits all approximations in the selected OVA-Slices to work out approximate kNN. The number of possible OVA-Slices is controlled by a user-defined parameter delta. By adjusting delta, OVA-LOW provides a trade-off between the query cost and the result quality. Query by video clip consisting of multiple frames is also discussed. Extensive experimental studies using real video data sets were conducted and the results showed that our methods can yield a significant speed-up over an existing VA-file-based method and iDistance with high query result quality. Furthermore, by incorporating temporal correlation of video content, our methods achieved much more efficient performance.
Resumo:
One way to achieve the large sample sizes required for genetic studies of complex traits is to combine samples collected by different groups. It is not often clear, however, whether this practice is reasonable from a genetic perspective. To assess the comparability of samples from the Australian and the Netherlands twin studies, we estimated F,, (the proportion of total genetic variability attributable to genetic differences between cohorts) based on 359 short tandem repeat polymorphisms in 1068 individuals. IF,, was estimated to be 0.30% between the Australian and the Netherlands cohorts, a smaller value than between many European groups. We conclude that it is reasonable to combine the Australian and the Netherlands samples for joint genetic analyses.
Resumo:
Humans play a role in deciding the fate of species in the current extinction wave. Because of the previous Similarity Principle, physical attractiveness and likeability, it has been argued that public choice favours the survival of species that satisfy these criteria at the expense of other species. This paper empirically tests this argument by considering a hypothetical ‘Ark’ situation. Surveys of 204 members of the Australian public inquired whether they are in favour of the survival of each of 24 native mammal, bird and reptile species (prior to and after information provision about each species). The species were ranked by percentage of ‘yes’ votes received. Species composition by taxon in various fractions of the ranking was determined. If the previous Similarity Principle holds, mammals should rank highly and dominate the top fractions of animals saved in the hierarchical list. We find that although mammals would be over-represented in the ‘Ark’, birds and reptiles are unlikely to be excluded when social choice is based on numbers ‘voting’ for the survival of each species. Support for the previous Similarity Principle is apparent particularly after information provision. Public policy implications of this are noted and recommendations are given.
Resumo:
Because faces and bodies share some abstract perceptual features, we hypothesised that similar recognition processes might be used for both. We investigated whether similar caricature effects to those found in facial identity and expression recognition could be found in the recognition of individual bodies and socially meaningful body positions. Participants were trained to name four body positions (anger, fear, disgust, sadness) and four individuals (in a neutral position). We then tested their recognition of extremely caricatured, moderately caricatured, anticaricatured, and undistorted images of each stimulus. Consistent with caricature effects found in face recognition, moderately caricatured representations of individuals' bodies were recognised more accurately than undistorted and extremely caricatured representations. No significant difference was found between participants' recognition of extremely caricatured, moderately caricatured, or undistorted body position line-drawings. AU anti-caricatured representations were named significandy less accurately than the veridical stimuli. Similar mental representations may be used for both bodies and faces.
Resumo:
Music similarity query based on acoustic content is becoming important with the ever-increasing growth of the music information from emerging applications such as digital libraries and WWW. However, relative techniques are still in their infancy and much less than satisfactory. In this paper, we present a novel index structure, called Composite Feature tree, CF-tree, to facilitate efficient content-based music search adopting multiple musical features. Before constructing the tree structure, we use PCA to transform the extracted features into a new space sorted by the importance of acoustic features. The CF-tree is a balanced multi-way tree structure where each level represents the data space at different dimensionalities. The PCA transformed data and reduced dimensions in the upper levels can alleviate suffering from dimensionality curse. To accurately mimic human perception, an extension, named CF+-tree, is proposed, which further applies multivariable regression to determine the weight of each individual feature. We conduct extensive experiments to evaluate the proposed structures against state-of-art techniques. The experimental results demonstrate superiority of our technique.
Resumo:
Jaccard has been the choice similarity metric in ecology and forensic psychology for comparison of sites or offences, by species or behaviour. This paper applies a more powerful hierarchical measure - taxonomic similarity (s), recently developed in marine ecology - to the task of behaviourally linking serial crime. Forensic case linkage attempts to identify behaviourally similar offences committed by the same unknown perpetrator (called linked offences). s considers progressively higher-level taxa, such that two sites show some similarity even without shared species. We apply this index by analysing 55 specific offence behaviours classified hierarchically. The behaviours are taken from 16 sexual offences by seven juveniles where each offender committed two or more offences. We demonstrate that both Jaccard and s show linked offences to be significantly more similar than unlinked offences. With up to 20% of the specific behaviours removed in simulations, s is equally or more effective at distinguishing linked offences than where Jaccard uses a full data set. Moreover, s retains significant difference between linked and unlinked pairs, with up to 50% of the specific behaviours removed. As police decision-making often depends upon incomplete data, s has clear advantages and its application may extend to other crime types. Copyright © 2007 John Wiley & Sons, Ltd.
Resumo:
There is evidence for both advantages and disadvantages in normal recognition of living over nonliving things. This paradox has been attributed to high levels of perceptual similarity within living categories having a different effect on performance in different contexts. However, since living things are intrinsically more similar to each other, previous studies could not determine whether the various category effects were due to perceptual similarity, or to other characteristics of living things. We used novel animal and vehicle stimuli that were matched for similarity to measure the influence of perceptual similarity in different contexts. We found that displaying highly similar objects in blocked sets reduced their perceived similarity, eliminating the detrimental effect on naming performance. Experiment 1 demonstrated a disadvantage for highly similar objects in name learning and name verification using mixed groups of similar and dissimilar animals and vehicles. Experiment 2 demonstrated no disadvantage for the same highly similar objects when they were blocked, e.g., similar animals presented alone. Thus, perceptual similarity, rather than other characteristics particular to living things, is affected by context, and could create apparent category effects under certain testing conditions.
Resumo:
The present work studies the overall structuring of radio news discourse via investigating three metatextual/interactive functions: (1) Discourse Organizing Elements (DOEs), (2) Attribution and (3) Sentential and Nominal Background Information (SBI & NBI). An extended corpus of about 73,000 words from BBC and Radio Damascus news is used to study DOEs and a restricted corpus of 38,000 words for Attribution and S & NBI. A situational approach is adopted to assess the influence of factors such as medium and audience on these functions and their frequence. It is found that: (1) DOEs are organizational and their frequency is determined by length of text; (2) Attribution Function in accordance with the editor's strategy and its frequency is audience sensitive; and (3) BI provides background information and is determined by audience and news topics. Secondly, the salient grammatical elements in DOEs are discourse deictic demonstratives, address pronouns and nouns referring to `the news'. Attribution is realized in reporting/reported clauses, and BI in a sentence, a clause or a nominal group. Thirdly, DOEs establish a hierarchy of (1) news, (2) summary/expansion and (3) item: including topic introduction and details. While Attribution is generally, and SBI solely, a function of detailing, NBI and proper names are generally a function of summary and topic introduction. Being primarily addressed to audience and referring metatextually, the functions investigated support Sinclair's interactive and autonomous planes of discourse. They also shed light on the part(s) of the linguistic system which realize the metatextual/interactive function. Strictly, `discourse structure' inevitably involves a rank-scale; but news discourse also shows a convention of item `listing'. Hence only within the boundary of variety (ultimately interpreted across language and in its situation) can textual functions and discourse structure be studied. Finally, interlingual variety study provides invaluable insights into a level of translation that goes beyond matching grammatical systems or situational factors, an interpretive level which has to be described in linguistic analysis of translation data.
Resumo:
The aim of this research project is to compare published history textbooks written for upper-secondary/tertiary study in the U.S. and Spain using Halliday's (1994) Theme/Rheme construct. The motivation for using the Theme/Rheme construct to analyze professional texts in the two languages is two-fold. First of all, while there exists a multitude of studies at the grammatical and phonological levels between the two languages, very little analysis has been carried out in comparison at the level of text, beyond that of comparing L1/L2 student writing. Secondly, thematic considerations allow the analyst to highlight areas of textual organization in a systematic way for purposes of comparison. The basic hypothesis tested here rests on the premise that similarity in the social function of the texts results in similar Theme choice and thematic patterning across languages, barring certain linguistic constraints. The corpus for this study consists of 20 texts: 10 from various history textbooks published in the U.S. and 10 from various history textbooks published in Spain. The texts chosen represent a variety of authors, in order to control for author style or preference. Three overall areas of analysis were carried out, representing Halliday's (1994) three metafunctions: the ideational, the interpersonal and the textual. The ideational analysis shows similarities across the two corpora in terms of participant roles and circumstances as Theme, with a slight difference in participants involved in material processes, which is shown to reflect a minor difference in the construal of the field of history in the two cultures. The textual analysis shows overall similarities with respect to text organization, and the interpersonal analysis shows overall similarities as regards the downplay of discrepant interpretations of historical events as well as a low frequency of interactive textual features, manifesting the informational focus of the texts. At the same time, differences in results amongst texts within each of the corpora demonstrate possible effect of subject matter, in many cases, and individual author style in others. Overall, the results confirm that similarity in content, but above all in purpose and audience, result in texts which show similarities in textual features, setting aside certain grammatical constraints.
Resumo:
Modelling class B G-protein-coupled receptors (GPCRs) using class A GPCR structural templates is difficult due to lack of homology. The plant GPCR, GCR1, has homology to both class A and class B GPCRs. We have used this to generate a class A-class B alignment, and by incorporating maximum lagged correlation of entropy and hydrophobicity into a consensus score, we have been able to align receptor transmembrane regions. We have applied this analysis to generate active and inactive homology models of the class B calcitonin gene-related peptide (CGRP) receptor, and have supported it with site-directed mutagenesis data using 122 CGRP receptor residues and 144 published mutagenesis results on other class B GPCRs. The variation of sequence variability with structure, the analysis of polarity violations, the alignment of group-conserved residues and the mutagenesis results at 27 key positions were particularly informative in distinguishing between the proposed and plausible alternative alignments. Furthermore, we have been able to associate the key molecular features of the class B GPCR signalling machinery with their class A counterparts for the first time. These include the [K/R]KLH motif in intracellular loop 1, [I/L]xxxL and KxxK at the intracellular end of TM5 and TM6, the NPXXY/VAVLY motif on TM7 and small group-conserved residues in TM1, TM2, TM3 and TM7. The equivalent of the class A DRY motif is proposed to involve Arg(2.39), His(2.43) and Glu(3.46), which makes a polar lock with T(6.37). These alignments and models provide useful tools for understanding class B GPCR function.