865 resultados para Representations.
Resumo:
Topic detection and tracking (TDT) is an area of information retrieval research the focus of which revolves around news events. The problems TDT deals with relate to segmenting news text into cohesive stories, detecting something new, previously unreported, tracking the development of a previously reported event, and grouping together news that discuss the same event. The performance of the traditional information retrieval techniques based on full-text similarity has remained inadequate for online production systems. It has been difficult to make the distinction between same and similar events. In this work, we explore ways of representing and comparing news documents in order to detect new events and track their development. First, however, we put forward a conceptual analysis of the notions of topic and event. The purpose is to clarify the terminology and align it with the process of news-making and the tradition of story-telling. Second, we present a framework for document similarity that is based on semantic classes, i.e., groups of words with similar meaning. We adopt people, organizations, and locations as semantic classes in addition to general terms. As each semantic class can be assigned its own similarity measure, document similarity can make use of ontologies, e.g., geographical taxonomies. The documents are compared class-wise, and the outcome is a weighted combination of class-wise similarities. Third, we incorporate temporal information into document similarity. We formalize the natural language temporal expressions occurring in the text, and use them to anchor the rest of the terms onto the time-line. Upon comparing documents for event-based similarity, we look not only at matching terms, but also how near their anchors are on the time-line. Fourth, we experiment with an adaptive variant of the semantic class similarity system. The news reflect changes in the real world, and in order to keep up, the system has to change its behavior based on the contents of the news stream. We put forward two strategies for rebuilding the topic representations and report experiment results. We run experiments with three annotated TDT corpora. The use of semantic classes increased the effectiveness of topic tracking by 10-30\% depending on the experimental setup. The gain in spotting new events remained lower, around 3-4\%. The anchoring the text to a time-line based on the temporal expressions gave a further 10\% increase the effectiveness of topic tracking. The gains in detecting new events, again, remained smaller. The adaptive systems did not improve the tracking results.
Resumo:
Segmentation is a data mining technique yielding simplified representations of sequences of ordered points. A sequence is divided into some number of homogeneous blocks, and all points within a segment are described by a single value. The focus in this thesis is on piecewise-constant segments, where the most likely description for each segment and the most likely segmentation into some number of blocks can be computed efficiently. Representing sequences as segmentations is useful in, e.g., storage and indexing tasks in sequence databases, and segmentation can be used as a tool in learning about the structure of a given sequence. The discussion in this thesis begins with basic questions related to segmentation analysis, such as choosing the number of segments, and evaluating the obtained segmentations. Standard model selection techniques are shown to perform well for the sequence segmentation task. Segmentation evaluation is proposed with respect to a known segmentation structure. Applying segmentation on certain features of a sequence is shown to yield segmentations that are significantly close to the known underlying structure. Two extensions to the basic segmentation framework are introduced: unimodal segmentation and basis segmentation. The former is concerned with segmentations where the segment descriptions first increase and then decrease, and the latter with the interplay between different dimensions and segments in the sequence. These problems are formally defined and algorithms for solving them are provided and analyzed. Practical applications for segmentation techniques include time series and data stream analysis, text analysis, and biological sequence analysis. In this thesis segmentation applications are demonstrated in analyzing genomic sequences.
Resumo:
This study examines philosophically the main theories and methodological assumptions of the field known as the cognitive science of religion (CSR). The study makes a philosophically informed reconstruction of the methodological principles of the CSR, indicates problems with them, and examines possible solutions to these problems. The study focuses on several different CSR writers, namely, Scott Atran, Justin Barrett, Pascal Boyer and Dan Sperber. CSR theorising is done in the intersection between cognitive sciences, anthropology and evolutionary psychology. This multidisciplinary nature makes CSR a fertile ground for philosophical considerations coming from philosophy of psychology, philosophy of mind and philosophy of science. The study begins by spelling out the methodological assumptions and auxiliary theories of CSR writers by situating these theories and assumptions in the nexus of existing approaches to religion. The distinctive feature of CSR is its emphasis on information processing: CSR writers claim that contemporary cognitive sciences can inform anthropological theorising about the human mind and offer tools for producing causal explanations. Further, they claim to explain the prevalence and persistence of religion by cognitive systems that undergird religious thinking. I also examine the core theoretical contributions of the field focusing mainly on the (1) “minimally counter-intuitiveness hypothesis” and (2) the different ways in which supernatural agent representations activate our cognitive systems. Generally speaking, CSR writers argue for the naturalness of religion: religious ideas and practices are widespread and pervasive because human cognition operates in such a way that religious ideas are easy to acquire and transmit. The study raises two philosophical problems, namely, the “problem of scope” and the “problem of religious relevance”. The problem of scope is created by the insistence of several critics of the CSR that CSR explanations are mostly irrelevant for explaining religion. Most CSR writers themselves hold that cognitive explanations can answer most of our questions about religion. I argue that the problem of scope is created by differences in explanation-begging questions: the former group is interested in explaining different things than the latter group. I propose that we should not stick too rigidly to one set of methodological assumptions, but rather acknowledge that different assumptions might help us to answer different questions about religion. Instead of adhering to some robust metaphysics as some strongly naturalistic writers argue, we should adopt a pragmatic and explanatory pluralist approach which would allow different kinds of methodological presuppositions in the study of religion provided that they attempt to answer different kinds of why-questions, since religion appears to be a multi-faceted phenomenon that spans over a variety of fields of special sciences. The problem of religious relevance is created by the insistence of some writers that CSR theories show religious beliefs to be false or irrational, whereas others invoke CSR theories to defend certain religious ideas. The problem is interesting because it reveals the more general philosophical assumptions of those who make such interpretations. CSR theories can (and have been) interpreted in terms of three different philosophical frameworks: strict naturalism, broad naturalism and theism. I argue that CSR theories can be interpreted inside all three frameworks without doing violence to the theories and that these frameworks give different kinds of results regarding the religious relevance of CSR theories.
Resumo:
Health professionals, academics, social commentators and the media are increasingly sending the same message – Australian men are in crisis. This message has been supported by documented rises in alcoholism, violence, depression, suicide and crime amongst men in Australia. A major cause of this crisis, it can be argued, is an over-reliance on the out-dated and limited model of hegemonic masculinity that all men are encouraged to imitate in their own behaviour. This paper, as part of a larger study, explores representations of masculinity in selected works of contemporary Australian theatre in order to investigate the concept of hegemonic masculinity and any influence it may have on the perceived ‘crisis of masculinity’. Theatre is but one of the artistic modes that can be used to investigate masculinity and issues associated with identity. The Australia Council for the Arts recognises theatre, along with literature, dance, film, television, inter-arts, music and visual arts, as critical to the understanding and expression of Australian culture and identity. Theatre has been chosen in this instance because of the opportunities available to this study for direct access to specific theatre performances and creators and, also, because of the researcher’s experience, as a theatre director, with the dramatic arts. Through interviews with writers, directors and actors, combined with the analysis of scripts, academic writings, reviews, articles, programmes, play rehearsals and workshops, this research utilises theatre as a medium to explore masculinity in Australia.
Resumo:
Health professionals, academics, social commentators and the media are increasingly sending the same message – Australian men are in crisis. This message has been supported by documented rises in alcoholism, violence, depression, suicide and crime amongst men in Australia. A major cause of this crisis, it can be argued, is an over-reliance on the out-dated and limited model of hegemonic masculinity that all men are encouraged to imitate in their own behaviour. This paper, as part of a larger study, explores representations of masculinity in selected works of contemporary Australian theatre in order to investigate the concept of hegemonic masculinity and any influence it may have on the perceived ‘crisis of masculinity’. Theatre is but one of the artistic modes that can be used to investigate masculinity and issues associated with identity. The Australia Council for the Arts recognises theatre, along with literature, dance, film, television, inter-arts, music and visual arts, as critical to the understanding and expression of Australian culture and identity. Theatre has been chosen in this instance because of the opportunities available to this study for direct access to specific theatre performances and creators and, also, because of the researcher’s experience, as a theatre director, with the dramatic arts. Through interviews with writers, directors and actors, combined with the analysis of scripts, academic writings, reviews, articles, programmes, play rehearsals and workshops, this research utilises theatre as a medium to explore masculinity in Australia.
Resumo:
This review of literacy research explores ways in which literacy has come to be understood as a problem about human populations. I describe connections between literacy education and the biopolitical government of population, especially the relationship between liberal forms of government and the administration of human freedom. The review takes into account ways in which literacy is implicated in the cultivation of civil society by attending to the interests, as well as to the conduct, of human subjects. I draw on research available in English from across the globe, which provides an overview of how literacy has been rethought and conceptualised through ethnographic, historical and classroom based studies. I discuss claims made for literacy, the way that human populations have been made visible in relation to their literacy practices and the social contexts of their use. The review informs research of representations of literacy as a tool for securing national interests.
Resumo:
In many instances we find it advantageous to display a quantum optical density matrix as a generalized statistical ensemble of coherent wave fields. The weight functions involved in these constructions turn out to belong to a family of distributions, not always smooth functions. In this paper we investigate this question anew and show how it is related to the problem of expanding an arbitrary state in terms of an overcomplete subfamily of the overcomplete set of coherent states. This provides a relatively transparent derivation of the optical equivalence theorem. An interesting by-product is the discovery of a new class of discrete diagonal representations.
Resumo:
Self-tracking, the process of recording one's own behaviours, thoughts and feelings, is a popular approach to enhance one's self-knowledge. While dedicated self-tracking apps and devices support data collection, previous research highlights that the integration of data constitutes a barrier for users. In this study we investigated how members of the Quantified Self movement---early adopters of self-tracking tools---overcome these barriers. We conducted a qualitative analysis of 51 videos of Quantified Self presentations to explore intentions for collecting data, methods for integrating and representing data, and how intentions and methods shaped reflection. The findings highlight two different intentions---striving for self-improvement and curiosity in personal data---which shaped how these users integrated data, i.e. the effort required. Furthermore, we identified three methods for representing data---binary, structured and abstract---which influenced reflection. Binary representations supported reflection-in-action, whereas structured and abstract representations supported iterative processes of data collection, integration and reflection. For people tracking out of curiosity, this iterative engagement with personal data often became an end in itself, rather than a means to achieve a goal. We discuss how these findings contribute to our current understanding of self-tracking amongst Quantified Self members and beyond, and we conclude with directions for future work to support self-trackers with their aspirations.
Resumo:
A parametrization of the elements of the three-dimensional Lorentz group O(2, 1), suited to the use of a noncompact O(1, 1) basis in its unitary representations, is derived and used to set up the representation matrices for the entire group. The Plancherel formula for O(2, 1) is then expressed in this basis.
Resumo:
Metanogeenit ovat hapettomissa oloissa eläviä arkkien pääryhmään kuuluvia mikrobeja, joiden ainutlaatuisen aineenvaihdunnan seurauksena syntyy metaania. Ilmakehässä metaani on voimakas kasvihuonekaasu. Yksi suurimmista luonnon metaanilähteistä ovat kosteikot. Pohjoisten soiden metaanipäästöt vaihtelevat voimakkaasti eri soiden välillä ja yhden suon sisälläkin, riippuen muun muassa vuodenajasta, suotyypistä ja kasvillisuudesta. Väitöskirjatyössä tutkittiin metaanipäästöjen vaihtelun mikrobiologista taustaa. Tutkimuksessa selvitettiin suotyypin, vuodenajan, tuhkalannoituksen ja turvesyvyyden vaikutusta metanogeeniyhteisöihin sekä metaanintuottoon kolmella suomalaisella suolla. Lisäksi tutkittiin ei-metanogeenisia arkkeja ja bakteereita, koska ne muodostavat metaanin tuoton lähtöaineet osana hapetonta hajotusta. Mikrobiyhteisöt analysoitiin DNA- ja RNA-lähtöisillä, polymeraasiketjureaktioon (PCR) perustuvilla menetelmillä. Merkkigeeneinä käytettiin metaanin tuottoon liittyvää mcrA-geeniä sekä arkkien ja bakteerien ribosomaalista 16S RNA-geeniä. Metanogeeniyhteisöt ja metaanintuotto erosivat huomattavasti happaman ja vähäravinteisen rahkasuon sekä ravinteikkaampien sarasoiden välillä. Rahkasuolta löytyi lähes yksinomaan Methanomicrobiales-lahkon metanogeeneja, jotka tuottavat metaania vedystä ja hiilidioksidista. Sarasoiden metanogeeniyhteisöt olivat monimuotoisempia, ja niillä esiintyi myös asetaattia käyttäviä metanogeeneja. Vuodenaika vaikutti merkittävästi metaanintuottoon. Talvella havaittiin odottamattoman suuri metaanintuottopotentiaali sekä viitteitä aktiivisista metanogeeneista. Arkkiyhteisön koostumus sen sijaan vaihteli vain vähän. Tuhkalannoitus, jonka tarkoituksena on edistää puiden kasvua ojitetuilla soilla, ei merkittävästi vaikuttanut metaanintuottoon tai -tuottajiin. Ojitetun suon yhteisöt kuitenkin muuttuivat turvesyvyyden mukaan. Vertailtaessa erilaisia PCR-menetelmiä todettiin, että kolmella mcrA-geeniin kohdistuvalla alukeparilla havaittiin pääosin samat ojitetun suon metanogeenit, mutta lajien runsaussuhteet riippuvat käytetyistä alukkeista. Soilla havaitut bakteerit kuuluivat pääjaksoihin Deltaproteobacteria, Acidobacteria ja Verrucomicrobia. Lisäksi löydettiin Crenarchaeota-pääjakson ryhmiin 1.1c ja 1.3 kuuluvia ei-metanogeenisia arkkeja. Tulokset ryhmien esiintymisestä hapettomassa turpeessa antavat lähtökohdan selvittää niiden mahdollisia vuorovaikutuksia metanogeenien kanssa. Tutkimuksen tulokset osoittivat, että metanogeeniyhteisön koostumus heijastaa metaanintuottoon vaikuttavia kemiallisia tai kasvillisuuden vaihteluita kuten suotyyppiä. Soiden metanogeenien ja niiden fysiologian parempi tuntemus voi auttaa ennustamaan ympäristömuutosten vaikutusta soiden metaanipäästöihin.
Resumo:
Multi-document summarization addressing the problem of information overload has been widely utilized in the various real-world applications. Most of existing approaches adopt term-based representation for documents which limit the performance of multi-document summarization systems. In this paper, we proposed a novel pattern-based topic model (PBTMSum) for the task of the multi-document summarization. PBTMSum combining pattern mining techniques with LDA topic modelling could generate discriminative and semantic rich representations for topics and documents so that the most representative and non-redundant sentences can be selected to form a succinct and informative summary. Extensive experiments are conducted on the data of document understanding conference (DUC) 2007. The results prove the effectiveness and efficiency of our proposed approach.
Resumo:
The idea of extracting knowledge in process mining is a descendant of data mining. Both mining disciplines emphasise data flow and relations among elements in the data. Unfortunately, challenges have been encountered when working with the data flow and relations. One of the challenges is that the representation of the data flow between a pair of elements or tasks is insufficiently simplified and formulated, as it considers only a one-to-one data flow relation. In this paper, we discuss how the effectiveness of knowledge representation can be extended in both disciplines. To this end, we introduce a new representation of the data flow and dependency formulation using a flow graph. The flow graph solves the issue of the insufficiency of presenting other relation types, such as many-to-one and one-to-many relations. As an experiment, a new evaluation framework is applied to the Teleclaim process in order to show how this method can provide us with more precise results when compared with other representations.
Resumo:
State-of-the-art image-set matching techniques typically implicitly model each image-set with a Gaussian distribution. Here, we propose to go beyond these representations and model image-sets as probability distribution functions (PDFs) using kernel density estimators. To compare and match image-sets, we exploit Csiszar´ f-divergences, which bear strong connections to the geodesic distance defined on the space of PDFs, i.e., the statistical manifold. Furthermore, we introduce valid positive definite kernels on the statistical manifold, which let us make use of more powerful classification schemes to match image-sets. Finally, we introduce a supervised dimensionality reduction technique that learns a latent space where f-divergences reflect the class labels of the data. Our experiments on diverse problems, such as video-based face recognition and dynamic texture classification, evidence the benefits of our approach over the state-of-the-art image-set matching methods.
Resumo:
Identifying unusual or anomalous patterns in an underlying dataset is an important but challenging task in many applications. The focus of the unsupervised anomaly detection literature has mostly been on vectorised data. However, many applications are more naturally described using higher-order tensor representations. Approaches that vectorise tensorial data can destroy the structural information encoded in the high-dimensional space, and lead to the problem of the curse of dimensionality. In this paper we present the first unsupervised tensorial anomaly detection method, along with a randomised version of our method. Our anomaly detection method, the One-class Support Tensor Machine (1STM), is a generalisation of conventional one-class Support Vector Machines to higher-order spaces. 1STM preserves the multiway structure of tensor data, while achieving significant improvement in accuracy and efficiency over conventional vectorised methods. We then leverage the theory of nonlinear random projections to propose the Randomised 1STM (R1STM). Our empirical analysis on several real and synthetic datasets shows that our R1STM algorithm delivers comparable or better accuracy to a state-of-the-art deep learning method and traditional kernelised approaches for anomaly detection, while being approximately 100 times faster in training and testing.
Resumo:
While many measures of viewpoint goodness have been proposed in computer graphics, none have been evaluated for ribbon representations of protein secondary structure. To fill this gap, we conducted a user study on Amazon’s Mechanical Turk platform, collecting human viewpoint preferences from 65 participants for 4 representative su- perfamilies of protein domains. In particular, we evaluated viewpoint entropy, which was previously shown to be a good predictor for human viewpoint preference of other, mostly non-abstract objects. In a second study, we asked 7 molecular biology experts to find the best viewpoint of the same protein domains and compared their choices with viewpoint entropy. Our results show that viewpoint entropy overall is a significant predictor of human viewpoint preference for ribbon representations of protein secondary structure. However, the accuracy is highly dependent on the complexity of the structure: while most participants agree on good viewpoints for small, non-globular structures with few secondary structure elements, viewpoint preference varies considerably for complex structures. Finally, experts tend to choose viewpoints of both low and high viewpoint entropy to emphasize different aspects of the respective structure.