871 resultados para Pattern mining, Information filtering, User profile, Threshold


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper deals with the classification of news items in ePaper, a prototype system of a future personalized newspaper service on a mobile reading device. The ePaper system aggregates news items from various news providers and delivers to each subscribed user (reader) a personalized electronic newspaper, utilizing content-based and collaborative filtering methods. The ePaper can also provide users "standard" (i.e., not personalized) editions of selected newspapers, as well as browsing capabilities in the repository of news items. This paper concentrates on the automatic classification of incoming news using hierarchical news ontology. Based on this classification on one hand, and on the users' profiles on the other hand, the personalization engine of the system is able to provide a personalized paper to each user onto her mobile reading device.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mobile messaging is an integral and vital part of the mobile industry and contributes significantly to worldwide total mobile service revenues. In today’s competitive world, differentiation is a significant factor in the success of the business communication. SMS (Short Message Service) provides a powerful vehicle for service differentiation. What is missing, however, is the availability of personalized SMS messages. In particular, the exploitation of user profile information allows a selection and content delivery that meets preferences and interests for the individual. Personalization of mobile messages is important in today’s service-oriented society, and has proven to be crucial for the acceptance of services provided by the mobile telecommunication networks. In this paper we focus on user profile description and the mechanism for delivering the relevant information to the mobile user in accordance with his/her profile.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The demands towards the contemporary information systems are constantly increasing. In a dynamic business environment an organization has to be prepared for sudden growth, shrinking or other type of reorganization. Such change would bring the need of adaptation of the information system, servicing the company. The association of access rights to parts of the system with users, groups of users, user roles etc. is of great importance to defining the different activities in the company and the restrictions of the access rights for each employee, according to his status. The mechanisms for access rights management in a system are taken in account during the system design. In most cases they are build in the system. This paper offers an approach in user rights framework development that is applicable in information systems. This work presents a reusable extendable mechanism that can be integrated in information systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the explosive growth of the volume and complexity of document data (e.g., news, blogs, web pages), it has become a necessity to semantically understand documents and deliver meaningful information to users. Areas dealing with these problems are crossing data mining, information retrieval, and machine learning. For example, document clustering and summarization are two fundamental techniques for understanding document data and have attracted much attention in recent years. Given a collection of documents, document clustering aims to partition them into different groups to provide efficient document browsing and navigation mechanisms. One unrevealed area in document clustering is that how to generate meaningful interpretation for the each document cluster resulted from the clustering process. Document summarization is another effective technique for document understanding, which generates a summary by selecting sentences that deliver the major or topic-relevant information in the original documents. How to improve the automatic summarization performance and apply it to newly emerging problems are two valuable research directions. To assist people to capture the semantics of documents effectively and efficiently, the dissertation focuses on developing effective data mining and machine learning algorithms and systems for (1) integrating document clustering and summarization to obtain meaningful document clusters with summarized interpretation, (2) improving document summarization performance and building document understanding systems to solve real-world applications, and (3) summarizing the differences and evolution of multiple document sources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Brazilian CAPES Journal Portal aims to provide Information in Science and Technology (IST) for academic users. Thus, it is considered a relevant instrument for post-graduation dynamics and the country´s Science and Technology (S&T) development. Despite its importance, there are still few studies that focus on the policy analysis and efficiency of these resources. This research aims to fill in this gap once it proposes an analysis of the use of the CAPES Journal Portal done on behalf of the master´s and doctoral alumni of the Post Graduate Program in Management (PPGA) at the Federal University of Rio Grande do Norte (UFRN). The operationalization of the research´s main objective was possible through the specific objectives: characterize graduate profile as CAPES Journal Portal users b) identify motivation for the use of CAPES Journal Portal c) detect graduate satisfaction degree in information seeking done at CAPES Journal Portal d) verify graduate satisfaction regarding the use of the CAPES Journal Portal e) verify the use of the information that is obtained by graduates in the development of their academic activities. The research is of descriptive nature employing a mixed methodological strategy in which quantitative approach predominates. Data collection was done through a web survey questionnaire. Quantitative data analysis was made possible through the use of a statistical method. As for qualitative analysis, there was use of the Brenda Dervin´s sense-making approach as well as content analysis in open ended questions. The research samples were composed by 90 graduate students who had defended their dissertation/thesis in the PPGA program at UFRN in the time span of 2010-2013. This represented by 88% of this population. As for user profile, the analysis has made evident that there are no quantitative differences related to gender. There is predominance of male graduates that were aged 26 to 30 years old. As for female graduates, the great majority were 31 o 35 years old. Most graduates had Master´s degree scholarship in order to support their study. It was also seen that the great majority claim to use the Portal during their post graduation studies. The main reasons responsible for non use was: preference for the use of other data bases and lack of knowledge regarding the Portal. It was observed that the most used information resources were theses and dissertations. Data also indicate preference for complete text. Those who have used the Portal also claimed to have used other electronic information fonts in order to fulfill their information needs. The information fonts that were researched outside in the Portal were monographs, dissertations and thesis. Scielo was the most used information font. Results reveal that access and use of the Portal has been done in a regular manner during post graduation studies. But on the other hand, graduates also make use of other electronic information fonts in order to meet their information needs. The study also confirmed the important mission performed by the Portal regarding Brazilian scientific communication production. This was seen even though users have reported the need for improvement in some aspects such as: periodic training in order to promote, encourage and teach more effective use of the portal; investment aiming the expansion of Social Sciences Collection in the Portal as well as the need to implement continuous evaluation process related to user satisfaction in regarding the services provided.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Postprint

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While news stories are an important traditional medium to broadcast and consume news, microblogging has recently emerged as a place where people can dis- cuss, disseminate, collect or report information about news. However, the massive information in the microblogosphere makes it hard for readers to keep up with these real-time updates. This is especially a problem when it comes to breaking news, where people are more eager to know “what is happening”. Therefore, this dis- sertation is intended as an exploratory effort to investigate computational methods to augment human effort when monitoring the development of breaking news on a given topic from a microblog stream by extractively summarizing the updates in a timely manner. More specifically, given an interest in a topic, either entered as a query or presented as an initial news report, a microblog temporal summarization system is proposed to filter microblog posts from a stream with three primary concerns: topical relevance, novelty, and salience. Considering the relatively high arrival rate of microblog streams, a cascade framework consisting of three stages is proposed to progressively reduce quantity of posts. For each step in the cascade, this dissertation studies methods that improve over current baselines. In the relevance filtering stage, query and document expansion techniques are applied to mitigate sparsity and vocabulary mismatch issues. The use of word embedding as a basis for filtering is also explored, using unsupervised and supervised modeling to characterize lexical and semantic similarity. In the novelty filtering stage, several statistical ways of characterizing novelty are investigated and ensemble learning techniques are used to integrate results from these diverse techniques. These results are compared with a baseline clustering approach using both standard and delay-discounted measures. In the salience filtering stage, because of the real-time prediction requirement a method of learning verb phrase usage from past relevant news reports is used in conjunction with some standard measures for characterizing writing quality. Following a Cranfield-like evaluation paradigm, this dissertation includes a se- ries of experiments to evaluate the proposed methods for each step, and for the end- to-end system. New microblog novelty and salience judgments are created, building on existing relevance judgments from the TREC Microblog track. The results point to future research directions at the intersection of social media, computational jour- nalism, information retrieval, automatic summarization, and machine learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sequences of timestamped events are currently being generated across nearly every domain of data analytics, from e-commerce web logging to electronic health records used by doctors and medical researchers. Every day, this data type is reviewed by humans who apply statistical tests, hoping to learn everything they can about how these processes work, why they break, and how they can be improved upon. To further uncover how these processes work the way they do, researchers often compare two groups, or cohorts, of event sequences to find the differences and similarities between outcomes and processes. With temporal event sequence data, this task is complex because of the variety of ways single events and sequences of events can differ between the two cohorts of records: the structure of the event sequences (e.g., event order, co-occurring events, or frequencies of events), the attributes about the events and records (e.g., gender of a patient), or metrics about the timestamps themselves (e.g., duration of an event). Running statistical tests to cover all these cases and determining which results are significant becomes cumbersome. Current visual analytics tools for comparing groups of event sequences emphasize a purely statistical or purely visual approach for comparison. Visual analytics tools leverage humans' ability to easily see patterns and anomalies that they were not expecting, but is limited by uncertainty in findings. Statistical tools emphasize finding significant differences in the data, but often requires researchers have a concrete question and doesn't facilitate more general exploration of the data. Combining visual analytics tools with statistical methods leverages the benefits of both approaches for quicker and easier insight discovery. Integrating statistics into a visualization tool presents many challenges on the frontend (e.g., displaying the results of many different metrics concisely) and in the backend (e.g., scalability challenges with running various metrics on multi-dimensional data at once). I begin by exploring the problem of comparing cohorts of event sequences and understanding the questions that analysts commonly ask in this task. From there, I demonstrate that combining automated statistics with an interactive user interface amplifies the benefits of both types of tools, thereby enabling analysts to conduct quicker and easier data exploration, hypothesis generation, and insight discovery. The direct contributions of this dissertation are: (1) a taxonomy of metrics for comparing cohorts of temporal event sequences, (2) a statistical framework for exploratory data analysis with a method I refer to as high-volume hypothesis testing (HVHT), (3) a family of visualizations and guidelines for interaction techniques that are useful for understanding and parsing the results, and (4) a user study, five long-term case studies, and five short-term case studies which demonstrate the utility and impact of these methods in various domains: four in the medical domain, one in web log analysis, two in education, and one each in social networks, sports analytics, and security. My dissertation contributes an understanding of how cohorts of temporal event sequences are commonly compared and the difficulties associated with applying and parsing the results of these metrics. It also contributes a set of visualizations, algorithms, and design guidelines for balancing automated statistics with user-driven analysis to guide users to significant, distinguishing features between cohorts. This work opens avenues for future research in comparing two or more groups of temporal event sequences, opening traditional machine learning and data mining techniques to user interaction, and extending the principles found in this dissertation to data types beyond temporal event sequences.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O objetivo foi analisar o perfil dos recém-nascidos, mães e mortalidade neonatal precoce, segundo complexidade do hospital e vínculo com o Sistema Único de Saúde (SUS), na Região Metropolitana de São Paulo, Brasil. Estudo baseado em dados de nascidos vivos, óbitos e cadastro de hospitais. Para obter a tipologia de complexidade e o perfil da clientela, empregaram-se análise fatorial e de clusters. O SUS atende mais recém-nascidos de risco e mães com baixa escolaridade, pré-natal insuficiente e adolescentes. A probabilidade de morte neonatal precoce foi 5,6‰ nascidos vivos (65% maior no SUS), sem diferenças por nível de complexidade do hospital, exceto nos de altíssima (SUS) e média (não-SUS) complexidade. O diferencial de mortalidade neonatal precoce entre as duas redes é menor no grupo de recém-nascidos < 1.500g (22%), entretanto, a taxa é 131% mais elevada no SUS para os recém-nascidos > 2.500g. Há uma concentração de nascimentos de alto risco na rede SUS, contudo a diferença de mortalidade neonatal precoce entre a rede SUS e não-SUS é menor nesse grupo de recém-nascidos. Novos estudos são necessários para compreender melhor a elevada mortalidade de recém-nascidos > 2.500g no SUS.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Our aim was to evaluate the interaction between breast cancer cells and nodal fibroblasts, by means of their gene expression profile. Fibroblast primary cultures were established from negative and positive lymph nodes from breast cancer patients and a similar gene expression pattern was identified, following cell culture. Fibroblasts and breast cancer cells (MDA-MB231, MDA-MB435, and MCF7) were cultured alone or co-cultured separated by a porous membrane (which allows passage of soluble factors) for comparison. Each breast cancer lineage exerted a particular effect on fibroblasts viability and transcriptional profile. However, fibroblasts from positive and negative nodes had a parallel transcriptional behavior when co-cultured with a specific breast cancer cell line. The effects of nodal fibroblasts on breast cancer cells were also investigated. MDA MB-231 cells viability and migration were enhanced by the presence of fibroblasts and accordingly, MDA-MB435 and MCF7 cells viability followed a similar pattern. MDA-MB231 gene expression profile, as evaluated by cDNA microarray, was influenced by the fibroblasts presence, and HNMT, COMT, FN3K, and SOD2 were confirmed downregulated in MDA-MB231 co-cultured cells with fibroblasts from both negative and positive nodes, in a new series of RT-PCR assays. In summary, transcriptional changes induced in breast cancer cells by fibroblasts from positive as well as negative nodes are very much alike in a specific lineage. However, fibroblasts effects are distinct in each one of the breast cancer lineages, suggesting that the inter-relationships between stromal and malignant cells are dependent on the intrinsic subtype of the tumor.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

to test the ability of frequency-doubling technology (FDT) perimetry to detect dysthyroid optic neuropathy (DON). Fifteen eyes of 15 patients with DON and 15 healthy control eyes were studied. Eligible eyes had a diagnosis of DON based on visual field abnormalities on standard automated perimetry and had visual acuity better than 20/30. FDT testing was performed using both the C-20-5 screening test and the C-20 full-threshold test. Normal and DON eyes were compared with regard to FDT mean sensitivity. Sensitivity ranges were 40.0%-86.7% for the screening test, and 53.3%-100.0% (total deviation) and 20.0-93.3 (pattern deviation) for the C-20 threshold test. The corresponding specificity ranges were 86.7-100.0, 33.3-93.3, and 26.7-100.0, respectively. The best sensitivity/specificity ratios were for one abnormal point depressed < 5% in the screening test (86.7%/86.7%), one point depressed < 1% in the total deviation analysis (80.0%/86.7%), and one point depressed < 2% in the pattern deviation analysis (80.0%/86.7%). DON eyes presented significantly lower than normal average sensitivity in the central, pericentral, and peripheral areas. FDT perimetry is a useful screening tool for DON in eyes with normal or only slightly reduced visual acuity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The dynamics of catalytic networks have been widely studied over the last decades because of their implications in several fields like prebiotic evolution, virology, neural networks, immunology or ecology. One of the most studied mathematical bodies for catalytic networks was initially formulated in the context of prebiotic evolution, by means of the hypercycle theory. The hypercycle is a set of self-replicating species able to catalyze other replicator species within a cyclic architecture. Hypercyclic organization might arise from a quasispecies as a way to increase the informational containt surpassing the so-called error threshold. The catalytic coupling between replicators makes all the species to behave like a single and coherent evolutionary multimolecular unit. The inherent nonlinearities of catalytic interactions are responsible for the emergence of several types of dynamics, among them, chaos. In this article we begin with a brief review of the hypercycle theory focusing on its evolutionary implications as well as on different dynamics associated to different types of small catalytic networks. Then we study the properties of chaotic hypercycles with error-prone replication with symbolic dynamics theory, characterizing, by means of the theory of topological Markov chains, the topological entropy and the periods of the orbits of unimodal-like iterated maps obtained from the strange attractor. We will focus our study on some key parameters responsible for the structure of the catalytic network: mutation rates, autocatalytic and cross-catalytic interactions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A afluência desmedida aos Serviços de Urgência é uma questão que acarreta preocupações a nível financeiro. Contribui para este fato, a mentalidade da população, que acredita que este serviço oferece mais facilidades de acesso, dispõe de mais recursos e fornece melhores cuidados de saúde. Novas medidas foram preconizadas, como o aumento das taxas moderadoras, para tentar travar este fenómeno. No entanto, apesar da descida dos episódios de urgência em cerca de 10%, em Portugal, estudos apontam para valores na ordem dos 30-35% de episódios não urgentes. Assim, torna-se importante que não só se enfatizem as novas medidas, como se eduque a população com vista à correta utilização destes serviços, através de campanhas de sensibilização. Torna-se, assim, necessário que se chegue ao perfil do utilizador abusivo. Para a identificação de um perfil de abusividade, foram solicitados dados de episódios de urgência ocorridos durante um período de 6 meses no Hospital de São João, tendo depois sido estimado um modelo de regressão logística. A metodologia permite identificar quais as características que influenciam uma utilização abusiva do serviço e quantificar o impacto de cada uma destas características na probabilidade de um utente apresentar um comportamento abusivo. Concluiu-se que, uma mulher entre os 18-30 anos, que resida em Vila Nova de Gaia, recorra à urgência durante a noite tendo-lhe sido atribuída uma pulseira azul e seja abrangida pelo Serviço Nacional de Saúde, apresenta 91,92% de probabilidade de utilizar este serviço de forma abusiva. Contrariamente, um homem com mais de 60 anos, residente na Maia, que recorra ao serviço durante o dia, esteja isento do pagamento de taxas moderadoras e seja abrangido pela ADSE, e lhe seja atribuída uma pulseira laranja, apresenta apenas 39,93% de probabilidade de ter um comportamento abusivo. Estes resultados são importantes para definir campanhas de sensibilização que diminuam comportamentos abusivos.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Extracting the semantic relatedness of terms is an important topic in several areas, including data mining, information retrieval and web recommendation. This paper presents an approach for computing the semantic relatedness of terms using the knowledge base of DBpedia — a community effort to extract structured information from Wikipedia. Several approaches to extract semantic relatedness from Wikipedia using bag-of-words vector models are already available in the literature. The research presented in this paper explores a novel approach using paths on an ontological graph extracted from DBpedia. It is based on an algorithm for finding and weighting a collection of paths connecting concept nodes. This algorithm was implemented on a tool called Shakti that extract relevant ontological data for a given domain from DBpedia using its SPARQL endpoint. To validate the proposed approach Shakti was used to recommend web pages on a Portuguese social site related to alternative music and the results of that experiment are reported in this paper.