975 resultados para probabilistic topic models
Resumo:
There are several scoring rules that one can choose from in order to score probabilistic forecasting models or estimate model parameters. Whilst it is generally agreed that proper scoring rules are preferable, there is no clear criterion for preferring one proper scoring rule above another. This manuscript compares and contrasts some commonly used proper scoring rules and provides guidance on scoring rule selection. In particular, it is shown that the logarithmic scoring rule prefers erring with more uncertainty, the spherical scoring rule prefers erring with lower uncertainty, whereas the other scoring rules are indifferent to either option.
Resumo:
Point pattern matching in Euclidean Spaces is one of the fundamental problems in Pattern Recognition, having applications ranging from Computer Vision to Computational Chemistry. Whenever two complex patterns are encoded by two sets of points identifying their key features, their comparison can be seen as a point pattern matching problem. This work proposes a single approach to both exact and inexact point set matching in Euclidean Spaces of arbitrary dimension. In the case of exact matching, it is assured to find an optimal solution. For inexact matching (when noise is involved), experimental results confirm the validity of the approach. We start by regarding point pattern matching as a weighted graph matching problem. We then formulate the weighted graph matching problem as one of Bayesian inference in a probabilistic graphical model. By exploiting the existence of fundamental constraints in patterns embedded in Euclidean Spaces, we prove that for exact point set matching a simple graphical model is equivalent to the full model. It is possible to show that exact probabilistic inference in this simple model has polynomial time complexity with respect to the number of elements in the patterns to be matched. This gives rise to a technique that for exact matching provably finds a global optimum in polynomial time for any dimensionality of the underlying Euclidean Space. Computational experiments comparing this technique with well-known probabilistic relaxation labeling show significant performance improvement for inexact matching. The proposed approach is significantly more robust under augmentation of the sizes of the involved patterns. In the absence of noise, the results are always perfect.
Resumo:
This keynote presentation will report some of our research work and experience on the development and applications of relevant methods, models, systems and simulation techniques in support of different types and various levels of decision making for business, management and engineering. In particular, the following topics will be covered. Modelling, multi-agent-based simulation and analysis of the allocation management of carbon dioxide emission permits in China (Nanfeng Liu & Shuliang Li Agent-based simulation of the dynamic evolution of enterprise carbon assets (Yin Zeng & Shuliang Li) A framework & system for extracting and representing project knowledge contexts using topic models and dynamic knowledge maps: a big data perspective (Jin Xu, Zheng Li, Shuliang Li & Yanyan Zhang) Open innovation: intelligent model, social media & complex adaptive system simulation (Shuliang Li & Jim Zheng Li) A framework, model and software prototype for modelling and simulation for deshopping behaviour and how companies respond (Shawkat Rahman & Shuliang Li) Integrating multiple agents, simulation, knowledge bases and fuzzy logic for international marketing decision making (Shuliang Li & Jim Zheng Li) A Web-based hybrid intelligent system for combined conventional, digital, mobile, social media and mobile marketing strategy formulation (Shuliang Li & Jim Zheng Li) A hybrid intelligent model for Web & social media dynamics, and evolutionary and adaptive branding (Shuliang Li) A hybrid paradigm for modelling, simulation and analysis of brand virality in social media (Shuliang Li & Jim Zheng Li) Network configuration management: attack paradigms and architectures for computer network survivability (Tero Karvinen & Shuliang Li)
Resumo:
We propose three research problems to explore the relations between trust and security in the setting of distributed computation. In the first problem, we study trust-based adversary detection in distributed consensus computation. The adversaries we consider behave arbitrarily disobeying the consensus protocol. We propose a trust-based consensus algorithm with local and global trust evaluations. The algorithm can be abstracted using a two-layer structure with the top layer running a trust-based consensus algorithm and the bottom layer as a subroutine executing a global trust update scheme. We utilize a set of pre-trusted nodes, headers, to propagate local trust opinions throughout the network. This two-layer framework is flexible in that it can be easily extensible to contain more complicated decision rules, and global trust schemes. The first problem assumes that normal nodes are homogeneous, i.e. it is guaranteed that a normal node always behaves as it is programmed. In the second and third problems however, we assume that nodes are heterogeneous, i.e, given a task, the probability that a node generates a correct answer varies from node to node. The adversaries considered in these two problems are workers from the open crowd who are either investing little efforts in the tasks assigned to them or intentionally give wrong answers to questions. In the second part of the thesis, we consider a typical crowdsourcing task that aggregates input from multiple workers as a problem in information fusion. To cope with the issue of noisy and sometimes malicious input from workers, trust is used to model workers' expertise. In a multi-domain knowledge learning task, however, using scalar-valued trust to model a worker's performance is not sufficient to reflect the worker's trustworthiness in each of the domains. To address this issue, we propose a probabilistic model to jointly infer multi-dimensional trust of workers, multi-domain properties of questions, and true labels of questions. Our model is very flexible and extensible to incorporate metadata associated with questions. To show that, we further propose two extended models, one of which handles input tasks with real-valued features and the other handles tasks with text features by incorporating topic models. Our models can effectively recover trust vectors of workers, which can be very useful in task assignment adaptive to workers' trust in the future. These results can be applied for fusion of information from multiple data sources like sensors, human input, machine learning results, or a hybrid of them. In the second subproblem, we address crowdsourcing with adversaries under logical constraints. We observe that questions are often not independent in real life applications. Instead, there are logical relations between them. Similarly, workers that provide answers are not independent of each other either. Answers given by workers with similar attributes tend to be correlated. Therefore, we propose a novel unified graphical model consisting of two layers. The top layer encodes domain knowledge which allows users to express logical relations using first-order logic rules and the bottom layer encodes a traditional crowdsourcing graphical model. Our model can be seen as a generalized probabilistic soft logic framework that encodes both logical relations and probabilistic dependencies. To solve the collective inference problem efficiently, we have devised a scalable joint inference algorithm based on the alternating direction method of multipliers. The third part of the thesis considers the problem of optimal assignment under budget constraints when workers are unreliable and sometimes malicious. In a real crowdsourcing market, each answer obtained from a worker incurs cost. The cost is associated with both the level of trustworthiness of workers and the difficulty of tasks. Typically, access to expert-level (more trustworthy) workers is more expensive than to average crowd and completion of a challenging task is more costly than a click-away question. In this problem, we address the problem of optimal assignment of heterogeneous tasks to workers of varying trust levels with budget constraints. Specifically, we design a trust-aware task allocation algorithm that takes as inputs the estimated trust of workers and pre-set budget, and outputs the optimal assignment of tasks to workers. We derive the bound of total error probability that relates to budget, trustworthiness of crowds, and costs of obtaining labels from crowds naturally. Higher budget, more trustworthy crowds, and less costly jobs result in a lower theoretical bound. Our allocation scheme does not depend on the specific design of the trust evaluation component. Therefore, it can be combined with generic trust evaluation algorithms.
Resumo:
Conventional topic models are ineffective for topic extraction from microblog messages since the lack of structure and context among the posts renders poor message-level word co-occurrence patterns. In this work, we organize microblog posts as conversation trees based on reposting and replying relations, which enrich context information to alleviate data sparseness. Our model generates words according to topic dependencies derived from the conversation structures. In specific, we differentiate messages as leader messages, which initiate key aspects of previously focused topics or shift the focus to different topics, and follower messages that do not introduce any new information but simply echo topics from the messages that they repost or reply. Our model captures the different extents that leader and follower messages may contain the key topical words, thus further enhances the quality of the induced topics. The results of thorough experiments demonstrate the effectiveness of our proposed model.
Resumo:
The Childhood protection is a subject with high value for the society, but, the Child Abuse cases are difficult to identify. The process from suspicious to accusation is very difficult to achieve. It must configure very strong evidences. Typically, Health Care services deal with these cases from the beginning where there are evidences based on the diagnosis, but they aren’t enough to promote the accusation. Besides that, this subject it’s highly sensitive because there are legal aspects to deal with such as: the patient privacy, paternity issues, medical confidentiality, among others. We propose a Child Abuses critical knowledge monitor system model that addresses this problem. This decision support system is implemented with a multiple scientific domains: to capture of tokens from clinical documents from multiple sources; a topic model approach to identify the topics of the documents; knowledge management through the use of ontologies to support the critical knowledge sensibility concepts and relations such as: symptoms, behaviors, among other evidences in order to match with the topics inferred from the clinical documents and then alert and log when clinical evidences are present. Based on these alerts clinical personnel could analyze the situation and take the appropriate procedures.
Resumo:
INTRODUCTION: This study sought to increase understanding of women's thoughts and feelings about decision making and the experience of subsequent pregnancy following stillbirth (intrauterine death after 24 weeks' gestation). METHODS: Eleven women were interviewed, 8 of whom were pregnant at the time of the interview. Modified grounded theory was used to guide the research methodology and to analyze the data. RESULTS: A model was developed to illustrate women's experiences of decision making in relation to subsequent pregnancy and of subsequent pregnancy itself. DISCUSSION: The results of the current study have significant implications for women who have experienced stillbirth and the health professionals who work with them. Based on the model, women may find it helpful to discuss their beliefs in relation to healing and health professionals to provide support with this in mind. Women and their partners may also benefit from explanations and support about the potentially conflicting emotions they may experience during this time.
Resumo:
OBJECTIVE: To calculate the variable costs involved with the process of delivering erythropoiesis stimulating agents (ESA) in European dialysis practices. METHODS: A conceptual model was developed to classify the processes and sub-processes followed in the pharmacy (ordering from supplier, receiving/storing/delivering ESA to the dialysis unit), dialysis unit (dose determination, ordering, receipt, registration, storage, administration, registration) and waste disposal unit. Time and material costs were recorded. Labour costs were derived from actual local wages while material costs came from the facilities' accounting records. Activities associated with ESA administration were listed and each activity evaluated to determine if dosing frequency affected the amount of resources required. RESULTS: A total of 21 centres in 8 European countries supplied data for 142 patients (mean) per hospital (range 42-648). Patients received various ESA regimens (thrice-weekly, twice-weekly, once-weekly, once every 2 weeks and once-monthly). Administering ESA every 2 weeks, the mean costs per patient per year for each process and the estimates of the percentage reduction in costs obtainable, respectively, were: pharmacy labour (10.1 euro, 39%); dialysis unit labour (66.0 euro, 65%); dialysis unit materials (4.11 euro, 61%) and waste unit materials (0.43 euro, 49%). LIMITATION: Impact on financial costs was not measured. CONCLUSION: ESA administration has quantifiable labour and material costs which are affected by dosing frequency.
Resumo:
BACKGROUND: Three different burnout types have been described: The "frenetic" type describes involved and ambitious subjects who sacrifice their health and personal lives for their jobs; the "underchallenged" type describes indifferent and bored workers who fail to find personal development in their jobs, and the "worn-out" in type describes neglectful subjects who feel they have little control over results and whose efforts go unacknowledged. The study aimed to describe the possible associations between burnout types and general sociodemographic and occupational characteristics. METHODS: A cross-sectional study was carried out on a multi-occupational sample of randomly selected university employees (n = 409). The presence of burnout types was assessed by means of the "Burnout Clinical Subtype Questionnaire (BCSQ-36)", and the degree of association between variables was assessed using an adjusted odds ratio (OR) obtained from multivariate logistic regression models. RESULTS: Individuals working more than 40 hours per week presented with the greatest risk for "frenetic" burnout compared to those working fewer than 35 hours (adjusted OR = 5.69; 95% CI = 2.52-12.82; p < 0.001). Administration and service personnel presented the greatest risk of "underchallenged" burnout compared to teaching and research staff (adjusted OR = 2.85; 95% CI = 1.16-7.01; p = 0.023). Employees with more than sixteen years of service in the organisation presented the greatest risk of "worn-out" burnout compared to those with less than four years of service (adjusted OR = 4.56; 95% CI = 1.47-14.16; p = 0.009). CONCLUSIONS: This study is the first to our knowledge that suggests the existence of associations between the different burnout subtypes (classified according to the degree of dedication to work) and the different sociodemographic and occupational characteristics that are congruent with the definition of each of the subtypes. These results are consistent with the clinical profile definitions of burnout syndrome. In addition, they assist the recognition of distinct profiles and reinforce the idea of differential characterisation of the syndrome for more effective treatment.
Resumo:
Despite medical advances, mortality in infective endocarditis (IE) is still very high. Previous studies on prognosis in IE have observed conflicting results. The aim of this study was to identify predictors of in-hospital mortality in a large multicenter cohort of left-sided IE.Methods An observational multicenter study was conducted from January 1984 to December 2006 in seven hospitals in Andalusia, Spain. Seven hundred and five left-side IE patients were included. The main outcome measure was in-hospital mortality. Several prognostic factors were analysed by univariate tests and then by multilogistic regression model. Results.The overall mortality was 29.5% (25.5% from 1984 to 1995 and 31.9% from 1996 to 2006; Odds Ratio 1.25; 95% Confidence Interval: 0.97-1.60; p = 0.07). In univariate analysis, age, comorbidity, especially chronic liver disease, prosthetic valve, virulent microorganism such as Staphylococcus aureus, Streptococcus agalactiae and fungi, and complications (septic shock, severe heart failure, renal insufficiency, neurologic manifestations and perivalvular extension) were related with higher mortality. Independent factors for mortality in multivariate analysis were: Charlson comorbidity score (OR: 1.2; 95% CI: 1.1-1.3), prosthetic endocarditis (OR: 1.9; CI: 1.2-3.1), Staphylococcus aureus aetiology (OR: 2.1; CI: 1.3-3.5), severe heart failure (OR: 5.4; CI: 3.3-8.8), neurologic manifestations (OR: 1.9; CI: 1.2-2.9), septic shock (OR: 4.2; CI: 2.3-7.7), perivalvular extension (OR: 2.4; CI: 1.3-4.5) and acute renal failure (OR: 1.69; CI: 1.0-2.6). Conversely, Streptococcus viridans group etiology (OR: 0.4; CI: 0.2-0.7) and surgical treatment (OR: 0.5; CI: 0.3-0.8) were protective factors.Conclusions Several characteristics of left-sided endocarditis enable selection of a patient group at higher risk of mortality. This group may benefit from more specialised attention in referral centers and should help to identify those patients who might benefit from more aggressive diagnostic and/or therapeutic procedures.
Resumo:
The human leukocyte antigen (HLA) DRB1*1501 has been consistently associated with multiple sclerosis (MS) in nearly all populations tested. This points to a specific antigen presentation as the pathogenic mechanism though this does not fully explain the disease association. The identification of expression quantitative trait loci (eQTL) for genes in the HLA locus poses the question of the role of gene expression in MS susceptibility. We analyzed the eQTLs in the HLA region with respect to MS-associated HLA-variants obtained from genome-wide association studies (GWAS). We found that the Tag of DRB1*1501, rs3135388 A allele, correlated with high expression of DRB1, DRB5 and DQB1 genes in a Caucasian population. In quantitative terms, the MS-risk AA genotype carriers of rs3135388 were associated with 15.7-, 5.2- and 8.3-fold higher expression of DQB1, DRB5 and DRB1, respectively, than the non-risk GG carriers. The haplotype analysis of expression-associated variants in a Spanish MS cohort revealed that high expression of DRB1 and DQB1 alone did not contribute to the disease. However, in Caucasian, Asian and African American populations, the DRB1*1501 allele was always highly expressed. In other immune related diseases such as type 1 diabetes, inflammatory bowel disease, ulcerative colitis, asthma and IgA deficiency, the best GWAS-associated HLA SNPs were also eQTLs for different HLA Class II genes. Our data suggest that the DR/DQ expression levels, together with specific structural properties of alleles, seem to be the causal effect in MS and in other immunopathologies rather than specific antigen presentation alone.
Resumo:
BACKGROUND Available screening tests for dementia are of limited usefulness because they are influenced by the patient's culture and educational level. The Eurotest, an instrument based on the knowledge and handling of money, was designed to overcome these limitations. The objective of this study was to evaluate the diagnostic accuracy of the Eurotest in identifying dementia in customary clinical practice. METHODS A cross-sectional, multi-center, naturalistic phase II study was conducted. The Eurotest was administered to consecutive patients, older than 60 years, in general neurology clinics. The patients' condition was classified as dementia or no dementia according to DSM-IV diagnostic criteria. We calculated sensitivity (Sn), specificity (Sp) and area under the ROC curves (aROC) with 95% confidence intervals. The influence of social and educational factors on scores was evaluated with multiple linear regression analysis, and the influence of these factors on diagnostic accuracy was evaluated with logistic regression. RESULTS Sixteen neurologists recruited a total of 516 participants: 101 with dementia, 380 without dementia, and 35 who were excluded. Of the 481 participants who took the Eurotest, 38.7% were totally or functionally illiterate and 45.5% had received no formal education. Mean time needed to administer the test was 8.2+/-2.0 minutes. The best cut-off point was 20/21, with Sn = 0.91 (0.84-0.96), Sp = 0.82 (0.77-0.85), and aROC = 0.93 (0.91-0.95). Neither the scores on the Eurotest nor its diagnostic accuracy were influenced by social or educational factors. CONCLUSION This naturalistic and pragmatic study shows that the Eurotest is a rapid, simple and useful screening instrument, which is free from educational influences, and has appropriate internal and external validity.
Resumo:
BACKGROUND. Autoimmunity appears to be associated with the pathophysiology of Meniere's disease (MD), an inner ear disorder characterized by episodes of vertigo associated with hearing loss and tinnitus. However, the prevalence of autoimmune diseases (AD) in patients with MD has not been studied in individuals with uni or bilateral sensorineural hearing loss (SNHL). METHODS AND FINDINGS. We estimated the prevalence of AD in 690 outpatients with MD with uni or bilateral SNHL from otoneurology clinics at six tertiary referral hospitals by using clinica criteria and an immune panel (lymphocyte populations, antinuclear antibodies, C3, C4 and proinflammatory cytokines TNFα, INFγ). The observed prevalence of rheumatoid arthritis (RA), systemic lupus erythematosus (SLE) and ankylosing spondylitis (AS) was higher than expected for the general population (1.39 for RA, 0.87 for SLE and 0.70 for AS, respectively). Systemic AD were more frequently observed in patients with MD and diagnostic criteria for migraine than cases with MD and tension-type headache (p = 0.007). There were clinical differences between patients with uni or bilateral SNHL, but no differences were found in the immune profile. Multiple linear regression showed that changes in lymphocytes subpopulations were associated with hearing loss and persistence of vertigo, suggesting a role for the immune response in MD. CONCLUSIONS. Despite some limitations, MD displays an elevated prevalence of systemic AD such as RA, SLE and AS. This finding, which suggests an autoimmune background in a subset of patients with MD, has important implications for the treatment of MD.
Resumo:
BACKGROUND Evidence associating exposure to water disinfection by-products with reduced birth weight and altered duration of gestation remains inconclusive. OBJECTIVE We assessed exposure to trihalomethanes (THMs) during pregnancy through different water uses and evaluated the association with birth weight, small for gestational age (SGA), low birth weight (LBW), and preterm delivery. METHODS Mother-child cohorts set up in five Spanish areas during the years 2000-2008 contributed data on water ingestion, showering, bathing, and swimming in pools. We ascertained residential THM levels during pregnancy periods through ad hoc sampling campaigns (828 measurements) and regulatory data (264 measurements), which were modeled and combined with personal water use and uptake factors to estimate personal uptake. We defined outcomes following standard definitions and included 2,158 newborns in the analysis. RESULTS Median residential THM ranged from 5.9 μg/L (Valencia) to 114.7 μg/L (Sabadell), and speciation differed across areas. We estimated that 89% of residential chloroform and 96% of brominated THM uptakes were from showering/bathing. The estimated change of birth weight for a 10% increase in residential uptake was -0.45 g (95% confidence interval: -1.36, 0.45 g) for chloroform and 0.16 g (-1.38, 1.70 g) for brominated THMs. Overall, THMs were not associated with SGA, LBW, or preterm delivery. CONCLUSIONS Despite the high THM levels in some areas and the extensive exposure assessment, results suggest that residential THM exposure during pregnancy driven by inhalation and dermal contact routes is not associated with birth weight, SGA, LBW, or preterm delivery in Spain.
Resumo:
Ambulatory blood pressure (BP) monitoring has become useful in the diagnosis and management of hypertensive individuals. In addition to 24-hour values, the circadian variation of BP adds prognostic significance in predicting cardiovascular outcome. However, the magnitude of circadian BP patterns in large studies has hardly been noticed. Our aims were to determine the prevalence of circadian BP patterns and to assess clinical conditions associated with the nondipping status in groups of both treated and untreated hypertensive subjects, studied separately. Clinical data and 24-hour ambulatory BP monitoring were obtained from 42,947 hypertensive patients included in the Spanish Society of Hypertension Ambulatory Blood Pressure Monitoring Registry. They were 8384 previously untreated and 34,563 treated hypertensives. Twenty-four-hour ambulatory BP monitoring was performed with an oscillometric device (SpaceLabs 90207). A nondipping pattern was defined when nocturnal systolic BP dip was <10% of daytime systolic BP. The prevalence of nondipping was 41% in the untreated group and 53% in treated patients. In both groups, advanced age, obesity, diabetes mellitus, and overt cardiovascular or renal disease were associated with a blunted nocturnal BP decline (P<0.001). In treated patients, nondipping was associated with the use of a higher number of antihypertensive drugs but not with the time of the day at which antihypertensive drugs were administered. In conclusion, a blunted nocturnal BP dip (the nondipping pattern) is common in hypertensive patients. A clinical pattern of high cardiovascular risk is associated with nondipping, suggesting that the blunted nocturnal BP dip may be merely a marker of high cardiovascular risk.