16 resultados para latent semantic analysis
em University of Queensland eSpace - Australia
Resumo:
Web transaction data between Web visitors and Web functionalities usually convey user task-oriented behavior pattern. Mining such type of click-stream data will lead to capture usage pattern information. Nowadays Web usage mining technique has become one of most widely used methods for Web recommendation, which customizes Web content to user-preferred style. Traditional techniques of Web usage mining, such as Web user session or Web page clustering, association rule and frequent navigational path mining can only discover usage pattern explicitly. They, however, cannot reveal the underlying navigational activities and identify the latent relationships that are associated with the patterns among Web users as well as Web pages. In this work, we propose a Web recommendation framework incorporating Web usage mining technique based on Probabilistic Latent Semantic Analysis (PLSA) model. The main advantages of this method are, not only to discover usage-based access pattern, but also to reveal the underlying latent factor as well. With the discovered user access pattern, we then present user more interested content via collaborative recommendation. To validate the effectiveness of proposed approach, we conduct experiments on real world datasets and make comparisons with some existing traditional techniques. The preliminary experimental results demonstrate the usability of the proposed approach.
Resumo:
In this paper, we compare a well-known semantic spacemodel, Latent Semantic Analysis (LSA) with another model, Hyperspace Analogue to Language (HAL) which is widely used in different area, especially in automatic query refinement. We conduct this comparative analysis to prove our hypothesis that with respect to ability of extracting the lexical information from a corpus of text, LSA is quite similar to HAL. We regard HAL and LSA as black boxes. Through a Pearsonrsquos correlation analysis to the outputs of these two black boxes, we conclude that LSA highly co-relates with HAL and thus there is a justification that LSA and HAL can potentially play a similar role in the area of facilitating automatic query refinement. This paper evaluates LSA in a new application area and contributes an effective way to compare different semantic space models.
Resumo:
This article applies methods of latent class analysis (LCA) to data on lifetime illicit drug use in order to determine whether qualitatively distinct classes of illicit drug users can be identified. Self-report data on lifetime illicit drug use (cannabis, stimulants, hallucinogens, sedatives, inhalants, cocaine, opioids and solvents) collected from a sample of 6265 Australian twins (average age 30 years) were analyzed using LCA. Rates of childhood sexual and physical abuse, lifetime alcohol and tobacco dependence, symptoms of illicit drug abuse/dependence and psychiatric comorbidity were compared across classes using multinomial logistic regression. LCA identified a 5-class model: Class 1 (68.5%) had low risks of the use of all drugs except cannabis; Class 2 (17.8%) had moderate risks of the use of all drugs; Class 3 (6.6%) had high rates of cocaine, other stimulant and hallucinogen use but lower risks for the use of sedatives or opioids. Conversely, Class 4 (3.0%) had relatively low risks of cocaine, other stimulant or hallucinogen use but high rates of sedative and opioid use. Finally, Class 5 (4.2%) had uniformly high probabilities for the use of all drugs. Rates of psychiatric comorbidity were highest in the polydrug class although the sedative/opioid class had elevated rates of depression/suicidal behaviors and exposure to childhood abuse. Aggregation of population-level data may obscure important subgroup differences in patterns of illicit drug use and psychiatric comorbidity. Further exploration of a 'self-medicating' subgroup is needed.
Resumo:
Context: The relationships among the different eating disorders that exist in the community are poorly understood, especially for residual disorders in which bingeing or purging occurs in the absence of other behaviors. Objective: To examine a community sample for the number of mutually exclusive weight and eating profiles. Design: Data regarding lifetime eating disorder symptoms and weight range were submitted to a latent profile analysis. Profiles were compared regarding personality, current eating and weight, retrospectively reported life events, and lifetime depressive psychopathology. Setting: Longitudinal study among female twins from the Australian Twin Registry in whom eating was assessed by a telephone interview. Participants: A community sample of 1002 twins (individuals) who had participated in earlier waves of data collection. Main Outcome Measures: Number and clinical character of latent profiles. Results: The best fit was a 5-profile solution with women who were (1) of normal weight with few lifetime eating disorders (4.3%), (2) overweight (10.6% had a lifetime eating disorder), (3) underweight and generally had no eating disorders except for 5.3% who had restricting anorexia nervosa, (4) of low to normal weight (89.0% had a lifetime eating disorder), and (5) obese (37.0% had a lifetime eating disorder). Each profile contained more than 1 type of lifetime eating disorder except for the third profile. Women in the first and third profiles had the best functioning, with women in the fourth and fifth profiles having similarly poorer functioning. The women in the fourth group had a symptom profile distinctive from the other 4 groups in terms of severity; they were also more likely to have had lifetime major depression and suicidality. Conclusion: Lifetime weight ranges and the severity of eating disorder symptoms affected clustering more than the type of eating disorder symptom.
Resumo:
There has been an increased demand for characterizing user access patterns using web mining techniques since the informative knowledge extracted from web server log files can not only offer benefits for web site structure improvement but also for better understanding of user navigational behavior. In this paper, we present a web usage mining method, which utilize web user usage and page linkage information to capture user access pattern based on Probabilistic Latent Semantic Analysis (PLSA) model. A specific probabilistic model analysis algorithm, EM algorithm, is applied to the integrated usage data to infer the latent semantic factors as well as generate user session clusters for revealing user access patterns. Experiments have been conducted on real world data set to validate the effectiveness of the proposed approach. The results have shown that the presented method is capable of characterizing the latent semantic factors and generating user profile in terms of weighted page vectors, which may reflect the common access interest exhibited by users among same session cluster.
Resumo:
Collaborative recommendation is one of widely used recommendation systems, which recommend items to visitor on a basis of referring other's preference that is similar to current user. User profiling technique upon Web transaction data is able to capture such informative knowledge of user task or interest. With the discovered usage pattern information, it is likely to recommend Web users more preferred content or customize the Web presentation to visitors via collaborative recommendation. In addition, it is helpful to identify the underlying relationships among Web users, items as well as latent tasks during Web mining period. In this paper, we propose a Web recommendation framework based on user profiling technique. In this approach, we employ Probabilistic Latent Semantic Analysis (PLSA) to model the co-occurrence activities and develop a modified k-means clustering algorithm to build user profiles as the representatives of usage patterns. Moreover, the hidden task model is derived by characterizing the meaningful latent factor space. With the discovered user profiles, we then choose the most matched profile, which possesses the closely similar preference to current user and make collaborative recommendation based on the corresponding page weights appeared in the selected user profile. The preliminary experimental results performed on real world data sets show that the proposed approach is capable of making recommendation accurately and efficiently.
Resumo:
Latent class and genetic analyses were used to identify subgroups of migraine sufferers in a community sample of 6,265 Australian twins (55% female) aged 25-36 who had completed an interview based on International Headache Society UHS) criteria. Consistent with prevalence rates from other population-based studies, 703 (20%) female and 250 (9%) male twins satisfied the IHS criteria for migraine without aura (MO), and of these, 432 (13%) female and 166 (6%) male twins satisfied the criteria for migraine with aura (MA) as indicated by visual symptoms. Latent class analysis (LCA) of IHS symptoms identified three major symptomatic classes, representing 1) a mild form of recurrent nonmigrainous headache, 2) a moderately severe form of migraine, typically without visual aura symptoms (although 40% of individuals in this class were positive for aura), and 3) a severe form of migraine typically with visual aura symptoms (although 24% of individuals were negative for aura). Using the LCA classification, many more individuals were considered affected to some degree than when using IHS criteria (35% vs. 13%). Furthermore, genetic model fitting indicated a greater genetic contribution to migraine using the LCA classification (heritability, h(2) =0.40; 95% CI, 0.29-0.46) compared with the IHS classification (h(2)=0.36; 95% CI, 0.22-0.42). Exploratory latent class modeling, fitting up to 10 classes, did not identify classes corresponding to either the IHS MO or MA classification. Our data indicate the existence of a continuum of severity, with MA more severe but not etiologically distinct from MO. In searching for predisposing genes, we should therefore expect to find some genes that may underlie all major recurrent headache subtypes, with modifying genetic or environmental factors that may lead to differential expression of the liability for migraine. (C) 2004 Wiley-Liss, Inc.
Resumo:
Familial typical migraine is a common, complex disorder that shows strong familial aggregation. Using latent-class analysis (LCA), we identified subgroups of people with migraine/severe headache in a community sample of 12,245 Australian twins (60% female), drawn from two cohorts of individuals aged 23-90 years who completed an interview based on International Headache Society criteria. We report results from genomewide linkage analyses involving 756 twin families containing a total of 790 independent sib pairs ( 130 affected concordant, 324 discordant, and 336 unaffected concordant for LCA-derived migraine). Quantitative-trait linkage analysis produced evidence of significant linkage on chromosome 5q21 and suggestive linkage on chromosomes 8, 10, and 13. In addition, we replicated previously reported typical-migraine susceptibility loci on chromosomes 6p12.2-p21.1 and 1q21-q23, the latter being within 3 cM of the rare autosomal dominant familial hemiplegic migraine gene (ATP1A2), a finding which potentially implicates ATP1A2 in familial typical migraine for the first time. Linkage analyses of individual migraine symptoms for our six most interesting chromosomes provide tantalizing hints of the phenotypic and genetic complexity of migraine. Specifically, the chromosome 1 locus is most associated with phonophobia; the chromosome 5 peak is predominantly associated with pulsating headache; the chromosome 6 locus is associated with activity-prohibiting headache and photophobia; the chromosome 8 locus is associated with nausea/vomiting and moderate/severe headache; the chromosome 10 peak is most associated with phonophobia and photophobia; and the chromosome 13 peak is completely due to association with photophobia. These results will prove to be invaluable in the design and analysis of future linkage and linkage disequilibrium studies of migraine.
Resumo:
It is often debated whether migraine with aura (MA) and migraine without aura (MO) are etiologically distinct disorders. A previous study using latent class analysis (LCA) in Australian twins showed no evidence for separate subtypes of MO and MA. The aim of the present study was to replicate these results in a population of Dutch twins and their parents, siblings and partners (N = 10,144). Latent class analysis of International Headache Society (IHS)-based migraine symptoms resulted in the identification of 4 classes: a class of unaffected subjects (class 0), a mild form of nonmigrainous headache (class 1), a moderately severe type of migraine (class 2), typically without neurological symptoms or aura (8% reporting aura symptoms), and a severe type of migraine (class 3), typically with neurological symptoms, and aura symptoms in approximately half of the cases. Given the overlap of neurological symptoms and nonmutual exclusivity of aura symptoms, these results do not support the MO and MA subtypes as being etiologically distinct. The heritability in female twins of migraine based on LCA classification was estimated at .50 (95% confidence intervals [0CI} .27 -.59), similar to IHS-based migraine diagnosis (h(2) = .49, 95% Cl .19-.57). However, using a dichotomous classification (affected-unaffected) decreased heritability for the IHS-based classification (h(2) = .33, 95% Cl .00-.60), but not the LCA-based classification (h(2) = .51, 95% Cl. 23-.61). Importantly, use of the LCA-based classification increased the number of subjects classified as affected. The heritability of the screening question was similar to more detailed LCA and IHS classifications, suggesting that the screening procedure is an important determining factor in genetic studies of migraine.
Resumo:
Conflicting findings regarding the ability of people with schizophrenia to maintain and update semantic contexts have been due, arguably, to vagaries within the experimental design employed (e.g. whether strongly or remotely associated prime-target pairs have been used, what delay between the prime and the target was employed, and what proportion of related prime-target pairs appeared) or to characteristics of the participant cohort (e.g. medication status, chronicity of illness). The aim of the present study was to examine how people with schizophrenia maintain and update contextual information over an extended temporal window by using multiple primes that were either remotely associated or unrelated to the target. Fourteen participants with schizophrenia and 12 healthy matched controls were compared across two stimulus onset asynchronies (SOAs) (short and long) and two relatedness proportions (RP) (high and low) in a crossed design. Analysis of variance statistics revealed significant two- and three-way interactions between Group and SOA, Group and Condition, SOA and RP, and Group, SOA and RP. The participants with schizophrenia showed evidence of enhanced remote priming at the short SOA and low RP, combined with a reduction in the time course over which context could be maintained. There was some sensitivity to biasing contextual information at the short SOA, although the mechanism over which context served to update information appeared to be different from that in the controls. The participants with schizophrenia showed marked performance decrements at the long SOA (both low and high RP). Indices of remote priming at the short (but not the long) SOA correlated with both clinical ratings of thought disorder and with increasing length of illness. The results support and extend the hypothesis that schizophrenia is associated with concurrent increases in tonic dopamine activity and decreases in phasic dopamine activity. (C) 2004 Elsevier Ireland Ltd. All rights reserved.
Resumo:
The ability of two-dimensional gel electrophoresis (2-DE) to separate glycoproteins was exploited to separate distinct glycoforms of kappa-casein that differed only in the number of O-glycans that were attached. To determine where the glycans were attached, the individual glycoforms were digested in-gel with pepsin and the released glycopeptides were identified from characteristic sugar ions in the tandem mass spectrometry (MS) spectra. The O-glycosylation sites were identified by tandem MS after replacement of the glycans with ammonia/aminoethanethiol. The results showed that glycans were not randomly distributed among the five potential glycosylation sites in kappa-casein. Rather, glycosylation of the monoglycoform could only be detected at a single site, T-152. Similarly the diglycoform appeared to be modified exclusively at T-152 and T-163, while the triglycoform was modified at T-152, T-163 and T-154. While low levels of glycosylation at other sites cannot be excluded the hierarchy of site occupation between glycoforms was clearly evident and argues for an ordered addition of glycans to the protein. Since all five potential O-glycosylation sites can be glycosylated in vivo, it would appear that certain sites remain latent until other sites are occupied. The determination of glycosylation site occupancy in individual glycoforms separated by 2-DE revealed a distinct pattern of in vivo glycosylation that has not been recognized previously.