827 resultados para clustering accuracy
Resumo:
Existing empirical evidence has frequently observed that professional forecasters are conservative and display herding behaviour. Whilst a large number of papers have considered equities as well as macroeconomic series, few have considered the accuracy of forecasts in alternative asset classes such as real estate. We consider the accuracy of forecasts for the UK commercial real estate market over the period 1999-2011. The results illustrate that forecasters display a tendency to under-estimate growth rates during strong market conditions and over-estimate when the market is performing poorly. This conservatism not only results in smoothed estimates but also implies that forecasters display herding behaviour. There is also a marked difference in the relative accuracy of capital and total returns versus rental figures. Whilst rental growth forecasts are relatively accurate, considerable inaccuracy is observed with respect to capital value and total returns.
Resumo:
Objective: Psychological problems should be identified in breast cancer patients proactively if doctors and nurses are to help them cope with the challenges imposed by their illness. Screening is one possible way to identify emotional problems proactively. Self-report questionnaires can be useful alternatives to carrying out psychiatric interviews during screening, because interviewing a large number of patients can be impractical due to limited resources. Two such measures are the Hospital Anxiety and Depression Scale (HADS) and the General Health Questionnaire-12 (GHQ-12). Method: The present study aimed to compare the performance of the GHQ-12, and the HADS Unitary Scale and its subscales to that of the Schedule for Affective Disorders and Schizophrenia (SADS) in identifying patients with affective disorders, including DSM major depression and generalized anxiety disorder. The sample consisted of 296 female breast cancer patients who underwent surgery for breast cancer a year previously. Results: A small number of patients (11%) were identified as having DSM major depression or generalized anxiety disorder based on SADS score. The findings indicate that the optimal thresholds in detecting generalized anxiety disorder and DSM major depression with the HADS anxiety and depression subscales were ≥ 8 and ≥ 7, with 93.3% and 77.3% sensitivity, respectively, and 77.9% and 87.1% specificity, respectively. They also had a 21% and 36% positive predictive value, respectively. Using the HADS Unitary Scale the optimal threshold for detecting affective disorders was ≥ 12, with 88.9% sensitivity, 80.7% specificity, and a 35% positive predictive value. In detecting affective disorders, the optimal threshold on the GHQ-12 was ≥ 2, with 77.8% sensitivity and 70.2% specificity. This scale also had a 24% positive predictive value. In detecting generalized anxiety disorder and DSM major depression, the optimal thresholds on the GHQ-12 were ≥ 2 and ≥ 4 with 73.3% and 77.3% sensitivity, respectively, and 67.5% and 82% specificity, respectively. The scale also had 12% and 29% positive predictive values, respectively. Conclusion: The HADS Unitary Scale and its subscales were effective in identifying affective disorders. They can be used as screening measures in breast cancer patients. The GHQ-12 was less accurate in detecting affective disorders than the HADS, but it can also be used as a screening instrument to detect affective disorders, generalized anxiety disorder, and DSM major depression.
Resumo:
ICT clusters have attracted much attention because of their rapid growth and their value for other economic activities. Using a nested multi-level model, we examine how conditions at the country level and at the city level affect ICT clustering activity in 227 cities across 22 European countries. We test for the influence of three country regulations (starting a business, registering property, enforcing contracts) and two city conditions (proximity to university, network density) on ICT clustering. We consider heterogeneity within the sector and study two types of ICT activities: ICT product firms and ICT content firms. Our results indicate that country conditions and city conditions each have idiosyncratic implications for ICT clustering, and further, that these can vary by activities in ICT products or ICT content manufacturing.
Resumo:
In this article, along with others, we take the position that the Null-Subject Parameter (NSP) (Chomsky 1981; Rizzi 1982) cluster of properties is narrower in scope than some originally contended. We test for the resetting of the NSP by English L2 learners of Spanish at the intermediate level, including poverty-of-the stimulus knowledge of the Overt Pronoun Constraint (Montalbetti 1984). Our participants are tested before and after five months' residency in Spain in an effort to see if increased amounts of native exposure are particularly beneficial for parameter resetting. Although we demonstrate NSP resetting for some of the L2 learners, our data essentially demonstrate that even with the advent of time/exposure to native input, there is no immediate gainful effect for NSP resetting.
Resumo:
The interaction of C-type lectin receptor 2 (CLEC-2) on platelets with Podoplanin on lymphatic endothelial cells initiates platelet signaling events that are necessary for prevention of blood-lymph mixing during development. In the present study, we show that CLEC-2 signaling via Src family and Syk tyrosine kinases promotes platelet adhesion to primary mouse lymphatic endothelial cells at low shear. Using supported lipid bilayers containing mobile Podoplanin, we further show that activation of Src and Syk in platelets promotes clustering of CLEC-2 and Podoplanin. Clusters of CLEC-2-bound Podoplanin migrate rapidly to the center of the platelet to form a single structure. Fluorescence lifetime imaging demonstrates that molecules within these clusters are within 10 nm of one another and that the clusters are disrupted by inhibition of Src and Syk family kinases. CLEC-2 clusters are also seen in platelets adhered to immobilized Podoplanin using direct stochastic optical reconstruction microscopy. These findings provide mechanistic insight by which CLEC-2 signaling promotes adhesion to Podoplanin and regulation of Podoplanin signaling, thereby contributing to lymphatic vasculature development.
Resumo:
Clustering methods are increasingly being applied to residential smart meter data, providing a number of important opportunities for distribution network operators (DNOs) to manage and plan the low voltage networks. Clustering has a number of potential advantages for DNOs including, identifying suitable candidates for demand response and improving energy profile modelling. However, due to the high stochasticity and irregularity of household level demand, detailed analytics are required to define appropriate attributes to cluster. In this paper we present in-depth analysis of customer smart meter data to better understand peak demand and major sources of variability in their behaviour. We find four key time periods in which the data should be analysed and use this to form relevant attributes for our clustering. We present a finite mixture model based clustering where we discover 10 distinct behaviour groups describing customers based on their demand and their variability. Finally, using an existing bootstrapping technique we show that the clustering is reliable. To the authors knowledge this is the first time in the power systems literature that the sample robustness of the clustering has been tested.
Resumo:
Genome-wide association studies (GWAS) have been widely used in genetic dissection of complex traits. However, common methods are all based on a fixed-SNP-effect mixed linear model (MLM) and single marker analysis, such as efficient mixed model analysis (EMMA). These methods require Bonferroni correction for multiple tests, which often is too conservative when the number of markers is extremely large. To address this concern, we proposed a random-SNP-effect MLM (RMLM) and a multi-locus RMLM (MRMLM) for GWAS. The RMLM simply treats the SNP-effect as random, but it allows a modified Bonferroni correction to be used to calculate the threshold p value for significance tests. The MRMLM is a multi-locus model including markers selected from the RMLM method with a less stringent selection criterion. Due to the multi-locus nature, no multiple test correction is needed. Simulation studies show that the MRMLM is more powerful in QTN detection and more accurate in QTN effect estimation than the RMLM, which in turn is more powerful and accurate than the EMMA. To demonstrate the new methods, we analyzed six flowering time related traits in Arabidopsis thaliana and detected more genes than previous reported using the EMMA. Therefore, the MRMLM provides an alternative for multi-locus GWAS.
Resumo:
Human Body Thermoregulation Models have been widely used in the field of human physiology or thermal comfort studies. However there are few studies on the evaluation method for these models. This paper summarises the existing evaluation methods and critically analyses the flaws. Based on that, a method for the evaluating the accuracy of the Human Body Thermoregulation models is proposed. The new evaluation method contributes to the development of Human Body Thermoregulation models and validates their accuracy both statistically and empirically. The accuracy of different models can be compared by the new method. Furthermore, the new method is not only suitable for the evaluation of Human Body Thermoregulation Models, but also can be theoretically applied to the evaluation of the accuracy of the population-based models in other research fields.
Resumo:
Bloom filters are a data structure for storing data in a compressed form. They offer excellent space and time efficiency at the cost of some loss of accuracy (so-called lossy compression). This work presents a yes-no Bloom filter, which as a data structure consisting of two parts: the yes-filter which is a standard Bloom filter and the no-filter which is another Bloom filter whose purpose is to represent those objects that were recognised incorrectly by the yes-filter (that is, to recognise the false positives of the yes-filter). By querying the no-filter after an object has been recognised by the yes-filter, we get a chance of rejecting it, which improves the accuracy of data recognition in comparison with the standard Bloom filter of the same total length. A further increase in accuracy is possible if one chooses objects to include in the no-filter so that the no-filter recognises as many as possible false positives but no true positives, thus producing the most accurate yes-no Bloom filter among all yes-no Bloom filters. This paper studies how optimization techniques can be used to maximize the number of false positives recognised by the no-filter, with the constraint being that it should recognise no true positives. To achieve this aim, an Integer Linear Program (ILP) is proposed for the optimal selection of false positives. In practice the problem size is normally large leading to intractable optimal solution. Considering the similarity of the ILP with the Multidimensional Knapsack Problem, an Approximate Dynamic Programming (ADP) model is developed making use of a reduced ILP for the value function approximation. Numerical results show the ADP model works best comparing with a number of heuristics as well as the CPLEX built-in solver (B&B), and this is what can be recommended for use in yes-no Bloom filters. In a wider context of the study of lossy compression algorithms, our researchis an example showing how the arsenal of optimization methods can be applied to improving the accuracy of compressed data.
Resumo:
With the fast development of wireless communications, ZigBee and semiconductor devices, home automation networks have recently become very popular. Since typical consumer products deployed in home automation networks are often powered by tiny and limited batteries, one of the most challenging research issues is concerning energy reduction and the balancing of energy consumption across the network in order to prolong the home network lifetime for consumer devices. The introduction of clustering and sink mobility techniques into home automation networks have been shown to be an efficient way to improve the network performance and have received significant research attention. Taking inspiration from nature, this paper proposes an Ant Colony Optimization (ACO) based clustering algorithm specifically with mobile sink support for home automation networks. In this work, the network is divided into several clusters and cluster heads are selected within each cluster. Then, a mobile sink communicates with each cluster head to collect data directly through short range communications. The ACO algorithm has been utilized in this work in order to find the optimal mobility trajectory for the mobile sink. Extensive simulation results from this research show that the proposed algorithm significantly improves home network performance when using mobile sinks in terms of energy consumption and network lifetime as compared to other routing algorithms currently deployed for home automation networks.
Resumo:
Subspace clustering groups a set of samples from a union of several linear subspaces into clusters, so that the samples in the same cluster are drawn from the same linear subspace. In the majority of the existing work on subspace clustering, clusters are built based on feature information, while sample correlations in their original spatial structure are simply ignored. Besides, original high-dimensional feature vector contains noisy/redundant information, and the time complexity grows exponentially with the number of dimensions. To address these issues, we propose a tensor low-rank representation (TLRR) and sparse coding-based (TLRRSC) subspace clustering method by simultaneously considering feature information and spatial structures. TLRR seeks the lowest rank representation over original spatial structures along all spatial directions. Sparse coding learns a dictionary along feature spaces, so that each sample can be represented by a few atoms of the learned dictionary. The affinity matrix used for spectral clustering is built from the joint similarities in both spatial and feature spaces. TLRRSC can well capture the global structure and inherent feature information of data, and provide a robust subspace segmentation from corrupted data. Experimental results on both synthetic and real-world data sets show that TLRRSC outperforms several established state-of-the-art methods.
Resumo:
Tensor clustering is an important tool that exploits intrinsically rich structures in real-world multiarray or Tensor datasets. Often in dealing with those datasets, standard practice is to use subspace clustering that is based on vectorizing multiarray data. However, vectorization of tensorial data does not exploit complete structure information. In this paper, we propose a subspace clustering algorithm without adopting any vectorization process. Our approach is based on a novel heterogeneous Tucker decomposition model taking into account cluster membership information. We propose a new clustering algorithm that alternates between different modes of the proposed heterogeneous tensor model. All but the last mode have closed-form updates. Updating the last mode reduces to optimizing over the multinomial manifold for which we investigate second order Riemannian geometry and propose a trust-region algorithm. Numerical experiments show that our proposed algorithm compete effectively with state-of-the-art clustering algorithms that are based on tensor factorization.
Resumo:
Introduction: The aim of this study was to evaluate the accuracy of two imaging methods in diagnosing apical periodontitis (AP) using histopathological findings as a gold standard. Methods: The periapex of 83 treated or untreated roots of dogs` teeth was examined using periapical radiography (PR), cone-beam computed tomography (CBCT) scans, and histology. Sensitivity, specificity, predictive values, and accuracy of PR and CBCT diagnosis were calculated. Results: PR detected AP in 71% of roots, a CBCT scan detected AP in 84%, and AP was histologically diagnosed in 93% (p = 0.001). Overall, sensitivity was 0.77 and 0.91 for PR and CBCT, respectively. Specificity was 1 for both. Negative predictive value was 0.25 and 0.46 for PR and CBCT, respectively. Positive predictive value was 1 for both. Diagnostic accuracy (true positives + true negatives) was 0.78 and 0.92 for PR and CBCT (p = 0.028), respectively. Conclusion: A CBCT scan was more sensitive in detecting AP compared with PR, which was more likely to miss AP when it was still present. (J Endod 2009;35:1009-1012)