773 resultados para Traditional clustering
Resumo:
The explosive growth in the development of Traditional Chinese Medicine (TCM) has resulted in the continued increase in clinical and research data. The lack of standardised terminology, flaws in data quality planning and management of TCM informatics are preventing clinical decision-making, drug discovery and education. This paper argues that the introduction of data warehousing technologies to enhance the effectiveness and durability in TCM is paramount. To showcase the role of data warehousing in the improvement of TCM, this paper presents a practical model for data warehousing with detailed explanation, which is based on the structured electronic records, for TCM clinical researches and medical knowledge discovery.
Resumo:
Document clustering is one of the prominent methods for mining important information from the vast amount of data available on the web. However, document clustering generally suffers from the curse of dimensionality. Providentially in high dimensional space, data points tend to be more concentrated in some areas of clusters. We take advantage of this phenomenon by introducing a novel concept of dynamic cluster representation named as loci. Clusters’ loci are efficiently calculated using documents’ ranking scores generated from a search engine. We propose a fast loci-based semi-supervised document clustering algorithm that uses clusters’ loci instead of conventional centroids for assigning documents to clusters. Empirical analysis on real-world datasets shows that the proposed method produces cluster solutions with promising quality and is substantially faster than several benchmarked centroid-based semi-supervised document clustering methods.
Resumo:
This study investigated whether bystanders of traditional bullying and cyberbullying used face-to-face methods, online methods or both methods when reporting, discouraging and providing support to the victims of traditional bullying and cyberbullying. A questionnaire was completed by 348 high school students (Years 7 – 12) from seven independent schools in Australia. Overall, students predominantly utilized face-to-face methods when reporting to others for both types of bullying. Older students were more likely to use online methods to discourage the traditional bully (i.e., asking the bully to stop). Males and older students were more likely to use online methods to support victims of traditional bullying. Females were more likely to use face-to-face methods to support victims of cyberbullying. Implications for practice and future research are discussed.
Resumo:
n this paper, a multistage evolutionary scheme is proposed for clustering in a large data base, like speech data. This is achieved by clustering a small subset of the entire sample set in each stage and treating the cluster centroids so obtained as samples, together with another subset of samples not considered previously, as input data to the next stage. This is continued till the whole sample set is exhausted. The clustering is accomplished by constructing a fuzzy similarity matrix and using the fuzzy techniques proposed here. The technique is illustrated by an efficient scheme for voiced-unvoiced-silence classification of speech.
Resumo:
For clustered survival data, the traditional Gehan-type estimator is asymptotically equivalent to using only the between-cluster ranks, and the within-cluster ranks are ignored. The contribution of this paper is two fold: - (i) incorporating within-cluster ranks in censored data analysis, and; - (ii) applying the induced smoothing of Brown and Wang (2005, Biometrika) for computational convenience. Asymptotic properties of the resulting estimating functions are given. We also carry out numerical studies to assess the performance of the proposed approach and conclude that the proposed approach can lead to much improved estimators when strong clustering effects exist. A dataset from a litter-matched tumorigenesis experiment is used for illustration.
Resumo:
This thesis aimed to compare the effects of constraints-led and traditional coaching approaches on young cricket spin bowlers, with a specific research focus on increasing spin rates (i.e., Revolutions per Minute). Participants were 22 spin bowlers from either an Australia state youth squad or an academy in England. Results indicate that adopting a constraints-led approach can benefit younger, inexperienced bowlers, whilst a traditional approach may assist more skilled, older bowlers. The findings are discussed with regards to how they may inform the learning design of training programs by cricket coaches.
Resumo:
Reproductive rate is a major contributing factor to the profitability of a sheep meat enterprise. Low reproduction rate is a feature of sheep husbandry in semi-arid Queensland. High ambient temperatures are implicated in poor fertility (Moule 1970) where variation in response can be due to breed and to animals within a breed (Hopkins and Stephenson 1978). Breeds recently imported from South Africa were selected in arid environments and may be better adapted to pastoral conditions of northern Australia than traditional breeds. Animal production for a consuming world : proceedings of 9th Congress of the Asian-Australasian Association of Animal Production Societies [AAAP] and 23rd Biennial Conference of the Australian Society of Animal Production [ASAP] and 17th Annual Symposium of the University of Sydney, Dairy Research Foundation, [DRF]. 2-7 July 2000, Sydney, Australia. This study will investigate (a) the thermoregulatory ability of Damara, Dorper, Poll Dorset, Rambouillet, South African Meat Merino and Queensland medium wool Merino rams prior to joinings in the autumn and spring of 1999, 2000 and 2001 and (b) the association between thermoregulatory parameters (rectal temperature and respiration rate) and ewe fertility. Results for the initial autumn joining are reported in this paper.
Resumo:
This paper addresses the following predictive business process monitoring problem: Given the execution trace of an ongoing case,and given a set of traces of historical (completed) cases, predict the most likely outcome of the ongoing case. In this context, a trace refers to a sequence of events with corresponding payloads, where a payload consists of a set of attribute-value pairs. Meanwhile, an outcome refers to a label associated to completed cases, like, for example, a label indicating that a given case completed “on time” (with respect to a given desired duration) or “late”, or a label indicating that a given case led to a customer complaint or not. The paper tackles this problem via a two-phased approach. In the first phase, prefixes of historical cases are encoded using complex symbolic sequences and clustered. In the second phase, a classifier is built for each of the clusters. To predict the outcome of an ongoing case at runtime given its (uncompleted) trace, we select the closest cluster(s) to the trace in question and apply the respective classifier(s), taking into account the Euclidean distance of the trace from the center of the clusters. We consider two families of clustering algorithms – hierarchical clustering and k-medoids – and use random forests for classification. The approach was evaluated on four real-life datasets.
Resumo:
In this paper the notion of conceptual cohesiveness is precised and used to group objects semantically, based on a knowledge structure called ‘cohesion forest’. A set of axioms is proposed which should be satisfied to make the generated clusters meaningful.
Resumo:
Sago starch is an important dietary carbohydrate in lowland Papua New Guinea (PNG). An investigation was conducted to determine whether microbes play a role in its preservation using traditional methods. In 12 stored sago samples collected from PNG villages, lactic acid bacteria (LAB) were present (>= 3.6 x 10(4) cfu/g) and pH ranged from 6.8 to 4.2. Acetic and propionic acids were detected in all samples, while butyric, lactic and valeric acids were present in six or more. In freshly prepared sago, held in sealed containers in the laboratory at 30 degrees C, spontaneous fermentation by endogenous microflora of sago starch was observed. This was evident by increasing concentrations of acetic, butyric and lactic acids over 4 weeks, and pH reducing from 4.9 to 3.1: both LAB and yeasts were involved. Survival of potential bacterial pathogens was monitored by seeding sago starch with similar to 10(4)/g of selected organisms. Numbers of Bacillus cereus, Listeria monocytogenes and Staphylococcus aureus fell to <30/g within 7 days. Salmonella sp. was present only in low numbers after 7 days (<36/g), but Escherichia coli was still detectable after three weeks (>10(2)/g). Fermentation appeared to increase the storability and safety of the product.
Resumo:
A computationally efficient agglomerative clustering algorithm based on multilevel theory is presented. Here, the data set is divided randomly into a number of partitions. The samples of each such partition are clustered separately using hierarchical agglomerative clustering algorithm to form sub-clusters. These are merged at higher levels to get the final classification. This algorithm leads to the same classification as that of hierarchical agglomerative clustering algorithm when the clusters are well separated. The advantages of this algorithm are short run time and small storage requirement. It is observed that the savings, in storage space and computation time, increase nonlinearly with the sample size.
Resumo:
Mathematical models describing the movement of multiple interacting subpopulations are relevant to many biological and ecological processes. Standard mean-field partial differential equation descriptions of these processes suffer from the limitation that they implicitly neglect to incorporate the impact of spatial correlations and clustering. To overcome this, we derive a moment dynamics description of a discrete stochastic process which describes the spreading of distinct interacting subpopulations. In particular, we motivate our model by mimicking the geometry of two typical cell biology experiments. Comparing the performance of the moment dynamics model with a traditional mean-field model confirms that the moment dynamics approach always outperforms the traditional mean-field approach. To provide more general insight we summarise the performance of the moment dynamics model and the traditional mean-field model over a wide range of parameter regimes. These results help distinguish between those situations where spatial correlation effects are sufficiently strong, such that a moment dynamics model is required, from other situations where spatial correlation effects are sufficiently weak, such that a traditional mean-field model is adequate.