919 resultados para label hierarchical clustering
Resumo:
Long-term measurements of particle number size distribution (PNSD) produce a very large number of observations and their analysis requires an efficient approach in order to produce results in the least possible time and with maximum accuracy. Clustering techniques are a family of sophisticated methods which have been recently employed to analyse PNSD data, however, very little information is available comparing the performance of different clustering techniques on PNSD data. This study aims to apply several clustering techniques (i.e. K-means, PAM, CLARA and SOM) to PNSD data, in order to identify and apply the optimum technique to PNSD data measured at 25 sites across Brisbane, Australia. A new method, based on the Generalised Additive Model (GAM) with a basis of penalised B-splines, was proposed to parameterise the PNSD data and the temporal weight of each cluster was also estimated using the GAM. In addition, each cluster was associated with its possible source based on the results of this parameterisation, together with the characteristics of each cluster. The performances of four clustering techniques were compared using the Dunn index and Silhouette width validation values and the K-means technique was found to have the highest performance, with five clusters being the optimum. Therefore, five clusters were found within the data using the K-means technique. The diurnal occurrence of each cluster was used together with other air quality parameters, temporal trends and the physical properties of each cluster, in order to attribute each cluster to its source and origin. The five clusters were attributed to three major sources and origins, including regional background particles, photochemically induced nucleated particles and vehicle generated particles. Overall, clustering was found to be an effective technique for attributing each particle size spectra to its source and the GAM was suitable to parameterise the PNSD data. These two techniques can help researchers immensely in analysing PNSD data for characterisation and source apportionment purposes.
Resumo:
A Bitcoin wallet is a set of private keys known to a user and which allow that user to spend any Bitcoin associated with those keys. In a hierarchical deterministic (HD) wallet, child private keys are generated pseudorandomly from a master private key, and the corresponding child public keys can be generated by anyone with knowledge of the master public key. These wallets have several interesting applications including Internet retail, trustless audit, and a treasurer allocating funds among departments. A specification of HD wallets has even been accepted as Bitcoin standard BIP32. Unfortunately, in all existing HD wallets---including BIP32 wallets---an attacker can easily recover the master private key given the master public key and any child private key. This vulnerability precludes use cases such as a combined treasurer-auditor, and some in the Bitcoin community have suspected that this vulnerability cannot be avoided. We propose a new HD wallet that is not subject to this vulnerability. Our HD wallet can tolerate the leakage of up to m private keys with a master public key size of O(m). We prove that breaking our HD wallet is at least as hard as the so-called "one more" discrete logarithm problem.
Resumo:
Existing techniques for automated discovery of process models from event logs gen- erally produce flat process models. Thus, they fail to exploit the notion of subprocess as well as error handling and repetition constructs provided by contemporary process modeling notations, such as the Business Process Model and Notation (BPMN). This paper presents a technique for automated discovery of hierarchical BPMN models con- taining interrupting and non-interrupting boundary events and activity markers. The technique employs functional and inclusion dependency discovery techniques in order to elicit a process-subprocess hierarchy from the event log. Given this hierarchy and the projected logs associated to each node in the hierarchy, parent process and subprocess models are then discovered using existing techniques for flat process model discovery. Finally, the resulting models and logs are heuristically analyzed in order to identify boundary events and markers. By employing approximate dependency discovery tech- niques, it is possible to filter out noise in the event log arising for example from data entry errors or missing events. A validation with one synthetic and two real-life logs shows that process models derived by the proposed technique are more accurate and less complex than those derived with flat process discovery techniques. Meanwhile, a validation on a family of synthetically generated logs shows that the technique is resilient to varying levels of noise.
Resumo:
We report rapid and ultra-sensitive detection system for 2,4,6-trinitrotoluene (TNT) using unmodified gold nanoparticles and surface-enhanced Raman spectroscopy (SERS). First, Meisenheimer complex has been formed in aqueous solution between TNT and cysteamine in less than 15 min of mixing. The complex formation is confirmed by the development of a pink colour and a new UV–vis absorption band around 520 nm. Second, the developed Meisenheimer complex is spontaneously self-assembled onto unmodified gold nanoparticles through a stable Au–S bond between the cysteamine moiety and the gold surface. The developed mono layer of cysteamine-TNT is then screened by SERS to detect and quantify TNT. Our experimental results demonstrate that the SERS-based assay provide an ultra-sensitive approach for the detection of TNT down to 22.7 ng/L. The unambiguous fingerprint identification of TNT by SERS represents a key advantage for our proposed method. The new method provides high selectivity towards TNT over 2,4 DNT and picric acid. Therefore it satisfies the practical requirements for the rapid screening of TNT in real life samples where the interim 24-h average allowable concentration of TNT in waste water is 0.04 mg/L.
Resumo:
Purpose This paper aims to set out a new hierarchical and differentiated model of social marketing principles, concepts and techniques that builds on, but supersedes, the existing lists of non-equivalent and undifferentiated benchmark criteria. Design/methodology/approach This is a conceptual paper that proposes a hierarchical model of social marketing principles, concepts and techniques. Findings This new delineation of the social marketing principle, its four core concepts and five techniques, represents a new way to conceptualize and recognize the different elements that constitute social marketing. This new model will help add to and further the development of the theoretical basis of social marketing, building on the definitional work led by the International Social Marketing Association (iSMA), Australian Association of Social Marketing (AASM) and European Social Marketing Association (ESMA). Research limitations/implications This proposed model offers a foundation for future research to expand upon. Further research is recommended to empirically test the proposed model. Originality/value This paper seeks to advance the theoretical base of social marketing by making a reasoned case for the need to differentiate between principles, concepts and techniques when seeking to describe social marketing.
Resumo:
The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundreds of thousands of clusters. It does this on a single mid-range machine using efficient algorithms and compressed document representations. It is applied to two web-scale crawls covering tens of terabytes. ClueWeb09 and ClueWeb12 contain 500 and 733 million web pages and were clustered into 500,000 to 700,000 clusters. To the best of our knowledge, such fine grained clustering has not been previously demonstrated. Previous approaches clustered a sample that limits the maximum number of discoverable clusters. The proposed EM-tree algorithm uses the entire collection in clustering and produces several orders of magnitude more clusters than the existing algorithms. Fine grained clustering is necessary for meaningful clustering in massive collections where the number of distinct topics grows linearly with collection size. These fine-grained clusters show an improved cluster quality when assessed with two novel evaluations using ad hoc search relevance judgments and spam classifications for external validation. These evaluations solve the problem of assessing the quality of clusters where categorical labeling is unavailable and unfeasible.
Resumo:
Erythropoietin (EPO), a glycoprotein hormone of ∼34 kDa, is an important hematopoietic growth factor, mainly produced in the kidney and controls the number of red blood cells circulating in the blood stream. Sensitive and rapid recombinant human EPO (rHuEPO) detection tools that improve on the current laborious EPO detection techniques are in high demand for both clinical and sports industry. A sensitive aptamer-functionalized biosensor (aptasensor) has been developed by controlled growth of gold nanostructures (AuNS) over a gold substrate (pAu/AuNS). The aptasensor selectively binds to rHuEPO and, therefore, was used to extract and detect the drug from horse plasma by surface enhanced Raman spectroscopy (SERS). Due to the nanogap separation between the nanostructures, the high population and distribution of hot spots on the pAu/AuNS substrate surface, strong signal enhancement was acquired. By using wide area illumination (WAI) setting for the Raman detection, a low RSD of 4.92% over 150 SERS measurements was achieved. The significant reproducibility of the new biosensor addresses the serious problem of SERS signal inconsistency that hampers the use of the technique in the field. The WAI setting is compatible with handheld Raman devices. Therefore, the new aptasensor can be used for the selective extraction of rHuEPO from biological fluids and subsequently screened with handheld Raman spectrometer for SERS based in-field protein detection.
Resumo:
We propose a novel technique for conducting robust voice activity detection (VAD) in high-noise recordings. We use Gaussian mixture modeling (GMM) to train two generic models; speech and non-speech. We then score smaller segments of a given (unseen) recording against each of these GMMs to obtain two respective likelihood scores for each segment. These scores are used to compute a dissimilarity measure between pairs of segments and to carry out complete-linkage clustering of the segments into speech and non-speech clusters. We compare the accuracy of our method against state-of-the-art and standardised VAD techniques to demonstrate an absolute improvement of 15% in half-total error rate (HTER) over the best performing baseline system and across the QUT-NOISE-TIMIT database. We then apply our approach to the Audio-Visual Database of American English (AVDBAE) to demonstrate the performance of our algorithm in using visual, audio-visual or a proposed fusion of these features.
Resumo:
In the United States, there has been fierce debate over state, federal and international efforts to engage in genetically modified food labelling (GM food labelling). A grassroots coalition of consumers, environmentalists, organic farmers, and the food movement has pushed for law reform in respect of GM food labelling. The Just Label It campaign has encouraged United States consumers to send comments to the United States Food and Drug Administration to label genetically modified foods. This Chapter explores the various justifications made in respect of genetically modified food labelling. There has been a considerable effort to portray the issue of GM food labelling as one of consumer rights as part of ‘the right to know’. There has been a significant battle amongst farmers over GM food labelling – with organic farmers and biotechnology companies, fighting for precedence. There has also been a significant discussion about the use of GM food labelling as a form of environmental legislation. The prescriptions in GM food labelling regulations may serve to promote eco-labelling, and deter greenwashing. There has been a significant debate over whether GM food labelling may serve to regulate corporations – particularly from the food, agriculture, and biotechnology industries. There are significant issues about the interaction between intellectual property laws – particularly in respect of trade mark law and consumer protection – and regulatory proposals focused upon biotechnology. There has been a lack of international harmonization in respect of GM food labelling. As such, there has been a major use of comparative arguments about regulator models in respect of food labelling. There has also been a discussion about international law, particularly with the emergence of sweeping regional trade proposals, such as the Trans-Pacific Partnership, and the Trans-Atlantic Trade and Investment Partnership. This Chapter considers the United States debates over genetically modified food labelling – at state, federal, and international levels. The battles often involved the use of citizen-initiated referenda. The policy conflicts have been policy-centric disputes – pitting organic farmers, consumers, and environmentalists against the food industry and biotechnology industry. Such battles have raised questions about consumer rights, public health, freedom of speech, and corporate rights. The disputes highlighted larger issues about lobbying, fund-raising, and political influence. The role of money in United States has been a prominent concern of Lawrence Lessig in his recent academic and policy work with the group, Rootstrikers. Part 1 considers the debate in California over Proposition 37. Part 2 explores other key state initiatives in respect of GM food labelling. Part 3 examines the Federal debate in the United States over GM food labelling. Part 4 explores whether regional trade agreements – such as the Trans-Pacific Partnership (TPP) and the Trans-Atlantic Trade and Investment Partnership (TTIP) – will impact upon
Resumo:
Imaging genetics aims to discover how variants in the human genome influence brain measures derived from images. Genome-wide association scans (GWAS) can screen the genome for common differences in our DNA that relate to brain measures. In small samples, GWAS has low power as individual gene effects are weak and one must also correct for multiple comparisons across the genome and the image. Here we extend recent work on genetic clustering of images, to analyze surface-based models of anatomy using GWAS. We performed spherical harmonic analysis of hippocampal surfaces, automatically extracted from brain MRI scans of 1254 subjects. We clustered hippocampal surface regions with common genetic influences by examining genetic correlations (r(g)) between the normalized deformation values at all pairs of surface points. Using genetic correlations to cluster surface measures, we were able to boost effect sizes for genetic associations, compared to clustering with traditional phenotypic correlations using Pearson's r.
Labeling white matter tracts in hardi by fusing multiple tract atlases with applications to genetics
Resumo:
Accurate identification of white matter structures and segmentation of fibers into tracts is important in neuroimaging and has many potential applications. Even so, it is not trivial because whole brain tractography generates hundreds of thousands of streamlines that include many false positive fibers. We developed and tested an automatic tract labeling algorithm to segment anatomically meaningful tracts from diffusion weighted images. Our multi-atlas method incorporates information from multiple hand-labeled fiber tract atlases. In validations, we showed that the method outperformed the standard ROI-based labeling using a deformable, parcellated atlas. Finally, we show a high-throughput application of the method to genetic population studies. We use the sub-voxel diffusion information from fibers in the clustered tracts based on 105-gradient HARDI scans of 86 young normal twins. The whole workflow shows promise for larger population studies in the future.
Resumo:
We introduce a framework for population analysis of white matter tracts based on diffusion-weighted images of the brain. The framework enables extraction of fibers from high angular resolution diffusion images (HARDI); clustering of the fibers based partly on prior knowledge from an atlas; representation of the fiber bundles compactly using a path following points of highest density (maximum density path; MDP); and registration of these paths together using geodesic curve matching to find local correspondences across a population. We demonstrate our method on 4-Tesla HARDI scans from 565 young adults to compute localized statistics across 50 white matter tracts based on fractional anisotropy (FA). Experimental results show increased sensitivity in the determination of genetic influences on principal fiber tracts compared to the tract-based spatial statistics (TBSS) method. Our results show that the MDP representation reveals important parts of the white matter structure and considerably reduces the dimensionality over comparable fiber matching approaches.
Resumo:
Developing nano/micro-structures which can effectively upgrade the intriguing properties of electrode materials for energy storage devices is always a key research topic. Ultrathin nanosheets were proved to be one of the potential nanostructures due to their high specific surface area, good active contact areas and porous channels. Herein, we report a unique hierarchical micro-spherical morphology of well-stacked and completely miscible molybdenum disulfide (MoS2) nanosheets and graphene sheets, were successfully synthesized via a simple and industrial scale spray-drying technique to take the advantages of both MoS2 and graphene in terms of their high practical capacity values and high electronic conductivity, respectively. Computational studies were performed to understand the interfacial behaviour of MoS2 and graphene, which proves high stability of the composite with high interfacial binding energy (−2.02 eV) among them. Further, the lithium and sodium storage properties have been tested and reveal excellent cyclic stability over 250 and 500 cycles, respectively, with the highest initial capacity values of 1300 mAh g−1 and 640 mAh g−1 at 0.1 A g−1.
Resumo:
This article considers the ongoing debate over the appropriation of well-known and famous trade marks by the No Logo Movement for the purposes of political and social critique. It focuses upon one sensational piece of litigation in South Africa, Laugh It Off Promotions v. South African Breweries International (Finance) B.V. t/a Sabmark International. In this case, a group called Laugh It Off Promotions subjected the trade marks of the manufacturers of Carling Beer were subjected to parody, social satire, and culture jamming. The beer slogan “Black Label” was turned into a T-Shirt entitled “Black Labour/ White Guilt”. In the ensuing litigation, the High Court of South Africa and the Supreme Court of Appeal were of the opinion that the appropriation of the mark was a case of hate speech. However, the Constitutional Court of South Africa disagreed, finding that the parodies of a well-known, famous trade mark did not constitute trade mark dilution. Moseneke J observed that there was a lack of evidence of economic or material harm; and Sachs J held that there is a need to provide latitude for parody, laughter, and freedom of expression. The decision of the Constitutional Court of South Africa provides some important insights into the nature of trade mark dilution, the role of parody and satire, and the relevance of constitutional protections of freedom of speech and freedom of expression. Arguably, the ruling will be of help in the reformation of trade mark dilution law in other jurisdictions – such as the United States. The decision in Laugh It Off Promotions v. South African Breweries International demonstrates that trade mark law should not be immune from careful constitutional scrutiny.