933 resultados para Hier-archical clustering


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The commercialization of Chinese media has taken place over the past two decades; it has become a significant force since 2001 when China joined the World Trade Organisation. With demand for original content increasing and China contemplating a cultural trade deficit in media content, there is much discussion of agglomeration and clustering. Beijing, as the national media centre of China, witnesses a process of media agglomeration while bearing the problem of cultural export during the media commercialization. Michael Curtin‟s idea of media capital, which absorbs media resources and personnel and exports media products transnationally, provides a dynamic perspective of understanding media agglomeration and dispersion under different political social and cultural circumstances. Hence the question whether Beijing is going to transform into a transnational media capital is worth studying, in order to observe and comprehend China‟s media industry in transition. Drawing on Michael Curtin‟s three media capital trajectories, the paper interprets tensions and challenges generated in the process of media industry agglomeration and growth in Beijing. Emphasis is placed on the third trajectory, socio-cultural variation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This technical report describes the methods used to obtain a list of acoustic indices that are used to characterise the structure and distribution of acoustic energy in recordings of the natural environment. In particular it describes methods for noise reduction from recordings of the environment and a fast clustering algorithm used to estimate the spectral richness of long recordings.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Approximate clone detection is the process of identifying similar process fragments in business process model collections. The tool presented in this paper can efficiently cluster approximate clones in large process model repositories. Once a repository is clustered, users can filter and browse the clusters using different filtering parameters. Our tool can also visualize clusters in the 2D space, allowing a better understanding of clusters and their member fragments. This demonstration will be useful for researchers and practitioners working on large process model repositories, where process standardization is a critical task for increasing the consistency and reducing the complexity of the repository.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper analyses the pairwise distances of signatures produced by the TopSig retrieval model on two document collections. The distribution of the distances are compared to purely random signatures. It explains why TopSig is only competitive with state of the art retrieval models at early precision. Only the local neighbourhood of the signatures is interpretable. We suggest this is a common property of vector space models.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Barmah Forest virus (BFV) disease is one of the most widespread mosquito-borne diseases in Australia. The number of outbreaks and the incidence rate of BFV in Australia have attracted growing concerns about the spatio-temporal complexity and underlying risk factors of BFV disease. A large number of notifications has been recorded continuously in Queensland since 1992. Yet, little is known about the spatial and temporal characteristics of the disease. I aim to use notification data to better understand the effects of climatic, demographic, socio-economic and ecological risk factors on the spatial epidemiology of BFV disease transmission, develop predictive risk models and forecast future disease risks under climate change scenarios. Computerised data files of daily notifications of BFV disease and climatic variables in Queensland during 1992-2008 were obtained from Queensland Health and Australian Bureau of Meteorology, respectively. Projections on climate data for years 2025, 2050 and 2100 were obtained from Council of Scientific Industrial Research Organisation. Data on socio-economic, demographic and ecological factors were also obtained from relevant government departments as follows: 1) socio-economic and demographic data from Australian Bureau of Statistics; 2) wetlands data from Department of Environment and Resource Management and 3) tidal readings from Queensland Department of Transport and Main roads. Disease notifications were geocoded and spatial and temporal patterns of disease were investigated using geostatistics. Visualisation of BFV disease incidence rates through mapping reveals the presence of substantial spatio-temporal variation at statistical local areas (SLA) over time. Results reveal high incidence rates of BFV disease along coastal areas compared to the whole area of Queensland. A Mantel-Haenszel Chi-square analysis for trend reveals a statistically significant relationship between BFV disease incidence rates and age groups (ƒÓ2 = 7587, p<0.01). Semi-variogram analysis and smoothed maps created from interpolation techniques indicate that the pattern of spatial autocorrelation was not homogeneous across the state. A cluster analysis was used to detect the hot spots/clusters of BFV disease at a SLA level. Most likely spatial and space-time clusters are detected at the same locations across coastal Queensland (p<0.05). The study demonstrates heterogeneity of disease risk at a SLA level and reveals the spatial and temporal clustering of BFV disease in Queensland. Discriminant analysis was employed to establish a link between wetland classes, climate zones and BFV disease. This is because the importance of wetlands in the transmission of BFV disease remains unclear. The multivariable discriminant modelling analyses demonstrate that wetland types of saline 1, riverine and saline tidal influence were the most significant risk factors for BFV disease in all climate and buffer zones, while lacustrine, palustrine, estuarine and saline 2 and saline 3 wetlands were less important. The model accuracies were 76%, 98% and 100% for BFV risk in subtropical, tropical and temperate climate zones, respectively. This study demonstrates that BFV disease risk varied with wetland class and climate zone. The study suggests that wetlands may act as potential breeding habitats for BFV vectors. Multivariable spatial regression models were applied to assess the impact of spatial climatic, socio-economic and tidal factors on the BFV disease in Queensland. Spatial regression models were developed to account for spatial effects. Spatial regression models generated superior estimates over a traditional regression model. In the spatial regression models, BFV disease incidence shows an inverse relationship with minimum temperature, low tide and distance to coast, and positive relationship with rainfall in coastal areas whereas in whole Queensland the disease shows an inverse relationship with minimum temperature and high tide and positive relationship with rainfall. This study determines the most significant spatial risk factors for BFV disease across Queensland. Empirical models were developed to forecast the future risk of BFV disease outbreaks in coastal Queensland using existing climatic, socio-economic and tidal conditions under climate change scenarios. Logistic regression models were developed using BFV disease outbreak data for the existing period (2000-2008). The most parsimonious model had high sensitivity, specificity and accuracy and this model was used to estimate and forecast BFV disease outbreaks for years 2025, 2050 and 2100 under climate change scenarios for Australia. Important contributions arising from this research are that: (i) it is innovative to identify high-risk coastal areas by creating buffers based on grid-centroid and the use of fine-grained spatial units, i.e., mesh blocks; (ii) a spatial regression method was used to account for spatial dependence and heterogeneity of data in the study area; (iii) it determined a range of potential spatial risk factors for BFV disease; and (iv) it predicted the future risk of BFV disease outbreaks under climate change scenarios in Queensland, Australia. In conclusion, the thesis demonstrates that the distribution of BFV disease exhibits a distinct spatial and temporal variation. Such variation is influenced by a range of spatial risk factors including climatic, demographic, socio-economic, ecological and tidal variables. The thesis demonstrates that spatial regression method can be applied to better understand the transmission dynamics of BFV disease and its risk factors. The research findings show that disease notification data can be integrated with multi-factorial risk factor data to develop build-up models and forecast future potential disease risks under climate change scenarios. This thesis may have implications in BFV disease control and prevention programs in Queensland.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Cancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset. Results We compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours. Conclusions We demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Due to the explosive growth of the Web, the domain of Web personalization has gained great momentum both in the research and commercial areas. One of the most popular web personalization systems is recommender systems. In recommender systems choosing user information that can be used to profile users is very crucial for user profiling. In Web 2.0, one facility that can help users organize Web resources of their interest is user tagging systems. Exploring user tagging behavior provides a promising way for understanding users’ information needs since tags are given directly by users. However, free and relatively uncontrolled vocabulary makes the user self-defined tags lack of standardization and semantic ambiguity. Also, the relationships among tags need to be explored since there are rich relationships among tags which could provide valuable information for us to better understand users. In this paper, we propose a novel approach for learning tag ontology based on the widely used lexical database WordNet for capturing the semantics and the structural relationships of tags. We present personalization strategies to disambiguate the semantics of tags by combining the opinion of WordNet lexicographers and users’ tagging behavior together. To personalize further, clustering of users is performed to generate a more accurate ontology for a particular group of users. In order to evaluate the usefulness of the tag ontology, we use the tag ontology in a pilot tag recommendation experiment for improving the recommendation performance by exploiting the semantic information in the tag ontology. The initial result shows that the personalized information has improved the accuracy of the tag recommendation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A multicausal model of adolescent homelessness is proposed, based upon the notion that homeless youth suffer from emotional, social, and cultural deprivation. The model was tested in a sample of homeless adolescents (n = 54) and a similar, but not homeless, control group (n = 58). Emotional deprivation was assessed on the Parental Bonding Inventory (Parker, Tupling,&Brown, 1979), whereas social and cultural deprivation were assessed on the Family Environment Scale (Moos&Moos, 1981). The homeless adolescents were found to be significantly more deprived emotionally, socially, and culturally than the controls. The results indicate support for a deprivation model of adolescent homelessness with implications for public policy and intervention planning.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The majority of distribution utilities do not have accurate information on the constituents of their loads. This information is very useful in managing and planning the network, adequately and economically. Customer loads are normally categorized in three main sectors: 1) residential; 2) industrial; and 3) commercial. In this paper, penalized least-squares regression and Euclidean distance methods are developed for this application to identify and quantify the makeup of a feeder load with unknown sectors/subsectors. This process is done on a monthly basis to account for seasonal and other load changes. The error between the actual and estimated load profiles are used as a benchmark of accuracy. This approach has shown to be accurate in identifying customer types in unknown load profiles, and is used in cross-validation of the results and initial assumptions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The concept of older adults contributing to society in a meaningful way has been termed ‘active ageing’. Active ageing reflects changes in prevailing theories of social and psychological aspects of ageing, with a focus on individuals' strengths as opposed to their deficits or pathology. In order to explore predictors of active ageing, the Australian Active Ageing (Triple A) project group undertook a national postal survey of participants over the age of 50 years recruited randomly through their 2004 membership of a large Australia-wide senior's organisation. The survey comprised 178 items covering paid and voluntary work, learning, social, spiritual, emotional, health and home, life events and demographic items. A 45% response rate (2655 returned surveys) reflected an expected balance of gender, age and geographic representation of participants. The data were analysed using data mining techniques to represent generalizations on individual situations. Data mining identifies the valid, novel, potentially useful and understandable patterns and trends in data. The results based on the clustering mining technique indicate that physical and emotional health combined with the desire to learn were the most significant factors when considering active ageing. The findings suggest that remaining active in later life is not only directly related to the maintenance of emotional and physical health, but may be significantly intertwined with the opportunity to engage in on-going learning activities that are relevant to the individual. The findings of this study suggest that practitioners and policy makers need to incorporate older peoples' learning needs within service and policy framework developments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we propose an approach which attempts to solve the problem of surveillance event detection, assuming that we know the definition of the events. To facilitate the discussion, we first define two concepts. The event of interest refers to the event that the user requests the system to detect; and the background activities are any other events in the video corpus. This is an unsolved problem due to many factors as listed below: 1) Occlusions and clustering: The surveillance scenes which are of significant interest at locations such as airports, railway stations, shopping centers are often crowded, where occlusions and clustering of people are frequently encountered. This significantly affects the feature extraction step, and for instance, trajectories generated by object tracking algorithms are usually not robust under such a situation. 2) The requirement for real time detection: The system should process the video fast enough in both of the feature extraction and the detection step to facilitate real time operation. 3) Massive size of the training data set: Suppose there is an event that lasts for 1 minute in a video with a frame rate of 25fps, the number of frames for this events is 60X25 = 1500. If we want to have a training data set with many positive instances of the event, the video is likely to be very large in size (i.e. hundreds of thousands of frames or more). How to handle such a large data set is a problem frequently encountered in this application. 4) Difficulty in separating the event of interest from background activities: The events of interest often co-exist with a set of background activities. Temporal groundtruth typically very ambiguous, as it does not distinguish the event of interest from a wide range of co-existing background activities. However, it is not practical to annotate the locations of the events in large amounts of video data. This problem becomes more serious in the detection of multi-agent interactions, since the location of these events can often not be constrained to within a bounding box. 5) Challenges in determining the temporal boundaries of the events: An event can occur at any arbitrary time with an arbitrary duration. The temporal segmentation of events is difficult and ambiguous, and also affected by other factors such as occlusions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The development of creative industries has been connected to urban development since the end of the 20th century. However, the causality of why creative industries always cluster and develop in certain cities hasn‘t been adequately demonstrated, especially as to how various resources grow, interact and nurture the creative capacity of the locality. Therefore it is vital to observe how the local institutional environment nurtures creative industries and how creative industries consequently change the environment in order to better address the connection between creative industries and localities. In Beijing, the relocation of CCTV, BTV and Phoenix to Chaoyang District raises the possibility of a new era for Chinese media, one in which the stodginess of propaganda content will give way to exciting new forms and genres. The mixing of media companies in an open commercial environment (away from the political power district of Xicheng) holds the promise of more freedom of expression and, ultimately, to a =media capital‘ (Curtin, 2003). These are the dreams of many media practitioners in Beijing. But just how realistic are their expectations? This study adopts the concept of =media capital‘ to demonstrate how participants, including state-media organisations, private media companies and international media conglomerates, are seeking out space and networks to survive in Beijing. Drawing on policy analysis, interviews and case studies, this study illustrates how different agents meet, confront and adapt in Beijing. This study identifies factors responsible for the media industries clustering in China, and argues that Beijing is very likely to be the next Chinese media capital, after enough accumulation and development, although as a lower tier version compared to other media capitals in the world. This study contributes to Curtin‘s =media capital‘ concept, develops his interpretation on the relationship of media industries and the government, and suggests that the influence over the government of media companies and professionals should be acknowledged. Therefore, empirically, this study assists media practitioners in understanding how the Chinese government perceives media industries and, consequently, how media industries are operated in China. The study also reveals that despite the government‘s aspirations, China‘s media industries are still greatly constrained by institutional obstacles. Hence Beijing really needs to speed up its pace on the path of media reform, abandon the old mindset and create more room for creativity. Policy-makers in China should keep in mind that the only choice left to them is to further the reform.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Smartphones become very critical part of our lives as they offer advanced capabilities with PC-like functionalities. They are getting widely deployed while not only being used for classical voice-centric communication. New smartphone malwares keep emerging where most of them still target Symbian OS. In the case of Symbian OS, application signing seemed to be an appropriate measure for slowing down malware appearance. Unfortunately, latest examples showed that signing can be bypassed resulting in new malware outbreak. In this paper, we present a novel approach to static malware detection in resource-limited mobile environments. This approach can be used to extend currently used third-party application signing mechanisms for increasing malware detection capabilities. In our work, we extract function calls from binaries in order to apply our clustering mechanism, called centroid. This method is capable of detecting unknown malwares. Our results are promising where the employed mechanism might find application at distribution channels, like online application stores. Additionally, it seems suitable for directly being used on smartphones for (pre-)checking installed applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Large Igneous Provinces are exceptional intraplate igneous events throughout Earth’s history. Their significance and potential global impact is related to the total volume of magma intruded and released during these geologically brief events (peak eruptions are often within 1-5 Myrs duration) where millions to tens of millions of cubic kilometers of magma are produced. In some cases, at least 1% of the Earth’s surface has been directly covered in volcanic rock, being equivalent to the size of small continents with comparable crustal thicknesses. Large Igneous Provinces are thus important, albeit episodic episodes of new crust addition. However, most magmatism is basaltic so that contributions to crustal growth will not always be picked up in zircon geochronology studies that better trace major episodes of extension-related silicic magmatism and the silicic Large Igneous Provinces. Much headway has been made on our understanding of these anomalous igneous events over the last 25 years, driving many new ideas and models. This includes their: 1) global spatial and temporal distribution, with a long-term average of one event approximately every 20 Myrs, but a clear clustering of events at times of supercontinent break-up – Large Igneous Provinces are thus an integral part of the Wilson cycle and are becoming an increasingly important tool in reconnecting dispersed continental fragments; 2) compositional diversity that in part reflects their crustal setting of ocean basins, and continental interiors and margins where in the latter setting, LIP magmatism can be silicicdominant; 3) mineral and energy resources with major PGE and precious metal resources being hosted in these provinces, as well as magmatism impacting on the hydrocarbon potential of volcanic basins and rifted margins through enhancing source rock maturation, providing fluid migration pathways, and trap formation; 4) biospheric, hydrospheric and atmospheric impacts, with Large Igneous Provinces now widely regarded as a key trigger mechanism for mass extinctions, although the exact kill mechanism(s) are still being resolved; 5) role in mantle geodynamics and thermal evolution of the Earth, by potentially recording the transport of material from the lower mantle or core-mantle boundary to the Earth's surface and being a fundamental component in whole mantle convection models; and 6) recognition on the inner planets where the lack of plate tectonics and erosional processes and planetary antiquity means that the very earliest record of LIP events during planetary evolution may be better preserved than on Earth.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

LIP emplacement is linked to the timing and evolution of supercontinental break-up. LIP-related break-up produces volcanic rifted margins, new and large (up to 108 km2) ocean basins, and new, smaller continents that undergo dispersal and potentially reassembly (e.g., India). However, not all continental LIPs lead to continental rupture. We analysed the <330 Ma continental LIP record(following final assembly of Pangea) to find relationships between LIP event attributes (e.g., igneous volume, extent, distance from pre-existing continental margin) and ocean basin attributes (e.g., length of new ocean basin/rifted margin) and how these varied during the progressive break up of Pangea. No correlation exists between LIP magnitude and size of the subsequent ocean basin or rifted margin. Our review suggests a three-phased break-up history of Pangea: 1) “Preconditioning” phase (∼330–200 Ma): LIP events (n=7) occurred largely around the supercontinental margin clustering today in Asia, with a low (<20%) rifting success rate. The Panjal Traps at ∼280 Ma may represent the first continental rupturing event of Pangea, resulting in continental ribboning along the Tethyan margin; 2) “Main Break-up” phase (∼200–100 Ma): numerous large LIP events(n=10) in the supercontinent interior, resulting in highly successful fragmentation (90%) and large, new ocean basins(e.g., Central/South Atlantic, Indian, >3000 km long); 3) “Waning” phase (∼100–0 Ma): Declining LIP magnitudes (n=6), greater proximity to continental margins (e.g., Madagascar, North Atlantic, Afro-Arabia, Sierra Madre) producing smaller ocean basins (<2600 km long). How Pangea broke up may thus have implications for earlier supercontinent reconstructions and LIP record.