299 resultados para clustering


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Ongoing financial, environmental and political adjustments have shifted the role of large international airports. Many airports are expanding from a narrow concentration on operating as transportation centres to becoming economic hubs. By working together, airports and other industry sectors can contribute to and facilitate not only economic prosperity, but create social advantage for local and regional areas in new ways. This transformation of the function and orientation of airports has been termed the aerotropolis or airport metropolis, where the airport is recognised as an economic centre with land uses that link local and global markets. This chapter contends that the conversion of an airport into a sustainable airport metropolis requires more than just industry clustering and the existence of hard physical infrastructure. Attention must also be directed to the creation and on-going development of social infrastructure within proximate areas and the maximisation of connectivity flows within and between infrastructure elements. It concludes that the establishment of an interactive and interdependent infrastructure trilogy of hard, soft and social infrastructures provides the necessary balance to the airport metropolis to ensure sustainable development. This chapter provides the start of an operating framework to integrate and harness the infrastructure trilogy to enable the achievement of optimal and sustainable social and economic advantage from airport cities.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper introduces a novel strategy for the specification of airworthiness certification categories for civil unmanned aircraft systems (UAS). The risk-based approach acknowledges the fundamental differences between the risk paradigms of manned and unmanned aviation. The proposed airworthiness certification matrix provides a systematic and objective structure for regulating the airworthiness of a diverse range of UAS types and operations. An approach for specifying UAS type categories is then discussed. An example of the approach, which includes the novel application of data-clustering algorithms, is presented to illustrate the discussion.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The ad hoc networks are vulnerable to attacks due to distributed nature and lack of infrastructure. Intrusion detection systems (IDS) provide audit and monitoring capabilities that offer the local security to a node and help to perceive the specific trust level of other nodes. The clustering protocols can be taken as an additional advantage in these processing constrained networks to collaboratively detect intrusions with less power usage and minimal overhead. Existing clustering protocols are not suitable for intrusion detection purposes, because they are linked with the routes. The route establishment and route renewal affects the clusters and as a consequence, the processing and traffic overhead increases due to instability of clusters. The ad hoc networks are battery and power constraint, and therefore a trusted monitoring node should be available to detect and respond against intrusions in time. This can be achieved only if the clusters are stable for a long period of time. If the clusters are regularly changed due to routes, the intrusion detection will not prove to be effective. Therefore, a generalized clustering algorithm has been proposed that can run on top of any routing protocol and can monitor the intrusions constantly irrespective of the routes. The proposed simplified clustering scheme has been used to detect intrusions, resulting in high detection rates and low processing and memory overhead irrespective of the routes, connections, traffic types and mobility of nodes in the network. Clustering is also useful to detect intrusions collaboratively since an individual node can neither detect the malicious node alone nor it can take action against that node on its own.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The high morbidity and mortality associated with atherosclerotic coronary vascular disease (CVD) and its complications are being lessened by the increased knowledge of risk factors, effective preventative measures and proven therapeutic interventions. However, significant CVD morbidity remains and sudden cardiac death continues to be a presenting feature for some subsequently diagnosed with CVD. Coronary vascular disease is also the leading cause of anaesthesia related complications. Stress electrocardiography/exercise testing is predictive of 10 year risk of CVD events and the cardiovascular variables used to score this test are monitored peri-operatively. Similar physiological time-series datasets are being subjected to data mining methods for the prediction of medical diagnoses and outcomes. This study aims to find predictors of CVD using anaesthesia time-series data and patient risk factor data. Several pre-processing and predictive data mining methods are applied to this data. Physiological time-series data related to anaesthetic procedures are subjected to pre-processing methods for removal of outliers, calculation of moving averages as well as data summarisation and data abstraction methods. Feature selection methods of both wrapper and filter types are applied to derived physiological time-series variable sets alone and to the same variables combined with risk factor variables. The ability of these methods to identify subsets of highly correlated but non-redundant variables is assessed. The major dataset is derived from the entire anaesthesia population and subsets of this population are considered to be at increased anaesthesia risk based on their need for more intensive monitoring (invasive haemodynamic monitoring and additional ECG leads). Because of the unbalanced class distribution in the data, majority class under-sampling and Kappa statistic together with misclassification rate and area under the ROC curve (AUC) are used for evaluation of models generated using different prediction algorithms. The performance based on models derived from feature reduced datasets reveal the filter method, Cfs subset evaluation, to be most consistently effective although Consistency derived subsets tended to slightly increased accuracy but markedly increased complexity. The use of misclassification rate (MR) for model performance evaluation is influenced by class distribution. This could be eliminated by consideration of the AUC or Kappa statistic as well by evaluation of subsets with under-sampled majority class. The noise and outlier removal pre-processing methods produced models with MR ranging from 10.69 to 12.62 with the lowest value being for data from which both outliers and noise were removed (MR 10.69). For the raw time-series dataset, MR is 12.34. Feature selection results in reduction in MR to 9.8 to 10.16 with time segmented summary data (dataset F) MR being 9.8 and raw time-series summary data (dataset A) being 9.92. However, for all time-series only based datasets, the complexity is high. For most pre-processing methods, Cfs could identify a subset of correlated and non-redundant variables from the time-series alone datasets but models derived from these subsets are of one leaf only. MR values are consistent with class distribution in the subset folds evaluated in the n-cross validation method. For models based on Cfs selected time-series derived and risk factor (RF) variables, the MR ranges from 8.83 to 10.36 with dataset RF_A (raw time-series data and RF) being 8.85 and dataset RF_F (time segmented time-series variables and RF) being 9.09. The models based on counts of outliers and counts of data points outside normal range (Dataset RF_E) and derived variables based on time series transformed using Symbolic Aggregate Approximation (SAX) with associated time-series pattern cluster membership (Dataset RF_ G) perform the least well with MR of 10.25 and 10.36 respectively. For coronary vascular disease prediction, nearest neighbour (NNge) and the support vector machine based method, SMO, have the highest MR of 10.1 and 10.28 while logistic regression (LR) and the decision tree (DT) method, J48, have MR of 8.85 and 9.0 respectively. DT rules are most comprehensible and clinically relevant. The predictive accuracy increase achieved by addition of risk factor variables to time-series variable based models is significant. The addition of time-series derived variables to models based on risk factor variables alone is associated with a trend to improved performance. Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular disease. Decision tree analysis of time-series data combined with risk factor variables yields rules which are more accurate than models based on time-series data alone. The limited additional value provided by electrocardiographic variables when compared to use of risk factors alone is similar to recent suggestions that exercise electrocardiography (exECG) under standardised conditions has limited additional diagnostic value over risk factor analysis and symptom pattern. The effect of the pre-processing used in this study had limited effect when time-series variables and risk factor variables are used as model input. In the absence of risk factor input, the use of time-series variables after outlier removal and time series variables based on physiological variable values’ being outside the accepted normal range is associated with some improvement in model performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

While close talking microphones give the best signal quality and produce the highest accuracy from current Automatic Speech Recognition (ASR) systems, the speech signal enhanced by microphone array has been shown to be an effective alternative in a noisy environment. The use of microphone arrays in contrast to close talking microphones alleviates the feeling of discomfort and distraction to the user. For this reason, microphone arrays are popular and have been used in a wide range of applications such as teleconferencing, hearing aids, speaker tracking, and as the front-end to speech recognition systems. With advances in sensor and sensor network technology, there is considerable potential for applications that employ ad-hoc networks of microphone-equipped devices collaboratively as a virtual microphone array. By allowing such devices to be distributed throughout the users’ environment, the microphone positions are no longer constrained to traditional fixed geometrical arrangements. This flexibility in the means of data acquisition allows different audio scenes to be captured to give a complete picture of the working environment. In such ad-hoc deployment of microphone sensors, however, the lack of information about the location of devices and active speakers poses technical challenges for array signal processing algorithms which must be addressed to allow deployment in real-world applications. While not an ad-hoc sensor network, conditions approaching this have in effect been imposed in recent National Institute of Standards and Technology (NIST) ASR evaluations on distant microphone recordings of meetings. The NIST evaluation data comes from multiple sites, each with different and often loosely specified distant microphone configurations. This research investigates how microphone array methods can be applied for ad-hoc microphone arrays. A particular focus is on devising methods that are robust to unknown microphone placements in order to improve the overall speech quality and recognition performance provided by the beamforming algorithms. In ad-hoc situations, microphone positions and likely source locations are not known and beamforming must be achieved blindly. There are two general approaches that can be employed to blindly estimate the steering vector for beamforming. The first is direct estimation without regard to the microphone and source locations. An alternative approach is instead to first determine the unknown microphone positions through array calibration methods and then to use the traditional geometrical formulation for the steering vector. Following these two major approaches investigated in this thesis, a novel clustered approach which includes clustering the microphones and selecting the clusters based on their proximity to the speaker is proposed. Novel experiments are conducted to demonstrate that the proposed method to automatically select clusters of microphones (ie, a subarray), closely located both to each other and to the desired speech source, may in fact provide a more robust speech enhancement and recognition than the full array could.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Speaker verification is the process of verifying the identity of a person by analysing their speech. There are several important applications for automatic speaker verification (ASV) technology including suspect identification, tracking terrorists and detecting a person’s presence at a remote location in the surveillance domain, as well as person authentication for phone banking and credit card transactions in the private sector. Telephones and telephony networks provide a natural medium for these applications. The aim of this work is to improve the usefulness of ASV technology for practical applications in the presence of adverse conditions. In a telephony environment, background noise, handset mismatch, channel distortions, room acoustics and restrictions on the available testing and training data are common sources of errors for ASV systems. Two research themes were pursued to overcome these adverse conditions: Modelling mismatch and modelling uncertainty. To directly address the performance degradation incurred through mismatched conditions it was proposed to directly model this mismatch. Feature mapping was evaluated for combating handset mismatch and was extended through the use of a blind clustering algorithm to remove the need for accurate handset labels for the training data. Mismatch modelling was then generalised by explicitly modelling the session conditions as a constrained offset of the speaker model means. This session variability modelling approach enabled the modelling of arbitrary sources of mismatch, including handset type, and halved the error rates in many cases. Methods to model the uncertainty in speaker model estimates and verification scores were developed to address the difficulties of limited training and testing data. The Bayes factor was introduced to account for the uncertainty of the speaker model estimates in testing by applying Bayesian theory to the verification criterion, with improved performance in matched conditions. Modelling the uncertainty in the verification score itself met with significant success. Estimating a confidence interval for the "true" verification score enabled an order of magnitude reduction in the average quantity of speech required to make a confident verification decision based on a threshold. The confidence measures developed in this work may also have significant applications for forensic speaker verification tasks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Artificial neural network (ANN) learning methods provide a robust and non-linear approach to approximating the target function for many classification, regression and clustering problems. ANNs have demonstrated good predictive performance in a wide variety of practical problems. However, there are strong arguments as to why ANNs are not sufficient for the general representation of knowledge. The arguments are the poor comprehensibility of the learned ANN, and the inability to represent explanation structures. The overall objective of this thesis is to address these issues by: (1) explanation of the decision process in ANNs in the form of symbolic rules (predicate rules with variables); and (2) provision of explanatory capability by mapping the general conceptual knowledge that is learned by the neural networks into a knowledge base to be used in a rule-based reasoning system. A multi-stage methodology GYAN is developed and evaluated for the task of extracting knowledge from the trained ANNs. The extracted knowledge is represented in the form of restricted first-order logic rules, and subsequently allows user interaction by interfacing with a knowledge based reasoner. The performance of GYAN is demonstrated using a number of real world and artificial data sets. The empirical results demonstrate that: (1) an equivalent symbolic interpretation is derived describing the overall behaviour of the ANN with high accuracy and fidelity, and (2) a concise explanation is given (in terms of rules, facts and predicates activated in a reasoning episode) as to why a particular instance is being classified into a certain category.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose a model-based approach to unify clustering and network modeling using time-course gene expression data. Specifically, our approach uses a mixture model to cluster genes. Genes within the same cluster share a similar expression profile. The network is built over cluster-specific expression profiles using state-space models. We discuss the application of our model to simulated data as well as to time-course gene expression data arising from animal models on prostate cancer progression. The latter application shows that with a combined statistical/bioinformatics analyses, we are able to extract gene-to-gene relationships supported by the literature as well as new plausible relationships.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

During the last decade many cities have sought to promote creativity by encouraging creative industries as drivers for economic and spatial growth. Among the creative industries, film industry play an important role in establishing high level of success in economic and spatial development of cities by fostering endogenous creativeness, attracting exogenous talent, and contributing to the formation of places that creative cities require. The paper aims to scrutinize the role of creative industries in general and the film industry in particular for place making, spatial development, tourism, and the formation of creative cities, their clustering and locational decisions. This paper investigates the positive effects of the film industry on tourism such as incubating creativity potential, increasing place recognition through locations of movies filmed and film festivals hosted, attracting visitors and establishing interaction among visitors, places and their cultures. This paper reveals the preliminary findings of two case studies from Beyoglu, Istanbul and Soho, London, examines the relation between creativity, tourism, culture and the film industry, and discusses their effects on place-making and tourism.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This overview focuses on the application of chemometrics techniques for the investigation of soils contaminated by polycyclic aromatic hydrocarbons (PAHs) and metals because these two important and very diverse groups of pollutants are ubiquitous in soils. The salient features of various studies carried out in the micro- and recreational environments of humans, are highlighted in the context of the various multivariate statistical techniques available across discipline boundaries that have been effectively used in soil studies. Particular attention is paid to techniques employed in the geosciences that may be effectively utilized for environmental soil studies; classical multivariate approaches that may be used in isolation or as complementary methods to these are also discussed. Chemometrics techniques widely applied in atmospheric studies for identifying sources of pollutants or for determining the importance of contaminant source contributions to a particular site, have seen little use in soil studies, but may be effectively employed in such investigations. Suitable programs are also available for suggesting mitigating measures in cases of soil contamination, and these are also considered. Specific techniques reviewed include pattern recognition techniques such as Principal Components Analysis (PCA), Fuzzy Clustering (FC) and Cluster Analysis (CA); geostatistical tools include variograms, Geographical Information Systems (GIS), contour mapping and kriging; source identification and contribution estimation methods reviewed include Positive Matrix Factorisation (PMF), and Principal Component Analysis on Absolute Principal Component Scores (PCA/APCS). Mitigating measures to limit or eliminate pollutant sources may be suggested through the use of ranking analysis and multi criteria decision making methods (MCDM). These methods are mainly represented in this review by studies employing the Preference Ranking Organisation Method for Enrichment Evaluation (PROMETHEE) and its associated graphic output, Geometrical Analysis for Interactive Aid (GAIA).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An investigation into the effects of changes in urban traffic characteristics due to rapid urbanisation and the predicted changes in rainfall characteristics due to climate change on the build-up and wash-off of heavy metals was carried out in Gold Coast, Australia. The study sites encompassed three different urban land uses. Nine heavy metals commonly associated with traffic emissions were selected. The results were interpreted using multivariate data analysis and decision making tools, such as principal component analysis (PCA), fuzzy clustering (FC), PROMETHEE and GAIA. Initial analyses established high, low and moderate traffic scenarios as well as low, low to moderate, moderate, high and extreme rainfall scenarios for build-up and wash-off investigations. GAIA analyses established that moderate to high traffic scenarios could affect the build-up while moderate to high rainfall scenarios could affect the wash-off of heavy metals under changed conditions. However, in wash-off, metal concentrations in 1-75µm fraction were found to be independent of the changes to rainfall characteristics. In build-up, high traffic activities in commercial and industrial areas influenced the accumulation of heavy metal concentrations in particulate size range from 75 - >300 µm, whereas metal concentrations in finer size range of <1-75 µm were not affected. As practical implications, solids <1 µm and organic matter from 1 - >300 µm can be targeted for removal of Ni, Cu, Pb, Cd, Cr and Zn from build-up whilst organic matter from <1 - >300 µm can be targeted for removal of Cd, Cr, Pb and Ni from wash-off. Cu and Zn need to be removed as free ions from most fractions in wash-off.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This chapter evaluates the rise of creative industries from four standpoints: the growing interest in creativity in the early 21st century; the 'culturalisation' of economic life with the rise of service industries; clustering and uneven development in the cultural economic geography of the creative industries; and the future of arts and cultural policy.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

After a brief personal orientation, this presentation offers an opening section on „clash, cluster, complexity, cities‟ – making the case that innovation (both creative and economic) proceeds not only from incremental improvements within an expert-pipeline process, but also from the clash of different systems, generations, and cultures. The argument is that cultural complexity arises from such clashes, and that clustering is the solution to problems of complexity. The classic, 10,000-year-old, institutional form taken by such clusters is … cities. Hence, a creative city is one where clashing and competitive complexity is clustered… and, latterly, networked.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: There has been a lack of investigation into the spatial distribution and clustering of suicide in Australia, where the population density is lower than many countries and varies dramatically among urban, rural and remote areas. This study aims to examine the spatial distribution of suicide at a Local Governmental Area (LGA) level and identify the LGAs with a high relative risk of suicide in Queensland, Australia, using geographical information system (GIS) techniques.---------- Methods: Data on suicide and demographic variables in each LGA between 1999 and 2003 were acquired from the Australian Bureau of Statistics. An age standardised mortality (ASM) rate for suicide was calculated at the LGA level. GIS techniques were used to examine the geographical difference of suicide across different areas.---------- Results: Far north and north-eastern Queensland (i.e., Cook and Mornington Shires) had the highest suicide incidence in both genders, while the south-western areas (i.e., Barcoo and Bauhinia Shires) had the lowest incidence in both genders. In different age groups (≤24 years, 25 to 44 years, 45 to 64 years, and ≥65 years), ASM rates of suicide varied with gender at the LGA level. Mornington and six other LGAs with low socioeconomic status in the upper Southeast had significant spatial clusters of high suicide risk.---------- Conclusions: There was a notable difference in ASM rates of suicide at the LGA level in Queensland. Some LGAs had significant spatial clusters of high suicide risk. The determinants of the geographical difference of suicide should be addressed in future research.