890 resultados para Clustering analysis
Resumo:
Today’s evolving networks are experiencing a large number of different attacks ranging from system break-ins, infection from automatic attack tools such as worms, viruses, trojan horses and denial of service (DoS). One important aspect of such attacks is that they are often indiscriminate and target Internet addresses without regard to whether they are bona fide allocated or not. Due to the absence of any advertised host services the traffic observed on unused IP addresses is by definition unsolicited and likely to be either opportunistic or malicious. The analysis of large repositories of such traffic can be used to extract useful information about both ongoing and new attack patterns and unearth unusual attack behaviors. However, such an analysis is difficult due to the size and nature of the collected traffic on unused address spaces. In this dissertation, we present a network traffic analysis technique which uses traffic collected from unused address spaces and relies on the statistical properties of the collected traffic, in order to accurately and quickly detect new and ongoing network anomalies. Detection of network anomalies is based on the concept that an anomalous activity usually transforms the network parameters in such a way that their statistical properties no longer remain constant, resulting in abrupt changes. In this dissertation, we use sequential analysis techniques to identify changes in the behavior of network traffic targeting unused address spaces to unveil both ongoing and new attack patterns. Specifically, we have developed a dynamic sliding window based non-parametric cumulative sum change detection techniques for identification of changes in network traffic. Furthermore we have introduced dynamic thresholds to detect changes in network traffic behavior and also detect when a particular change has ended. Experimental results are presented that demonstrate the operational effectiveness and efficiency of the proposed approach, using both synthetically generated datasets and real network traces collected from a dedicated block of unused IP addresses.
Resumo:
This report explains the objectives, datasets and evaluation criteria of both the clustering and classification tasks set in the INEX 2009 XML Mining track. The report also describes the approaches and results obtained by the different participants.
Resumo:
Confirmatory factor analyses were conducted to evaluate the factorial validity of the Toronto Alexithymia Scale in an alcohol-dependent sample. Several factor models were examined, but all models were rejected given their poor fit. A revision of the TAS-20 in alcohol-dependent populations may be needed.
Resumo:
Since the formal recognition of practice-led research in the 1990s, many higher research degree candidates in art, design and media have submitted creative works along with an accompanying written document or ‘exegesis’ for examination. Various models for the exegesis have been proposed in university guidelines and academic texts during the past decade, and students and supervisors have experimented with its contents and structure. With a substantial number of exegeses submitted and archived, it has now become possible to move beyond proposition to empirical analysis. In this article we present the findings of a content analysis of a large, local sample of submitted exegeses. We identify the emergence of a persistent pattern in the types of content included as well as overall structure. Besides an introduction and conclusion, this pattern includes three main parts, which can be summarized as situating concepts (conceptual definitions and theories); precedents of practice (traditions and exemplars in the field); and researcher’s creative practice (the creative process, the artifacts produced and their value as research). We argue that this model combines earlier approaches to the exegesis, which oscillated between academic objectivity, by providing a contextual framework for the practice, and personal reflexivity, by providing commentary on the creative practice. But this model is more than simply a hybrid: it provides a dual orientation, which allows the researcher to both situate their creative practice within a trajectory of research and do justice to its personally invested poetics. By performing the important function of connecting the practice and creative work to a wider emergent field, the model helps to support claims for a research contribution to the field. We call it a connective model of exegesis.
Resumo:
This technical report is concerned with one aspect of environmental monitoring—the detection and analysis of acoustic events in sound recordings of the environment. Sound recordings offer ecologists the potential advantages of cheaper and increased sampling. An acoustic event detection algorithm is introduced that outputs a compact rectangular marquee description of each event. It can disentangle superimposed events, which are a common occurrence during morning and evening choruses. Next, three uses to which acoustic event detection can be put are illustrated. These tasks have been selected because they illustrate quite different modes of analysis: (1) the detection of diffuse events caused by wind and rain, which are a frequent contaminant of recordings of the terrestrial environment; (2) the detection of bird calls using the spatial distribution of their component events; and (3) the preparation of acoustic maps for whole ecosystem analysis. This last task utilises the temporal distribution of events over a daily, monthly or yearly cycle.
Resumo:
The large deformation analysis is one of major challenges in numerical modelling and simulation of metal forming. Because no mesh is used, the meshfree methods show good potential for the large deformation analysis. In this paper, a local meshfree formulation, based on the local weak-forms and the updated Lagrangian (UL) approach, is developed for the large deformation analysis. To fully employ the advantages of meshfree methods, a simple and effective adaptive technique is proposed, and this procedure is much easier than the re-meshing in FEM. Numerical examples of large deformation analysis are presented to demonstrate the effectiveness of the newly developed nonlinear meshfree approach. It has been found that the developed meshfree technique provides a superior performance to the conventional FEM in dealing with large deformation problems for metal forming.
Resumo:
This report applies CCI’s creative trident methodology with the definition of the arts as established by the Australia Council for the Arts to data sourced from Australia’s national census data (from 1996, 2001 and the most recent one in 2006). Analysis has been conducted on employment, income, gender, age and the nature of employment for artists and arts related workers within and beyond the arts industries, as well as other support workers in the arts industries.
Resumo:
This paper examines the algebraic cryptanalysis of small scale variants of the LEX-BES. LEX-BES is a stream cipher based on the Advanced Encryption Standard (AES) block cipher. LEX is a generic method proposed for constructing a stream cipher from a block cipher, initially introduced by Biryukov at eSTREAM, the ECRYPT Stream Cipher project in 2005. The Big Encryption System (BES) is a block cipher introduced at CRYPTO 2002 which facilitates the algebraic analysis of the AES block cipher. In this paper, experiments were conducted to find solution of the equation system describing small scale LEX-BES using Gröbner Basis computations. This follows a similar approach to the work by Cid, Murphy and Robshaw at FSE 2005 that investigated algebraic cryptanalysis on small scale variants of the BES. The difference between LEX-BES and BES is that due to the way the keystream is extracted, the number of unknowns in LEX-BES equations is fewer than the number in BES. As far as the author knows, this attempt is the first at creating solvable equation systems for stream ciphers based on the LEX method using Gröbner Basis computations.
Resumo:
In a resource constrained business world, strategic choices must be made on process improvement and service delivery. There are calls for more agile forms of enterprises and much effort is being directed at moving organizations from a complex landscape of disparate application systems to that of an integrated and flexible enterprise accessing complex systems landscapes through service oriented architecture (SOA). This paper describes the deconstruction of an enterprise into business services using value chain analysis as each element in the value chain can be rendered as a business service in the SOA. These business services are explicitly linked to the attainment of specific organizational strategies and their contribution to the attainment of strategy is assessed and recorded. This contribution is then used to provide a rank order of business service to strategy. This information facilitates executive decision making on which business service to develop into the SOA. The paper describes an application of this Critical Service Identification Methodology (CSIM) to a case study.
Resumo:
Insight into the unique structure of layered double hydroxides has been obtained using a combination of X-ray diffraction and thermal analysis. Indium containing hydrotalcites of formula Mg4In2(CO3)(OH)12•4H2O (2:1 In-LDH) through to Mg8In2(CO3)(OH)18•4H2O (4:1 In-LDH) with variation in the Mg:In ratio have been successfully synthesised. The d(003) spacing varied from 7.83 Å for the 2:1 LDH to 8.15 Å for the 3:1 indium containing layered double hydroxide. Distinct mass loss steps attributed to dehydration, dehydroxylation and decarbonation are observed for the indium containing hydrotalcite. Dehydration occurs over the temperature range ambient to 205 °C. Dehydroxylation takes place in a series of steps over the 238 to 277 °C temperature range. Decarbonation occurs between 763 and 795 °C. The dehydroxylation and decarbonation steps depend upon the Mg:In ratio. The formation of indium containing hydrotalcites and their thermal activation provides a method for the synthesis of indium oxide based catalysts.
Resumo:
To date, most applications of algebraic analysis and attacks on stream ciphers are on those based on lin- ear feedback shift registers (LFSRs). In this paper, we extend algebraic analysis to non-LFSR based stream ciphers. Specifically, we perform an algebraic analysis on the RC4 family of stream ciphers, an example of stream ciphers based on dynamic tables, and inves- tigate its implications to potential algebraic attacks on the cipher. This is, to our knowledge, the first pa- per that evaluates the security of RC4 against alge- braic attacks through providing a full set of equations that describe the complex word manipulations in the system. For an arbitrary word size, we derive alge- braic representations for the three main operations used in RC4, namely state extraction, word addition and state permutation. Equations relating the inter- nal states and keystream of RC4 are then obtained from each component of the cipher based on these al- gebraic representations, and analysed in terms of their contributions to the security of RC4 against algebraic attacks. Interestingly, it is shown that each of the three main operations contained in the components has its own unique algebraic properties, and when their respective equations are combined, the resulting system becomes infeasible to solve. This results in a high level of security being achieved by RC4 against algebraic attacks. On the other hand, the removal of an operation from the cipher could compromise this security. Experiments on reduced versions of RC4 have been performed, which confirms the validity of our algebraic analysis and the conclusion that the full RC4 stream cipher seems to be immune to algebraic attacks at present.
Resumo:
Spectrum sensing is considered to be one of the most important tasks in cognitive radio. Many sensing detectors have been proposed in the literature, with the common assumption that the primary user is either fully present or completely absent within the window of observation. In reality, there are scenarios where the primary user signal only occupies a fraction of the observed window. This paper aims to analyse the effect of the primary user duty cycle on spectrum sensing performance through the analysis of a few common detectors. Simulations show that the probability of detection degrades severely with reduced duty cycle regardless of the detection method. Furthermore we show that reducing the duty cycle has a greater degradation on performance than lowering the signal strength.
Resumo:
In this thesis, the relationship between air pollution and human health has been investigated utilising Geographic Information System (GIS) as an analysis tool. The research focused on how vehicular air pollution affects human health. The main objective of this study was to analyse the spatial variability of pollutants, taking Brisbane City in Australia as a case study, by the identification of the areas of high concentration of air pollutants and their relationship with the numbers of death caused by air pollutants. A correlation test was performed to establish the relationship between air pollution, number of deaths from respiratory disease, and total distance travelled by road vehicles in Brisbane. GIS was utilized to investigate the spatial distribution of the air pollutants. The main finding of this research is the comparison between spatial and non-spatial analysis approaches, which indicated that correlation analysis and simple buffer analysis of GIS using the average levels of air pollutants from a single monitoring station or by group of few monitoring stations is a relatively simple method for assessing the health effects of air pollution. There was a significant positive correlation between variable under consideration, and the research shows a decreasing trend of concentration of nitrogen dioxide at the Eagle Farm and Springwood sites and an increasing trend at CBD site. Statistical analysis shows that there exists a positive relationship between the level of emission and number of deaths, though the impact is not uniform as certain sections of the population are more vulnerable to exposure. Further statistical tests found that the elderly people of over 75 years age and children between 0-15 years of age are the more vulnerable people exposed to air pollution. A non-spatial approach alone may be insufficient for an appropriate evaluation of the impact of air pollutant variables and their inter-relationships. It is important to evaluate the spatial features of air pollutants before modeling the air pollution-health relationships.
Resumo:
The high morbidity and mortality associated with atherosclerotic coronary vascular disease (CVD) and its complications are being lessened by the increased knowledge of risk factors, effective preventative measures and proven therapeutic interventions. However, significant CVD morbidity remains and sudden cardiac death continues to be a presenting feature for some subsequently diagnosed with CVD. Coronary vascular disease is also the leading cause of anaesthesia related complications. Stress electrocardiography/exercise testing is predictive of 10 year risk of CVD events and the cardiovascular variables used to score this test are monitored peri-operatively. Similar physiological time-series datasets are being subjected to data mining methods for the prediction of medical diagnoses and outcomes. This study aims to find predictors of CVD using anaesthesia time-series data and patient risk factor data. Several pre-processing and predictive data mining methods are applied to this data. Physiological time-series data related to anaesthetic procedures are subjected to pre-processing methods for removal of outliers, calculation of moving averages as well as data summarisation and data abstraction methods. Feature selection methods of both wrapper and filter types are applied to derived physiological time-series variable sets alone and to the same variables combined with risk factor variables. The ability of these methods to identify subsets of highly correlated but non-redundant variables is assessed. The major dataset is derived from the entire anaesthesia population and subsets of this population are considered to be at increased anaesthesia risk based on their need for more intensive monitoring (invasive haemodynamic monitoring and additional ECG leads). Because of the unbalanced class distribution in the data, majority class under-sampling and Kappa statistic together with misclassification rate and area under the ROC curve (AUC) are used for evaluation of models generated using different prediction algorithms. The performance based on models derived from feature reduced datasets reveal the filter method, Cfs subset evaluation, to be most consistently effective although Consistency derived subsets tended to slightly increased accuracy but markedly increased complexity. The use of misclassification rate (MR) for model performance evaluation is influenced by class distribution. This could be eliminated by consideration of the AUC or Kappa statistic as well by evaluation of subsets with under-sampled majority class. The noise and outlier removal pre-processing methods produced models with MR ranging from 10.69 to 12.62 with the lowest value being for data from which both outliers and noise were removed (MR 10.69). For the raw time-series dataset, MR is 12.34. Feature selection results in reduction in MR to 9.8 to 10.16 with time segmented summary data (dataset F) MR being 9.8 and raw time-series summary data (dataset A) being 9.92. However, for all time-series only based datasets, the complexity is high. For most pre-processing methods, Cfs could identify a subset of correlated and non-redundant variables from the time-series alone datasets but models derived from these subsets are of one leaf only. MR values are consistent with class distribution in the subset folds evaluated in the n-cross validation method. For models based on Cfs selected time-series derived and risk factor (RF) variables, the MR ranges from 8.83 to 10.36 with dataset RF_A (raw time-series data and RF) being 8.85 and dataset RF_F (time segmented time-series variables and RF) being 9.09. The models based on counts of outliers and counts of data points outside normal range (Dataset RF_E) and derived variables based on time series transformed using Symbolic Aggregate Approximation (SAX) with associated time-series pattern cluster membership (Dataset RF_ G) perform the least well with MR of 10.25 and 10.36 respectively. For coronary vascular disease prediction, nearest neighbour (NNge) and the support vector machine based method, SMO, have the highest MR of 10.1 and 10.28 while logistic regression (LR) and the decision tree (DT) method, J48, have MR of 8.85 and 9.0 respectively. DT rules are most comprehensible and clinically relevant. The predictive accuracy increase achieved by addition of risk factor variables to time-series variable based models is significant. The addition of time-series derived variables to models based on risk factor variables alone is associated with a trend to improved performance. Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular disease. Decision tree analysis of time-series data combined with risk factor variables yields rules which are more accurate than models based on time-series data alone. The limited additional value provided by electrocardiographic variables when compared to use of risk factors alone is similar to recent suggestions that exercise electrocardiography (exECG) under standardised conditions has limited additional diagnostic value over risk factor analysis and symptom pattern. The effect of the pre-processing used in this study had limited effect when time-series variables and risk factor variables are used as model input. In the absence of risk factor input, the use of time-series variables after outlier removal and time series variables based on physiological variable values’ being outside the accepted normal range is associated with some improvement in model performance.
Resumo:
Road agencies require comprehensive, relevan and quality data describing their road assets to support their investment decisions. An investment decision support system for raod maintenance and rehabilitation mainly comprise three important supporting elements namely: road asset data, decision support tools and criteria for decision-making. Probability-based methods have played a crucial role in helping decision makers understand the relationship among road related data, asset performance and uncertainties in estimating budgets/costs for road management investment. This paper presents applications of the probability-bsed method for road asset management.