994 resultados para 019900 OTHER MATHEMATICAL SCIENCES
Resumo:
With the increasing number of XML documents in varied domains, it has become essential to identify ways of finding interesting information from these documents. Data mining techniques were used to derive this interesting information. Mining on XML documents is impacted by its model due to the semi-structured nature of these documents. Hence, in this chapter we present an overview of the various models of XML documents, how these models were used for mining and some of the issues and challenges in these models. In addition, this chapter also provides some insights into the future models of XML documents for effectively capturing the two important features namely structure and content of XML documents for mining.
Resumo:
The problem of steady subcritical free surface flow past a submerged inclined step is considered. The asymptotic limit of small Froude number is treated, with particular emphasis on the effect that changing the angle of the step face has on the surface waves. As demonstrated by Chapman & Vanden-Broeck (2006), the divergence of a power series expansion in powers of the square of the Froude number is caused by singularities in the analytic continuation of the free surface; for an inclined step, these singularities may correspond to either the corners or stagnation points of the step, or both, depending on the angle of incline. Stokes lines emanate from these singularities, and exponentially small waves are switched on at the point the Stokes lines intersect with the free surface. Our results suggest that for a certain range of step angles, two wavetrains are switched on, but the exponentially subdominant one is switched on first, leading to an intermediate wavetrain not previously noted. We extend these ideas to the problem of flow over a submerged bump or trench, again with inclined sides. This time there may be two, three or four active Stokes lines, depending on the inclination angles. We demonstrate how to construct a base topography such that wave contributions from separate Stokes lines are of equal magnitude but opposite phase, thus cancelling out. Our asymptotic results are complemented by numerical solutions to the fully nonlinear equations.
Resumo:
The quality assurance of stereotactic radiotherapy and radiosurgery treatments requires the use of small-field dose measurements that can be experimentally challenging. This study used Monte Carlo simulations to establish that PAGAT dosimetry gel can be used to provide accurate, high resolution, three-dimensional dose measurements of stereotactic radiotherapy fields. A small cylindrical container (4 cm height, 4.2 cm diameter) was filled with PAGAT gel, placed in the parietal region inside a CIRS head phantom, and irradiated with a 12 field stereotactic radiotherapy plan. The resulting three-dimensional dose measurement was read out using an optical CT scanner and compared with the treatment planning prediction of the dose delivered to the gel during the treatment. A BEAMnrc DOSXYZnrc simulation of this treatment was completed, to provide a standard against which the accuracy of the gel measurement could be gauged. The three dimensional dose distributions obtained from Monte Carlo and from the gel measurement were found to be in better agreement with each other than with the dose distribution provided by the treatment planning system's pencil beam calculation. Both sets of data showed close agreement with the treatment planning system's dose distribution through the centre of the irradiated volume and substantial disagreement with the treatment planning system at the penumbrae. The Monte Carlo calculations and gel measurements both indicated that the treated volume was up to 3 mm narrower, with steeper penumbrae and more variable out-of-field dose, than predicted by the treatment planning system. The Monte Carlo simulations allowed the accuracy of the PAGAT gel dosimeter to be verified in this case, allowing PAGAT gel to be utilised in the measurement of dose from stereotactic and other radiotherapy treatments, with greater confidence in the future.
Resumo:
The REMNANT/EMERGENCY Artlab was funded by the Australia Council InterArts ArtLab Program in 2010 and involves 22 months of rigorous research and experimentation in several countries. The process will be developed between a core transdisciplinary team of practicing media artists, designers and engineers where possible working in consultation and collaboration with local creatives at each venue. Our team asserts that today’s environmental crisis is underpinned by a deep cultural crisis - and so to get our ‘house in order’ we urgently need to create better and more powerful ‘images’ of what a ‘citizen-led’, sustainable world might be. This ArtLab’s core aim is therefore to begin to understand how to develop and create such ‘powerful images’.
Resumo:
Bioinformatics involves analyses of biological data such as DNA sequences, microarrays and protein-protein interaction (PPI) networks. Its two main objectives are the identification of genes or proteins and the prediction of their functions. Biological data often contain uncertain and imprecise information. Fuzzy theory provides useful tools to deal with this type of information, hence has played an important role in analyses of biological data. In this thesis, we aim to develop some new fuzzy techniques and apply them on DNA microarrays and PPI networks. We will focus on three problems: (1) clustering of microarrays; (2) identification of disease-associated genes in microarrays; and (3) identification of protein complexes in PPI networks. The first part of the thesis aims to detect, by the fuzzy C-means (FCM) method, clustering structures in DNA microarrays corrupted by noise. Because of the presence of noise, some clustering structures found in random data may not have any biological significance. In this part, we propose to combine the FCM with the empirical mode decomposition (EMD) for clustering microarray data. The purpose of EMD is to reduce, preferably to remove, the effect of noise, resulting in what is known as denoised data. We call this method the fuzzy C-means method with empirical mode decomposition (FCM-EMD). We applied this method on yeast and serum microarrays, and the silhouette values are used for assessment of the quality of clustering. The results indicate that the clustering structures of denoised data are more reasonable, implying that genes have tighter association with their clusters. Furthermore we found that the estimation of the fuzzy parameter m, which is a difficult step, can be avoided to some extent by analysing denoised microarray data. The second part aims to identify disease-associated genes from DNA microarray data which are generated under different conditions, e.g., patients and normal people. We developed a type-2 fuzzy membership (FM) function for identification of diseaseassociated genes. This approach is applied to diabetes and lung cancer data, and a comparison with the original FM test was carried out. Among the ten best-ranked genes of diabetes identified by the type-2 FM test, seven genes have been confirmed as diabetes-associated genes according to gene description information in Gene Bank and the published literature. An additional gene is further identified. Among the ten best-ranked genes identified in lung cancer data, seven are confirmed that they are associated with lung cancer or its treatment. The type-2 FM-d values are significantly different, which makes the identifications more convincing than the original FM test. The third part of the thesis aims to identify protein complexes in large interaction networks. Identification of protein complexes is crucial to understand the principles of cellular organisation and to predict protein functions. In this part, we proposed a novel method which combines the fuzzy clustering method and interaction probability to identify the overlapping and non-overlapping community structures in PPI networks, then to detect protein complexes in these sub-networks. Our method is based on both the fuzzy relation model and the graph model. We applied the method on several PPI networks and compared with a popular protein complex identification method, the clique percolation method. For the same data, we detected more protein complexes. We also applied our method on two social networks. The results showed our method works well for detecting sub-networks and give a reasonable understanding of these communities.
Resumo:
Complex networks have been studied extensively due to their relevance to many real-world systems such as the world-wide web, the internet, biological and social systems. During the past two decades, studies of such networks in different fields have produced many significant results concerning their structures, topological properties, and dynamics. Three well-known properties of complex networks are scale-free degree distribution, small-world effect and self-similarity. The search for additional meaningful properties and the relationships among these properties is an active area of current research. This thesis investigates a newer aspect of complex networks, namely their multifractality, which is an extension of the concept of selfsimilarity. The first part of the thesis aims to confirm that the study of properties of complex networks can be expanded to a wider field including more complex weighted networks. Those real networks that have been shown to possess the self-similarity property in the existing literature are all unweighted networks. We use the proteinprotein interaction (PPI) networks as a key example to show that their weighted networks inherit the self-similarity from the original unweighted networks. Firstly, we confirm that the random sequential box-covering algorithm is an effective tool to compute the fractal dimension of complex networks. This is demonstrated on the Homo sapiens and E. coli PPI networks as well as their skeletons. Our results verify that the fractal dimension of the skeleton is smaller than that of the original network due to the shortest distance between nodes is larger in the skeleton, hence for a fixed box-size more boxes will be needed to cover the skeleton. Then we adopt the iterative scoring method to generate weighted PPI networks of five species, namely Homo sapiens, E. coli, yeast, C. elegans and Arabidopsis Thaliana. By using the random sequential box-covering algorithm, we calculate the fractal dimensions for both the original unweighted PPI networks and the generated weighted networks. The results show that self-similarity is still present in generated weighted PPI networks. This implication will be useful for our treatment of the networks in the third part of the thesis. The second part of the thesis aims to explore the multifractal behavior of different complex networks. Fractals such as the Cantor set, the Koch curve and the Sierspinski gasket are homogeneous since these fractals consist of a geometrical figure which repeats on an ever-reduced scale. Fractal analysis is a useful method for their study. However, real-world fractals are not homogeneous; there is rarely an identical motif repeated on all scales. Their singularity may vary on different subsets; implying that these objects are multifractal. Multifractal analysis is a useful way to systematically characterize the spatial heterogeneity of both theoretical and experimental fractal patterns. However, the tools for multifractal analysis of objects in Euclidean space are not suitable for complex networks. In this thesis, we propose a new box covering algorithm for multifractal analysis of complex networks. This algorithm is demonstrated in the computation of the generalized fractal dimensions of some theoretical networks, namely scale-free networks, small-world networks, random networks, and a kind of real networks, namely PPI networks of different species. Our main finding is the existence of multifractality in scale-free networks and PPI networks, while the multifractal behaviour is not confirmed for small-world networks and random networks. As another application, we generate gene interactions networks for patients and healthy people using the correlation coefficients between microarrays of different genes. Our results confirm the existence of multifractality in gene interactions networks. This multifractal analysis then provides a potentially useful tool for gene clustering and identification. The third part of the thesis aims to investigate the topological properties of networks constructed from time series. Characterizing complicated dynamics from time series is a fundamental problem of continuing interest in a wide variety of fields. Recent works indicate that complex network theory can be a powerful tool to analyse time series. Many existing methods for transforming time series into complex networks share a common feature: they define the connectivity of a complex network by the mutual proximity of different parts (e.g., individual states, state vectors, or cycles) of a single trajectory. In this thesis, we propose a new method to construct networks of time series: we define nodes by vectors of a certain length in the time series, and weight of edges between any two nodes by the Euclidean distance between the corresponding two vectors. We apply this method to build networks for fractional Brownian motions, whose long-range dependence is characterised by their Hurst exponent. We verify the validity of this method by showing that time series with stronger correlation, hence larger Hurst exponent, tend to have smaller fractal dimension, hence smoother sample paths. We then construct networks via the technique of horizontal visibility graph (HVG), which has been widely used recently. We confirm a known linear relationship between the Hurst exponent of fractional Brownian motion and the fractal dimension of the corresponding HVG network. In the first application, we apply our newly developed box-covering algorithm to calculate the generalized fractal dimensions of the HVG networks of fractional Brownian motions as well as those for binomial cascades and five bacterial genomes. The results confirm the monoscaling of fractional Brownian motion and the multifractality of the rest. As an additional application, we discuss the resilience of networks constructed from time series via two different approaches: visibility graph and horizontal visibility graph. Our finding is that the degree distribution of VG networks of fractional Brownian motions is scale-free (i.e., having a power law) meaning that one needs to destroy a large percentage of nodes before the network collapses into isolated parts; while for HVG networks of fractional Brownian motions, the degree distribution has exponential tails, implying that HVG networks would not survive the same kind of attack.
Resumo:
Mixture models are a flexible tool for unsupervised clustering that have found popularity in a vast array of research areas. In studies of medicine, the use of mixtures holds the potential to greatly enhance our understanding of patient responses through the identification of clinically meaningful clusters that, given the complexity of many data sources, may otherwise by intangible. Furthermore, when developed in the Bayesian framework, mixture models provide a natural means for capturing and propagating uncertainty in different aspects of a clustering solution, arguably resulting in richer analyses of the population under study. This thesis aims to investigate the use of Bayesian mixture models in analysing varied and detailed sources of patient information collected in the study of complex disease. The first aim of this thesis is to showcase the flexibility of mixture models in modelling markedly different types of data. In particular, we examine three common variants on the mixture model, namely, finite mixtures, Dirichlet Process mixtures and hidden Markov models. Beyond the development and application of these models to different sources of data, this thesis also focuses on modelling different aspects relating to uncertainty in clustering. Examples of clustering uncertainty considered are uncertainty in a patient’s true cluster membership and accounting for uncertainty in the true number of clusters present. Finally, this thesis aims to address and propose solutions to the task of comparing clustering solutions, whether this be comparing patients or observations assigned to different subgroups or comparing clustering solutions over multiple datasets. To address these aims, we consider a case study in Parkinson’s disease (PD), a complex and commonly diagnosed neurodegenerative disorder. In particular, two commonly collected sources of patient information are considered. The first source of data are on symptoms associated with PD, recorded using the Unified Parkinson’s Disease Rating Scale (UPDRS) and constitutes the first half of this thesis. The second half of this thesis is dedicated to the analysis of microelectrode recordings collected during Deep Brain Stimulation (DBS), a popular palliative treatment for advanced PD. Analysis of this second source of data centers on the problems of unsupervised detection and sorting of action potentials or "spikes" in recordings of multiple cell activity, providing valuable information on real time neural activity in the brain.
Resumo:
A novel m-ary tree based approach is presented to solve asset management decisions which are combinatorial in nature. The approach introduces a new dynamic constraint based control mechanism which is capable of excluding infeasible solutions from the solution space. The approach also provides a solution to the challenges with ordering of assets decisions.
Resumo:
Sfinks is a shift register based stream cipher designed for hardware implementation. The initialisation state update function is different from the state update function used for keystream generation. We demonstrate state convergence during the initialisation process, even though the individual components used in the initialisation are one-to-one. However, the combination of these components is not one-to-one.
Resumo:
The paper presents the results of a study conducted to investigate indoor air quality within residential dwellings in Lao PDR. Results from PM 10, CO, and NO2 measurements inside 167 dwellings in Lao PDR over a five month period (December 2005-April 2006) are discussed as a function of household characteristics and occupant activities. Extremely high PM10 and NO2 concentrations (12 h mean PM10 concentrations 1275 ± 98 μg m-3 and 1183 ± 99 μg m-3 in Vientiane and Bolikhamxay provinces, respectively; 12 h mean NO2 concentrations 1210 ± 94 μg m-3 and 561 ± 45 μg m-3 in Vientiane and Bolikhamxay, respectively) were measured within the dwellings. Correlations, ANOVA analysis (univariate and multivariate), and linear regression results suggest a substantial contribution from cookingandsmoking.The PM10 concentrations were significantly higher in houses without a chimney compared to houses in which cooking occurred on a stove with a chimney. However, no significant differences in pollutantconcentrations were observed as a function of cooking location. Furthermore, PM10 and NO2 concentrations were higher in houses in which smoking occurred, suggestive of a relationship between increased indoor concentrations and smoking (0.05 < p < 0.10). Resuspension of dust from soil floors was another significant source of PM10 inside the house (634 μg m-3, p < 0.05).
Resumo:
In this paper, a generic and flexible optimisation methodology is developed to represent, model, solve and analyse the iron ore supply chain system by integrating of iron ore shipment, stockpiles and railing within a whole system. As a result, an integrated train-stockpile-ship timetable is created and optimised for improving efficiency of overall supply chain system. The proposed methodology provides better decision making on how to significantly improve rolling stock utilisation with the best cost-effectiveness ratio. Based on extensive computational experiments and analysis, insightful and quantitative advices are suggested for iron ore mine industry practitioners. The proposed methodology contributes to the sustainability of the environment by reducing pollution due to better utilisation of transportation resources and fuel.
Resumo:
In this paper, we describe the main processes and operations in mining industries and present a comprehensive survey of operations research methodologies that have been applied over the last several decades. The literature review is classified into four main categories: mine design; mine production; mine transportation; and mine evaluation. Mining design models are further separated according to two main mining methods: open-pit and underground. Moreover, mine production models are subcategorised into two groups: ore mining and coal mining. Mine transportation models are further partitioned in accordance with fleet management, truck haulage and train scheduling. Mine evaluation models are further subdivided into four clusters in terms of mining method selection, quality control, financial risks and environmental protection. The main characteristics of four Australian commercial mining software are addressed and compared. This paper bridges the gaps in the literature and motivates researchers to develop more applicable, realistic and comprehensive operations research models and solution techniques that are directly linked with mining industries.
Resumo:
The paper investigates train scheduling problems when prioritised trains and non-prioritised trains are simultaneously traversed in a single-line rail network. In this case, no-wait conditions arise because the prioritised trains such as express passenger trains should traverse continuously without any interruption. In comparison, non-prioritised trains such as freight trains are allowed to enter the next section immediately if possible or to remain in a section until the next section on the routing becomes available, which is thought of as a relaxation of no-wait conditions. With thorough analysis of the structural properties of the No-Wait Blocking Parallel-Machine Job-Shop-Scheduling (NWBPMJSS) problem that is originated in this research, an innovative generic constructive algorithm (called NWBPMJSS_Liu-Kozan) is proposed to construct the feasible train timetable in terms of a given order of trains. In particular, the proposed NWBPMJSS_Liu-Kozan constructive algorithm comprises several recursively-used sub-algorithms (i.e. Best-Starting-Time-Determination Procedure, Blocking-Time-Determination Procedure, Conflict-Checking Procedure, Conflict-Eliminating Procedure, Tune-up Procedure and Fine-tune Procedure) to guarantee feasibility by satisfying the blocking, no-wait, deadlock-free and conflict-free constraints. A two-stage hybrid heuristic algorithm (NWBPMJSS_Liu-Kozan-BIH) is developed by combining the NWBPMJSS_Liu-Kozan constructive algorithm and the Best-Insertion-Heuristic (BIH) algorithm to find the preferable train schedule in an efficient and economical way. Extensive computational experiments show that the proposed methodology is promising because it can be applied as a standard and fundamental toolbox for identifying, analysing, modelling and solving real-world scheduling problems.
Resumo:
In practice, parallel-machine job-shop scheduling (PMJSS) is very useful in the development of standard modelling approaches and generic solution techniques for many real-world scheduling problems. In this paper, based on the analysis of structural properties in an extended disjunctive graph model, a hybrid shifting bottleneck procedure (HSBP) algorithm combined with Tabu Search metaheuristic algorithm is developed to deal with the PMJSS problem. The original-version SBP algorithm for the job-shop scheduling (JSS) has been significantly improved to solve the PMJSS problem with four novelties: i) a topological-sequence algorithm is proposed to decompose the PMJSS problem into a set of single-machine scheduling (SMS) and/or parallel-machine scheduling (PMS) subproblems; ii) a modified Carlier algorithm based on the proposed lemmas and the proofs is developed to solve the SMS subproblem; iii) the Jackson rule is extended to solve the PMS subproblem; iv) a Tabu Search metaheuristic algorithm is embedded under the framework of SBP to optimise the JSS and PMJSS cases. The computational experiments show that the proposed HSBP is very efficient in solving the JSS and PMJSS problems.
Resumo:
This research deals with an innovative methodology for optimising the coal train scheduling problem. Based on our previously published work, generic solution techniques are developed by utilising a “toolbox” of standard well-solved standard scheduling problems. According to our analysis, the coal train scheduling problem can be basically modelled a Blocking Parallel-Machine Job-Shop Scheduling (BPMJSS) problem with some minor constraints. To construct the feasible train schedules, an innovative constructive algorithm called the SLEK algorithm is proposed. To optimise the train schedule, a three-stage hybrid algorithm called the SLEK-BIH-TS algorithm is developed based on the definition of a sophisticated neighbourhood structure under the mechanism of the Best-Insertion-Heuristic (BIH) algorithm and Tabu Search (TS) metaheuristic algorithm. A case study is performed for optimising a complex real-world coal rail system in Australia. A method to calculate the lower bound of the makespan is proposed to evaluate results. The results indicate that the proposed methodology is promising to find the optimal or near-optimal feasible train timetables of a coal rail system under network and terminal capacity constraints.