169 resultados para Labelled graphs


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by specifying the inner products between each pair of points in the embedding space. This information is contained in the so-called kernel matrix, a symmetric and positive definite matrix that encodes the relative positions of all points. Specifying this matrix amounts to specifying the geometry of the embedding space and inducing a notion of similarity in the input space -- classical model selection problems in machine learning. In this paper we show how the kernel matrix can be learned from data via semi-definite programming (SDP) techniques. When applied to a kernel matrix associated with both training and test data this gives a powerful transductive algorithm -- using the labelled part of the data one can learn an embedding also for the unlabelled part. The similarity between test points is inferred from training points and their labels. Importantly, these learning problems are convex, so we obtain a method for learning both the model class and the function without local minima. Furthermore, this approach leads directly to a convex method to learn the 2-norm soft margin parameter in support vector machines, solving another important open problem. Finally, the novel approach presented in the paper is supported by positive empirical results.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dashboards are expected to improve decision making by amplifying cognition and capitalizing on human perceptual capabilities. Hence, interest in dashboards has increased recently, which is also evident from the proliferation of dashboard solution providers in the market. Despite dashboards' popularity, little is known about the extent of their effectiveness, i.e. what types of dashboards work best for different users or tasks. In this paper, we conduct a comprehensive multidisciplinary literature review with an aim to identify the critical issues organizations might need to consider when implementing dashboards. Dashboards are likely to succeed and solve the problems of presentation format and information load when certain visualization principles and features are present (e.g. high data-ink ratio and drill down features).Werecommend that dashboards come with some level of flexibility, i.e. allowing users to switch between alternative presentation formats. Also some theory driven guidance through popups and warnings can help users to select an appropriate presentation format. Given the dearth of research on dashboards, we conclude the paper with a research agenda that could guide future studies in this area.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Continuous user authentication with keystroke dynamics uses characters sequences as features. Since users can type characters in any order, it is imperative to find character sequences (n-graphs) that are representative of user typing behavior. The contemporary feature selection approaches do not guarantee selecting frequently-typed features which may cause less accurate statistical user-representation. Furthermore, the selected features do not inherently reflect user typing behavior. We propose four statistical based feature selection techniques that mitigate limitations of existing approaches. The first technique selects the most frequently occurring features. The other three consider different user typing behaviors by selecting: n-graphs that are typed quickly; n-graphs that are typed with consistent time; and n-graphs that have large time variance among users. We use Gunetti’s keystroke dataset and k-means clustering algorithm for our experiments. The results show that among the proposed techniques, the most-frequent feature selection technique can effectively find user representative features. We further substantiate our results by comparing the most-frequent feature selection technique with three existing approaches (popular Italian words, common n-graphs, and least frequent ngraphs). We find that it performs better than the existing approaches after selecting a certain number of most-frequent n-graphs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Expected satiety has been shown to play a key role in decisions around meal size. Recently it has become clear that these expectations can also influence the satiety that is experienced after a food has been consumed. As such, increasing the expected and actual satiety a food product confers without increasing its caloric content is of importance. In this study we sought to determine whether this could be achieved via product labelling. Female participants (N=75) were given a 223-kcal yoghurt smoothie for lunch. In separate conditions the smoothie was labelled as a diet brand, a highly-satiating brand, or an ‘own brand’ control. Expected satiety was assessed using rating scales and a computer-based ‘method of adjustment’, both prior to consuming the smoothie and 24 hours later. Hunger and fullness were assessed at baseline, immediately after consuming the smoothie, and for a further three hours. Despite the fact that all participants consumed the same food, the smoothie branded as highly-satiating was consistently expected to deliver more satiety than the other ‘brands’; this difference was sustained 24 hours after consumption. Furthermore, post-consumption and over three hours, participants consuming this smoothie reported significantly less hunger and significantly greater fullness. These findings demonstrate that the satiety that a product confers depends in part on information that is present around the time of consumption. We suspect that this process is mediated by changes to expected satiety. These effects may potentially be utilised in the development of successful weight-management products.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The management of models over time in many domains requires different constraints to apply to some parts of the model as it evolves. Using EMF and its meta-language Ecore, the development of model management code and tools usually relies on the meta- model having some constraints, such as attribute and reference cardinalities and changeability, set in the least constrained way that any model user will require. Stronger versions of these constraints can then be enforced in code, or by attaching additional constraint expressions, and their evaluations engines, to the generated model code. We propose a mechanism that allows for variations to the constraining meta-attributes of metamodels, to allow enforcement of different constraints at different lifecycle stages of a model. We then discuss the implementation choices within EMF to support the validation of a state-specific metamodel on model graphs when changing states, as well as the enforcement of state-specific constraints when executing model change operations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Defence organisations perform information security evaluations to confirm that electronic communications devices are safe to use in security-critical situations. Such evaluations include tracing all possible dataflow paths through the device, but this process is tedious and error-prone, so automated reachability analysis tools are needed to make security evaluations faster and more accurate. Previous research has produced a tool, SIFA, for dataflow analysis of basic digital circuitry, but it cannot analyse dataflow through microprocessors embedded within the circuit since this depends on the software they run. We have developed a static analysis tool that produces SIFA compatible dataflow graphs from embedded microcontroller programs written in C. In this paper we present a case study which shows how this new capability supports combined hardware and software dataflow analyses of a security critical communications device.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Data flow analysis techniques can be used to help assess threats to data confidentiality and integrity in security critical program code. However, a fundamental weakness of static analysis techniques is that they overestimate the ways in which data may propagate at run time. Discounting large numbers of these false-positive data flow paths wastes an information security evaluator's time and effort. Here we show how to automatically eliminate some false-positive data flow paths by precisely modelling how classified data is blocked by certain expressions in embedded C code. We present a library of detailed data flow models of individual expression elements and an algorithm for introducing these components into conventional data flow graphs. The resulting models can be used to accurately trace byte-level or even bit-level data flow through expressions that are normally treated as atomic. This allows us to identify expressions that safely downgrade their classified inputs and thereby eliminate false-positive data flow paths from the security evaluation process. To validate the approach we have implemented and tested it in an existing data flow analysis toolkit.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a method for automatic terrain classification, using a cheap monocular camera in conjunction with a robot’s stall sensor. A first step is to have the robot generate a training set of labelled images. Several techniques are then evaluated for preprocessing the images, reducing their dimensionality, and building a classifier. Finally, the classifier is implemented and used online by an indoor robot. Results are presented, demonstrating an increased level of autonomy.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the growing number of XML documents on theWeb it becomes essential to effectively organise these XML documents in order to retrieve useful information from them. A possible solution is to apply clustering on the XML documents to discover knowledge that promotes effective data management, information retrieval and query processing. However, many issues arise in discovering knowledge from these types of semi-structured documents due to their heterogeneity and structural irregularity. Most of the existing research on clustering techniques focuses only on one feature of the XML documents, this being either their structure or their content due to scalability and complexity problems. The knowledge gained in the form of clusters based on the structure or the content is not suitable for reallife datasets. It therefore becomes essential to include both the structure and content of XML documents in order to improve the accuracy and meaning of the clustering solution. However, the inclusion of both these kinds of information in the clustering process results in a huge overhead for the underlying clustering algorithm because of the high dimensionality of the data. The overall objective of this thesis is to address these issues by: (1) proposing methods to utilise frequent pattern mining techniques to reduce the dimension; (2) developing models to effectively combine the structure and content of XML documents; and (3) utilising the proposed models in clustering. This research first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. A clustering framework with two types of models, implicit and explicit, is developed. The implicit model uses a Vector Space Model (VSM) to combine the structure and the content information. The explicit model uses a higher order model, namely a 3- order Tensor Space Model (TSM), to explicitly combine the structure and the content information. This thesis also proposes a novel incremental technique to decompose largesized tensor models to utilise the decomposed solution for clustering the XML documents. The proposed framework and its components were extensively evaluated on several real-life datasets exhibiting extreme characteristics to understand the usefulness of the proposed framework in real-life situations. Additionally, this research evaluates the outcome of the clustering process on the collection selection problem in the information retrieval on the Wikipedia dataset. The experimental results demonstrate that the proposed frequent pattern mining and clustering methods outperform the related state-of-the-art approaches. In particular, the proposed framework of utilising frequent structures for constraining the content shows an improvement in accuracy over content-only and structure-only clustering results. The scalability evaluation experiments conducted on large scaled datasets clearly show the strengths of the proposed methods over state-of-the-art methods. In particular, this thesis work contributes to effectively combining the structure and the content of XML documents for clustering, in order to improve the accuracy of the clustering solution. In addition, it also contributes by addressing the research gaps in frequent pattern mining to generate efficient and concise frequent subtrees with various node relationships that could be used in clustering.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Link the Wiki track at INEX 2008 offered two tasks, file-to-file link discovery and anchor-to-BEP link discovery. In the former 6600 topics were used and in the latter 50 were used. Manual assessment of the anchor-to-BEP runs was performed using a tool developed for the purpose. Runs were evaluated using standard precision & recall measures such as MAP and precision / recall graphs. 10 groups participated and the approaches they took are discussed. Final evaluation results for all runs are presented.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Philanthropic foundations in Australia have traditionally been labelled ‘icebergs’. Much of what they do and who they are is not apparent on the surface. Many are unknown and apart from an occasional biography, almost all are sparsely documented in terms of the very personal decisions behind establishing them. Practically and academically, scant data exist on the decision journeys people make into formalised philanthropy. This study seeks to fill that gap. It is believed to be the largest such study of foundation decision-making ever undertaken in this country. It is the latest in a series of ACPNS research into types of considered (versus spontaneous) giving in Australia. This research has been supported by the Perpetual Foundation, the EF and SL Gluyas Trust and the Edward Corbould Charitable Trust under the management of Perpetual Trustee Company Ltd.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Prostate cancer (CaP) is the most commonly diagnosed cancer in males in Australia, North America, and Europe. If found early and locally confined, CaP can be treated with radical prostatectomy or radiation therapy; however, 25-40% patients will relapse and go on to advanced disease. The most common therapy in these cases is androgen deprivation therapy (ADT), which suppresses androgen production from the testis. Lack of the testicular androgen supply causes cells of the prostate to undergo apoptosis. However, in some cases the regression initially seen with ADT eventually gives way to a growth of a population of cancerous cells that no longer require testicular androgens. This phenotype is essentially fatal and is termed castrate resistant prostate cancer (CRPC). In addition to eventual regression, there are many undesirable side effects which accompany ADT, including development of a metabolic syndrome, which is defined by the U.S. National Library of Medicine as “a combination of medical disorders that increase the risk of developing cardiovascular disease and diabetes.” This project will focus on the effect of ADT induced hyperinsulinemia, as mimicked by treating androgen receptor positive CaP cells with insulin in a serum (hormone) deprived environment. While this side effect is not widely explored, in this thesis it is demonstrated for the first time that insulin upregulates pathways important to CaP progression. Our group has previously shown that during CaP progression, the enzymes necessary for de novo steroidogenesis are upregulated in the LNCaP xenograft model, total steroid levels are increased in tumours compared to pre castrate levels, and de novo steroidogenesis from radio-labelled acetate has been demonstrated. Because of the CaP dependence on AR for survival, we and other groups believe that CaP cells carry out de novo steroidogenesis to survive in androgen deprived conditions. Because (a) men on ADT often develop metabolic syndrome, and (b) men with lifestyle-induced obesity and hyperinsulinemia have worse prognosis and faster disease progression, and because (c) insulin causes steroidogenesis in other cell lines, the hypothesis that insulin may contribute to CaP progression through upregulation of steroidogenesis was explored. Insulin upregulates steroidogenesis enzymes at the mRNA level in three AR positive cell lines, as well as upregulating these enzymes at the protein level in two cell lines. It has also been demonstrated that insulin increases mitochondrial (functional) levels of steroid acute regulatory protein (StAR). Furthermore, insulin causes increased levels of total steroids in and induction of de novo steroid synthesis by insulin has been demonstrated at levels induced sufficient to activate AR. The effect of insulin analogs on CaP steroidogenesis in LNCaP and VCaP cells has also been investigated because epidemiological studies suggest that some of the analogs developed may have more cancer stimulatory effects than normal insulin. In this project, despite the signalling differences between glargine, X10, and insulin, these analogs did not appear to induce steroidogenesis any more potently that normal insulin. The effect of insulin of MCF7breast cancer cells was also investigated with results suggesting that breast cancer cells may be capable of de novo steroidogenesis, and that increase in estradiol production may be exacerbated by insulin. Insulin has also been long known to stimulate lipogenesis in the liver and adipocytes, and has been demonstrated to increase lipogenesis in breast cancer cells; therefore, investigation of the effect of insulin on lipogenesis, which is a hallmark of aggressive cancers, was investigated. In CaP progression sterol regulatory element binding protein (SREBP) is dysregulated and upregulates fatty acid synthase (FASN), acetyl CoA-carboxylase, and other lipogenesis genes. SREBP is important for steroidogenesis and in this project has been shown to be upregulated by insulin in CaP cells. Fatty acid synthesis provides building blocks of membrane growth, provides substrates for acid oxidation, the main energy source for CaP cells, provides building blocks for anti-apoptotic and proinflammatory molecules, and provides molecules that stimulate steroidogenesis. In this project it has been shown that insulin upregulates FASN and ACC, which synthesize fatty acids, as well as upregulating hormone sensitive lipase (HSL), diazepam-binding inhibitor (DBI), and long-chain acyl-CoA synthetase 3 (ACSL3), which contribute to lipid activation of steroidogenesis. Insulin also upregulates total lipid levels and de novo lipogenesis, which can be suppressed by inhibition of the insulin receptor (INSR). The fatty acids synthesized after insulin treatment are those that have been associated with CaP; furthermore, microarray data suggests insulin may upregulate fatty acid biosynthesis, metabolism and arachidonic acid metabolism pathways, which have been implicated in CaP growth and survival. Pharmacological agents used to treat patients with hyperinsulinemia/ hyperlipidemia have gained much interest in regards to CaP risk and treatment; however, the scientific rationale behind these clinical applications has not been examined. This thesis explores whether the use of metformin or simvastatin would decrease either lipogenesis or steroidogenesis or both in CaP cells. Simvastatin is a 3-hydroxy-3-methylglutaryl-CoA reductase (HMGR) inhibitor, which blocks synthesis of cholesterol, the building block of steroids/ androgens. It has also been postulated to down regulate SREBP in other metabolic disorders. It has been shown in this thesis, in LNCaP cells, that simvastatin inhibited and decreased insulin induced steroidogenesis and lipogenesis, respectively, but increased these pathways in the absence of insulin. Conversely, metformin, which activates AMP-activated protein kinase (AMPK) to shut down lipogenesis, cholesterol synthesis, and protein synthesis, highly suppresses both steroidogenesis and lipogenesis in the presence and absence of insulin. Lastly, because it has been demonstrated to increase steroidogenesis in other cell lines, and because the elucidation of any factors affecting steroidogenesis is important to understanding CaP, the effect of IGF2 on steroidogenesis in CaP cells was investigated. In patient samples, as men progress to CRPC, IGF2 mRNA and the protein levels of the receptors it may signal through are upregulated. It has also been demonstrated that IGF2 upregulates steroidogenic enzymes at both the mRNA and protein levels in LNCaP cells, increases intracellular and secreted steroid/androgen levels in LNCaPs to levels sufficient to stimulate the AR, and upregulated de novo steroidogenesis in LNCaPs and VCaPs. As well, inhibition of INSR and insulin-like growth factor 1 receptor (IGF1R), which IGF2 signals through, suggests that induction of steroidogenesis may be occurring predominantly through IGF1R. In summary, this project has illuminated for the first time that insulin is likely to play a large role in cancer progression, through upregulation of the steroidogenesis and lipogenesis pathways at the mRNA and protein levels, and production levels, and demonstrates a novel role for IGF-II in CaP progression through stimulation of steroidogenesis. It has also been demonstrated that metformin and simvastatin drugs may be useful in suppressing the insulin induction of these pathways. This project affirms the pathways by which ADT- induced metabolic syndrome may exacerbate CaP progression and strongly suggests that the monitoring and modulation of the metabolic state of CaP patients could have a strong impact on their therapeutic outcomes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The purpose of this paper is to identify and empirically examine the key features, purposes, uses, and benefits of performance dashboards. We find that only about a quarter of the sales managers surveyed1 in Finland used a dashboard, which was lower than previously reported. Dashboards were used for four distinct purposes: (i) monitoring, (ii) problem solving, (iii) rationalizing, and (iv) communication and consistency. There was a high correlation between the different uses of dashboards and user productivity indicating that dashboards were perceived as effective tools in performance management, not just for monitoring one‟s own performance but for other purposes including communication. The quality of the data in dashboards did not seem to be a concern (except for completeness) but it was a critical driver regarding its use. This is the first empirical study on performance dashboards in terms of adoption rates, key features, and benefits. The study highlights the research potential and benefits of dashboards, which could be valuable for future researchers and practitioners.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Individual science teachers who have inspired colleagues to transform their classroom praxis have been labelled transformational leaders. As the notion of distributed leadership became more accepted in the educational literature, the focus on the individual teacher-leader shifted to the study of leadership praxis both by individuals (whoever they might be) and by collectives within schools and science classrooms. This review traces the trajectory of leadership research, in the context of learning and teaching science, from an individual focus to a dialectical relationship between individual and collective praxis. The implications of applying an individual-collective perspective to praxis for teachers, students and their designated leaders are discussed.