868 resultados para data-types
Resumo:
The idea of extracting knowledge in process mining is a descendant of data mining. Both mining disciplines emphasise data flow and relations among elements in the data. Unfortunately, challenges have been encountered when working with the data flow and relations. One of the challenges is that the representation of the data flow between a pair of elements or tasks is insufficiently simplified and formulated, as it considers only a one-to-one data flow relation. In this paper, we discuss how the effectiveness of knowledge representation can be extended in both disciplines. To this end, we introduce a new representation of the data flow and dependency formulation using a flow graph. The flow graph solves the issue of the insufficiency of presenting other relation types, such as many-to-one and one-to-many relations. As an experiment, a new evaluation framework is applied to the Teleclaim process in order to show how this method can provide us with more precise results when compared with other representations.
Resumo:
Understanding the functioning of a neural system in terms of its underlying circuitry is an important problem in neuroscience. Recent d evelopments in electrophysiology and imaging allow one to simultaneously record activities of hundreds of neurons. Inferring the underlying neuronal connectivity patterns from such multi-neuronal spike train data streams is a challenging statistical and computational problem. This task involves finding significant temporal patterns from vast amounts of symbolic time series data. In this paper we show that the frequent episode mining methods from the field of temporal data mining can be very useful in this context. In the frequent episode discovery framework, the data is viewed as a sequence of events, each of which is characterized by an event type and its time of occurrence and episodes are certain types of temporal patterns in such data. Here we show that, using the set of discovered frequent episodes from multi-neuronal data, one can infer different types of connectivity patterns in the neural system that generated it. For this purpose, we introduce the notion of mining for frequent episodes under certain temporal constraints; the structure of these temporal constraints is motivated by the application. We present algorithms for discovering serial and parallel episodes under these temporal constraints. Through extensive simulation studies we demonstrate that these methods are useful for unearthing patterns of neuronal network connectivity.
Resumo:
Family mediation is mandated in Australia for couples in dispute over separation and parenting as a first step in dispute resolution, except where there is a history of intimate partner violence. However, validation of effective well-differentiated partner violence screening instruments suitable for mediation settings is at an early phase of development. This study contributes to calls for better violence screening instruments in the mediation context to detect a differentiated range of abusive behaviors by examining the reliability and validity of both established scales, and newly developed scales that measured intimate partner violence by partner and by self. The study also aimed to examine relationships between types of abuse, and between gender and types of abuse. A third aim was to examine associations between types of abuse and other relationship indicators such as acrimony and parenting alliance. The data reported here are part of a larger mixed method, naturalistic longitudinal study of clients attending nine family mediation centers in Victoria, Australia. The current analyses on baseline cross-sectional screening data confirmed the reliability of three subscales of the Conflict Tactics Scale (CTS2), and the reliability and validity of three new scales measuring intimidation, controlling and jealous behavior, and financial control. Most clients disclosed a history of at least one type of violence by partner: 95% reported psychological aggression, 72% controlling and jealous behavior, 50% financial control, and 35% physical assault. Higher rates of abuse perpetration were reported by partner versus by self, and gender differences were identified. There were strong associations between certain patterns of psychologically abusive behavior and both acrimony and parenting alliance. The implications for family mediation services and future research are discussed.
Resumo:
The synthesis, structure and magnetic properties of mixed-metal oxides of ABO(3) composition in the La-B-V-O (B = Ni, Cu) systems are described in the present paper. While the B = Ni oxides adopt GdFeO3-like perovskite structure containing disordered nickel and vanadium at the octahedral B site, La3Cu2VO9 crystallizes in a YAlO3-type structure. A detailed investigation of the superstructure of nominal La3Cu2VO9 by WDS analysis and Rietveld refinement of powder XRD data reveal that the likely composition of the phase is La13Cu9V4O38.5, where the Cu and V atoms are ordered in a root13a(h) (a(h) = hexagonal a parameter of YAlO3-like subcell) superstructure. Magnetic susceptibility data support the proposed superstructure consisting of triangular Cu-3 clusters. At low temperatures, the magnetic moment corresponds to S = 1/2 per Cu-3 cluster, while at high temperatures the behavior is Curie-Weiss like, showing S = 1/2 per copper. The present work reveals the contrasting behavior of La-Cu-V-O and La-Ni-V-O systems: while a unique line-phase related to YAlO3 structure is formed around La3Cu2VO9 Composition in the copper system, a continuous series of perovskite-GdFeO3 solid solutions, LaNi1-xVxO3 for 0 less than or equal to x less than or equal to 1/3 seems to be obtained in the nickel system, where the oxidation state of nickel varies from 3+ to 2+.
Resumo:
Over the last few decades, there has been a significant land cover (LC) change across the globe due to the increasing demand of the burgeoning population and urban sprawl. In order to take account of the change, there is a need for accurate and up- to-date LC maps. Mapping and monitoring of LC in India is being carried out at national level using multi-temporal IRS AWiFS data. Multispectral data such as IKONOS, Landsat- TM/ETM+, IRS-1C/D LISS-III/IV, AWiFS and SPOT-5, etc. have adequate spatial resolution (~ 1m to 56m) for LC mapping to generate 1:50,000 maps. However, for developing countries and those with large geographical extent, seasonal LC mapping is prohibitive with data from commercial sensors of limited spatial coverage. Superspectral data from the MODIS sensor are freely available, have better temporal (8 day composites) and spectral information. MODIS pixels typically contain a mixture of various LC types (due to coarse spatial resolution of 250, 500 and 1000 m), especially in more fragmented landscapes. In this context, linear spectral unmixing would be useful for mapping patchy land covers, such as those that characterise much of the Indian subcontinent. This work evaluates the existing unmixing technique for LC mapping using MODIS data, using end- members that are extracted through Pixel Purity Index (PPI), Scatter plot and N-dimensional visualisation. The abundance maps were generated for agriculture, built up, forest, plantations, waste land/others and water bodies. The assessment of the results using ground truth and a LISS-III classified map shows 86% overall accuracy, suggesting the potential for broad-scale applicability of the technique with superspectral data for natural resource planning and inventory applications.
Resumo:
The memory subsystem is a major contributor to the performance, power, and area of complex SoCs used in feature rich multimedia products. Hence, memory architecture of the embedded DSP is complex and usually custom designed with multiple banks of single-ported or dual ported on-chip scratch pad memory and multiple banks of off-chip memory. Building software for such large complex memories with many of the software components as individually optimized software IPs is a big challenge. In order to obtain good performance and a reduction in memory stalls, the data buffers of the application need to be placed carefully in different types of memory. In this paper we present a unified framework (MODLEX) that combines different data layout optimizations to address the complex DSP memory architectures. Our method models the data layout problem as multi-objective genetic algorithm (GA) with performance and power being the objectives and presents a set of solution points which is attractive from a platform design viewpoint. While most of the work in the literature assumes that performance and power are non-conflicting objectives, our work demonstrates that there is significant trade-off (up to 70%) that is possible between power and performance.
Resumo:
Abstract—A new breed of processors like the Cell Broadband Engine, the Imagine stream processor and the various GPU processors emphasize data-level parallelism (DLP) and threadlevel parallelism (TLP) as opposed to traditional instructionlevel parallelism (ILP). This allows them to achieve order-ofmagnitude improvements over conventional superscalar processors for many workloads. However, it is unclear as to how much parallelism of these types exists in current programs. Most earlier studies have largely concentrated on the amount of ILP in a program, without differentiating DLP or TLP. In this study, we investigate the extent of data-level parallelism available in programs in the MediaBench suite. By packing instructions in a SIMD fashion, we observe reductions of up to 91 % (84 % on average) in the number of dynamic instructions, indicating a very high degree of DLP in several applications. I.
Resumo:
This paper reports the results of employing an artificial bee colony search algorithm for synthesizing a mutually coupled lumped-parameter ladder-network representation of a transformer winding, starting from its measured magnitude frequency response. The existing bee colony algorithm is suitably adopted by appropriately defining constraints, inequalities, and bounds to restrict the search space and thereby ensure synthesis of a nearly unique ladder network corresponding to each frequency response. Ensuring near-uniqueness while constructing the reference circuit (i.e., representation of healthy winding) is the objective. Furthermore, the synthesized circuits must exhibit physical realizability. The proposed method is easy to implement, time efficient, and problems associated with the supply of initial guess in existing methods are circumvented. Experimental results are reported on two types of actual, single, and isolated transformer windings (continuous disc and interleaved disc).
Resumo:
This paper reports the results of employing an artificial bee colony search algorithm for synthesizing a mutually coupled lumped-parameter ladder-network representation of a transformer winding, starting from its measured magnitude frequency response. The existing bee colony algorithm is suitably adopted by appropriately defining constraints, inequalities, and bounds to restrict the search space and thereby ensure synthesis of a nearly unique ladder network corresponding to each frequency response. Ensuring near-uniqueness while constructing the reference circuit (i.e., representation of healthy winding) is the objective. Furthermore, the synthesized circuits must exhibit physical realizability. The proposed method is easy to implement, time efficient, and problems associated with the supply of initial guess in existing methods are circumvented. Experimental results are reported on two types of actual, single, and isolated transformer windings (continuous disc and interleaved disc).
Resumo:
Image-guided diffuse optical tomography has the advantage of reducing the total number of optical parameters being reconstructed to the number of distinct tissue types identified by the traditional imaging modality, converting the optical image-reconstruction problem from underdetermined in nature to overdetermined. In such cases, the minimum required measurements might be far less compared to those of the traditional diffuse optical imaging. An approach to choose these optimally based on a data-resolution matrix is proposed, and it is shown that such a choice does not compromise the reconstruction performance. (C) 2013 Optical Society of America
Resumo:
Effective conservation and management of natural resources requires up-to-date information of the land cover (LC) types and their dynamics. The LC dynamics are being captured using multi-resolution remote sensing (RS) data with appropriate classification strategies. RS data with important environmental layers (either remotely acquired or derived from ground measurements) would however be more effective in addressing LC dynamics and associated changes. These ancillary layers provide additional information for delineating LC classes' decision boundaries compared to the conventional classification techniques. This communication ascertains the possibility of improved classification accuracy of RS data with ancillary and derived geographical layers such as vegetation index, temperature, digital elevation model (DEM), aspect, slope and texture. This has been implemented in three terrains of varying topography. The study would help in the selection of appropriate ancillary data depending on the terrain for better classified information.
Resumo:
The rapid growth in the field of data mining has lead to the development of various methods for outlier detection. Though detection of outliers has been well explored in the context of numerical data, dealing with categorical data is still evolving. In this paper, we propose a two-phase algorithm for detecting outliers in categorical data based on a novel definition of outliers. In the first phase, this algorithm explores a clustering of the given data, followed by the ranking phase for determining the set of most likely outliers. The proposed algorithm is expected to perform better as it can identify different types of outliers, employing two independent ranking schemes based on the attribute value frequencies and the inherent clustering structure in the given data. Unlike some existing methods, the computational complexity of this algorithm is not affected by the number of outliers to be detected. The efficacy of this algorithm is demonstrated through experiments on various public domain categorical data sets.
Resumo:
Outlier detection in high dimensional categorical data has been a problem of much interest due to the extensive use of qualitative features for describing the data across various application areas. Though there exist various established methods for dealing with the dimensionality aspect through feature selection on numerical data, the categorical domain is actively being explored. As outlier detection is generally considered as an unsupervised learning problem due to lack of knowledge about the nature of various types of outliers, the related feature selection task also needs to be handled in a similar manner. This motivates the need to develop an unsupervised feature selection algorithm for efficient detection of outliers in categorical data. Addressing this aspect, we propose a novel feature selection algorithm based on the mutual information measure and the entropy computation. The redundancy among the features is characterized using the mutual information measure for identifying a suitable feature subset with less redundancy. The performance of the proposed algorithm in comparison with the information gain based feature selection shows its effectiveness for outlier detection. The efficacy of the proposed algorithm is demonstrated on various high-dimensional benchmark data sets employing two existing outlier detection methods.
Resumo:
A study was conducted in October 2006 in the Charleston, South Carolina area to test the movements of three different buoy line types to determine which produced a preferred profile that could reduce the risk of dolphin entanglement. Tests on diamond-braided nylon commonly used in the crab pot fishery were compared with stiffened line of Esterpro and calf types in both shallow and deep water environments using DSTmilli data loggers. Loggers were placed at intervals along the lines to record depth, and thus movements, over a 24 hour period. Three observers viewed video animations and charts created for each of the six trial days from the collected logger data and provided their opinions on the most desirable line type that fit set criteria. A quantitative analysis (ANCOVA) of the data was conducted taking into consideration daily tidal fluctuations and logger movements. Loggers tracking the tides had an r2 value approaching 1.00 and produced little movement other than with the tides. Conversely, r2 values approaching 0.00 were less affected by tidal movement and influenced by currents that cause more erratic movement. Results from this study showed that stiffened line, in particular the medium lay Esterpro type, produced the more desirable profiles that could reduce risk of dolphin entanglement. Combining the observer’s results with the ANCOVA results, Esterpro was chosen nearly 60% of the time as opposed to the nylon line which was only chosen 10% of the time. ANCOVA results showed that the stiffened lines performed better in both the shallow and deep water environments, while the nylon line only performed better during one trial in a deep water set, most probably due to the increased current velocities experienced that day. (58pp.)(PDF contains 68 pages)
Resumo:
The seasonal stability tests of Canova & Hansen (1995) (CH) provide a method complementary to that of Hylleberg et al. (1990) for testing for seasonal unit roots. But the distribution of the CH tests are unknown in small samples. We present a method to numerically compute critical values and P-values for the CH tests for any sample size and any seasonal periodicity. In fact this method is applicable to the types of seasonality which are commonly in use, but also to any other.