909 resultados para Multidimensional Scaling


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Top Down Induction of Decision Trees (TDIDT) is the most commonly used method of constructing a model from a dataset in the form of classification rules to classify previously unseen data. Alternative algorithms have been developed such as the Prism algorithm. Prism constructs modular rules which produce qualitatively better rules than rules induced by TDIDT. However, along with the increasing size of databases, many existing rule learning algorithms have proved to be computational expensive on large datasets. To tackle the problem of scalability, parallel classification rule induction algorithms have been introduced. As TDIDT is the most popular classifier, even though there are strongly competitive alternative algorithms, most parallel approaches to inducing classification rules are based on TDIDT. In this paper we describe work on a distributed classifier that induces classification rules in a parallel manner based on Prism.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The fast increase in the size and number of databases demands data mining approaches that are scalable to large amounts of data. This has led to the exploration of parallel computing technologies in order to perform data mining tasks concurrently using several processors. Parallelization seems to be a natural and cost-effective way to scale up data mining technologies. One of the most important of these data mining technologies is the classification of newly recorded data. This paper surveys advances in parallelization in the field of classification rule induction.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advances in hardware and software technology enable us to collect, store and distribute large quantities of data on a very large scale. Automatically discovering and extracting hidden knowledge in the form of patterns from these large data volumes is known as data mining. Data mining technology is not only a part of business intelligence, but is also used in many other application areas such as research, marketing and financial analytics. For example medical scientists can use patterns extracted from historic patient data in order to determine if a new patient is likely to respond positively to a particular treatment or not; marketing analysts can use extracted patterns from customer data for future advertisement campaigns; finance experts have an interest in patterns that forecast the development of certain stock market shares for investment recommendations. However, extracting knowledge in the form of patterns from massive data volumes imposes a number of computational challenges in terms of processing time, memory, bandwidth and power consumption. These challenges have led to the development of parallel and distributed data analysis approaches and the utilisation of Grid and Cloud computing. This chapter gives an overview of parallel and distributed computing approaches and how they can be used to scale up data mining to large datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe ncWMS, an implementation of the Open Geospatial Consortium’s Web Map Service (WMS) specification for multidimensional gridded environmental data. ncWMS can read data in a large number of common scientific data formats – notably the NetCDF format with the Climate and Forecast conventions – then efficiently generate map imagery in thousands of different coordinate reference systems. It is designed to require minimal configuration from the system administrator and, when used in conjunction with a suitable client tool, provides end users with an interactive means for visualizing data without the need to download large files or interpret complex metadata. It is also used as a “bridging” tool providing interoperability between the environmental science community and users of geographic information systems. ncWMS implements a number of extensions to the WMS standard in order to fulfil some common scientific requirements, including the ability to generate plots representing timeseries and vertical sections. We discuss these extensions and their impact upon present and future interoperability. We discuss the conceptual mapping between the WMS data model and the data models used by gridded data formats, highlighting areas in which the mapping is incomplete or ambiguous. We discuss the architecture of the system and particular technical innovations of note, including the algorithms used for fast data reading and image generation. ncWMS has been widely adopted within the environmental data community and we discuss some of the ways in which the software is integrated within data infrastructures and portals.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider second kind integral equations of the form x(s) - (abbreviated x - K x = y ), in which Ω is some unbounded subset of Rn. Let Xp denote the weighted space of functions x continuous on Ω and satisfying x (s) = O(|s|-p ),s → ∞We show that if the kernel k(s,t) decays like |s — t|-q as |s — t| → ∞ for some sufficiently large q (and some other mild conditions on k are satisfied), then K ∈ B(XP) (the set of bounded linear operators on Xp), for 0 ≤ p ≤ q. If also (I - K)-1 ∈ B(X0) then (I - K)-1 ∈ B(XP) for 0 < p < q, and (I- K)-1∈ B(Xq) if further conditions on k hold. Thus, if k(s, t) = O(|s — t|-q). |s — t| → ∞, and y(s)=O(|s|-p), s → ∞, the asymptotic behaviour of the solution x may be estimated as x (s) = O(|s|-r), |s| → ∞, r := min(p, q). The case when k(s,t) = к(s — t), so that the equation is of Wiener-Hopf type, receives especial attention. Conditions, in terms of the symbol of I — K, for I — K to be invertible or Fredholm on Xp are established for certain cases (Ω a half-space or cone). A boundary integral equation, which models three-dimensional acoustic propaga-tion above flat ground, absorbing apart from an infinite rigid strip, illustrates the practical application and sharpness of the above results. This integral equation mod-els, in particular, road traffic noise propagation along an infinite road surface sur-rounded by absorbing ground. We prove that the sound propagating along the rigid road surface eventually decays with distance at the same rate as sound propagating above the absorbing ground.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We obtain sharp estimates for multidimensional generalisations of Vinogradov’s mean value theorem for arbitrary translation-dilation invariant systems, achieving constraints on the number of variables approaching those conjectured to be the best possible. Several applications of our bounds are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We investigate the scaling between precipitation and temperature changes in warm and cold climates using six models that have simulated the response to both increased CO2 and Last Glacial Maximum (LGM) boundary conditions. Globally, precipitation increases in warm climates and decreases in cold climates by between 1.5%/°C and 3%/°C. Precipitation sensitivity to temperature changes is lower over the land than over the ocean and lower over the tropical land than over the extratropical land, reflecting the constraint of water availability. The wet tropics get wetter in warm climates and drier in cold climates, but the changes in dry areas differ among models. Seasonal changes of tropical precipitation in a warmer world also reflect this “rich get richer” syndrome. Precipitation seasonality is decreased in the cold-climate state. The simulated changes in precipitation per degree temperature change are comparable to the observed changes in both the historical period and the LGM.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The leaf carbon isotope ratio (δ13C) of C3 plants is inversely related to the drawdown of CO2 concentration during photosynthesis, which increases towards drier environments. We aimed to discriminate between the hypothesis of universal scaling, which predicts between-species responses of δ13C to aridity similar to within-species responses, and biotic homoeostasis, which predicts offsets in the δ13C of species occupying adjacent ranges. The Northeast China Transect spans 130–900 mm annual precipitation within a narrow latitude and temperature range. Leaves of 171 species were sampled at 33 sites along the transect (18 at ≥ 5 sites) for dry matter, carbon (C) and nitrogen (N) content, specific leaf area (SLA) and δ13C. The δ13C of species generally followed a common relationship with the climatic moisture index (MI). Offsets between adjacent species were not observed. Trees and forbs diverged slightly at high MI. In C3 plants, δ13C predicted N per unit leaf area (Narea) better than MI. The δ13C of C4 plants was invariant with MI. SLA declined and Narea increased towards low MI in both C3 and C4 plants. The data are consistent with optimal stomatal regulation with respect to atmospheric dryness. They provide evidence for universal scaling of CO2 drawdown with aridity in C3 plants.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Three methodological limitations in English-Chinese contrastive rhetoric research have been identified in previous research, namely: the failure to control for the quality of L1 data; an inference approach to interpreting the relationship between L1 and L2 writing; and a focus on national cultural factors in interpreting rhetorical differences. Addressing these limitations, the current study examined the presence or absence and placement of thesis statement and topic sentences in four sets of argumentative texts produced by three groups of university students. We found that Chinese students tended to favour a direct/deductive approach in their English and Chinese writing, while native English writers typically adopted an indirect/inductive approach. This study argues for a dynamic and ecological interpretation of rhetorical practices in different languages and cultures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – This paper aims to address the gaps in service recovery strategy assessment. An effective service recovery strategy that prevents customer defection after a service failure is a powerful managerial instrument. The literature to date does not present a comprehensive assessment of service recovery strategy. It also lacks a clear picture of the service recovery actions at managers’ disposal in case of failure and the effectiveness of individual strategies on customer outcomes. Design/methodology/approach – Based on service recovery theory, this paper proposes a formative index of service recovery strategy and empirically validates this measure using partial least-squares path modelling with survey data from 437 complainants in the telecommunications industry in Egypt. Findings – The CURE scale (CUstomer REcovery scale) presents evidence of reliability as well as convergent, discriminant and nomological validity. Findings also reveal that problem-solving, speed of response, effort, facilitation and apology are the actions that have an impact on the customer’s satisfaction with service recovery. Practical implications – This new formative index is of potential value in investigating links between strategy and customer evaluations of service by helping managers identify which actions contribute most to changes in the overall service recovery strategy as well as satisfaction with service recovery. Ultimately, the CURE scale facilitates the long-term planning of effective complaint management. Originality/value – This is the first study in the service marketing literature to propose a comprehensive assessment of service recovery strategy and clearly identify the service recovery actions that contribute most to changes in the overall service recovery strategy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advances in hardware technologies allow to capture and process data in real-time and the resulting high throughput data streams require novel data mining approaches. The research area of Data Stream Mining (DSM) is developing data mining algorithms that allow us to analyse these continuous streams of data in real-time. The creation and real-time adaption of classification models from data streams is one of the most challenging DSM tasks. Current classifiers for streaming data address this problem by using incremental learning algorithms. However, even so these algorithms are fast, they are challenged by high velocity data streams, where data instances are incoming at a fast rate. This is problematic if the applications desire that there is no or only a very little delay between changes in the patterns of the stream and absorption of these patterns by the classifier. Problems of scalability to Big Data of traditional data mining algorithms for static (non streaming) datasets have been addressed through the development of parallel classifiers. However, there is very little work on the parallelisation of data stream classification techniques. In this paper we investigate K-Nearest Neighbours (KNN) as the basis for a real-time adaptive and parallel methodology for scalable data stream classification tasks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The turbulent structure of a stratocumulus-topped marine boundary layer over a 2-day period is observed with a Doppler lidar at Mace Head in Ireland. Using profiles of vertical velocity statistics, the bulk of the mixing is identified as cloud driven. This is supported by the pertinent feature of negative vertical velocity skewness in the sub-cloud layer which extends, on occasion, almost to the surface. Both coupled and decoupled turbulence characteristics are observed. The length and timescales related to the cloud-driven mixing are investigated and shown to provide additional information about the structure and the source of the mixing inside the boundary layer. They are also shown to place constraints on the length of the sampling periods used to derive products, such as the turbulent dissipation rate, from lidar measurements. For this, the maximum wavelengths that belong to the inertial subrange are studied through spectral analysis of the vertical velocity. The maximum wavelength of the inertial subrange in the cloud-driven layer scales relatively well with the corresponding layer depth during pronounced decoupled structure identified from the vertical velocity skewness. However, on many occasions, combining the analysis of the inertial subrange and vertical velocity statistics suggests higher decoupling height than expected from the skewness profiles. Our results show that investigation of the length scales related to the inertial subrange significantly complements the analysis of the vertical velocity statistics and enables a more confident interpretation of complex boundary layer structures using measurements from a Doppler lidar.