84 resultados para mash-up


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Top Down Induction of Decision Trees (TDIDT) is the most commonly used method of constructing a model from a dataset in the form of classification rules to classify previously unseen data. Alternative algorithms have been developed such as the Prism algorithm. Prism constructs modular rules which produce qualitatively better rules than rules induced by TDIDT. However, along with the increasing size of databases, many existing rule learning algorithms have proved to be computational expensive on large datasets. To tackle the problem of scalability, parallel classification rule induction algorithms have been introduced. As TDIDT is the most popular classifier, even though there are strongly competitive alternative algorithms, most parallel approaches to inducing classification rules are based on TDIDT. In this paper we describe work on a distributed classifier that induces classification rules in a parallel manner based on Prism.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The fast increase in the size and number of databases demands data mining approaches that are scalable to large amounts of data. This has led to the exploration of parallel computing technologies in order to perform data mining tasks concurrently using several processors. Parallelization seems to be a natural and cost-effective way to scale up data mining technologies. One of the most important of these data mining technologies is the classification of newly recorded data. This paper surveys advances in parallelization in the field of classification rule induction.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advances in hardware and software technology enable us to collect, store and distribute large quantities of data on a very large scale. Automatically discovering and extracting hidden knowledge in the form of patterns from these large data volumes is known as data mining. Data mining technology is not only a part of business intelligence, but is also used in many other application areas such as research, marketing and financial analytics. For example medical scientists can use patterns extracted from historic patient data in order to determine if a new patient is likely to respond positively to a particular treatment or not; marketing analysts can use extracted patterns from customer data for future advertisement campaigns; finance experts have an interest in patterns that forecast the development of certain stock market shares for investment recommendations. However, extracting knowledge in the form of patterns from massive data volumes imposes a number of computational challenges in terms of processing time, memory, bandwidth and power consumption. These challenges have led to the development of parallel and distributed data analysis approaches and the utilisation of Grid and Cloud computing. This chapter gives an overview of parallel and distributed computing approaches and how they can be used to scale up data mining to large datasets.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The UK construction industry labour market is characterised by high levels of self-employment, sub-contracting, informality and flexibility. A corollary of this, and a sign of the increasing globalisation of construction, has been an increasing reliance on migrant labour, particularly that from the Eastern European Accession states. Yet, little is known about how their experiences within and outside of work shape their work in the construction sector. In this context better qualitative understandings of the social and communication networks through which migrant workers gain employment, create routes through the sector and develop their role/career are needed. We draw on two examples from a short-term ethnographic study of migrant construction worker employment experiences and practices in the town of Crewe in Cheshire, UK, to demonstrate how informal networks intersect with formal elements of the sector to facilitate both recruitment and up-skilling. Such research knowledge, we argue, offers new evidence of the importance of attending to migrant worker’s own experiences in the development of more transparent recruitment processes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective Behavioural inhibition (BI) in early childhood is associated with increased risk for anxiety. The present research examines BI alongside family environment factors, specifically maternal negativity and overinvolvement, maternal anxiety and mother-child attachment, with a view to providing a broader understanding of the development of child anxiety. Method Participants were 202 children classified at age 4 as either behaviourally inhibited (N=102) or uninhibited (N=100). Family environment, BI and child anxiety were assessed at baseline and child anxiety and BI were assessed again two-years later when participants were aged 6 years. Results After controlling for baseline anxiety, inhibited participants were significantly more likely to meet criteria for a diagnosis of social phobia and generalized anxiety disorder at follow-up. Path analysis suggested that maternal anxiety significantly affected child anxiety over time, even after controlling for the effects of BI and baseline anxiety. No significant paths from parenting or attachment to child anxiety were found. Maternal overinvolvement was significantly associated with BI at follow-up. Conclusions At age 4, BI, maternal anxiety and child anxiety represent risk factors for anxiety at age 6. Furthermore, overinvolved parenting increases risk for BI at age 6, which may then lead to the development of anxiety in later childhood.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditionally, the formal scientific output in most fields of natural science has been limited to peer- reviewed academic journal publications, with less attention paid to the chain of intermediate data results and their associated metadata, including provenance. In effect, this has constrained the representation and verification of the data provenance to the confines of the related publications. Detailed knowledge of a dataset’s provenance is essential to establish the pedigree of the data for its effective re-use, and to avoid redundant re-enactment of the experiment or computation involved. It is increasingly important for open-access data to determine their authenticity and quality, especially considering the growing volumes of datasets appearing in the public domain. To address these issues, we present an approach that combines the Digital Object Identifier (DOI) – a widely adopted citation technique – with existing, widely adopted climate science data standards to formally publish detailed provenance of a climate research dataset as an associated scientific workflow. This is integrated with linked-data compliant data re-use standards (e.g. OAI-ORE) to enable a seamless link between a publication and the complete trail of lineage of the corresponding dataset, including the dataset itself.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Market liberalization in emerging-market economies and the entry of multinational firms spur significant changes to the industry/institutional environment faced by domestic firms. Prior studies have described how such changes tend to be disruptive to the relatively backward domestic firms, and negatively affect their performance and survival prospects. In this paper, we study how domestic supplier firms may adapt and continue to perform, as market liberalization progresses, through catch-up strategies aimed at integrating with the industry's global value chain. Drawing on internalization theory and the literatures on upgrading and catch-up processes, learning and relational networks, we hypothesize that, for continued performance, domestic supplier firms need to adapt their strategies from catching up initially through technology licensing/collaborations and joint ventures with multinational enterprises (MNEs) to also developing strong customer relationships with downstream firms (especially MNEs). Further, we propose that successful catch-up through these two strategies lays the foundation for a strategy of knowledge creation during the integration of domestic industry with the global value chain. Our analysis of data from the auto components industry in India during the period 1992–2002, that is, the decade since liberalization began in 1991, offers support for our hypotheses.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this article, we make two important contributions to the literature on clusters. First, we provide a broader theory of cluster connectivity that has hitherto focused on organization-based pipelines and MNE subsidiaries, by including linkages in the form of personal relationships. Second, we use the lens of social network theory to derive a number of testable propositions. We propose that global linkages with decentralized network structures have the highest potential for local spillovers. In the emerging economy context, our theory implies that clusters linked to the global economy by decentralized pipelines have potential for in-depth catch-up, focused in industry and technology scope. In contrast, clusters linked through decentralized personal relationships have potential for in-breadth catch-up over a range of related industries and technologies. We illustrate our theoretical propositions by contrasting two emerging economy case studies: Bollywood, the Indian filmed entertainment cluster in Mumbai and the Indian software cluster in Bangalore.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A number of studies have found an asymmetric response of consumer price index inflation to the output gap in the US in simple Phillips curve models. We consider whether there are similar asymmetries in mark-up pricing models, that is, whether the mark-up over producers' costs also depends upon the sign of the (adjusted) output gap. The robustness of our findings to the price series is assessed, and also whether price-output responses in the UK are asymmetric.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The role of eddy fluxes in the general circulation is often approached by treating eddies as (macro)turbulence. In this approach, the eddies act to diffuse certain quasiconservative quantities, such as potential vorticity (PV), along isentropic surfaces in the free atmosphere. The eddy fluxes are determined primarily by the eddy diffusivities and are necessarily down-gradient of the basic state PV field. Support for the (macro)turbulence approach stems from the fact that the eddy fluxes of PV in the free atmosphere are generally down-gradient in the long-term mean. Here we call attention to a pronounced and significant region of upgradient eddy PV fluxes on the poleward flank of the jet core in both hemispheres. The region of up-gradient (i.e., notionally “antidiffusive”) eddy PV fluxes is most pronounced during the winter and spring seasons and partially contradicts the turbulence approach described above. Analyses of the PV variance (potential enstrophy) budget suggest that the up-gradient PV fluxes represent local wave decay and are maintained by poleward fluxes of PV variance. Finite-amplitude effects thus represent leading order contributions to the PV variance budget, whereas dissipation is only of secondary importance locally. The appearance of up-gradient PV fluxes in the long-term mean is associated with the poleward shift of the jet—and thus the region of wave decay relative to wave growth—following wave-breaking events.