980 resultados para Link prediction


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the growing interest in social networks, link prediction has received significant attention. Link prediction is mostly based on graph-based features, with some recent approaches focusing on domain semantics. We propose algorithms for link prediction that use a probabilistic ontology to enhance the analysis of the domain and the unavoidable uncertainty in the task (the ontology is specified in the probabilistic description logic crALC). The scalability of the approach is investigated, through a combination of semantic assumptions and graph-based features. We evaluate empirically our proposal, and compare it with standard solutions in the literature.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Latent variable models for network data extract a summary of the relational structure underlying an observed network. The simplest possible models subdivide nodes of the network into clusters; the probability of a link between any two nodes then depends only on their cluster assignment. Currently available models can be classified by whether clusters are disjoint or are allowed to overlap. These models can explain a "flat" clustering structure. Hierarchical Bayesian models provide a natural approach to capture more complex dependencies. We propose a model in which objects are characterised by a latent feature vector. Each feature is itself partitioned into disjoint groups (subclusters), corresponding to a second layer of hierarchy. In experimental comparisons, the model achieves significantly improved predictive performance on social and biological link prediction tasks. The results indicate that models with a single layer hierarchy over-simplify real networks.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Understanding a complex network's structure holds the key to understanding its function. The physics community has contributed a multitude of methods and analyses to this cross-disciplinary endeavor. Structural features exist on both the microscopic level, resulting from differences between single node properties, and the mesoscopic level resulting from properties shared by groups of nodes. Disentangling the determinants of network structure on these different scales has remained a major, and so far unsolved, challenge. Here we show how multiscale generative probabilistic exponential random graph models combined with efficient, distributive message-passing inference techniques can be used to achieve this separation of scales, leading to improved detection accuracy of latent classes as demonstrated on benchmark problems. It sheds new light on the statistical significance of motif-distributions in neural networks and improves the link-prediction accuracy as exemplified for gene-disease associations in the highly consequential Online Mendelian Inheritance in Man database. © 2011 Reichardt et al.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Event-specific scales commonly have greater power than generalized scales in prediction of specific disorders and in testing mediator models for predicting such disorders. Therefore, in a preliminary study, a 6-item Alcohol Helplessness Scale was constructed and found to be reliable for a sample of 98 problem drinkers. Hierarchical multiple regression and its derivative path analysis were used to test whether helplessness and self-efficacy moderate or mediate the link between alcohol dependence and depression, A test of a moderation model was not supported, whereas a test of a mediation model was supported. Helplessness and self-efficacy both significantly and independently mediated between alcohol dependence and depression. Nevertheless, a significant direct effect of alcohol dependence on depression also remained.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Research on workforce diversity gained momentum in the 1990s. However, empirical findings to date on the link between gender diversity and performance have been inconsistent. Based on contrasting theories, this paper proposes a positive linear and a negative linear prediction of the gender diversity-performance relationship. The paper also proposes that industry type (services vs. manufacturing) moderates the gender diversity-performance relationship such that the relationship will be positive in service organisations and negative in manufacturing organisations. The results show partial support for the positive linear gender diversity-performance relationship and for the moderating effect of industry type. The study contributes to the field of diversity by showing that workforce gender diversity can have a different impact on organisational performance in different industries.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the benefits and issues related to travel time prediction on urban network. Travel time information quantifies congestion and is perhaps the most important network performance measure. Travel time prediction has been an active area of research for the last five decades. The activities related to ITS have increased the attention of researchers for better and accurate real-time prediction of travel time. Majority of the literature on travel time prediction is applicable to freeways where, under non-incident conditions, traffic flow is not affected by external factors such as traffic control signals and opposing traffic flows. On urban environment the problem is more complicated due to conflicting areas (intersections), mid-link sources and sinks etc. and needs to be addressed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motion planning for planetary rovers must consider control uncertainty in order to maintain the safety of the platform during navigation. Modelling such control uncertainty is difficult due to the complex interaction between the platform and its environment. In this paper, we propose a motion planning approach whereby the outcome of control actions is learned from experience and represented statistically using a Gaussian process regression model. This mobility prediction model is trained using sample executions of motion primitives on representative terrain, and predicts the future outcome of control actions on similar terrain. Using Gaussian process regression allows us to exploit its inherent measure of prediction uncertainty in planning. We integrate mobility prediction into a Markov decision process framework and use dynamic programming to construct a control policy for navigation to a goal region in a terrain map built using an on-board depth sensor. We consider both rigid terrain, consisting of uneven ground, small rocks, and non-traversable rocks, and also deformable terrain. We introduce two methods for training the mobility prediction model from either proprioceptive or exteroceptive observations, and report results from nearly 300 experimental trials using a planetary rover platform in a Mars-analogue environment. Our results validate the approach and demonstrate the value of planning under uncertainty for safe and reliable navigation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Under the project `Seasonal Prediction of the Indian Monsoon' (SPIM), the prediction of Indian summer monsoon rainfall by five atmospheric general circulation models (AGCMs) during 1985-2004 was assessed. The project was a collaborative effort of the coordinators and scientists from the different modelling groups across the country. All the runs were made at the Centre for Development of Advanced Computing (CDAC) at Bangalore on the PARAM Padma supercomputing system. Two sets of simulations were made for this purpose. In the first set, the AGCMs were forced by the observed sea surface temperature (SST) for May-September during 1985-2004. In the second set, runs were made for 1987, 1988, 1994, 1997 and 2002 forced by SST which was obtained by assuming that the April anomalies persist during May-September. The results of the first set of runs show, as expected from earlier studies, that none of the models were able to simulate the correct sign of the anomaly of the Indian summer monsoon rainfall for all the years. However, among the five models, one simulated the correct sign in the largest number of years and the second model showed maximum skill in the simulation of the extremes (i.e. droughts or excess rainfall years). The first set of runs showed some common bias which could arise either from an excessive sensitivity of the models to El Nino Southern Oscillation (ENSO) or an inability of the models to simulate the link of the Indian monsoon rainfall to Equatorial Indian Ocean Oscillation (EQUINOO), or both. Analysis of the second set of runs showed that with a weaker ENSO forcing, some models could simulate the link with EQUINOO, suggesting that the errors in the monsoon simulations with observed SST by these models could be attributed to unrealistically high sensitivity to ENSO.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We report here an experimental investigation for establishing and quantifying a link between the growth and decay characteristics of fiber Bragg gratings. One of the key aspects of our work is the determination of the defect energy distribution from the grating characteristics measured during their fabrication. We observe a strong correlation between the growth-based defect energy distribution and that obtained through accelerated aging experiments, paving the way for predicting the decay characteristics of fiber Bragg gratings from their growth data. Such a prediction is significant in simplifying the postfabrication steps required to enhance the thermal stability of fiber Bragg gratings. (c) 2011 Optical Society of America

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Interannual variation of Indian summer monsoon rainfall (ISMR) is linked to El Nino-Southern oscillation (ENSO) as well as the Equatorial Indian Ocean oscillation (EQUINOO) with the link with the seasonal value of the ENSO index being stronger than that with the EQUINOO index. We show that the variation of a composite index determined through bivariate analysis, explains 54% of ISMR variance, suggesting a strong dependence of the skill of monsoon prediction on the skill of prediction of ENSO and EQUINOO. We explored the possibility of prediction of the Indian rainfall during the summer monsoon season on the basis of prior values of the indices. We find that such predictions are possible for July-September rainfall on the basis of June indices and for August-September rainfall based on the July indices. This will be a useful input for second and later stage forecasts made after the commencement of the monsoon season.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Warming of the global climate is now unequivocal and its impact on Earth’ functional units has become more apparent. Here, we show that marine ecosystems are not equally sensitive to climate change and reveal a critical thermal boundary where a small increase in temperature triggers abrupt ecosystem shifts seen across multiple trophic levels. This large-scale boundary is located in regions where abrupt ecosystem shifts have been reported in the North Atlantic sector and thereby allows us to link these shifts by a global common phenomenon. We show that these changes alter the biodiversity and carrying capacity of ecosystems and may, combined with fishing, precipitate the reduction of some stocks of Atlantic cod already severely impacted by exploitation. These findings offer a way to anticipate major ecosystem changes and to propose adaptive strategies for marine exploited resources such as cod in order to minimize social and economic consequences.