942 resultados para networks text analysis text network graph Gephi network measures shuffed text Zipf Heap Python


Relevância:

80.00% 80.00%

Publicador:

Resumo:

This dissertation establishes a novel data-driven method to identify language network activation patterns in pediatric epilepsy through the use of the Principal Component Analysis (PCA) on functional magnetic resonance imaging (fMRI). A total of 122 subjects’ data sets from five different hospitals were included in the study through a web-based repository site designed here at FIU. Research was conducted to evaluate different classification and clustering techniques in identifying hidden activation patterns and their associations with meaningful clinical variables. The results were assessed through agreement analysis with the conventional methods of lateralization index (LI) and visual rating. What is unique in this approach is the new mechanism designed for projecting language network patterns in the PCA-based decisional space. Synthetic activation maps were randomly generated from real data sets to uniquely establish nonlinear decision functions (NDF) which are then used to classify any new fMRI activation map into typical or atypical. The best nonlinear classifier was obtained on a 4D space with a complexity (nonlinearity) degree of 7. Based on the significant association of language dominance and intensities with the top eigenvectors of the PCA decisional space, a new algorithm was deployed to delineate primary cluster members without intensity normalization. In this case, three distinct activations patterns (groups) were identified (averaged kappa with rating 0.65, with LI 0.76) and were characterized by the regions of: 1) the left inferior frontal Gyrus (IFG) and left superior temporal gyrus (STG), considered typical for the language task; 2) the IFG, left mesial frontal lobe, right cerebellum regions, representing a variant left dominant pattern by higher activation; and 3) the right homologues of the first pattern in Broca's and Wernicke's language areas. Interestingly, group 2 was found to reflect a different language compensation mechanism than reorganization. Its high intensity activation suggests a possible remote effect on the right hemisphere focus on traditionally left-lateralized functions. In retrospect, this data-driven method provides new insights into mechanisms for brain compensation/reorganization and neural plasticity in pediatric epilepsy.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Two concepts in rural economic development policy have been the focus of much research and policy action: the identification and support of clusters or networks of firms and the availability and adoption by rural businesses of Information and Communication Technologies (ICT). From a theoretical viewpoint these policies are based on two contrasting models, with clustering seen as a process of economic agglomeration, and ICT-mediated communication as a means of facilitating economic dispersion. The study’s conceptual framework is based on four interrelated elements: location, interaction, knowledge, and advantage, together with the concept of networks which is employed as an operationally and theoretically unifying concept. The research questions are developed in four successive categories: Policy, Theory, Networks, and Method. The questions are approached using a study of two contrasting groups of rural small businesses in West Cork, Ireland: (a) Speciality Foods, and (b) firms in Digital Products and Services. The study combines Social Network Analysis (SNA) with Qualitative Thematic Analysis, using data collected from semi-structured interviews with 58 owners or managers of these businesses. Data comprise relational network data on the firms’ connections to suppliers, customers, allies and competitors, together with linked qualitative data on how the firms established connections, and how tacit and codified knowledge was sourced and utilised. The research finds that the key characteristics identified in the cluster literature are evident in the sample of Speciality Food businesses, in relation to flows of tacit knowledge, social embedding, and the development of forms of social capital. In particular the research identified the presence of two distinct forms of collective social capital in this network, termed “community” and “reputation”. By contrast the sample of Digital Products and Services businesses does not have the form of a cluster, but matches more closely to dispersive models, or “chain” structures. Much of the economic and social structure of this set of firms is best explained in terms of “project organisation”, and by the operation of an individual rather than collective form of “reputation”. The rural setting in which these firms are located has resulted in their being service-centric, and consequently they rely on ICT-mediated communication in order to exchange tacit knowledge “at a distance”. It is this factor, rather than inputs of codified knowledge, that most strongly influences their operation and their need for availability and adoption of high quality communication technologies. Thus the findings have applicability in relation to theory in Economic Geography and to policy and practice in Rural Development. In addition the research contributes to methodological questions in SNA, and to methodological questions about the combination or mixing of quantitative and qualitative methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Marine protected areas (MPAs) are commonly employed to protect ecosystems from threats like overfishing. Ideally, MPA design should incorporate movement data from multiple target species to ensure sufficient habitat is protected. We used long-term acoustic telemetry and network analysis to determine the fine-scale space use of five shark and one turtle species at a remote atoll in the Seychelles, Indian Ocean, and evaluate the efficacy of a proposed MPA. Results revealed strong, species-specific habitat use in both sharks and turtles, with corresponding variation in MPA use. Defining the MPA's boundary from the edge of the reef flat at low tide instead of the beach at high tide (the current best in Seychelles) significantly increased the MPA's coverage of predator movements by an average of 34%. Informed by these results, the larger MPA was adopted by the Seychelles government, demonstrating how telemetry data can improve shark spatial conservation by affecting policy directly.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Marine protected areas (MPAs) are commonly employed to protect ecosystems from threats like overfishing. Ideally, MPA design should incorporate movement data from multiple target species to ensure sufficient habitat is protected. We used long-term acoustic telemetry and network analysis to determine the fine-scale space use of five shark and one turtle species at a remote atoll in the Seychelles, Indian Ocean, and evaluate the efficacy of a proposed MPA. Results revealed strong, species-specific habitat use in both sharks and turtles, with corresponding variation in MPA use. Defining the MPA's boundary from the edge of the reef flat at low tide instead of the beach at high tide (the current best in Seychelles) significantly increased the MPA's coverage of predator movements by an average of 34%. Informed by these results, the larger MPA was adopted by the Seychelles government, demonstrating how telemetry data can improve shark spatial conservation by affecting policy directly.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the development of information technology, the theory and methodology of complex network has been introduced to the language research, which transforms the system of language in a complex networks composed of nodes and edges for the quantitative analysis about the language structure. The development of dependency grammar provides theoretical support for the construction of a treebank corpus, making possible a statistic analysis of complex networks. This paper introduces the theory and methodology of the complex network and builds dependency syntactic networks based on the treebank of speeches from the EEE-4 oral test. According to the analysis of the overall characteristics of the networks, including the number of edges, the number of the nodes, the average degree, the average path length, the network centrality and the degree distribution, it aims to find in the networks potential difference and similarity between various grades of speaking performance. Through clustering analysis, this research intends to prove the network parameters’ discriminating feature and provide potential reference for scoring speaking performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Data mining can be defined as the extraction of implicit, previously un-known, and potentially useful information from data. Numerous re-searchers have been developing security technology and exploring new methods to detect cyber-attacks with the DARPA 1998 dataset for Intrusion Detection and the modified versions of this dataset KDDCup99 and NSL-KDD, but until now no one have examined the performance of the Top 10 data mining algorithms selected by experts in data mining. The compared classification learning algorithms in this thesis are: C4.5, CART, k-NN and Naïve Bayes. The performance of these algorithms are compared with accuracy, error rate and average cost on modified versions of NSL-KDD train and test dataset where the instances are classified into normal and four cyber-attack categories: DoS, Probing, R2L and U2R. Additionally the most important features to detect cyber-attacks in all categories and in each category are evaluated with Weka’s Attribute Evaluator and ranked according to Information Gain. The results show that the classification algorithm with best performance on the dataset is the k-NN algorithm. The most important features to detect cyber-attacks are basic features such as the number of seconds of a network connection, the protocol used for the connection, the network service used, normal or error status of the connection and the number of data bytes sent. The most important features to detect DoS, Probing and R2L attacks are basic features and the least important features are content features. Unlike U2R attacks, where the content features are the most important features to detect attacks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In 2013 the European Commission launched its new green infrastructure strategy to make another attempt to stop and possibly reverse the loss of biodiversity until 2020, by connecting habitats in the wider landscape. This means that conservation would go beyond current practices to include landscapes that are dominated by conventional agriculture, where biodiversity conservation plays a minor role at best. The green infrastructure strategy aims at bottom-up rather than top-down implementation, and suggests including local and regional stakeholders. Therefore, it is important to know which stakeholders influence land-use decisions concerning green infrastructure at the local and regional level. The research presented in this paper served to select stakeholders in preparation for a participatory scenario development process to analyze consequences of different implementation options of the European green infrastructure strategy. We used a mix of qualitative and quantitative social network analysis (SNA) methods to combine actors’ attributes, especially concerning their perceived influence, with structural and relational measures. Further, our analysis provides information on institutional backgrounds and governance settings for green infrastructure and agricultural policy. The investigation started with key informant interviews at the regional level in administrative units responsible for relevant policies and procedures such as regional planners, representatives of federal ministries, and continued at the local level with farmers and other members of the community. The analysis revealed the importance of information flows and regulations but also of social pressure, considerably influencing biodiversity governance with respect to green infrastructure and biodiversity.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the author(s) of a biomedical publication, or implicit, such as the positive or negative sentiment that an author had when she wrote a product review; there may also be complex context such as the social network of the authors. Many applications require analysis of topic patterns over different contexts. For instance, analysis of search logs in the context of the user can reveal how we can improve the quality of a search engine by optimizing the search results according to particular users; analysis of customer reviews in the context of positive and negative sentiments can help the user summarize public opinions about a product; analysis of blogs or scientific publications in the context of a social network can facilitate discovery of more meaningful topical communities. Since context information significantly affects the choices of topics and language made by authors, in general, it is very important to incorporate it into analyzing and mining text data. In general, modeling the context in text, discovering contextual patterns of language units and topics from text, a general task which we refer to as Contextual Text Mining, has widespread applications in text mining. In this thesis, we provide a novel and systematic study of contextual text mining, which is a new paradigm of text mining treating context information as the ``first-class citizen.'' We formally define the problem of contextual text mining and its basic tasks, and propose a general framework for contextual text mining based on generative modeling of text. This conceptual framework provides general guidance on text mining problems with context information and can be instantiated into many real tasks, including the general problem of contextual topic analysis. We formally present a functional framework for contextual topic analysis, with a general contextual topic model and its various versions, which can effectively solve the text mining problems in a lot of real world applications. We further introduce general components of contextual topic analysis, by adding priors to contextual topic models to incorporate prior knowledge, regularizing contextual topic models with dependency structure of context, and postprocessing contextual patterns to extract refined patterns. The refinements on the general contextual topic model naturally lead to a variety of probabilistic models which incorporate different types of context and various assumptions and constraints. These special versions of the contextual topic model are proved effective in a variety of real applications involving topics and explicit contexts, implicit contexts, and complex contexts. We then introduce a postprocessing procedure for contextual patterns, by generating meaningful labels for multinomial context models. This method provides a general way to interpret text mining results for real users. By applying contextual text mining in the ``context'' of other text information management tasks, including ad hoc text retrieval and web search, we further prove the effectiveness of contextual text mining techniques in a quantitative way with large scale datasets. The framework of contextual text mining not only unifies many explorations of text analysis with context information, but also opens up many new possibilities for future research directions in text mining.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ecological network analysis was applied in the Seine estuary ecosystem, northern France, integrating ecological data from the years 1996 to 2002. The Ecopath with Ecosim (EwE) approach was used to model the trophic flows in 6 spatial compartments leading to 6 distinct EwE models: the navigation channel and the two channel flanks in the estuary proper, and 3 marine habitats in the eastern Seine Bay. Each model included 12 consumer groups, 2 primary producers, and one detritus group. Ecological network analysis was performed, including a set of indices, keystoneness, and trophic spectrum analysis to describe the contribution of the 6 habitats to the Seine estuary ecosystem functioning. Results showed that the two habitats with a functioning most related to a stressed state were the northern and central navigation channels, where building works and constant maritime traffic are considered major anthropogenic stressors. The strong top-down control highlighted in the other 4 habitats was not present in the central channel, showing instead (i) a change in keystone roles in the ecosystem towards sediment-based, lower trophic levels, and (ii) a higher system omnivory. The southern channel evidenced the highest system activity (total system throughput), the higher trophic specialisation (low system omnivory), and the lowest indication of stress (low cycling and relative redundancy). Marine habitats showed higher fish biomass proportions and higher transfer efficiencies per trophic levels than the estuarine habitats, with a transition area between the two that presented intermediate ecosystem structure. The modelling of separate habitats permitted disclosing each one's response to the different pressures, based on their a priori knowledge. Network indices, although non-monotonously, responded to these differences and seem a promising operational tool to define the ecological status of transitional water ecosystems.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The focus of this research is to explore the applications of the finite difference formulation based on the latency insertion method (LIM) to the analysis of circuit interconnects. Special attention is devoted to addressing the issues that arise in very large networks such as on-chip signal and power distribution networks. We demonstrate that the LIM has the power and flexibility to handle various types of analysis required at different stages of circuit design. The LIM is particularly suitable for simulations of very large scale linear networks and can significantly outperform conventional circuit solvers (such as SPICE).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

By providing vehicle-to-vehicle and vehicle-to-infrastructure wireless communications, vehicular ad hoc networks (VANETs), also known as the “networks on wheels”, can greatly enhance traffic safety, traffic efficiency and driving experience for intelligent transportation system (ITS). However, the unique features of VANETs, such as high mobility and uneven distribution of vehicular nodes, impose critical challenges of high efficiency and reliability for the implementation of VANETs. This dissertation is motivated by the great application potentials of VANETs in the design of efficient in-network data processing and dissemination. Considering the significance of message aggregation, data dissemination and data collection, this dissertation research targets at enhancing the traffic safety and traffic efficiency, as well as developing novel commercial applications, based on VANETs, following four aspects: 1) accurate and efficient message aggregation to detect on-road safety relevant events, 2) reliable data dissemination to reliably notify remote vehicles, 3) efficient and reliable spatial data collection from vehicular sensors, and 4) novel promising applications to exploit the commercial potentials of VANETs. Specifically, to enable cooperative detection of safety relevant events on the roads, the structure-less message aggregation (SLMA) scheme is proposed to improve communication efficiency and message accuracy. The scheme of relative position based message dissemination (RPB-MD) is proposed to reliably and efficiently disseminate messages to all intended vehicles in the zone-of-relevance in varying traffic density. Due to numerous vehicular sensor data available based on VANETs, the scheme of compressive sampling based data collection (CS-DC) is proposed to efficiently collect the spatial relevance data in a large scale, especially in the dense traffic. In addition, with novel and efficient solutions proposed for the application specific issues of data dissemination and data collection, several appealing value-added applications for VANETs are developed to exploit the commercial potentials of VANETs, namely general purpose automatic survey (GPAS), VANET-based ambient ad dissemination (VAAD) and VANET based vehicle performance monitoring and analysis (VehicleView). Thus, by improving the efficiency and reliability in in-network data processing and dissemination, including message aggregation, data dissemination and data collection, together with the development of novel promising applications, this dissertation will help push VANETs further to the stage of massive deployment.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Sporulation is a process in which some bacteria divide asymmetrically to form tough protective endospores, which help them to survive in a hazardous environment for a quite long time. The factors which can trigger this process are diverse. Heat, radiation, chemicals and lacking of nutrition can all lead to the formation of endospores. This phenomenon will lead to low productivity during industrial production. However, the sporulation mechanism in a spore-forming bacterium, Clostridium theromcellum, is still unclear. Therefore, if a regulation network of sporulation can be built, we may figure out ways to inhibit this process. In this study, a computational method is applied to predict the sporulation network in Clostridium theromcellum. A working sporulation network model with 40 new predicted genes and 4 function groups is built by using a network construction program, CINPER. 5 sets of microarray expression data in Clostridium theromcellum under different conditions have been collected. The analysis shows the predicted result is reasonable.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

La diffusione dei Social Network ha portato alla necessità di utilizzare tecniche per fare copyright e autenticazione dei file su di essi diffusi. Viene presentato un metodo di watermarking testuale basato sulla sostituzione dei caratteri omoglifi e studiato nell'ambiente dei Social Network. E' stata posta particolare attenzione sulla possibilità che questi adottino già tecniche di watermarking testuale e successivamente sono state studiate le potenzialità dell'algoritmo proposto sulle diverse piattaforme, valutandone la percentuale di successo, la robustezza e l'invisibilità.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Frame. Assessing the difficulty of source texts and parts thereof is important in CTIS, whether for research comparability, for didactic purposes or setting price differences in the market. In order to empirically measure it, Campbell & Hale (1999) and Campbell (2000) developed the Choice Network Analysis (CNA) framework. Basically, the CNA’s main hypothesis is that the more translation options (a group of) translators have to render a given source text stretch, the higher the difficulty of that text stretch will be. We will call this the CNA hypothesis. In a nutshell, this research project puts the CNA hypothesis to the test and studies whether it does actually measure difficulty. Data collection. Two groups of participants (n=29) of different profiles and from two universities in different countries had three translation tasks keylogged with Inputlog, and filled pre- and post-translation questionnaires. Participants translated from English (L2) into their L1s (Spanish or Italian), and worked—first in class and then at home—using their own computers, on texts ca. 800–1000 words long. Each text was translated in approximately equal halves in two 1-hour sessions, in three consecutive weeks. Only the parts translated at home were considered in the study. Results. A very different picture emerged from data than that which the CNA hypothesis might predict: there was no prevalence of disfluent task segments when there were many translation options, nor was a prevalence of fluent task segments associated to fewer translation options. Indeed, there was no correlation between the number of translation options (many and few) and behavioral fluency. Additionally, there was no correlation between pauses and both behavioral fluency and typing speed. The discussed theoretical flaws and the empirical evidence lead to the conclusion that the CNA framework does not and cannot measure text and translation difficulty.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this thesis we discuss the expansion of an existing project, called CHIMeRA, which is a comprehensive biomedical network, and the analysis of its sub-components by using graph theory. We describe how it is structured internally, what are the existing databases from which it retrieves information and what machine learning techniques are used in order to produce new knowledge. We also introduce a new technique for graph exploration that is aimed to speed-up the network cover time under the condition that the analyzed graph is stellar; if this condition is satisfied, the improvement in the performance compared to the conventional exploration technique is extremely appealing. We show that the stellar structure is highly recurrent for sub-networks in CHIMeRA generated by queries, which made this technique even more interesting. Finally, we describe the convenience in using the CHIMeRA network for research purposes and what it could become in a very near future.