803 resultados para Sensor Networks and Data Streaming
Resumo:
Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science, technology, medicine, public health, economics, business, linguistics and social science are bombarded by ever increasing flows of data begging to be analyzed efficiently and effectively. In this paper, we propose a rough idea of a possible taxonomy of big data, along with some of the most commonly used tools for handling each particular category of bigness. The dimensionality p of the input space and the sample size n are usually the main ingredients in the characterization of data bigness. The specific statistical machine learning technique used to handle a particular big data set will depend on which category it falls in within the bigness taxonomy. Large p small n data sets for instance require a different set of tools from the large n small p variety. Among other tools, we discuss Preprocessing, Standardization, Imputation, Projection, Regularization, Penalization, Compression, Reduction, Selection, Kernelization, Hybridization, Parallelization, Aggregation, Randomization, Replication, Sequentialization. Indeed, it is important to emphasize right away that the so-called no free lunch theorem applies here, in the sense that there is no universally superior method that outperforms all other methods on all categories of bigness. It is also important to stress the fact that simplicity in the sense of Ockham’s razor non-plurality principle of parsimony tends to reign supreme when it comes to massive data. We conclude with a comparison of the predictive performance of some of the most commonly used methods on a few data sets.
Resumo:
For wireless power transfer (WPT) systems, communication between the primary side and the pickup side is a challenge because of the large air gap and magnetic interferences. A novel method, which integrates bidirectional data communication into a high-power WPT system, is proposed in this paper. The power and data transfer share the same inductive link between coreless coils. Power/data frequency division multiplexing technique is applied, and the power and data are transmitted by employing different frequency carriers and controlled independently. The circuit model of the multiband system is provided to analyze the transmission gain of the communication channel, as well as the power delivery performance. The crosstalk interference between two carriers is discussed. In addition, the signal-to-noise ratios of the channels are also estimated, which gives a guideline for the design of mod/demod circuits. Finally, a 500-W WPT prototype has been built to demonstrate the effectiveness of the proposed WPT system.
Resumo:
Sentiment classification over Twitter is usually affected by the noisy nature (abbreviations, irregular forms) of tweets data. A popular procedure to reduce the noise of textual data is to remove stopwords by using pre-compiled stopword lists or more sophisticated methods for dynamic stopword identification. However, the effectiveness of removing stopwords in the context of Twitter sentiment classification has been debated in the last few years. In this paper we investigate whether removing stopwords helps or hampers the effectiveness of Twitter sentiment classification methods. To this end, we apply six different stopword identification methods to Twitter data from six different datasets and observe how removing stopwords affects two well-known supervised sentiment classification methods. We assess the impact of removing stopwords by observing fluctuations on the level of data sparsity, the size of the classifier's feature space and its classification performance. Our results show that using pre-compiled lists of stopwords negatively impacts the performance of Twitter sentiment classification approaches. On the other hand, the dynamic generation of stopword lists, by removing those infrequent terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and substantially shrinking the feature space
Resumo:
Innovation is one of the key drivers for gaining competitive advantages in any firms. Understanding knowledge transfer through inter-firm networks and its effects on types of innovation in SMEs is very important in improving SMEs innovation. This study examines relationships between characteristics of inter-firm knowledge transfer networks and types of innovation in SMEs. To achieve this, social network perspective is adopted to understand inter-firm knowledge transfer networks and its impact on innovation by investigating how and to what extend ego network characteristics are affecting types of innovation. Therefore, managers can develop the firms'network according to their strategies and requirements. First, a conceptual model and research hypotheses are proposed to establish the possible relationship between network properties and types of innovation. Three aspects of ego network are identified and adopted for hypotheses development: 1) structural properties which address the potential for resources and the context for the flow of resources, 2) relational properties which reflect the quality of resource flows, and 3) nodal properties which are about quality and variety of resources and capabilities of the ego partners. A questionnaire has been designed based on the hypotheses. Second, semistructured interviews with managers of five SMEs have been carried out, and a thematic qualitative analysis of these interviews has been performed. The interviews helped to revise the questionnaire and provided preliminary evidence to support the hypotheses. Insights from the preliminary investigation also helped to develop research plan for the next stage of this research.
Resumo:
Business angels are natural persons who provide equity financing for young enterprises and gain ownership in them. They are usually anonym investors and they operate in the background of the companies. Their important feature is that over the funding of the enterprises based on their business experiences they can contribute to the success of the companies with their special expertise and with strategic support. As a result of the asymmetric information between the angels and the companies their matching is difficult (Becsky-Nagy – Fazekas 2015), and the fact, that angel investors prefer anonymity makes it harder for entrepreneurs to obtain informal venture capital. The primary aim of the different type of business angel organizations and networks is to alleviate this matching process with intermediation between the two parties. The role of these organizations is increasing in the informal venture capital market compared to the individually operating angels. The recognition of their economic importance led many governments to support them. There were also public initiations that aimed the establishment of these intermediary organizations that led to the institutionalization of business angels. This study via the characterization of business angels focuses on the progress of these informational intermediaries and their ways of development with regards to the international trends and the current situation of Hungarian business angels and angel networks.
Resumo:
Groundwater systems of different densities are often mathematically modeled to understand and predict environmental behavior such as seawater intrusion or submarine groundwater discharge. Additional data collection may be justified if it will cost-effectively aid in reducing the uncertainty of a model's prediction. The collection of salinity, as well as, temperature data could aid in reducing predictive uncertainty in a variable-density model. However, before numerical models can be created, rigorous testing of the modeling code needs to be completed. This research documents the benchmark testing of a new modeling code, SEAWAT Version 4. The benchmark problems include various combinations of density-dependent flow resulting from variations in concentration and temperature. The verified code, SEAWAT, was then applied to two different hydrological analyses to explore the capacity of a variable-density model to guide data collection. ^ The first analysis tested a linear method to guide data collection by quantifying the contribution of different data types and locations toward reducing predictive uncertainty in a nonlinear variable-density flow and transport model. The relative contributions of temperature and concentration measurements, at different locations within a simulated carbonate platform, for predicting movement of the saltwater interface were assessed. Results from the method showed that concentration data had greater worth than temperature data in reducing predictive uncertainty in this case. Results also indicated that a linear method could be used to quantify data worth in a nonlinear model. ^ The second hydrological analysis utilized a model to identify the transient response of the salinity, temperature, age, and amount of submarine groundwater discharge to changes in tidal ocean stage, seasonal temperature variations, and different types of geology. The model was compared to multiple kinds of data to (1) calibrate and verify the model, and (2) explore the potential for the model to be used to guide the collection of data using techniques such as electromagnetic resistivity, thermal imagery, and seepage meters. Results indicated that the model can be used to give insight to submarine groundwater discharge and be used to guide data collection. ^
Resumo:
Due to the rapid advances in computing and sensing technologies, enormous amounts of data are being generated everyday in various applications. The integration of data mining and data visualization has been widely used to analyze these massive and complex data sets to discover hidden patterns. For both data mining and visualization to be effective, it is important to include the visualization techniques in the mining process and to generate the discovered patterns for a more comprehensive visual view. In this dissertation, four related problems: dimensionality reduction for visualizing high dimensional datasets, visualization-based clustering evaluation, interactive document mining, and multiple clusterings exploration are studied to explore the integration of data mining and data visualization. In particular, we 1) propose an efficient feature selection method (reliefF + mRMR) for preprocessing high dimensional datasets; 2) present DClusterE to integrate cluster validation with user interaction and provide rich visualization tools for users to examine document clustering results from multiple perspectives; 3) design two interactive document summarization systems to involve users efforts and generate customized summaries from 2D sentence layouts; and 4) propose a new framework which organizes the different input clusterings into a hierarchical tree structure and allows for interactive exploration of multiple clustering solutions.
Resumo:
Solving microkinetics of catalytic systems, which bridges microscopic processes and macroscopic reaction rates, is currently vital for understanding catalysis in silico. However, traditional microkinetic solvers possess several drawbacks that make the process slow and unreliable for complicated catalytic systems. In this paper, a new approach, the so-called reversibility iteration method (RIM), is developed to solve microkinetics for catalytic systems. Using the chemical potential notation we previously proposed to simplify the kinetic framework, the catalytic systems can be analytically illustrated to be logically equivalent to the electric circuit, and the reaction rate and coverage can be calculated by updating the values of reversibilities. Compared to the traditional modified Newton iteration method (NIM), our method is not sensitive to the initial guess of the solution and typically requires fewer iteration steps. Moreover, the method does not require arbitrary-precision arithmetic and has a higher probability of successfully solving the system. These features make it ∼1000 times faster than the modified Newton iteration method for the systems we tested. Moreover, the derived concept and the mathematical framework presented in this work may provide new insight into catalytic reaction networks.
Resumo:
Background
It is generally acknowledged that a functional understanding of a biological system can only be obtained by an understanding of the collective of molecular interactions in form of biological networks. Protein networks are one particular network type of special importance, because proteins form the functional base units of every biological cell. On a mesoscopic level of protein networks, modules are of significant importance because these building blocks may be the next elementary functional level above individual proteins allowing to gain insight into fundamental organizational principles of biological cells.
Results
In this paper, we provide a comparative analysis of five popular and four novel module detection algorithms. We study these module prediction methods for simulated benchmark networks as well as 10 biological protein interaction networks (PINs). A particular focus of our analysis is placed on the biological meaning of the predicted modules by utilizing the Gene Ontology (GO) database as gold standard for the definition of biological processes. Furthermore, we investigate the robustness of the results by perturbing the PINs simulating in this way our incomplete knowledge of protein networks.
Conclusions
Overall, our study reveals that there is a large heterogeneity among the different module prediction algorithms if one zooms-in the biological level of biological processes in the form of GO terms and all methods are severely affected by a slight perturbation of the networks. However, we also find pathways that are enriched in multiple modules, which could provide important information about the hierarchical organization of the system
Resumo:
Pavements tend to deteriorate with time under repeated traffic and/or environmental loading. By detecting pavement distresses and damage early enough, it is possible for transportation agencies to develop more effective pavement maintenance and rehabilitation programs and thereby achieve significant cost and time savings. The structural health monitoring (SHM) concept can be considered as a systematic method for assessing the structural state of pavement infrastructure systems and documenting their condition. Over the past several years, this process has traditionally been accomplished through the use of wired sensors embedded in bridge and highway pavement. However, the use of wired sensors has limitations for long-term SHM and presents other associated cost and safety concerns. Recently, micro-electromechanical sensors and systems (MEMS) and nano-electromechanical systems (NEMS) have emerged as advanced/smart-sensing technologies with potential for cost-effective and long-term SHM. This two-pronged study evaluated the performance of commercial off-the-shelf (COTS) MEMS sensors embedded in concrete pavement (Final Report Volume I) and developed a wireless MEMS multifunctional sensor system for health monitoring of concrete pavement (Final Report Volume II).
Resumo:
This chapter examines community media projects in Scotland as social processes that nurture knowledge through participation in production. A visual and media anthropology framework (Ginsburg, 2005) with an emphasis on the social context of media production informs the analysis of community media. Drawing on community media projects in the Govan area of Glasgow and the Isle of Bute, the techniques of production foreground “the relational aspects of filmmaking” (Grimshaw and Ravetz, 2005: 7) and act as a catalyst for knowledge and networks of relations embedded in time and place. Community media is defined here as a creative social process, characterised by an approach to production that is multi-authored, collaborative and informed by the lives of participants, and which recognises the relevance of networks of relations to that practice (Caines, 2007: 2). As a networked process, community media production is recognised as existing in collaboration between a director or producer, such as myself, and organisations, institutions and participants, who are connected through a range of identities, practices and place. These relations born of the production process reflect a complex area of practice and participation that brings together “parallel and overlapping public spheres” (Meadows et al., 2002: 3). This relates to broader concerns with networks (Carpentier, Servaes and Lie, 2003; Rodríguez, 2001), both revealed during the process of production and enhanced by it, and how they can be described with reference to the knowledge practice of community media.