796 resultados para Data-Mining Techniques
Resumo:
Streamflow forecasts at daily time scale are necessary for effective management of water resources systems. Typical applications include flood control, water quality management, water supply to multiple stakeholders, hydropower and irrigation systems. Conventionally physically based conceptual models and data-driven models are used for forecasting streamflows. Conceptual models require detailed understanding of physical processes governing the system being modeled. Major constraints in developing effective conceptual models are sparse hydrometric gauge network and short historical records that limit our understanding of physical processes. On the other hand, data-driven models rely solely on previous hydrological and meteorological data without directly taking into account the underlying physical processes. Among various data driven models Auto Regressive Integrated Moving Average (ARIMA), Artificial Neural Networks (ANNs) are most widely used techniques. The present study assesses performance of ARIMA and ANNs methods in arriving at one-to seven-day ahead forecast of daily streamflows at Basantpur streamgauge site that is situated at upstream of Hirakud Dam in Mahanadi river basin, India. The ANNs considered include Feed-Forward back propagation Neural Network (FFNN) and Radial Basis Neural Network (RBNN). Daily streamflow forecasts at Basantpur site find use in management of water from Hirakud reservoir. (C) 2015 The Authors. Published by Elsevier B.V.
Resumo:
Online Social Networks (OSNs) facilitate to create and spread information easily and rapidly, influencing others to participate and propagandize. This work proposes a novel method of profiling Influential Blogger (IB) based on the activities performed on one's blog documents who influences various other bloggers in Social Blog Network (SBN). After constructing a social blogging site, a SBN is analyzed with appropriate parameters to get the Influential Blog Power (IBP) of each blogger in the network and demonstrate that profiling IB is adequate and accurate. The proposed Profiling Influential Blogger (PIB) Algorithm survival rate of IB is high and stable. (C) 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Resumo:
DNA microarray, or DNA chip, is a technology that allows us to obtain the expression level of many genes in a single experiment. The fact that numerical expression values can be easily obtained gives us the possibility to use multiple statistical techniques of data analysis. In this project microarray data is obtained from Gene Expression Omnibus, the repository of National Center for Biotechnology Information (NCBI). Then, the noise is removed and data is normalized, also we use hypothesis tests to find the most relevant genes that may be involved in a disease and use machine learning methods like KNN, Random Forest or Kmeans. For performing the analysis we use Bioconductor, packages in R for the analysis of biological data, and we conduct a case study in Alzheimer disease. The complete code can be found in https://github.com/alberto-poncelas/ bioc-alzheimer
Resumo:
A Data Mining model that is able to predict if a flight is going to leave late due to a weather delay. It is used, to be able to get a later connection if you have a connecting flight.
Resumo:
188 p.
Resumo:
194 p.
Resumo:
Infrastructure spatial data, such as the orientation and the location of in place structures and these structures' boundaries and areas, play a very important role for many civil infrastructure development and rehabilitation applications, such as defect detection, site planning, on-site safety assistance and others. In order to acquire these data, a number of modern optical-based spatial data acquisition techniques can be used. These techniques are based on stereo vision, optics, time of flight, etc., and have distinct characteristics, benefits and limitations. The main purpose of this paper is to compare these infrastructure optical-based spatial data acquisition techniques based on civil infrastructure application requirements. In order to achieve this goal, the benefits and limitations of these techniques were identified. Subsequently, these techniques were compared according to applications' requirements, such as spatial accuracy, the automation of acquisition, the portability of devices and others. With the help of this comparison, unique characteristics of these techniques were identified so that practitioners will be able to select an appropriate technique for their own applications.
Resumo:
Infrastructure spatial data, such as the orientation and the location of in place structures and these structures' boundaries and areas, play a very important role for many civil infrastructure development and rehabilitation applications, such as defect detection, site planning, on-site safety assistance and others. In order to acquire these data, a number of modern optical-based spatial data acquisition techniques can be used. These techniques are based on stereo vision, optics, time of flight, etc., and have distinct characteristics, benefits and limitations. The main purpose of this paper is to compare these infrastructure optical-based spatial data acquisition techniques based on civil infrastructure application requirements. In order to achieve this goal, the benefits and limitations of these techniques were identified. Subsequently, these techniques were compared according to applications' requirements, such as spatial accuracy, the automation of acquisition, the portability of devices and others. With the help of this comparison, unique characteristics of these techniques were identified so that practitioners will be able to select an appropriate technique for their own applications.
Resumo:
Expressed sequence tags (ESTs) are a source for microsatellite development. In the present study, EST-derived microsatelltes (EST-SSRs) were generated and characterized in the common carp (Cyprinus carpio) by data mining from updated public EST databases and by subsequent testing for polymorphism. About 5.5% (555) of 10,088 ESTs contain repeat motifs of various types and lengths with CA being the most abundant dinucleotide one. Out of the 60 EST-SSRs for which PCR primers were designed, 25 loci showed polymorphism in a common carp population with the alleles per locus ranging from 3 to 17 (mean 7). The observed (H-O) and expected (HE) heterozygosities of these EST-SSRs were 0.13-1.00 and 0.12-0.91, respectively. Six EST-SSR loci significantly deviated from the Hardy-Weinberg equilibrium (HWE) expectation, and the remaining 19 loci were in HWE. Of the 60 primer sets, the rates of polymorphic EST-SSRs were 42% in common carp, 17% in crucian carp (Carassius auratus), and 5% in silver carp (Hypophthalmichthys molitrix), respectively. These new EST-SSR markers would provide sufficient polymorphism for population genetic studies and genome mapping of the common carp and its closely related fishes. (c) 2007 Published by Elsevier B.V.
Resumo:
King, R. D. and Wise, P. H. and Clare, A. (2004) Confirmation of Data Mining Based Predictions of Protein Function. Bioinformatics 20(7), 1110-1118
Resumo:
Clare, A. and King R.D. (2003) Data mining the yeast genome in a lazy functional language. In Practical Aspects of Declarative Languages (PADL'03) (won Best/Most Practical Paper award).
Resumo:
Chapter 15