Biblioteca Digital

914 resultados para predictive coding

A Taxonomy of Big Data for Optimal Predictive Machine Learning and Data Mining

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science, technology, medicine, public health, economics, business, linguistics and social science are bombarded by ever increasing flows of data begging to be analyzed efficiently and effectively. In this paper, we propose a rough idea of a possible taxonomy of big data, along with some of the most commonly used tools for handling each particular category of bigness. The dimensionality p of the input space and the sample size n are usually the main ingredients in the characterization of data bigness. The specific statistical machine learning technique used to handle a particular big data set will depend on which category it falls in within the bigness taxonomy. Large p small n data sets for instance require a different set of tools from the large n small p variety. Among other tools, we discuss Preprocessing, Standardization, Imputation, Projection, Regularization, Penalization, Compression, Reduction, Selection, Kernelization, Hybridization, Parallelization, Aggregation, Randomization, Replication, Sequentialization. Indeed, it is important to emphasize right away that the so-called no free lunch theorem applies here, in the sense that there is no universally superior method that outperforms all other methods on all categories of bigness. It is also important to stress the fact that simplicity in the sense of Ockham’s razor non-plurality principle of parsimony tends to reign supreme when it comes to massive data. We conclude with a comparison of the predictive performance of some of the most commonly used methods on a few data sets.

A Comparative Analysis of Predictive Learning Algorithms on High-Dimensional Microarray Cancer Data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research evaluates pattern recognition techniques on a subclass of big data where the dimensionality of the input space (p) is much larger than the number of observations (n). Specifically, we evaluate massive gene expression microarray cancer data where the ratio κ is less than one. We explore the statistical and computational challenges inherent in these high dimensional low sample size (HDLSS) problems and present statistical machine learning methods used to tackle and circumvent these difficulties. Regularization and kernel algorithms were explored in this research using seven datasets where κ < 1. These techniques require special attention to tuning necessitating several extensions of cross-validation to be investigated to support better predictive performance. While no single algorithm was universally the best predictor, the regularization technique produced lower test errors in five of the seven datasets studied.

Experimental validation of a predictive pyrolysis model in AspenPlus®

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel simulation model for pyrolysis processes oflignocellulosicbiomassin AspenPlus (R) was presented at the BC&E 2013. Based on kinetic reaction mechanisms, the simulation calculates product compositions and yields depending on reactor conditions (temperature, residence time, flue gas flow rate) and feedstock composition (biochemical composition, atomic composition, ash and alkali metal content). The simulation model was found to show good correlation with existing publications. In order to further verify the model, own pyrolysis experiments in a 1 kg/h continuously fed fluidized bed fast pyrolysis reactor are performed. Two types of biomass with different characteristics are processed in order to evaluate the influence of the feedstock composition on the yields of the pyrolysis products and their composition. One wood and one straw-like feedstock are used due to their different characteristics. Furthermore, the temperature response of yields and product compositions is evaluated by varying the reactor temperature between 450 and 550 degrees C for one of the feedstocks. The yields of the pyrolysis products (gas, oil, char) are determined and their detailed composition is analysed. The experimental runs are reproduced with the corresponding reactor conditions in the AspenPlus model and the results compared with the experimental findings.

Advanced predictive-analysis-based decision support for collaborative logistics networks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – The purpose of this paper is to examine challenges and potential of big data in heterogeneous business networks and relate these to an implemented logistics solution. Design/methodology/approach – The paper establishes an overview of challenges and opportunities of current significance in the area of big data, specifically in the context of transparency and processes in heterogeneous enterprise networks. Within this context, the paper presents how existing components and purpose-driven research were combined for a solution implemented in a nationwide network for less-than-truckload consignments. Findings – Aside from providing an extended overview of today’s big data situation, the findings have shown that technical means and methods available today can comprise a feasible process transparency solution in a large heterogeneous network where legacy practices, reporting lags and incomplete data exist, yet processes are sensitive to inadequate policy changes. Practical implications – The means introduced in the paper were found to be of utility value in improving process efficiency, transparency and planning in logistics networks. The particular system design choices in the presented solution allow an incremental introduction or evolution of resource handling practices, incorporating existing fragmentary, unstructured or tacit knowledge of experienced personnel into the theoretically founded overall concept. Originality/value – The paper extends previous high-level view on the potential of big data, and presents new applied research and development results in a logistics application.

Növelhető-e a csőd-előrejelző modellek előre jelző képessége az új klasszifikációs módszerek nélkül? (Can the predictive capacity of bankruptcy forecasting models be increased without new classification methods?)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A Bázel–2. tőkeegyezmény bevezetését követően a bankok és hitelintézetek Magyarországon is megkezdték saját belső minősítő rendszereik felépítését, melyek karbantartása és fejlesztése folyamatos feladat. A szerző arra a kérdésre keres választ, hogy lehetséges-e a csőd-előrejelző modellek előre jelző képességét növelni a hagyományos matematikai-statisztikai módszerek alkalmazásával oly módon, hogy a modellekbe a pénzügyi mutatószámok időbeli változásának mértékét is beépítjük. Az empirikus kutatási eredmények arra engednek következtetni, hogy a hazai vállalkozások pénzügyi mutatószámainak időbeli alakulása fontos információt hordoz a vállalkozás jövőbeli fizetőképességéről, mivel azok felhasználása jelentősen növeli a csődmodellek előre jelző képességét. A szerző azt is megvizsgálja, hogy javítja-e a megfigyelések szélsőségesen magas vagy alacsony értékeinek modellezés előtti korrekciója a modellek klasszifikációs teljesítményét. ______ Banks and lenders in Hungary also began, after the introduction of the Basel 2 capital agreement, to build up their internal rating systems, whose maintenance and development are a continuing task. The author explores whether it is possible to increase the predictive capacity of business-failure forecasting models by traditional mathematical-cum-statistical means in such a way that they incorporate the measure of change in the financial indicators over time. Empirical findings suggest that the temporal development of the financial indicators of firms in Hungary carries important information about future ability to pay, since the predictive capacity of bankruptcy forecasting models is greatly increased by using such indicators. The author also examines whether the classification performance of the models can be improved by correcting for extremely high or low values before modelling.

The predictive power of homework assignments on student achievement in grade three

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Homework has been a controversial issue in education for the past century. Research has been scarce and has yielded results at both ends of the spectrum. This study examined the relationship between homework performance (percent of homework completed and percent of homework correct), student characteristics (SAT-9 score, gender, ethnicity, and socio-economic status), perceptions, and challenges and academic achievement determined by the students' average score on weekly tests and their score on the FCAT NRT mathematics assessment. ^ The subjects for this study consisted of 143 students enrolled in Grade 3 at a suburban elementary school in Miami, Florida. Pearson's correlations were used to examine the associations of the predictor variables with average test scores and FCAT NRT scores. Additionally, simultaneous regression analyses were carried out to examine the influence of the predictor variables on each of the criterion variables. Hierarchical regression analyses were performed on the criterion variables from the predictor variables. ^ Homework performance was significantly correlated with average test score. Controlling for the other variables homework performance was highly related to average test score and FCAT NRT score. ^ This study lends support to the view that homework completion is highly related to student academic achievement at the lower elementary level. It is suggested that at the elementary level more consideration be given to the amount of homework completed by students and to utilize the information in formulating intervention strategies for student who may not be achieving at the appropriate levels. ^

Quantifying the effects of temperature and concentration on variable-density flow in numerical modeling of groundwater systems: Implications for predictive uncertainty and data collection

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Groundwater systems of different densities are often mathematically modeled to understand and predict environmental behavior such as seawater intrusion or submarine groundwater discharge. Additional data collection may be justified if it will cost-effectively aid in reducing the uncertainty of a model's prediction. The collection of salinity, as well as, temperature data could aid in reducing predictive uncertainty in a variable-density model. However, before numerical models can be created, rigorous testing of the modeling code needs to be completed. This research documents the benchmark testing of a new modeling code, SEAWAT Version 4. The benchmark problems include various combinations of density-dependent flow resulting from variations in concentration and temperature. The verified code, SEAWAT, was then applied to two different hydrological analyses to explore the capacity of a variable-density model to guide data collection. ^ The first analysis tested a linear method to guide data collection by quantifying the contribution of different data types and locations toward reducing predictive uncertainty in a nonlinear variable-density flow and transport model. The relative contributions of temperature and concentration measurements, at different locations within a simulated carbonate platform, for predicting movement of the saltwater interface were assessed. Results from the method showed that concentration data had greater worth than temperature data in reducing predictive uncertainty in this case. Results also indicated that a linear method could be used to quantify data worth in a nonlinear model. ^ The second hydrological analysis utilized a model to identify the transient response of the salinity, temperature, age, and amount of submarine groundwater discharge to changes in tidal ocean stage, seasonal temperature variations, and different types of geology. The model was compared to multiple kinds of data to (1) calibrate and verify the model, and (2) explore the potential for the model to be used to guide the collection of data using techniques such as electromagnetic resistivity, thermal imagery, and seepage meters. Results indicated that the model can be used to give insight to submarine groundwater discharge and be used to guide data collection. ^

Space time coding for polynomial phase modulated signals

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Polynomial phase modulated (PPM) signals have been shown to provide improved error rate performance with respect to conventional modulation formats under additive white Gaussian noise and fading channels in single-input single-output (SISO) communication systems. In this dissertation, systems with two and four transmit antennas using PPM signals were presented. In both cases we employed full-rate space-time block codes in order to take advantage of the multipath channel. For two transmit antennas, we used the orthogonal space-time block code (OSTBC) proposed by Alamouti and performed symbol-wise decoding by estimating the phase coefficients of the PPM signal using three different methods: maximum-likelihood (ML), sub-optimal ML (S-ML) and the high-order ambiguity function (HAF). In the case of four transmit antennas, we used the full-rate quasi-OSTBC (QOSTBC) proposed by Jafarkhani. However, in order to ensure the best error rate performance, PPM signals were selected such as to maximize the QOSTBC’s minimum coding gain distance (CGD). Since this method does not always provide a unique solution, an additional criterion known as maximum channel interference coefficient (CIC) was proposed. Through Monte Carlo simulations it was shown that by using QOSTBCs along with the properly selected PPM constellations based on the CGD and CIC criteria, full diversity in flat fading channels and thus, low BER at high signal-to-noise ratios (SNR) can be ensured. Lastly, the performance of symbol-wise decoding for QOSTBCs was evaluated. In this case a quasi zero-forcing method was used to decouple the received signal and it was shown that although this technique reduces the decoding complexity of the system, there is a penalty to be paid in terms of error rate performance at high SNRs.

The Predictive Power of Homework Assignments on Student Achievement in Mathematics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study examined the relationship between homework performance (percent of homework completed and percent of homework correct), student characteristics (Stanford Achievement Test score, gender, ethnicity, and socio-economic status), perceptions, and challenges and academic achievement determined by the students’ average score on weekly tests and their score on the Florida Comprehensive Assessment Test (FCAT) Norm Reference Test (NRT) mathematics assessment.

Quasi-Orthogonal Space-Time Block Coding Using Polynomial Phase Modulation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently, polynomial phase modulation (PPM) was shown to be a power- and bandwidth-efficient modulation format. These two characteristics are in high demand nowadays specially in mobile applications, where devices with size, weight, and power (SWaP) constraints are common. In this paper, we propose implementing a full-diversity quasiorthogonal space-time block code (QOSTBC) using polynomial phase signals as modulation format. QOSTBCs along with PPM are used in order to improve the power efficiency of communication systems with four transmit antennas. We obtain the optimal PPM constellations that ensure full diversity and maximize the QOSTBC's minimum coding gain distance. Simulation results show that by using QOSTBCs along with a properly selected PPM constellation, full diversity in flat fading channels and thus low BER at high signal-to-noise ratios (SNR) can be ensured. More importantly, it is also shown that QOSTBCs using PPM achieve a better error performance than those using conventional modulation formats.

‘Genetic Coding’ Reconsidered : An Analysis of Actual Usage

Relevância:

20.00% 20.00%

Publicador:

Resumo:

I thank George Pandarakalam for research assistance; Hans-Jörg Rheinberger for hosting my stay at the Max Planck Institute for History of Science, Berlin; and Sahotra Sarkar and referees of this journal for offering detailed comments. Funded by the Wellcome Trust (WT098764MA).

‘Genetic Coding’ Reconsidered : An Analysis of Actual Usage

Relevância:

20.00% 20.00%

Publicador:

Resumo:

I thank George Pandarakalam for research assistance; Hans-Jörg Rheinberger for hosting my stay at the Max Planck Institute for History of Science, Berlin; and Sahotra Sarkar and referees of this journal for offering detailed comments. Funded by the Wellcome Trust (WT098764MA).

Transcriptomic and proteomic analysis of signature molecules predictive of chrondrogenic potency and tissue specipicity in human mesenchymal stem cells

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Peer reviewed

22 years of Predictive Testing for Huntington’s Disease : The Experience of the UK Huntington’s Prediction Consortium

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Peer reviewed

22 years of Predictive Testing for Huntington’s Disease : The Experience of the UK Huntington’s Prediction Consortium

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Corrigendum European Journal of Human Genetics (2016) 24, 1515; doi:10.1038/ejhg.2016.81 22 Years of predictive testing for Huntington’s disease: the experience of the UK Huntington’s Prediction Consortium Sheharyar S Baig, Mark Strong, Elisabeth Rosser, Nicola V Taverner, Ruth Glew, Zosia Miedzybrodzka, Angus Clarke, David Craufurd, UK Huntington's Disease Prediction Consortium and Oliver W Quarrell Correction to: European Journal of Human Genetics advance online publication, 11 May 2016; doi: 10.1038/ejhg.2016.36 Post online publication the authors realised that they had made an error: The sentence on page 2: 'In the first 5-year period........but this changed significantly in the last 5-year period with 51% positive and 49% negative (χ2=20.6, P<0.0001)' should read: 'In the first 5-year period........but this changed significantly in the last 5-year period with 49% positive and 51% negative (χ2=20.6, P<0.0001)'.

«
1
2
...
51
52
53
54
55
56
57
...
60
61
»