2 resultados para Big data, Spark, Hadoop

em Cambridge University Engineering Department Publications Database


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present Random Partition Kernels, a new class of kernels derived by demonstrating a natural connection between random partitions of objects and kernels between those objects. We show how the construction can be used to create kernels from methods that would not normally be viewed as random partitions, such as Random Forest. To demonstrate the potential of this method, we propose two new kernels, the Random Forest Kernel and the Fast Cluster Kernel, and show that these kernels consistently outperform standard kernels on problems involving real-world datasets. Finally, we show how the form of these kernels lend themselves to a natural approximation that is appropriate for certain big data problems, allowing $O(N)$ inference in methods such as Gaussian Processes, Support Vector Machines and Kernel PCA.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Measurements of particulate matter (PM) from spark ignition (SI) engine exhaust using dilution tunnels will become more prevalent as emission standards are tightened. Hence, a study of the dilution process was undertaken in order to understand how various dilution related parameters affect the accuracy with which PM sizes and concentrations can be determined. A SI and a compression ignition (CI) engine were separately used to examine parameters of the dilution process; the present work discusses the results in the context of SI exhaust dilution. A Scanning Mobility Particle Sizer (SMPS) was used to measure the size distribution, number density, and volume fraction of PM. Temperature measurements in the exhaust pipe and dilution tunnel reveal the degree of mixing between exhaust and dilution air, the effect of flowrate on heat transfer from undiluted and diluted exhaust to the environment, and the minimum permissible dilution ratio for a maximum sample temperature of 52°C. Measurements of PM concentrations as a function of dilution ratio show the competing effects of temperature and particle/vapor concentrations on particle growth dynamics, which result in a range of dilution ratios-from 13 to 18-where the effect of dilution ratio, independent of flowrate, is kept to a minimum. This range of dilution ratios is therefore optimal in order to achieve repeatable PM concentration measurements. Particle dynamics during transit through the tunnel operating at the optimal dilution ratio was found statistically insignificant compared to data scatter. Such small differences in number concentration may be qualitatively representative of particle losses for SI exhaust, but small increases in PM volume fraction during transit through the tunnel may significantly underestimate accretion of mass due to unburned hydrocarbons (HCs) emitted by SI engines. The fraction of SI-derived PM mass due to adsorbed/absorbed vapor, estimated from these data, is consistent with previous chemical analyses of PM. © 1998 Society of Automotive Engineers, Inc.