964 resultados para data fitting


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as ƒ-test is performed during each node's split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lapid, Ulrich, and Rammsayer (2008) reported that estimates of the difference limen (DL) from a two-alternative forced choice (2AFC) task are higher than those obtained from a reminder task. This article reanalyzes their data in order to correct an error in their estimates of the DL from 2AFC data. We also extend the psychometric functions fitted to data from both tasks to incorporate an extra parameter that has been shown to allow obtaining accurate estimates of the DL that are unaffected by lapses. Contrary to Lapid et al.'s conclusion, our reanalysis shows that DLs estimated with the 2AFC task are only minimally (and not always significantly) larger than those estimated with the reminder task. We also show that their data are contaminated by response bias, and that the small remaining difference between DLs estimated with 2AFC and reminder tasks can be reasonably attributed to the differential effects that response bias has in either task as they were defined in Lapid et al.'s experiments. Finally, we discuss a novel approach presented by Ulrich and Vorberg (2009) for fitting psychometric functions to 2AFC discrimination data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A class of multi-process models is developed for collections of time indexed count data. Autocorrelation in counts is achieved with dynamic models for the natural parameter of the binomial distribution. In addition to modeling binomial time series, the framework includes dynamic models for multinomial and Poisson time series. Markov chain Monte Carlo (MCMC) and Po ́lya-Gamma data augmentation (Polson et al., 2013) are critical for fitting multi-process models of counts. To facilitate computation when the counts are high, a Gaussian approximation to the P ́olya- Gamma random variable is developed.

Three applied analyses are presented to explore the utility and versatility of the framework. The first analysis develops a model for complex dynamic behavior of themes in collections of text documents. Documents are modeled as a “bag of words”, and the multinomial distribution is used to characterize uncertainty in the vocabulary terms appearing in each document. State-space models for the natural parameters of the multinomial distribution induce autocorrelation in themes and their proportional representation in the corpus over time.

The second analysis develops a dynamic mixed membership model for Poisson counts. The model is applied to a collection of time series which record neuron level firing patterns in rhesus monkeys. The monkey is exposed to two sounds simultaneously, and Gaussian processes are used to smoothly model the time-varying rate at which the neuron’s firing pattern fluctuates between features associated with each sound in isolation.

The third analysis presents a switching dynamic generalized linear model for the time-varying home run totals of professional baseball players. The model endows each player with an age specific latent natural ability class and a performance enhancing drug (PED) use indicator. As players age, they randomly transition through a sequence of ability classes in a manner consistent with traditional aging patterns. When the performance of the player significantly deviates from the expected aging pattern, he is identified as a player whose performance is consistent with PED use.

All three models provide a mechanism for sharing information across related series locally in time. The models are fit with variations on the P ́olya-Gamma Gibbs sampler, MCMC convergence diagnostics are developed, and reproducible inference is emphasized throughout the dissertation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We analyze a real data set pertaining to reindeer fecal pellet-group counts obtained from a survey conducted in a forest area in northern Sweden. In the data set, over 70% of counts are zeros, and there is high spatial correlation. We use conditionally autoregressive random effects for modeling of spatial correlation in a Poisson generalized linear mixed model (GLMM), quasi-Poisson hierarchical generalized linear model (HGLM), zero-inflated Poisson (ZIP), and hurdle models. The quasi-Poisson HGLM allows for both under- and overdispersion with excessive zeros, while the ZIP and hurdle models allow only for overdispersion. In analyzing the real data set, we see that the quasi-Poisson HGLMs can perform better than the other commonly used models, for example, ordinary Poisson HGLMs, spatial ZIP, and spatial hurdle models, and that the underdispersed Poisson HGLMs with spatial correlation fit the reindeer data best. We develop R codes for fitting these models using a unified algorithm for the HGLMs. Spatial count response with an extremely high proportion of zeros, and underdispersion can be successfully modeled using the quasi-Poisson HGLM with spatial random effects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present new methodologies to generate rational function approximations of broadband electromagnetic responses of linear and passive networks of high-speed interconnects, and to construct SPICE-compatible, equivalent circuit representations of the generated rational functions. These new methodologies are driven by the desire to improve the computational efficiency of the rational function fitting process, and to ensure enhanced accuracy of the generated rational function interpolation and its equivalent circuit representation. Toward this goal, we propose two new methodologies for rational function approximation of high-speed interconnect network responses. The first one relies on the use of both time-domain and frequency-domain data, obtained either through measurement or numerical simulation, to generate a rational function representation that extrapolates the input, early-time transient response data to late-time response while at the same time providing a means to both interpolate and extrapolate the used frequency-domain data. The aforementioned hybrid methodology can be considered as a generalization of the frequency-domain rational function fitting utilizing frequency-domain response data only, and the time-domain rational function fitting utilizing transient response data only. In this context, a guideline is proposed for estimating the order of the rational function approximation from transient data. The availability of such an estimate expedites the time-domain rational function fitting process. The second approach relies on the extraction of the delay associated with causal electromagnetic responses of interconnect systems to provide for a more stable rational function process utilizing a lower-order rational function interpolation. A distinctive feature of the proposed methodology is its utilization of scattering parameters. For both methodologies, the approach of fitting the electromagnetic network matrix one element at a time is applied. It is shown that, with regard to the computational cost of the rational function fitting process, such an element-by-element rational function fitting is more advantageous than full matrix fitting for systems with a large number of ports. Despite the disadvantage that different sets of poles are used in the rational function of different elements in the network matrix, such an approach provides for improved accuracy in the fitting of network matrices of systems characterized by both strongly coupled and weakly coupled ports. Finally, in order to provide a means for enforcing passivity in the adopted element-by-element rational function fitting approach, the methodology for passivity enforcement via quadratic programming is modified appropriately for this purpose and demonstrated in the context of element-by-element rational function fitting of the admittance matrix of an electromagnetic multiport.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Time perception is studied with subjective or semi-objective psychophysical methods. With subjective methods, observers provide quantitative estimates of duration and data depict the psychophysical function relating subjective duration to objective duration. With semi-objective methods, observers provide categorical or comparative judgments of duration and data depict the psychometric function relating the probability of a certain judgment to objective duration. Both approaches are used to study whether subjective and objective time run at the same pace or whether time flies or slows down under certain conditions. We analyze theoretical aspects affecting the interpretation of data gathered with the most widely used semi-objective methods, including single-presentation and paired-comparison methods. For this purpose, a formal model of psychophysical performance is used in which subjective duration is represented via a psychophysical function and the scalar property. This provides the timing component of the model, which is invariant across methods. A decisional component that varies across methods reflects how observers use subjective durations to make judgments and give the responses requested under each method. Application of the model shows that psychometric functions in single-presentation methods are uninterpretable because the various influences on observed performance are inextricably confounded in the data. In contrast, data gathered with paired-comparison methods permit separating out those influences. Prevalent approaches to fitting psychometric functions to data are also discussed and shown to be inconsistent with widely accepted principles of time perception, implicitly assuming instead that subjective time equals objective time and that observed differences across conditions do not reflect differences in perceived duration but criterion shifts. These analyses prompt evidence-based recommendations for best methodological practice in studies on time perception.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The rural electrification is characterized by geographical dispersion of the population, low consumption, high investment by consumers and high cost. Moreover, solar radiation constitutes an inexhaustible source of energy and in its conversion into electricity photovoltaic panels are used. In this study, equations were adjusted to field conditions presented by the manufacturer for current and power of small photovoltaic systems. The mathematical analysis was performed on the photovoltaic rural system I- 100 from ISOFOTON, with power 300 Wp, located at the Experimental Farm Lageado of FCA/UNESP. For the development of such equations, the circuitry of photovoltaic cells has been studied to apply iterative numerical methods for the determination of electrical parameters and possible errors in the appropriate equations in the literature to reality. Therefore, a simulation of a photovoltaic panel was proposed through mathematical equations that were adjusted according to the data of local radiation. The results have presented equations that provide real answers to the user and may assist in the design of these systems, once calculated that the maximum power limit ensures a supply of energy generated. This real sizing helps establishing the possible applications of solar energy to the rural producer and informing the real possibilities of generating electricity from the sun.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as f-test is performed during each node’s split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The representation of alkene degradation in version 3 of the Master Chemical Mechanism (MCM v3) has been evaluated, using environmental chamber data on the photo-oxidation of ethene, propene, 1-butene and 1-hexene in the presence of NOx, from up to five chambers at the Statewide Air Pollution Research Center (SAPRC) at the University of California. As part of this evaluation, it was necessary to include a representation of the reactions of the alkenes with O(3P), which are significant under chamber conditions but generally insignificant under atmospheric conditions. The simulations for the ethene and propene systems, in particular, were found to be sensitive to the branching ratios assigned to molecular and free radical forming pathways of the O(3P) reactions, with the extent of radical formation required for proper fitting of the model to the chamber data being substantially lower than the reported consensus. With this constraint, the MCM v3 mechanisms for ethene and propene generally performed well. The sensitivity of the simulations to the parameters applied to a series of other radical sources and sink reactions (radical formation from the alkene ozonolysis reactions and product carbonyl photolysis; radical removal from the reaction of OH with NO2 and β-hydroxynitrate formation) were also considered, and the implications of these results are discussed. Evaluation of the MCM v3 1-butene and 1-hexene degradation mechanisms, using a more limited dataset from only one chamber, was found to be inconclusive. The results of sensitivity studies demonstrate that it is impossible to reconcile the simulated and observed formation of ozone in these systems for ranges of parameter values which can currently be justified on the basis of the literature. As a result of this work, gaps and uncertainties in the kinetic, mechanistic and chamber database are identified and discussed, in relation to both tropospheric chemistry and chemistry important under chamber conditions which may compromise the evaluation procedure, and recommendations are made for future experimental studies. Throughout the study, the performance of the MCM v3 chemistry was also simultaneously compared with that of the corresponding chemistry in the SAPRC-99 mechanism, which was developed and optimized in conjunction with the chamber datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A non-linear least-squares methodology for simultaneously estimating parameters of selectivity curves with a pre-defined functional form, across size classes and mesh sizes, using catch size frequency distributions, was developed based on the model of Kirkwood and Walker [Kirkwood, G.P., Walker, T.L, 1986. Gill net selectivities for gummy shark, Mustelus antarcticus Gunther, taken in south-eastern Australian waters. Aust. J. Mar. Freshw. Res. 37, 689-697] and [Wulff, A., 1986. Mathematical model for selectivity of gill nets. Arch. Fish Wiss. 37, 101-106]. Observed catches of fish of size class I in mesh m are modeled as a function of the estimated numbers of fish of that size class in the population and the corresponding selectivities. A comparison was made with the maximum likelihood methodology of [Kirkwood, G.P., Walker, T.I., 1986. Gill net selectivities for gummy shark, Mustelus antarcticus Gunther, taken in south-eastern Australian waters. Aust. J. Mar. Freshw. Res. 37, 689-697] and [Wulff, A., 1986. Mathematical model for selectivity of gill nets. Arch. Fish Wiss; 37, 101-106], using simulated catch data with known selectivity curve parameters, and two published data sets. The estimated parameters and selectivity curves were generally consistent for both methods, with smaller standard errors for parameters estimated by non-linear least-squares. The proposed methodology is a useful and accessible alternative which can be used to model selectivity in situations where the parameters of a pre-defined model can be assumed to be functions of gear size; facilitating statistical evaluation of different models and of goodness of fit. (C) 1998 Elsevier Science B.V.

Relevância:

20.00% 20.00%

Publicador: