93 resultados para grouping estimators
Resumo:
Illustrations are an integral part of many dictionaries, but the selection, placing, and sizing of illustrations is often highly conservative, and can appear to reflect the editorial concerns and technological constraints of previous eras. We might start with the question ‘why not illustrate?’, especially when we consider the ability of an illustration to simplify the definition of technical terms. How do illustrations affect the reader’s view of a dictionary as objective, and how illustrations reinforce the pedagogic aims of the dictionary? By their graphic nature, illustrations stand out from the field of text against which they stand, and they can immediately indicate to the reader the level of seriousness or popularity of the book’s approach, or the age-range that it is intended for. And illustrations are expensive to create and can add to printing costs, so it is not surprising that there is much direct and indirect copying from dictionary to dictionary, and simple re-use. This article surveys developments in illustrating dictionaries, considering the difference between distributing individual illustrations through the text of the dictionary and grouping illustrations into larger synoptic illustrations; the graphic style of illustrations; and the role of illustrations in ‘feature-led’ dictionary marketing.
Resumo:
We develop a new sparse kernel density estimator using a forward constrained regression framework, within which the nonnegative and summing-to-unity constraints of the mixing weights can easily be satisfied. Our main contribution is to derive a recursive algorithm to select significant kernels one at time based on the minimum integrated square error (MISE) criterion for both the selection of kernels and the estimation of mixing weights. The proposed approach is simple to implement and the associated computational cost is very low. Specifically, the complexity of our algorithm is in the order of the number of training data N, which is much lower than the order of N2 offered by the best existing sparse kernel density estimators. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with comparable accuracy to those of the classical Parzen window estimate and other existing sparse kernel density estimators.
Resumo:
Optimal estimation (OE) improves sea surface temperature (SST) estimated from satellite infrared imagery in the “split-window”, in comparison to SST retrieved using the usual multi-channel (MCSST) or non-linear (NLSST) estimators. This is demonstrated using three months of observations of the Advanced Very High Resolution Radiometer (AVHRR) on the first Meteorological Operational satellite (Metop-A), matched in time and space to drifter SSTs collected on the global telecommunications system. There are 32,175 matches. The prior for the OE is forecast atmospheric fields from the Météo-France global numerical weather prediction system (ARPEGE), the forward model is RTTOV8.7, and a reduced state vector comprising SST and total column water vapour (TCWV) is used. Operational NLSST coefficients give mean and standard deviation (SD) of the difference between satellite and drifter SSTs of 0.00 and 0.72 K. The “best possible” NLSST and MCSST coefficients, empirically regressed on the data themselves, give zero mean difference and SDs of 0.66 K and 0.73 K respectively. Significant contributions to the global SD arise from regional systematic errors (biases) of several tenths of kelvin in the NLSST. With no bias corrections to either prior fields or forward model, the SSTs retrieved by OE minus drifter SSTs have mean and SD of − 0.16 and 0.49 K respectively. The reduction in SD below the “best possible” regression results shows that OE deals with structural limitations of the NLSST and MCSST algorithms. Using simple empirical bias corrections to improve the OE, retrieved minus drifter SSTs are obtained with mean and SD of − 0.06 and 0.44 K respectively. Regional biases are greatly reduced, such that the absolute bias is less than 0.1 K in 61% of 10°-latitude by 30°-longitude cells. OE also allows a statistic of the agreement between modelled and measured brightness temperatures to be calculated. We show that this measure is more efficient than the current system of confidence levels at identifying reliable retrievals, and that the best 75% of satellite SSTs by this measure have negligible bias and retrieval error of order 0.25 K.
Resumo:
We describe the approach to be adopted for a major new initiative to derive a homogeneous record of sea surface temperature for 1991–2007 from the observations of the series of three along-track scanning radiometers (ATSRs). This initiative is called (A)RC: (Advanced) ATSR Re-analysis for Climate. The main objectives are to reduce regional biases in retrieved sea surface temperature (SST) to less than 0.1 K for all global oceans, while creating a very homogenous record that is stable in time to within 0.05 K decade−1, with maximum independence of the record from existing analyses of SST used in climate change research. If these stringent targets are achieved, this record will enable significantly improved estimates of surface temperature trends and variability of sufficient quality to advance questions of climate change attribution, climate sensitivity and historical reconstruction of surface temperature changes. The approach includes development of new, consistent estimators for SST for each of the ATSRs, and detailed analysis of overlap periods. Novel aspects of the approach include generation of multiple versions of the record using alternative channel sets and cloud detection techniques, to assess for the first time the effect of such choices. There will be extensive effort in quality control, validation and analysis of the impact on climate SST data sets. Evidence for the plausibility of the 0.1 K target for systematic error is reviewed, as is the need for alternative cloud screening methods in this context.
Resumo:
We derive energy-norm a posteriori error bounds, using gradient recovery (ZZ) estimators to control the spatial error, for fully discrete schemes for the linear heat equation. This appears to be the �rst completely rigorous derivation of ZZ estimators for fully discrete schemes for evolution problems, without any restrictive assumption on the timestep size. An essential tool for the analysis is the elliptic reconstruction technique.Our theoretical results are backed with extensive numerical experimentation aimed at (a) testing the practical sharpness and asymptotic behaviour of the error estimator against the error, and (b) deriving an adaptive method based on our estimators. An extra novelty provided is an implementation of a coarsening error "preindicator", with a complete implementation guide in ALBERTA in the appendix.
Resumo:
A new sparse kernel density estimator is introduced. Our main contribution is to develop a recursive algorithm for the selection of significant kernels one at time using the minimum integrated square error (MISE) criterion for both kernel selection. The proposed approach is simple to implement and the associated computational cost is very low. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with competitive accuracy to existing kernel density estimators.
Resumo:
The extensive shoreline deposits of Lake Chilwa, southern Malawi, a shallow water body today covering 600 km2 of a basin of 7500 km2, are investigated for their record of late Quaternary highstands. OSL dating, applied to 36 samples from five sediment cores from the northern and western marginal sand ridges, reveal a highstand record spanning 44 ka. Using two different grouping methods, highstand phases are identified at 43.7–33.3 ka, 26.2–21.0 ka and 17.9–12.0 ka (total error method) or 38.4–35.5 ka, 24.3–22.3 ka, 16.2–15.1 ka and 13.5–12.7 ka (Finite Mixture Model age components) with two further discrete events recorded at 11.01 ± 0.76 ka and 8.52 ± 0.56 ka. Highstands are comparable to the timing of wet phases from other basins in East and southern Africa, demonstrating wet conditions in the region before the LGM, which was dry, and a wet Lateglacial, which commenced earlier in the southern compared to northern hemisphere in East Africa. We find no evidence that wet phases are insolation driven, but analysis of the dataset and GCM modelling experiments suggest that Heinrich events may be associated with enhanced monsoon activity in East Africa in both timing and as a possible causal mechanism.
Resumo:
Purpose – The creation of a target market strategy is integral to developing an effective business strategy. The concept of market segmentation is often cited as pivotal to establishing a target market strategy, yet all too often business-to-business marketers utilise little more than trade sectors or product groups as the basis for their groupings of customers, rather than customers' characteristics and buying behaviour. The purpose of this paper is to offer a solution for managers, focusing on customer purchasing behaviour, which evolves from the organisation's existing criteria used for grouping its customers. Design/methodology/approach – One of the underlying reasons managers fail to embrace best practice market segmentation is their inability to manage the transition from how target markets in an organisation are currently described to how they might look when based on customer characteristics, needs, purchasing behaviour and decision-making. Any attempt to develop market segments should reflect the inability of organisations to ignore their existing customer group classification schemes and associated customer-facing operational practices, such as distribution channels and sales force allocations. Findings – A straightforward process has been derived and applied, enabling organisations to practice market segmentation in an evolutionary manner, facilitating the transition to customer-led target market segments. This process also ensures commitment from the managers responsible for implementing the eventual segmentation scheme. This paper outlines the six stages of this process and presents an illustrative example from the agrichemicals sector, supported by other cases. Research implications – The process presented in this paper for embarking on market segmentation focuses on customer purchasing behaviour rather than business sectors or product group classifications - which is true to the concept of market segmentation - but in a manner that participating managers find non-threatening. The resulting market segments have their basis in the organisation's existing customer classification schemes and are an iteration to which most managers readily buy-in. Originality/value – Despite the size of the market segmentation literature, very few papers offer step-by-step guidance for developing customer-focused market segments in business-to-business marketing. The analytical tool for assessing customer purchasing deployed in this paper originally was created to assist in marketing planning programmes, but has since proved its worth as the foundation for creating segmentation schemes in business marketing, as described in this paper.
Resumo:
What are the main causes of international terrorism? Despite the meticulous examination of various candidate explanations, existing estimates still diverge in sign, size, and significance. This article puts forward a novel explanation and supporting evidence. We argue that domestic political instability provides the learning environment needed to successfully execute international terror attacks. Using a yearly panel of 123 countries over 1973–2003, we find that the occurrence of civil wars increases fatalities and the number of international terrorist acts by 45%. These results hold for alternative indicators of political instability, estimators, subsamples, subperiods, and accounting for competing explanations.
Resumo:
Much UK research and market practice on portfolio strategy and performance benchmarking relies on a sector‐geography subdivision of properties. Prior tests of the appropriateness of such divisions have generally relied on aggregated or hypothetical return data. However, the results found in aggregate may not hold when individual buildings are considered. This paper makes use of a dataset of individual UK property returns. A series of multivariate exploratory statistical techniques are utilised to test whether the return behaviour of individual properties conforms to their a priori grouping. The results suggest strongly that neither standard sector nor regional classifications provide a clear demarcation of individual building performance. This has important implications for both portfolio strategy and performance measurement and benchmarking. However, there do appear to be size and yield effects that help explain return behaviour at the property level.
Resumo:
almonella enterica serovar Typhimurium is an established model organism for Gram-negative, intracellular pathogens. Owing to the rapid spread of resistance to antibiotics among this group of pathogens, new approaches to identify suitable target proteins are required. Based on the genome sequence of Salmonella Typhimurium and associated databases, a genome-scale metabolic model was constructed. Output was based on an experimental determination of the biomass of Salmonella when growing in glucose minimal medium. Linear programming was used to simulate variations in energy demand, while growing in glucose minimal medium. By grouping reactions with similar flux responses, a sub-network of 34 reactions responding to this variation was identified (the catabolic core). This network was used to identify sets of one and two reactions, that when removed from the genome-scale model interfered with energy and biomass generation. 11 such sets were found to be essential for the production of biomass precursors. Experimental investigation of 7 of these showed that knock-outs of the associated genes resulted in attenuated growth for 4 pairs of reactions, while 3 single reactions were shown to be essential for growth.
Resumo:
The Lincoln–Petersen estimator is one of the most popular estimators used in capture–recapture studies. It was developed for a sampling situation in which two sources independently identify members of a target population. For each of the two sources, it is determined if a unit of the target population is identified or not. This leads to a 2 × 2 table with frequencies f11, f10, f01, f00 indicating the number of units identified by both sources, by the first but not the second source, by the second but not the first source and not identified by any of the two sources, respectively. However, f00 is unobserved so that the 2 × 2 table is incomplete and the Lincoln–Petersen estimator provides an estimate for f00. In this paper, we consider a generalization of this situation for which one source provides not only a binary identification outcome but also a count outcome of how many times a unit has been identified. Using a truncated Poisson count model, truncating multiple identifications larger than two, we propose a maximum likelihood estimator of the Poisson parameter and, ultimately, of the population size. This estimator shows benefits, in comparison with Lincoln–Petersen’s, in terms of bias and efficiency. It is possible to test the homogeneity assumption that is not testable in the Lincoln–Petersen framework. The approach is applied to surveillance data on syphilis from Izmir, Turkey.
Resumo:
In this paper, we study the role of the volatility risk premium for the forecasting performance of implied volatility. We introduce a non-parametric and parsimonious approach to adjust the model-free implied volatility for the volatility risk premium and implement this methodology using more than 20 years of options and futures data on three major energy markets. Using regression models and statistical loss functions, we find compelling evidence to suggest that the risk premium adjusted implied volatility significantly outperforms other models, including its unadjusted counterpart. Our main finding holds for different choices of volatility estimators and competing time-series models, underlying the robustness of our results.
Resumo:
A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion for the finite mixture model. Since the constraint on the mixing coefficients of the finite mixture model is on the multinomial manifold, we use the well-known Riemannian trust-region (RTR) algorithm for solving this problem. The first- and second-order Riemannian geometry of the multinomial manifold are derived and utilized in the RTR algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with an accuracy competitive with those of existing kernel density estimators.
Resumo:
During the development of new therapies, it is not uncommon to test whether a new treatment works better than the existing treatment for all patients who suffer from a condition (full population) or for a subset of the full population (subpopulation). One approach that may be used for this objective is to have two separate trials, where in the first trial, data are collected to determine if the new treatment benefits the full population or the subpopulation. The second trial is a confirmatory trial to test the new treatment in the population selected in the first trial. In this paper, we consider the more efficient two-stage adaptive seamless designs (ASDs), where in stage 1, data are collected to select the population to test in stage 2. In stage 2, additional data are collected to perform confirmatory analysis for the selected population. Unlike the approach that uses two separate trials, for ASDs, stage 1 data are also used in the confirmatory analysis. Although ASDs are efficient, using stage 1 data both for selection and confirmatory analysis introduces selection bias and consequently statistical challenges in making inference. We will focus on point estimation for such trials. In this paper, we describe the extent of bias for estimators that ignore multiple hypotheses and selecting the population that is most likely to give positive trial results based on observed stage 1 data. We then derive conditionally unbiased estimators and examine their mean squared errors for different scenarios.