120 resultados para Resampling
Resumo:
Des progrès significatifs ont été réalisés dans le domaine de l'intégration quantitative des données géophysique et hydrologique l'échelle locale. Cependant, l'extension à de plus grandes échelles des approches correspondantes constitue encore un défi majeur. Il est néanmoins extrêmement important de relever ce défi pour développer des modèles fiables de flux des eaux souterraines et de transport de contaminant. Pour résoudre ce problème, j'ai développé une technique d'intégration des données hydrogéophysiques basée sur une procédure bayésienne de simulation séquentielle en deux étapes. Cette procédure vise des problèmes à plus grande échelle. L'objectif est de simuler la distribution d'un paramètre hydraulique cible à partir, d'une part, de mesures d'un paramètre géophysique pertinent qui couvrent l'espace de manière exhaustive, mais avec une faible résolution (spatiale) et, d'autre part, de mesures locales de très haute résolution des mêmes paramètres géophysique et hydraulique. Pour cela, mon algorithme lie dans un premier temps les données géophysiques de faible et de haute résolution à travers une procédure de réduction déchelle. Les données géophysiques régionales réduites sont ensuite reliées au champ du paramètre hydraulique à haute résolution. J'illustre d'abord l'application de cette nouvelle approche dintégration des données à une base de données synthétiques réaliste. Celle-ci est constituée de mesures de conductivité hydraulique et électrique de haute résolution réalisées dans les mêmes forages ainsi que destimations des conductivités électriques obtenues à partir de mesures de tomographic de résistivité électrique (ERT) sur l'ensemble de l'espace. Ces dernières mesures ont une faible résolution spatiale. La viabilité globale de cette méthode est testée en effectuant les simulations de flux et de transport au travers du modèle original du champ de conductivité hydraulique ainsi que du modèle simulé. Les simulations sont alors comparées. Les résultats obtenus indiquent que la procédure dintégration des données proposée permet d'obtenir des estimations de la conductivité en adéquation avec la structure à grande échelle ainsi que des predictions fiables des caractéristiques de transports sur des distances de moyenne à grande échelle. Les résultats correspondant au scénario de terrain indiquent que l'approche d'intégration des données nouvellement mise au point est capable d'appréhender correctement les hétérogénéitées à petite échelle aussi bien que les tendances à gande échelle du champ hydraulique prévalent. Les résultats montrent également une flexibilté remarquable et une robustesse de cette nouvelle approche dintégration des données. De ce fait, elle est susceptible d'être appliquée à un large éventail de données géophysiques et hydrologiques, à toutes les gammes déchelles. Dans la deuxième partie de ma thèse, j'évalue en détail la viabilité du réechantillonnage geostatique séquentiel comme mécanisme de proposition pour les méthodes Markov Chain Monte Carlo (MCMC) appliquées à des probmes inverses géophysiques et hydrologiques de grande dimension . L'objectif est de permettre une quantification plus précise et plus réaliste des incertitudes associées aux modèles obtenus. En considérant une série dexemples de tomographic radar puits à puits, j'étudie deux classes de stratégies de rééchantillonnage spatial en considérant leur habilité à générer efficacement et précisément des réalisations de la distribution postérieure bayésienne. Les résultats obtenus montrent que, malgré sa popularité, le réechantillonnage séquentiel est plutôt inefficace à générer des échantillons postérieurs indépendants pour des études de cas synthétiques réalistes, notamment pour le cas assez communs et importants où il existe de fortes corrélations spatiales entre le modèle et les paramètres. Pour résoudre ce problème, j'ai développé un nouvelle approche de perturbation basée sur une déformation progressive. Cette approche est flexible en ce qui concerne le nombre de paramètres du modèle et lintensité de la perturbation. Par rapport au rééchantillonage séquentiel, cette nouvelle approche s'avère être très efficace pour diminuer le nombre requis d'itérations pour générer des échantillons indépendants à partir de la distribution postérieure bayésienne. - Significant progress has been made with regard to the quantitative integration of geophysical and hydrological data at the local scale. However, extending corresponding approaches beyond the local scale still represents a major challenge, yet is critically important for the development of reliable groundwater flow and contaminant transport models. To address this issue, I have developed a hydrogeophysical data integration technique based on a two-step Bayesian sequential simulation procedure that is specifically targeted towards larger-scale problems. The objective is to simulate the distribution of a target hydraulic parameter based on spatially exhaustive, but poorly resolved, measurements of a pertinent geophysical parameter and locally highly resolved, but spatially sparse, measurements of the considered geophysical and hydraulic parameters. To this end, my algorithm links the low- and high-resolution geophysical data via a downscaling procedure before relating the downscaled regional-scale geophysical data to the high-resolution hydraulic parameter field. I first illustrate the application of this novel data integration approach to a realistic synthetic database consisting of collocated high-resolution borehole measurements of the hydraulic and electrical conductivities and spatially exhaustive, low-resolution electrical conductivity estimates obtained from electrical resistivity tomography (ERT). The overall viability of this method is tested and verified by performing and comparing flow and transport simulations through the original and simulated hydraulic conductivity fields. The corresponding results indicate that the proposed data integration procedure does indeed allow for obtaining faithful estimates of the larger-scale hydraulic conductivity structure and reliable predictions of the transport characteristics over medium- to regional-scale distances. The approach is then applied to a corresponding field scenario consisting of collocated high- resolution measurements of the electrical conductivity, as measured using a cone penetrometer testing (CPT) system, and the hydraulic conductivity, as estimated from electromagnetic flowmeter and slug test measurements, in combination with spatially exhaustive low-resolution electrical conductivity estimates obtained from surface-based electrical resistivity tomography (ERT). The corresponding results indicate that the newly developed data integration approach is indeed capable of adequately capturing both the small-scale heterogeneity as well as the larger-scale trend of the prevailing hydraulic conductivity field. The results also indicate that this novel data integration approach is remarkably flexible and robust and hence can be expected to be applicable to a wide range of geophysical and hydrological data at all scale ranges. In the second part of my thesis, I evaluate in detail the viability of sequential geostatistical resampling as a proposal mechanism for Markov Chain Monte Carlo (MCMC) methods applied to high-dimensional geophysical and hydrological inverse problems in order to allow for a more accurate and realistic quantification of the uncertainty associated with the thus inferred models. Focusing on a series of pertinent crosshole georadar tomographic examples, I investigated two classes of geostatistical resampling strategies with regard to their ability to efficiently and accurately generate independent realizations from the Bayesian posterior distribution. The corresponding results indicate that, despite its popularity, sequential resampling is rather inefficient at drawing independent posterior samples for realistic synthetic case studies, notably for the practically common and important scenario of pronounced spatial correlation between model parameters. To address this issue, I have developed a new gradual-deformation-based perturbation approach, which is flexible with regard to the number of model parameters as well as the perturbation strength. Compared to sequential resampling, this newly proposed approach was proven to be highly effective in decreasing the number of iterations required for drawing independent samples from the Bayesian posterior distribution.
Resumo:
In this paper we propose a subsampling estimator for the distribution ofstatistics diverging at either known rates when the underlying timeseries in strictly stationary abd strong mixing. Based on our results weprovide a detailed discussion how to estimate extreme order statisticswith dependent data and present two applications to assessing financialmarket risk. Our method performs well in estimating Value at Risk andprovides a superior alternative to Hill's estimator in operationalizingSafety First portofolio selection.
Resumo:
Consider the problem of testing k hypotheses simultaneously. In this paper,we discuss finite and large sample theory of stepdown methods that providecontrol of the familywise error rate (FWE). In order to improve upon theBonferroni method or Holm's (1979) stepdown method, Westfall and Young(1993) make eective use of resampling to construct stepdown methods thatimplicitly estimate the dependence structure of the test statistics. However,their methods depend on an assumption called subset pivotality. The goalof this paper is to construct general stepdown methods that do not requiresuch an assumption. In order to accomplish this, we take a close look atwhat makes stepdown procedures work, and a key component is a monotonicityrequirement of critical values. By imposing such monotonicity on estimatedcritical values (which is not an assumption on the model but an assumptionon the method), it is demonstrated that the problem of constructing a validmultiple test procedure which controls the FWE can be reduced to the problemof contructing a single test which controls the usual probability of a Type 1error. This reduction allows us to draw upon an enormous resamplingliterature as a general means of test contruction.
Resumo:
The objective of this paper is to compare the performance of twopredictive radiological models, logistic regression (LR) and neural network (NN), with five different resampling methods. One hundred and sixty-seven patients with proven calvarial lesions as the only known disease were enrolled. Clinical and CT data were used for LR and NN models. Both models were developed with cross validation, leave-one-out and three different bootstrap algorithms. The final results of each model were compared with error rate and the area under receiver operating characteristic curves (Az). The neural network obtained statistically higher Az than LR with cross validation. The remaining resampling validation methods did not reveal statistically significant differences between LR and NN rules. The neural network classifier performs better than the one based on logistic regression. This advantage is well detected by three-fold cross-validation, but remains unnoticed when leave-one-out or bootstrap algorithms are used.
Resumo:
[ANGLÈS] This project introduces GNSS-SDR, an open source Global Navigation Satellite System software-defined receiver. The lack of reconfigurability of current commercial-of-the-shelf receivers and the advent of new radionavigation signals and systems make software receivers an appealing approach to design new architectures and signal processing algorithms. With the aim of exploring the full potential of this forthcoming scenario with a plurality of new signal structures and frequency bands available for positioning, this paper describes the software architecture design and provides details about its implementation, targeting a multiband, multisystem GNSS receiver. The result is a testbed for GNSS signal processing that allows any kind of customization, including interchangeability of signal sources, signal processing algorithms, interoperability with other systems, output formats, and the offering of interfaces to all the intermediate signals, parameters and variables. The source code release under the GNU General Public License (GPL) secures practical usability, inspection, and continuous improvement by the research community, allowing the discussion based on tangible code and the analysis of results obtained with real signals. The source code is complemented by a development ecosystem, consisting of a website (http://gnss-sdr.org), as well as a revision control system, instructions for users and developers, and communication tools. The project shows in detail the design of the initial blocks of the Signal Processing Plane of the receiver: signal conditioner, the acquisition block and the receiver channel, the project also extends the functionality of the acquisition and tracking modules of the GNSS-SDR receiver to track the new Galileo E1 signals available. Each section provides a theoretical analysis, implementation details of each block and subsequent testing to confirm the calculations with both synthetically generated signals and with real signals from satellites in space.
Resumo:
Volumetric soil water content (theta) can be evaluated in the field by direct or indirect methods. Among the direct, the gravimetric method is regarded as highly reliable and thus often preferred. Its main disadvantages are that sampling and laboratory procedures are labor intensive, and that the method is destructive, which makes resampling of a same point impossible. Recently, the time domain reflectometry (TDR) technique has become a widely used indirect, non-destructive method to evaluate theta. In this study, evaluations of the apparent dielectric number of soils (epsilon) and samplings for the gravimetrical determination of the volumetric soil water content (thetaGrav) were carried out at four sites of a Xanthic Ferralsol in Manaus - Brazil. With the obtained epsilon values, theta was estimated using empirical equations (thetaTDR), and compared with thetaGrav derived from disturbed and undisturbed samples. The main objective of this study was the comparison of thetaTDR estimates of horizontally as well as vertically inserted probes with the thetaGrav values determined by disturbed and undisturbed samples. Results showed that thetaTDR estimates of vertically inserted probes and the average of horizontally measured layers were only slightly and insignificantly different. However, significant differences were found between the thetaTDR estimates of different equations and between disturbed and undisturbed samples in the thetaGrav determinations. The use of the theoretical Knight et al. model, which permits an evaluation of the soil volume assessed by TDR probes, is also discussed. It was concluded that the TDR technique, when properly calibrated, permits in situ, nondestructive measurements of q in Xanthic Ferralsols of similar accuracy as the gravimetric method.
Resumo:
BACKGROUND: The objective is to develop a cost-effective, reliable and non invasive screening test able to detect early CRCs and adenomas. This is done on a nucleic acids multigene assay performed on peripheral blood mononuclear cells (PBMCs). METHODS: A colonoscopy-controlled study was conducted on 179 subjects. 92 subjects (21 CRC, 30 adenoma >1 cm and 41 controls) were used as training set to generate a signature. Other 48 subjects kept blinded (controls, CRC and polyps) were used as a test set. To determine organ and disease specificity 38 subjects were used: 24 with inflammatory bowel disease (IBD),14 with other cancers (OC). Blood samples were taken and PBMCs were purified. After the RNA extraction, multiplex RT-qPCR was applied on 92 different candidate biomarkers. After different univariate and multivariate analysis 60 biomarkers with significant p-values (<0.01) were selected. 2 distinct biomarker signatures are used to separate patients without lesion from those with CRC or with adenoma, named COLOX CRC and COLOX POL. COLOX performances were validated using random resampling method, bootstrap. RESULTS: COLOX CRC and POL tests successfully separate patients without lesions from those with CRC (Se 67%, Sp 93%, AUC 0.87), and from those with adenoma > 1cm (Se 63%, Sp 83%, AUC 0.77). 6/24 patients in the IBD group and 1/14 patients in the OC group have a positive COLOX CRC. CONCLUSION: The two COLOX tests demonstrated a high Se and Sp to detect the presence of CRCs and adenomas > 1 cm. A prospective, multicenter, pivotal study is underway in order to confirm these promising results in a larger cohort.
Resumo:
The objective of this study was to determine the minimum number of plants per plot that must be sampled in experiments with sugarcane (Saccharum officinarum) full-sib families in order to provide an effective estimation of genetic and phenotypic parameters of yield-related traits. The data were collected in a randomized complete block design with 18 sugarcane full-sib families and 6 replicates, with 20 plants per plot. The sample size was determined using resampling techniques with replacement, followed by an estimation of genetic and phenotypic parameters. Sample-size estimates varied according to the evaluated parameter and trait. The resampling method permits an efficient comparison of the sample-size effects on the estimation of genetic and phenotypic parameters. A sample of 16 plants per plot, or 96 individuals per family, was sufficient to obtain good estimates for all traits considered of all the characters evaluated. However, for Brix, if sample separation by trait were possible, ten plants per plot would give an efficient estimate for most of the characters evaluated.
Resumo:
Abstract: The objective of this work was to identify polymorphic simple sequence repeat (SSR) markers for varietal identification of cotton and evaluation of the genetic distance among the varieties. Initially, 92 SSR markers were genotyped in 20 Brazilian cotton cultivars. Of this total, 38 loci were polymorphic, two of which were amplified by one primer pair; the mean number of alleles per locus was 2.2. The values of polymorphic information content (PIC) and discrimination power (DP) were, on average, 0.374 and 0.433, respectively. The mean genetic distance was 0.397 (minimum of 0.092 and maximum of 0.641). A panel of 96 varieties originating from different regions of the world was assessed by 21 polymorphic loci derived from 17 selected primer pairs. Among these varieties, the mean genetic distance was 0.387 (minimum of 0 and maximum of 0.786). The dendrograms generated by the unweighted pair group method with arithmetic average (UPGMA) did not reflect the regions of Brazil (20 genotypes) or around the world (96 genotypes), where the varieties or lines were selected. Bootstrap resampling shows that genotype identification is viable with 19 loci. The polymorphic markers evaluated are useful to perform varietal identification in a large panel of cotton varieties and may be applied in studies of the species diversity.
Resumo:
The pharmacokinetics of scorpion venom and its toxins has been investigated in experimental models using adult animals, although, severe scorpion accidents are associated more frequently with children. We compared the effect of age on the pharmacokinetics of tityustoxin, one of the most active principles of Tityus serrulatus venom, in young male/female rats (21-22 days old, N = 5-8) and in adult male rats (150-160 days old, N = 5-8). Tityustoxin (6 µg) labeled with 99mTechnetium was administered subcutaneously to young and adult rats. The plasma concentration vs time data were subjected to non-compartmental pharmacokinetic analysis to obtain estimates of various pharmacokinetic parameters such as total body clearance (CL/F), distribution volume (Vd/F), area under the curve (AUC), and mean residence time. The data were analyzed with and without considering body weight. The data without correction for body weight showed a higher Cmax (62.30 ± 7.07 vs 12.71 ± 2.11 ng/ml, P < 0.05) and AUC (296.49 ± 21.09 vs 55.96 ± 5.41 ng h-1 ml-1, P < 0.05) and lower Tmax (0.64 ± 0.19 vs 2.44 ± 0.49 h, P < 0.05) in young rats. Furthermore, Vd/F (0.15 vs 0.42 l/kg) and CL/F (0.02 ± 0.001 vs 0.11 ± 0.01 l h-1 kg-1, P < 0.05) were lower in young rats. However, when the data were reanalyzed taking body weight into consideration, the Cmax (40.43 ± 3.25 vs 78.21 ± 11.23 ng kg-1 ml-1, P < 0.05) and AUC (182.27 ± 11.74 vs 344.62 ± 32.11 ng h-1 ml-1, P < 0.05) were lower in young rats. The clearance (0.03 ± 0.002 vs 0.02 ± 0.002 l h-1 kg-1, P < 0.05) and Vd/F (0.210 vs 0.067 l/kg) were higher in young rats. The raw data (not adjusted for body weight) strongly suggest that age plays a pivotal role in the disposition of tityustoxin. Furthermore, our results also indicate that the differences in the severity of symptoms observed in children and adults after scorpion envenomation can be explained in part by differences in the pharmacokinetics of the toxin.
Resumo:
In the present study, we compared the performance of a ThinPrep cytological method with the conventional Papanicolaou test for diagnosis of cytopathological changes, with regard to unsatisfactory results achieved at the Central Public Health Laboratory of the State of Pernambuco. A population-based, cross-sectional study was performed with women aged 18 to 65 years, who spontaneously sought gynecological services in Public Health Units in the State of Pernambuco, Northeast Brazil, between April and November 2011. All patients in the study were given a standardized questionnaire on sociodemographics, sexual characteristics, reproductive practices, and habits. A total of 525 patients were assessed by the two methods (11.05% were under the age of 25 years, 30.86% were single, 4.4% had had more than 5 sexual partners, 44% were not using contraception, 38.85% were users of alcohol, 24.38% were smokers, 3.24% had consumed drugs previously, 42.01% had gynecological complaints, and 12.19% had an early history of sexually transmitted diseases). The two methods showed poor correlation (k=0.19; 95%CI=0.11–0.26; P<0.001). The ThinPrep method reduced the rate of unsatisfactory results from 4.38% to 1.71% (χ2=5.28; P=0.02), and the number of cytopathological changes diagnosed increased from 2.47% to 3.04%. This study confirmed that adopting the ThinPrep method for diagnosis of cervical cytological samples was an improvement over the conventional method. Furthermore, this method may reduce possible losses from cytological resampling and reduce obstacles to patient follow-up, improving the quality of the public health system in the State of Pernambuco, Northeast Brazil.
Resumo:
The current thesis manuscript studies the suitability of a recent data assimilation method, the Variational Ensemble Kalman Filter (VEnKF), to real-life fluid dynamic problems in hydrology. VEnKF combines a variational formulation of the data assimilation problem based on minimizing an energy functional with an Ensemble Kalman filter approximation to the Hessian matrix that also serves as an approximation to the inverse of the error covariance matrix. One of the significant features of VEnKF is the very frequent re-sampling of the ensemble: resampling is done at every observation step. This unusual feature is further exacerbated by observation interpolation that is seen beneficial for numerical stability. In this case the ensemble is resampled every time step of the numerical model. VEnKF is implemented in several configurations to data from a real laboratory-scale dam break problem modelled with the shallow water equations. It is also tried in a two-layer Quasi- Geostrophic atmospheric flow problem. In both cases VEnKF proves to be an efficient and accurate data assimilation method that renders the analysis more realistic than the numerical model alone. It also proves to be robust against filter instability by its adaptive nature.
Resumo:
The relationships between vine water status, soil texture, and vine size were observed in four Niagara, Ontario Pinot noir vineyards in 2008 and 2009. The vineyards were divided into water status zones using geographic information systems (GIS) software to map the seasonal mean midday leaf water potential (,P), and dormant pruning shoot weights following the 2008 season. Fruit was harvested from all sentinel vines, bulked by water status zones and made into wine. Sensory analysis included a multidimensional sorting (MDS) task and descriptive analysis (DA) of the 2008 wines. Airborne multispectral images, with a spatial resolution of 38 cm, were captured four times in 2008 and three times in 2009, with the final flights around veraison. A semi-automatic process was developed to extract NDVI from the images, and a masking procedure was identified to create a vine-only NDVI image. 2008 and 2009 were cooler and wetter than mean years, and the range of water status zones was narrow. Yield per vine, vine size, anthocyanins and phenols were the least consistent variables. Divided by water status or vine size, there were no variables with differences between zones in all four vineyards in either year. Wines were not different between water status zones in any chemical analysis, and HPLC revealed that there were no differences in individual anthocyanins or phenolic compounds between water status zones within the vineyard sites. There were some notable correlations between vineyard and grape composition variables, and spatial trends were observed to be qualitatively related for many of the variables. The MDS task revealed that wines from each vineyard were more affected by random fermentation effects than water status effects. This was confirmed by the DA; there were no differences between wines from the water status zones within vineyard sites for any attribute. Remotely sensed NDVI (normalized difference vegetation index) correlated reasonably well with a number of grape composition variables, as well as soil type. Resampling to a lower spatial resolution did not appreciably affect the strength of correlations, and corresponded to the information contained in the masked images, while maintaining the range of values of NDVI. This study showed that in cool climates, there is the potential for using precision viticulture techniques to understand the variability in vineyards, but the variable weather presents a challenge for understanding the driving forces of that variability.
Resumo:
Cognitive control involves the ability to flexibly adjust cognitive processing in order to resist interference and promote goal-directed behaviour. Although frontal cortex is considered to be broadly involved in cognitive control, the mechanisms by which frontal brain areas implement control functions are unclear. Furthermore, aging is associated with reductions in the ability to implement control functions and questions remain as to whether unique cortical responses serve a compensatory role in maintaining maximal performance in later years. Described here are three studies in which electrophysiological data were recorded while participants performed modified versions of the standard Sternberg task. The goal was to determine how top-down control is implemented in younger adults and altered in aging. In study I, the effects of frequent stimulus repetition on the interference-related N450 were investigated in a Sternberg task with a small stimulus set (requiring extensive stimulus resampling) and a task with a large stimulus set (requiring no stimulus resampling).The data indicated that constant stimulus res amp ling required by employing small stimulus sets can undercut the effect of proactive interference on the N450. In study 2, younger and older adults were tested in a standard version of the Sternberg task to determine whether the unique frontal positivity, previously shown to predict memory impairment in older adults during a proactive interference task, would be associated with the improved performance when memory recognition could be aided by unambiguous stimulus familiarity. Here, results indicated that the frontal positivity was associated with poorer memory performance, replicating the effect observed in a more cognitively demanding task, and showing that stimulus familiarity does not mediate compensatory cortical activations in older adults. Although the frontal positivity could be interpreted to reflect maladaptive cortical activation, it may also reflect attempts at compensation that fail to fully ameliorate agerelated decline. Furthermore, the frontal positivity may be the result of older adults' reliance on late occurring, controlled processing in contrast to younger adults' ability to identify stimuli at very early stages of processing. In the final study, working memory load was manipulated in the proactive interference Sternberg task in order to investigate whether the N450 reflects simple interference detection, with little need for cognitive resources, or an active conflict resolution mechanism that requires executive resources to implement. Independent component analysis was used to isolate the effect of interference revealing that the canonical N450 was based on two dissociable cognitive control mechanisms: a left frontal negativity that reflects active interference resolution, , but requires executive resources to implement, and a right frontal negativity that reflects global response inhibition that can be relied on when executive resources are minimal but at the cost of a slowed response. Collectively, these studies advance understanding of the factors that influence younger and older adults' ability to satisfy goal-directed behavioural requirements in the face of interference and the effects of age-related cognitive decline.
Resumo:
Emerging markets have received wide attention from investors around the globe because of their return potential and risk diversification. This research examines the selection and timing performance of Canadian mutual funds which invest in fixed-income and equity securities in emerging markets. We use (un)conditional two- and five-factor benchmark models that accommodate the dynamics of returns in emerging markets. We also adopt the cross-sectional bootstrap methodology to distinguish between ‘skill’ and ‘luck’ for individual funds. All the tests are conducted using a comprehensive data set of bond and equity emerging funds over the period of 1989-2011. The risk-adjusted measures of performance are estimated using the least squares method with the Newey-West adjustment for standard errors that are robust to conditional heteroskedasticity and autocorrelation. The performance statistics of the emerging funds before (after) management-related costs are insignificantly positive (significantly negative). They are sensitive to the chosen benchmark model and conditional information improves selection performance. The timing statistics are largely insignificant throughout the sample period and are not sensitive to the benchmark model. Evidence of timing and selecting abilities is obtained in a small number of funds which is not sensitive to the fees structure. We also find evidence that a majority of individual funds provide zero (very few provide positive) abnormal return before fees and a significantly negative return after fees. At the negative end of the tail of performance distribution, our resampling tests fail to reject the role of bad luck in the poor performance of funds and we conclude that most of them are merely ‘unlucky’.