986 resultados para statistical detection


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the instrumental records of daily precipitation, we often encounter one or more periods in which values below some threshold were not registered. Such periods, besides lacking small values, also have a large number of dry days. Their cumulative distribution function is shifted to the right in relation to that for other portions of the record having more reliable observations. Such problems are examined in this work, based mostly on the two-sample Kolmogorov–Smirnov (KS) test, where the portion of the series with more number of dry days is compared with the portion with less number of dry days. Another relatively common problem in daily rainfall data is the prevalence of integers either throughout the period of record or in some part of it, likely resulting from truncation during data compilation prior to archiving or by coarse rounding of daily readings by observers. This problem is identified by simple calculation of the proportion of integers in the series, taking the expected proportion as 10%. The above two procedures were applied to the daily rainfall data sets from the European Climate Assessment (ECA), Southeast Asian Climate Assessment (SACA), and Brazilian Water Resources Agency (BRA). Taking the statistic D of the KS test >0.15 and the corresponding p-value <0.001 as the condition to classify a given series as suspicious, the proportions of the ECA, SACA, and BRA series falling into this category are, respectively, 34.5%, 54.3%, and 62.5%. With relation to coarse rounding problem, the proportions of series exceeding twice the 10% reference level are 3%, 60%, and 43% for the ECA, SACA, and BRA data sets, respectively. A simple way to visualize the two problems addressed here is by plotting the time series of daily rainfall for a limited range, for instance, 0–10 mm day−1.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spam has become a critical problem in online social networks. This paper focuses on Twitter spam detection. Recent research works focus on applying machine learning techniques for Twitter spam detection, which make use of the statistical features of tweets. We observe existing machine learning based detection methods suffer from the problem of Twitter spam drift, i.e., the statistical properties of spam tweets vary over time. To avoid this problem, an effective solution is to train one twitter spam classifier every day. However, it faces a challenge of the small number of imbalanced training data because labelling spam samples is time-consuming. This paper proposes a new method to address this challenge. The new method employs two new techniques, fuzzy-based redistribution and asymmetric sampling. We develop a fuzzy-based information decomposition technique to re-distribute the spam class and generate more spam samples. Moreover, an asymmetric sampling technique is proposed to re-balance the sizes of spam samples and non-spam samples in the training data. Finally, we apply the ensemble technique to combine the spam classifiers over two different training sets. A number of experiments are performed on a real-world 10-day ground-truth dataset to evaluate the new method. Experiments results show that the new method can significantly improve the detection performance for drifting Twitter spam.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we report on the outcomes of a research and demonstration project on human intrusion detection in a large secure space using an ad hoc wireless sensor network. This project has been a unique experience in collaborative research, involving ten investigators (with expertise in areas such as sensors, circuits, computer systems,communication and networking, signal processing and security) to execute a large funded project that spanned three to four years. In this paper we report on the specific engineering solution that was developed: the various architectural choices and the associated specific designs. In addition to developing a demonstrable system, the various problems that arose have given rise to a large amount of basic research in areas such as geographical packet routing, distributed statistical detection, sensors and associated circuits, a low power adaptive micro-radio, and power optimising embedded systems software. We provide an overview of the research results obtained.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

There are two statistical decision making questions regarding statistically detecting sings of denial-of-service flooding attacks. One is how to represent the distributions of detection probability, false alarm probability and miss probability. The other is how to quantitatively express a decision region within which one may make a decision that has high detection probability, low false alarm probability and low miss probability. This paper gives the answers to the above questions. In addition, a case study is demonstrated.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Mineralogic, petrographic, and geochemical analyses of sediments recovered from two Leg 166 Ocean Drilling Program cores on the western slope of Great Bahama Bank (308 m and 437 m water depth) are used to characterize early marine diagenesis of these shallow-water, periplatform carbonates. The most pronounced diagenetic products are well-lithified intervals found almost exclusively in glacial lowstand deposits and interpreted to have formed at or near the seafloor (i.e., hardgrounds). Hardground cements are composed of high-Mg calcite (~14 mol% MgCO3), and exhibit textures typically associated with seafloor cementation. Geochemically, hardgrounds are characterized by increased d18O and Mg contents and decreased d13C, Sr, and Na contents relative to their less lithified counterparts. Despite being deposited in shallow waters that are supersaturated with the common carbonate minerals, it is clear that these sediments are also undergoing shallow subsurface diagenesis. Calculation of saturation states shows that pore waters become undersaturated with aragonite within the upper 10 m at both sites. Dissolution, and likely recrystallization, of metastable carbonates is manifested by increases in interstitial water Sr and Sr/Ca profiles with depth. We infer that the reduction in mineral saturation states and subsequent dissolution are being driven by the oxidation of organic matter in this Fe-poor carbonate system. Precipitation of burial diagenetic phases is indicated by the down-core appearance of dolomite and corresponding decrease in interstitial water Mg, and the presence of low-Mg calcite cements observed in scanning electron microscope photomicrographs.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: Developing sampling strategies to target biological pests such as insects in stored grain is inherently difficult owing to species biology and behavioural characteristics. The design of robust sampling programmes should be based on an underlying statistical distribution that is sufficiently flexible to capture variations in the spatial distribution of the target species. Results: Comparisons are made of the accuracy of four probability-of-detection sampling models - the negative binomial model,1 the Poisson model,1 the double logarithmic model2 and the compound model3 - for detection of insects over a broad range of insect densities. Although the double log and negative binomial models performed well under specific conditions, it is shown that, of the four models examined, the compound model performed the best over a broad range of insect spatial distributions and densities. In particular, this model predicted well the number of samples required when insect density was high and clumped within experimental storages. Conclusions: This paper reinforces the need for effective sampling programs designed to detect insects over a broad range of spatial distributions. The compound model is robust over a broad range of insect densities and leads to substantial improvement in detection probabilities within highly variable systems such as grain storage.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Power calculation and sample size determination are critical in designing environmental monitoring programs. The traditional approach based on comparing the mean values may become statistically inappropriate and even invalid when substantial proportions of the response values are below the detection limits or censored because strong distributional assumptions have to be made on the censored observations when implementing the traditional procedures. In this paper, we propose a quantile methodology that is robust to outliers and can also handle data with a substantial proportion of below-detection-limit observations without the need of imputing the censored values. As a demonstration, we applied the methods to a nutrient monitoring project, which is a part of the Perth Long-Term Ocean Outlet Monitoring Program. In this example, the sample size required by our quantile methodology is, in fact, smaller than that by the traditional t-test, illustrating the merit of our method.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Statistical learning can be used to extract the words from continuous speech. Gómez, Bion, and Mehler (Language and Cognitive Processes, 26, 212–223, 2011) proposed an online measure of statistical learning: They superimposed auditory clicks on a continuous artificial speech stream made up of a random succession of trisyllabic nonwords. Participants were instructed to detect these clicks, which could be located either within or between words. The results showed that, over the length of exposure, reaction times (RTs) increased more for within-word than for between-word clicks. This result has been accounted for by means of statistical learning of the between-word boundaries. However, even though statistical learning occurs without an intention to learn, it nevertheless requires attentional resources. Therefore, this process could be affected by a concurrent task such as click detection. In the present study, we evaluated the extent to which the click detection task indeed reflects successful statistical learning. Our results suggest that the emergence of RT differences between within- and between-word click detection is neither systematic nor related to the successful segmentation of the artificial language. Therefore, instead of being an online measure of learning, the click detection task seems to interfere with the extraction of statistical regularities.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Motivation: Microarray experiments generate a high data volume. However, often due to financial or experimental considerations, e.g. lack of sample, there is little or no replication of the experiments or hybridizations. These factors combined with the intrinsic variability associated with the measurement of gene expression can result in an unsatisfactory detection rate of differential gene expression (DGE). Our motivation was to provide an easy to use measure of the success rate of DGE detection that could find routine use in the design of microarray experiments or in post-experiment assessment.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Anti-islanding protection is becoming increasingly important due to the rapid installation of distributed generation from renewable resources like wind, tidal and wave, solar PV, bio-fuels, as well as from other resources like diesel. Unintentional islanding presents a potential risk for damaging utility plants and equipment connected from the demand side, as well as to public and personnel in utility plants. This paper investigates automatic islanding detection. This is achieved by deploying a statistical process control approach for fault detection with the real-time data acquired through a wide area measurement system, which is based on Phasor Measurement Unit (PMU) technology. In particular, the principal component analysis (PCA) is used to project the data into principal component subspace and residual space, and two statistics are used to detect the occurrence of fault. Then a fault reconstruction method is used to identify the fault and its development over time. The proposed scheme has been used in a real system and the results have confirmed that the proposed method can correctly identify the fault and islanding site.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The characterization and grading of glioma tumors, via image derived features, for diagnosis, prognosis, and treatment response has been an active research area in medical image computing. This paper presents a novel method for automatic detection and classification of glioma from conventional T2 weighted MR images. Automatic detection of the tumor was established using newly developed method called Adaptive Gray level Algebraic set Segmentation Algorithm (AGASA).Statistical Features were extracted from the detected tumor texture using first order statistics and gray level co-occurrence matrix (GLCM) based second order statistical methods. Statistical significance of the features was determined by t-test and its corresponding p-value. A decision system was developed for the grade detection of glioma using these selected features and its p-value. The detection performance of the decision system was validated using the receiver operating characteristic (ROC) curve. The diagnosis and grading of glioma using this non-invasive method can contribute promising results in medical image computing

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Two types of ecological thresholds are now being widely used to develop conservation targets: breakpoint-based thresholds represent tipping points where system properties change dramatically, whereas classification thresholds identify groups of data points with contrasting properties. Both breakpoint-based and classification thresholds are useful tools in evidence-based conservation. However, it is critical that the type of threshold to be estimated corresponds with the question of interest and that appropriate statistical procedures are used to determine its location. On the basis of their statistical properties, we recommend using piecewise regression methods to identify breakpoint-based thresholds and discriminant analysis or classification and regression trees to identify classification thresholds.