930 resultados para Data Interpretation, Statistical


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we consider applying derived knowledge base regarding the sensitivity and specificity of damage(s) to be detected by an SHM system being designed and qualified. These efforts are necessary toward developing capabilities in SHM system to classify reliably various probable damages through sequence of monitoring, i.e., damage precursor identification, detection of damage and monitoring its progression. We consider the particular problem of visual and ultrasonic NDE based SHM system design requirements, where the damage detection sensitivity and specificity data definitions for a class of structural components are established. Methodologies for SHM system specification creation are discussed in details. Examples are shown to illustrate how the physics of damage detection scheme limits particular damage detection sensitivity and specificity and further how these information can be used in algorithms to combine various different NDE schemes in an SHM system to enhance efficiency and effectiveness. Statistical and data driven models to determine the sensitivity and probability of damage detection (POD) has been demonstrated for plate with varying one-sided line crack using optical and ultrasonic based inspection techniques.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Gene microarray technology is highly effective in screening for differential gene expression and has hence become a popular tool in the molecular investigation of cancer. When applied to tumours, molecular characteristics may be correlated with clinical features such as response to chemotherapy. Exploitation of the huge amount of data generated by microarrays is difficult, however, and constitutes a major challenge in the advancement of this methodology. Independent component analysis (ICA), a modern statistical method, allows us to better understand data in such complex and noisy measurement environments. The technique has the potential to significantly increase the quality of the resulting data and improve the biological validity of subsequent analysis. We performed microarray experiments on 31 postmenopausal endometrial biopsies, comprising 11 benign and 20 malignant samples. We compared ICA to the established methods of principal component analysis (PCA), Cyber-T, and SAM. We show that ICA generated patterns that clearly characterized the malignant samples studied, in contrast to PCA. Moreover, ICA improved the biological validity of the genes identified as differentially expressed in endometrial carcinoma, compared to those found by Cyber-T and SAM. In particular, several genes involved in lipid metabolism that are differentially expressed in endometrial carcinoma were only found using this method. This report highlights the potential of ICA in the analysis of microarray data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Rarefied gas flows through micro-channels are simulated using particle approaches, named as the information preservation (IP) method and the direct simulation Monte Carlo (DSMC) method. In simulating the low speed flows in long micro-channels the DSMC method encounters the problem of large sample size demand and the difficulty of regulating boundary conditions at the inlet and outlet. Some important computational issues in the calculation of long micro-channel flows by using the IP method, such as the use the conservative form of the mass conservation equation to guarantee the adjustment of the inlet and outlet boundary conditions and the super-relaxation scheme to accelerate the convergence process, are addressed. Stream-wise pressure distributions and mass fluxes through micro-channels given by the IP method agree well with experimental data measured in long micro-channels by Pong et al. (with a height to length ratio of 1.2:3000), Shih et al. (l.2:4800), Arkilic et al. and Arkilic (l.3:7500), respectively. The famous Knudsen minimum of normalized mass flux is observed in IP and DSMC calculations of a short micro-channel over the entire flow regime from continuum to free molecular, whereas the slip Navier-Stokes solution fails to predict it.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The stress release model, a stochastic version of the elastic rebound theory, is applied to the large events from four synthetic earthquake catalogs generated by models with various levels of disorder in distribution of fault zone strength (Ben-Zion, 1996) They include models with uniform properties (U), a Parkfield-type asperity (A), fractal brittle properties (F), and multi-size-scale heterogeneities (M). The results show that the degree of regularity or predictability in the assumed fault properties, based on both the Akaike information criterion and simulations, follows the order U, F, A, and M, which is in good agreement with that obtained by pattern recognition techniques applied to the full set of synthetic data. Data simulated from the best fitting stress release models reproduce, both visually and in distributional terms, the main features of the original catalogs. The differences in character and the quality of prediction between the four cases are shown to be dependent on two main aspects: the parameter controlling the sensitivity to departures from the mean stress level and the frequency-magnitude distribution, which differs substantially between the four cases. In particular, it is shown that the predictability of the data is strongly affected by the form of frequency-magnitude distribution, being greatly reduced if a pure Gutenburg-Richter form is assumed to hold out to high magnitudes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A relationship between the cumulative length of microcracks and the amplitude and duration of tensile impulse in spallation was established based on the application of statistical microdamage mechanics, which included a statistical formulation and dynamic laws of microdamage under loading. Since the degrees of spallation, called incipient, intermediate and complete spallation, can be characterized by the cumulative length of microcracks, a physical interpretation of an empirical criterion to spallation was presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

DNA microarray, or DNA chip, is a technology that allows us to obtain the expression level of many genes in a single experiment. The fact that numerical expression values can be easily obtained gives us the possibility to use multiple statistical techniques of data analysis. In this project microarray data is obtained from Gene Expression Omnibus, the repository of National Center for Biotechnology Information (NCBI). Then, the noise is removed and data is normalized, also we use hypothesis tests to find the most relevant genes that may be involved in a disease and use machine learning methods like KNN, Random Forest or Kmeans. For performing the analysis we use Bioconductor, packages in R for the analysis of biological data, and we conduct a case study in Alzheimer disease. The complete code can be found in https://github.com/alberto-poncelas/ bioc-alzheimer

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hyper-spectral data allows the construction of more robust statistical models to sample the material properties than the standard tri-chromatic color representation. However, because of the large dimensionality and complexity of the hyper-spectral data, the extraction of robust features (image descriptors) is not a trivial issue. Thus, to facilitate efficient feature extraction, decorrelation techniques are commonly applied to reduce the dimensionality of the hyper-spectral data with the aim of generating compact and highly discriminative image descriptors. Current methodologies for data decorrelation such as principal component analysis (PCA), linear discriminant analysis (LDA), wavelet decomposition (WD), or band selection methods require complex and subjective training procedures and in addition the compressed spectral information is not directly related to the physical (spectral) characteristics associated with the analyzed materials. The major objective of this article is to introduce and evaluate a new data decorrelation methodology using an approach that closely emulates the human vision. The proposed data decorrelation scheme has been employed to optimally minimize the amount of redundant information contained in the highly correlated hyper-spectral bands and has been comprehensively evaluated in the context of non-ferrous material classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main objective of this study is to describe and characterize the behaviour of fish prices in Nigeria. Drawing upon aspects of the data from a nationwide fish survey in 1980/81 and on various secondary data, the study analyses the pattern of fish price movement and makes projections of fish prices in Nigeria till 2002 A.D. It is concluded that unless efforts are directed at stemming inflation in fish prices, prices paid by fish consumers in Nigeria will be more than doubled within the next two decades

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Transcription factor binding sites (TFBS) play key roles in genebior 6.8 wavelet expression and regulation. They are short sequence segments with de¯nite structure and can be recognized by the corresponding transcription factors correctly. From the viewpoint of statistics, the candidates of TFBS should be quite di®erent from the segments that are randomly combined together by nucleotide. This paper proposes a combined statistical model for ¯nding over- represented short sequence segments in di®erent kinds of data set. While the over-represented short sequence segment is described by position weight matrix, the nucleotide distribution at most sites of the segment should be far from the background nucleotide distribution. The central idea of this approach is to search for such kind of signals. This algorithm is tested on 3 data sets, including binding sites data set of cyclic AMP receptor protein in E.coli, PlantProm DB which is a non-redundant collection of proximal promoter sequences from di®erent species, collection of the intergenic sequences of the whole genome of E.Coli. Even though the complexity of these three data sets is quite di®erent, the results show that this model is rather general and sensible.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

After several years of surveys on the Kainji Lake fisheries activities by the Nigerian German Kainji Lake Fish promotion Project (KLFPP) trends regarding catches, yield and other parameter begin to emerge. However, it became obvious that some of the data were not quite as accurate as they were believed to be. Looking at the different editions of the statistical bulletin of Kainji Lake, concerning one given fisheries parameter, sometimes it is possible to reveal inconsistencies and unexplained trends. As compared to the survey method, PRA is primarily for analysis of differences in local phenomenon and processes. Therefore, PRA was used as a complementary tool to enhance the knowledge on issues like fisher women, entrepreneurs, gear ownership structure, mode of operation by owners of large gear number, preference in the use of twine and nylon gill nets, and reasons for misinformation on the number of fishing equipment owned by entrepreneurs, which cannot be done with frame survey. PRA techniques like timeline, mapping, seasonal calendar, transect walk and key informant interviews were utilized in the study process

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The bulletin presents summary tables and charts on levels of fishing activity, fishing effort, yields and economic values of yields for the fisheries of Kainji Lake, Nigeria for the year 1997. Frame survey data and fishing gear measurements are also included. (PDF contains 34 pages)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A tabulated summary is presented of the main fisheries data collected to date (1998) by the Nigerian-German Kainji Lake Fisheries Promotion Project, together with a current overview of the fishery. The data are given under the following sections: 1) Fishing localities and types; 2) Frame survey data; 3) Number of licensed fishermen by state; 4) Mesh size distribution; 5) Fishing net characteristics; 6) Fish yield; 7) Total annual fishing effort by gear type; 8) Total annual value of fish landed by gear type; 9) Graphs of effort and CPUE by gear type. (PDF contains 36 pages)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A tabulated summary is presented of the main Lake Kainji fisheries data collected to date (1999) by the Nigerian-German Kainji Lake Fisheries Promotion Project, together with a current overview of the fishery. The data are given under the following sections: 1) Fishing localities and types; 2) Frame survey data; 3) Number of licensed fishermen by state; 4) Mesh size distribution; 5) Fishing net characteristics; 6) Fish yield; 7) Average monthly CPUE by gear type; 8)Average monthly fishing activity by gear type; 9) Total annual fishing effort by gear type; 10) Total annual value of fish landed by gear type; 11) Trends of the total yield by gear type. (PDF contains 34 pages)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation. The scale is staggering: over 10^11 neurons, each making an average of 10^3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area. Slowly, we are beginning to acquire experimental tools that can gather the massive amounts of data needed to characterize this system. However, to understand and interpret these data will also require substantial strides in inferential and statistical techniques. This dissertation attempts to meet this need, extending and applying the modern tools of latent variable modeling to problems in neural data analysis.

It is divided into two parts. The first begins with an exposition of the general techniques of latent variable modeling. A new, extremely general, optimization algorithm is proposed - called Relaxation Expectation Maximization (REM) - that may be used to learn the optimal parameter values of arbitrary latent variable models. This algorithm appears to alleviate the common problem of convergence to local, sub-optimal, likelihood maxima. REM leads to a natural framework for model size selection; in combination with standard model selection techniques the quality of fits may be further improved, while the appropriate model size is automatically and efficiently determined. Next, a new latent variable model, the mixture of sparse hidden Markov models, is introduced, and approximate inference and learning algorithms are derived for it. This model is applied in the second part of the thesis.

The second part brings the technology of part I to bear on two important problems in experimental neuroscience. The first is known as spike sorting; this is the problem of separating the spikes from different neurons embedded within an extracellular recording. The dissertation offers the first thorough statistical analysis of this problem, which then yields the first powerful probabilistic solution. The second problem addressed is that of characterizing the distribution of spike trains recorded from the same neuron under identical experimental conditions. A latent variable model is proposed. Inference and learning in this model leads to new principled algorithms for smoothing and clustering of spike data.