936 resultados para Pattern recognition, cluster finding, calibration and fitting methods


Relevância:

100.00% 100.00%

Publicador:

Resumo:

New arguments proving that successive (repeated) measurements have a memory and actually remember each other are presented. The recognition of this peculiarity can change essentially the existing paradigm associated with conventional observation in behavior of different complex systems and lead towards the application of an intermediate model (IM). This IM can provide a very accurate fit of the measured data in terms of the Prony's decomposition. This decomposition, in turn, contains a small set of the fitting parameters relatively to the number of initial data points and allows comparing the measured data in cases where the “best fit” model based on some specific physical principles is absent. As an example, we consider two X-ray diffractometers (defined in paper as A- (“cheap”) and B- (“expensive”) that are used after their proper calibration for the measuring of the same substance (corundum a-Al2O3). The amplitude-frequency response (AFR) obtained in the frame of the Prony's decomposition can be used for comparison of the spectra recorded from (A) and (B) - X-ray diffractometers (XRDs) for calibration and other practical purposes. We prove also that the Fourier decomposition can be adapted to “ideal” experiment without memory while the Prony's decomposition corresponds to real measurement and can be fitted in the frame of the IM in this case. New statistical parameters describing the properties of experimental equipment (irrespective to their internal “filling”) are found. The suggested approach is rather general and can be used for calibration and comparison of different complex dynamical systems in practical purposes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Usually, data warehousing populating processes are data-oriented workflows composed by dozens of granular tasks that are responsible for the integration of data coming from different data sources. Specific subset of these tasks can be grouped on a collection together with their relationships in order to form higher- level constructs. Increasing task granularity allows for the generalization of processes, simplifying their views and providing methods to carry out expertise to new applications. Well-proven practices can be used to describe general solutions that use basic skeletons configured and instantiated according to a set of specific integration requirements. Patterns can be applied to ETL processes aiming to simplify not only a possible conceptual representation but also to reduce the gap that often exists between two design perspectives. In this paper, we demonstrate the feasibility and effectiveness of an ETL pattern-based approach using task clustering, analyzing a real world ETL scenario through the definitions of two commonly used clusters of tasks: a data lookup cluster and a data conciliation and integration cluster.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent decades, an increased interest has been evidenced in the research on multi-scale hierarchical modelling in the field of mechanics, and also in the field of wood products and timber engineering. One of the main motivations for hierar-chical modelling is to understand how properties, composition and structure at lower scale levels may influence and be used to predict the material properties on a macroscopic and structural engineering scale. This chapter presents the applicability of statistic and probabilistic methods, such as the Maximum Likelihood method and Bayesian methods, in the representation of timber’s mechanical properties and its inference accounting to prior information obtained in different importance scales. These methods allow to analyse distinct timber’s reference properties, such as density, bending stiffness and strength, and hierarchically consider information obtained through different non, semi or destructive tests. The basis and fundaments of the methods are described and also recommendations and limitations are discussed. The methods may be used in several contexts, however require an expert’s knowledge to assess the correct statistic fitting and define the correlation arrangement between properties.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Specific properties emerge from the structure of large networks, such as that of worldwide air traffic, including a highly hierarchical node structure and multi-level small world sub-groups that strongly influence future dynamics. We have developed clustering methods to understand the form of these structures, to identify structural properties, and to evaluate the effects of these properties. Graph clustering methods are often constructed from different components: a metric, a clustering index, and a modularity measure to assess the quality of a clustering method. To understand the impact of each of these components on the clustering method, we explore and compare different combinations. These different combinations are used to compare multilevel clustering methods to delineate the effects of geographical distance, hubs, network densities, and bridges on worldwide air passenger traffic. The ultimate goal of this methodological research is to demonstrate evidence of combined effects in the development of an air traffic network. In fact, the network can be divided into different levels of âeurooecohesionâeuro, which can be qualified and measured by comparative studies (Newman, 2002; Guimera et al., 2005; Sales-Pardo et al., 2007).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVES: Mannan-binding lectin (MBL) acts as a pattern-recognition molecule directed against oligomannan, which is part of the cell wall of yeasts and various bacteria. We have previously shown an association between MBL deficiency and anti-Saccharomyces cerevisiae mannan antibody (ASCA) positivity. This study aims at evaluating whether MBL deficiency is associated with distinct Crohn's disease (CD) phenotypes. METHODS: Serum concentrations of MBL and ASCA were measured using ELISA (enzyme-linked immunosorbent assay) in 427 patients with CD, 70 with ulcerative colitis, and 76 healthy controls. CD phenotypes were grouped according to the Montreal Classification as follows: non-stricturing, non-penetrating (B1, n=182), stricturing (B2, n=113), penetrating (B3, n=67), and perianal disease (p, n=65). MBL was classified as deficient (<100 ng/ml), low (100-500 ng/ml), and normal (500 ng/ml). RESULTS: Mean MBL was lower in B2 and B3 CD patients (1,503+/-1,358 ng/ml) compared with that in B1 phenotypes (1,909+/-1,392 ng/ml, P=0.013). B2 and B3 patients more frequently had low or deficient MBL and ASCA positivity compared with B1 patients (P=0.004 and P<0.001). Mean MBL was lower in ASCA-positive CD patients (1,562+/-1,319 ng/ml) compared with that in ASCA-negative CD patients (1,871+/-1,320 ng/ml, P=0.038). In multivariate logistic regression modeling, low or deficient MBL was associated significantly with B1 (negative association), complicated disease (B2+B3), and ASCA. MBL levels did not correlate with disease duration. CONCLUSIONS: Low or deficient MBL serum levels are significantly associated with complicated (stricturing and penetrating) CD phenotypes but are negatively associated with the non-stricturing, non-penetrating group. Furthermore, CD patients with low or deficient MBL are significantly more often ASCA positive, possibly reflecting delayed clearance of oligomannan-containing microorganisms by the innate immune system in the absence of MBL.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

INTRODUCTION: The objective was to investigate the potential implication of the IL18 gene promoter polymorphisms in the susceptibility to giant-cell arteritis GCA). METHODS: In total, 212 patients diagnosed with biopsy-proven GCA were included in this study. DNA from patients and matched controls was obtained from peripheral blood. Samples were genotyped for the IL18-137 G>C (rs187238), the IL18-607 C>A (rs1946518), and the IL18-1297 T>C (rs360719) gene polymorphisms with polymerase chain reaction, by using a predesigned TaqMan allele discrimination assay. RESULTS: No significant association between the IL18-137 G>C polymorphism and GCA was found. However, the IL18 -607 allele A was significantly increased in GCA patients compared with controls (47.8% versus 40.9% in patients and controls respectively; P = 0.02; OR, 1.32; 95% CI, 1.04 to 1.69). It was due to an increased frequency of homozygosity for the IL18 -607 A/A genotype in patients with GCA (20.4%) compared with controls (13.4%) (IL18 -607 A/A versus IL18 -607 A/C plus IL18 -607 C/C genotypes: P = 0.04; OR, 1.59; 95% CI, 1.02 to 2.46). Also, the IL18-1297 allele C was significantly increased in GCA patients (30.7%) compared with controls (23.0%) (P = 0.003; OR, 1.48; 95% CI, 1.13 to 1.95). In this regard, an increased susceptibility to GCA was observed in individuals carrying the IL18-1297 C/C or the IL18-1297 C/T genotypes compared with those carrying the IL18-1297 T/T genotype (IL18-1297 C/C plus IL18-1297 T/C versus IL18-1297 T/T genotype in GCA patients compared with controls: P = 0.005; OR, 1.61; 95% CI, 1.15 to 2.25). We also found an additive effect of the IL18 -1297 and -607 polymorphisms with TLR4 Asp299Gly polymorphism. The OR for GCA was 1.95 for combinations of genotypes with one or two risk alleles, whereas carriers of three or more risk alleles have an OR of 3.7. CONCLUSIONS: Our results show for the first time an implication of IL18 gene-promoter polymorphisms in the susceptibility to biopsy-proven GCA. In addition, an additive effect between the associated IL18 and TLR4 genetic variants was observed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the first part of this research, three stages were stated for a program to increase the information extracted from ink evidence and maximise its usefulness to the criminal and civil justice system. These stages are (a) develop a standard methodology for analysing ink samples by high-performance thin layer chromatography (HPTLC) in reproducible way, when ink samples are analysed at different time, locations and by different examiners; (b) compare automatically and objectively ink samples; and (c) define and evaluate theoretical framework for the use of ink evidence in forensic context. This report focuses on the second of the three stages. Using the calibration and acquisition process described in the previous report, mathematical algorithms are proposed to automatically and objectively compare ink samples. The performances of these algorithms are systematically studied for various chemical and forensic conditions using standard performance tests commonly used in biometrics studies. The results show that different algorithms are best suited for different tasks. Finally, this report demonstrates how modern analytical and computer technology can be used in the field of ink examination and how tools developed and successfully applied in other fields of forensic science can help maximising its impact within the field of questioned documents.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology. RESULTS: We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads. CONCLUSION: We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dendritic cell (DC) populations consist of multiple subsets that are essential orchestrators of the immune system. Technological limitations have so far prevented systems-wide accurate proteome comparison of rare cell populations in vivo. Here, we used high-resolution mass spectrometry-based proteomics, combined with label-free quantitation algorithms, to determine the proteome of mouse splenic conventional and plasmacytoid DC subsets to a depth of 5,780 and 6,664 proteins, respectively. We found mutually exclusive expression of pattern recognition pathways not previously known to be different among conventional DC subsets. Our experiments assigned key viral recognition functions to be exclusively expressed in CD4(+) and double-negative DCs. The CD8alpha(+) DCs largely lack the receptors required to sense certain viruses in the cytoplasm. By avoiding activation via cytoplasmic receptors, including retinoic acid-inducible gene I, CD8alpha(+) DCs likely gain a window of opportunity to process and present viral antigens before activation-induced shutdown of antigen presentation pathways occurs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Childhood obesity and physical inactivity are increasing dramatically worldwide. Children of low socioeconomic status and/or children of migrant background are especially at risk. In general, the overall effectiveness of school-based programs on health-related outcomes has been disappointing. A special gap exists for younger children and in high risk groups. This paper describes the rationale, design, curriculum, and evaluation of a multicenter preschool randomized intervention study conducted in areas with a high migrant population in two out of 26 Swiss cantons. Twenty preschool classes in the German (canton St. Gallen) and another 20 in the French (canton Vaud) part of Switzerland were separately selected and randomized to an intervention and a control arm by the use of opaque envelopes. The multidisciplinary lifestyle intervention aimed to increase physical activity and sleep duration, to reinforce healthy nutrition and eating behaviour, and to reduce media use. According to the ecological model, it included children, their parents and the teachers. The regular teachers performed the majority of the intervention and were supported by a local health promoter. The intervention included physical activity lessons, adaptation of the built infrastructure; promotion of regional extracurricular physical activity; playful lessons about nutrition, media use and sleep, funny homework cards and information materials for teachers and parents. It lasted one school year. Baseline and post-intervention evaluations were performed in both arms. Primary outcome measures included BMI and aerobic fitness (20 m shuttle run test). Secondary outcomes included total (skinfolds, bioelectrical impedance) and central (waist circumference) body fat, motor abilities (obstacle course, static and dynamic balance), physical activity and sleep duration (accelerometry and questionnaires), nutritional behaviour and food intake, media use, quality of life and signs of hyperactivity (questionnaires), attention and spatial working memory ability (two validated tests). Researchers were blinded to group allocation. The purpose of this paper is to outline the design of a school-based multicenter cluster randomized, controlled trial aiming to reduce body mass index and to increase aerobic fitness in preschool children in culturally different parts of Switzerland with a high migrant population. Trial Registration: (clinicaltrials.gov) NCT00674544.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently, kernel-based Machine Learning methods have gained great popularity in many data analysis and data mining fields: pattern recognition, biocomputing, speech and vision, engineering, remote sensing etc. The paper describes the use of kernel methods to approach the processing of large datasets from environmental monitoring networks. Several typical problems of the environmental sciences and their solutions provided by kernel-based methods are considered: classification of categorical data (soil type classification), mapping of environmental and pollution continuous information (pollution of soil by radionuclides), mapping with auxiliary information (climatic data from Aral Sea region). The promising developments, such as automatic emergency hot spot detection and monitoring network optimization are discussed as well.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose a simple and general model for computing the Ramsey optimal inflation tax, which includes several models from the previous literature as special cases. We show that it cannot be claimed that the Friedman rule is always optimal (or always non--optimal) on theoretical grounds. The Friedman rule is optimal or not, depending on conditions related to the shape of various relevant functions. One contribution of this paper is to relate these conditions to {\it measurable} variables such as the interest rate or the consumption elasticity of money demand. We find that it tends to be optimal to tax money when there are economies of scale in the demand for money (the scale elasticity is smaller than one) and/or when money is required for the payment of consumption or wage taxes. We find that it tends to be optimal to tax money more heavily when the interest elasticity of money demand is small. We present empirical evidence on the parameters that determine the optimal inflation tax. Calibrating the model to a variety of empirical studies yields a optimal nominal interest rate of less than 1\%/year, although that finding is sensitive to the calibration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Given $n$ independent replicates of a jointly distributed pair $(X,Y)\in {\cal R}^d \times {\cal R}$, we wish to select from a fixed sequence of model classes ${\cal F}_1, {\cal F}_2, \ldots$ a deterministic prediction rule $f: {\cal R}^d \to {\cal R}$ whose risk is small. We investigate the possibility of empirically assessingthe {\em complexity} of each model class, that is, the actual difficulty of the estimation problem within each class. The estimated complexities are in turn used to define an adaptive model selection procedure, which is based on complexity penalized empirical risk.The available data are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select a candidate rule from each cover based on empirical risk. The covering radii are determined empirically to optimize a tight upper bound on the estimation error. An estimate is chosen from the list of candidates in order to minimize the sum of class complexity and empirical risk. A distinguishing feature of the approach is that the complexity of each model class is assessed empirically, based on the size of its empirical cover.Finite sample performance bounds are established for the estimates, and these bounds are applied to several non-parametric estimation problems. The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand. In addition, it is shown that the estimate can be consistent,and even possess near optimal rates of convergence, when each model class has an infinite VC or pseudo dimension.For regression estimation with squared loss we modify our estimate to achieve a faster rate of convergence.