1000 resultados para Data Minig


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data mining can be defined as the extraction of previously unknown and potentially useful information from large datasets. The main principle is to devise computer programs that run through databases and automatically seek deterministic patterns. It is applied in different fields of application, e.g., remote sensing, biometry, speech recognition, but has seldom been applied to forensic case data. The intrinsic difficulty related to the use of such data lies in its heterogeneity, which comes from the many different sources of information. The aim of this study is to highlight potential uses of pattern recognition that would provide relevant results from a criminal intelligence point of view. The role of data mining within a global crime analysis methodology is to detect all types of structures in a dataset. Once filtered and interpreted, those structures can point to previously unseen criminal activities. The interpretation of patterns for intelligence purposes is the final stage of the process. It allows the researcher to validate the whole methodology and to refine each step if necessary. An application to cutting agents found in illicit drug seizures was performed. A combinatorial approach was done, using the presence and the absence of products. Methods coming from the graph theory field were used to extract patterns in data constituted by links between products and place and date of seizure. A data mining process completed using graphing techniques is called ``graph mining''. Patterns were detected that had to be interpreted and compared with preliminary knowledge to establish their relevancy. The illicit drug profiling process is actually an intelligence process that uses preliminary illicit drug classes to classify new samples. Methods proposed in this study could be used \textit{a priori} to compare structures from preliminary and post-detection patterns. This new knowledge of a repeated structure may provide valuable complementary information to profiling and become a source of intelligence.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

since 1999 data from pulmonary hypertension (PH) patients from all PH centres in Switzerland were prospectively collected. We analyse the epidemiological aspects of these data. PH was defined as a mean pulmonary artery pressure of >25 mm Hg at rest or >30 mm Hg during exercise. Patients with pulmonary arterial hypertension (PAH), PH associated with lung diseases, PH due to chronic thrombotic and/or embolic disease (CTEPH), or PH due to miscellaneous disorders were registered. Data from adult patients included between January 1999 and December 2004 were analysed. 250 patients were registered (age 58 +/- 16 years, 104 (41%) males). 152 patients (61%) had PAH, 73 (29%) had CTEPH and 18 (7%) had PH associated with lung disease. Patients <50 years (32%) were more likely to have PAH than patients >50 years (76% vs. 53%, p <0.005). Twenty-four patients (10%) were lost to followup, 58 patients (26%) died and 150 (66%) survived without transplantation or thrombendarterectomy. Survivors differed from patients who died in the baseline six-minute walking distance (400 m [300-459] vs. 273 m [174-415]), the functional impairment (NYHA class III/IV 86% vs. 98%), mixed venous saturation (63% [57-68] vs. 56% [50-61]) and right atrial pressure (7 mm Hg [4-11] vs. 11 mm Hg [4-18]). PH is a disease affecting adults of all ages. The management of these patients in specialised centres guarantees a high quality of care. Analysis of the registry data could be an instrument for quality control and might help identify weak points in assessment and treatment of these patients.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a comparative analysis of satellite derived climatologies in the Cape Verde region (CV). In order to establish chlorophyll a variability, in relation to other oceanographic phenomena, a set of, relatively long (from five to eight years), time series of chlorophyll a, sea surface temperature, wind and geostrophic currents, were ensembled for the Eastern Central Atlantic (ECA). We studied seasonal and inter-annual variability of phytoplankton concentration, in relation to the rest of the variables, with a special focus in CV. We compared the situation within the archipelago with those of the surrounding marine environments, such as the North West African Upwelling (NWAU), North Atlantic Subtropical Gyre (NASTG), North Equatorial Counter Current (NECC) and Guinea Dome (GD). At the seasonal scale, CV region behaves partly as the surrounding areas, nevertheless, some autochthonous features were also found. The maximum peak of the pigment having a positive correlation with temperature is found at the end of the year for all the points in the archipelago; a less remarkable rise with negative correlation is also detected in February for points CV2 and CV4. This is behavior that none of the surrounding environments have shown. This enrichment was found to be preceded by a drastic drop in wind intensity (SW Monsoon) during summer months. The inter-annual analysis shows a tendency for decreasing of the chlorophyll a concentration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a comparative analysis of satellite derived climatologies in the Cape Verde region (CV). In order to establish chlorophyll a variability, in relation to other oceanographic phenomena, a set of, relatively long (from five to eight years), time series of chlorophyll a, sea surface temperature, wind and geostrophic currents, were ensembled for the Eastern Central Atlantic (ECA). We studied seasonal and inter-annual variability of phytoplankton concentration, in relation to the rest of the variables, with a special focus in CV. We compared the situation within the archipelago with those of the surrounding marine environments, such as the North West African Upwelling (NWAU), North Atlantic Subtropical Gyre (NASTG), North Equatorial Counter Current (NECC) and Guinea Dome (GD). At the seasonal scale, CV region behaves partly as the surrounding areas, nevertheless, some autochthonous features were also found. The maximum peak of the pigment having a positive correlation with temperature is found at the end of the year for all the points in the archipelago; a less remarkable rise with negative correlation is also detected in February for points CV2 and CV4. This is behavior that none of the surrounding environments have shown. This enrichment was found to be preceded by a drastic drop in wind intensity (SW Monsoon) during summer months. The inter-annual analysis shows a tendency for decreasing of the chlorophyll a concentration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the disadvantages of old age is that there is more past than future: this,however, may be turned into an advantage if the wealth of experience and, hopefully,wisdom gained in the past can be reflected upon and throw some light on possiblefuture trends. To an extent, then, this talk is necessarily personal, certainly nostalgic,but also self critical and inquisitive about our understanding of the discipline ofstatistics. A number of almost philosophical themes will run through the talk: searchfor appropriate modelling in relation to the real problem envisaged, emphasis onsensible balances between simplicity and complexity, the relative roles of theory andpractice, the nature of communication of inferential ideas to the statistical layman, theinter-related roles of teaching, consultation and research. A list of keywords might be:identification of sample space and its mathematical structure, choices betweentransform and stay, the role of parametric modelling, the role of a sample spacemetric, the underused hypothesis lattice, the nature of compositional change,particularly in relation to the modelling of processes. While the main theme will berelevance to compositional data analysis we shall point to substantial implications forgeneral multivariate analysis arising from experience of the development ofcompositional data analysis…

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modern methods of compositional data analysis are not well known in biomedical research.Moreover, there appear to be few mathematical and statistical researchersworking on compositional biomedical problems. Like the earth and environmental sciences,biomedicine has many problems in which the relevant scienti c information isencoded in the relative abundance of key species or categories. I introduce three problemsin cancer research in which analysis of compositions plays an important role. Theproblems involve 1) the classi cation of serum proteomic pro les for early detection oflung cancer, 2) inference of the relative amounts of di erent tissue types in a diagnostictumor biopsy, and 3) the subcellular localization of the BRCA1 protein, and it'srole in breast cancer patient prognosis. For each of these problems I outline a partialsolution. However, none of these problems is \solved". I attempt to identify areas inwhich additional statistical development is needed with the hope of encouraging morecompositional data analysts to become involved in biomedical research

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this talk is to convince the reader that there are a lot of interesting statisticalproblems in presentday life science data analysis which seem ultimately connected withcompositional statistics.Key words: SAGE, cDNA microarrays, (1D-)NMR, virus quasispecies

Relevância:

20.00% 20.00%

Publicador:

Resumo:

OBJECT: To study a scan protocol for coronary magnetic resonance angiography based on multiple breath-holds featuring 1D motion compensation and to compare the resulting image quality to a navigator-gated free-breathing acquisition. Image reconstruction was performed using L1 regularized iterative SENSE. MATERIALS AND METHODS: The effects of respiratory motion on the Cartesian sampling scheme were minimized by performing data acquisition in multiple breath-holds. During the scan, repetitive readouts through a k-space center were used to detect and correct the respiratory displacement of the heart by exploiting the self-navigation principle in image reconstruction. In vivo experiments were performed in nine healthy volunteers and the resulting image quality was compared to a navigator-gated reference in terms of vessel length and sharpness. RESULTS: Acquisition in breath-hold is an effective method to reduce the scan time by more than 30 % compared to the navigator-gated reference. Although an equivalent mean image quality with respect to the reference was achieved with the proposed method, the 1D motion compensation did not work equally well in all cases. CONCLUSION: In general, the image quality scaled with the robustness of the motion compensation. Nevertheless, the featured setup provides a positive basis for future extension with more advanced motion compensation methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Secondary accident statistics can be useful for studying the impact of traffic incident management strategies. An easy-to-implement methodology is presented for classifying secondary accidents using data fusion of a police accident database with intranet incident reports. A current method for classifying secondary accidents uses a static threshold that represents the spatial and temporal region of influence of the primary accident, such as two miles and one hour. An accident is considered secondary if it occurs upstream from the primary accident and is within the duration and queue of the primary accident. However, using the static threshold may result in both false positives and negatives because accident queues are constantly varying. The methodology presented in this report seeks to improve upon this existing method by making the threshold dynamic. An incident progression curve is used to mark the end of the queue throughout the entire incident. Four steps in the development of incident progression curves are described. Step one is the processing of intranet incident reports. Step two is the filling in of incomplete incident reports. Step three is the nonlinear regression of incident progression curves. Step four is the merging of individual incident progression curves into one master curve. To illustrate this methodology, 5,514 accidents from Missouri freeways were analyzed. The results show that secondary accidents identified by dynamic versus static thresholds can differ by more than 30%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A report produced by the Department of Natural Resources on the historical pattern the rivers take.