977 resultados para statistical techniques
Resumo:
In this work, 50 ceramic fragments from the Lago Grande and 30 from the Osvaldo archaeological site were compared to assess elemental similarities. The aim is to perform a preliminary comparison between the sites, which are located in the central Amazon, Brazil. The analytical technique employed to obtain the ceramics elemental composition was instrumental neutron activation analysis (INAA). The data set obtained was explored by the multivariate statistical techniques of cluster, principal component and discriminant analysis. The analyzed elements were: Na, Lu, U, Yb, La, Th, Cr, Cs, Sc, Fe, Eu, Ce and Hf. The results showed the existence of at least two compositional groups for Lago Grande and Osvaldo. Each compositional group of Osvaldo archaeological site matches with one group of Lago Grande. Correlated with the archaeological background, the results suggest commercial or cultural exchange in the region, which is an indicative of socio-cultural interactions between those sites.
Resumo:
Fraud is a global problem that has required more attention due to an accentuated expansion of modern technology and communication. When statistical techniques are used to detect fraud, whether a fraud detection model is accurate enough in order to provide correct classification of the case as a fraudulent or legitimate is a critical factor. In this context, the concept of bootstrap aggregating (bagging) arises. The basic idea is to generate multiple classifiers by obtaining the predicted values from the adjusted models to several replicated datasets and then combining them into a single predictive classification in order to improve the classification accuracy. In this paper, for the first time, we aim to present a pioneer study of the performance of the discrete and continuous k-dependence probabilistic networks within the context of bagging predictors classification. Via a large simulation study and various real datasets, we discovered that the probabilistic networks are a strong modeling option with high predictive capacity and with a high increment using the bagging procedure when compared to traditional techniques. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
Este artigo faz parte de um amplo estudo de avaliação da adequação no uso de técnicas estatísticas multivariadas em teses e dissertações de duas instituições de ensino superior na área de marketing na temática do comportamento do consumidor, entre 1997 e 2006. Neste artigo são focalizadas onze técnicas multivariadas (análise de regressão, análise discriminante, análise de regressão logística, correlação canônica, análise multivariada de variância, análise conjunta, modelagem de equações estruturais, análise fatorial, análise de conglomerados, análise de correspondência, escalonamento multidimensional), as quais têm apresentado grande potencial de uso em estudos de marketing. Foi objetivo no trabalho relatado a análise da adequação do emprego dessas técnicas às necessidades dos problemas de pesquisa apresentados nas teses e dissertações e, também, a aferição do nível de adequação no atendimento de suas premissas. De forma geral, os resultados sugerem a necessidade de um aumento do comprometimento dos pesquisadores na verificação de todos os preceitos teóricos de aplicação das técnicas multivariadas.
Resumo:
This study aimed to verify the impact of inhalable particulate matter (PM10) on cancer incidence and mortality in the city of São Paulo, Brazil. Statistical techniques were used to investigate the relationship between PM10 on cancer incidence and mortality in selected districts. For some types of cancer (skin, lung, thyroid, larynx, and bladder) and some periods, the correlation coefficients ranged from 0.60 to 0.80 for incidence. Lung cancer mortality showed more correlations during the overall period. Spatial analysis showed that districts distant from the city center showed higher than expected relative risk, depending on the type of cancer. According to the study, urban PM10 can contribute to increased incidence of some cancers and may also contribute to increased cancer mortality. The results highlight the need to adopt measures to reduce atmospheric PM10 levels and the importance of their continuous monitoring.
Resumo:
Doctorado en Geografía y ordenación del territorio
Resumo:
This thesis is focused on the metabolomic study of human cancer tissues by ex vivo High Resolution-Magic Angle Spinning (HR-MAS) nuclear magnetic resonance (NMR) spectroscopy. This new technique allows for the acquisition of spectra directly on intact tissues (biopsy or surgery), and it has become very important for integrated metabonomics studies. The objective is to identify metabolites that can be used as markers for the discrimination of the different types of cancer, for the grading, and for the assessment of the evolution of the tumour. Furthermore, an attempt to recognize metabolites, that although involved in the metabolism of tumoral tissues in low concentration, can be important modulators of neoplastic proliferation, was performed. In addition, NMR data was integrated with statistical techniques in order to obtain semi-quantitative information about the metabolite markers. In the case of gliomas, the NMR study was correlated with gene expression of neoplastic tissues. Chapter 1 begins with a general description of a new “omics” study, the metabolomics. The study of metabolism can contribute significantly to biomedical research and, ultimately, to clinical medical practice. This rapidly developing discipline involves the study of the metabolome: the total repertoire of small molecules present in cells, tissues, organs, and biological fluids. Metabolomic approaches are becoming increasingly popular in disease diagnosis and will play an important role on improving our understanding of cancer mechanism. Chapter 2 addresses in more detail the basis of NMR Spectroscopy, presenting the new HR-MAS NMR tool, that is gaining importance in the examination of tumour tissues, and in the assessment of tumour grade. Some advanced chemometric methods were used in an attempt to enhance the interpretation and quantitative information of the HR-MAS NMR data are and presented in chapter 3. Chemometric methods seem to have a high potential in the study of human diseases, as it permits the extraction of new and relevant information from spectroscopic data, allowing a better interpretation of the results. Chapter 4 reports results obtained from HR-MAS NMR analyses performed on different brain tumours: medulloblastoma, meningioms and gliomas. The medulloblastoma study is a case report of primitive neuroectodermal tumor (PNET) localised in the cerebellar region by Magnetic Resonance Imaging (MRI) in a 3-year-old child. In vivo single voxel 1H MRS shows high specificity in detecting the main metabolic alterations in the primitive cerebellar lesion; which consist of very high amounts of the choline-containing compounds and of very low levels of creatine derivatives and N-acetylaspartate. Ex vivo HR-MAS NMR, performed at 9.4 Tesla on the neoplastic specimen collected during surgery, allows the unambiguous identification of several metabolites giving a more in-depth evaluation of the metabolic pattern of the lesion. The ex vivo HR-MAS NMR spectra show higher detail than that obtained in vivo. In addition, the spectroscopic data appear to correlate with some morphological features of the medulloblastoma. The present study shows that ex vivo HR-MAS 1H NMR is able to strongly improve the clinical possibility of in vivo MRS and can be used in conjunction with in vivo spectroscopy for clinical purposes. Three histological subtypes of meningiomas (meningothelial, fibrous and oncocytic) were analysed both by in vivo and ex vivo MRS experiments. The ex vivo HR-MAS investigations are very helpful for the assignment of the in vivo resonances of human meningiomas and for the validation of the quantification procedure of in vivo MR spectra. By using one- and two dimensional experiments, several metabolites in different histological subtypes of meningiomas, were identified. The spectroscopic data confirmed the presence of the typical metabolites of these benign neoplasms and, at the same time, that meningomas with different morphological characteristics have different metabolic profiles, particularly regarding macromolecules and lipids. The profile of total choline metabolites (tCho) and the expression of the Kennedy pathway genes in biopsies of human gliomas were also investigated using HR-MAS NMR, and microfluidic genomic cards. 1H HR-MAS spectra, allowed the resolution and relative quantification by LCModel of the resonances from choline (Cho), phosphorylcholine (PC) and glycerolphorylcholine (GPC), the three main components of the combined tCho peak observed in gliomas by in vivo 1H MRS spectroscopy. All glioma biopsies depicted an increase in tCho as calculated from the addition of Cho, PC and GPC HR-MAS resonances. However, the increase was constantly derived from augmented GPC in low grade NMR gliomas or increased PC content in the high grade gliomas, respectively. This circumstance allowed the unambiguous discrimination of high and low grade gliomas by 1H HR-MAS, which could not be achieved by calculating the tCho/Cr ratio commonly used by in vivo 1H MR spectroscopy. The expression of the genes involved in choline metabolism was investigated in the same biopsies. The present findings offer a convenient procedure to classify accurately glioma grade using 1H HR-MAS, providing in addition the genetic background for the alterations of choline metabolism observed in high and low gliomas grade. Chapter 5 reports the study on human gastrointestinal tract (stomach and colon) neoplasms. The human healthy gastric mucosa, and the characteristics of the biochemical profile of human gastric adenocarcinoma in comparison with that of healthy gastric mucosa were analyzed using ex vivo HR-MAS NMR. Healthy human mucosa is mainly characterized by the presence of small metabolites (more than 50 identified) and macromolecules. The adenocarcinoma spectra were dominated by the presence of signals due to triglycerides, that are usually very low in healthy gastric mucosa. The use of spin-echo experiments enable us to detect some metabolites in the unhealthy tissues and to determine their variation with respect to the healthy ones. Then, the ex vivo HR-MAS NMR analysis was applied to human gastric tissue, to obtain information on the molecular steps involved in the gastric carcinogenesis. A microscopic investigation was also carried out in order to identify and locate the lipids in the cellular and extra-cellular environments. Correlation of the morphological changes detected by transmission (TEM) and scanning (SEM) electron microscopy, with the metabolic profile of gastric mucosa in healthy, gastric atrophy autoimmune diseases (AAG), Helicobacter pylori-related gastritis and adenocarcinoma subjects, were obtained. These ultrastructural studies of AAG and gastric adenocarcinoma revealed lipid intra- and extra-cellularly accumulation associated with a severe prenecrotic hypoxia and mitochondrial degeneration. A deep insight into the metabolic profile of human healthy and neoplastic colon tissues was gained using ex vivo HR-MAS NMR spectroscopy in combination with multivariate methods: Principal Component Analysis (PCA) and Partial Least Squares Discriminant Analysis (PLS-DA). The NMR spectra of healthy tissues highlight different metabolic profiles with respect to those of neoplastic and microscopically normal colon specimens (these last obtained at least 15 cm far from the adenocarcinoma). Furthermore, metabolic variations are detected not only for neoplastic tissues with different histological diagnosis, but also for those classified identical by histological analysis. These findings suggest that the same subclass of colon carcinoma is characterized, at a certain degree, by metabolic heterogeneity. The statistical multivariate approach applied to the NMR data is crucial in order to find metabolic markers of the neoplastic state of colon tissues, and to correctly classify the samples. Significant different levels of choline containing compounds, taurine and myoinositol, were observed. Chapter 6 deals with the metabolic profile of normal and tumoral renal human tissues obtained by ex vivo HR-MAS NMR. The spectra of human normal cortex and medulla show the presence of differently distributed osmolytes as markers of physiological renal condition. The marked decrease or disappearance of these metabolites and the high lipid content (triglycerides and cholesteryl esters) is typical of clear cell renal carcinoma (RCC), while papillary RCC is characterized by the absence of lipids and very high amounts of taurine. This research is a contribution to the biochemical classification of renal neoplastic pathologies, especially for RCCs, which can be evaluated by in vivo MRS for clinical purposes. Moreover, these data help to gain a better knowledge of the molecular processes envolved in the onset of renal carcinogenesis.
Resumo:
Das Ziel der Arbeit war die Entwicklung computergestützter Methoden zur Erstellung einer Gefahrenhinweiskarte für die Region Rheinhessen, zur Minimierung der Hangrutschungsgefährdung. Dazu wurde mit Hilfe zweier statistischer Verfahren (Diskriminanzanalyse, Logistische Regression) und einer Methode aus dem Bereich der Künstlichen Intelligenz (Fuzzy Logik) versucht, die potentielle Gefährdung auch solcher Hänge zu klassifizieren, die bis heute noch nicht durch Massenbewegungen aufgefallen sind. Da ingenieurgeologische und geotechnische Hanguntersuchungen aus Zeit und Kostengründen im regionalen Maßstab nicht möglich sind, wurde auf punktuell vorhandene Datenbestände zu einzelnen Rutschungen des Winters 1981/82, die in einer Rutschungsdatenbank zusammengefaßt sind, zurückgegriffen, wobei die daraus gewonnenen Erkenntnisse über Prozeßmechanismen und auslösende Faktoren genutzt und in das jeweilige Modell integriert wurden. Flächenhafte Daten (Lithologie, Hangneigung, Landnutzung, etc.), die für die Berechnung der Hangstabilität notwendig sind, wurden durch Fernerkundungsmethoden, dem Digitalisieren von Karten und der Auswertung von Digitalen Geländemodellen (Reliefanalyse) gewonnen. Für eine weiterführende Untersuchung von einzelnen, als rutschgefährdet klassifizierten Bereichen der Gefahrenhinweiskarte, wurde am Beispiel eines Testgebietes, eine auf dem infinite-slope-stability Modell aufbauende Methode untersucht, die im Maßstabsbereich von Grundkarten (1:5000) auch geotechnische und hydrogeologische Parameter berücksichtigt und damit eine genauere, der jeweiligen klimatischen Situation angepaßte, Gefahrenabschätzung ermöglicht.
Resumo:
This Thesis focuses on the X-ray study of the inner regions of Active Galactic Nuclei, in particular on the formation of high velocity winds by the accretion disk itself. Constraining AGN winds physical parameters is of paramount importance both for understanding the physics of the accretion/ejection flow onto supermassive black holes, and for quantifying the amount of feedback between the SMBH and its environment across the cosmic time. The sources selected for the present study are BAL, mini-BAL, and NAL QSOs, known to host high-velocity winds associated to the AGN nuclear regions. Observationally, a three-fold strategy has been adopted: - substantial samples of distant sources have been analyzed through spectral, photometric, and statistical techniques, to gain insights into their mean properties as a population; - a moderately sized sample of bright sources has been studied through detailed X-ray spectral analysis, to give a first flavor of the general spectral properties of these sources, also from a temporally resolved point of view; - the best nearby candidate has been thoroughly studied using the most sophisticated spectral analysis techniques applied to a large dataset with a high S/N ratio, to understand the details of the physics of its accretion/ejection flow. There are three main channels through which this Thesis has been developed: - [Archival Studies]: the XMM-Newton public archival data has been extensively used to analyze both a large sample of distant BAL QSOs, and several individual bright sources, either BAL, mini-BAL, or NAL QSOs. - [New Observational Campaign]: I proposed and was awarded with new X-ray pointings of the mini-BAL QSOs PG 1126-041 and PG 1351+640 during the XMM-Newton AO-7 and AO-8. These produced the biggest X-ray observational campaign ever made on a mini-BAL QSO (PG 1126-041), including the longest exposure so far. Thanks to the exceptional dataset, a whealth of informations have been obtained on both the intrinsic continuum and on the complex reprocessing media that happen to be in the inner regions of this AGN. Furthermore, the temporally resolved X-ray spectral analysis field has been finally opened for mini-BAL QSOs. - [Theoretical Studies]: some issues about the connection between theories and observations of AGN accretion disk winds have been investigated, through theoretical arguments and synthetic absorption line profiles studies.
Resumo:
Questa tesi descrive alcuni studi di messa a punto di metodi di analisi fisici accoppiati con tecniche statistiche multivariate per valutare la qualità e l’autenticità di oli vegetali e prodotti caseari. L’applicazione di strumenti fisici permette di abbattere i costi ed i tempi necessari per le analisi classiche ed allo stesso tempo può fornire un insieme diverso di informazioni che possono riguardare tanto la qualità come l’autenticità di prodotti. Per il buon funzionamento di tali metodi è necessaria la costruzione di modelli statistici robusti che utilizzino set di dati correttamente raccolti e rappresentativi del campo di applicazione. In questo lavoro di tesi sono stati analizzati oli vegetali e alcune tipologie di formaggi (in particolare pecorini per due lavori di ricerca e Parmigiano-Reggiano per un altro). Sono stati utilizzati diversi strumenti di analisi (metodi fisici), in particolare la spettroscopia, l’analisi termica differenziale, il naso elettronico, oltre a metodiche separative tradizionali. I dati ottenuti dalle analisi sono stati trattati mediante diverse tecniche statistiche, soprattutto: minimi quadrati parziali; regressione lineare multipla ed analisi discriminante lineare.
Resumo:
Information is nowadays a key resource: machine learning and data mining techniques have been developed to extract high-level information from great amounts of data. As most data comes in form of unstructured text in natural languages, research on text mining is currently very active and dealing with practical problems. Among these, text categorization deals with the automatic organization of large quantities of documents in priorly defined taxonomies of topic categories, possibly arranged in large hierarchies. In commonly proposed machine learning approaches, classifiers are automatically trained from pre-labeled documents: they can perform very accurate classification, but often require a consistent training set and notable computational effort. Methods for cross-domain text categorization have been proposed, allowing to leverage a set of labeled documents of one domain to classify those of another one. Most methods use advanced statistical techniques, usually involving tuning of parameters. A first contribution presented here is a method based on nearest centroid classification, where profiles of categories are generated from the known domain and then iteratively adapted to the unknown one. Despite being conceptually simple and having easily tuned parameters, this method achieves state-of-the-art accuracy in most benchmark datasets with fast running times. A second, deeper contribution involves the design of a domain-independent model to distinguish the degree and type of relatedness between arbitrary documents and topics, inferred from the different types of semantic relationships between respective representative words, identified by specific search algorithms. The application of this model is tested on both flat and hierarchical text categorization, where it potentially allows the efficient addition of new categories during classification. Results show that classification accuracy still requires improvements, but models generated from one domain are shown to be effectively able to be reused in a different one.
Resumo:
There is a consensus in China that industrialization, urbanization, globalization and information technology will enhance China's urban competitiveness. We have developed a methodology for the analysis of urban competitiveness that we have applied to China's 25 principal cities during three periods from 1990 through 2009. Our model uses data for 12 variables, to which we apply appropriate statistical techniques. We are able to examine the competitiveness of inland cities and those on the coast, how this has changed during the two decades of the study, the competitiveness of Mega Cities and of administrative centres, and the importance of each variable in explaining urban competitiveness and its development over time. This analysis will be of benefit to Chinese planners as they seek to enhance the competitiveness of China and its major cities in the future.
Resumo:
Regional flood frequency techniques are commonly used to estimate flood quantiles when flood data is unavailable or the record length at an individual gauging station is insufficient for reliable analyses. These methods compensate for limited or unavailable data by pooling data from nearby gauged sites. This requires the delineation of hydrologically homogeneous regions in which the flood regime is sufficiently similar to allow the spatial transfer of information. It is generally accepted that hydrologic similarity results from similar physiographic characteristics, and thus these characteristics can be used to delineate regions and classify ungauged sites. However, as currently practiced, the delineation is highly subjective and dependent on the similarity measures and classification techniques employed. A standardized procedure for delineation of hydrologically homogeneous regions is presented herein. Key aspects are a new statistical metric to identify physically discordant sites, and the identification of an appropriate set of physically based measures of extreme hydrological similarity. A combination of multivariate statistical techniques applied to multiple flood statistics and basin characteristics for gauging stations in the Southeastern U.S. revealed that basin slope, elevation, and soil drainage largely determine the extreme hydrological behavior of a watershed. Use of these characteristics as similarity measures in the standardized approach for region delineation yields regions which are more homogeneous and more efficient for quantile estimation at ungauged sites than those delineated using alternative physically-based procedures typically employed in practice. The proposed methods and key physical characteristics are also shown to be efficient for region delineation and quantile development in alternative areas composed of watersheds with statistically different physical composition. In addition, the use of aggregated values of key watershed characteristics was found to be sufficient for the regionalization of flood data; the added time and computational effort required to derive spatially distributed watershed variables does not increase the accuracy of quantile estimators for ungauged sites. This dissertation also presents a methodology by which flood quantile estimates in Haiti can be derived using relationships developed for data rich regions of the U.S. As currently practiced, regional flood frequency techniques can only be applied within the predefined area used for model development. However, results presented herein demonstrate that the regional flood distribution can successfully be extrapolated to areas of similar physical composition located beyond the extent of that used for model development provided differences in precipitation are accounted for and the site in question can be appropriately classified within a delineated region.
Resumo:
Transparent and translucent objects involve both light reflection and transmission at surfaces. This paper presents a physically based transmission model of rough surface. The surface is assumed to be locally smooth, and statistical techniques is applied to calculate light transmission through a local illumination area. We have obtained an analytical expression for single scattering. The analytical model has been compared to our Monte Carlo simulations as well as to the previous simulations, and good agreements have been achieved. The presented model has potential applications for realistic rendering of transparent and translucent objects.
Climate refugia: joint inference from fossil records, species distribution models and phylogeography
Resumo:
Climate refugia, locations where taxa survive periods of regionally adverse climate, are thought to be critical for maintaining biodiversity through the glacial–interglacial climate changes of the Quaternary. A critical research need is to better integrate and reconcile the three major lines of evidence used to infer the existence of past refugia – fossil records, species distribution models and phylogeographic surveys – in order to characterize the complex spatiotemporal trajectories of species and populations in and out of refugia. Here we review the complementary strengths, limitations and new advances for these three approaches. We provide case studies to illustrate their combined application, and point the way towards new opportunities for synthesizing these disparate lines of evidence. Case studies with European beech, Qinghai spruce and Douglas-fir illustrate how the combination of these three approaches successfully resolves complex species histories not attainable from any one approach. Promising new statistical techniques can capitalize on the strengths of each method and provide a robust quantitative reconstruction of species history. Studying past refugia can help identify contemporary refugia and clarify their conservation significance, in particular by elucidating the fine-scale processes and the particular geographic locations that buffer species against rapidly changing climate.
Resumo:
Rapid industrialization and urbanization in developing countries has led to an increase in air pollution, along a similar trajectory to that previously experienced by the developed nations. In China, particulate pollution is a serious environmental problem that is influencing air quality, regional and global climates, and human health. In response to the extremely severe and persistent haze pollution experienced by about 800 million people during the first quarter of 2013 (refs 4, 5), the Chinese State Council announced its aim to reduce concentrations of PM2.5 (particulate matter with an aerodynamic diameter less than 2.5micrometres) by up to 25 per cent relative to 2012 levels by 2017 (ref. 6). Such efforts however require elucidation of the factors governing the abundance and composition of PM2.5, which remain poorly constrained in China. Here we combine a comprehensive set of novel and state-of-the-art offline analytical approaches and statistical techniques to investigate the chemical nature and sources of particulate matter at urban locations in Beijing, Shanghai, Guangzhou and Xi'an during January 2013. We find that the severe haze pollution event was driven to a large extent by secondary aerosol formation, which contributed 30-77 per cent and 44-71 per cent (average for all four cities) of PM2.5 and of organic aerosol, respectively. On average, the contribution of secondary organic aerosol (SOA) and secondary inorganic aerosol (SIA) are found to be of similar importance (SOA/SIA ratios range from 0.6 to 1.4). Our results suggest that, in addition to mitigating primary particulate emissions, reducing the emissions of secondary aerosol precursors from, for example, fossil fuel combustion and biomass burning is likely to be important for controlling China's PM2.5 levels and for reducing the environmental, economic and health impacts resulting from particulate pollution.