920 resultados para multivariate data analysis
Resumo:
The increasing availability of mobility data and the awareness of its importance and value have been motivating many researchers to the development of models and tools for analyzing movement data. This paper presents a brief survey of significant research works about modeling, processing and visualization of data about moving objects. We identified some key research fields that will provide better features for online analysis of movement data. As result of the literature review, we suggest a generic multi-layer architecture for the development of an online analysis processing software tool, which will be used for the definition of the future work of our team.
Resumo:
3rd SMTDA Conference Proceedings, 11-14 June 2014, Lisbon Portugal.
Resumo:
In the current context of serious climate changes, where the increase of the frequency of some extreme events occurrence can enhance the rate of periods prone to high intensity forest fires, the National Forest Authority often implements, in several Portuguese forest areas, a regular set of measures in order to control the amount of fuel mass availability (PNDFCI, 2008). In the present work we’ll present a preliminary analysis concerning the assessment of the consequences given by the implementation of prescribed fire measures to control the amount of fuel mass in soil recovery, in particular in terms of its water retention capacity, its organic matter content, pH and content of iron. This work is included in a larger study (Meira-Castro, 2009(a); Meira-Castro, 2009(b)). According to the established praxis on the data collection, embodied in multidimensional matrices of n columns (variables in analysis) by p lines (sampled areas at different depths), and also considering the quantitative data nature present in this study, we’ve chosen a methodological approach that considers the multivariate statistical analysis, in particular, the Principal Component Analysis (PCA ) (Góis, 2004). The experiments were carried out in a soil cover over a natural site of Andaluzitic schist, in Gramelas, Caminha, NW Portugal, who was able to maintain itself intact from prescribed burnings from four years and was submit to prescribed fire in March 2008. The soils samples were collected from five different plots at six different time periods. The methodological option that was adopted have allowed us to identify the most relevant relational structures inside the n variables, the p samples and in two sets at the same time (Garcia-Pereira, 1990). Consequently, and in addition to the traditional outputs produced from the PCA, we have analyzed the influence of both sampling depths and geomorphological environments in the behavior of all variables involved.
Resumo:
Complex industrial plants exhibit multiple interactions among smaller parts and with human operators. Failure in one part can propagate across subsystem boundaries causing a serious disaster. This paper analyzes the industrial accident data series in the perspective of dynamical systems. First, we process real world data and show that the statistics of the number of fatalities reveal features that are well described by power law (PL) distributions. For early years, the data reveal double PL behavior, while, for more recent time periods, a single PL fits better into the experimental data. Second, we analyze the entropy of the data series statistics over time. Third, we use the Kullback–Leibler divergence to compare the empirical data and multidimensional scaling (MDS) techniques for data analysis and visualization. Entropy-based analysis is adopted to assess complexity, having the advantage of yielding a single parameter to express relationships between the data. The classical and the generalized (fractional) entropy and Kullback–Leibler divergence are used. The generalized measures allow a clear identification of patterns embedded in the data.
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
BACKGROUND: Recommended oral voriconazole (VRC) doses are lower than intravenous doses. Because plasma concentrations impact efficacy and safety of therapy, optimizing individual drug exposure may improve these outcomes. METHODS: A population pharmacokinetic analysis (NONMEM) was performed on 505 plasma concentration measurements involving 55 patients with invasive mycoses who received recommended VRC doses. RESULTS: A 1-compartment model with first-order absorption and elimination best fitted the data. VRC clearance was 5.2 L/h, the volume of distribution was 92 L, the absorption rate constant was 1.1 hour(-1), and oral bioavailability was 0.63. Severe cholestasis decreased VRC elimination by 52%. A large interpatient variability was observed on clearance (coefficient of variation [CV], 40%) and bioavailability (CV 84%), and an interoccasion variability was observed on bioavailability (CV, 93%). Lack of response to therapy occurred in 12 of 55 patients (22%), and grade 3 neurotoxicity occurred in 5 of 55 patients (9%). A logistic multivariate regression analysis revealed an independent association between VRC trough concentrations and probability of response or neurotoxicity by identifying a therapeutic range of 1.5 mg/L (>85% probability of response) to 4.5 mg/L (<15% probability of neurotoxicity). Population-based simulations with the recommended 200 mg oral or 300 mg intravenous twice-daily regimens predicted probabilities of 49% and 87%, respectively, for achievement of 1.5 mg/L and of 8% and 37%, respectively, for achievement of 4.5 mg/L. With 300-400 mg twice-daily oral doses and 200-300 mg twice-daily intravenous doses, the predicted probabilities of achieving the lower target concentration were 68%-78% for the oral regimen and 70%-87% for the intravenous regimen, and the predicted probabilities of achieving the upper target concentration were 19%-29% for the oral regimen and 18%-37% for the intravenous regimen. CONCLUSIONS: Higher oral than intravenous VRC doses, followed by individualized adjustments based on measured plasma concentrations, improve achievement of the therapeutic target that maximizes the probability of therapeutic response and minimizes the probability of neurotoxicity. These findings challenge dose recommendations for VRC.
Resumo:
Planners in public and private institutions would like coherent forecasts of the components of age-specic mortality, such as causes of death. This has been di cult toachieve because the relative values of the forecast components often fail to behave ina way that is coherent with historical experience. In addition, when the group forecasts are combined the result is often incompatible with an all-groups forecast. It hasbeen shown that cause-specic mortality forecasts are pessimistic when compared withall-cause forecasts (Wilmoth, 1995). This paper abandons the conventional approachof using log mortality rates and forecasts the density of deaths in the life table. Sincethese values obey a unit sum constraint for both conventional single-decrement life tables (only one absorbing state) and multiple-decrement tables (more than one absorbingstate), they are intrinsically relative rather than absolute values across decrements aswell as ages. Using the methods of Compositional Data Analysis pioneered by Aitchison(1986), death densities are transformed into the real space so that the full range of multivariate statistics can be applied, then back-transformed to positive values so that theunit sum constraint is honoured. The structure of the best-known, single-decrementmortality-rate forecasting model, devised by Lee and Carter (1992), is expressed incompositional form and the results from the two models are compared. The compositional model is extended to a multiple-decrement form and used to forecast mortalityby cause of death for Japan
Resumo:
In a seminal paper, Aitchison and Lauder (1985) introduced classical kernel densityestimation techniques in the context of compositional data analysis. Indeed, they gavetwo options for the choice of the kernel to be used in the kernel estimator. One ofthese kernels is based on the use the alr transformation on the simplex SD jointly withthe normal distribution on RD-1. However, these authors themselves recognized thatthis method has some deficiencies. A method for overcoming these dificulties based onrecent developments for compositional data analysis and multivariate kernel estimationtheory, combining the ilr transformation with the use of the normal density with a fullbandwidth matrix, was recently proposed in Martín-Fernández, Chacón and Mateu-Figueras (2006). Here we present an extensive simulation study that compares bothmethods in practice, thus exploring the finite-sample behaviour of both estimators
Resumo:
Several eco-toxicological studies have shown that insectivorous mammals, due to theirfeeding habits, easily accumulate high amounts of pollutants in relation to other mammal species. To assess the bio-accumulation levels of toxic metals and their in°uenceon essential metals, we quantified the concentration of 19 elements (Ca, K, Fe, B, P,S, Na, Al, Zn, Ba, Rb, Sr, Cu, Mn, Hg, Cd, Mo, Cr and Pb) in bones of 105 greaterwhite-toothed shrews (Crocidura russula) from a polluted (Ebro Delta) and a control(Medas Islands) area. Since chemical contents of a bio-indicator are mainly compositional data, conventional statistical analyses currently used in eco-toxicology can givemisleading results. Therefore, to improve the interpretation of the data obtained, weused statistical techniques for compositional data analysis to define groups of metalsand to evaluate the relationships between them, from an inter-population viewpoint.Hypothesis testing on the adequate balance-coordinates allow us to confirm intuitionbased hypothesis and some previous results. The main statistical goal was to test equalmeans of balance-coordinates for the two defined populations. After checking normality,one-way ANOVA or Mann-Whitney tests were carried out for the inter-group balances
Resumo:
Pounamu (NZ jade), or nephrite, is a protected mineral in its natural form following thetransfer of ownership back to Ngai Tahu under the Ngai Tahu (Pounamu Vesting) Act 1997.Any theft of nephrite is prosecutable under the Crimes Act 1961. Scientific evidence isessential in cases where origin is disputed. A robust method for discrimination of thismaterial through the use of elemental analysis and compositional data analysis is required.Initial studies have characterised the variability within a given nephrite source. This hasincluded investigation of both in situ outcrops and alluvial material. Methods for thediscrimination of two geographically close nephrite sources are being developed.Key Words: forensic, jade, nephrite, laser ablation, inductively coupled plasma massspectrometry, multivariate analysis, elemental analysis, compositional data analysis
Resumo:
A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.
Resumo:
We consider the joint visualization of two matrices which have common rowsand columns, for example multivariate data observed at two time pointsor split accord-ing to a dichotomous variable. Methods of interest includeprincipal components analysis for interval-scaled data, or correspondenceanalysis for frequency data or ratio-scaled variables on commensuratescales. A simple result in matrix algebra shows that by setting up thematrices in a particular block format, matrix sum and difference componentscan be visualized. The case when we have more than two matrices is alsodiscussed and the methodology is applied to data from the InternationalSocial Survey Program.
Resumo:
The singular value decomposition and its interpretation as alinear biplot has proved to be a powerful tool for analysing many formsof multivariate data. Here we adapt biplot methodology to the specifficcase of compositional data consisting of positive vectors each of whichis constrained to have unit sum. These relative variation biplots haveproperties relating to special features of compositional data: the studyof ratios, subcompositions and models of compositional relationships. Themethodology is demonstrated on a data set consisting of six-part colourcompositions in 22 abstract paintings, showing how the singular valuedecomposition can achieve an accurate biplot of the colour ratios and howpossible models interrelating the colours can be diagnosed.
Adenocarcinoma of the pancreas: Comparative single centre analysis between ductal and mucinous type.
Resumo:
1. Background¦Adenocarcinomas of the pancreas are exocrine tumors, originate from ductal system, including two morphologically distinct entities: the ductal adenocarcinoma and mucinous adenocarcinoma. Ductal adenocarcinoma is by far the most frequent malignant tumor in the pancreas, representing at least about 90% of all pancreas cancers. It is associated with very poor prognosis, due to the fact that actually there are no any biological markers or diagnostic tools for identification of the disease at an early stage. Most of the time the disease is extensive with vascular and nerves involvement or with metastatic spread at the time of diagnosis (1). The median survival is less than 5% at 5 years, placing it, at the fifth leading cause of death by cancer in the world (2). The mucinous form of pancreatic adenocarcinoma is less frequent, and seems to have a better prognosis with about 57% survival at 5 years (1)(3)(4).¦Each morphologic type of pancreatic adenocarcinoma is associated with particular preneoplastic lesions. Two types of preneoplastic lesions are described: firstly, pancreatic intra-epithelial neoplasia (PanIN) which affects the small and peripheral pancreatic ducts, and the intraductal papillary-mucinous neoplasm (IPMN) interested the main pancreatic ducts and its principal branches. Both of preneoplastic lesions lead by different mechanisms to the pancreatic adenocarcinoma (1)(2)(3)(4)(5)(6)(7)(8)(9)(10).¦The purpose of our study consists in a retrospective analysis of various clinical and histo-morphological parameters in order to assess a difference in survival between these two morphological types of pancreatic adenocarcinomas.¦1.2 Material and methods¦We conducted a retrospective analysis including 35 patients, (20 men and 15 women), beneficed the surgical treatment for pancreas adenocarcinoma at the Surgical Department of University Hospital in Lausanne. The patients involved in our study have been treated between 2003 and 2008, permitting at least 5-years mean follow up. For each patient the following parameters were analysed: age, gender, type of operation, type of preneoplastic lesions, TNM stage, histological grade of the tumor, vascular invasion, lymphatic and perineural invasion, resection margins, and adjuvant treatment.¦The results from these observations were included in a univariate and multivariate statistical analysis and compared with overall survival, as well as specific survival for each morphologic subtype of adenocarcinoma.¦As a low number of mucinous adenocarcinomas (n=5) was insufficient to conduct a pertinent statistical analysis, we compared the data obtained from adenocarcinomas developed on PanIN with adenocarcinomas developed on IPMN including both, ductal or mucinous types.¦1.3 Result¦Our results show that adenocarcinomas developed on pre-existing IPMN including both morphologic types (ductal and mucinous form) are associated with a better survival and prognosis than adenocarciomas developed on PanIN.¦1.4 Conclusion¦This study reflects that the most relevant parameter in survival in pancreatic adenocarcinoma seems to be the type of preneoplastic lesion. The significant difference in survival was noted between adenocarcinomas developing on PanIN as compared to adenocarcinomas developed on IPMN precursor lesions. Ductal adenocarcinomas developped on IPMN present significantly longer survival than those developed on PanIN lesions (P value= 0,01). Therefore we can suggest that the histological type of preneoplastic lesion rather than the histological type of adenocarcinoma should be the determinant prognosis factor in survival of pancreatic adenocarcinoma.
Resumo:
Whether for investigative or intelligence aims, crime analysts often face up the necessity to analyse the spatiotemporal distribution of crimes or traces left by suspects. This article presents a visualisation methodology supporting recurrent practical analytical tasks such as the detection of crime series or the analysis of traces left by digital devices like mobile phone or GPS devices. The proposed approach has led to the development of a dedicated tool that has proven its effectiveness in real inquiries and intelligence practices. It supports a more fluent visual analysis of the collected data and may provide critical clues to support police operations as exemplified by the presented case studies.