958 resultados para multivariate binary data
Resumo:
Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.
Resumo:
The Semantic Binary Data Model (SBM) is a viable alternative to the now-dominant relational data model. SBM would be especially advantageous for applications dealing with complex interrelated networks of objects provided that a robust efficient implementation can be achieved. This dissertation presents an implementation design method for SBM, algorithms, and their analytical and empirical evaluation. Our method allows building a robust and flexible database engine with a wider applicability range and improved performance. ^ Extensions to SBM are introduced and an implementation of these extensions is proposed that allows the database engine to efficiently support applications with a predefined set of queries. A New Record data structure is proposed. Trade-offs of employing Fact, Record and Bitmap Data structures for storing information in a semantic database are analyzed. ^ A clustering ID distribution algorithm and an efficient algorithm for object ID encoding are proposed. Mapping to an XML data model is analyzed and a new XML-based XSDL language facilitating interoperability of the system is defined. Solutions to issues associated with making the database engine multi-platform are presented. An improvement to the atomic update algorithm suitable for certain scenarios of database recovery is proposed. ^ Specific guidelines are devised for implementing a robust and well-performing database engine based on the extended Semantic Data Model. ^
Resumo:
This collection contains measurements of abundance and diversity of different groups of aboveground invertebrates sampled on the plots of the different sub-experiments at the field site of a large grassland biodiversity experiment (the Jena Experiment; see further details below). In the main experiment, 82 grassland plots of 20 x 20 m were established from a pool of 60 species belonging to four functional groups (grasses, legumes, tall and small herbs). In May 2002, varying numbers of plant species from this species pool were sown into the plots to create a gradient of plant species richness (1, 2, 4, 8, 16 and 60 species) and functional richness (1, 2, 3, 4 functional groups). Plots were maintained by bi-annual weeding and mowing. The following series of datasets are contained in this collection: 1. Measurements of ant abundance (number of individuals attracted to baits) and ant occurrence (binary data) in the Main Experiment in 2006 and 2013. Ants where sampled using two types of baited traps receiving ~10g of Tuna or ~10g of honey/Sucrose. After 30min the occurrence (presence = 1 / absence = 0) and abundance (number) of ants at the two types of baits was recorded and pooled per plot.
Resumo:
The viscosity of ionic liquids (ILs) has been modeled as a function of temperature and at atmospheric pressure using a new method based on the UNIFAC–VISCO method. This model extends the calculations previously reported by our group (see Zhao et al. J. Chem. Eng. Data 2016, 61, 2160–2169) which used 154 experimental viscosity data points of 25 ionic liquids for regression of a set of binary interaction parameters and ion Vogel–Fulcher–Tammann (VFT) parameters. Discrepancies in the experimental data of the same IL affect the quality of the correlation and thus the development of the predictive method. In this work, mathematical gnostics was used to analyze the experimental data from different sources and recommend one set of reliable data for each IL. These recommended data (totally 819 data points) for 70 ILs were correlated using this model to obtain an extended set of binary interaction parameters and ion VFT parameters, with a regression accuracy of 1.4%. In addition, 966 experimental viscosity data points for 11 binary mixtures of ILs were collected from literature to establish this model. All the binary data consist of 128 training data points used for the optimization of binary interaction parameters and 838 test data points used for the comparison of the pure evaluated values. The relative average absolute deviation (RAAD) for training and test is 2.9% and 3.9%, respectively.
Resumo:
A secure communication system based on the error-feedback synchronization of the electronic model of the particle-in-a-box system is proposed. This circuit allows a robust and simple electronic emulation of the mechanical behavior of the collisions of a particle inside a box, exhibiting rich chaotic behavior. The required nonlinearity to emulate the box walls is implemented in a simple way when compared with other analog electronic chaotic circuits. A master/slave synchronization of two circuits exhibiting a rich chaotic behavior demonstrates the potentiality of this system to secure communication. In this system, binary data stream information modulates the bifurcation parameter of the particle-in-a-box electronic circuit in the transmitter. In the receiver circuit, this parameter is estimated using Pecora-Carroll synchronization and error-feedback synchronization. The performance of the demodulation process is verified through the eye pattern technique applied on the recovered bit stream. During the demodulation process, the error-feedback synchronization presented better performance compared with the Pecora-Carroll synchronization. The application of the particle-in-a-box electronic circuit in a secure communication system is demonstrated.
Resumo:
The stock market suffers uncertain relations throughout the entire negotiation process, with different variables exerting direct and indirect influence on stock prices. This study focuses on the analysis of certain aspects that may influence these values offered by the capital market, based on the Brazil Index of the Sao Paulo Stock Exchange (Bovespa), which selects 100 stocks among the most traded on Bovespa in terms of number of trades and financial volume. The selected variables are characterized by the companies` activity area and the business volume in the month of data collection, i.e. April/2007. This article proposes an analysis that joins the accounting view of the stock price variables that can be influenced with the use of multivariate qualitative data analysis. Data were explored through Correspondence Analysis (Anacor) and Homogeneity Analysis (Homals). According to the research, the selected variables are associated with the values presented by the stocks, which become an internal control instrument and a decision-making tool when it comes to choosing investments.
Resumo:
Nos últimos anos, o processo de ensino e aprendizagem tem sofrido significativas alterações graças ao aparecimento da Internet. Novas ferramentas para apoio ao ensino têm surgido, nas quais se destacam os laboratórios remotos. Atualmente, muitas instituições de ensino disponibilizam laboratórios remotos nos seus cursos, que permitem, a professores e alunos, a realização de experiências reais através da Internet. Estes são implementados por diferentes arquiteturas e infraestruturas, suportados por vários módulos de laboratório acessíveis remotamente (e.g. instrumentos de medição). No entanto, a sua inclusão no ensino é ainda deficitária, devido: i) à falta de meios e competências técnicas das instituições de ensino para os desenvolverem, ii) à dificuldade na partilha dos módulos de laboratório por diferentes infraestruturas e, iii) à reduzida capacidade de os reconfigurar com esses módulos. Para ultrapassar estas limitações, foi idealizado e desenvolvido no âmbito de um trabalho de doutoramento [1] um protótipo, cuja arquitetura é baseada na norma IEEE 1451.0 e na tecnologia de FPGAs. Para além de garantir o desenvolvimento e o acesso de forma normalizada a um laboratório remoto, este protótipo promove ainda a partilha de módulos de laboratório por diferentes infraestruturas. Nesse trabalho explorou-se a capacidade de reconfiguração de FPGAs para embutir na infraestrutura do laboratório vários módulos, todos descritos em ficheiros, utilizando linguagens de descrição de hardware estruturados de acordo com a norma IEEE 1451.0. A definição desses módulos obriga à criação de estruturas de dados binárias (Transducer Electronic Data Sheets, TEDSs), bem como de outros ficheiros que possibilitam a sua interligação com a infraestrutura do laboratório. No entanto, a criação destes ficheiros é bastante complexa, uma vez que exige a realização de vários cálculos e conversões. Tendo em consideração essa mesma complexidade, esta dissertação descreve o desenvolvimento de uma aplicação Web para leitura e escrita dos TEDSs. Para além de um estudo sobre os laboratórios remotos, é efetuada uma descrição da norma IEEE 1451.0, com particular atenção para a sua arquitetura e para a estrutura dos diferentes TEDSs. Com o objetivo de enquadrar a aplicação desenvolvida, efetua-se ainda uma breve apresentação de um protótipo de um laboratório remoto reconfigurável, cuja reconfiguração é apoiada por esta aplicação. Por fim, é descrita a verificação da aplicação Web, de forma a tirar conclusões sobre o seu contributo para a simplificação dessa reconfiguração.
Resumo:
Introduction This study evaluated the efficacy of retreatment of pulmonary tuberculosis (TB) with regard to treatment outcomes and antimicrobial susceptibility testing (ST) profiles. Methods This retrospective cohort study analyzed 144 patients treated at a referral hospital in Brazil. All of them had undergone prior treatment, were smear-positive for TB and received a standardized retreatment regimen. Fisher's 2-tailed exact test and the χ2 test were used; RRs and 95% CIs were calculated using univariate and multivariate binary logistic regression. Results The patients were cured in 84 (58.3%) cases. Failure was associated with relapsed treatment and abandonment (n=34). Culture tests were obtained for 103 (71.5%) cases; 70 (48.6%) had positive results. ST results were available for 67 (46.5%) cases; the prevalence of acquired resistance was 53.7%. There were no significant differences between those who achieved or not therapeutic success (p=0.988), despite being sensitive or resistant to 1 or more drugs. Rifampicin resistance was independently associated with therapeutic failure (OR: 4.4, 95% CI:1.12-17.37, p=0.034). For those cases in which cultures were unavailable, a 2nd model without this information was built. In this, return after abandonment was significantly associated with retreatment failure (OR: 3.59, 95% CI:1.17-11.06, p=0.026). Conclusions In this cohort, the general resistance profile appeared to have no influence on treatment outcome, except in cases of rifampicin resistance. The form of reentry was another independent predictor of failure. The use of bacterial culture identification and ST in TB management must be re-evaluated. The recommendations for different susceptibility profiles must also be improved.
Resumo:
This paper extends previous research and discussion on the use of multivariate continuous data, which are about to become more prevalent in forensic science. As an illustrative example, attention is drawn here on the area of comparative handwriting examinations. Multivariate continuous data can be obtained in this field by analysing the contour shape of loop characters through Fourier analysis. This methodology, based on existing research in this area, allows one describe in detail the morphology of character contours throughout a set of variables. This paper uses data collected from female and male writers to conduct a comparative analysis of likelihood ratio based evidence assessment procedures in both, evaluative and investigative proceedings. While the use of likelihood ratios in the former situation is now rather well established (typically, in order to discriminate between propositions of authorship of a given individual versus another, unknown individual), focus on the investigative setting still remains rather beyond considerations in practice. This paper seeks to highlight that investigative settings, too, can represent an area of application for which the likelihood ratio can offer a logical support. As an example, the inference of gender of the writer of an incriminated handwritten text is forwarded, analysed and discussed in this paper. The more general viewpoint according to which likelihood ratio analyses can be helpful for investigative proceedings is supported here through various simulations. These offer a characterisation of the robustness of the proposed likelihood ratio methodology.
Resumo:
BACKGROUND: Social support has been found to be protective from adverse health effects of psychological stress. We hypothesized that higher social support would predict a more favorable course of Crohn's disease (CD) directly (main effect hypothesis) and via moderating other prognostic factors (buffer hypothesis). METHODS: Within a multicenter cohort study we observed 597 adults with CD for 18 months. We assessed social support using the ENRICHD Social Support Inventory. Flares, nonresponse to therapy, complications, and extraintestinal manifestations were recorded as a combined endpoint indicating disease deterioration. We controlled for several demographic, psychosocial, and clinical variables of potential prognostic importance. We used multivariate binary logistic regression to estimate the overall effect of social support on the odds of disease deterioration and to explore main and moderator effects of social support by probing interactions with other predictors. RESULTS: The odds of disease deterioration decreased by 1.5 times (95% confidence interval [CI]: 1.2-1.9) for an increase of one standard deviation (SD) of social support. In case of low body mass index (BMI) (i.e., 1 SD below the mean or <19 kg/m(2)), the odds decreased by 1.8 times for an increase of 1 SD of social support. In case of low social support, the odds increased by 2.1 times for a decrease of 1 SD of BMI. Low BMI was not predictive under high social support. CONCLUSIONS: The findings suggest that elevated social support may favorably affect the clinical course of CD, particularly in patients with low BMI. (Inflamm Bowel Dis 2010;).
Resumo:
The use of simple and multiple correspondence analysis is well-established in socialscience research for understanding relationships between two or more categorical variables.By contrast, canonical correspondence analysis, which is a correspondence analysis with linearrestrictions on the solution, has become one of the most popular multivariate techniques inecological research. Multivariate ecological data typically consist of frequencies of observedspecies across a set of sampling locations, as well as a set of observed environmental variablesat the same locations. In this context the principal dimensions of the biological variables aresought in a space that is constrained to be related to the environmental variables. Thisrestricted form of correspondence analysis has many uses in social science research as well,as is demonstrated in this paper. We first illustrate the result that canonical correspondenceanalysis of an indicator matrix, restricted to be related an external categorical variable, reducesto a simple correspondence analysis of a set of concatenated (or stacked ) tables. Then weshow how canonical correspondence analysis can be used to focus on, or partial out, aparticular set of response categories in sample survey data. For example, the method can beused to partial out the influence of missing responses, which usually dominate the results of amultiple correspondence analysis.
Resumo:
In the analysis of multivariate categorical data, typically the analysis of questionnaire data, it is often advantageous, for substantive and technical reasons, to analyse a subset of response categories. In multiple correspondence analysis, where each category is coded as a column of an indicator matrix or row and column of Burt matrix, it is not correct to simply analyse the corresponding submatrix of data, since the whole geometric structure is different for the submatrix . A simple modification of the correspondence analysis algorithm allows the overall geometric structure of the complete data set to be retained while calculating the solution for the selected subset of points. This strategy is useful for analysing patterns of response amongst any subset of categories and relating these patterns to demographic factors, especially for studying patterns of particular responses such as missing and neutral responses. The methodology is illustrated using data from the International Social Survey Program on Family and Changing Gender Roles in 1994.
Resumo:
The generalization of simple (two-variable) correspondence analysis to more than two categorical variables, commonly referred to as multiple correspondence analysis, is neither obvious nor well-defined. We present two alternative ways of generalizing correspondence analysis, one based on the quantification of the variables and intercorrelation relationships, and the other based on the geometric ideas of simple correspondence analysis. We propose a version of multiple correspondence analysis, with adjusted principal inertias, as the method of choice for the geometric definition, since it contains simple correspondence analysis as an exact special case, which is not the situation of the standard generalizations. We also clarify the issue of supplementary point representation and the properties of joint correspondence analysis, a method that visualizes all two-way relationships between the variables. The methodology is illustrated using data on attitudes to science from the International Social Survey Program on Environment in 1993.
Resumo:
This paper extends previous research [1] on the use of multivariate continuous data in comparative handwriting examinations, notably for gender classification. A database has been constructed by analyzing the contour shape of loop characters of type a and d by means of Fourier analysis, which allows characters to be described in a global way by a set of variables (e.g., Fourier descriptors). Sample handwritings were collected from right- and left-handed female and male writers. The results reported in this paper provide further arguments in support of the view that investigative settings in forensic science represent an area of application for which the Bayesian approach offers a logical framework. In particular, the Bayes factor is computed for settings that focus on inference of gender and handedness of the author of an incriminated handwritten text. An emphasis is placed on comparing the efficiency for investigative purposes of characters a and d.
Resumo:
The objective of this work was to characterize morphologically and molecularly the genetic diversity of cassava accessions, collected from different regions in Brazil. A descriptive analysis was made for 12 morphological traits in 419 accessions. Data was transformed into binary data for cluster analysis and analysis of molecular variance. A higher proportion of white or cream (71%) root cortex color was found, while flesh colors were predominantly white (49%) and cream (42%). Four accession groups were classified by the cluster analysis, but they were not grouped according to their origin, which indicates that diversity is not structured in space. The variation was greater within regions (95.6%). Sixty genotypes were also evaluated using 14 polymorphic microsatellite markers. Molecular results corroborated the morphological ones, showing the same random distribution of genotypes, with no grouping according to origin. Diversity indices were high for each region, and a greater diversity was found within regions, with: a mean number of alleles per locus of 3.530; observed and expected heterozygosity of 0.499 and 0.642, respectively; and Shannon index of 1.03. The absence of spatial structure among cassava genotypes according to their origins shows the anthropic influence in the distribution and movement of germplasm, both within and among regions.