913 resultados para acoustic sensor data analysis
Resumo:
A four-parameter extension of the generalized gamma distribution capable of modelling a bathtub-shaped hazard rate function is defined and studied. The beauty and importance of this distribution lies in its ability to model monotone and non-monotone failure rate functions, which are quite common in lifetime data analysis and reliability. The new distribution has a number of well-known lifetime special sub-models, such as the exponentiated Weibull, exponentiated generalized half-normal, exponentiated gamma and generalized Rayleigh, among others. We derive two infinite sum representations for its moments. We calculate the density of the order statistics and two expansions for their moments. The method of maximum likelihood is used for estimating the model parameters and the observed information matrix is obtained. Finally, a real data set from the medical area is analysed.
Resumo:
Joint generalized linear models and double generalized linear models (DGLMs) were designed to model outcomes for which the variability can be explained using factors and/or covariates. When such factors operate, the usual normal regression models, which inherently exhibit constant variance, will under-represent variation in the data and hence may lead to erroneous inferences. For count and proportion data, such noise factors can generate a so-called overdispersion effect, and the use of binomial and Poisson models underestimates the variability and, consequently, incorrectly indicate significant effects. In this manuscript, we propose a DGLM from a Bayesian perspective, focusing on the case of proportion data, where the overdispersion can be modeled using a random effect that depends on some noise factors. The posterior joint density function was sampled using Monte Carlo Markov Chain algorithms, allowing inferences over the model parameters. An application to a data set on apple tissue culture is presented, for which it is shown that the Bayesian approach is quite feasible, even when limited prior information is available, thereby generating valuable insight for the researcher about its experimental results.
Resumo:
In an investigation intended to determine training needs of night crews, Bowers et al. (1998, this issue) report two studies showing that the patterning of communication is a better discriminator of good and poor crews than is the content of communication. Bowers et al. characterize their studies as intended to generate hypotheses for training needs and draw connections with Exploratory Sequential Data Analysis (ESDA). Although applauding the intentions of Bowers ct al., we point out some concerns with their characterization and implementation of ESDA. Our principal concern is that the Bowers et al. exploration of the data does not convincingly lead them back to a better fundamental understanding of the original phenomena they are investigating.
Resumo:
This paper develops an interactive approach for exploratory spatial data analysis. Measures of attribute similarity and spatial proximity are combined in a clustering model to support the identification of patterns in spatial information. Relationships between the developed clustering approach, spatial data mining and choropleth display are discussed. Analysis of property crime rates in Brisbane, Australia is presented. A surprising finding in this research is that there are substantial inconsistencies in standard choropleth display options found in two widely used commercial geographical information systems, both in terms of definition and performance. The comparative results demonstrate the usefulness and appeal of the developed approach in a geographical information system environment for exploratory spatial data analysis.
Resumo:
Disease resistance is associated with a plant defense response that involves an integrated set of signal transduction pathways. Changes in the expression patterns of 2.375 selected genes were examined simultaneously by cDNA microarray analysis in Arabidopsis thaliana after inoculation with an incompatible fungal pathogen Alternaria brassicicola or treatment with the defense-related signaling molecules salicylic acid (SA), methyl jasmonate (MJ), or ethylene, Substantial changes (up- and down-regulation) in the steady-state abundance of 705 mRNAs were observed in response to one or more of the treatments, including known and putative defense-related genes and 106 genes with no previously described function or homology, In leaf tissue inoculated with A. brassicicola, the abundance of 168 mRNAs was increased more than 2.5-fold, whereas that of 39 mRNAs was reduced. Similarly, the abundance of 192, 221, and 55 mRNAs was highly (>2.5-fold) increased after treatment with SA, MJ, and ethylene, respectively. Data analysis revealed a surprising level of coordinated defense responses, including 169 mRNAs regulated by multiple treatments/defense pathways. The largest number of genes coinduced (one of four induced genes) and corepressed was found after treatments with SA and MJ. In addition, 50% of the genes induced by ethylene treatment were also induced by MJ treatment. These results indicated the existence of a substantial network of regulatory interactions and coordination occurring during plant defense among the different defense signaling pathways, notably between the salicylate and jasmonate pathways that were previously thought to act in an antagonistic fashion.
Resumo:
The stock market suffers uncertain relations throughout the entire negotiation process, with different variables exerting direct and indirect influence on stock prices. This study focuses on the analysis of certain aspects that may influence these values offered by the capital market, based on the Brazil Index of the Sao Paulo Stock Exchange (Bovespa), which selects 100 stocks among the most traded on Bovespa in terms of number of trades and financial volume. The selected variables are characterized by the companies` activity area and the business volume in the month of data collection, i.e. April/2007. This article proposes an analysis that joins the accounting view of the stock price variables that can be influenced with the use of multivariate qualitative data analysis. Data were explored through Correspondence Analysis (Anacor) and Homogeneity Analysis (Homals). According to the research, the selected variables are associated with the values presented by the stocks, which become an internal control instrument and a decision-making tool when it comes to choosing investments.
Resumo:
Objective: To compare rates of self-reported use of health services between rural, remote and urban South Australians. Methods: Secondary data analysis from a population-based survey to assess health and well-being, conducted in South Australia in 2000. In all, 2,454 adults were randomly selected and interviewed using the computer-assisted telephone interview (CATI) system. We analysed health service use by Accessibility and Remoteness Index of Australia (ARIA) category. Results: There was no statistically significant difference in the median number of uses of the four types of health services studied across ARIA categories. Significantly fewer residents of highly accessible areas reported never using primary care services (14.4% vs. 22.2% in very remote areas), and significantly more reported high use ( greater than or equal to6 visits, 29.3% vs. 21.5%). Fewer residents of remote areas reported never attending hospital (65.6% vs. 73.8% in highly accessible areas). Frequency of use of mental health services was not statistically significantly different across ARIA categories. Very remote residents were more likely to spend at least one night in a public hospital (15.8%) than were residents of other areas (e.g. 5.9% for highly accessible areas). Conclusion: The self-reported frequency of use of a range of health services in South Australia was broadly similar across ARIA categories. However, use of primary care services was higher among residents of highly accessible areas and public hospital use increased with increasing remoteness. There is no evidence for systematic rural disadvantage in terms of self-reported health service utilisation in this State.
Resumo:
Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.
Resumo:
Objectives : The purpose of this article is to find out differences between surveys using paper and online questionnaires. The author has deep knowledge in the case of questions concerning opinions in the development of survey based research, e.g. the limits of postal and online questionnaires. Methods : In the physician studies carried out in 1995 (doctors graduated in 1982-1991), 2000 (doctors graduated in 1982-1996), 2005 (doctors graduated in 1982-2001), 2011 (doctors graduated in 1977-2006) and 457 family doctors in 2000, were used paper and online questionnaires. The response rates were 64%, 68%, 64%, 49% and 73%, respectively. Results : The results of the physician studies showed that there were differences between methods. These differences were connected with using paper-based questionnaire and online questionnaire and response rate. The online-based survey gave a lower response rate than the postal survey. The major advantages of online survey were short response time; very low financial resource needs and data were directly loaded in the data analysis software, thus saved time and resources associated with the data entry process. Conclusions : The current article helps researchers with planning the study design and choosing of the right data collection method.
Resumo:
This article is is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Attribution-NonCommercial (CC BY-NC) license lets others remix, tweak, and build upon work non-commercially, and although the new works must also acknowledge & be non-commercial.
Resumo:
Neste trabalho pretende-se introduzir os conceitos associados à lógica difusa no controlo de sistemas, neste caso na área da robótica autónoma, onde é feito um enquadramento da utilização de controladores difusos na mesma. Foi desenvolvido de raiz um AGV (Autonomous Guided Vehicle) de modo a se implementar o controlador difuso, e testar o desempenho do mesmo. Uma vez que se pretende de futuro realizar melhorias e/ou evoluções optou-se por um sistema modular em que cada módulo é responsável por uma determinada tarefa. Neste trabalho existem três módulos que são responsáveis pelo controlo de velocidade, pela aquisição dos dados dos sensores e, por último, pelo controlador difuso do sistema. Após a implementação do controlador difuso, procedeu-se a testes para validar o sistema onde foram recolhidos e registados os dados provenientes dos sensores durante o funcionamento normal do robô. Este dados permitiram uma melhor análise do desempenho do robô. Verifica-se que a lógica difusa permite obter uma maior suavidade na transição de decisões, e que com o aumento do número de regras é possível tornar o sistema ainda mais suave. Deste modo, verifica-se que a lógica difusa é uma ferramenta útil e funcional para o controlo de aplicações. Como desvantagem surge a quantidade de dados associados à implementação, tais como, os universos de discurso, as funções de pertença e as regras. Ao se aumentar o número de regras de controlo do sistema existe também um aumento das funções de pertença consideradas para cada variável linguística; este facto leva a um aumento da memória necessária e da complexidade na implementação pela quantidade de dados que têm de ser tratados. A maior dificuldade no projecto de um controlador difuso encontra-se na definição das variáveis linguísticas através dos seus universos de discurso e das suas funções de pertença, pois a definição destes pode não ser a mais adequada ao contexto de controlo e torna-se necessário efectuar testes e, consequentemente, modificações à definição das funções de pertença para melhorar o desempenho do sistema. Todos os aspectos referidos são endereçados no desenvolvimento do AGV e os respectivos resultados são apresentados e analisados.
Resumo:
This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Resumo:
Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.
Resumo:
Catastrophic events, such as wars and terrorist attacks, tornadoes and hurricanes, earthquakes, tsunamis, floods and landslides, are always accompanied by a large number of casualties. The size distribution of these casualties has separately been shown to follow approximate power law (PL) distributions. In this paper, we analyze the statistical distributions of the number of victims of catastrophic phenomena, in particular, terrorism, and find double PL behavior. This means that the data sets are better approximated by two PLs instead of a single one. We plot the PL parameters, corresponding to several events, and observe an interesting pattern in the charts, where the lines that connect each pair of points defining the double PLs are almost parallel to each other. A complementary data analysis is performed by means of the computation of the entropy. The results reveal relationships hidden in the data that may trigger a future comprehensive explanation of this type of phenomena.
Resumo:
Relatório do Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia de Electrónica e Telecomunicações