822 resultados para discriminant analysis and cluster analysis
Resumo:
Using NCANDS data of US child maltreatment reports for 2009, logistic regression, probit analysis, discriminant analysis and an artificial neural network are used to determine the factors which explain the decision to place a child in out-of-home care. As well as developing a new model for 2009, a previous study using 2005 data is replicated. While there are many small differences, the four estimation techniques give broadly the same results, demonstrating the robustness of the results. Similarly, apart from age and sexual abuse, the 2005 and 2009 results are roughly similar. For 2009, child characteristics (particularly child emotional problems) are more important than the nature of the abuse and the situation of the household; while caregiver characteristics are the least important. All these models have low explanatory power.
Resumo:
In the recent years, the area of data mining has been experiencing considerable demand for technologies that extract knowledge from large and complex data sources. There has been substantial commercial interest as well as active research in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from large datasets. Artificial neural networks (NNs) are popular biologically-inspired intelligent methodologies, whose classification, prediction, and pattern recognition capabilities have been utilized successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction, and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks. © 2012 Wiley Periodicals, Inc.
Resumo:
Modeling aging and age-related pathologies presents a substantial analytical challenge given the complexity of gene−environment influences and interactions operating on an individual. A top-down systems approach is used to model the effects of lifelong caloric restriction, which is known to extend life span in several animal models. The metabolic phenotypes of caloric-restricted (CR; n = 24) and pair-housed control-fed (CF; n = 24) Labrador Retriever dogs were investigated by use of orthogonal projection to latent structures discriminant analysis (OPLS-DA) to model both generic and age-specific responses to caloric restriction from the 1H NMR blood serum profiles of young and older dogs. Three aging metabolic phenotypes were resolved: (i) an aging metabolic phenotype independent of diet, characterized by high levels of glutamine, creatinine, methylamine, dimethylamine, trimethylamine N-oxide, and glycerophosphocholine and decreasing levels of glycine, aspartate, creatine and citrate indicative of metabolic changes associated largely with muscle mass; (ii) an aging metabolic phenotype specific to CR dogs that consisted of relatively lower levels of glucose, acetate, choline, and tyrosine and relatively higher serum levels of phosphocholine with increased age in the CR population; (iii) an aging metabolic phenotype specific to CF dogs including lower levels of liproprotein fatty acyl groups and allantoin and relatively higher levels of formate with increased age in the CF population. There was no diet metabotype that consistently differentiated the CF and CR dogs irrespective of age. Glucose consistently discriminated between feeding regimes in dogs (≥312 weeks), being relatively lower in the CR group. However, it was observed that creatine and amino acids (valine, leucine, isoleucine, lysine, and phenylalanine) were lower in the CR dogs (<312 weeks), suggestive of differences in energy source utilization. 1H NMR spectroscopic analysis of longitudinal serum profiles enabled an unbiased evaluation of the metabolic markers modulated by a lifetime of caloric restriction and showed differences in the metabolic phenotype of aging due to caloric restriction, which contributes to longevity studies in caloric-restricted animals. Furthermore, OPLS-DA provided a framework such that significant metabolites relating to life extension could be differentiated and integrated with aging processes.
Resumo:
Various complex oscillatory processes are involved in the generation of the motor command. The temporal dynamics of these processes were studied for movement detection from single trial electroencephalogram (EEG). Autocorrelation analysis was performed on the EEG signals to find robust markers of movement detection. The evolution of the autocorrelation function was characterised via the relaxation time of the autocorrelation by exponential curve fitting. It was observed that the decay constant of the exponential curve increased during movement, indicating that the autocorrelation function decays slowly during motor execution. Significant differences were observed between movement and no moment tasks. Additionally, a linear discriminant analysis (LDA) classifier was used to identify movement trials with a peak accuracy of 74%.
Resumo:
The launch of the Double Star mission has provided the opportunity to monitor events at distinct locations on the dayside magnetopause, in coordination with the quartet of Cluster spacecraft. We present results of two such coordinated studies. In the first, 6 April 2004, both Cluster and the Double Star TC-1 spacecraft were on outbound transits through the dawn-side magnetosphere. Cluster observed northward moving FTEs with +/- polarity, whereas TC-1 saw -/+ polarity FTEs. The strength, motion and occurrence of the FTE signatures changes somewhat according to changes in IMF clock angle. These observations are consistent with ongoing reconnection on the dayside magnetopause, resulting in a series of flux transfer events (FTEs) seen both at Cluster and TC-1. The observed polarity and motion of each FTE signature advocates the existence of an active reconnection region consistently located between the positions of Cluster and TC-1, lying north and south of the reconnection line, respectively. This scenario is supported by the application of a model, designed to track flux tube motion, to conditions appropriate for the prevailing interplanetary conditions. The results from the model confirm the observational evidence that the low-latitude FTE dynamics is sensitive to changes in convected upstream conditions. In particular, changing the interplanetary magnetic field (IMF) clock angle in the model predicts that TC-1 should miss the resulting FTEs more often than Cluster, as is observed. For the second conjunction, on the 4 Jan 2005, the Cluster and TC-1 spacecraft all exited the dusk-side magnetosphere almost simultaneously, with TC-1 lying almost equatorial and Cluster at northern latitudes at about 4 RE from TC-1. The spacecraft traverse the magnetopause during a strong reversal in the IMF from northward to southward and a number of magnetosheath FTE signatures are subsequently observed. One coordinated FTE, studied in detail by Pu et al, [this issue], carries and inflowing energetic electron population and shows a motion and orientation which is similar at all spacecraft and consistent with the predictions of the model for the flux tube dynamics, given a near sub-solar reconnection line. This event can be interpreted either as the passage of two parallel flux tubes arising from adjacent x-line positions, or as a crossing of a single flux tube at different positions.
Resumo:
ESA’s first multi-satellite mission Cluster is unique in its concept of 4 satellites orbiting in controlled formations. This will give an unprecedented opportunity to study structure and dynamics of the magnetosphere. In this paper we discuss ways in which ground-based remote-sensing observations of the ionosphere can be used to support the multipoint in-situ satellite measurements. There are a very large number of potentially useful configurations between the satellites and any one ground-based observatory; however, the number of ideal occurrences for any one configuration is low. Many of the ground-based instruments cannot operate continuously and Cluster will take data only for a part of each orbit, depending on how much high-resolution (‘burst-mode’) data are acquired. In addition, there are a great many instrument modes and the formation, size and shape of the cluster of the four satellites to consider. These circumstances create a clear and pressing need for careful planning to ensure that the scientific return from Cluster is maximised by additional coordinated ground-based observations. For this reason, ESA established a working group to coordinate the observations on the ground with Cluster. We will give a number of examples how the combined spacecraft and ground-based observations can address outstanding questions in magnetospheric physics. An online computer tool has been prepared to allow for the planning of conjunctions and advantageous constellations between the Cluster spacecraft and individual or combined ground-based systems. During the mission a ground-based database containing index and summary data will help to identify interesting datasets and allow to select intervals for coordinated studies. We illustrate the philosophy of our approach, using a few important examples of the many possible configurations between the satellite and the ground-based instruments.
Resumo:
The authors examine the housing pathways of young people in the UK in the years 1999 to 2008, and consider the changing nature of these pathways in the run up to 2020. They employ a highly innovative methodology, which begins with the identification and description of key drivers likely to affect young people’s housing circumstances in the future. The empirical identification and analysis of housing pathways is then achieved using multiple-sequence analysis and cluster analysis of the British Household Panel Survey, contextualised by qualitative interviews with a large sample of young people. The authors describe how the interactions between the meanings, perceptions, and aspirations of young people, and the opportunities and constraints imposed by the drivers, are having a major impact on young people’s housing pathways, resulting in considerable housing policy challenges, particularly in relation to the private rented sector
Resumo:
In this paper, a novel statistical test is introduced to compare two locally stationary time series. The proposed approach is a Wald test considering time-varying autoregressive modeling and function projections in adequate spaces. The covariance structure of the innovations may be also time- varying. In order to obtain function estimators for the time- varying autoregressive parameters, we consider function expansions in splines and wavelet bases. Simulation studies provide evidence that the proposed test has a good performance. We also assess its usefulness when applied to a financial time series.
Resumo:
Purpose. - This study investigates the influence of age at onset of OCS on psychiatric comorbidities, and tries to establish a cut-off point for age at onset. Methods. - Three hundred and thirty OCD patients were consecutively recruited and interviewed using the following structured interviews: Yale-Brown Obsessive Compulsive Scale; Yale Global Tic Severity Scale and the Structured Clinical Interview for DSM-IV. Data were analyzed with regression and cluster analysis. Results. - Lower age at onset was associated with a higher probability of having comorbidity with tic, anxiety, somatoform, eating and impulse-control disorders. Longer illness duration was associated with lower chance of having tics. Female gender was associated with anxiety, eating and impulse-control disorders. Tic disorders were associated with anxiety disorders and attention-deficit/hyperactivity disorder. No cutoff age at onset was found to clearly divide the sample in homogeneous subgroups. However, cluster analyses revealed that differences started to emerge at the age of 10 and were more pronounced at the age of 17, suggesting that these were the best cut-off points on this sample. Conclusions. - Age at onset is associated with specific comorbidity patterns in OCD patients. More prominent differences are obtained when analyzing age at onset as an absolute value. (C) 2008 Elsevier Masson SAS. All rights reserved.
Resumo:
In this paper we show the results of a comparison simulation study for three classification techniques: Multinomial Logistic Regression (MLR), No Metric Discriminant Analysis (NDA) and Linear Discriminant Analysis (LDA). The measure used to compare the performance of the three techniques was the Error Classification Rate (ECR). We found that MLR and LDA techniques have similar performance and that they are better than DNA when the population multivariate distribution is Normal or Logit-Normal. For the case of log-normal and Sinh(-1)-normal multivariate distributions we found that MLR had the better performance.
Resumo:
Chagas disease is nowadays the most serious parasitic health problem. This disease is caused by Trypanosoma cruzi. The great number of deaths and the insufficient effectiveness of drugs against this parasite have alarmed the scientific community worldwide. In an attempt to overcome this problem, a model for the design and prediction of new antitrypanosomal agents was obtained. This used a mixed approach, containing simple descriptors based on fragments and topological substructural molecular design descriptors. A data set was made up of 188 compounds, 99 of them characterized an antitrypanosomal activity and 88 compounds that belong to other pharmaceutical categories. The model showed sensitivity, specificity and accuracy values above 85%. Quantitative fragmental contributions were also calculated. Then, and to confirm the quality of the model, 15 structures of molecules tested as antitrypanosomal compounds (that we did not include in this study) were predicted, taking into account the information on the abovementioned calculated fragmental contributions. The model showed an accuracy of 100% which means that the ""in silico"" methodology developed by our team is promising for the rational design of new antitrypanosomal drugs. (C) 2009 Wiley Periodicals, Inc. J Comput Chem 31: 882-894. 2010
Resumo:
Os modelos hazard, também conhecidos por modelos de tempo até a falência ou duração, são empregados para determinar quais variáveis independentes têm maior poder explicativo na previsão de falência de empresas. Consistem em uma abordagem alternativa aos modelos binários logit e probit, e à análise discriminante. Os modelos de duração deveriam ser mais eficientes que modelos de alternativas discretas, pois levam em consideração o tempo de sobrevivência para estimar a probabilidade instantânea de falência de um conjunto de observações sobre uma variável independente. Os modelos de alternativa discreta tipicamente ignoram a informação de tempo até a falência, e fornecem apenas a estimativa de falhar em um dado intervalo de tempo. A questão discutida neste trabalho é como utilizar modelos hazard para projetar taxas de inadimplência e construir matrizes de migração condicionadas ao estado da economia. Conceitualmente, o modelo é bastante análogo às taxas históricas de inadimplência e mortalidade utilizadas na literatura de crédito. O Modelo Semiparamétrico Proporcional de Cox é testado em empresas brasileiras não pertencentes ao setor financeiro, e observa-se que a probabilidade de inadimplência diminui sensivelmente após o terceiro ano da emissão do empréstimo. Observa-se também que a média e o desvio-padrão das probabilidades de inadimplência são afetados pelos ciclos econômicos. É discutido como o Modelo Proporcional de Cox pode ser incorporado aos quatro modelos mais famosos de gestão de risco .de crédito da atualidade: CreditRisk +, KMV, CreditPortfolio View e CreditMetrics, e as melhorias resultantes dessa incorporação
Resumo:
Este trabalho investiga basicamente a validade do Teste Illinois de Habilidades Psicolinguísticas -ITPA -instrumento de avaliação do desenvolvimento da linguagem infantil. Seus autores, S. Kirk e J.J. McCarthy (1961), utilizam o referencial teórico proposto por C. Osgood (1957), a ele incorporando o modelo derivado da Teoria da Informação, o que permite que, na prática clínica, o ITPA possa ser incluído no processo psicodiagnóstico como instrumento de avaliação dos problemas da comunicação em crianças entre três e dez anos. Os objetivos que conduzem e orientam o trabalho apresentado podem ser definidos em três níveis: 1) O que trata dos constructos e suas interrelaç6es -análise crítica da validade teóricado ITPA; 2) O que avalia sua condição de instrumento diagnóstico do desempenho escolar -sensibilidade discriminante do rendimento acadêmico; 3) O que trata da eficácia da prática psicopedagógica proposta pelo mesmo instrumento. O estudo sobre a validade teórica foi realizado com 931 crianças entre três e dez anos de idade, em processo de escolarização, frequentando creches, jardins de infância ou classes regulares da Rede de Ensino do Primeiro Grau no Município do Rio de Janeiro. Utilizou-se a técnica da Análise Fatorial, complementada por uma abordagem lógica que comprovaram algumas das dimensões propostas pelo referencial teórico de Kirk e McCarthy. Para a validade diagnóstica foram avaliadas 71 crianças com dificuldades no desempenho acadêmico, expressas através de conceitos de insuficiência ou deficiência de rendimento e seus resultados foram comparados com os de um subgrupo, aleatoriamente constituído de crianças que participaram do estudo anterior. Utilizou-se a técnica da Análise Discriminante chegando-se à seguinte conclusão: embora a validade de constructo do IIPA não tenha sido completamente confirmada num nível diagnóstico os resultados permitem identificar, com baixa margem de erro, as crianças que pertencem a um ou outro dos grupos de contraste. Quanto ao terceiro nível, foi feita ampla-revelação bibliográfica sobre investigações efetuadas com este instrumento no Brasil e no Exterior. Visou-se avaliar a eficácia da prática psicopedagógica utilizada quando desenvolvida à luz dos recursos de intervenção que o IIPA propõe. Concluiu-se que as pesquisas, até o presente momento efetuadas, não são suficientes para formar um juízo mais seguro da praxis educativa destinada à reabilitação das crianças com problemas da comunicação - o que constitui impedimento a seu desempenho acadêmico - em função das controvérsias que tais pesquisas apresentam.
Resumo:
User-generated content in travel industry is the phenomenon studied in this research, which aims to fill the literature gap on the drivers to write reviews on TripAdvisor. The object of study is relevant from a managerial standpoint since the motivators that drive users to co-create can shape strategies and be turned into external leverages that generate value for brands through content production. From an academic perspective, the goal is to enhance literature on the field, and fill a gap on adherence of local culture to UGC given industry structure specificities. The business’ impact of UGC is supported by the fact that it increases e-commerce conversion rates since research undertaken by Ye, Law, Gu and Chen (2009) states each 10% in traveler review ratings boosts online booking in more than 5%. The literature review builds a theoretical framework on required concepts to support the TripAdvisor case study methodology. Quantitative and qualitative data compound the methodological approach through literature review, desk research, executive interview, and user survey which are analyzed under factor and cluster analysis to group users with similar drivers towards UGC. Additionally, cultural and country-specific aspects impact user behavior. Since hospitality industry in Brazil is concentrated on long tail – 92% of hotels in Brazil are independent ones (Jones Lang LaSalle, 2015, p. 7) – and lesser known hotels take better advantage of reviews – according to Luca (2011) each one Yelp-star increase in rating, increases in 9% independent restaurant revenue whereas in chain restaurants the reviews have no effect – , this dissertation sought to understand UGC in the context of travelers from São Paulo (Brazil) and adopted the case of TripAdvisor to describe what are the incentives that drives user’s co-creation among targeted travelers. It has an outcome of 4 different clusters with different drivers for UGC that enables to design marketing strategies, and it also concludes there’s a big potential to convert current content consumers into producers, the remaining importance of friends and family referrals and the role played by incentives. Among the conclusions, this study lead us to an exploration of positive feedback and network effect concepts, a reinforcement of the UGC relevance for long tail hotels, the interdependence across content production, consumption and participation; and the role played by technology allied with behavioral analysis to take effective decisions. The adherence of UGC to hospitality industry, also outlines the formulation of the concept present in the dissertation title of “Traveler-Generated Content”.