936 resultados para leave one out cross validation
Resumo:
This work analyses a study on natural ventilation and its relation to the urban legislation versus the building types in an urban fraction of coastal area of Praia do Meio in the city of Natal/RN, approaching the type or types of land use most appropriate to this limited urban fraction. The objective of this study is to analyse the effects of the present legislation as well as the types of buildings in this area on the natural ventilation. This urban fraction was selected because it is one of the sites from where the wind flows into the city of Natal. This research is based on the hypothesis stating that the reduction on the porosity of the urban soil (decrease in the set back/boundary clearance), and an increase in the form (height of the buildings) rise the level of the ventilation gradient, consequently causing a reduction on the wind speed at the lowest part of the buildings. Three-dimensional computational models were used to produce the modes of occupation allowed in the urban fraction within the area under study. A Computational Fluid Dynamics (CFD) software was also used to analyse the modes of land occupation. Following simulation, a statistical assessment was carried out for validation of the hypothesis. It was concluded that the reduction in the soil porosity as a consequence of the rates that defined the minimum boundary clearance between the building and the boundary of the plot (and consequently the set back), as well as the increase in the building form (height of the buildings) caused a reduction in the wind speed, thus creating heat islands
Resumo:
The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism.
Resumo:
The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics.
Resumo:
Our aim was to determine the normative reference values of cardiorespiratory fitness (CRF) and to establish the proportion of subjects with low CRF suggestive of future cardio-metabolic risk.
Resumo:
In this paper, we describe one of the approaches of the participation of Universidade de Évora. Our approach is similar to usual methods where text is preprocessed, features are extracted, and then used in SVMs with cross validation. The main difference is that features used come from averages of word embeddings, specifically word2vec vectors. Using PAN 2016 dataset, we were able to achieve 44.8% and 68.2% for English age and gender classification respectively. We were also able to achieve 51.3% and 67.1% accuracy for Spanish age and gender classification. Finally, we report 71.9% accuracy for Dutch age classification.
Resumo:
Espécies forrageiras adaptadas às condições semiáridas são uma alternativa para reduzir os impactos negativos na cadeia produtiva de ruminantes da região Nordeste brasileira devido à sazonalidade na oferta de forragem, além de reduzir custo com o fornecimento de alimentos concentrados. Dentre as espécies, a vagem de algaroba (Prosopis juliflora SW D.C.) e palma forrageira (Opuntia e Nopalea) ganham destaque por tolerarem o déficit hídrico e produzirem em períodos onde a oferta de forragem está reduzida, além de apresentam bom valor nutricional e serem bem aceitas pelos animais. Porém, devido à variação na sua composição, seu uso na alimentação animal exige o conhecimento profundo da sua composição para a elaboração de dietas balanceadas. No entanto, devido ao custo e tempo para análise, os produtores não fazem uso da prática de análise da composição químico-bromatológica dos alimentos. Por isto, a espectroscopia de reflectância no infravermelho próximo (NIRS) representa uma importante alternativa aos métodos tradicionais. Objetivou-se com este estudo desenvolver e validar modelos de predição da composição bromatológica de vagem de algaroba e palma forrageira baseados em espectroscopia NIRS, escaneadas em dois modelos de equipamentos e com diferentes processamentos da amostra. Foram coletadas amostras de vagem de algaroba nos estados do Ceará, Bahia, Paraíba e Pernambuco, e amostras de palma forrageira nos estados do Ceará, Paraíba e Pernambuco, frescas (in natura) ou pré-secas e moídas. Para obtenção dos espectros utilizaram-se dois equipamentos NIR, Perten DA 7250 e FOSS 5000. Inicialmente os alimentos foram escaneados in natura em aparelho do modelo Perten, e, com o auxílio do software The Unscrambler 10.2 foi selecionado um grupo de amostras para o banco de calibração. As amostras selecionadas foram secas e moídas, e escaneadas novamente em equipamentos Perten e FOSS. Os valores dos parâmetros de referência foram obtidos por meio de metodologias tradicionalmente aplicadas em laboratório de nutrição animal para matéria seca (MS), matéria mineral (MM), matéria orgânica (MO), proteína bruta (PB), estrato etéreo (EE), fibra solúvel em detergente neutro (FDN), fibra solúvel em detergente ácido (FDA), hemicelulose (HEM) e digestibilidade in vitro da matéria seca (DIVMS). O desempenho dos modelos foi avaliado de acordo com os erros médios de calibração (RMSEC) e validação (RMSECV), coeficiente de determinação (R2 ) e da relação de desempenho de desvio dos modelos (RPD). A análise exploratória dos dados, por meio de tratamentos espectrais e análise de componentes principais (PCA), demonstraram que os bancos de dados eram similares entre si, dando segurança de desenvolver os modelos com todas as amostras selecionadas em um único modelo para cada alimento, algaroba e palma. Na avaliação dos resultados de referência, observou-se que a variação dos resultados para cada parâmetro corroboraram com os descritos na literatura. No desempenho dos modelos, aqueles desenvolvidos com pré-processamento da amostra (pré-secagem e moagem) se mostraram mais robustos do que aqueles construídos com amostras in natura. O aparelho NIRS Perten apresentou desempenho semelhante ao equipamento FOSS, apesar desse último cobrir uma faixa espectral maior e com intervalos de leituras menores. A técnica NIR, associada ao método de calibração multivariada de regressão por meio de quadrados mínimos (PLS), mostrou-se confiável para prever a composição químico-bromatológica de vagem de algaroba e da palma forrageira. Abstract: Forage species adapted to semi-arid conditions are an alternative to reduce the negative impacts in the feed supply for ruminants in the Brazilian Northeast region, due to seasonality in forage availability, as well as in the reducing of cost by providing concentrated feedstuffs. Among the species, mesquite pods (Prosopis juliflora SW DC) and spineless cactus (Opuntia and Nopalea) are highlighted for tolerating the drought and producion in periods where the forage is scarce, and have high nutritional value and also are well accepted by the animals. However, its use in animal diets requires a knowledge about its composition to prepare balanced diets. However, farmers usually do not use feed composition analysis, because their high cost and time-consuming. Thus, the Near Infrared Reflectance Spectroscopy in the (NIRS) is an important alternative to traditional methods. The objective of this study to develop and validate predictive models of the chemical composition of mesquite pods and spineless cactus-based NIRS spectroscopy, scanned in two different spectrometers and sample processing. Mesquite pods samples were collected in the states of Ceará, Bahia, Paraiba and Pernambuco, and samples of forage cactus in the states of Ceará, Paraíba and Pernambuco. In order to obtain the spectra, it was used two NIR equipment: Perten DA 7250 and FOSS 5000. sSpectra of samples were initially obtained fresh (as received) using Perten instrument, and with The Unscrambler software 10.2, a group of subsamples was selected to model development, keeping out redundant ones. The selected samples were dried and ground, and scanned again in both Perten and FOSS instruments. The values of the reference analysis were obtained by methods traditionally applied in animal nutrition laboratory to dry matter (DM), mineral matter (MM), organic matter (OM), crude protein (CP), ether extract (EE), soluble neutral detergent fiber (NDF), soluble acid detergent fiber (ADF), hemicellulose ( HEM) and in vitro digestibility of dry matter (DIVDM). The performance of the models was evaluated according to the Root Mean Square Error of Calibration (RMSEC) and cross-validation (RMSECV), coefficient of determination (R2 ) and the deviation of Ratio of performance Deviation of the models (RPD). Exploratory data analysis through spectral treatments and principal component analysis (PCA), showed that the databases were similar to each other, and may be treated asa single model for each feed - mesquite pods and cactus. Evaluating the reference results, it was observed that the variation were similar to those reported in the literature. Comparing the preprocessing of samples, the performance ofthose developed with preprocessing (dried and ground) of the sample were more robust than those built with fresh samples. The NIRS Perten device performance similar to FOSS equipment, although the latter cover a larger spectral range and with lower readings intervals. NIR technology associate do multivariate techniques is reliable to predict the bromatological composition of mesquite pods and cactus.
Resumo:
Sound radiators based on forced vibrations of plates are becoming widely employed, mainly for active sound enhancement and noise cancelling systems, both in music and automotive environment. Active sound enhancement solutions based on electromagnetic shakers hence find increasing interest. Mostly diffused applications deal with active noise control (ANC) and active vibration control systems for improving the acoustic experience inside or outside the vehicle. This requires investigating vibrational and, consequently, vibro-acoustic characteristics of vehicles. Therefore, simulation and processing methods capable of reducing the calculation time and providing high-accuracy results, are strongly demanded. In this work, an ideal case study on rectangular plates in fully clamped conditions preceded a real case analysis on vehicle panels. The sound radiation generated by a vibrating flat or shallow surface can be calculated by means of Rayleigh’s integral. The analytical solution of the problem is here calculated implementing the equations in MATLAB. Then, the results are compared with a numerical model developed in COMSOL Multiphysics, employing Finite Element Method (FEM). A very good matching between analytical and numerical solutions is shown, thus the cross validation of the two methods is achieved. The shift to the real case study, on a McLaren super car, led to the development of a mixed analytical-numerical method. Optimum results were obtained with mini shakers excitement, showing good matching of the recorded SPL with the calculated one over all the selected frequency band. In addition, a set of directivity measurements of the hood were realized, to start studying the spatiality of sound, which is fundamental to active noise control systems.
Resumo:
The Crescent Shaped Brace (CSB) is a new simple steel hysteretic device proposed to be used as an enhanced diagonal brace in framed structures. The CSB allows the practical designer to choose the lateral stiffness independently from the yield strength of the device, due to its peculiar ad-hoc shape. In the present thesis, a complete study referring to different CSB configurations has been presented. After the validation of the hysteretic capacities of the Crescent Shaped Braces, the seismic concept of the "enhanced first story isolation" system has been proposed within the PBSD. It relies on the total separation between the Vertical Resisting System (VRS) and the Horizontal Resisting System (HRS) in order to attain a certain objective curve of the structure. An applicative example has been studied following this concept and exploiting the advantages of the CSBs as seismic dissipative devices used for the HRS. Then several geometrical configurations called Single CSB system, Single 2 CSB system, Double CSB system, Coupled CSB system, Coupled with high length CSB system, and the final one was Cross bracing system have been introduced and modelled with SAP2000 and the results have been compared.
Resumo:
OBJETIVO: Estimar a prevalência de excesso de peso e obesidade e fatores associados. MÉTODOS: Foram analisados dados referentes a indivíduos com idade >18 anos entrevistados pelo sistema de Vigilância de Fatores de Risco e Proteção para Doenças Crônicas por Inquérito Telefônico (VIGITEL), realizado nas capitais brasileiras e Distrito Federal em 2006. Para 49.395 indivíduos, o índice de massa corporal (IMC) foi utilizado para identificar excesso de peso (IMC 25-30 kg/m²) e obesidade (IMC >30 kg/m²). Prevalência e razões de prevalência foram apresentadas segundo variáveis sociodemográficas, escolaridade e condição de saúde/comorbidades e auto-avaliação da saúde, estratificadas por sexo. Utilizou-se regressão de Poisson para análises brutas e ajustadas por idade. RESULTADOS: A prevalência de excesso de peso foi de 47% para os homens e 39% para as mulheres, e de obesidade, 11% para ambos os sexos. Observou-se associação direta entre excesso de peso e escolaridade entre homens, e associação inversa entre mulheres. Obesidade foi mais freqüente entre os homens que viviam com companheira e não esteve associada com escolaridade ou cor da pele. As prevalências de excesso de peso e obesidade foram mais altas entre mulheres negras e que viviam com companheiro. A presença de diabetes, hipertensão arterial sistêmica e dislipidemias, bem como considerar sua saúde como regular ou ruim, também foram referidas pelos entrevistados com excesso de peso ou obesidade. CONCLUSÕES: Enquanto cerca de um de cada dois entrevistados foram classificados com excesso de peso, obesidade foi referida por um de cada dez entrevistados. Variáveis socioeconômicas e demográficas, bem como morbidades referidas, foram associadas com excesso de peso e obesidade. Esses resultados foram similares àqueles encontrados em outros estudos brasileiros.
Resumo:
Our objective was to develop a methodology to predict soil fertility using visible near-infrared (vis-NIR) diffuse reflectance spectra and terrain attributes derived from a digital elevation model (DEM). Specifically, our aims were to: (i) assemble a minimum data set to develop a soil fertility index for sugarcane (Sarcharum officinarum L.) (SFI-SC) for biofuel production in tropical soils; (ii) construct a model to predict the SFI-SC using soil vis-NIR spectra and terrain attributes; and (iii) produce a soil fertility map for our study area and assess it by comparing it with a green vegetation index (GVI). The study area was 185 ha located in sao Paulo State, Brazil. In total, 184 soil samples were collected and analyzed for a range of soil chemical and physical properties. Their vis-NIR spectra were collected from 400 to 2500 nm. The Shuttle Radar Topographic Mission 3-arcsec (90-m resolution) DEM of the area was used to derive 17 terrain attributes. A minimum data set of soil properties was selected to develop the SFI-SC. The SFI-SC consisted of three classes: Class 1, the highly fertile soils; Class 2, the fertile soils; and Class 3, the least fertile soils. It was derived heuristically with conditionals and using expert knowledge. The index was modeled with the spectra and terrain data using cross-validated decision trees. The cross-validation of the model correctly predicted Class 1 in 75% of cases, Class 2 in 61%, and Class 3 in 65%. A fertility map was derived for the study area and compared with a map of the GVI. Our approach offers a methodology that incorporates expert knowledge to derive the SFI-SC and uses a versatile spectro-spatial methodology that may be implemented for rapid and accurate determination of soil fertility and better exploration of areas suitable for production.
Resumo:
Herein we report an approach to the formation of 5-alkynyl-1,3-dioxin-4-ones using Suzuki-Miyaura cross-coupling reaction of potassium alkynyltrifluoroborate salts with 2,2,6-trimethy1-5-iodo-1,3-dioxin-4-one. The resulting 5-ethynyltrimethylsilyl-1,3-dioxin-4-ones obtained through the Sonogashira reaction were further reacted in a Cu(I)-catalyzed Huisgen azide-alkyne 1,3-dipolar cycloaddition to form functionalized 1,4-disubstituted-1,2,3-triazoles in good yields, using mild conditions and ultrasonic radiation to expedite the reaction. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
The Flow State Scale-2 (FSS-2) and Dispositional Flow Scale-2 (DFS-2) are presented as two self-report instruments designed to assess flow experiences in physical activity. Item modifications were made to the original versions of these scales in order to improve the measurement of some of the flow dimensions. Confirmatory factor analyses of an item identification and a cross-validation sample demonstrated a good fit of the new scales. There was support for both a 9-first-order factor model and a higher order model with a global flow factor. The item identification sample yielded mean item loadings on the first-order factor of .78 for the FSS-2 and .77 for the DFS-2. Reliability estimates ranged from .80 to .90 for the FSS-2, and .81 to .90 for the DFS-2. In the cross-validation sample, mean item loadings on the first-order factor were .80 for the FSS-2, and .73 for the DFS-2. Reliability estimates ranged between .80 to .92 for the FSS-2 and .78 to .86 for the DFS-2. The scales are presented as ways of assessing flow experienced within a particular event (FSS-2) or the frequency of flow experiences in chosen physical activity in general (DFS-2).
Resumo:
Motivation: Prediction methods for identifying binding peptides could minimize the number of peptides required to be synthesized and assayed, and thereby facilitate the identification of potential T-cell epitopes. We developed a bioinformatic method for the prediction of peptide binding to MHC class II molecules. Results: Experimental binding data and expert knowledge of anchor positions and binding motifs were combined with an evolutionary algorithm (EA) and an artificial neural network (ANN): binding data extraction --> peptide alignment --> ANN training and classification. This method, termed PERUN, was implemented for the prediction of peptides that bind to HLA-DR4(B1*0401). The respective positive predictive values of PERUN predictions of high-, moderate-, low- and zero-affinity binder-a were assessed as 0.8, 0.7, 0.5 and 0.8 by cross-validation, and 1.0, 0.8, 0.3 and 0.7 by experimental binding. This illustrates the synergy between experimentation and computer modeling, and its application to the identification of potential immunotheraaeutic peptides.
Resumo:
The principle of using induction rules based on spatial environmental data to model a soil map has previously been demonstrated Whilst the general pattern of classes of large spatial extent and those with close association with geology were delineated small classes and the detailed spatial pattern of the map were less well rendered Here we examine several strategies to improve the quality of the soil map models generated by rule induction Terrain attributes that are better suited to landscape description at a resolution of 250 m are introduced as predictors of soil type A map sampling strategy is developed Classification error is reduced by using boosting rather than cross validation to improve the model Further the benefit of incorporating the local spatial context for each environmental variable into the rule induction is examined The best model was achieved by sampling in proportion to the spatial extent of the mapped classes boosting the decision trees and using spatial contextual information extracted from the environmental variables.
Resumo:
After 12 weeks of selective serotonin reuptake inhibitor (SSRI) monotherapy with inadequate response, 10 patients received clomipramine and 11 received quetiapine as augmentation agents of the SSRI. The primary outcome measure was the difference between initial and final scores of the YaleBrown Obsessive-Compulsive Scale (Y-BOCS), rated in a blinded fashion, and the score of clinical global improvement (CGI-I). Statistical analyses were performed using nonparametric tests to evaluate treatment efficacy and the difference between treatment groups. Percentile plots were constructed with YBOCS scores from the clomipramine and quetiapine groups. Considering response a >= 35% reduction in the initial Y-BOCS score plus a rating of `much improved` or `very much improved` on CGI-I, four of eleven quetiapine patients and one out of ten clomipramine patients were classified as responders. The mean final Y-BOCS score was significantly lower than baseline in the quetiapine augmentation group (P = 0.023), but not in the clomipramine augmentation group (P = 0.503). The difference between groups showed a trend towards significance only at week 4, the mean Y-BOCS score being lower for those receiving quetiapine (P = 0.052). A difference between groups was also observed at week 4 according to percentile plots. These results corroborate previous findings of quetiapine augmentation efficacy in obsessive-compulsive disorder (OCD). Clomipramine augmentation did not produce a significant reduction in Y-BOCS scores. Higher target maximum dosages might have yielded different results.