997 resultados para pre-filtering
Resumo:
Los sistemas de recomendación son un tipo de solución al problema de sobrecarga de información que sufren los usuarios de los sitios web en los que se pueden votar ciertos artículos. El sistema de recomendación de filtrado colaborativo es considerado como el método con más éxito debido a que sus recomendaciones se hacen basándose en los votos de usuarios similares a un usuario activo. Sin embargo, el método de filtrado de colaboración tradicional selecciona usuarios insuficientemente representativos como vecinos de cada usuario activo. Esto significa que las recomendaciones hechas a posteriori no son lo suficientemente precisas. El método propuesto en esta tesis realiza un pre-filtrado del proceso, mediante el uso de dominancia de Pareto, que elimina los usuarios menos representativos del proceso de selección k-vecino y mantiene los más prometedores. Los resultados de los experimentos realizados en MovieLens y Netflix muestran una mejora significativa en todas las medidas de calidad estudiadas en la aplicación del método propuesto. ABSTRACTRecommender systems are a type of solution to the information overload problem suffered by users of websites on which they can rate certain items. The Collaborative Filtering Recommender System is considered to be the most successful approach as it make its recommendations based on votes of users similar to an active user. Nevertheless, the traditional collaborative filtering method selects insufficiently representative users as neighbors of each active user. This means that the recommendations made a posteriori are not precise enough. The method proposed in this thesis performs a pre-filtering process, by using Pareto dominance, which eliminates the less representative users from the k-neighbor selection process and keeps the most promising ones. The results from the experiments performed on Movielens and Netflix show a significant improvement in all the quality measures studied on applying the proposed method.
Resumo:
The term proteome is used to define the complete set of proteins expressed in cells or tissues of an organism at a certain timepoint. Respectively, proteomics is used to describe the methods, which are used to study such proteomes. These methods include chromatographic and electrophoretic techniques for protein or peptide fractionation, mass spectrometry for their identification, and use of computational methods to assist the complicated data analysis. A primary aim in this Ph.D. thesis was to set-up, optimize, and develop proteomics methods for analysing proteins extracted from T-helper (Th) lymphocytes. First, high-throughput LC-MS/MS and ICAT labeling methods were set-up and optimized for analysing the microsomal fraction proteins extracted from Th lymphocytes. Later, iTRAQ method was optimized to study cytokine regulated protein expression in the nuclei of Th lymphocytes. High-throughput LC-MS/MS analyses, like ICAT and iTRAQ, produce large quantities of data and robust software and data analysis pipelines are needed. Therefore, different software programs used for analysing such data were evaluated. Moreover, a pre-filtering algorithm was developed to classify good-quality and bad-quality spectra prior to the database searches. Th-lymphocytes can differentiate into Th1 or Th2 cells based on surrounding antigens, co-stimulatory molecules, and cytokines. Both subsets have individual cytokine secretion profiles and specific functions. Th1 cells participate in the cellular immunity against intracellular pathogens, while Th2 cells have important role in the humoral immunity against extracellular parasites. An abnormal response of Th1 and Th2 cells and imbalance between the subsets are charasteristic of several diseases. Th1 specific reactions and cytokines have been detected in autoimmune diseases, while Th2 specific response and cytokine profile is common in allergy and asthma. In this Ph. D. thesis mass spectrometry-based proteomics was used to study the effects of Th1 and Th2 promoting cytokines IL-12 and IL-4 on the proteome of Th lymphocytes. Characterization of microsomal fraction proteome extracted from IL-12 treated lymphobasts and IL-4 stimulated cord blood CD4+ cells resulted in finding of cytokine regulated proteins. Galectin-1 and CD7 were down-regulated in IL-12 treated cells, while IL-4 stimulation decreased the expression of STAT1, MXA, GIMAP1, and GIMAP4. Interestingly, the transcription of both GIMAP genes was up-regulated in Th1 polarized cells and down-regulated in Th2 promoting conditions.
Resumo:
Pós-graduação em Engenharia Elétrica - FEIS
Resumo:
In the last decades, a striking amount of hydrographic data, covering the most part of Mediterranean basin, have been generated by the efforts made to characterize the oceanography and ecology of the basin. On the other side, the improvement in technologies, and the consequent perfecting of sampling and analytical techniques, provided data even more reliable than in the past. Nutrient data enter fully in this context, but suffer of the fact of having been produced by a large number of uncoordinated research programs and of being often deficient in quality control, with data bases lacking of intercalibration. In this study we present a computational procedure based on robust statistical parameters and on the physical dynamic properties of the Mediterranean sea and its morphological characteristics, to partially overcome the above limits in the existing data sets. Through a data pre filtering based on the outlier analysis, and thanks to the subsequent shape analysis, the procedure identifies the inconsistent data and for each basin area identifies a characteristic set of shapes (vertical profiles). Rejecting all the profiles that do not follow any of the spotted shapes, the procedure identifies all the reliable profiles and allows us to obtain a data set that can be considered more internally consistent than the existing ones.
Resumo:
We report an empirical analysis of long-range dependence in the returns of eight stock market indices, using the Rescaled Range Analysis (RRA) to estimate the Hurst exponent. Monte Carlo and bootstrap simulations are used to construct critical values for the null hypothesis of no long-range dependence. The issue of disentangling short-range and long-range dependence is examined. Pre-filtering by fitting a (short-range) autoregressive model eliminates part of the long-range dependence when the latter is present, while failure to pre-filter leaves open the possibility of conflating short-range and long-range dependence. There is a strong evidence of long-range dependence for the small central European Czech stock market index PX-glob, and a weaker evidence for two smaller western European stock market indices, MSE (Spain) and SWX (Switzerland). There is little or no evidence of long-range dependence for the other five indices, including those with the largest capitalizations among those considered, DJIA (US) and FTSE350 (UK). These results are generally consistent with prior expectations concerning the relative efficiency of the stock markets examined. © 2011 Elsevier Inc.
Resumo:
La tesi propone una soluzione middleware per scenari in cui i sensori producono un numero elevato di dati che è necessario gestire ed elaborare attraverso operazioni di preprocessing, filtering e buffering al fine di migliorare l'efficienza di comunicazione e del consumo di banda nel rispetto di vincoli energetici e computazionali. E'possibile effettuare l'ottimizzazione di questi componenti attraverso operazioni di tuning remoto.
Resumo:
Estuarine hydrodynamics is a key factor in the definition of the filtering capacity of an estuary and results from the interaction of the processes that control the inlet morphodynamics and those that are acting in the mixing of the water in the estuary. The hydrodynamics and suspended sediment transport in the Camboriú estuary were assessed by two field campaigns conducted in 1998 that covered both neap and spring tide conditions. The period measured represents the estuarine hydrodynamics and sediment transport prior to the construction of the jetty in 2003 and provides important background information for the Camboriú estuary. Each field campaign covered two complete tidal cycles with hourly measurements of currents, salinity, suspended sediment concentration and water level. Results show that the Camboriú estuary is partially mixed with the vertical structure varying as a function of the tidal range and tidal phase. The dynamic estuarine structure can be balanced between the stabilizing effects generated by the vertical density gradient, which produces buoyancy and stratification flows, and the turbulent effects generated by the vertical velocity gradient that generates vertical mixing. The main sediment source for the water column are the bottom sediments, periodically resuspended by the tidal currents. The advective salt and suspended sediment transport was different between neap and spring tides, being more complex at spring tide. The river discharge term was important under both tidal conditions. The tidal correlation term was also important, being dominant in the suspended sediment transport during the spring tide. The gravitational circulation and Stokes drift played a secondary role in the estuarine transport processes.
Resumo:
This work characterizes the effects of ambient levels of urban particulate matter (PM(2.5)) from the city of Sao Paulo on spermatogenesis using mice exposed during the embryo-fetal and/or postnatal phases of development. Parental generations (BALB/c mice) were exposed to air pollution in chambers with or without filtering PM(2.5) for 4 months. Animals were mated, and half of the 1-day-old offspring were moved between chambers, which yielded prenatal and postnatal groups. Remaining offspring comprised the non-exposed and pre+postnatal exposed groups. After 90 days, the animals were sacrificed for testis collection and weighing. Optical microscopy was used for the morphometric analyses of the cell counts, spermatogenic cycle, proliferation, and apoptosis. Prenatally exposed animals presented reduced body and testicular weight with an increased gonadosomatic index (GSI). Testicular volume also decreased, as well as the tubular diameter in testes of the same animals. Proliferation, apoptosis, and spermatogenic cycle analyses showed no significant differences among groups. However, the tubules at stage VII of pre- and postnatal animals presented a reduced number of elongated spermatids. Pre+postnatal group presented higher spermatid head retention at stages VIII-XII. These results show that ambient levels of PM(2.5) from Sao Paulo city affect spermatogenesis by damaging sperm production.
Resumo:
To construct Biodiversity richness maps from Environmental Niche Models (ENMs) of thousands of species is time consuming. A separate species occurrence data pre-processing phase enables the experimenter to control test AUC score variance due to species dataset size. Besides, removing duplicate occurrences and points with missing environmental data, we discuss the need for coordinate precision, wide dispersion, temporal and synonymity filters. After species data filtering, the final task of a pre-processing phase should be the automatic generation of species occurrence datasets which can then be directly ’plugged-in’ to the ENM. A software application capable of carrying out all these tasks will be a valuable time-saver particularly for large scale biodiversity studies.
Resumo:
Estuarine hydrodynamics is a key factor in the definition of the filtering capacity of an estuary and results from the interaction of the processes that control the inlet morphodynamics and those that are acting in the mixing of the water in the estuary. The hydrodynamics and suspended sediment transport in the Cambori estuary were assessed by two field campaigns conducted in 1998 that covered both neap and spring tide conditions. The period measured represents the estuarine hydrodynamics and sediment transport prior to the construction of the jetty in 2003 and provides important background information for the Cambori estuary. Each field campaign covered two complete tidal cycles with hourly measurements of currents, salinity, suspended sediment concentration and water level. Results show that the Cambori estuary is partially mixed with the vertical structure varying as a function of the tidal range and tidal phase. The dynamic estuarine structure can be balanced between the stabilizing effects generated by the vertical density gradient, which produces buoyancy and stratification flows, and the turbulent effects generated by the vertical velocity gradient that generates vertical mixing. The main sediment source for the water column are the bottom sediments, periodically resuspended by the tidal currents. The advective salt and suspended sediment transport was different between neap and spring tides, being more complex at spring tide. The river discharge term was important under both tidal conditions. The tidal correlation term was also important, being dominant in the suspended sediment transport during the spring tide. The gravitational circulation and Stokes drift played a secondary role in the estuarine transport processes.
Resumo:
In autumn 2012, the new release 05 (RL05) of monthly geopotencial spherical harmonics Stokes coefficients (SC) from GRACE (Gravity Recovery and Climate Experiment) mission was published. This release reduces the noise in high degree and order SC, but they still need to be filtered. One of the most common filtering processing is the combination of decorrelation and Gaussian filters. Both of them are parameters dependent and must be tuned by the users. Previous studies have analyzed the parameters choice for the RL05 GRACE data for oceanic applications, and for RL04 data for global application. This study updates the latter for RL05 data extending the statistics analysis. The choice of the parameters of the decorrelation filter has been optimized to: (1) balance the noise reduction and the geophysical signal attenuation produced by the filtering process; (2) minimize the differences between GRACE and model-based data; (3) maximize the ratio of variability between continents and oceans. The Gaussian filter has been optimized following the latter criteria. Besides, an anisotropic filter, the fan filter, has been analyzed as an alternative to the Gauss filter, producing better statistics.
Resumo:
A MATLAB-based computer code has been developed for the simultaneous wavelet analysis and filtering of several environmental time series, particularly focused on the analyses of cave monitoring data. The continuous wavelet transform, the discrete wavelet transform and the discrete wavelet packet transform have been implemented to provide a fast and precise time–period examination of the time series at different period bands. Moreover, statistic methods to examine the relation between two signals have been included. Finally, the entropy of curves and splines based methods have also been developed for segmenting and modeling the analyzed time series. All these methods together provide a user-friendly and fast program for the environmental signal analysis, with useful, practical and understandable results.
Resumo:
Sentiment classification over Twitter is usually affected by the noisy nature (abbreviations, irregular forms) of tweets data. A popular procedure to reduce the noise of textual data is to remove stopwords by using pre-compiled stopword lists or more sophisticated methods for dynamic stopword identification. However, the effectiveness of removing stopwords in the context of Twitter sentiment classification has been debated in the last few years. In this paper we investigate whether removing stopwords helps or hampers the effectiveness of Twitter sentiment classification methods. To this end, we apply six different stopword identification methods to Twitter data from six different datasets and observe how removing stopwords affects two well-known supervised sentiment classification methods. We assess the impact of removing stopwords by observing fluctuations on the level of data sparsity, the size of the classifier's feature space and its classification performance. Our results show that using pre-compiled lists of stopwords negatively impacts the performance of Twitter sentiment classification approaches. On the other hand, the dynamic generation of stopword lists, by removing those infrequent terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and substantially shrinking the feature space
Resumo:
To identify risk factors associated with post-operative temporomandibular joint dysfunction after craniotomy. The study sample included 24 patients, mean age of 37.3 ± 10 years; eligible for surgery for refractory epilepsy, evaluated according to RDC/TMD before and after surgery. The primary predictor was the time after the surgery. The primary outcome variable was maximal mouth opening. Other outcome variables were: disc displacement, bruxism, TMJ sound, TMJ pain, and pain associated to mandibular movements. Data analyses were performed using bivariate and multiple regression methods. The maximal mouth opening was significantly reduced after surgery in all patients (p = 0.03). In the multiple regression model, time of evaluation and pre-operative bruxism were significantly (p < .05) associated with an increased risk for TMD post-surgery. A significant correlation between surgery follow-up time and maximal opening mouth was found. Pre-operative bruxism was associated with increased risk for temporomandibular joint dysfunction after craniotomy.
Resumo:
To analyze the relationship between parity, pre-pregnancy body mass index (BMI), and gestational weight gain (GWG). This observational controlled study was conducted from November 2013 to April 2014, with postpartum women who started antenatal care up to 14 weeks and had full-term births. Data were collected from medical records and antenatal cards. Descriptive and bivariate analyses were performed. The significance level was 5%. Data were collected from 130 primiparous and 160 multiparous women. At the beginning of prenatal care, 54.62% of the primiparous were eutrophic, while the majority of multiparous were overweight or obese (62.51%). Multiparas are two times more likely to be obese at the beginning of their pregnancies, when compared to primiparas. The average pre-pregnancy weight and final pregnancy weight was significantly higher in multiparous, however, the mean GWG was higher among primiparous. We found an inverse correlation between parity and the total GWG, but initial BMI was significantly higher in multiparas. Nevertheless, monitoring of the GWG through actions that promote a healthier lifestyle is needed, regardless of parity and nutritional status, in order to prevent excessive GWG and postpartum weight retention and consequently inadequate pre-pregnancy nutritional status in future pregnancies.