Biblioteca Digital

783 resultados para grid, clustering, statistical, clustering

Discovery of a New Retrograde Trans-Neptunian Object: Hint of a Common Orbital Plane for Low Semimajor Axis, High-inclination TNOs and Centaurs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although the majority of Centaurs are thought to have originated in the scattered disk, with the high-inclination members coming from the Oort cloud, the origin of the high-inclination component of trans-Neptunian objects (TNOs) remains uncertain. We report the discovery of a retrograde TNO, which we nickname “Niku,” detected by the Pan-STARRS 1 Outer Solar System Survey. Our numerical integrations show that the orbital dynamics of Niku are very similar to that of 2008 KV42 (Drac), with a half-life of ˜500 Myr. Comparing similar high-inclination TNOs and Centaurs (q > 10 au, a <100 au, and i > 60°), we find that these objects exhibit a surprising clustering of ascending node, and occupy a common orbital plane. This orbital configuration has high statistical significance: 3.8-σ. An unknown mechanism is required to explain the observed clustering. This discovery may provide a pathway to investigating a possible reservoir of high-inclination objects.

The High-Redshift Quasar Luminosity Function from Multi-Epoch Imaging Surveys

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Methods for Confounding Adjustment and High-Dimensional Environmental Exposures

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

L’ichtyofaune dans l’organisation biologique d’un système paralique de type lagunaire, la Ria d’Aveiro (Portugal), en 1987-1988 et 1999-2000

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cette étude concerne un écosystème paralique, la lagune d’Aveiro (Portugal). Elle vise à déterminer l’organisation des peuplements de poissons en fonction des caractéristiques et du fonctionnement de cet écosystème. L’ichtyofaune a été échantillonnée mensuellement en 10 stations d’août 1987 à juillet 1988 et de janvier 1999 à décembre 2000, avec une seine de plage traditionnelle. La répartition des peuplements de poissons est étudiée au moyen de descripteurs populationnels (richesses spécifique et familiale, densité, biomasse et indice de diversité) et d’analyses statistiques (groupements et ordination). La lagune d’Aveiro présente de fortes variations, dans l’espace et le temps, de ses paramètres physico- chimiques reflétant ainsi les variations climatiques annuelles. Si l’on considère la mobilité des poissons et la géomorphologie et l’hydrologie du système étudié, nous pouvions nous attendre à une forte homogénéité de la distribution des poissons. À l’inverse, une diminution de l’influence marine a pour conséquence une diminution des richesses spécifiques et familiales, de la densité et de la biomasse. Nous avons également observé une modification de composition de l’assemblage de poissons et la présence d’espèces dominantes caractéristiques des différents niveaux de confinement (taux de renouvellement des eaux marines en un point donné du système). Le peuplement de poissons présente une organisation semblable à la zonation biologique, indépendamment des paramètres physico-chimiques tels que la salinité, décrite par la macrofaune benthique et induite par le confinement. La comparaison des résultats avec des données obtenues douze ans plus tôt, montre que l’organisation générale de la lagune est demeurée inchangée, illustrant ainsi la stabilité des écosystèmes paraliques. De plus, des modifications du niveau de confinement dans les marges nord et sud, induites principalement par des changements locaux de l’hydrodynamisme, ont été constatées. Le déconfinement de la zone nord est la conséquence de l’entretien des canaux de navigation par dragage. À l’inverse, le confinement de la zone sud est l’évolution naturelle des bassins paraliques soumis souvent à une sédimentation élevée et rapide. Cette étude montre que l’organisation du peuplement de poissons valide le concept du confinement pour l’organisation biologique des milieux paraliques, et peut être employé pour expliquer les changements de ces écosystèmes.

Microblogging Temporal Summarization: Filtering Important Twitter Updates for Breaking News

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While news stories are an important traditional medium to broadcast and consume news, microblogging has recently emerged as a place where people can dis- cuss, disseminate, collect or report information about news. However, the massive information in the microblogosphere makes it hard for readers to keep up with these real-time updates. This is especially a problem when it comes to breaking news, where people are more eager to know “what is happening”. Therefore, this dis- sertation is intended as an exploratory effort to investigate computational methods to augment human effort when monitoring the development of breaking news on a given topic from a microblog stream by extractively summarizing the updates in a timely manner. More specifically, given an interest in a topic, either entered as a query or presented as an initial news report, a microblog temporal summarization system is proposed to filter microblog posts from a stream with three primary concerns: topical relevance, novelty, and salience. Considering the relatively high arrival rate of microblog streams, a cascade framework consisting of three stages is proposed to progressively reduce quantity of posts. For each step in the cascade, this dissertation studies methods that improve over current baselines. In the relevance filtering stage, query and document expansion techniques are applied to mitigate sparsity and vocabulary mismatch issues. The use of word embedding as a basis for filtering is also explored, using unsupervised and supervised modeling to characterize lexical and semantic similarity. In the novelty filtering stage, several statistical ways of characterizing novelty are investigated and ensemble learning techniques are used to integrate results from these diverse techniques. These results are compared with a baseline clustering approach using both standard and delay-discounted measures. In the salience filtering stage, because of the real-time prediction requirement a method of learning verb phrase usage from past relevant news reports is used in conjunction with some standard measures for characterizing writing quality. Following a Cranfield-like evaluation paradigm, this dissertation includes a se- ries of experiments to evaluate the proposed methods for each step, and for the end- to-end system. New microblog novelty and salience judgments are created, building on existing relevance judgments from the TREC Microblog track. The results point to future research directions at the intersection of social media, computational jour- nalism, information retrieval, automatic summarization, and machine learning.

Whole Genome Analysis of Gene Expression Reveals Coordinated Activation of Signaling and Metabolic Pathways during Pollen-Pistil Interactions in Arabidopsis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Plant reproduction depends on the concerted activation of many genes to ensure correct communication between pollen and pistil. Here, we queried the whole transcriptome of Arabidopsis (Arabidopsis thaliana) in order to identify genes with specific reproductive functions. We used the Affymetrix ATH1 whole genome array to profile wild-type unpollinated pistils and unfertilized ovules. By comparing the expression profile of pistils at 0.5, 3.5, and 8.0 h after pollination and applying a number of statistical and bioinformatics criteria, we found 1,373 genes differentially regulated during pollen-pistil interactions. Robust clustering analysis grouped these genes in 16 time-course clusters representing distinct patterns of regulation. Coregulation within each cluster suggests the presence of distinct genetic pathways, which might be under the control of specific transcriptional regulators. A total of 78% of the regulated genes were expressed initially in unpollinated pistil and/or ovules, 15% were initially detected in the pollen data sets as enriched or preferentially expressed, and 7% were induced upon pollination. Among those, we found a particular enrichment for unknown transcripts predicted to encode secreted proteins or representing signaling and cell wall-related proteins, which may function by remodeling the extracellular matrix or as extracellular signaling molecules. A strict regulatory control in various metabolic pathways suggests that fine-tuning of the biochemical and physiological cellular environment is crucial for reproductive success. Our study provides a unique and detailed temporal and spatial gene expression profile of in vivo pollen-pistil interactions, providing a framework to better understand the basis of the molecular mechanisms operating during the reproductive process in higher plants.

Construction and analysis of hydrogeological landscape units using self-organising maps

Relevância:

30.00% 30.00%

Publicador:

A new age of fuel performance code criteria studied through advanced atomistic simulation techniques

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A fundamental step in understanding the effects of irradiation on metallic uranium and uranium dioxide ceramic fuels, or any material, must start with the nature of radiation damage on the atomic level. The atomic damage displacement results in a multitude of defects that influence the fuel performance. Nuclear reactions are coupled, in that changing one variable will alter others through feedback. In the field of fuel performance modeling, these difficulties are addressed through the use of empirical models rather than models based on first principles. Empirical models can be used as a predictive code through the careful manipulation of input variables for the limited circumstances that are closely tied to the data used to create the model. While empirical models are efficient and give acceptable results, these results are only applicable within the range of the existing data. This narrow window prevents modeling changes in operating conditions that would invalidate the model as the new operating conditions would not be within the calibration data set. This work is part of a larger effort to correct for this modeling deficiency. Uranium dioxide and metallic uranium fuels are analyzed through a kinetic Monte Carlo code (kMC) as part of an overall effort to generate a stochastic and predictive fuel code. The kMC investigations include sensitivity analysis of point defect concentrations, thermal gradients implemented through a temperature variation mesh-grid, and migration energy values. In this work, fission damage is primarily represented through defects on the oxygen anion sublattice. Results were also compared between the various models. Past studies of kMC point defect migration have not adequately addressed non-standard migration events such as clustering and dissociation of vacancies. As such, the General Utility Lattice Program (GULP) code was utilized to generate new migration energies so that additional non-migration events could be included into kMC code in the future for more comprehensive studies. Defect energies were calculated to generate barrier heights for single vacancy migration, clustering and dissociation of two vacancies, and vacancy migration while under the influence of both an additional oxygen and uranium vacancy.

The experience of teaching statistics to non-specialist students in Saudi universities

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Undoubtedly, statistics has become one of the most important subjects in the modern world, where its applications are ubiquitous. The importance of statistics is not limited to statisticians, but also impacts upon non-statisticians who have to use statistics within their own disciplines. Several studies have indicated that most of the academic departments around the world have realized the importance of statistics to non-specialist students. Therefore, the number of students enrolled in statistics courses has vastly increased, coming from a variety of disciplines. Consequently, research within the scope of statistics education has been able to develop throughout the last few years. One important issue is how statistics is best taught to, and learned by, non-specialist students. This issue is controlled by several factors that affect the learning and teaching of statistics to non-specialist students, such as the use of technology, the role of the English language (especially for those whose first language is not English), the effectiveness of statistics teachers and their approach towards teaching statistics courses, students’ motivation to learn statistics and the relevance of statistics courses to the main subjects of non-specialist students. Several studies, focused on aspects of learning and teaching statistics, have been conducted in different countries around the world, particularly in Western countries. Conversely, the situation in Arab countries, especially in Saudi Arabia, is different; here, there is very little research in this scope, and what there is does not meet the needs of those countries towards the development of learning and teaching statistics to non-specialist students. This research was instituted in order to develop the field of statistics education. The purpose of this mixed methods study was to generate new insights into this subject by investigating how statistics courses are currently taught to non-specialist students in Saudi universities. Hence, this study will contribute towards filling the knowledge gap that exists in Saudi Arabia. This study used multiple data collection approaches, including questionnaire surveys from 1053 non-specialist students who had completed at least one statistics course in different colleges of the universities in Saudi Arabia. These surveys were followed up with qualitative data collected via semi-structured interviews with 16 teachers of statistics from colleges within all six universities where statistics is taught to non-specialist students in Saudi Arabia’s Eastern Region. The data from questionnaires included several types, so different techniques were used in analysis. Descriptive statistics were used to identify the demographic characteristics of the participants. The chi-square test was used to determine associations between variables. Based on the main issues that are raised from literature review, the questions (items scales) were grouped and five key groups of questions were obtained which are: 1) Effectiveness of Teachers; 2) English Language; 3) Relevance of Course; 4) Student Engagement; 5) Using Technology. Exploratory data analysis was used to explore these issues in more detail. Furthermore, with the existence of clustering in the data (students within departments within colleges, within universities), multilevel generalized linear models for dichotomous analysis have been used to clarify the effects of clustering at those levels. Factor analysis was conducted confirming the dimension reduction of variables (items scales). The data from teachers’ interviews were analysed on an individual basis. The responses were assigned to one of the eight themes that emerged from within the data: 1) the lack of students’ motivation to learn statistics; 2) students' participation; 3) students’ assessment; 4) the effective use of technology; 5) the level of previous mathematical and statistical skills of non-specialist students; 6) the English language ability of non-specialist students; 7) the need for extra time for teaching and learning statistics; and 8) the role of administrators. All the data from students and teachers indicated that the situation of learning and teaching statistics to non-specialist students in Saudi universities needs to be improved in order to meet the needs of those students. The findings of this study suggested a weakness in the use of statistical software applications in these courses. This study showed that there is lack of application of technology such as statistical software programs in these courses, which would allow non-specialist students to consolidate their knowledge. The results also indicated that English language is considered one of the main challenges in learning and teaching statistics, particularly in institutions where English is not used as the main language. Moreover, the weakness of mathematical skills of students is considered another major challenge. Additionally, the results indicated that there was a need to tailor statistics courses to the needs of non-specialist students based on their main subjects. The findings indicate that statistics teachers need to choose appropriate methods when teaching statistics courses.

Redes elétricas inteligentes: desenvolvimento de modelos computacionais e aplicações para a gestão do lado da procura

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Esta tese incide sobre o desenvolvimento de modelos computacionais e de aplicações para a gestão do lado da procura, no âmbito das redes elétricas inteligentes. É estudado o desempenho dos intervenientes da rede elétrica inteligente, sendo apresentado um modelo do produtor-consumidor doméstico. O problema de despacho económico considerando previsão de produção e consumo de energia obtidos a partir de redes neuronais artificiais é apresentado. São estudados os modelos existentes no âmbito dos programas de resposta à procura e é desenvolvida uma ferramenta computacional baseada no algoritmo de fuzzy-clustering subtrativo. São analisados perfis de consumo e modos de operação, incluindo uma breve análise da introdução do veículo elétrico e de contingências na rede de energia elétrica. São apresentadas aplicações para a gestão de energia dos consumidores no âmbito do projeto piloto InovGrid. São desenvolvidos sistemas de automação para, aquisição monitorização, controlo e supervisão do consumo a partir de dados fornecidos pelos contadores inteligente que permitem a incorporação das ações dos consumidores na gestão do consumo de energia elétrica; SMART GRIDS - COMPUTATIONAL MODELS DEVELOPMENT AND DEMAND SIDE MANAGMENT APPLICATIONS Abstract: This thesis focuses on the development of computational models and its applications on the demand side management within the smart grid scope. The performance of the electrical network players is studied and a domestic prosumer model is presented. The economic dispatch problem considering the production forecast and the energy consumption obtained from artificial neural networks is also presented. The existing demand response models are studied and a computational tool based on the fuzzy subtractive clustering algorithm is developed. Energy consumption profiles and operational modes are analyzed, including a brief analysis of the electrical vehicle and contingencies on the electrical network. Consumer energy management applications within the scope of InovGrid pilot project are presented. Computational systems are developed for the acquisition, monitoring, control and supervision of consumption data provided by smart meters allowing to incorporate consumer actions on their electrical energy management.

Modelling of an efficient dynamic smart solar photovoltaic power grid system

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A smart solar photovoltaic grid system is an advent of innovation coherence of information and communications technology (ICT) with power systems control engineering via the internet [1]. This thesis designs and demonstrates a smart solar photovoltaic grid system that is selfhealing, environmental and consumer friendly, but also with the ability to accommodate other renewable sources of energy generation seamlessly, creating a healthy competitive energy industry and optimising energy assets efficiency. This thesis also presents the modelling of an efficient dynamic smart solar photovoltaic power grid system by exploring the maximum power point tracking efficiency, optimisation of the smart solar photovoltaic array through modelling and simulation to improve the quality of design for the solar photovoltaic module. In contrast, over the past decade quite promising results have been published in literature, most of which have not addressed the basis of the research questions in this thesis. The Levenberg-Marquardt and sparse based algorithms have proven to be very effective tools in helping to improve the quality of design for solar photovoltaic modules, minimising the possible relative errors in this thesis. Guided by theoretical and analytical reviews in literature, this research has carefully chosen the MatLab/Simulink software toolbox for modelling and simulation experiments performed on the static smart solar grid system. The auto-correlation coefficient results obtained from the modelling experiments give an accuracy of 99% with negligible mean square error (MSE), root mean square error (RMSE) and standard deviation. This thesis further explores the design and implementation of a robust real-time online solar photovoltaic monitoring system, establishing a comparative study of two solar photovoltaic tracking systems which provide remote access to the harvested energy data. This research made a landmark innovation in designing and implementing a unique approach for online remote access solar photovoltaic monitoring systems providing updated information of the energy produced by the solar photovoltaic module at the site location. In addressing the challenge of online solar photovoltaic monitoring systems, Darfon online data logger device has been systematically integrated into the design for a comparative study of the two solar photovoltaic tracking systems examined in this thesis. The site location for the comparative study of the solar photovoltaic tracking systems is at the National Kaohsiung University of Applied Sciences, Taiwan, R.O.C. The overall comparative energy output efficiency of the azimuthal-altitude dual-axis over the 450 stationary solar photovoltaic monitoring system as observed at the research location site is about 72% based on the total energy produced, estimated money saved and the amount of CO2 reduction achieved. Similarly, in comparing the total amount of energy produced by the two solar photovoltaic tracking systems, the overall daily generated energy for the month of July shows the effectiveness of the azimuthal-altitude tracking systems over the 450 stationary solar photovoltaic system. It was found that the azimuthal-altitude dual-axis tracking systems were about 68.43% efficient compared to the 450 stationary solar photovoltaic systems. Lastly, the overall comparative hourly energy efficiency of the azimuthal-altitude dual-axis over the 450 stationary solar photovoltaic energy system was found to be 74.2% efficient. Results from this research are quite promising and significant in satisfying the purpose of the research objectives and questions posed in the thesis. The new algorithms introduced in this research and the statistical measures applied to the modelling and simulation of a smart static solar photovoltaic grid system performance outperformed other previous works in reviewed literature. Based on this new implementation design of the online data logging systems for solar photovoltaic monitoring, it is possible for the first time to have online on-site information of the energy produced remotely, fault identification and rectification, maintenance and recovery time deployed as fast as possible. The results presented in this research as Internet of things (IoT) on smart solar grid systems are likely to offer real-life experiences especially both to the existing body of knowledge and the future solar photovoltaic energy industry irrespective of the study site location for the comparative solar photovoltaic tracking systems. While the thesis has contributed to the smart solar photovoltaic grid system, it has also highlighted areas of further research and the need to investigate more on improving the choice and quality design for solar photovoltaic modules. Finally, it has also made recommendations for further research in the minimization of the absolute or relative errors in the quality and design of the smart static solar photovoltaic module.

ArrayMining: a modular web-application for microarray analysis combining ensemble and consensus methods with cross-study normalization

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Statistical analysis of DNA microarray data provides a valuable diagnostic tool for the investigation of genetic components of diseases. To take advantage of the multitude of available data sets and analysis methods, it is desirable to combine both different algorithms and data from different studies. Applying ensemble learning, consensus clustering and cross-study normalization methods for this purpose in an almost fully automated process and linking different analysis modules together under a single interface would simplify many microarray analysis tasks. Results: We present ArrayMining.net, a web-application for microarray analysis that provides easy access to a wide choice of feature selection, clustering, prediction, gene set analysis and cross-study normalization methods. In contrast to other microarray-related web-tools, multiple algorithms and data sets for an analysis task can be combined using ensemble feature selection, ensemble prediction, consensus clustering and cross-platform data integration. By interlinking different analysis tools in a modular fashion, new exploratory routes become available, e.g. ensemble sample classification using features obtained from a gene set analysis and data from multiple studies. The analysis is further simplified by automatic parameter selection mechanisms and linkage to web tools and databases for functional annotation and literature mining. Conclusion: ArrayMining.net is a free web-application for microarray analysis combining a broad choice of algorithms based on ensemble and consensus methods, using automatic parameter selection and integration with annotation databases.

Modelling global pyrogeography using data derived from satellite imagery

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Doutoramento em Engenharia Florestal e dos Recursos Naturais - Instituto Superior de Agronomia - UL

SVR-GARCH com misturas de kernels gaussianos

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação (mestrado)—Universidade de Brasília, Departamento de Administração, Programa de Pós-graduação em Administração, 2016.

Simple approximate MAP inference for Dirichlet processes mixtures

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics.

«
1
2
...
45
46
47
48
49
50
51
52
53
»