797 resultados para Statistical Learning Theory.
Resumo:
We have investigated the structure of double quantum dots vertically coupled at zero magnetic field within local-spin-density functional theory. The dots are identical and have a finite width, and the whole system is axially symmetric. We first discuss the effect of thickness on the addition spectrum of one single dot. Next we describe the structure of coupled dots as a function of the interdot distance for different electron numbers. Addition spectra, Hund's rule, and molecular-type configurations are discussed. It is shown that self-interaction corrections to the density-functional results do not play a very important role in the calculated addition spectra
Resumo:
This paper highlights the prediction of learning disabilities (LD) in school-age children using rough set theory (RST) with an emphasis on application of data mining. In rough sets, data analysis start from a data table called an information system, which contains data about objects of interest, characterized in terms of attributes. These attributes consist of the properties of learning disabilities. By finding the relationship between these attributes, the redundant attributes can be eliminated and core attributes determined. Also, rule mining is performed in rough sets using the algorithm LEM1. The prediction of LD is accurately done by using Rosetta, the rough set tool kit for analysis of data. The result obtained from this study is compared with the output of a similar study conducted by us using Support Vector Machine (SVM) with Sequential Minimal Optimisation (SMO) algorithm. It is found that, using the concepts of reduct and global covering, we can easily predict the learning disabilities in children
Resumo:
Econometrics is a young science. It developed during the twentieth century in the mid-1930’s, primarily after the World War II. Econometrics is the unification of statistical analysis, economic theory and mathematics. The history of econometrics can be traced to the use of statistical and mathematics analysis in economics. The most prominent contributions during the initial period can be seen in the works of Tinbergen and Frisch, and also that of Haavelmo in the 1940's through the mid 1950's. Right from the rudimentary application of statistics to economic data, like the use of laws of error through the development of least squares by Legendre, Laplace, and Gauss, the discipline of econometrics has later on witnessed the applied works done by Edge worth and Mitchell. A very significant mile stone in its evolution has been the work of Tinbergen, Frisch, and Haavelmo in their development of multiple regression and correlation analysis. They used these techniques to test different economic theories using time series data. In spite of the fact that some predictions based on econometric methodology might have gone wrong, the sound scientific nature of the discipline cannot be ignored by anyone. This is reflected in the economic rationale underlying any econometric model, statistical and mathematical reasoning for the various inferences drawn etc. The relevance of econometrics as an academic discipline assumes high significance in the above context. Because of the inter-disciplinary nature of econometrics (which is a unification of Economics, Statistics and Mathematics), the subject can be taught at all these broad areas, not-withstanding the fact that most often Economics students alone are offered this subject as those of other disciplines might not have adequate Economics background to understand the subject. In fact, even for technical courses (like Engineering), business management courses (like MBA), professional accountancy courses etc. econometrics is quite relevant. More relevant is the case of research students of various social sciences, commerce and management. In the ongoing scenario of globalization and economic deregulation, there is the need to give added thrust to the academic discipline of econometrics in higher education, across various social science streams, commerce, management, professional accountancy etc. Accordingly, the analytical ability of the students can be sharpened and their ability to look into the socio-economic problems with a mathematical approach can be improved, and enabling them to derive scientific inferences and solutions to such problems. The utmost significance of hands-own practical training on the use of computer-based econometric packages, especially at the post-graduate and research levels need to be pointed out here. Mere learning of the econometric methodology or the underlying theories alone would not have much practical utility for the students in their future career, whether in academics, industry, or in practice This paper seeks to trace the historical development of econometrics and study the current status of econometrics as an academic discipline in higher education. Besides, the paper looks into the problems faced by the teachers in teaching econometrics, and those of students in learning the subject including effective application of the methodology in real life situations. Accordingly, the paper offers some meaningful suggestions for effective teaching of econometrics in higher education
Resumo:
This study investigated the relationship between higher education and the requirement of the world of work with an emphasis on the effect of problem-based learning (PBL) on graduates' competencies. The implementation of full PBL method is costly (Albanese & Mitchell, 1993; Berkson, 1993; Finucane, Shannon, & McGrath, 2009). However, the implementation of PBL in a less than curriculum-wide mode is more achievable in a broader context (Albanese, 2000). This means higher education institutions implement only a few PBL components in the curriculum. Or a teacher implements a few PBL components at the courses level. For this kind of implementation there is a need to identify PBL components and their effects on particular educational outputs (Hmelo-Silver, 2004; Newman, 2003). So far, however there has been little research about this topic. The main aims of this study were: (1) to identify each of PBL components which were manifested in the development of a valid and reliable PBL implementation questionnaire and (2) to determine the effect of each identified PBL component to specific graduates' competencies. The analysis was based on quantitative data collected in the survey of medicine graduates of Gadjah Mada University, Indonesia. A total of 225 graduates responded to the survey. The result of confirmatory factor analysis (CFA) showed that all individual constructs of PBL and graduates' competencies had acceptable GOFs (Goodness-of-fit). Additionally, the values of the factor loadings (standardize loading estimates), the AVEs (average variance extracted), CRs (construct reliability), and ASVs (average shared squared variance) showed the proof of convergent and discriminant validity. All values indicated valid and reliable measurements. The investigation of the effects of PBL showed that each PBL component had specific effects on graduates' competencies. Interpersonal competencies were affected by Student-centred learning (β = .137; p < .05) and Small group components (β = .078; p < .05). Problem as stimulus affected Leadership (β = .182; p < .01). Real-world problems affected Personal and organisational competencies (β = .140; p < .01) and Interpersonal competencies (β = .114; p < .05). Teacher as facilitator affected Leadership (β = 142; p < .05). Self-directed learning affected Field-related competencies (β = .080; p < .05). These results can help higher education institution and educator to have informed choice about the implementation of PBL components. With this information higher education institutions and educators could fulfil their educational goals and in the same time meet their limited resources. This study seeks to improve prior studies' research method in four major ways: (1) by indentifying PBL components based on theory and empirical data; (2) by using latent variables in the structural equation modelling instead of using a variable as a proxy of a construct; (3) by using CFA to validate the latent structure of the measurement, thus providing better evidence of validity; and (4) by using graduate survey data which is suitable for analysing PBL effects in the frame work of the relationship between higher education and the world of work.
Resumo:
Die zunehmende Vernetzung der Informations- und Kommunikationssysteme führt zu einer weiteren Erhöhung der Komplexität und damit auch zu einer weiteren Zunahme von Sicherheitslücken. Klassische Schutzmechanismen wie Firewall-Systeme und Anti-Malware-Lösungen bieten schon lange keinen Schutz mehr vor Eindringversuchen in IT-Infrastrukturen. Als ein sehr wirkungsvolles Instrument zum Schutz gegenüber Cyber-Attacken haben sich hierbei die Intrusion Detection Systeme (IDS) etabliert. Solche Systeme sammeln und analysieren Informationen von Netzwerkkomponenten und Rechnern, um ungewöhnliches Verhalten und Sicherheitsverletzungen automatisiert festzustellen. Während signatur-basierte Ansätze nur bereits bekannte Angriffsmuster detektieren können, sind anomalie-basierte IDS auch in der Lage, neue bisher unbekannte Angriffe (Zero-Day-Attacks) frühzeitig zu erkennen. Das Kernproblem von Intrusion Detection Systeme besteht jedoch in der optimalen Verarbeitung der gewaltigen Netzdaten und der Entwicklung eines in Echtzeit arbeitenden adaptiven Erkennungsmodells. Um diese Herausforderungen lösen zu können, stellt diese Dissertation ein Framework bereit, das aus zwei Hauptteilen besteht. Der erste Teil, OptiFilter genannt, verwendet ein dynamisches "Queuing Concept", um die zahlreich anfallenden Netzdaten weiter zu verarbeiten, baut fortlaufend Netzverbindungen auf, und exportiert strukturierte Input-Daten für das IDS. Den zweiten Teil stellt ein adaptiver Klassifikator dar, der ein Klassifikator-Modell basierend auf "Enhanced Growing Hierarchical Self Organizing Map" (EGHSOM), ein Modell für Netzwerk Normalzustand (NNB) und ein "Update Model" umfasst. In dem OptiFilter werden Tcpdump und SNMP traps benutzt, um die Netzwerkpakete und Hostereignisse fortlaufend zu aggregieren. Diese aggregierten Netzwerkpackete und Hostereignisse werden weiter analysiert und in Verbindungsvektoren umgewandelt. Zur Verbesserung der Erkennungsrate des adaptiven Klassifikators wird das künstliche neuronale Netz GHSOM intensiv untersucht und wesentlich weiterentwickelt. In dieser Dissertation werden unterschiedliche Ansätze vorgeschlagen und diskutiert. So wird eine classification-confidence margin threshold definiert, um die unbekannten bösartigen Verbindungen aufzudecken, die Stabilität der Wachstumstopologie durch neuartige Ansätze für die Initialisierung der Gewichtvektoren und durch die Stärkung der Winner Neuronen erhöht, und ein selbst-adaptives Verfahren eingeführt, um das Modell ständig aktualisieren zu können. Darüber hinaus besteht die Hauptaufgabe des NNB-Modells in der weiteren Untersuchung der erkannten unbekannten Verbindungen von der EGHSOM und der Überprüfung, ob sie normal sind. Jedoch, ändern sich die Netzverkehrsdaten wegen des Concept drif Phänomens ständig, was in Echtzeit zur Erzeugung nicht stationärer Netzdaten führt. Dieses Phänomen wird von dem Update-Modell besser kontrolliert. Das EGHSOM-Modell kann die neuen Anomalien effektiv erkennen und das NNB-Model passt die Änderungen in Netzdaten optimal an. Bei den experimentellen Untersuchungen hat das Framework erfolgversprechende Ergebnisse gezeigt. Im ersten Experiment wurde das Framework in Offline-Betriebsmodus evaluiert. Der OptiFilter wurde mit offline-, synthetischen- und realistischen Daten ausgewertet. Der adaptive Klassifikator wurde mit dem 10-Fold Cross Validation Verfahren evaluiert, um dessen Genauigkeit abzuschätzen. Im zweiten Experiment wurde das Framework auf einer 1 bis 10 GB Netzwerkstrecke installiert und im Online-Betriebsmodus in Echtzeit ausgewertet. Der OptiFilter hat erfolgreich die gewaltige Menge von Netzdaten in die strukturierten Verbindungsvektoren umgewandelt und der adaptive Klassifikator hat sie präzise klassifiziert. Die Vergleichsstudie zwischen dem entwickelten Framework und anderen bekannten IDS-Ansätzen zeigt, dass der vorgeschlagene IDSFramework alle anderen Ansätze übertrifft. Dies lässt sich auf folgende Kernpunkte zurückführen: Bearbeitung der gesammelten Netzdaten, Erreichung der besten Performanz (wie die Gesamtgenauigkeit), Detektieren unbekannter Verbindungen und Entwicklung des in Echtzeit arbeitenden Erkennungsmodells von Eindringversuchen.
Resumo:
We consider an online learning scenario in which the learner can make predictions on the basis of a fixed set of experts. The performance of each expert may change over time in a manner unknown to the learner. We formulate a class of universal learning algorithms for this problem by expressing them as simple Bayesian algorithms operating on models analogous to Hidden Markov Models (HMMs). We derive a new performance bound for such algorithms which is considerably simpler than existing bounds. The bound provides the basis for learning the rate at which the identity of the optimal expert switches over time. We find an analytic expression for the a priori resolution at which we need to learn the rate parameter. We extend our scalar switching-rate result to models of the switching-rate that are governed by a matrix of parameters, i.e. arbitrary homogeneous HMMs. We apply and examine our algorithm in the context of the problem of energy management in wireless networks. We analyze the new results in the framework of Information Theory.
Resumo:
In previous work (Olshausen & Field 1996), an algorithm was described for learning linear sparse codes which, when trained on natural images, produces a set of basis functions that are spatially localized, oriented, and bandpass (i.e., wavelet-like). This note shows how the algorithm may be interpreted within a maximum-likelihood framework. Several useful insights emerge from this connection: it makes explicit the relation to statistical independence (i.e., factorial coding), it shows a formal relationship to the algorithm of Bell and Sejnowski (1995), and it suggests how to adapt parameters that were previously fixed.
Resumo:
Modeling and predicting co-occurrences of events is a fundamental problem of unsupervised learning. In this contribution we develop a statistical framework for analyzing co-occurrence data in a general setting where elementary observations are joint occurrences of pairs of abstract objects from two finite sets. The main challenge for statistical models in this context is to overcome the inherent data sparseness and to estimate the probabilities for pairs which were rarely observed or even unobserved in a given sample set. Moreover, it is often of considerable interest to extract grouping structure or to find a hierarchical data organization. A novel family of mixture models is proposed which explain the observed data by a finite number of shared aspects or clusters. This provides a common framework for statistical inference and structure discovery and also includes several recently proposed models as special cases. Adopting the maximum likelihood principle, EM algorithms are derived to fit the model parameters. We develop improved versions of EM which largely avoid overfitting problems and overcome the inherent locality of EM--based optimization. Among the broad variety of possible applications, e.g., in information retrieval, natural language processing, data mining, and computer vision, we have chosen document retrieval, the statistical analysis of noun/adjective co-occurrence and the unsupervised segmentation of textured images to test and evaluate the proposed algorithms.
Resumo:
First discussion on compositional data analysis is attributable to Karl Pearson, in 1897. However, notwithstanding the recent developments on algebraic structure of the simplex, more than twenty years after Aitchison’s idea of log-transformations of closed data, scientific literature is again full of statistical treatments of this type of data by using traditional methodologies. This is particularly true in environmental geochemistry where besides the problem of the closure, the spatial structure (dependence) of the data have to be considered. In this work we propose the use of log-contrast values, obtained by a simplicial principal component analysis, as LQGLFDWRUV of given environmental conditions. The investigation of the log-constrast frequency distributions allows pointing out the statistical laws able to generate the values and to govern their variability. The changes, if compared, for example, with the mean values of the random variables assumed as models, or other reference parameters, allow defining monitors to be used to assess the extent of possible environmental contamination. Case study on running and ground waters from Chiavenna Valley (Northern Italy) by using Na+, K+, Ca2+, Mg2+, HCO3-, SO4 2- and Cl- concentrations will be illustrated
Resumo:
The preceding two editions of CoDaWork included talks on the possible consideration of densities as infinite compositions: Egozcue and D´ıaz-Barrero (2003) extended the Euclidean structure of the simplex to a Hilbert space structure of the set of densities within a bounded interval, and van den Boogaart (2005) generalized this to the set of densities bounded by an arbitrary reference density. From the many variations of the Hilbert structures available, we work with three cases. For bounded variables, a basis derived from Legendre polynomials is used. For variables with a lower bound, we standardize them with respect to an exponential distribution and express their densities as coordinates in a basis derived from Laguerre polynomials. Finally, for unbounded variables, a normal distribution is used as reference, and coordinates are obtained with respect to a Hermite-polynomials-based basis. To get the coordinates, several approaches can be considered. A numerical accuracy problem occurs if one estimates the coordinates directly by using discretized scalar products. Thus we propose to use a weighted linear regression approach, where all k- order polynomials are used as predictand variables and weights are proportional to the reference density. Finally, for the case of 2-order Hermite polinomials (normal reference) and 1-order Laguerre polinomials (exponential), one can also derive the coordinates from their relationships to the classical mean and variance. Apart of these theoretical issues, this contribution focuses on the application of this theory to two main problems in sedimentary geology: the comparison of several grain size distributions, and the comparison among different rocks of the empirical distribution of a property measured on a batch of individual grains from the same rock or sediment, like their composition
Resumo:
A brief skim through educational theory intended for students registered on a single module in Technology Enhanced Learning. Startes with Blooms taxonomy, travles through instructivism and constructivism and on to theories of motivation/
Resumo:
El presente proyecto tiene como objeto identificar cuáles son los conceptos de salud, enfermedad, epidemiología y riesgo aplicables a las empresas del sector de extracción de petróleo y gas natural en Colombia. Dado, el bajo nivel de predicción de los análisis financieros tradicionales y su insuficiencia, en términos de inversión y toma de decisiones a largo plazo, además de no considerar variables como el riesgo y las expectativas de futuro, surge la necesidad de abordar diferentes perspectivas y modelos integradores. Esta apreciación es pertinente dentro del sector de extracción de petróleo y gas natural, debido a la creciente inversión extranjera que ha reportado, US$2.862 millones en el 2010, cifra mayor a diez veces su valor en el año 2003. Así pues, se podrían desarrollar modelos multi-dimensional, con base en los conceptos de salud financiera, epidemiológicos y estadísticos. El termino de salud y su adopción en el sector empresarial, resulta útil y mantiene una coherencia conceptual, evidenciando una presencia de diferentes subsistemas o factores interactuantes e interconectados. Es necesario mencionar también, que un modelo multidimensional (multi-stage) debe tener en cuenta el riesgo y el análisis epidemiológico ha demostrado ser útil al momento de determinarlo e integrarlo en el sistema junto a otros conceptos, como la razón de riesgo y riesgo relativo. Esto se analizará mediante un estudio teórico-conceptual, que complementa un estudio previo, para contribuir al proyecto de finanzas corporativas de la línea de investigación en Gerencia.
Resumo:
El desarrollo del presente documento constituye una investigación sobre las actitudes de los directivos frente a la adopción del e-learning como herramienta de trabajo en las organizaciones de Bogotá. Para ello se realizó una encuesta a 101 directivos, tomando como base el tipo de muestreo de conveniencia; esto con el objetivo de identificar sus actitudes frente al uso del e-learning y su influencia dentro de la organización. Como resultado se obtuvo que las actitudes de los directivos influencian en el uso de herramientas e-learning, así como también en las acciones que promueven su uso y en las actitudes de los empleados; por otro lado se identificó que las creencias relacionadas con la apropiación de herramientas e-learning y los factores facilitadores del uso de estas, influencian en las actitudes de los directivos. Lo anterior, corresponde a los análisis llevados a cabo a partir de los resultados contrastados con los estudios empíricos hallados y el marco teórico desarrollado.
Resumo:
Los resultados financieros de las organizaciones son objeto de estudio y análisis permanente, predecir sus comportamientos es una tarea permanente de empresarios, inversionistas, analistas y académicos. En el presente trabajo se explora el impacto del tamaño de los activos (valor total de los activos) en la cuenta de resultados operativos y netos, analizando inicialmente la relación entre dichas variables con indicadores tradicionales del análisis financiero como es el caso de la rentabilidad operativa y neta y con elementos de estadística descriptiva que permiten calificar los datos utilizados como lineales o no lineales. Descubriendo posteriormente que los resultados financieros de las empresas vigiladas por la Superintendencia de Sociedades para el año 2012, tienen un comportamiento no lineal, de esta manera se procede a analizar la relación de los activos y los resultados con la utilización de espacios de fase y análisis de recurrencia, herramientas útiles para sistemas caóticos y complejos. Para el desarrollo de la investigación y la revisión de la relación entre las variables de activos y resultados financieros se tomó como fuente de información los reportes financieros del cierre del año 2012 de la Superintendencia de Sociedades (Superintendencia de Sociedades, 2012).
Resumo:
Se realizó un estudio transversal, se incluyeron 3 residentes no cardiólogos y se les dio formación básica en ecocardiografía (horas teóricas 22, horas prácticas 65), con recomendaciones de la Sociedad Americana de Ecocardiografia y aportes del aprendizaje basado en problemas, con el desarrollo de competencia técnicas y diagnósticas necesarias, se realizó el análisis de concordancia entre residentes y ecocardiografistas expertos, se recolectaron 122 pacientes hospitalizados que cumplieran con los criterios de inclusión y exclusión, se les realizo un ecocardiograma convencional por el experto y una valoración ecocardiográfica por el residente, se evaluó la ventana acústica, contractilidad, función del ventrículo izquierdo y derrame pericárdico. La hipótesis planteada fue obtener una concordancia moderada. Resultados: Se analizó la concordancia entre observadores para la contractilidad miocárdica (Kappa: 0,57 p=0,000), función sistólica del ventrículo izquierdo (Kappa 0,54 p=0.000) siendo esta moderada por estar entre 0,40 – 0,60 y con una alta significancia estadística, para la calidad de la ventana acústica (Kappa: 0,22 p= 0.000) y presencia de derrame pericárdico (Kappa: 0,26 p= 0.000) se encontró una escasa concordancia ubicándose entre 0,20 – 0,40. Se estableció una sensibilidad de 90%, especificidad de 67%, un valor predictivo positivo de 80% y un valor predictivo negativo de 85% para el diagnóstico de disfunción sistólica del ventrículo izquierdo realizado por los residentes.