Il focus di questo elaborato è sui sistemi di recommendations e le relative caratteristiche. L'utilizzo di questi meccanism è sempre più forte e presente nel mondo del web, con un parallelo sviluppo di soluzioni sempre più accurate ed efficienti. Tra tutti gli approcci esistenti, si è deciso di prendere in esame quello affrontato in Apache Mahout. Questa libreria open source implementa il collaborative-filtering, basando il processo di recommendation sulle preferenze espresse dagli utenti riguardo ifferenti oggetti. Grazie ad Apache Mahout e ai principi base delle varie tipologie di recommendationè stato possibile realizzare un applicativo web che permette di produrre delle recommendations nell'ambito delle pubblicazioni scientifiche, selezionando quegli articoli che hanno un maggiore similarità con quelli pubblicati dall'utente corrente. La realizzazione di questo progetto ha portato alla definizione di un sistema ibrido. Infatti l'approccio alla recommendation di Apache Mahout non è completamente adattabile a questa situazione, per questo motivo le sue componenti sono state estese e modellate per il caso di studio. Siè cercato quindi di combinare il collaborative filtering e il content-based in un unico approccio. Di Apache Mahout si è mantenuto l'algoritmo attraverso il quale esaminare i dati del data set, tralasciando completamente l'aspetto legato alle preferenze degli utenti, poichè essi non esprimono delle valutazioni sugli articoli. Del content-based si è utilizzata l'idea del confronto tra i titoli delle pubblicazioni. La valutazione di questo applicativo ha portato alla luce diversi limiti, ma anche possibili sviluppi futuri che potrebbero migliorare la qualità delle recommendations, ma soprattuto le prestazioni. Grazie per esempio ad Apache Hadoop sarebbe possibile una computazione distribuita che permetterebbe di elaborare migliaia di dati con dei risultati più che discreti.


There is now broad consensus that higher education must extend beyond content-based knowledge to encompass intellectual and practical skills, personal and social responsibility, and integrative learning. The college learning outcomes needed for success in 21st century life include critical thinking, a coherent sense of self, intercultural maturity, civic engagement, and the capacity for mutual relationships. Yet, research suggests that college students are struggling to achieve these outcomes in part because skills needed to succeed in college are not those needed to succeed upon graduation. One reason for this gap is that these college learning outcomes require complex developmental capacities or “self-authorship” that higher education is not currently designed to promote.


The Continental porphyry Cu‐Mo mine, located 2 km east of the famous Berkeley Pit lake of Butte, Montana, contains two small lakes that vary in size depending on mining activity. In contrast to the acidic Berkeley Pit lake, the Continental Pit waters have near-neutral pH and relatively low metal concentrations. The main reason is geological: whereas the Berkeley Pit mined highly‐altered granite rich in pyrite with no neutralizing potential, the Continental Pit is mining weakly‐altered granite with lower pyrite concentrations and up to 1‐2% hydrothermal calcite. The purpose of this study was to gather and interpret information that bears on the chemistry of surface water and groundwater in the active Continental Pit. Pre‐existing chemistry data from sampling of the Continental Pit were compiled from the Montana Bureau of Mines and Geology and Montana Department of Environmental Quality records. In addition, in March of 2013, new water samples were collected from the mine’s main dewatering well, the Sarsfield well, and a nearby acidic seep (Pavilion Seep) and analyzed for trace metals and several stable isotopes, including dD and d18O of water, d13C of dissolved inorganic carbon, and d34S of dissolved sulfate. In December 2013, several soil samples were collected from the shore of the frozen pit lake and surrounding area. The soil samples were analyzed using X‐ray diffraction to determine mineral content. Based on Visual Minteq modeling, water in the Continental Pit lake is near equilibrium with a number of carbonate, sulfate, and molybdate minerals, including calcite, dolomite, rhodochrosite (MnCO3), brochantite (CuSO4·3Cu(OH)2), malachite (Cu2CO3(OH)2), hydrozincite (Zn5(CO3)2(OH)6), gypsum, and powellite (CaMoO4). The fact that these minerals are close to equilibrium suggests that they are present on the weathered mine walls and/or in the sediment of the surface water ponds. X‐Ray Diffraction (XRD) analysis of the pond “beach” sample failed to show any discrete metal‐bearing phases. One of the soil samples collected higher in the mine, near an area of active weathering of chalcocite‐rich ore, contained over 50% chalcanthite (CuSO4·5H2O). This water‐soluble copper salt is easily dissolved in water, and is probably a major source of copper to the pond and underlying groundwater system. However, concentrations of copper in the latter are probably controlled by other, less‐soluble minerals, such as brochantite or malachite. Although the acidity of the Pavilion Seep is high (~ 11 meq/L), the flow is much less than the Sarsfield Well at the current time. Thus, the pH, major and minor element chemistry in the Continental Pit lakes are buffered by calcite and other carbonate minerals. For the Continental Pit waters to become acidic, the influx of acidic seepage (e.g., Pavilion Seep) would need to increase substantially over its present volume.


El propósito del presente trabajo es describir una experiencia didáctica desarrollada en el ámbito del Colegio Nacional ´Rafael Hernández´ (UNLP) con alumnos de 4º año que integra el uso de las TIC (Tecnologías de la Información y de la Comunicación) al desarrollo de una segunda lengua. Dicha experiencia se encuadra dentro del marco de la Enseñanza para la Comprensión, con el cual se trabaja en el colegio. El principal objetivo de esta integración es desarrollar en los alumnos no sólo competencia comunicativa en segunda lengua, sino también competencia cultural y tecnológica la cual los lleve a alfabetizarse digitalmente. El marco teórico dentro del cual se inscribe esta experiencia es el aprendizaje basado en el contenido (Content-Based Learning), el cual comprende el aprendizaje de contenido por medio de la lengua. Para la implementación de este proyecto se utilizó el entorno virtual de la UNLP (WebUNLP) y se desarrollaron tareas para que los alumnos resolvieran en torno a un tema que se consideró como hilo conductor tanto para las clases presenciales como para las tareas virtuales. La segunda parte del trabajo consiste en la descripción de cómo se estructuró el mismo y de un corpus con producciones de alumnos que participaron de esta experiencia


Today's digital libraries (DLs) archive vast amounts of information in the form of text, videos, images, data measurements, etc. User access to DL content can rely on similarity between metadata elements, or similarity between the data itself (content-based similarity). We consider the problem of exploratory search in large DLs of time-oriented data. We propose a novel approach for overview-first exploration of data collections based on user-selected metadata properties. In a 2D layout representing entities of the selected property are laid out based on their similarity with respect to the underlying data content. The display is enhanced by compact summarizations of underlying data elements, and forms the basis for exploratory navigation of users in the data space. The approach is proposed as an interface for visual exploration, leading the user to discover interesting relationships between data items relying on content-based similarity between data items and their respective metadata labels. We apply the method on real data sets from the earth observation community, showing its applicability and usefulness.


Increasing amounts of data is collected in most areas of research and application. The degree to which this data can be accessed, analyzed, and retrieved, is a decisive in obtaining progress in fields such as scientific research or industrial production. We present a novel methodology supporting content-based retrieval and exploratory search in repositories of multivariate research data. In particular, our methods are able to describe two-dimensional functional dependencies in research data, e.g. the relationship between ination and unemployment in economics. Our basic idea is to use feature vectors based on the goodness-of-fit of a set of regression models to describe the data mathematically. We denote this approach Regressional Features and use it for content-based search and, since our approach motivates an intuitive definition of interestingness, for exploring the most interesting data. We apply our method on considerable real-world research datasets, showing the usefulness of our approach for user-centered access to research data in a Digital Library system.


Although context could be exploited to improve the performance, elasticity and adaptation in most distributed systems that adopt the publish/subscribe (P/S) model of communication, only very few works have explored domains with highly dynamic context, whereas most adopted models are context agnostic. In this paper, we present the key design principles underlying a novel context-aware content-based P/S (CA-CBPS) model of communication, where the context is explicitly managed, focusing on the minimization of network overhead in domains with recurrent context changes thanks to contextual scoping. We highlight how we dealt with the main shortcomings of most of the current approaches. Our research is some of the first to study the problem of explicitly introducing context-awareness into the P/S model to capitalize on contextual information. The envisioned CA-CBPS middleware enables the cloud ecosystem of services to communicate very efficiently, in a decoupled, but contextually scoped fashion.


This doctoral thesis focuses on the modeling of multimedia systems to create personalized recommendation services based on the analysis of users’ audiovisual consumption. Research is focused on the characterization of both users’ audiovisual consumption and content, specifically images and video. This double characterization converges into a hybrid recommendation algorithm, adapted to different application scenarios covering different specificities and constraints. Hybrid recommendation systems use both content and user information as input data, applying the knowledge from the analysis of these data as the initial step to feed the algorithms in order to generate personalized recommendations. Regarding the user information, this doctoral thesis focuses on the analysis of audiovisual consumption to infer implicitly acquired preferences. The inference process is based on a new probabilistic model proposed in the text. This model takes into account qualitative and quantitative consumption factors on the one hand, and external factors such as zapping factor or company factor on the other. As for content information, this research focuses on the modeling of descriptors and aesthetic characteristics, which influence the user and are thus useful for the recommendation system. Similarly, the automatic extraction of these descriptors from the audiovisual piece without excessive computational cost has been considered a priority, in order to ensure applicability to different real scenarios. Finally, a new content-based recommendation algorithm has been created from the previously acquired information, i.e. user preferences and content descriptors. This algorithm has been hybridized with a collaborative filtering algorithm obtained from the current state of the art, so as to compare the efficiency of this hybrid recommender with the individual techniques of recommendation (different hybridization techniques of the state of the art have been studied for suitability). The content-based recommendation focuses on the influence of the aesthetic characteristics on the users. The heterogeneity of the possible users of these kinds of systems calls for the use of different criteria and attributes to create effective recommendations. Therefore, the proposed algorithm is adaptable to different perceptions producing a dynamic representation of preferences to obtain personalized recommendations for each user of the system. The hypotheses of this doctoral thesis have been validated by conducting a set of tests with real users, or by querying a database containing user preferences - available to the scientific community. This thesis is structured based on the different research and validation methodologies of the techniques involved. In the three central chapters the state of the art is studied and the developed algorithms and models are validated via self-designed tests. It should be noted that some of these tests are incremental and confirm the validation of previously discussed techniques. Resumen Esta tesis doctoral se centra en el modelado de sistemas multimedia para la creación de servicios personalizados de recomendación a partir del análisis de la actividad de consumo audiovisual de los usuarios. La investigación se focaliza en la caracterización tanto del consumo audiovisual del usuario como de la naturaleza de los contenidos, concretamente imágenes y vídeos. Esta doble caracterización de usuarios y contenidos confluye en un algoritmo de recomendación híbrido que se adapta a distintos escenarios de aplicación, cada uno de ellos con distintas peculiaridades y restricciones. Todo sistema de recomendación híbrido toma como datos de partida tanto información del usuario como del contenido, y utiliza este conocimiento como entrada para algoritmos que permiten generar recomendaciones personalizadas. Por la parte de la información del usuario, la tesis se centra en el análisis del consumo audiovisual para inferir preferencias que, por lo tanto, se adquieren de manera implícita. Para ello, se ha propuesto un nuevo modelo probabilístico que tiene en cuenta factores de consumo tanto cuantitativos como cualitativos, así como otros factores de contorno, como el factor de zapping o el factor de compañía, que condicionan la incertidumbre de la inferencia. En cuanto a la información del contenido, la investigación se ha centrado en la definición de descriptores de carácter estético y morfológico que resultan influyentes en el usuario y que, por lo tanto, son útiles para la recomendación. Del mismo modo, se ha considerado una prioridad que estos descriptores se puedan extraer automáticamente de un contenido sin exigir grandes requisitos computacionales y, de tal forma que se garantice la posibilidad de aplicación a escenarios reales de diverso tipo. Por último, explotando la información de preferencias del usuario y de descripción de los contenidos ya obtenida, se ha creado un nuevo algoritmo de recomendación basado en contenido. Este algoritmo se cruza con un algoritmo de filtrado colaborativo de referencia en el estado del arte, de tal manera que se compara la eficiencia de este recomendador híbrido (donde se ha investigado la idoneidad de las diferentes técnicas de hibridación del estado del arte) con cada una de las técnicas individuales de recomendación. El algoritmo de recomendación basado en contenido que se ha creado se centra en las posibilidades de la influencia de factores estéticos en los usuarios, teniendo en cuenta que la heterogeneidad del conjunto de usuarios provoca que los criterios y atributos que condicionan las preferencias de cada individuo sean diferentes. Por lo tanto, el algoritmo se adapta a las diferentes percepciones y articula una metodología dinámica de representación de las preferencias que permite obtener recomendaciones personalizadas, únicas para cada usuario del sistema. Todas las hipótesis de la tesis han sido debidamente validadas mediante la realización de pruebas con usuarios reales o con bases de datos de preferencias de usuarios que están a disposición de la comunidad científica. La diferente metodología de investigación y validación de cada una de las técnicas abordadas condiciona la estructura de la tesis, de tal manera que los tres capítulos centrales se estructuran sobre su propio estudio del estado del arte y los algoritmos y modelos desarrollados se validan mediante pruebas autónomas, sin impedir que, en algún caso, las pruebas sean incrementales y ratifiquen la validación de técnicas expuestas anteriormente.


This study suggests a theoretical framework for improving the teaching/ learning process of English employed in the Aeronautical discourse that brings together cognitive learning strategies, Genre Analysis and the Contemporary theory of Metaphor (Lakoff and Johnson 1980; Lakoff 1993). It maintains that cognitive strategies such as imagery, deduction, inference and grouping can be enhanced by means of metaphor and genre awareness in the context of content based approach to language learning. A list of image metaphors and conceptual metaphors which comes from the terminological database METACITEC is provided. The metaphorical terms from the area of Aeronautics have been taken from specialised dictionaries and have been categorised according to the conceptual metaphors they respond to, by establishing the source domains and the target domains, as well as the semantic networks found. This information makes reference to the internal mappings underlying the discourse of aeronautics reflected in five aviation accident case studies which are related to accident reports from the National Transportation Safety Board (NTSB) and provides an important source for designing language teaching tasks. La Lingüística Cognitiva y el Análisis del Género han contribuido a la mejora de la enseñanza de segundas lenguas y, en particular, al desarrollo de la competencia lingüística de los alumnos de inglés para fines específicos. Este trabajo pretende perfeccionar los procesos de enseñanza y el aprendizaje del lenguaje empleado en el discurso aeronáutico por medio de la práctica de estrategias cognitivas y prestando atención a la Teoría del análisis del género y a la Teoría contemporánea de la metáfora (Lakoff y Johnson 1980; Lakoff 1993). Con el propósito de crear recursos didácticos en los que se apliquen estrategias metafóricas, se ha elaborado un listado de metáforas de imagen y de metáforas conceptuales proveniente de la base de datos terminológica META-CITEC. Estos términos se han clasificado de acuerdo con las metáforas conceptuales y de imagen existentes en esta área de conocimiento. Para la enseñanza de este lenguaje de especialidad, se proponen las correspondencias y las proyecciones entre el dominio origen y el dominio meta que se han hallado en los informes de accidentes aéreos tomados de la Junta federal de la Seguridad en el Transporte (NTSB)


Since the first decade of the 21st century, the Valley of the Fallen has been established as an object of controversy related to the new policies of memory. In recent years the "Historical Memory" has been a recurring concept in the mass media. While it is true that since 2011 this issue has been overshadowed in the political agenda, even today we continue to access information that refers to our recent past, from perspectives that demand actions of ethic, symbolic, political or economic repair. Many of these reports could be framed within a broader discourse, akin to a concept of "historical memory". These media texts are part of a larger problem that is troubling modern western societies and that has presented a remarkable recovery since the late nineties: debates or polemics on memory. In this paper we propose to study the nature of these media texts. We assume that the mass media configure their texts from frameworks or pre-existing frames. For this research, we propose an analysis of content based on the theory of framing to identify what is the typical journalistic discourse and the modalities of interpretive general framework applied in a number of texts and broadcasts about the Valley of the Fallen...


With the rapid increase in both centralized video archives and distributed WWW video resources, content-based video retrieval is gaining its importance. To support such applications efficiently, content-based video indexing must be addressed. Typically, each video is represented by a sequence of frames. Due to the high dimensionality of frame representation and the large number of frames, video indexing introduces an additional degree of complexity. In this paper, we address the problem of content-based video indexing and propose an efficient solution, called the Ordered VA-File (OVA-File) based on the VA-file. OVA-File is a hierarchical structure and has two novel features: 1) partitioning the whole file into slices such that only a small number of slices are accessed and checked during k Nearest Neighbor (kNN) search and 2) efficient handling of insertions of new vectors into the OVA-File, such that the average distance between the new vectors and those approximations near that position is minimized. To facilitate a search, we present an efficient approximate kNN algorithm named Ordered VA-LOW (OVA-LOW) based on the proposed OVA-File. OVA-LOW first chooses possible OVA-Slices by ranking the distances between their corresponding centers and the query vector, and then visits all approximations in the selected OVA-Slices to work out approximate kNN. The number of possible OVA-Slices is controlled by a user-defined parameter delta. By adjusting delta, OVA-LOW provides a trade-off between the query cost and the result quality. Query by video clip consisting of multiple frames is also discussed. Extensive experimental studies using real video data sets were conducted and the results showed that our methods can yield a significant speed-up over an existing VA-file-based method and iDistance with high query result quality. Furthermore, by incorporating temporal correlation of video content, our methods achieved much more efficient performance.


Music similarity query based on acoustic content is becoming important with the ever-increasing growth of the music information from emerging applications such as digital libraries and WWW. However, relative techniques are still in their infancy and much less than satisfactory. In this paper, we present a novel index structure, called Composite Feature tree, CF-tree, to facilitate efficient content-based music search adopting multiple musical features. Before constructing the tree structure, we use PCA to transform the extracted features into a new space sorted by the importance of acoustic features. The CF-tree is a balanced multi-way tree structure where each level represents the data space at different dimensionalities. The PCA transformed data and reduced dimensions in the upper levels can alleviate suffering from dimensionality curse. To accurately mimic human perception, an extension, named CF+-tree, is proposed, which further applies multivariable regression to determine the weight of each individual feature. We conduct extensive experiments to evaluate the proposed structures against state-of-art techniques. The experimental results demonstrate superiority of our technique.


Content-based instruction (CBI) is increasingly important in curriculum development for second-language acquisition (SLA), as language and non-language departments in universities are finding the integration of core-content as part of the second language curriculum to be beneficial. With this in mind, this paper describes the English program at Nanzan University’s Faculty of Policy Studies and examines the synergy presently being developed between core-content and English language instruction there. Specifically, this paper seeks to shed light on how instructors can reflect on the meaning of language instruction at higher education through an illustration of our activities.