930 resultados para Data-driven
Resumo:
The social media classification problems draw more and more attention in the past few years. With the rapid development of Internet and the popularity of computers, there is astronomical amount of information in the social network (social media platforms). The datasets are generally large scale and are often corrupted by noise. The presence of noise in training set has strong impact on the performance of supervised learning (classification) techniques. A budget-driven One-class SVM approach is presented in this thesis that is suitable for large scale social media data classification. Our approach is based on an existing online One-class SVM learning algorithm, referred as STOCS (Self-Tuning One-Class SVM) algorithm. To justify our choice, we first analyze the noise-resilient ability of STOCS using synthetic data. The experiments suggest that STOCS is more robust against label noise than several other existing approaches. Next, to handle big data classification problem for social media data, we introduce several budget driven features, which allow the algorithm to be trained within limited time and under limited memory requirement. Besides, the resulting algorithm can be easily adapted to changes in dynamic data with minimal computational cost. Compared with two state-of-the-art approaches, Lib-Linear and kNN, our approach is shown to be competitive with lower requirements of memory and time.
Resumo:
Following the workshop on new developments in daily licensing practice in November 2011, we brought together fourteen representatives from national consortia (from Denmark, Germany, Netherlands and the UK) and publishers (Elsevier, SAGE and Springer) met in Copenhagen on 9 March 2012 to discuss provisions in licences to accommodate new developments. The one day workshop aimed to: present background and ideas regarding the provisions KE Licensing Expert Group developed; introduce and explain the provisions the invited publishers currently use;ascertain agreement on the wording for long term preservation, continuous access and course packs; give insight and more clarity about the use of open access provisions in licences; discuss a roadmap for inclusion of the provisions in the publishers’ licences; result in report to disseminate the outcome of the meeting. Participants of the workshop were: United Kingdom: Lorraine Estelle (Jisc Collections) Denmark: Lotte Eivor Jørgensen (DEFF), Lone Madsen (Southern University of Denmark), Anne Sandfær (DEFF/Knowledge Exchange) Germany: Hildegard Schaeffler (Bavarian State Library), Markus Brammer (TIB) The Netherlands: Wilma Mossink (SURF), Nol Verhagen (University of Amsterdam), Marc Dupuis (SURF/Knowledge Exchange) Publishers: Alicia Wise (Elsevier), Yvonne Campfens (Springer), Bettina Goerner (Springer), Leo Walford (Sage) Knowledge Exchange: Keith Russell The main outcome of the workshop was that it would be valuable to have a standard set of clauses which could used in negotiations, this would make concluding licences a lot easier and more efficient. The comments on the model provisions the Licensing Expert group had drafted will be taken into account and the provisions will be reformulated. Data and text mining is a new development and demand for access to allow for this is growing. It would be easier if there was a simpler way to access materials so they could be more easily mined. However there are still outstanding questions on how authors of articles that have been mined can be properly attributed.
Resumo:
Discovery Driven Analysis (DDA) is a common feature of OLAP technology to analyze structured data. In essence, DDA helps analysts to discover anomalous data by highlighting 'unexpected' values in the OLAP cube. By giving indications to the analyst on what dimensions to explore, DDA speeds up the process of discovering anomalies and their causes. However, Discovery Driven Analysis (and OLAP in general) is only applicable on structured data, such as records in databases. We propose a system to extend DDA technology to semi-structured text documents, that is, text documents with a few structured data. Our system pipeline consists of two stages: first, the text part of each document is structured around user specified dimensions, using semi-PLSA algorithm; then, we adapt DDA to these fully structured documents, thus enabling DDA on text documents. We present some applications of this system in OLAP analysis and show how scalability issues are solved. Results show that our system can handle reasonable datasets of documents, in real time, without any need for pre-computation.
Resumo:
The interplay between the biocolloidal characteristics (especially size and charge), pH, salt concentration and the thermal energy results in a unique collection of mesoscopic forces of importance to the molecular organization and function in biological systems. By means of Monte Carlo simulations and semi-quantitative analysis in terms of perturbation theory, we describe a general electrostatic mechanism that gives attraction at low electrolyte concentrations. This charge regulation mechanism due to titrating amino acid residues is discussed in a purely electrostatic framework. The complexation data reported here for interaction between a polyelectrolyte chain and the proteins albumin, goat and bovine alpha-lactalbumin, beta-lactoglobulin, insulin, k-casein, lysozyme and pectin methylesterase illustrate the importance of the charge regulation mechanism. Special attention is given to pH congruent to pI where ion-dipole and charge regulation interactions could overcome the repulsive ion-ion interaction. By means of protein mutations, we confirm the importance of the charge regulation mechanism, and quantify when the complexation is dominated either by charge regulation or by the ion-dipole term.
Resumo:
We consider a nontrivial one-species population dynamics model with finite and infinite carrying capacities. Time-dependent intrinsic and extrinsic growth rates are considered in these models. Through the model per capita growth rate we obtain a heuristic general procedure to generate scaling functions to collapse data into a simple linear behavior even if an extrinsic growth rate is included. With this data collapse, all the models studied become independent from the parameters and initial condition. Analytical solutions are found when time-dependent coefficients are considered. These solutions allow us to perceive nontrivial transitions between species extinction and survival and to calculate the transition's critical exponents. Considering an extrinsic growth rate as a cancer treatment, we show that the relevant quantity depends not only on the intensity of the treatment, but also on when the cancerous cell growth is maximum.
Resumo:
Background: A relative friability to capture a sufficiently large patient population in any one geographic location has traditionally limited research into rare diseases. Methods and Results: Clinicians interested in the rare disease lymphangioleiomyomatosis (LAM) have worked with the LAM Treatment Alliance, the MIT Media Lab, and Clozure Associates to cooperate in the design of a state-of-the-art data coordination platform that can be used for clinical trials and other research focused on the global LAM patient population. This platform is a component of a set of web-based resources, including a patient self-report data portal, aimed at accelerating research in rare diseases in a rigorous fashion. Conclusions: Collaboration between clinicians, researchers, advocacy groups, and patients can create essential community resource infrastructure to accelerate rare disease research. The International LAM Registry is an example of such an effort.
Resumo:
Context. B[e] supergiants are luminous, massive post-main sequence stars exhibiting non-spherical winds, forbidden lines, and hot dust in a disc-like structure. The physical properties of their rich and complex circumstellar environment (CSE) are not well understood, partly because these CSE cannot be easily resolved at the large distances found for B[e] supergiants (typically greater than or similar to 1 kpc). Aims. From mid-IR spectro-interferometric observations obtained with VLTI/MIDI we seek to resolve and study the CSE of the Galactic B[e] supergiant CPD-57 degrees 2874. Methods. For a physical interpretation of the observables (visibilities and spectrum) we use our ray-tracing radiative transfer code (FRACS), which is optimised for thermal spectro-interferometric observations. Results. Thanks to the short computing time required by FRACS (<10 s per monochromatic model), best-fit parameters and uncertainties for several physical quantities of CPD-57 degrees 2874 were obtained, such as inner dust radius, relative flux contribution of the central source and of the dusty CSE, dust temperature profile, and disc inclination. Conclusions. The analysis of VLTI/MIDI data with FRACS allowed one of the first direct determinations of physical parameters of the dusty CSE of a B[e] supergiant based on interferometric data and using a full model-fitting approach. In a larger context, the study of B[e] supergiants is important for a deeper understanding of the complex structure and evolution of hot, massive stars.
Resumo:
During the past 40 years colluvial and alluvial deposits have been used in Brazil as good indicators of regional landscape sensitivity to Quaternary environmental changes. In spite of the low resolution of most of the continental sedimentary record, geomorphology and sedimentology may favor palaeoenvironmental interpretation when supported by independent proxy data. This paper presents results obtained from pedostratigraphic sequences, in near-valley head sites of southern Brazilian highlands, based on geomorphologic. sedimentologic, micromorphologic, isotopic and palynologic data. Results point to environmental changes, with ages that coincide with Marine Isotopic Stages (MIS) 5b; 3; 2 and 1. During the late Pleistocene, although under temperatures and precipitation lower than today, the local record points to relatively wet local environments, where shallow soil-water saturated zones contributed to erosion and sedimentation during periods of climatic change, as during the transition between MIS 2 and MIS 1. Late Pleistocene events with ages that coincide with the Northern Hemisphere Younger Dryas are also depicted. During the mid Holocene, slope-wash deposits suggest a climate drier than today, probably under the influence of seasonally contrasted precipitation regimes. The predominance of overland flow-related sedimentary deposits suggests an excess of precipitation over evaporation that influenced local palaeohydrology. This environmental condition seems to be recurrent and explains how slope morphology had influenced pedogenesis and sedimentation in the study area. Due to relative sensitiveness, resilience and short source-to-sink sedimentary pathways, near-valley head sites deserve further attention in Quaternary studies in the humid tropics. (c) 2008 Elsevier B.A. All rights reserved.
Resumo:
Background: There are few studies on HIV subtypes and primary and secondary antiretroviral drug resistance (ADR) in community-recruited samples in Brazil. We analyzed HIV clade diversity and prevalence of mutations associated with ADR in men who have sex with men in all five regions of Brazil. Methods: Using respondent-driven sampling, we recruited 3515 men who have sex with men in nine cities: 299 (9.5%) were HIV-positive; 143 subjects had adequate genotyping and epidemiologic data. Forty-four (30.8%) subjects were antiretroviral therapy-experienced (AE) and 99 (69.2%) antiretroviral therapy-naive (AN). We sequenced the reverse transcriptase and protease regions of the virus and analyzed them for drug resistant mutations using World Health Organization guidelines. Results: The most common subtypes were B (81.8%), C (7.7%), and recombinant forms (6.9%). The overall prevalence of primary ADR resistance was 21.4% (i.e. among the AN) and secondary ADR was 35.8% (i.e. among the AE). The prevalence of resistance to protease inhibitors was 3.9% (AN) and 4.4% (AE); to nucleoside reverse transcriptase inhibitors 15.0% (AN) and 31.0% (AE) and to nonnucleoside reverse transcriptase inhibitors 5.5% (AN) and 13.2% (AE). The most common resistance mutation for nucleoside reverse transcriptase inhibitors was 184V (17 cases) and for nonnucleoside reverse transcriptase inhibitors 103N (16 cases). Conclusions: Our data suggest a high level of both primary and secondary ADR in men who have sex with men in Brazil. Additional studies are needed to identify the correlates and causes of antiretroviral therapy resistance to limit the development of resistance among those in care and the transmission of resistant strains in the wider epidemic.
Resumo:
The exponential increase of home-bound persons who live alone and are in need of continuous monitoring requires new solutions to current problems. Most of these cases present illnesses such as motor or psychological disabilities that deprive of a normal living. Common events such as forgetfulness or falls are quite common and have to be prevented or dealt with. This paper introduces a platform to guide and assist these persons (mostly elderly people) by providing multisensory monitoring and intelligent assistance. The platform operates at three levels. The lower level, denominated ‘‘Data acquisition and processing’’performs the usual tasks of a monitoring system, collecting and processing data from the sensors for the purpose of detecting and tracking humans. The aim is to identify their activities in an intermediate level called ‘‘activity detection’’. The upper level, ‘‘Scheduling and decision-making’’, consists of a scheduler which provides warnings, schedules events in an intelligent manner and serves as an interface to the rest of the platform. The idea is to use mobile and static sensors performing constant monitoring of the user and his/her environment, providing a safe environment and an immediate response to severe problems. A case study on elderly fall detection in a nursery home bedroom demonstrates the usefulness of the proposal.
Resumo:
Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática
Resumo:
In this paper we introduce a formation control loop that maximizes the performance of the cooperative perception of a tracked target by a team of mobile robots, while maintaining the team in formation, with a dynamically adjustable geometry which is a function of the quality of the target perception by the team. In the formation control loop, the controller module is a distributed non-linear model predictive controller and the estimator module fuses local estimates of the target state, obtained by a particle filter at each robot. The two modules and their integration are described in detail, including a real-time database associated to a wireless communication protocol that facilitates the exchange of state data while reducing collisions among team members. Simulation and real robot results for indoor and outdoor teams of different robots are presented. The results highlight how our method successfully enables a team of homogeneous robots to minimize the total uncertainty of the tracked target cooperative estimate while complying with performance criteria such as keeping a pre-set distance between the teammates and the target, avoiding collisions with teammates and/or surrounding obstacles.
Resumo:
The superfluous consumption of energy is faced by the modern society as a Socio-Economical and Environmental problem of the present days. This situation is worsening given that it is becoming clear that the tendency is to increase energy price every year. It is also noticeable that people, not necessarily proficient in technology, are not able to know where savings can be achieved, due to the absence of accessible awareness mechanisms. One of the home user concerns is to balance the need of reducing energy consumption, while producing the same activity with all the comfort and work efficiency. The common techniques to reduce the consumption are to use a less wasteful equipment, altering the equipment program to a more economical one or disconnecting appliances that are not necessary at the moment. However, there is no direct feedback from this performed actions, which leads to the situation where the user is not aware of the influence that these techniques have in the electrical bill. With the intension to give some control over the home consumption, Energy Management Systems (EMS) were developed. These systems allow the access to the consumption information and help understanding the energy waste. However, some studies have proven that these systems have a clear mismatch between the information that is presented and the one the user finds useful for his daily life, leading to demotivation of use. In order to create a solution more oriented towards the user’s demands, a specially tailored language (DSL) was implemented. This solution allows the user to acquire the information he considers useful, through the construction of questions about his energy consumption. The development of this language, following the Model Driven Development (MDD) approach, took into consideration the ideas of facility managers and home users in the phases of design and validation. These opinions were gathered through meetings with experts and a survey, which was conducted to the purpose of collecting statistics about what home users want to know.
Resumo:
The present paper focuses on a damage identification method based on the use of the second order spectral properties of the nodal response processes. The explicit dependence on the frequency content of the outputs power spectral densities makes them suitable for damage detection and localization. The well-known case study of the Z24 Bridge in Switzerland is chosen to apply and further investigate this technique with the aim of validating its reliability. Numerical simulations of the dynamic response of the structure subjected to different types of excitation are carried out to assess the variability of the spectrum-driven method with respect to both type and position of the excitation sources. The simulated data obtained from random vibrations, impulse, ramp and shaking forces, allowed to build the power spectrum matrix from which the main eigenparameters of reference and damage scenarios are extracted. Afterwards, complex eigenvectors and real eigenvalues are properly weighed and combined and a damage index based on the difference between spectral modes is computed to pinpoint the damage. Finally, a group of vibration-based damage identification methods are selected from the literature to compare the results obtained and to evaluate the performance of the spectral index.
Resumo:
Dissertação de mestrado integrado em Engenharia Biomédica (área de especialização em Informática Médica)