903 resultados para Data-driven Methods


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Many systems and applications are continuously producing events. These events are used to record the status of the system and trace the behaviors of the systems. By examining these events, system administrators can check the potential problems of these systems. If the temporal dynamics of the systems are further investigated, the underlying patterns can be discovered. The uncovered knowledge can be leveraged to predict the future system behaviors or to mitigate the potential risks of the systems. Moreover, the system administrators can utilize the temporal patterns to set up event management rules to make the system more intelligent. With the popularity of data mining techniques in recent years, these events grad- ually become more and more useful. Despite the recent advances of the data mining techniques, the application to system event mining is still in a rudimentary stage. Most of works are still focusing on episodes mining or frequent pattern discovering. These methods are unable to provide a brief yet comprehensible summary to reveal the valuable information from the high level perspective. Moreover, these methods provide little actionable knowledge to help the system administrators to better man- age the systems. To better make use of the recorded events, more practical techniques are required. From the perspective of data mining, three correlated directions are considered to be helpful for system management: (1) Provide concise yet comprehensive summaries about the running status of the systems; (2) Make the systems more intelligence and autonomous; (3) Effectively detect the abnormal behaviors of the systems. Due to the richness of the event logs, all these directions can be solved in the data-driven manner. And in this way, the robustness of the systems can be enhanced and the goal of autonomous management can be approached. This dissertation mainly focuses on the foregoing directions that leverage tem- poral mining techniques to facilitate system management. More specifically, three concrete topics will be discussed, including event, resource demand prediction, and streaming anomaly detection. Besides the theoretic contributions, the experimental evaluation will also be presented to demonstrate the effectiveness and efficacy of the corresponding solutions.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This dissertation establishes a novel data-driven method to identify language network activation patterns in pediatric epilepsy through the use of the Principal Component Analysis (PCA) on functional magnetic resonance imaging (fMRI). A total of 122 subjects’ data sets from five different hospitals were included in the study through a web-based repository site designed here at FIU. Research was conducted to evaluate different classification and clustering techniques in identifying hidden activation patterns and their associations with meaningful clinical variables. The results were assessed through agreement analysis with the conventional methods of lateralization index (LI) and visual rating. What is unique in this approach is the new mechanism designed for projecting language network patterns in the PCA-based decisional space. Synthetic activation maps were randomly generated from real data sets to uniquely establish nonlinear decision functions (NDF) which are then used to classify any new fMRI activation map into typical or atypical. The best nonlinear classifier was obtained on a 4D space with a complexity (nonlinearity) degree of 7. Based on the significant association of language dominance and intensities with the top eigenvectors of the PCA decisional space, a new algorithm was deployed to delineate primary cluster members without intensity normalization. In this case, three distinct activations patterns (groups) were identified (averaged kappa with rating 0.65, with LI 0.76) and were characterized by the regions of: 1) the left inferior frontal Gyrus (IFG) and left superior temporal gyrus (STG), considered typical for the language task; 2) the IFG, left mesial frontal lobe, right cerebellum regions, representing a variant left dominant pattern by higher activation; and 3) the right homologues of the first pattern in Broca's and Wernicke's language areas. Interestingly, group 2 was found to reflect a different language compensation mechanism than reorganization. Its high intensity activation suggests a possible remote effect on the right hemisphere focus on traditionally left-lateralized functions. In retrospect, this data-driven method provides new insights into mechanisms for brain compensation/reorganization and neural plasticity in pediatric epilepsy.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Highway Safety Manual (HSM) estimates roadway safety performance based on predictive models that were calibrated using national data. Calibration factors are then used to adjust these predictive models to local conditions for local applications. The HSM recommends that local calibration factors be estimated using 30 to 50 randomly selected sites that experienced at least a total of 100 crashes per year. It also recommends that the factors be updated every two to three years, preferably on an annual basis. However, these recommendations are primarily based on expert opinions rather than data-driven research findings. Furthermore, most agencies do not have data for many of the input variables recommended in the HSM. This dissertation is aimed at determining the best way to meet three major data needs affecting the estimation of calibration factors: (1) the required minimum sample sizes for different roadway facilities, (2) the required frequency for calibration factor updates, and (3) the influential variables affecting calibration factors. In this dissertation, statewide segment and intersection data were first collected for most of the HSM recommended calibration variables using a Google Maps application. In addition, eight years (2005-2012) of traffic and crash data were retrieved from existing databases from the Florida Department of Transportation. With these data, the effect of sample size criterion on calibration factor estimates was first studied using a sensitivity analysis. The results showed that the minimum sample sizes not only vary across different roadway facilities, but they are also significantly higher than those recommended in the HSM. In addition, results from paired sample t-tests showed that calibration factors in Florida need to be updated annually. To identify influential variables affecting the calibration factors for roadway segments, the variables were prioritized by combining the results from three different methods: negative binomial regression, random forests, and boosted regression trees. Only a few variables were found to explain most of the variation in the crash data. Traffic volume was consistently found to be the most influential. In addition, roadside object density, major and minor commercial driveway densities, and minor residential driveway density were also identified as influential variables.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

La tesi presenta uno studio della libreria grafica per web D3, sviluppata in javascript, e ne presenta una catalogazione dei grafici implementati e reperibili sul web. Lo scopo è quello di valutare la libreria e studiarne i pregi e difetti per capire se sia opportuno utilizzarla nell'ambito di un progetto Europeo. Per fare questo vengono studiati i metodi di classificazione dei grafici presenti in letteratura e viene esposto e descritto lo stato dell'arte del data visualization. Viene poi descritto il metodo di classificazione proposto dal team di progettazione e catalogata la galleria di grafici presente sul sito della libreria D3. Infine viene presentato e studiato in maniera formale un algoritmo per selezionare un grafico in base alle esigenze dell'utente.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Research endeavors on spoken dialogue systems in the 1990s and 2000s have led to the deployment of commercial spoken dialogue systems (SDS) in microdomains such as customer service automation, reservation/booking and question answering systems. Recent research in SDS has been focused on the development of applications in different domains (e.g. virtual counseling, personal coaches, social companions) which requires more sophistication than the previous generation of commercial SDS. The focus of this research project is the delivery of behavior change interventions based on the brief intervention counseling style via spoken dialogue systems. Brief interventions (BI) are evidence-based, short, well structured, one-on-one counseling sessions. Many challenges are involved in delivering BIs to people in need, such as finding the time to administer them in busy doctors' offices, obtaining the extra training that helps staff become comfortable providing these interventions, and managing the cost of delivering the interventions. Fortunately, recent developments in spoken dialogue systems make the development of systems that can deliver brief interventions possible. The overall objective of this research is to develop a data-driven, adaptable dialogue system for brief interventions for problematic drinking behavior, based on reinforcement learning methods. The implications of this research project includes, but are not limited to, assessing the feasibility of delivering structured brief health interventions with a data-driven spoken dialogue system. Furthermore, while the experimental system focuses on harmful alcohol drinking as a target behavior in this project, the produced knowledge and experience may also lead to implementation of similarly structured health interventions and assessments other than the alcohol domain (e.g. obesity, drug use, lack of exercise), using statistical machine learning approaches. In addition to designing a dialog system, the semantic and emotional meanings of user utterances have high impact on interaction. To perform domain specific reasoning and recognize concepts in user utterances, a named-entity recognizer and an ontology are designed and evaluated. To understand affective information conveyed through text, lexicons and sentiment analysis module are developed and tested.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Quantitative methods can help us understand how underlying attributes contribute to movement patterns. Applying principal components analysis (PCA) to whole-body motion data may provide an objective data-driven method to identify unique and statistically important movement patterns. Therefore, the primary purpose of this study was to determine if athletes’ movement patterns can be differentiated based on skill level or sport played using PCA. Motion capture data from 542 athletes performing three sport-screening movements (i.e. bird-dog, drop jump, T-balance) were analyzed. A PCA-based pattern recognition technique was used to analyze the data. Prior to analyzing the effects of skill level or sport on movement patterns, methodological considerations related to motion analysis reference coordinate system were assessed. All analyses were addressed as case-studies. For the first case study, referencing motion data to a global (lab-based) coordinate system compared to a local (segment-based) coordinate system affected the ability to interpret important movement features. Furthermore, for the second case study, where the interpretability of PCs was assessed when data were referenced to a stationary versus a moving segment-based coordinate system, PCs were more interpretable when data were referenced to a stationary coordinate system for both the bird-dog and T-balance task. As a result of the findings from case study 1 and 2, only stationary segment-based coordinate systems were used in cases 3 and 4. During the bird-dog task, elite athletes had significantly lower scores compared to recreational athletes for principal component (PC) 1. For the T-balance movement, elite athletes had significantly lower scores compared to recreational athletes for PC 2. In both analyses the lower scores in elite athletes represented a greater range of motion. Finally, case study 4 reported differences in athletes’ movement patterns who competed in different sports, and significant differences in technique were detected during the bird-dog task. Through these case studies, this thesis highlights the feasibility of applying PCA as a movement pattern recognition technique in athletes. Future research can build on this proof-of-principle work to develop robust quantitative methods to help us better understand how underlying attributes (e.g. height, sex, ability, injury history, training type) contribute to performance.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This report evaluates the use of remotely sensed images in implementing the Iowa DOT LRS that is currently in the stages of system architecture. The Iowa Department of Transportation is investing a significant amount of time and resources into creation of a linear referencing system (LRS). A significant portion of the effort in implementing the system will be creation of a datum, which includes geographically locating anchor points and then measuring anchor section distances between those anchor points. Currently, system architecture and evaluation of different data collection methods to establish the LRS datum is being performed for the DOT by an outside consulting team.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Avhandlingens överordnade syfte är att utforska möjligheterna för ett integrerat forskningsperspektiv på mäns våld samt exemplifiera hur sådan forskning kan bedrivas. Det konkreta syftet är att öka kunskapen om hur våldsamma mäns barndomsupplevelser, socialisation, maskulinitetskonstruktion och emotioner kan relateras till deras våld mot andra män, mot sig själva och mot kvinnor samt till hur terapeutiska interventioner mot våld kan analyseras och utvecklas i korrespondens med denna kunskap. Med vetenskapsteoretiska utgångspunkter hämtade från den kritiska realismen och ekologiska metoder relaterar studien forskning från olika skolbildningar till varandra; - psykologisk: om barndomserfarenheter och socialisation, socialpsykologisk: om emotioner och interaktion samt sociologisk: om social klass, könsmaktsstrukturer och hegemonisk maskulinitet. Detta genomförs för att kunna få tillgång till kunskap om hur olika faktorer samverkar vid mäns våld. I studie I och II studerades möjligheterna att undersöka de sociala banden mellan terapeut/terapi och klient inom terapeutiska behandlingar mot våld. I studie I operationaliserades indikatorer på emotionerna stolthet och skam och i studie II testades dessa på terapeuter inom en KBT-orienterad terapi. I studie IIIundersöktes män i olika maskulinitetspositioner, där urvalet för den ena gruppen hämtades ur populationen män dömda till terapi för våld och missbruk och den andra ur populationen män som organiserat arbetade för jämlikhet och mot våld mot kvinnor. I studien jämfördes de båda gruppernas förhållningssätt till faktorer som i tidigare forskning relaterats till våld och våld mot kvinnor. I studie IVundersöktes våldsdömda mäns karriärer fram till deras nuvarande position som våldsbejakande kriminella i avsikt att öka kunskapen om det samspel mellan faktorer som i olika situationer leder fram till deras våld mot andra män, sig själva och kvinnor. Samtliga empiriska studier använde kvalitativa metoder för datainsamling och analys. I studie IV användes individuella intervjuer och biografisk analys, I studie II ochIII användes gruppintervjuer samt deduktiv innehållsanalys. I studie I, den teoretiska reviewartikeln, utgjorde sociologisk, socialpsykologisk och psykologisk teoribildning empiri. Avhandlingen visar att det finns fler fördelar är nackdelar med ett nivåövergripande perspektiv. Nivåintegrerande studier försvåras av att de kräver en komplex metodologi för att kunna hantera samverkan mellan faktorer bakom våld på olika nivåer men ger å andra sidan en mer holistisk förståelse av fenomenet i fråga. Resultaten visar att integrerande perspektiv kan minska risken för ekologiska felslut och ökar förståelsen av komplex samverkan mellan faktorer bakom mäns våld, något som kan komma att bidra till kunskapsutvecklingen inom våldsterapiområdet. Den teoretiska reviewartikeln (studie I) exemplifierade hur teoretiskt och metodologiskt driven forskning om sociala band kan göras pragmatiskt tillämpbar av terapeuter inom våldbehandlingar. Den tillämpade studien av en KBT-terapi (studie II) gav exempel på hur operationaliserade indikatorer på stolthet och skam kan användas praktiskt för att bestämma kvalitén på det sociala bandet mellan terapeut och klient. Den studerade KBT-terapin innehöll som förväntat både skam- och stolthetskapande moment vilket utgör värdefulla utgångspunkter för vidare forskning. Jämförelsen mellan män i idealtypiskt motsatta maskulinitetspositioner (studie III) visade att både gruppen av män som arbetar mot våld mot kvinnor och männen dömda till behandling mot våld, bär på ambivalenta attityder gentemot våld och våld mot kvinnor. Jämförelsen visade vidare att gruppernas maskulinitetskonstruktioner och attityder till våld korresponderar med grupperingarnas olika tillgång till ekonomiska, sociala och kulturella resurser. Den biografiskt fokuserade kvalitativa studien av män i våldsbehandling (studie IV) undersökte explorativt hur karriären fram till våldskriminell kan se ut och hur barndomsupplevelser, socialisation, maskulinitet och emotioner hos enskilda våldsverkande män kan tänkas ha samverkat med varandra när våld äger rum. Resultaten visade att de män som vittnar om utsatthet för allvarligt våld i barndomen är mer skambenägna och vid kränkningar från andra tenderar att omedvetet och utan föregående känslor av skam direkt reagera med aggressioner och våld mot båda könen. Övriga män var visserligen skambenägna men beskrev en mer kontrollerad våldsreaktion. Två män som blivit brutalt fysiskt mobbade i grundskolan, berättade om ett mer kontrollerat våld. En preliminär hypotes är att männen kan ha lärt sig att kognitivt, för att undslippa fortsatt mobbing, ta kontrollen över processen där skamkänslor ersätts med aggressioner. Föräldrarnas personliga problem tillsammans med deras bristande sociala kontroll och omsorg antogs ha ett samband med flera av männens skolproblem, deras umgänge med avvikande ungdomar, deras senare svårigheter med att kunna försörja sig med konventionella medel samt deras våldskarriärer.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper is reviewing objective assessments of Parkinson’s disease(PD) motor symptoms, cardinal, and dyskinesia, using sensor systems. It surveys the manifestation of PD symptoms, sensors that were used for their detection, types of signals (measures) as well as their signal processing (data analysis) methods. A summary of this review’s finding is represented in a table including devices (sensors), measures and methods that were used in each reviewed motor symptom assessment study. In the gathered studies among sensors, accelerometers and touch screen devices are the most widely used to detect PD symptoms and among symptoms, bradykinesia and tremor were found to be mostly evaluated. In general, machine learning methods are potentially promising for this. PD is a complex disease that requires continuous monitoring and multidimensional symptom analysis. Combining existing technologies to develop new sensor platforms may assist in assessing the overall symptom profile more accurately to develop useful tools towards supporting better treatment process.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Americans are accustomed to a wide range of data collection in their lives: census, polls, surveys, user registrations, and disclosure forms. When logging onto the Internet, users’ actions are being tracked everywhere: clicking, typing, tapping, swiping, searching, and placing orders. All of this data is stored to create data-driven profiles of each user. Social network sites, furthermore, set the voluntarily sharing of personal data as the default mode of engagement. But people’s time and energy devoted to creating this massive amount of data, on paper and online, are taken for granted. Few people would consider their time and energy spent on data production as labor. Even if some people do acknowledge their labor for data, they believe it is accessory to the activities at hand. In the face of pervasive data collection and the rising time spent on screens, why do people keep ignoring their labor for data? How has labor for data been become invisible, as something that is disregarded by many users? What does invisible labor for data imply for everyday cultural practices in the United States? Invisible Labor for Data addresses these questions. I argue that three intertwined forces contribute to framing data production as being void of labor: data production institutions throughout history, the Internet’s technological infrastructure (especially with the implementation of algorithms), and the multiplication of virtual spaces. There is a common tendency in the framework of human interactions with computers to deprive data and bodies of their materiality. My Introduction and Chapter 1 offer theoretical interventions by reinstating embodied materiality and redefining labor for data as an ongoing process. The middle Chapters present case studies explaining how labor for data is pushed to the margin of the narratives about data production. I focus on a nationwide debate in the 1960s on whether the U.S. should build a databank, contemporary Big Data practices in the data broker and the Internet industries, and the group of people who are hired to produce data for other people’s avatars in the virtual games. I conclude with a discussion on how the new development of crowdsourcing projects may usher in the new chapter in exploiting invisible and discounted labor for data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Microsecond long Molecular Dynamics (MD) trajectories of biomolecular processes are now possible due to advances in computer technology. Soon, trajectories long enough to probe dynamics over many milliseconds will become available. Since these timescales match the physiological timescales over which many small proteins fold, all atom MD simulations of protein folding are now becoming popular. To distill features of such large folding trajectories, we must develop methods that can both compress trajectory data to enable visualization, and that can yield themselves to further analysis, such as the finding of collective coordinates and reduction of the dynamics. Conventionally, clustering has been the most popular MD trajectory analysis technique, followed by principal component analysis (PCA). Simple clustering used in MD trajectory analysis suffers from various serious drawbacks, namely, (i) it is not data driven, (ii) it is unstable to noise and change in cutoff parameters, and (iii) since it does not take into account interrelationships amongst data points, the separation of data into clusters can often be artificial. Usually, partitions generated by clustering techniques are validated visually, but such validation is not possible for MD trajectories of protein folding, as the underlying structural transitions are not well understood. Rigorous cluster validation techniques may be adapted, but it is more crucial to reduce the dimensions in which MD trajectories reside, while still preserving their salient features. PCA has often been used for dimension reduction and while it is computationally inexpensive, being a linear method, it does not achieve good data compression. In this thesis, I propose a different method, a nonmetric multidimensional scaling (nMDS) technique, which achieves superior data compression by virtue of being nonlinear, and also provides a clear insight into the structural processes underlying MD trajectories. I illustrate the capabilities of nMDS by analyzing three complete villin headpiece folding and six norleucine mutant (NLE) folding trajectories simulated by Freddolino and Schulten [1]. Using these trajectories, I make comparisons between nMDS, PCA and clustering to demonstrate the superiority of nMDS. The three villin headpiece trajectories showed great structural heterogeneity. Apart from a few trivial features like early formation of secondary structure, no commonalities between trajectories were found. There were no units of residues or atoms found moving in concert across the trajectories. A flipping transition, corresponding to the flipping of helix 1 relative to the plane formed by helices 2 and 3 was observed towards the end of the folding process in all trajectories, when nearly all native contacts had been formed. However, the transition occurred through a different series of steps in all trajectories, indicating that it may not be a common transition in villin folding. The trajectories showed competition between local structure formation/hydrophobic collapse and global structure formation in all trajectories. Our analysis on the NLE trajectories confirms the notion that a tight hydrophobic core inhibits correct 3-D rearrangement. Only one of the six NLE trajectories folded, and it showed no flipping transition. All the other trajectories get trapped in hydrophobically collapsed states. The NLE residues were found to be buried deeply into the core, compared to the corresponding lysines in the villin headpiece, thereby making the core tighter and harder to undo for 3-D rearrangement. Our results suggest that the NLE may not be a fast folder as experiments suggest. The tightness of the hydrophobic core may be a very important factor in the folding of larger proteins. It is likely that chaperones like GroEL act to undo the tight hydrophobic core of proteins, after most secondary structure elements have been formed, so that global rearrangement is easier. I conclude by presenting facts about chaperone-protein complexes and propose further directions for the study of protein folding.