945 resultados para Structured data


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Genomic sequences are fundamentally text documents, admitting various representations according to need and tokenization. Gene expression depends crucially on binding of enzymes to the DNA sequence at small, poorly conserved binding sites, limiting the utility of standard pattern search. However, one may exploit the regular syntactic structure of the enzyme's component proteins and the corresponding binding sites, framing the problem as one of detecting grammatically correct genomic phrases. In this paper we propose new kernels based on weighted tree structures, traversing the paths within them to capture the features which underpin the task. Experimentally, we and that these kernels provide performance comparable with state of the art approaches for this problem, while offering significant computational advantages over earlier methods. The methods proposed may be applied to a broad range of sequence or tree-structured data in molecular biology and other domains.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Extracting frequent subtrees from the tree structured data has important applications in Web mining. In this paper, we introduce a novel canonical form for rooted labelled unordered trees called the balanced-optimal-search canonical form (BOCF) that can handle the isomorphism problem efficiently. Using BOCF, we define a tree structure guided scheme based enumeration approach that systematically enumerates only the valid subtrees. Finally, we present the balanced optimal search tree miner (BOSTER) algorithm based on BOCF and the proposed enumeration approach, for finding frequent induced subtrees from a database of labelled rooted unordered trees. Experiments on the real datasets compare the efficiency of BOSTER over the two state-of-the-art algorithms for mining induced unordered subtrees, HybridTreeMiner and UNI3. The results are encouraging.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Given the marked changes in length of hospital stay and the number of CAB procedures being performed, it is essential that health professionals are aware of the potential impact these changes could have on the spouses of patients who have undergone CAB surgery. Results from numerous quantitative studies suggest that spouses of patients undergoing CAB surgery experience both physical and emotional stress before and after their partners surgery. While such studies have contributed to our understanding, they fail to capture the qualitative experience of what it is like to be a spouse of a partner who has undergone CAB surgery, specifically in the context of changes in the length of hospital stay. The objective of this study was to describe the experience of spouses of patients who had recently undergone CAB surgery. This study utilised a qualitative methodology and was guided by Husserl's phenomenological approach. Data was obtained from four participants by in depth open ended interviews. This study has implications for all health professionals involved in the care of patients and their families undergoing CAB surgery. If health professionals are to provide holistic care, they need to understand more fully the qualitative experience of spouses of critically ill patients. The purpose of this study was to describe the experience of spouses whose partner's had suffered an acute myocardial infarction (MI). The study was guided by a phenomenological approach. This qualitative type of study is new to nursing inquiry, therefore this investigation creates links with understanding the notion of psychosocial nursing processes with the leading cause of death in Australia. Literature concerning the spouses of myocardial infarction patients has predominantly employed quantitative methods, as such results have centred on structured data collection, and categorised outcomes. Such methods have failed to capture the insight of what it is like to be a spouse of a patient who has had an MI. In-depth interviews were conducted with three participants (2 females and 1 male) about their experiences. The major findings of the study were categorised under the headings of uncertainty, emotional turmoil, support information and lifestyle change. Conclusions suggest that spouses are neglected by health professionals and they require as much psychosocial support as their partner in terms of cardiac discharge planning. Spouses need to be granted special consideration, as they progress through a grieving and readjustment process in coming to terms with: (1) the need to support and care for their partner, (2) changes in their roles and (3) adjustments to their current lifestyles.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In recent years, XML has been widely adopted as a universal format for structured data. A variety of XML-based systems have emerged, most prominently SOAP for Web services, XMPP for instant messaging, and RSS and Atom for content syndication. This popularity is helped by the excellent support for XML processing in many programming languages and by the variety of XML-based technologies for more complex needs of applications. Concurrently with this rise of XML, there has also been a qualitative expansion of the Internet's scope. Namely, mobile devices are becoming capable enough to be full-fledged members of various distributed systems. Such devices are battery-powered, their network connections are based on wireless technologies, and their processing capabilities are typically much lower than those of stationary computers. This dissertation presents work performed to try to reconcile these two developments. XML as a highly redundant text-based format is not obviously suitable for mobile devices that need to avoid extraneous processing and communication. Furthermore, the protocols and systems commonly used in XML messaging are often designed for fixed networks and may make assumptions that do not hold in wireless environments. This work identifies four areas of improvement in XML messaging systems: the programming interfaces to the system itself and to XML processing, the serialization format used for the messages, and the protocol used to transmit the messages. We show a complete system that improves the overall performance of XML messaging through consideration of these areas. The work is centered on actually implementing the proposals in a form usable on real mobile devices. The experimentation is performed on actual devices and real networks using the messaging system implemented as a part of this work. The experimentation is extensive and, due to using several different devices, also provides a glimpse of what the performance of these systems may look like in the future.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In recent years, XML has been accepted as the format of messages for several applications. Prominent examples include SOAP for Web services, XMPP for instant messaging, and RSS and Atom for content syndication. This XML usage is understandable, as the format itself is a well-accepted standard for structured data, and it has excellent support for many popular programming languages, so inventing an application-specific format no longer seems worth the effort. Simultaneously with this XML's rise to prominence there has been an upsurge in the number and capabilities of various mobile devices. These devices are connected through various wireless technologies to larger networks, and a goal of current research is to integrate them seamlessly into these networks. These two developments seem to be at odds with each other. XML as a fully text-based format takes up more processing power and network bandwidth than binary formats would, whereas the battery-powered nature of mobile devices dictates that energy, both in processing and transmitting, be utilized efficiently. This thesis presents the work we have performed to reconcile these two worlds. We present a message transfer service that we have developed to address what we have identified as the three key issues: XML processing at the application level, a more efficient XML serialization format, and the protocol used to transfer messages. Our presentation includes both a high-level architectural view of the whole message transfer service, as well as detailed descriptions of the three new components. These components consist of an API, and an associated data model, for XML processing designed for messaging applications, a binary serialization format for the data model of the API, and a message transfer protocol providing two-way messaging capability with support for client mobility. We also present relevant performance measurements for the service and its components. As a result of this work, we do not consider XML to be inherently incompatible with mobile devices. As the fixed networking world moves toward XML for interoperable data representation, so should the wireless world also do to provide a better-integrated networking infrastructure. However, the problems that XML adoption has touch all of the higher layers of application programming, so instead of concentrating simply on the serialization format we conclude that improvements need to be made in an integrated fashion in all of these layers.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Vertigo in children is more common than previously thought. However, only a small fraction of affected children meet a physician. The reason for this may be the benign course of vertigo in children. Most childhood vertigo is self-limiting, and the provoking factor can often be identified. The differential diagnostic process in children with vertigo is extensive and quite challenging even for otologists and child neurologists, who are the key persons involved in treating vertiginous children. The cause of vertigo can vary from orthostatic hypotension to a brain tumor, and thus, a structured approach is essential in avoiding unnecessary examinations and achieving a diagnosis. Common forms of vertigo in children are otitis media-related dizziness, benign paroxysmal vertigo of childhood, migraine-associated dizziness, and vestibular neuronitis. Orthostatic hypotension, which is not a true vertigo, is the predominant type of dizziness in children. Vertigo is often divided according to origin into peripheral and central types. An otologist is familiar with peripheral causes, while a neurologist treats central causes. Close cooperation between different specialists is essential. Sometimes consultation with a psy-chiatrist or an ophthalmologist can lead to the correct diagnosis. The purpose of this study was to evaluate the prevalence and clinical characteristics of vertigo in children. We prospectively collected general population-based data from three schools and one child wel-fare clinic located close to Helsinki University Central Hospital (HUCH). A simple questionnaire with mostly closed questions was given to 300 consecutive children visiting the welfare clinic. At the schools, entire classes that fit the desired age groups received the questionnaire. Of the 1050 children who received the questionnaire, 938 (473 girls, 465 boys) returned it, the response rate thus being 89% (I). In Study II, we evaluated the 24 vertiginous children (15 girls, 9 boys) with true vertigo and 12 healthy age- and gender-matched controls. A detailed medical history was obtained using a structured approach, and an otoneurologic examination, including audiogram, electronystagmography, and tympanometry, was performed at the HUCH ear, nose, and throat clinic for cooperative subjects. In Study III, we reviewed and evaluated the medical records of 119 children (63 girls, 56 boys) aged 0-17 years who had visited the ear, nose, and throat clinic with a primary complaint of vertigo in 2000-2004. We also wanted information about indications for imaging of the head in vertiginous children. To this end, we reviewed the medical records of 978 children who had undergone imaging of the head for various indications. Of these, 87 children aged 0-16 years were imaged because of vertigo. Subjects of interest were the 23 vertiginous children with an acute deviant finding in magnetic resonance images or com-puterized tomography (IV). Our results indicate that vertigo and other balance problems in children are quite common. Of the HUCH area population, 8% of the children had sometimes experienced vertigo, dizziness, or balance problems. Of these 23% had vertigo sufficiently severe to stop their activity (I). The structured data collection approach eased the evaluation of vertiginous children. More headaches and head traumas were observed in vertiginous children than in healthy controls (II). The most common diagnoses of ear, nose, and throat clinic patients within the five-year period were benign paroxysmal vertigo of child-hood, migraine-associated dizziness, vestibular neuronitis, and otitis media-related vertigo. Valuable diagnostic tools in the diagnostic process were patient history and otoneurologic examinations, includ-ing audiogram, electronystagmography, and tympanometry (III). If the vertiginous child had neurologi-cal deficits, persistent headache, or preceding head trauma, imaging of the head was indicated (IV).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In structured output learning, obtaining labeled data for real-world applications is usually costly, while unlabeled examples are available in abundance. Semisupervised structured classification deals with a small number of labeled examples and a large number of unlabeled structured data. In this work, we consider semisupervised structural support vector machines with domain constraints. The optimization problem, which in general is not convex, contains the loss terms associated with the labeled and unlabeled examples, along with the domain constraints. We propose a simple optimization approach that alternates between solving a supervised learning problem and a constraint matching problem. Solving the constraint matching problem is difficult for structured prediction, and we propose an efficient and effective label switching method to solve it. The alternating optimization is carried out within a deterministic annealing framework, which helps in effective constraint matching and avoiding poor local minima, which are not very useful. The algorithm is simple and easy to implement. Further, it is suitable for any structured output learning problem where exact inference is available. Experiments on benchmark sequence labeling data sets and a natural language parsing data set show that the proposed approach, though simple, achieves comparable generalization performance.

Relevância:

60.00% 60.00%

Publicador:

Relevância:

60.00% 60.00%

Publicador:

Resumo:

ADMB2R is a collection of AD Model Builder routines for saving complex data structures into a file that can be read in the R statistics environment with a single command.1 ADMB2R provides both the means to transfer data structures significantly more complex than simple tables, and an archive mechanism to store data for future reference. We developed this software because we write and run computationally intensive numerical models in Fortran, C++, and AD Model Builder. We then analyse results with R. We desired to automate data transfer to speed diagnostics during working-group meetings. We thus developed the ADMB2R interface to write an R data object (of type list) to a plain-text file. The master list can contain any number of matrices, values, dataframes, vectors or lists, all of which can be read into R with a single call to the dget function. This allows easy transfer of structured data from compiled models to R. Having the capacity to transfer model data, metadata, and results has sharply reduced the time spent on diagnostics, and at the same time, our diagnostic capabilities have improved tremendously. The simplicity of this interface and the capabilities of R have enabled us to automate graph and table creation for formal reports. Finally, the persistent storage in files makes it easier to treat model results in analyses or meta-analyses devised months—or even years—later. We offer ADMB2R to others in the hope that they will find it useful. (PDF contains 30 pages)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

C2R is a collection of C routines for saving complex data structures into a file that can be read in the R statistics environment with a single command.1 C2R provides both the means to transfer data structures significantly more complex than simple tables, and an archive mechanism to store data for future reference. We developed this software because we write and run computationally intensive numerical models in Fortran, C++, and AD Model Builder. We then analyse results with R. We desired to automate data transfer to speed diagnostics during working-group meetings. We thus developed the C2R interface to write an R data object (of type list) to a plain-text file. The master list can contain any number of matrices, values, dataframes, vectors or lists, all of which can be read into R with a single call to the dget function. This allows easy transfer of structured data from compiled models to R. Having the capacity to transfer model data, metadata, and results has sharply reduced the time spent on diagnostics, and at the same time, our diagnostic capabilities have improved tremendously. The simplicity of this interface and the capabilities of R have enabled us to automate graph and table creation for formal reports. Finally, the persistent storage in files makes it easier to treat model results in analyses or meta-analyses devised months—or even years—later. We offer C2R to others in the hope that they will find it useful. (PDF contains 27 pages)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

For2R is a collection of Fortran routines for saving complex data structures into a file that can be read in the R statistics environment with a single command.1 For2R provides both the means to transfer data structures significantly more complex than simple tables, and an archive mechanism to store data for future reference. We developed this software because we write and run computationally intensive numerical models in Fortran, C++, and AD Model Builder. We then analyse results with R. We desired to automate data transfer to speed diagnostics during working-group meetings. We thus developed the For2R interface to write an R data object (of type list) to a plain-text file. The master list can contain any number of matrices, values, dataframes, vectors or lists, all of which can be read into R with a single call to the dget function. This allows easy transfer of structured data from compiled models to R. Having the capacity to transfer model data, metadata, and results has sharply reduced the time spent on diagnostics, and at the same time, our diagnostic capabilities have improved tremendously. The simplicity of this interface and the capabilities of R have enabled us to automate graph and table creation for formal reports. Finally, the persistent storage in files makes it easier to treat model results in analyses or meta-analyses devised months—or even years—later. We offer For2R to others in the hope that they will find it useful. (PDF contains 31 pages)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A presente pesquisa objetiva verificar a contribuição da tecnologia da informação na previsão de indicadores de desempenho da Empresa Alfa. Para a realização deste estudo, foi realizado um estudo de caso único a fim de aprofundar na pesquisa de forma exploratória e descritiva. As técnicas utilizadas para tal foram análise documental e entrevista, que deram suporte à pesquisa quantitativa atendendo ao objetivo proposto no estudo e na fundamentação teórica. A pesquisa teve como base principal os resultados dos indicadores de desempenho dos anos de 2012 e 2013 descritos no planejamento estratégico referente ao ano de 2013. Através desses resultados foi possível prever os resultados dos indicadores para 2014 utilizando o software Weka e assim realizar as análises necessárias. Os principais achados demonstram que a Empresa Alfa precisará antecipar ações para maximizar seus resultados evitando que impactem negativamente na rentabilidade, além de ter a necessidade de manter uma base de dados sólida e estruturada que possa subsidiar previsões confiáveis e alimentar futuramente o programa a fim de realizar novas previsões. O resultado da pesquisa aponta que o sistema de informações Weka contribui para a previsão de resultados, podendo antecipar ações para que a organização possa otimizar suas tomadas de decisões, tornando-as mais eficientes e eficazes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Architecture, Engineering, Construction and Facilities Management (AEC/FM) industry is rapidly becoming a multidisciplinary, multinational and multi-billion dollar economy, involving large numbers of actors working concurrently at different locations and using heterogeneous software and hardware technologies. Since the beginning of the last decade, a great deal of effort has been spent within the field of construction IT in order to integrate data and information from most computer tools used to carry out engineering projects. For this purpose, a number of integration models have been developed, like web-centric systems and construction project modeling, a useful approach in representing construction projects and integrating data from various civil engineering applications. In the modern, distributed and dynamic construction environment it is important to retrieve and exchange information from different sources and in different data formats in order to improve the processes supported by these systems. Previous research demonstrated that a major hurdle in AEC/FM data integration in such systems is caused by its variety of data types and that a significant part of the data is stored in semi-structured or unstructured formats. Therefore, new integrative approaches are needed to handle non-structured data types like images and text files. This research is focused on the integration of construction site images. These images are a significant part of the construction documentation with thousands stored in site photographs logs of large scale projects. However, locating and identifying such data needed for the important decision making processes is a very hard and time-consuming task, while so far, there are no automated methods for associating them with other related objects. Therefore, automated methods for the integration of construction images are important for construction information management. During this research, processes for retrieval, classification, and integration of construction images in AEC/FM model based systems have been explored. Specifically, a combination of techniques from the areas of image and video processing, computer vision, information retrieval, statistics and content-based image and video retrieval have been deployed in order to develop a methodology for the retrieval of related construction site image data from components of a project model. This method has been tested on available construction site images from a variety of sources like past and current building construction and transportation projects and is able to automatically classify, store, integrate and retrieve image data files in inter-organizational systems so as to allow their usage in project management related tasks.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Our media is saturated with claims of ``facts'' made from data. Database research has in the past focused on how to answer queries, but has not devoted much attention to discerning more subtle qualities of the resulting claims, e.g., is a claim ``cherry-picking''? This paper proposes a Query Response Surface (QRS) based framework that models claims based on structured data as parameterized queries. A key insight is that we can learn a lot about a claim by perturbing its parameters and seeing how its conclusion changes. This framework lets us formulate and tackle practical fact-checking tasks --- reverse-engineering vague claims, and countering questionable claims --- as computational problems. Within the QRS based framework, we take one step further, and propose a problem along with efficient algorithms for finding high-quality claims of a given form from data, i.e. raising good questions, in the first place. This is achieved to using a limited number of high-valued claims to represent high-valued regions of the QRS. Besides the general purpose high-quality claim finding problem, lead-finding can be tailored towards specific claim quality measures, also defined within the QRS framework. An example of uniqueness-based lead-finding is presented for ``one-of-the-few'' claims, landing in interpretable high-quality claims, and an adjustable mechanism for ranking objects, e.g. NBA players, based on what claims can be made for them. Finally, we study the use of visualization as a powerful way of conveying results of a large number of claims. An efficient two stage sampling algorithm is proposed for generating input of 2d scatter plot with heatmap, evalutaing a limited amount of data, while preserving the two essential visual features, namely outliers and clusters. For all the problems, we present real-world examples and experiments that demonstrate the power of our model, efficiency of our algorithms, and usefulness of their results.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

New representations of tree-structured data objects, using ideas from topological data analysis, enable improved statistical analyses of a population of brain artery trees. A number of representations of each data tree arise from persistence diagrams that quantify branching and looping of vessels at multiple scales. Novel approaches to the statistical analysis, through various summaries of the persistence diagrams, lead to heightened correlations with covariates such as age and sex, relative to earlier analyses of this data set. The correlation with age continues to be significant even after controlling for correlations from earlier significant summaries.