958 resultados para Data Generation


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Variable Data Printing (VDP) has brought new flexibility and dynamism to the printed page. Each printed instance of a specific class of document can now have different degrees of customized content within the document template. This flexibility comes at a cost. If every printed page is potentially different from all others it must be rasterized separately, which is a time-consuming process. Technologies such as PPML (Personalized Print Markup Language) attempt to address this problem by dividing the bitmapped page into components that can be cached at the raster level, thereby speeding up the generation of page instances. A large number of documents are stored in Page Description Languages at a higher level of abstraction than the bitmapped page. Much of this content could be reused within a VDP environment provided that separable document components can be identified and extracted. These components then need to be individually rasterisable so that each high-level component can be related to its low-level (bitmap) equivalent. Unfortunately, the unstructured nature of most Page Description Languages makes it difficult to extract content easily. This paper outlines the problems encountered in extracting component-based content from existing page description formats, such as PostScript, PDF and SVG, and how the differences between the formats affects the ease with which content can be extracted. The techniques are illustrated with reference to a tool called COG Extractor, which extracts content from PDF and SVG and prepares it for reuse.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Humans use their grammatical knowledge in more than one way. On one hand, they use it to understand what others say. On the other hand, they use it to say what they want to convey to others (or to themselves). In either case, they need to assemble the structure of sentences in a systematic fashion, in accordance with the grammar of their language. Despite the fact that the structures that comprehenders and speakers assemble are systematic in an identical fashion (i.e., obey the same grammatical constraints), the two ‘modes’ of assembling sentence structures might or might not be performed by the same cognitive mechanisms. Currently, the field of psycholinguistics implicitly adopts the position that they are supported by different cognitive mechanisms, as evident from the fact that most psycholinguistic models seek to explain either comprehension or production phenomena. The potential existence of two independent cognitive systems underlying linguistic performance doubles the problem of linking the theory of linguistic knowledge and the theory of linguistic performance, making the integration of linguistics and psycholinguistic harder. This thesis thus aims to unify the structure building system in comprehension, i.e., parser, and the structure building system in production, i.e., generator, into one, so that the linking theory between knowledge and performance can also be unified into one. I will discuss and unify both existing and new data pertaining to how structures are assembled in understanding and speaking, and attempt to show that the unification between parsing and generation is at least a plausible research enterprise. In Chapter 1, I will discuss the previous and current views on how parsing and generation are related to each other. I will outline the challenges for the current view that the parser and the generator are the same cognitive mechanism. This single system view is discussed and evaluated in the rest of the chapters. In Chapter 2, I will present new experimental evidence suggesting that the grain size of the pre-compiled structural units (henceforth simply structural units) is rather small, contrary to some models of sentence production. In particular, I will show that the internal structure of the verb phrase in a ditransitive sentence (e.g., The chef is donating the book to the monk) is not specified at the onset of speech, but is specified before the first internal argument (the book) needs to be uttered. I will also show that this timing of structural processes with respect to the verb phrase structure is earlier than the lexical processes of verb internal arguments. These two results in concert show that the size of structure building units in sentence production is rather small, contrary to some models of sentence production, yet structural processes still precede lexical processes. I argue that this view of generation resembles the widely accepted model of parsing that utilizes both top-down and bottom-up structure building procedures. In Chapter 3, I will present new experimental evidence suggesting that the structural representation strongly constrains the subsequent lexical processes. In particular, I will show that conceptually similar lexical items interfere with each other only when they share the same syntactic category in sentence production. The mechanism that I call syntactic gating, will be proposed, and this mechanism characterizes how the structural and lexical processes interact in generation. I will present two Event Related Potential (ERP) experiments that show that the lexical retrieval in (predictive) comprehension is also constrained by syntactic categories. I will argue that the syntactic gating mechanism is operative both in parsing and generation, and that the interaction between structural and lexical processes in both parsing and generation can be characterized in the same fashion. In Chapter 4, I will present a series of experiments examining the timing at which verbs’ lexical representations are planned in sentence production. It will be shown that verbs are planned before the articulation of their internal arguments, regardless of the target language (Japanese or English) and regardless of the sentence type (active object-initial sentence in Japanese, passive sentences in English, and unaccusative sentences in English). I will discuss how this result sheds light on the notion of incrementality in generation. In Chapter 5, I will synthesize the experimental findings presented in this thesis and in previous research to address the challenges to the single system view I outlined in Chapter 1. I will then conclude by presenting a preliminary single system model that can potentially capture both the key sentence comprehension and sentence production data without assuming distinct mechanisms for each.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nowadays, new computers generation provides a high performance that enables to build computationally expensive computer vision applications applied to mobile robotics. Building a map of the environment is a common task of a robot and is an essential part to allow the robots to move through these environments. Traditionally, mobile robots used a combination of several sensors from different technologies. Lasers, sonars and contact sensors have been typically used in any mobile robotic architecture, however color cameras are an important sensor due to we want the robots to use the same information that humans to sense and move through the different environments. Color cameras are cheap and flexible but a lot of work need to be done to give robots enough visual understanding of the scenes. Computer vision algorithms are computational complex problems but nowadays robots have access to different and powerful architectures that can be used for mobile robotics purposes. The advent of low-cost RGB-D sensors like Microsoft Kinect which provide 3D colored point clouds at high frame rates made the computer vision even more relevant in the mobile robotics field. The combination of visual and 3D data allows the systems to use both computer vision and 3D processing and therefore to be aware of more details of the surrounding environment. The research described in this thesis was motivated by the need of scene mapping. Being aware of the surrounding environment is a key feature in many mobile robotics applications from simple robotic navigation to complex surveillance applications. In addition, the acquisition of a 3D model of the scenes is useful in many areas as video games scene modeling where well-known places are reconstructed and added to game systems or advertising where once you get the 3D model of one room the system can add furniture pieces using augmented reality techniques. In this thesis we perform an experimental study of the state-of-the-art registration methods to find which one fits better to our scene mapping purposes. Different methods are tested and analyzed on different scene distributions of visual and geometry appearance. In addition, this thesis proposes two methods for 3d data compression and representation of 3D maps. Our 3D representation proposal is based on the use of Growing Neural Gas (GNG) method. This Self-Organizing Maps (SOMs) has been successfully used for clustering, pattern recognition and topology representation of various kind of data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models without considering time constraints. Self-organising neural models have the ability to provide a good representation of the input space. In particular, the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time consuming, specially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This thesis proposes a hardware implementation leveraging the computing power of modern GPUs which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). Our proposed geometrical 3D compression method seeks to reduce the 3D information using plane detection as basic structure to compress the data. This is due to our target environments are man-made and therefore there are a lot of points that belong to a plane surface. Our proposed method is able to get good compression results in those man-made scenarios. The detected and compressed planes can be also used in other applications as surface reconstruction or plane-based registration algorithms. Finally, we have also demonstrated the goodness of the GPU technologies getting a high performance implementation of a CAD/CAM common technique called Virtual Digitizing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Forecasting abrupt variations in wind power generation (the so-called ramps) helps achieve large scale wind power integration. One of the main issues to be confronted when addressing wind power ramp forecasting is the way in which relevant information is identified from large datasets to optimally feed forecasting models. To this end, an innovative methodology oriented to systematically relate multivariate datasets to ramp events is presented. The methodology comprises two stages: the identification of relevant features in the data and the assessment of the dependence between these features and ramp occurrence. As a test case, the proposed methodology was employed to explore the relationships between atmospheric dynamics at the global/synoptic scales and ramp events experienced in two wind farms located in Spain. The achieved results suggested different connection degrees between these atmospheric scales and ramp occurrence. For one of the wind farms, it was found that ramp events could be partly explained from regional circulations and zonal pressure gradients. To perform a comprehensive analysis of ramp underlying causes, the proposed methodology could be applied to datasets related to other stages of the wind-topower conversion chain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing large-scale gene expression data is a labor-intensive and time-consuming process. To make data analysis easier, we developed a set of pipelines for rapid processing and analysis poplar gene expression data for knowledge discovery. Of all pipelines developed, differentially expressed genes (DEGs) pipeline is the one designed to identify biologically important genes that are differentially expressed in one of multiple time points for conditions. Pathway analysis pipeline was designed to identify the differentially expression metabolic pathways. Protein domain enrichment pipeline can identify the enriched protein domains present in the DEGs. Finally, Gene Ontology (GO) enrichment analysis pipeline was developed to identify the enriched GO terms in the DEGs. Our pipeline tools can analyze both microarray gene data and high-throughput gene data. These two types of data are obtained by two different technologies. A microarray technology is to measure gene expression levels via microarray chips, a collection of microscopic DNA spots attached to a solid (glass) surface, whereas high throughput sequencing, also called as the next-generation sequencing, is a new technology to measure gene expression levels by directly sequencing mRNAs, and obtaining each mRNA’s copy numbers in cells or tissues. We also developed a web portal (http://sys.bio.mtu.edu/) to make all pipelines available to public to facilitate users to analyze their gene expression data. In addition to the analyses mentioned above, it can also perform GO hierarchy analysis, i.e. construct GO trees using a list of GO terms as an input.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To analyze the characteristics and predict the dynamic behaviors of complex systems over time, comprehensive research to enable the development of systems that can intelligently adapt to the evolving conditions and infer new knowledge with algorithms that are not predesigned is crucially needed. This dissertation research studies the integration of the techniques and methodologies resulted from the fields of pattern recognition, intelligent agents, artificial immune systems, and distributed computing platforms, to create technologies that can more accurately describe and control the dynamics of real-world complex systems. The need for such technologies is emerging in manufacturing, transportation, hazard mitigation, weather and climate prediction, homeland security, and emergency response. Motivated by the ability of mobile agents to dynamically incorporate additional computational and control algorithms into executing applications, mobile agent technology is employed in this research for the adaptive sensing and monitoring in a wireless sensor network. Mobile agents are software components that can travel from one computing platform to another in a network and carry programs and data states that are needed for performing the assigned tasks. To support the generation, migration, communication, and management of mobile monitoring agents, an embeddable mobile agent system (Mobile-C) is integrated with sensor nodes. Mobile monitoring agents visit distributed sensor nodes, read real-time sensor data, and perform anomaly detection using the equipped pattern recognition algorithms. The optimal control of agents is achieved by mimicking the adaptive immune response and the application of multi-objective optimization algorithms. The mobile agent approach provides potential to reduce the communication load and energy consumption in monitoring networks. The major research work of this dissertation project includes: (1) studying effective feature extraction methods for time series measurement data; (2) investigating the impact of the feature extraction methods and dissimilarity measures on the performance of pattern recognition; (3) researching the effects of environmental factors on the performance of pattern recognition; (4) integrating an embeddable mobile agent system with wireless sensor nodes; (5) optimizing agent generation and distribution using artificial immune system concept and multi-objective algorithms; (6) applying mobile agent technology and pattern recognition algorithms for adaptive structural health monitoring and driving cycle pattern recognition; (7) developing a web-based monitoring network to enable the visualization and analysis of real-time sensor data remotely. Techniques and algorithms developed in this dissertation project will contribute to research advances in networked distributed systems operating under changing environments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Semantics, knowledge and Grids represent three spaces where people interact, understand, learn and create. Grids represent the advanced cyber-infrastructures and evolution. Big data influence the evolution of semantics, knowledge and Grids. Exploring semantics, knowledge and Grids on big data helps accelerate the shift of scientific paradigm, the fourth industrial revolution, and the transformational innovation of technologies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present paper presents an application that composes formal poetry in Spanish in a semiautomatic interactive fashion. JASPER is a forward reasoning rule-based system that obtains from the user an intended message, the desired metric, a choice of vocabulary, and a corpus of verses; and, by intelligent adaptation of selected examples from this corpus using the given words, carries out a prose-to-poetry translation of the given message. In the composition process, JASPER combines natural language generation and a set of construction heuristics obtained from formal literature on Spanish poetry.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Undergraduate psychology students rated expectations of a bogus professor (randomly designated a man or woman and hot versus not hot) based on an online rating and sample comments as found on RateMyProfessors.com (RMP). Five professor qualities were derived using principal components analysis (PCA): dedication, attractiveness, enhancement, fairness, and clarity. Participants rated current psychology professors on the same qualities. Current professors were divided based on gender (man or woman), age (under 35 or 35 and older), and attractiveness (at or below the median or above the median). Using multivariate analysis of covariance (MANCOVA), students expected hot professors to be more attractive but lower in clarity. They rated current professors as lowest in clarity when a man and 35 or older. Current professors were rated significantly lower in dedication, enhancement, fairness, and clarity when rated at or below the median on attractiveness. Results, with previous research, suggest numerous factors, largely out of professors’ control, influencing how students interpret and create professor ratings. Caution is therefore warranted in using online ratings to select courses or make hiring and promotion decisions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper focus on the development of an algorithm using Matlab to generate Typical Meteorological Years from weather data of eight locations in the Madeira Island and to predict the energy generation of photovoltaic systems based on solar cells modelling. Solar cells model includes the effect of ambient temperature and wind speed. The analysis of the PV system performance is carried out through the Weather Corrected Performance Ratio and the PV system yield for the entire island is estimated using spatial interpolation tools.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research activities involved the application of the Geomatic techniques in the Cultural Heritage field, following the development of two themes: Firstly, the application of high precision surveying techniques for the restoration and interpretation of relevant monuments and archaeological finds. The main case regards the activities for the generation of a high-fidelity 3D model of the Fountain of Neptune in Bologna. In this work, aimed to the restoration of the manufacture, both the geometrical and radiometrical aspects were crucial. The final product was the base of a 3D information system representing a shared tool where the different figures involved in the restoration activities shared their contribution in a multidisciplinary approach. Secondly, the arrangement of 3D databases for a Building Information Modeling (BIM) approach, in a process which involves the generation and management of digital representations of physical and functional characteristics of historical buildings, towards a so-called Historical Building Information Model (HBIM). A first application was conducted for the San Michele in Acerboli’s church in Santarcangelo di Romagna. The survey was performed by the integration of the classical and modern Geomatic techniques and the point cloud representing the church was used for the development of a HBIM model, where the relevant information connected to the building could be stored and georeferenced. A second application regards the domus of Obellio Firmo in Pompeii, surveyed by the integration of the classical and modern Geomatic techniques. An historical analysis permitted the definitions of phases and the organization of a database of materials and constructive elements. The goal is the obtaining of a federate model able to manage the different aspects: documental, analytic and reconstructive ones.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nowadays, information security is a very important topic. In particular, wireless networks are experiencing an ongoing widespread diffusion, also thanks the increasing number of Internet Of Things devices, which generate and transmit a lot of data: protecting wireless communications is of fundamental importance, possibly through an easy but secure method. Physical Layer Security is an umbrella of techniques that leverages the characteristic of the wireless channel to generate security for the transmission. In particular, the Physical Layer based-Key generation aims at allowing two users to generate a random symmetric keys in an autonomous way, hence without the aid of a trusted third entity. Physical Layer based-Key generation relies on observations of the wireless channel, from which harvesting entropy: however, an attacker might possesses a channel simulator, for example a Ray Tracing simulator, to replicate the channel between the legitimate users, in order to guess the secret key and break the security of the communication. This thesis work is focused on the possibility to carry out a so called Ray Tracing attack: the method utilized for the assessment consist of a set of channel measurements, in different channel conditions, that are then compared with the simulated channel from the ray tracing, to compute the mutual information between the measurements and simulations. Furthermore, it is also presented the possibility of using the Ray Tracing as a tool to evaluate the impact of channel parameters (e.g. the bandwidth or the directivity of the antenna) on the Physical Layer based-Key generation. The measurements have been carried out at the Barkhausen Institut gGmbH in Dresden (GE), in the framework of the existing cooperation agreement between BI and the Dept. of Electrical, Electronics and Information Engineering "G. Marconi" (DEI) at the University of Bologna.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Gastrointestinal stromal tumors (GIST) are the most common di tumors of the gastrointestinal tract, arising from the interstitial cells of Cajal (ICCs) or their precursors. The vast majority of GISTs (75–85% of GIST) harbor KIT or PDGFRA mutations. A small percentage of GIST (about 10‐15%) do not harbor any of these driver mutations and have historically been called wild-type (WT). Among them, from 20% to 40% show loss of function of the succinate dehydrogenase complex (SDH), also defined as SDH‐deficient GIST. SDH-deficient GISTs display distinctive clinical and pathological features, and can be sporadic or associated with Carney triad or Carney-Stratakis syndrome. These tumors arise most frequently in the stomach with predilection to distal stomach and antrum, have a multi-nodular growth, display a histological epithelioid phenotype, and present frequent lympho-vascular invasion. Occurrence of lymph node metastases and indolent course are representative features of SDH-deficient GISTs. This subset of GIST is known for the immunohistochemical loss of succinate dehydrogenase subunit B (SDHB), which signals the loss of function of the entire SDH-complex. The overall aim of my PhD project consists of the comprehensive characterization of SDH deficient GIST. Throughout the project, clinical, molecular and cellular characterizations were performed using next-generation sequencing technologies (NGS), that has the potential to allow the identification of molecular patterns useful for the diagnosis and development of novel treatments. Moreover, while there are many different cell lines and preclinical models of KIT/PDGFRA mutant GIST, no reliable cell model of SDH-deficient GIST has currently been developed, which could be used for studies on tumor evolution and in vitro assessments of drug response. Therefore, another aim of this project was to develop a pre-clinical model of SDH deficient GIST using the novel technology of induced pluripotent stem cells (iPSC).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the near future, the LHC experiments will continue to be upgraded as the LHC luminosity will increase from the design 1034 to 7.5 × 1034, with the HL-LHC project, to reach 3000 × f b−1 of accumulated statistics. After the end of a period of data collection, CERN will face a long shutdown to improve overall performance by upgrading the experiments and implementing more advanced technologies and infrastructures. In particular, ATLAS will upgrade parts of the detector, the trigger, and the data acquisition system. It will also implement new strategies and algorithms for processing and transferring the data to the final storage. This PhD thesis presents a study of a new pattern recognition algorithm to be used in the trigger system, which is a software designed to provide the information necessary to select physical events from background data. The idea is to use the well-known Hough Transform mathematical formula as an algorithm for detecting particle trajectories. The effectiveness of the algorithm has already been validated in the past, independently of particle physics applications, to detect generic shapes in images. Here, a software emulation tool is proposed for the hardware implementation of the Hough Transform, to reconstruct the tracks in the ATLAS Trigger and Data Acquisition system. Until now, it has never been implemented on electronics in particle physics experiments, and as a hardware implementation it would provide overall latency benefits. A comparison between the simulated data and the physical system was performed on a Xilinx UltraScale+ FPGA device.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Correctness of information gathered in production environments is an essential part of quality assurance processes in many industries, this task is often performed by human resources who visually take annotations in various steps of the production flow. Depending on the performed task the correlation between where exactly the information is gathered and what it represents is more than often lost in the process. The lack of labeled data places a great boundary on the application of deep neural networks aimed at object detection tasks, moreover supervised training of deep models requires a great amount of data to be available. Reaching an adequate large collection of labeled images through classic techniques of data annotations is an exhausting and costly task to perform, not always suitable for every scenario. A possible solution is to generate synthetic data that replicates the real one and use it to fine-tune a deep neural network trained on one or more source domains to a different target domain. The purpose of this thesis is to show a real case scenario where the provided data were both in great scarcity and missing the required annotations. Sequentially a possible approach is presented where synthetic data has been generated to address those issues while standing as a training base of deep neural networks for object detection, capable of working on images taken in production-like environments. Lastly, it compares performance on different types of synthetic data and convolutional neural networks used as backbones for the model.