997 resultados para Automatic writing
Resumo:
An energy-momentum conserving time integrator coupled with an automatic finite element algorithm is developed to study longitudinal wave propagation in hyperelastic layers. The Murnaghan strain energy function is used to model material nonlinearity and full geometric nonlinearity is considered. An automatic assembly algorithm using algorithmic differentiation is developed within a discrete Hamiltonian framework to directly formulate the finite element matrices without recourse to an explicit derivation of their algebraic form or the governing equations. The algorithm is illustrated with applications to longitudinal wave propagation in a thin hyperelastic layer modeled with a two-mode kinematic model. Solution obtained using a standard nonlinear finite element model with Newmark time stepping is provided for comparison. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
There are many wireless sensor network(WSN) applications which require reliable data transfer between the nodes. Several techniques including link level retransmission, error correction methods and hybrid Automatic Repeat re- Quest(ARQ) were introduced into the wireless sensor networks for ensuring reliability. In this paper, we use Automatic reSend request(ASQ) technique with regular acknowledgement to design reliable end-to-end communication protocol, called Adaptive Reliable Transport(ARTP) protocol, for WSNs. Besides ensuring reliability, objective of ARTP protocol is to ensure message stream FIFO at the receiver side instead of the byte stream FIFO used in TCP/IP protocol suite. To realize this objective, a new protocol stack has been used in the ARTP protocol. The ARTP protocol saves energy without affecting the throughput by sending three different types of acknowledgements, viz. ACK, NACK and FNACK with semantics different from that existing in the literature currently and adapting to the network conditions. Additionally, the protocol controls flow based on the receiver's feedback and congestion by holding ACK messages. To the best of our knowledge, there has been little or no attempt to build a receiver controlled regularly acknowledged reliable communication protocol. We have carried out extensive simulation studies of our protocol using Castalia simulator, and the study shows that our protocol performs better than related protocols in wireless/wire line networks, in terms of throughput and energy efficiency.
Resumo:
MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.
Resumo:
Analysis of high resolution satellite images has been an important research topic for urban analysis. One of the important features of urban areas in urban analysis is the automatic road network extraction. Two approaches for road extraction based on Level Set and Mean Shift methods are proposed. From an original image it is difficult and computationally expensive to extract roads due to presences of other road-like features with straight edges. The image is preprocessed to improve the tolerance by reducing the noise (the buildings, parking lots, vegetation regions and other open spaces) and roads are first extracted as elongated regions, nonlinear noise segments are removed using a median filter (based on the fact that road networks constitute large number of small linear structures). Then road extraction is performed using Level Set and Mean Shift method. Finally the accuracy for the road extracted images is evaluated based on quality measures. The 1m resolution IKONOS data has been used for the experiment.
Resumo:
Latent variable methods, such as PLCA (Probabilistic Latent Component Analysis) have been successfully used for analysis of non-negative signal representations. In this paper, we formulate PLCS (Probabilistic Latent Component Segmentation), which models each time frame of a spectrogram as a spectral distribution. Given the signal spectrogram, the segmentation boundaries are estimated using a maximum-likelihood approach. For an efficient solution, the algorithm imposes a hard constraint that each segment is modelled by a single latent component. The hard constraint facilitates the solution of ML boundary estimation using dynamic programming. The PLCS framework does not impose a parametric assumption unlike earlier ML segmentation techniques. PLCS can be naturally extended to model coarticulation between successive phones. Experiments on the TIMIT corpus show that the proposed technique is promising compared to most state of the art speech segmentation algorithms.
Resumo:
This paper presents classification, representation and extraction of deformation features in sheet-metal parts. The thickness is constant for these shape features and hence these are also referred to as constant thickness features. The deformation feature is represented as a set of faces with a characteristic arrangement among the faces. Deformation of the base-sheet or forming of material creates Bends and Walls with respect to a base-sheet or a reference plane. These are referred to as Basic Deformation Features (BDFs). Compound deformation features having two or more BDFs are defined as characteristic combinations of Bends and Walls and represented as a graph called Basic Deformation Features Graph (BDFG). The graph, therefore, represents a compound deformation feature uniquely. The characteristic arrangement of the faces and type of bends belonging to the feature decide the type and nature of the deformation feature. Algorithms have been developed to extract and identify deformation features from a CAD model of sheet-metal parts. The proposed algorithm does not require folding and unfolding of the part as intermediate steps to recognize deformation features. Representations of typical features are illustrated and results of extracting these deformation features from typical sheet metal parts are presented and discussed. (C) 2013 Elsevier Ltd. All rights reserved.
Resumo:
Exploiting the performance potential of GPUs requires managing the data transfers to and from them efficiently which is an error-prone and tedious task. In this paper, we develop a software coherence mechanism to fully automate all data transfers between the CPU and GPU without any assistance from the programmer. Our mechanism uses compiler analysis to identify potential stale accesses and uses a runtime to initiate transfers as necessary. This allows us to avoid redundant transfers that are exhibited by all other existing automatic memory management proposals. We integrate our automatic memory manager into the X10 compiler and runtime, and find that it not only results in smaller and simpler programs, but also eliminates redundant memory transfers. Tested on eight programs ported from the Rodinia benchmark suite it achieves (i) a 1.06x speedup over hand-tuned manual memory management, and (ii) a 1.29x speedup over another recently proposed compiler--runtime automatic memory management system. Compared to other existing runtime-only and compiler-only proposals, it also transfers 2.2x to 13.3x less data on average.
Resumo:
We report high aspect-ratio micromechanical structures made of SU-8 polymer, which is a negative photoresist. Mask-less direct writing with 405 nm laser is used to pattern spin-cast SU-8 films of thickness of more than 600 um. As compared with X-ray lithography, which helps pattern material to give aspect ratios of 1:50 or higher, laser writing is a less expensive and more accessible alternative. In this work, aspect ratios up to 1:30 were obtained on narrow pillars and cantilever structures. Deep vertical patterning was achieved in multiple exposures of the surface with varying dosages given at periodic intervals of sufficient duration. It was found that a time lag between successive exposures at the same location helps the material recover from the transient changes that occur during exposure to the laser. This gives vertical sidewalls to the resulting structures. The time-lags and dosages were determined by conducting several trials. The micromechanical structures obtained with laser writing are compared with those obtained with traditional UV lithography as well as e-beam lithography. Laser writing gives not only high aspect ratios but also narrow gaps whereas e-beam can only give narrow gaps over very small depths. Unlike traditional UV lithography, laser writing does not need a mask. Furthermore, there is no adjustment for varying the dosage in traditional UV lithography. A drawback of this method compared to UV lithography is that the writing time increases. Some test structures as well as a compliant microgripper are fabricated.
Resumo:
Multi-GPU machines are being increasingly used in high-performance computing. Each GPU in such a machine has its own memory and does not share the address space either with the host CPU or other GPUs. Hence, applications utilizing multiple GPUs have to manually allocate and manage data on each GPU. Existing works that propose to automate data allocations for GPUs have limitations and inefficiencies in terms of allocation sizes, exploiting reuse, transfer costs, and scalability. We propose a scalable and fully automatic data allocation and buffer management scheme for affine loop nests on multi-GPU machines. We call it the Bounding-Box-based Memory Manager (BBMM). BBMM can perform at runtime, during standard set operations like union, intersection, and difference, finding subset and superset relations on hyperrectangular regions of array data (bounding boxes). It uses these operations along with some compiler assistance to identify, allocate, and manage data required by applications in terms of disjoint bounding boxes. This allows it to (1) allocate exactly or nearly as much data as is required by computations running on each GPU, (2) efficiently track buffer allocations and hence maximize data reuse across tiles and minimize data transfer overhead, and (3) and as a result, maximize utilization of the combined memory on multi-GPU machines. BBMM can work with any choice of parallelizing transformations, computation placement, and scheduling schemes, whether static or dynamic. Experiments run on a four-GPU machine with various scientific programs showed that BBMM reduces data allocations on each GPU by up to 75% compared to current allocation schemes, yields performance of at least 88% of manually written code, and allows excellent weak scaling.
Resumo:
The tonic is a fundamental concept in Indian art music. It is the base pitch, which an artist chooses in order to construct the melodies during a rg(a) rendition, and all accompanying instruments are tuned using the tonic pitch. Consequently, tonic identification is a fundamental task for most computational analyses of Indian art music, such as intonation analysis, melodic motif analysis and rg recognition. In this paper we review existing approaches for tonic identification in Indian art music and evaluate them on six diverse datasets for a thorough comparison and analysis. We study the performance of each method in different contexts such as the presence/absence of additional metadata, the quality of audio data, the duration of audio data, music tradition (Hindustani/Carnatic) and the gender of the singer (male/female). We show that the approaches that combine multi-pitch analysis with machine learning provide the best performance in most cases (90% identification accuracy on average), and are robust across the aforementioned contexts compared to the approaches based on expert knowledge. In addition, we also show that the performance of the latter can be improved when additional metadata is available to further constrain the problem. Finally, we present a detailed error analysis of each method, providing further insights into the advantages and limitations of the methods.
Resumo:
The formulation of higher order structural models and their discretization using the finite element method is difficult owing to their complexity, especially in the presence of non-linearities. In this work a new algorithm for automating the formulation and assembly of hyperelastic higher-order structural finite elements is developed. A hierarchic series of kinematic models is proposed for modeling structures with special geometries and the algorithm is formulated to automate the study of this class of higher order structural models. The algorithm developed in this work sidesteps the need for an explicit derivation of the governing equations for the individual kinematic modes. Using a novel procedure involving a nodal degree-of-freedom based automatic assembly algorithm, automatic differentiation and higher dimensional quadrature, the relevant finite element matrices are directly computed from the variational statement of elasticity and the higher order kinematic model. Another significant feature of the proposed algorithm is that natural boundary conditions are implicitly handled for arbitrary higher order kinematic models. The validity algorithm is illustrated with examples involving linear elasticity and hyperelasticity. (C) 2013 Elsevier Inc. All rights reserved.
Resumo:
Background: The function of a protein can be deciphered with higher accuracy from its structure than from its amino acid sequence. Due to the huge gap in the available protein sequence and structural space, tools that can generate functionally homogeneous clusters using only the sequence information, hold great importance. For this, traditional alignment-based tools work well in most cases and clustering is performed on the basis of sequence similarity. But, in the case of multi-domain proteins, the alignment quality might be poor due to varied lengths of the proteins, domain shuffling or circular permutations. Multi-domain proteins are ubiquitous in nature, hence alignment-free tools, which overcome the shortcomings of alignment-based protein comparison methods, are required. Further, existing tools classify proteins using only domain-level information and hence miss out on the information encoded in the tethered regions or accessory domains. Our method, on the other hand, takes into account the full-length sequence of a protein, consolidating the complete sequence information to understand a given protein better. Results: Our web-server, CLAP (Classification of Proteins), is one such alignment-free software for automatic classification of protein sequences. It utilizes a pattern-matching algorithm that assigns local matching scores (LMS) to residues that are a part of the matched patterns between two sequences being compared. CLAP works on full-length sequences and does not require prior domain definitions. Pilot studies undertaken previously on protein kinases and immunoglobulins have shown that CLAP yields clusters, which have high functional and domain architectural similarity. Moreover, parsing at a statistically determined cut-off resulted in clusters that corroborated with the sub-family level classification of that particular domain family. Conclusions: CLAP is a useful protein-clustering tool, independent of domain assignment, domain order, sequence length and domain diversity. Our method can be used for any set of protein sequences, yielding functionally relevant clusters with high domain architectural homogeneity. The CLAP web server is freely available for academic use at http://nslab.mbu.iisc.ernet.in/clap/.
Resumo:
A new automatic algorithm for the assessment of mixed mode crack growth rate characteristics is presented based on the concept of an equivalent crack. The residual ligament size approach is introduced to implementation this algorithm for identifying the crack tip position on a curved path with respect to the drop potential signal. The automatic algorithm accounting for the curvilinear crack trajectory and employing an electrical potential difference was calibrated with respect to the optical measurements for the growing crack under cyclic mixed mode loading conditions. The effectiveness of the proposed algorithm is confirmed by fatigue tests performed on ST3 steel compact tension-shear specimens in the full range of mode mixities from pure mode Ito pure mode II. (C) 2015 Elsevier Ltd. All rights reserved.
Resumo:
This paper presents the design and implementation of PolyMage, a domain-specific language and compiler for image processing pipelines. An image processing pipeline can be viewed as a graph of interconnected stages which process images successively. Each stage typically performs one of point-wise, stencil, reduction or data-dependent operations on image pixels. Individual stages in a pipeline typically exhibit abundant data parallelism that can be exploited with relative ease. However, the stages also require high memory bandwidth preventing effective utilization of parallelism available on modern architectures. For applications that demand high performance, the traditional options are to use optimized libraries like OpenCV or to optimize manually. While using libraries precludes optimization across library routines, manual optimization accounting for both parallelism and locality is very tedious. The focus of our system, PolyMage, is on automatically generating high-performance implementations of image processing pipelines expressed in a high-level declarative language. Our optimization approach primarily relies on the transformation and code generation capabilities of the polyhedral compiler framework. To the best of our knowledge, this is the first model-driven compiler for image processing pipelines that performs complex fusion, tiling, and storage optimization automatically. Experimental results on a modern multicore system show that the performance achieved by our automatic approach is up to 1.81x better than that achieved through manual tuning in Halide, a state-of-the-art language and compiler for image processing pipelines. For a camera raw image processing pipeline, our performance is comparable to that of a hand-tuned implementation.