971 resultados para Language Model


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this PhD thesis is to investigate a semantic relation present in the connection of sentences (more specifically: propositional units). This relation, which we refer to as contrast, includes the traditional categories of adversatives - predominantly represented by the connector but in English and pero in Modern Spanish - and concessives, prototypically verbalised through although / aunque. The aim is to describe, analyse and - as far as possible - to explain the emergence and evolution of different syntactic schemes marking contrast during the first three centuries of Spanish (also referred to as Castilian) as a literary language, i.e., from the 13th to the 15th century. The starting point of this question is a commonplace in syntax, whereby the semantic and syntactic complexity of clause linkage correlates with the degree of textual elaboration. In historical linguistics, i.e., applied to the phylogeny of a language, it is commonly referred to as the parataxis hypothesis A crucial part of the thesis is dedicated by the definition of contrast as a semantic relation. Although the label contrast has been used in this sense, mainly in functional grammar and text linguistics, mainstream grammaticography and linguistics remain attached to the traditional categories adversatives and concessives. In opposition to this traditional view, we present our own model of contrast, based on a pragma-semantic description proposed for the analysis of adversatives by Oswald Ducrot and subsequently adopted by Ekkehard König for the analysis of concessives. We refine and further develop this model in order for it to accommodate all, not just the prototypical instances of contrast in Spanish, arguing that the relationship between adversatives and concessives is a marked opposition, i.e., that the higher degree of semantic and syntactic integration of concessives restricts some possible readings that the adversatives may have, but that this difference is almost systematically neutralised by contextual factors, thus justifying the assumption of contrast as a comprehensive onomasiological category. This theoretical focus is completed by a state-of-the-question overview attempting to account for all relevant forms in which contrast is expressed in Medieval Spanish, with the aid of lexicographic and grammaticographical sources, and an empirical study investigating the expression of corpus in a corpus study on the textual functions of contrast in nine Medieval Spanish texts: Cantar de Mio Cid, Libro de Alexandre, Milagros de Nuestra Sehora, Estoria de Espana, Primera Partida, Lapidario, Libro de buen amor, Conde Lucanor, and Corbacho. This corpus is analysed using quantitative and qualitative tools, and the study is accompanied by a series of methodological remarks on how to investigate a pragma-semantic category in historical linguistics. The corpus study shows that the parataxis hypothesis fails to prove from a statistical viewpoint, although a qualitative analysis shows that the use of subordination does increase over time in some particular contexts.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this thesis a model for managing the product data in a product transfer project was created for ABB Machines. This model was then applied for the ongoing product transfer project during its planning phase. Detailed information about the demands and challenges in product transfer projects was acquired by analyzing previous product transfer projects in participating organizations. This analysis and the ABB Gate Model were then used as a base for the creation of the model for managing the product data in a product transfer project. The created model shows the main tasks during each phase in the project, their sub-tasks and relatedness on general level. Furthermore the model emphasizes need for detailed analysis of the situation during the project planning phase. The created model for managing the product data in a product transfer project was applied into ongoing project two main areas; manufacturing instructions and production item data. The results showed that the greatest challenge considering the product transfer project in previously mentioned areas is the current state of the product data. Based on the findings, process and resource proposals for both the ongoing product transfer project and the BU Machines were given. For manufacturing instructions it is necessary to create detailed process instructions in receiving organizations own language for each department so that the manufacturing instructions can be used as a training material during the training in sending organization. For production item data the English version of the bill of materials needs to be fully in English. In addition it needs to be ensured that bill of materials is updated and these changes implemented before the training in sending organization begins.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Language diversity has become greatly endangered in the past centuries owing to processes of language shift from indigenous languages to other languages that are seen as socially and economically more advantageous, resulting in the death or doom of minority languages. In this paper, we define a new language competition model that can describe the historical decline of minority languages in competition with more advantageous languages. We then implement this non-spatial model as an interaction term in a reactiondiffusion system to model the evolution of the two competing languages. We use the results to estimate the speed at which the more advantageous language spreads geographically, resulting in the shrinkage of the area of dominance of the minority language. We compare the results from our model with the observed retreat in the area of influence of the Welsh language in the UK, obtaining a good agreement between the model and the observed data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

En la implementació del CLIL a l’educació superior, apart d’estudis sobre el nivell de l’estudiantat i la disponibilitat del professorat, i de l’elaboració de material educatiu interdisciplinari, el repte actual és aconseguir que s’involucrin activament en CLIL els professors de contingut d’un ventall ampli de disciplines. En aquesta comunicació es presenten les bases d’un model per un sistema CLIL, utilitzant la dinàmica newtoniana. Pot ser un model interessant i plausible en un context universitari científic i tecnològic, on fins ara el CLIL s’ha implementat només lleugerament.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We use two coupled equations to analyze the space-time dynamics of two interacting languages. Firstly, we introduce a cohabitation model, which is more appropriate for human populations than classical (non-cohabitation) models. Secondly, using numerical simulations we nd the front speed of a new language spreading into a region where another language was previously used. Thirdly, for a special case we derive an analytical formula that makes it possible to check the validity of our numerical simulations. Finally, as an example, we nd that the observed front speed for the spread of the English language into Wales in the period 1961-1981 is consistent with the model predictions. We also nd that the e¤ects of linguistic parameters are much more important than those of parameters related to population dispersal and reproduction. If the initial population densities of both languages are similar, they have no e¤ect on the front speed. We outline the potential of the new model to analyze relationships between language replacement and genetic replacement

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The article describes some concrete problems that were encountered when writing a two-level model of Mari morphology. Mari is an agglutinative Finno-Ugric language spoken in Russia by about 600 000 people. The work was begun in the 1980s on the basis of K. Koskenniemi’s Two-Level Morphology (1983), but in the latest stage R. Beesley’s and L. Karttunen’s Finite State Morphology (2003) was used. Many of the problems described in the article concern the inexplicitness of the rules in Mari grammars and the lack of information about the exact distribution of some suffixes, e.g. enclitics. The Mari grammars usually give complete paradigms for a few unproblematic verb stems, whereas the difficult or unclear forms of certain verbs are only superficially discussed. Another example of phenomena that are poorly described in grammars is the way suffixes with an initial sibilant combine to stems ending in a sibilant. The help of informants and searches from electronic corpora were used to overcome such difficulties in the development of the two-level model of Mari. The variation of the order of plural markers, case suffixes and possessive suffixes is a typical feature of Mari. The morphotactic rules constructed for Mari declensional forms tend to be recursive and their productivity must be limited by some technical device, such as filters. In the present model, certain plural markers were treated like nouns. The positional and functional versatility of the possessive suffixes can be regarded as the most challenging phenomenon in attempts to formalize the Mari morphology. Cyrillic orthography, which was used in the model, also caused problems. For instance, a Cyrillic letter may represent a sequence of two sounds, the first being part of the word stem while the other belongs to a suffix. In some cases, letters for voiced consonants are also generalized to represent voiceless consonants. Such orthographical conventions distance a morphological model based on orthography from the actual (morpho)phonological processes in the language.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Biotechnology has been recognized as the key strategic technology for industrial growth. The industry is heavily dependent on basic research. Finland continues to rank in the top 10 of Europe's most innovative countries in terms of tax-policy, education system, infrastructure and the number of patents issued. Regardless of the excellent statistical results, the output of this innovativeness is below acceptable. Research on the issues hindering the output creation has already been done and the identifiable weaknesses in the Finland's National Innovation system are the non-existent growth of entrepreneurship and the missing internationalization. Finland is proven to have all the enablers of the innovation policy tools, but is lacking the incentives and rewards to push the enablers, such as knowledge and human capital, forward. Science Parks are the biggest operator in research institutes in the Finnish Science and Technology system. They exist with the purpose of speeding up the commercialization process of biotechnology innovations which usually include technological uncertainty, technical inexperience, business inexperience and high technology cost. Innovation management only internally is a rather historic approach, current trend drives towards open innovation model with strong triple helix linkages. The evident problems in the innovation management within the biotechnology industry are examined through a case study approach including analysis of the semi-structured interviews which included biotechnology and business expertise from Turku School of Economics. The results from the interviews supported the theoretical implications as well as conclusions derived from the pilot survey, which focused on the companies inside Turku Science Park network. One major issue that the Finland's National innovation system is struggling with is the fact that it is technology driven, not business pulled. Another problem is the university evaluation scale which focuses more on number of graduates and short-term factors, when it should put more emphasis on the cooperation success in the long-term, such as the triple helix connections with interaction and knowledge distribution. The results of this thesis indicated that there is indeed requirement for some structural changes in the Finland's National innovation system and innovation policy in order to generate successful biotechnology companies and innovation output. There is lack of joint output and scales of success, lack of people with experience, lack of language skills, lack of business knowledge and lack of growth companies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Denna avhandling tar sin utgångspunkt i ett ifrågasättande av effektiviteten i EU:s konditionalitetspolitik avseende minoritetsrättigheter. Baserat på den rationalistiska teoretiska modellen, External Incentives Model of Governance, syftar denna hypotesprövande avhandling till att förklara om tidsavståndet på det potentiella EU medlemskapet påverkar lagstiftningsnivån avseende minoritetsspråksrättigheter. Mätningen av nivån på lagstiftningen avseende minoritetsspråksrättigheter begränsas till att omfatta icke-diskriminering, användning av minoritetsspråk i officiella sammanhang samt minoriteters språkliga rättigheter i utbildningen. Metodologiskt används ett jämförande angreppssätt både avseende tidsramen för studien, som sträcker sig mellan 2003 och 2010, men även avseende urvalet av stater. På basis av det \"mest lika systemet\" kategoriseras staterna i tre grupper efter deras olika tidsavstånd från det potentiella EU medlemskapet. Hypotesen som prövas är följande: ju kortare tidsavstånd till det potentiella EU medlemskapet desto större sannolikhet att staternas lagstiftningsnivå inom de tre områden som studeras har utvecklats till en hög nivå. Studien visar att hypotesen endast bekräftas delvis. Resultaten avseende icke-diskriminering visar att sambandet mellan tidsavståndet och nivån på lagstiftningen har ökat markant under den undersökta tidsperioden. Detta samband har endast stärkts mellan kategorin av stater som ligger tidsmässigt längst bort ett potentiellt EU medlemskap och de två kategorier som ligger närmare respektive närmast ett potentiellt EU medlemskap. Resultaten avseende användning av minoritetsspråk i officiella sammanhang och minoriteters språkliga rättigheter i utbildningen visar inget respektive nästan inget samband mellan tidsavståndet och utvecklingen på lagstiftningen mellan 2003 och 2010.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The capabilities and thus, design complexity of VLSI-based embedded systems have increased tremendously in recent years, riding the wave of Moore’s law. The time-to-market requirements are also shrinking, imposing challenges to the designers, which in turn, seek to adopt new design methods to increase their productivity. As an answer to these new pressures, modern day systems have moved towards on-chip multiprocessing technologies. New architectures have emerged in on-chip multiprocessing in order to utilize the tremendous advances of fabrication technology. Platform-based design is a possible solution in addressing these challenges. The principle behind the approach is to separate the functionality of an application from the organization and communication architecture of hardware platform at several levels of abstraction. The existing design methodologies pertaining to platform-based design approach don’t provide full automation at every level of the design processes, and sometimes, the co-design of platform-based systems lead to sub-optimal systems. In addition, the design productivity gap in multiprocessor systems remain a key challenge due to existing design methodologies. This thesis addresses the aforementioned challenges and discusses the creation of a development framework for a platform-based system design, in the context of the SegBus platform - a distributed communication architecture. This research aims to provide automated procedures for platform design and application mapping. Structural verification support is also featured thus ensuring correct-by-design platforms. The solution is based on a model-based process. Both the platform and the application are modeled using the Unified Modeling Language. This thesis develops a Domain Specific Language to support platform modeling based on a corresponding UML profile. Object Constraint Language constraints are used to support structurally correct platform construction. An emulator is thus introduced to allow as much as possible accurate performance estimation of the solution, at high abstraction levels. VHDL code is automatically generated, in the form of “snippets” to be employed in the arbiter modules of the platform, as required by the application. The resulting framework is applied in building an actual design solution for an MP3 stereo audio decoder application.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Software is a key component in many of our devices and products that we use every day. Most customers demand not only that their devices should function as expected but also that the software should be of high quality, reliable, fault tolerant, efficient, etc. In short, it is not enough that a calculator gives the correct result of a calculation, we want the result instantly, in the right form, with minimal use of battery, etc. One of the key aspects for succeeding in today's industry is delivering high quality. In most software development projects, high-quality software is achieved by rigorous testing and good quality assurance practices. However, today, customers are asking for these high quality software products at an ever-increasing pace. This leaves the companies with less time for development. Software testing is an expensive activity, because it requires much manual work. Testing, debugging, and verification are estimated to consume 50 to 75 per cent of the total development cost of complex software projects. Further, the most expensive software defects are those which have to be fixed after the product is released. One of the main challenges in software development is reducing the associated cost and time of software testing without sacrificing the quality of the developed software. It is often not enough to only demonstrate that a piece of software is functioning correctly. Usually, many other aspects of the software, such as performance, security, scalability, usability, etc., need also to be verified. Testing these aspects of the software is traditionally referred to as nonfunctional testing. One of the major challenges with non-functional testing is that it is usually carried out at the end of the software development process when most of the functionality is implemented. This is due to the fact that non-functional aspects, such as performance or security, apply to the software as a whole. In this thesis, we study the use of model-based testing. We present approaches to automatically generate tests from behavioral models for solving some of these challenges. We show that model-based testing is not only applicable to functional testing but also to non-functional testing. In its simplest form, performance testing is performed by executing multiple test sequences at once while observing the software in terms of responsiveness and stability, rather than the output. The main contribution of the thesis is a coherent model-based testing approach for testing functional and performance related issues in software systems. We show how we go from system models, expressed in the Unified Modeling Language, to test cases and back to models again. The system requirements are traced throughout the entire testing process. Requirements traceability facilitates finding faults in the design and implementation of the software. In the research field of model-based testing, many new proposed approaches suffer from poor or the lack of tool support. Therefore, the second contribution of this thesis is proper tool support for the proposed approach that is integrated with leading industry tools. We o er independent tools, tools that are integrated with other industry leading tools, and complete tool-chains when necessary. Many model-based testing approaches proposed by the research community suffer from poor empirical validation in an industrial context. In order to demonstrate the applicability of our proposed approach, we apply our research to several systems, including industrial ones.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The academic study of place has been generally defined by two distinct and highly refined discourses within outdoor recreation research: place attachment and sense of place. Place attachment generally describes the intensity of the place relationship, whereas sense of place approaches place from a more holistic and intimate orientation. This study bridges these two methodological and theoretical separate areas of place research together by re-conceptualizing the way in which place relationships are viewed within outdoor recreation research. The Psychological Continuum Model is used to extend the language of place attachment to incorporate more of the philosophy of sense of place while attending to the empirical strength and utility of place attachment. This extension results in the term place allegiance being coined to depict the strong and profound relationships outdoor recreationists build with their places of outdoor recreation. Using a concurrent mixed methods research design, this study explored place allegiance via an online survey (n = 437) and thirteen in-depth qualitative interviews with outdoor recreationists. Results indicate that place allegiance can be measured through a multi-dimensional model of place allegiance that incorporates behaviours, importance, resistance, knowledge and symbolic value. In addition, place allegiance was found to be related to an individual's influence on life course and his/her willingness to exhibit preservation and protection tendencies. Place allegiance plays an important role in acknowledging the importance of authentic place relationships in an effort to confront placelessness. Wilderness recreation is an important avenue for outdoor recreationists to build strong place relationships.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Le but de cette thèse est d'étudier les corrélats comportementaux et neuronaux du transfert inter-linguistique (TIL) dans l'apprentissage d’une langue seconde (L2). Compte tenu de nos connaissances sur l'influence de la distance linguistique sur le TIL (Paradis, 1987, 2004; Odlin, 1989, 2004, 2005; Gollan, 2005; Ringbom, 2007), nous avons examiné l'effet de facilitation de la similarité phonologique à l’aide de la résonance magnétique fonctionnelle entre des langues linguistiquement proches (espagnol-français) et des langues linguistiquement éloignées (persan-français). L'étude I rapporte les résultats obtenus pour des langues linguistiquement proches (espagnol-français), alors que l'étude II porte sur des langues linguistiquement éloignées (persan-français). Puis, les changements de connectivité fonctionnelle dans le réseau langagier (Price, 2010) et dans le réseau de contrôle supplémentaire impliqué dans le traitement d’une langue seconde (Abutalebi & Green, 2007) lors de l’apprentissage d’une langue linguistiquement éloignée (persan-français) sont rapportés dans l’étude III. Les résultats des analyses d’IRMF suivant le modèle linéaire général chez les bilingues de langues linguistiquement proches (français-espagnol) montrent que le traitement des mots phonologiquement similaires dans les deux langues (cognates et clangs) compte sur un réseau neuronal partagé par la langue maternelle (L1) et la L2, tandis que le traitement des mots phonologiquement éloignés (non-clang-non-cognates) active des structures impliquées dans le traitement de la mémoire de travail et d'attention. Toutefois, chez les personnes bilingues de L1-L2 linguistiquement éloignées (français-persan), même les mots phonologiquement similaires à travers les langues (cognates et clangs) activent des régions connues pour être impliquées dans l'attention et le contrôle cognitif. Par ailleurs, les mots phonologiquement éloignés (non-clang-non-cognates) activent des régions usuellement associées à la mémoire de travail et aux fonctions exécutives. Ainsi, le facteur de distance inter-linguistique entre L1 et L2 module la charge cognitive sur la base du degré de similarité phonologiques entres les items en L1 et L2. Des structures soutenant les processus impliqués dans le traitement exécutif sont recrutées afin de compenser pour des demandes cognitives. Lorsque la compétence linguistique en L2 augmente et que les tâches linguistiques exigent ainsi moins d’effort, la demande pour les ressources cognitives diminue. Tel que déjà rapporté (Majerus, et al, 2008; Prat, et al, 2007; Veroude, et al, 2010; Dodel, et al, 2005; Coynel, et al ., 2009), les résultats des analyses de connectivité fonctionnelle montrent qu’après l’entraînement la valeur d'intégration (connectivité fonctionnelle) diminue puisqu’il y a moins de circulation du flux d'information. Les résultats de cette recherche contribuent à une meilleure compréhension des aspects neurocognitifs et de plasticité cérébrale du TIL ainsi que l'impact de la distance linguistique dans l'apprentissage des langues. Ces résultats ont des implications dans les stratégies d'apprentissage d’une L2, les méthodes d’enseignement d’une L2 ainsi que le développement d'approches thérapeutiques chez des patients bilingues qui souffrent de troubles langagiers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cette thèse a pour but d’améliorer l’automatisation dans l’ingénierie dirigée par les modèles (MDE pour Model Driven Engineering). MDE est un paradigme qui promet de réduire la complexité du logiciel par l’utilisation intensive de modèles et des transformations automatiques entre modèles (TM). D’une façon simplifiée, dans la vision du MDE, les spécialistes utilisent plusieurs modèles pour représenter un logiciel, et ils produisent le code source en transformant automatiquement ces modèles. Conséquemment, l’automatisation est un facteur clé et un principe fondateur de MDE. En plus des TM, d’autres activités ont besoin d’automatisation, e.g. la définition des langages de modélisation et la migration de logiciels. Dans ce contexte, la contribution principale de cette thèse est de proposer une approche générale pour améliorer l’automatisation du MDE. Notre approche est basée sur la recherche méta-heuristique guidée par les exemples. Nous appliquons cette approche sur deux problèmes importants de MDE, (1) la transformation des modèles et (2) la définition précise de langages de modélisation. Pour le premier problème, nous distinguons entre la transformation dans le contexte de la migration et les transformations générales entre modèles. Dans le cas de la migration, nous proposons une méthode de regroupement logiciel (Software Clustering) basée sur une méta-heuristique guidée par des exemples de regroupement. De la même façon, pour les transformations générales, nous apprenons des transformations entre modèles en utilisant un algorithme de programmation génétique qui s’inspire des exemples des transformations passées. Pour la définition précise de langages de modélisation, nous proposons une méthode basée sur une recherche méta-heuristique, qui dérive des règles de bonne formation pour les méta-modèles, avec l’objectif de bien discriminer entre modèles valides et invalides. Les études empiriques que nous avons menées, montrent que les approches proposées obtiennent des bons résultats tant quantitatifs que qualitatifs. Ceux-ci nous permettent de conclure que l’amélioration de l’automatisation du MDE en utilisant des méthodes de recherche méta-heuristique et des exemples peut contribuer à l’adoption plus large de MDE dans l’industrie à là venir.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In Statistical Machine Translation from English to Malayalam, an unseen English sentence is translated into its equivalent Malayalam sentence using statistical models. A parallel corpus of English-Malayalam is used in the training phase. Word to word alignments has to be set among the sentence pairs of the source and target language before subjecting them for training. This paper deals with certain techniques which can be adopted for improving the alignment model of SMT. Methods to incorporate the parts of speech information into the bilingual corpus has resulted in eliminating many of the insignificant alignments. Also identifying the name entities and cognates present in the sentence pairs has proved to be advantageous while setting up the alignments. Presence of Malayalam words with predictable translations has also contributed in reducing the insignificant alignments. Moreover, reduction of the unwanted alignments has brought in better training results. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics.