World Library  
Flag as Inappropriate
Email this Article

Romanization of Bengali

Article Id: WHEBN0008029370
Reproduction Date:

Title: Romanization of Bengali  
Author: World Heritage Encyclopedia
Language: English
Subject: Bengali language
Publisher: World Heritage Encyclopedia

Romanization of Bengali

The Romanization of Bengali is the representation of the Bengali language in the Latin script. There are various ways of Romanization systems of Bengali created in recent years which have failed to represent the true Bengali phonetic sound. While different standards for romanization have been proposed for Bengali, these have not been adopted with the degree of uniformity seen in languages such as Japanese or Sanskrit.[note 1] The Bengali script has often been included with the group of Indic scripts for romanization where the true phonetic value of Bengali is never represented. Some of them are the "International Alphabet of Sanskrit Transliteration" or IAST system (based on diacritics),[1] "Indian languages Transliteration" or ITRANS (uses upper case alphabets suited for ASCII keyboards),[2] and the National Library at Calcutta romanization.[3]

In the context of Bengali Romanization, it is important to distinguish transliteration from transcription. Transliteration is orthographically accurate (i.e. the original spelling can be recovered), whereas transcription is phonetically accurate (the pronunciation can be reproduced). Since English does not have the sounds of Bengali, and since pronunciation does not completely reflect the spellings, not being faithful to both.

Although it might be desirable to use a transliteration scheme where the original Bengali orthography is recoverable from the Latin text, Bengali words are currently Romanized on WorldHeritage using a phonemic transcription, where the true phonetic pronunciation of Bengali is represented with no reference to how it is written. The WorldHeritage Romanization scheme is given in the table below, with the transcriptions as used above.


The Portuguese missionaries stationed in Bengal in the 16th century were the first people to employ the Latin alphabet in writing Bengali books, the most famous of which are the Crepar Xaxtrer Orth, Bhed and the Vocabolario em idioma Bengalla, e Portuguez dividido em duas partes, both written by Manuel da Assumpção. But the Portuguese-based romanization did not take root. In the late 18th century Augustin Aussant used a romanization scheme based on the French alphabet. At the same time, Nathaniel Brassey Halhed used a romanization scheme based on English for his Bengali grammar book. After Halhed, the renowned English philologist and oriental scholar Sir William Jones devised a romanization scheme for Bengali and for Indian languages in general, and published it in the Asiatick Researches journal in 1801.[4] This scheme came to be known as the "Jonesian System" of romanization, and served as a model for the next century and a half.

Transliteration vs transcription

The Romanization of a language written in a non-Roman script can be based on transliteration (orthographically accurate, i.e. the original spelling can be recovered) or transcription (phonetically accurate, i.e. the pronunciation can be reproduced). This distinction is important in Bengali as its orthography was adopted from Sanskrit, and ignores sound change processes of several millennia. To some degree, all writing systems differ from the way the language is pronounced, but this may be more extreme for languages like Bengali. For example, the three letters শ, ষ, and স had distinct pronunciations in Sanskrit, but over several centuries, the standard pronunciation of Bengali (usually modeled on the Nadia dialect), has lost these phonetic distinctions (all three are usually pronounced as IPA [ʃɔ]) while the spelling distinction nevertheless persists in orthography.

In written texts, it is easy to distinguish between homophones such as শাপ shap "curse" and সাপ shap "snake". Such a distinction could be particularly relevant in searching for the term in an encyclopedia, for example. However, the fact that the words sound identical means that they would be transcribed identically; thus, some important meaning distinctions cannot be rendered in a transcription model. Another issue with transcription systems is that cross-dialectal and cross-register differences are widespread, and thus the same word or lexeme may have many different transcriptions. Even simple words like মন "mind" may be pronounced "mon", "môn", or (in poetry) "mônô" (e.g. the Indian national anthem, Jana Gana Mana).

Often, different phonemes (meaningfully different sounds) are represented by the same symbol or grapheme. Thus, the vowel এ can represent both [e] (এল elo [elɔ] "came"), or [æ] (এক êk [æk] "one"). Occasionally, words written in the same way (homographs) may have different pronunciations for differing meanings: মত can mean "opinion" (pronounced môt), or "similar to" (môtô). Thus, some important phonemic distinctions cannot be rendered in a transliteration model. In addition, when representing a Bengali word to allow speakers of other languages to pronounce it easily, it may be better to use a transcription, which does not include the silent letters and other idiosyncrasies (e.g. স্বাস্থ্য sbasthyô, spelled , or অজ্ঞান ôggên, spelled ) that make Bengali romanization so complicated. Those spelled letters are false to phonetic romanization of Bengali and is a result of often inclusion of the Bengali script with other Indic scripts for romanizations, where the other Incic scripts don't carry the inherited vowel ô, thus making Bengali romanization a mess.

Comparison of romanizations

Comparisons of standard romanization schemes for Bengali are given in the table below. Two standards are commonly used for transliteration of Indic languages including Bengali. Many standards (e.g. NLK / ISO), use diacritic marks and permit case markings for proper nouns. Newer forms (e.g. Harvard-Kyoto) are more suited for ASCII-derivative keyboards, and use upper- and lower-case letters contrastively and forgo normal standards for English capitalization.

  • "NLK" stands for the diacritic-based letter-to-letter transliteration schemes, best represented by the National Library at Kolkata romanization or the ISO 15919, or IAST. This is the ISO standard, and it uses diacritic marks (e.g. ā) to reflect the additional characters and sounds of Bengali letters.
  • ITRANS is an ASCII representation for Sanskrit; it is one-to-many, i.e. there may be more than one way of transliterating characters, which can make internet searching more complicated. ITRANS representations forgo capitalization norms of English so as to be able to represent the characters using a normal ASCII keyboard.
  • "HK" stands for two other case-sensitive letter-to-letter transliteration schemes: Harvard-Kyoto and XIAST scheme. These are similar to the ITRANS scheme, and use only one form for each character.
  • XHK or Extended Harvard-Kyoto (XHK) stands for the case-sensitive letter-to-letter Extended Harvard-Kyoto transliteration. This adds some specific characters for handling Bengali text to IAST.
  • "Wiki" stands for a phonemic transcription-based romanization. It is a sound-preserving transcription based on what is perceived to be the standard pronunciation of the Bengali words, with no reference to how it is written in Bengali script. It uses diacritics often used by linguists specializing in Bengali (other than IPA), and is the transcription system used to represent Bengali sounds in WorldHeritage articles.


The following table includes examples of Bengali words Romanized using the various systems mentioned above.

Example words
In orthography Meaning NLK XHK ITRANS HK Wiki IPA
মন mind mana mana mana mana môn [mɔn]
সাপ snake sāpa sApa saapa sApa sap [ʃap]
শাপ curse śāpa zApa shaapa zApa shap [ʃap]
মত opinion mata mata mata mata môt [mɔt̪]
মত like mata mata mata mata môtô [mɔt̪ɔ]
তেল oil tēla tela tela tela tel [t̪el]
গেল went gēla gela gela gela gêlô [ɡɛlɔ]
জ্বর fever jvara jvara jvara jvara jôr [dʒɔr]
স্বাস্থ্য health svāsthya svAsthya svaasthya svAsthya sasthyô [ʃast̪ʰːɔ]
বাংলাদেশ Bangladesh bāṃlādēśa bAMlAdeza baa.mlaadesha bAMlAdeza Bangladesh [baŋlad̪eʃ]
ব্যঞ্জনধ্বনি consonant byañjanadhvani byaJjanadhvani bya~njanadhvani byaJjanadhvani bênjôndhôni [bɛndʒɔnd̪ʱɔni]
আত্মহত্যা suicide ātmahatyā AtmahatyA aatmahatyaa AtmahatyA atmôhôtya [at̪ːɔhɔt̪ːa]

Romanization reference

The IPA (International Phonetic Alphabet) transcription is provided in the rightmost column, representing the most common pronunciation of the glyph in Standard Colloquial Bengali, alongside the various romanizations described above.

a a a a a ô/o [ɔ]/[o]
ā ā ā A~aa A a [a]/[a:]
i i i i i i [i]
ī ī ī I~ii I i [i:]
u u u u u u [u]
ū ū ū U~uu U u [u:]
r RRi~R^i R ri [ri]
e ē e e e ê/e [æ]/[e]
ai ai ai ai ai ôi/oi [oi]
o ō o o o o [o]
au au au au au ou [ou]
k k k k k [kɔ]
kh kh kh kh kh khô [kʰɔ]
g g g g g [ɡɔ]
gh gh gh gh gh ghô [ɡʱɔ]
ng ~N G ngô [ŋɔ]/[uõ]
c c c ch c chô [tʃɔ]
ch ch ch Ch ch chhô [tʃʰɔ]
j j j j j [dʒɔ]
jh jh jh jh jh jhô [dʒʱɔ]
ñ ñ ñ ~n J niô [nɔ]
T T ţô [ʈɔ]
ṭh ṭh ṭh Th Th ţhô [ʈʰɔ]
D D đô [ɖɔ]
ড় .D P ŗô [ɽɔ]
ḍh ḍh ḍh Dh Dh đhô [ɖʱɔ]
ঢ় ṛh ḍh ḏh .Dh Ph ŗhô [ɽɔ]
N N [nɔ]
t t t t t [t̪ɔ]
th th th th th thô [t̪ʰɔ]
d d d d d [d̪ɔ]
dh dh dh dh dh dhô [d̪ʱɔ]
n n n n n [nɔ]
p p p p p [pɔ]
ph ph ph ph ph fô/phô [ɸɔ~pʰɔ]
b b b b b [bɔ]
bh bh bh bh bh bhô [bʱɔ]
m m m m m [mɔ]
y/j y y y jô/zô [dʒɔ]
য় y Y Y yô/e [e̯ɔ]/–
r r r r r [rɔ]
l l l l l [lɔ]
ś/sh ś ś sh z shô [ʃɔ]
ṣ/sh Sh S shô [ʃɔ]
s s s s s sô/shô [sɔ]
h h h h h [ɦɔ]
H H varies varies
ng .m M ng [ŋ]
◌̃ ɱ .N ~ ~ [~] (nasalization)
্য y y y y y varies varies
্ব w/v v v v v varies varies
ক্ষ kṣ kṣ kṣ x kS kkhô [kʰːɔ]
জ্ঞ GY jJ ggô [ɡːɔ]
শ্র śr śr śr shr zr shrô [ʃɾɔ]


  1. ^ In Japanese there exists some debate as to whether to accent certain distinctions, such as Tōhoku vs Tohoku. Sanskrit is well standardized, because the speaking community is relatively small, and sound change is not a large concern


  1. ^ "Learning International Alphabet of Sanskrit Transliteration". Sanskrit 3 - Learning transliteration. Gabriel Pradiipaka & Andrés Muni. Archived from the original on 12 February 2007. Retrieved 2006-11-20. 
  2. ^ "ITRANS — Indian Language Transliteration Package". Avinash Chopde. Retrieved 2006-11-20. 
  3. ^ "Annex-F: Roman Script Transliteration" (PDF). Indian Standard: Indian Script Code for Information Interchange — ISCII.  
  4. ^ Jones 1801
  5. ^ a b c বাংলা একাডেমী ব্যবহারিক বাংলা অভিধান Bangla Academy Byaboharik Bangla Abhidhan (Bangla Academy Functional Bengali Dictionary) (16th reprint ed.). DHaka 1000, Bangladesh: Bangla Academy. Nov 2012. p. আট্রিশ (তালিকা -৪).  
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.