World Library  
Flag as Inappropriate
Email this Article

Language code

Article Id: WHEBN0000221917
Reproduction Date:

Title: Language code  
Author: World Heritage Encyclopedia
Language: English
Subject: Character encodings in HTML, Bi icon, Frp icon, Hif icon, Safwa people
Collection: Identifiers, Internationalization and Localization, Languages
Publisher: World Heritage Encyclopedia

Language code

A language code is a code that assigns letters and/or numbers as identifiers or classifiers for languages. These codes may be used to organize library collections or presentations of data, to choose the correct localizations and translations in computing, and as a shorthand designation for longer forms of language-name.


  • Difficulties of classification 1
  • Common schemes 2
  • See also 3
  • References 4
  • External links 5

Difficulties of classification

Language code schemes attempt to classify within the complex world of human languages, dialects, and variants. Most schemes make some compromises between being general and being complete enough to support specific dialects.

For example, most people in Central America and South America speak Spanish. Spanish spoken in Mexico will be slightly different from Spanish spoken in Peru. Different regions of Mexico will have slightly different dialects and accents of Spanish. A language code scheme might group these all as "Spanish" for choosing a keyboard layout, most as "Spanish" for general usage, or separate each dialect to allow region-specific idioms.

Common schemes

Some common language code schemes include:

Scheme Notes Examples
Codes for English Codes for Spanish
Glottolog codes Created for minority languages as a scientific alternative to the industrial ISO 639‑3 standard.
Intentionally do not resemble abbreviations.
  • stan1293 – standard English
  • macr1271 – macro-English (Modern English, incl. creoles)
  • midd1317 – Middle English
  • merc1242 – Mercian (Middle – Modern English)
  • olde1238 – Old English
  • angl1265 – Anglian (Old – Modern English, incl. Scots)
  • stan1288 – standard Spanish
  • olds1249 – Old Spanish
  • cast1243 – Castilic (Old – Modern Spanish, incl. Extremaduran & creoles)
IETF language tag An IETF best practice, currently specified by RFC 5646 and RFC 4647, for language tags easy to parse by computer. The tag system is extensible to region, dialect, and private designations.
  • en – English, as shortest ISO 639 code.
  • en-US – English as used in the United States (US is the ISO 3166‑1 country code for the United States)

(source: IETF memo[1])

  • es – Spanish, as shortest ISO 639 code.
  • es-419 – Spanish appropriate for the Latin America and Caribbean region, using the UN M.49 region code
ISO 639 The original ISO standard from 1967 to 2002. Now obsolete, it was replaced by ISO 639‑1, ISO 639‑2, and ISO 639‑3. Sometimes used as a shorthand for the union of all 639 standard codes.
  • eng – three-letter code
  • enm – Middle English, c. 1100–1500
  • ang – Old English, c. 450–1100
  • cpe – other English-based creoles and pidgins
  • EN – English or American two-letter capital code

(source: Library of Congress[2])

  • esl – three-letter code
  • spa – alternative three-letter code
  • ES – Spanish two-letter capital code
ISO 639‑1 Two-letter code system made official in 2002, containing 136 codes. Many systems use two-letter ISO 639‑1 codes supplemented by three-letter ISO 639‑2 codes when no two-letter code is applicable.
  • en

(from List of ISO 639‑1 codes)

  • es – Spanish
ISO 639‑2 Three-letter system of 464 codes.
  • eng – three-letter code
  • enm – Middle English, c. 1100–1500
  • ang – Old English, c. 450–1100
  • cpe – other English-based creoles and pidgins

(from List of ISO 639‑2 codes)

  • spa – Spanish
ISO 639‑3 An extension of ISO 639‑2 to cover all known, living or dead, spoken or written languages in 7,589 entries.
  • eng – three-letter code
  • enm – Middle English, c. 1100–1500
  • aig – Antigua and Barbuda Creole English
  • ang – Old English, c. 450–1100
  • svc – Vincentian Creole English
  • others

(from List of ISO 639‑3 codes)

  • spa – Spanish
  • spq – Spanish, Loreto-Ucayali
  • ssp – Spanish sign language
  • others
LS‑2010 Two-digit + one to six letter Linguasphere code system published in 2000, updated 2010, containing over 32,000 codes.

(within hierarchy of Linguasphere-2010 codes, as follows:)

  • 5= Indo-European phylosector
  • 52= Germanic phylozone
  • 52-A Germanic set
  • 52-AB English + Anglo-Creole chain
  • 52-ABA English
  • 52-ABA-c
    Global English
    outer unit
    52-ABA-ca to
    (186 varieties)

compare: 52-ABA-a Scots + Northumbrian
outer unit &
52-ABA-b "Anglo-English" outer unit
(= South Great Britain traditional varieties + Old Anglo-Irish)

(within hierarchy of Linguasphere-2010 codes, as follows:)

  • 5= Indo-European phylosector
  • 51= Romanic phylozone
  • 51-A Romance set
  • 51-AA Romance chain
  • 51-AAA West Romance net
  • 51-AAA-b Español/Castellano
    outer unit
    51-AAA-ba to
    (58 varieties)

compare: 51-AAA-a Português + Galego outer unit &
51-AAA-c Astur + Leonés outer unit, etc.

SIL codes (10th–14th editions) Codes created for use in the Ethnologue, a publication of SIL International that lists language statistics. The publication now uses ISO 639‑3 codes. ENG SPN
Verbix Language Codes Constructed codes starting with old SIL codes and adding more information.[3]    

See also


  1. ^ Best Current Practice 47 – Tags for Identifying Languages, IETF
  2. ^ ISO 639 Language Codes, Library of Congress
  3. ^ Verbix language codes, Verbix

External links

  • Language Tags in HTML and XML
  • Language Identifiers in the Markup Context

This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.