Languages (ISO 639-3)

Version 1.0.0 β€’ Published Jun 03, 2025 00:41 UTC β€’ by SIL International, International Organization for Standardization (ISO), RefPack

A comprehensive list of world languages with their standard ISO 639-3 codes. Data is sourced from SIL International, the official ISO 639-3 Registration Authority.

πŸ“˜ Languages Dataset (languages)

🧾 Overview

This dataset provides a comprehensive list of world languages, primarily identified by their ISO 639-3 standard codes. It includes essential identifiers such as alpha3-b (the primary ISO 639-3 code), optional alpha3-t (terminological) and alpha2 (ISO 639-1) codes, and language names in English and French. This dataset is intended to serve as a reliable reference for applications involving multilingual capabilities, localization, internationalization (i18n), linguistic data processing, or any system requiring standardized language identifiers. The core language data is sourced from SIL International, the official ISO 639-3 Registration Authority.

πŸ—‚οΈ Dataset Structure

The dataset is an array of language objects, each with the following fields:

Field Name Data Type Description Required
alpha3-b Text The ISO 639-3 code for the language (e.g., "eng"). ID Field. Primary key. Yes
English Text The name of the language in English (e.g., "English"). Name Field. Yes
alpha3-t Text The ISO 639-3 code for the language in its traditional or terminological form, if applicable (e.g., "ger" for German, while alpha3-b is "deu"). Optional. No
alpha2 Text The ISO 639-1 two-letter code for the language, if available (e.g., "en"). Optional. No
French Text The name of the language in French, if available (e.g., "Anglais"). Optional. No

πŸ“ Standardization Info

This dataset adheres to the following international standards for language coding and naming:

  • ISO 639-3:
    • alpha3-b: Three-letter codes providing comprehensive coverage of world languages. Maintained by SIL International, the ISO 639-3 Registration Authority. This is the primary identifier.
      • Example: English = eng, French = fra
    • alpha3-t: Optional three-letter terminological codes from the same standard.
      • Example: German's alpha3-b is deu, its alpha3-t can be ger.
  • ISO 639-1:
    • alpha2: Two-letter codes for many major languages. Provided for broader compatibility.
      • Example: English = en, French = fr
  • Language Names:
    • English: Common English names for languages.
    • French: Common French names for languages, where available.

🧩 Usage Scenarios

This dataset can be effectively used for:

  • Populating language selection dropdowns or user preference settings in applications.
  • Powering multilingual content management systems.
  • Building robust internationalization (i18n) and localization (l10n) frameworks.
  • Standardizing language references in databases, metadata, and APIs.
  • Conducting language data analysis or supporting natural language processing (NLP) tasks.
  • Educational platforms and linguistic research requiring accurate language identifiers.
  • Mapping between different ISO language code standards.

πŸ” Sample Entries

English alpha3-b alpha2 French alpha3-t
English eng en Anglais eng
French fra fr FranΓ§ais fre
German deu de Allemand ger
Japanese jpn ja Japonais jpn
Spanish spa es Espagnol spa

(Note: alpha3-t may often be the same as alpha3-b if no distinct terminological code exists)

πŸ”’ Data Integrity

  • The alpha3-b field serves as the unique identifier (ID Field) for each language record and must be unique across the dataset.
  • The alpha3-b (ISO 639-3 code) and English (English name) fields are mandatory for all records.
  • Codes (alpha3-b, alpha3-t, alpha2) conform to the patterns specified by their respective ISO standards (e.g., three lowercase letters for ISO 639-3, two for ISO 639-1).
  • While the schema allows for additional custom properties, the core defined fields ensure consistency with international language standards.

πŸ“Š Metadata & Versioning

  • Dataset ID: languages
  • Packager Version (RefPack): 1.0.0
  • Packaged Date (UTC): 2025-06-03T00:41:47.501Z
  • Title: Languages (ISO 639-3)
  • Description: A comprehensive list of world languages with their standard ISO 639-3 codes. Data is sourced from SIL International, the official ISO 639-3 Registration Authority.
  • Authors & Contributors:
    • Primary Data Source & ISO 639-3 Registration Authority: SIL International
    • Standard Publisher: International Organization for Standardization (ISO)
    • Dataset Packager: RefPack
  • License:
    • Type: Custom
    • Details: The language data compiled in this dataset is sourced from the ISO 639-3 code tables provided by SIL International. Users of this dataset should refer to the official SIL International website for the most current and specific terms of use regarding the ISO 639-3 data.
    • Terms URL: https://iso639-3.sil.org/code_tables/download_tables
  • Data Source URL: https://iso639-3.sil.org/code_tables/download_tables
  • Tags: languages, iso639-3, iso 639, language codes, internationalization, i18n, localization, l10n, SIL, linguistics, general reference