Skip to content

Data Models and Encodings

Understanding how data can be structured and encoded is very important in programming in general and network automation in particular.

YANG & Openconfig

YANG (Yet Another Next Generation) is a data modeling language originally developed for NETCONF and defined in RFC 6020 and then updated in RFC 7950. YANG and NETCONF can be considered as successors to SMIng and SNMP respectively.

YANG provides a format-independent way to describe a data model that can be represented in XML or JSON.

Jason Edelman, Scott S. Lowe, Matt Oswalt. Network Programmability and Automation, p. 183

There are hundreds of YANG data models available both vendor-neutral and vendor-specific. The YANG catalog web site can be helpful if you need to find data models relevant to your tasks.

Because of this abundance of data models and lack of coordination between standards developing organizations and vendors it seems that YANG and NETCONF are going the same path SNMP went (i.e. used only for data retrieval, but not configuration). OpenConfig workgroup tries to solve this by providing vendor-neutral data models, but I think that Ivan Pepelnjak's point from 2018 stating that "seamless multi-vendor network device configuration is still a pipe dream" still holds in 2020.


XML (eXtensible Markup Language) although a bit old is still widely used in various APIs. It uses tags to encode data hence is a bit hard to read by humans. It was initially designed for documents but is suitable to represent arbitrary data structures.

You can refer to this tutorial to learn more about XML.

Let's see how this sample CLI output of Cisco IOS show vlan command can be encoded with XML:

VLAN Name                             Status    Ports
---- -------------------------------- ---------   -------------------------------
1    default                          active    Gi3/4, Gi3/5, Gi4/11


VLAN Type  SAID       MTU   Parent RingNo BridgeNo Stp  BrdgMode Trans1 Trans2
---- ----- ---------- ----- ------ ------ -------- ---- -------- ------ ------
1    enet  100001     1500  -      -      -        -    -        0      0
<?xml version="1.0" encoding="UTF-8" ?>


YAML (YAML Ain’t Markup Language) is a human-friendly data serialization format. Because YAML is really easy to read and write it is widely used in modern automation tools for configuration files and even for defining automation tasks logic (see Ansible).

You can refer to this tutorial to learn more about YAML.

Here is a show vlan output from previous subsection encoded in YAML:

    - GigabitEthernet3/4
    - GigabitEthernet3/5
    - GigabitEthernet4/11
    mtu: 1500
    name: default
    said: 100001
    shutdown: false
    state: active
    trans1: 0
    trans2: 0
    type: enet
    vlan_id: '1'

Bonus: a collection of YAML shortcomings.


JSON (JavaScript Object Notation) is a modern data encoding format defined in RFC 7159 and widely used in web APIs. It's lightweight, human-readable, and is more suited for data models of modern programming languages than XML.

You can refer to this tutorial to learn more about JSON.

Here is the sample data from previous sections encoded in JSON:

  "vlans": {
    "1": {
      "interfaces": [
      "mtu": 1500,
      "name": "default",
      "said": 100001,
      "shutdown": false,
      "state": "active",
      "trans1": 0,
      "trans2": 0,
      "type": "enet",
      "vlan_id": "1"
As you can see it's almost as easy to read as YAML, however, native JSON doesn't support comments making it not very suitable for configuration files.


Here is a summary table representing the key properties of the described data formats.

Human readable not really yes yes
Purpose documents, APIs configuration files APIs
Python libs xml, lxml PyYAML json

There are online tools like this one to convert data between all three formats.