How to Convert JSON to YAML in Python

JavaScript Object Notation (JSON) and YAML Ain’t Markup Language (YAML) are two popular data serialization formats used for configuration files, data storage, and data exchange between programs. Both are human-readable text formats that represent hierarchical data using basic structures like objects, arrays, strings, numbers, booleans, etc.

While JSON and YAML have similarities in terms of use cases, they differ in syntax rules. JSON has stricter formatting with more required elements like quotes around keys and string values. YAML allows more flexibility with things like newlines and indentation.

Sometimes you may need to convert data between these two formats. For example, you might have a JSON config file but need to port the data to a YAML file that another application expects. Or you may want to leverage YAML’s more concise syntax to simplify some verbose JSON.

Luckily, Python makes converting between JSON and YAML pretty painless. In this comprehensive guide, we’ll explore multiple methods to:

  • Convert JSON to YAML using the PyYAML library
  • Convert YAML to JSON using Python’s built-in JSON library
  • Leverage utilities like json2yaml to simplify the process

We’ll also provide code examples of translating sample JSON and YAML data files in both directions. Let’s get started!

Overview: JSON vs YAML

Before diving into the conversion process, let’s briefly compare some of the syntax rules and typical use cases of JSON and YAML:

JSON

  • Strict formatting rules
  • Keys and string values require double quotes
  • Arrays denoted with square brackets
  • Popular for web APIs, config files, data exchange

YAML

  • Flexible whitespace formatting
  • Keys and values may omit quotes
  • Uses indentation to denote hierarchy
  • Great for configuration files and data descriptions

So while their core functionality is similar, YAML allows for more human-readable formatting. This can make it a great choice for configuration files, while JSON is ubiquitous for web services.

Converting between the two in Python opens up flexibility in terms of the formats you interface with.

Installing the Necessary Libraries

To convert JSON and YAML in Python, we need to install a few libraries:

pip install pyyaml json2yaml

This gives us:

  • PyYAML: For parsing/generating YAML
  • json: Python’s built-in JSON library
  • json2yaml: A utility library to simplify JSON<>YAML conversion

Let’s look at how these libraries can translate between formats.

Parsing JSON in Python

Before we can convert our JSON to anything else, we first need to deserialized it from a string into Python data structures.

Python’s json library handles parsing JSON easily. We just use the loads() method:

import json

json_string = '{"name": "John Smith", "age": 35}'

# Parse string to Python dict
data = json.loads(json_string)

print(data["name"]) # "John Smith"

loads() takes a JSON string and converts it into native Python types like:

  • Objects → Dicts
  • Arrays → Lists
  • Strings
  • Numbers
  • Booleans
  • etc.

This gives us easy access in Python for further processing.

You can also load JSON from a file using load() instead of loads():

with open("data.json") as f:
  data = json.load(f)

Either way, you end up with parsed Python data structures matching the JSON.

Converting JSON to YAML with PyYAML

Once our JSON is parsed into Python types, we can convert it into YAML using PyYAML’s dump() method:

import json 
import yaml

json_string = '{"name": "John Smith", "age": 35}'
json_data = json.loads(json_string)

yaml_string = yaml.dump(json_data)
print(yaml_string)

# Output:
# name: John Smith
# age: 35

Here’s what’s happening above:

  1. Load our JSON string into a Python dict
  2. Pass that Python dict into PyYAML’s dump() method
  3. It outputs our YAML string representation

As you can see, PyYAML’s YAML is more concise by omitting the quotes and braces. But the data remains the same key/value pairs.

If we wanted to dump to a file instead of a string, we would use dump() instead:

with open("data.yaml", "w") as f:
  yaml.dump(json_data, f)

And that’s really all there is to the basic JSON → YAML conversion in Python!

Converting YAML to JSON in Python

We can also go the other direction – parsing YAML into Python data structures using PyYAML, then converting into JSON:

import json
import yaml

yaml_string = """
name: John Smith 
age: 35
"""

# Parse YAML into Python dict
yaml_data = yaml.load(yaml_string, Loader=yaml.FullLoader)

# Convert Python dict to JSON
json_string = json.dumps(yaml_data, indent=2)

print(json_string)

# Output: 
# {
#   "name": "John Smith",
#   "age": 35  
# }

Here PyYAML load() parses our YAML string, then the Python json library converts that Python dict back into a JSON string.

So combined, PyYAML and Python’s JSON library make converting between YAML and JSON in either direction straightforward.

Using json2yaml for Simplified JSON/YAML Conversion

While PyYAML + Python JSON works well, for convenience we can also use the json2yaml library. This gives us simple conversion methods designed explicitly for translating these formats:

from json2yaml import json2yaml, yaml2json

json_string = '{"name": "John Smith", "age": 35}'

yaml_string = json2yaml(json_string) 
print(yaml_string)

# Output:  
# name: John Smith
# age: 35

Then going from YAML back to JSON:

yaml_string = """
name: John Smith
age: 35  
""" 

json_string = yaml2json(yaml_string)
print(json_string)  

# Output:
# {"name": "John Smith", "age": 35}

json2yaml eliminates needing to work directly with Python data structures. Instead we go directly from one text serialization format to the other.

For most use cases, json2yaml provides all the convenience we need for converting JSON and YAML in Python.

Handling Files for JSON <> YAML Conversion

In addition to strings, json2yaml also handles seamless file conversion:

JSON File → YAML File

import json2yaml

with open("data.json") as f:
  json2yaml.convert_json2yaml(f, "data.yaml")

YAML File → JSON File

import json2yaml

with open("data.yaml") as f:
  json2yaml.convert_yaml2json(f, "data.json")

This simplifies dumping between formats when working with file data.

Example: Converting a Complex JSON Document

To better understand how more complex documents translate, let’s walk through converting this sample JSON payload:

{
  "name": "John Smith",
  "age": 35,
  "children": [
    {
      "name": "Jane",
      "age": 6 
    },
    {
      "name": "Peter",
      "age": 8
    }
  ],
  "fav_colors": [
    "blue", 
    "green"
  ],    
  "active": true
}

Running this through json2yaml.json2yaml(), here is the resulting YAML format:

name: John Smith
age: 35
children:
  - name: Jane
    age: 6
  - name: Peter  
    age: 8
fav_colors: 
  - blue
  - green 
active: true

The output is much more concise by leveraging features like YAML lists indicated through hyphens, rather than JSON arrays which require brackets. Nested objects also get cleaner indentation without braces required.

This allows the same rich data structure with hierarchy and multiple types, while improving readability.

We could then go YAML back to JSON just as easily using jsonyaml.yaml2json() if needed.

Tips for Seamless JSON <-> YAML Conversion in Python

Here are some best practices for smooth translation between JSON and YAML with Python:

Following Pythonic style guidelines and leveraging libraries like json2yaml helps smooth out the transition between these flexible data serialization formats. Automate as much of the conversion pipeline as possible and you can translate JSON to YAML or vice versa with ease.

Leave a Comment