Writing and Reading Metadata with Serialized with JSON

Author
Dr. Nicholas Del Grosso

JSON (JavaScript Object Notation) is a widely-used format for data exchange, valued for its simplicity and readability. In neuroscience, JSON’s structured format is ideal for organizing complex metadata. It supports clear data representation, crucial for sharing and analyzing experimental information. This compatibility with diverse programming languages enhances its utility in global research collaboration, streamlining data management in neuroscience.

This table covers the basic types of values that can be represented in JSON, providing a quick reference for understanding and using JSON data types in various applications:

JSON Type Description Example
String Textual data enclosed in quotes "exampleString"
Number Integer or floating-point number 42, 3.14
Object Collection of key-value pairs {"key": "value"}
Array Ordered list of values [1, "two", 3.0]
Boolean True or false value true, false
Null Represents a null or non-existent value null

Setup

Import Libraries

import json
from pathlib import Path
import pandas as pd

Data

import json, random
from pathlib import Path

random.seed(42)  # Ensures all the randomly-generated data is consisten across runs and computers

for _ in range(10):

    # Generate random parameters
    params = {
        "exposure_time": random.choice([100, 200, 300]),  # milliseconds
        "laser_power": random.choice([5, 10, 15]),  # milliwatts
        "num_frames": random.randint(200, 400),
        "frame_rate": random.choice([10, 20, 30]),  # Hz
        "region_of_interest": random.choice(["ROI1", "ROI2", "ROI3"]),
    }
    if random.random() > 0.5:
        params['start_time'] = random.randint(1, 5000)  # seconds

    # Write the data to a json file
    session_num = random.randint(1, 300)
    experimenter = random.choice(["Sophie", "Florian"])
    path = Path(f"image_data/{experimenter}_{session_num}/session.json")
    path.parent.mkdir(parents=True, exist_ok=True)
    json_text = json.dumps(params, indent=3)
    path.write_text(json_text)

Section 1: The Built-In json Library

Code Description
Reading JSON -
text = pathlib.Path('myfile.json').read_text() Reads a text file to a string.
data = json.loads(text) Converts JSON-formtted text to a Python code data structure
Writing JSON -
text = json.dumps(data, indent=3) Convert a Python code data structure to a text string
pathlib.Path("myfile.json").write_text(text) Write the text to a file
pathlib.Path("data/myfile.json").parent Get the parent directory of “myfile.json” (in this case, “data”)
pathlib.Path("data").mkdir(exist_ok=True, parents=True) Create a folder at the path, and all of its parent folders, if necessary.

Exercises

Example: Translate the following sentence to JSON-formatted text, and use the JSON parser to validate it (i.e. check that it is formatted correctly): The researcher, Sam Vimes, ran Session Number 3 with Subject XTR2 on February 4th, 2022..

text = '{"Researcher": "Sam Vimes", "Session": 3, "Subject": "XTR2", "Date": "2022-02-04"}'
json.loads(text)
{'Researcher': 'Sam Vimes',
 'Session': 3,
 'Subject': 'XTR2',
 'Date': '2022-02-04'}

Example: Save this data to an appropriately-named file.

path = Path("data/session.json")
path.parent.mkdir(exist_ok=True, parents=True)
path.write_text(text)
82

Example: Read the data from the file back into a Python data structure.

data = json.loads(path.read_text())
data
{'Researcher': 'Sam Vimes',
 'Session': 3,
 'Subject': 'XTR2',
 'Date': '2022-02-04'}

Exercise: Translate the following sentence to JSON-formatted text, and use the JSON parser to validate it (i.e. check that it is formatted correctly): The EEG amplifier’s low-pass filter was set to 200 Hz, its high-pass filter to 0.2 Hz, and its notch filter (which was set to 50 Hz) was turned on.

Solution
text = '{"low": "200", "high": 0.2, "notch": "on"}'
json.loads(text)
{'low': '200', 'high': 0.2, 'notch': 'on'}

Exercise: Save this data to an appropriately-named file.

Solution
path = Path("data/eeg.json")
path.parent.mkdir(exist_ok=True, parents=True)
path.write_text(text)
42

Exercise: Read the data from the file back into a Python data structure.

Solution
data = json.loads(path.read_text())
data
{'low': '200', 'high': 0.2, 'notch': 'on'}

Exercise: Translate the following sentence to a Python data structure, then use the json library to convert it to JSON-formatted text: Three electrodes were implanted into subject “Pinky”, a Sprague-Dawley rat: one in the hippocampus (channel 3), one in the visual cortex (channel 4), and one in the motor cortex (channel 6).

Solution
text = """
{
    "subject": "Pinky",
    "strain": "Sprague-Dawley",
    "electrodes": 
        [
            {
                "channel": 4,
                "location": "hippocampus"
            },
            {
                "channel": 5,
                "location": "visual cortex"
            },
            {
                "channel": 6,
                "location": "motor cortex"
            }
        ]
}
 
"""

data = json.loads(text)
data
{'subject': 'Pinky',
 'strain': 'Sprague-Dawley',
 'electrodes': [{'channel': 4, 'location': 'hippocampus'},
  {'channel': 5, 'location': 'visual cortex'},
  {'channel': 6, 'location': 'motor cortex'}]}

Exercise: Save the json data to an appropriately-named file.

Solution
path = Path("data/pinky.json")
path.parent.mkdir(exist_ok=True, parents=True)
path.write_text(text)
406

Exercise: Read the file back into a Python data structure.

Solution
data = json.loads(path.read_text())
data
{'subject': 'Pinky',
 'strain': 'Sprague-Dawley',
 'electrodes': [{'channel': 4, 'location': 'hippocampus'},
  {'channel': 5, 'location': 'visual cortex'},
  {'channel': 6, 'location': 'motor cortex'}]}

Exercise: Translate the following sentence to a Python data structure and save it to a JSON file: “The image has a width of 1080 pixels, a height of 720 pixels, saved data in RGB format. The camera settings had an exposure time of 8 milliseconds, an aperture of 2.8 stops, and an ISO setting of 100.”

Solution
text = """
{
    "width": "1080",
    "height": "720",
    "color": "rgb",
    "exp_time": "8",
    "aperture": "2.8",
    "iso": "100" 
}
"""

data = json.loads(text)
path = Path("data/pinky.json")
path.parent.mkdir(exist_ok=True, parents=True)
path.write_text(text)
129

Exercise: Read the file back to check that it was saved correctly.

Solution
data = json.loads(path.read_text())
data
{'width': '1080',
 'height': '720',
 'color': 'rgb',
 'exp_time': '8',
 'aperture': '2.8',
 'iso': '100'}

Example: In image_data, Read and Parse the JSON-formatted data in session 72, to get the exposure time.

json.loads(Path("image_data/Sophie_72/session.json").read_text())["exposure_time"]
300

Exercise: Read and Parse the JSON-formatted data in session 177, to get the frame rate.

Solution
json.loads(Path("image_data/Florian_117/session.json").read_text())["frame_rate"]
30

Example: Use list(Path().glob(pattern)) to list all the the JSON session files in the image_data folder (tip: use the wildcard “*” whereever there are variable parts in the filename)

list(Path("image_data").glob("*/session.json"))
[PosixPath('image_data/Sophie_187/session.json'),
 PosixPath('image_data/Florian_41/session.json'),
 PosixPath('image_data/Florian_177/session.json'),
 PosixPath('image_data/Sophie_143/session.json'),
 PosixPath('image_data/Sophie_215/session.json'),
 PosixPath('image_data/Sophie_88/session.json'),
 PosixPath('image_data/Sophie_16/session.json'),
 PosixPath('image_data/Sophie_72/session.json'),
 PosixPath('image_data/Sophie_167/session.json'),
 PosixPath('image_data/Florian_117/session.json')]

Example: Read and parse all the session.json files and put them into a Pandas DataFrame. Here is a code template to help you get started:

sessions = []
for path in Path().glob("image_data/Sophie_16/session.json"):
    text = path.read_text()
    session = {"A": 3}
    sessions.append(session)

df = pd.DataFrame(sessions)
df

Exercise: Read and parse all the session.json files and put them into a Pandas DataFrame.

Solution
df = pd.DataFrame([
    json.loads(Path(path).read_text())
    for path in Path().glob("image_data/*/session.json")
])
df

exposure_time laser_power num_frames frame_rate region_of_interest start_time
0 100 5 225 20 ROI2 NaN
1 100 15 317 30 ROI1 3101.0
2 100 10 226 10 ROI2 NaN
3 200 15 271 10 ROI1 2788.0
4 100 5 329 30 ROI1 4465.0
5 200 10 253 30 ROI2 585.0
6 300 15 339 10 ROI3 NaN
7 300 5 206 30 ROI2 NaN
8 100 10 297 20 ROI3 1800.0
9 300 15 292 30 ROI1 376.0

Exercise: Read and parse all the session.json files and put them into a Pandas DataFrame, this time including the experimenter name, the session ID from the parent folder’s name (tip: Path().parent.name), and the path to the parent folder’s name for later analysis (e.g. to load up other data files from that session).

Solution
import json, pandas as pd
from pathlib import Path

df = pd.DataFrame([
    {**json.loads(path.read_text()),
     "experimenter": path.parent.name.split("_")[0],
     "session": int(path.parent.name.split("_")[1])}
    for path in Path("image_data").glob("*/session.json")
])
df

exposure_time laser_power num_frames frame_rate region_of_interest experimenter session start_time
0 100 5 225 20 ROI2 Sophie 187 NaN
1 100 15 317 30 ROI1 Florian 41 3101.0
2 100 10 226 10 ROI2 Florian 177 NaN
3 200 15 271 10 ROI1 Sophie 143 2788.0
4 100 5 329 30 ROI1 Sophie 215 4465.0
5 200 10 253 30 ROI2 Sophie 88 585.0
6 300 15 339 10 ROI3 Sophie 16 NaN
7 300 5 206 30 ROI2 Sophie 72 NaN
8 100 10 297 20 ROI3 Sophie 167 1800.0
9 300 15 292 30 ROI1 Florian 117 376.0