Writing and Reading Metadata with Serialized with JSON

Courses

Analyzing JSON-, CSV, and Parquet data using SQL in DuckDB

Author

Dr. Nicholas Del Grosso

Download Materials

JSON (JavaScript Object Notation) is a widely-used format for data exchange, valued for its simplicity and readability. In neuroscience, JSON’s structured format is ideal for organizing complex metadata. It supports clear data representation, crucial for sharing and analyzing experimental information. This compatibility with diverse programming languages enhances its utility in global research collaboration, streamlining data management in neuroscience.

This table covers the basic types of values that can be represented in JSON, providing a quick reference for understanding and using JSON data types in various applications:

JSON Type	Description	Example
String	Textual data enclosed in quotes	`"exampleString"`
Number	Integer or floating-point number	`42`, `3.14`
Object	Collection of key-value pairs	`{"key": "value"}`
Array	Ordered list of values	`[1, "two", 3.0]`
Boolean	True or false value	`true`, `false`
Null	Represents a null or non-existent value	`null`

Setup

Import Libraries

import json
from pathlib import Path
import pandas as pd

Data

import json, random
from pathlib import Path

random.seed(42)  # Ensures all the randomly-generated data is consisten across runs and computers

for _ in range(10):

    # Generate random parameters
    params = {
        "exposure_time": random.choice([100, 200, 300]),  # milliseconds
        "laser_power": random.choice([5, 10, 15]),  # milliwatts
        "num_frames": random.randint(200, 400),
        "frame_rate": random.choice([10, 20, 30]),  # Hz
        "region_of_interest": random.choice(["ROI1", "ROI2", "ROI3"]),
    }
    if random.random() > 0.5:
        params['start_time'] = random.randint(1, 5000)  # seconds

    # Write the data to a json file
    session_num = random.randint(1, 300)
    experimenter = random.choice(["Sophie", "Florian"])
    path = Path(f"image_data/{experimenter}_{session_num}/session.json")
    path.parent.mkdir(parents=True, exist_ok=True)
    json_text = json.dumps(params, indent=3)
    path.write_text(json_text)

Section 1: The Built-In `json` Library

Code	Description
Reading JSON	-
`text = pathlib.Path('myfile.json').read_text()`	Reads a text file to a string.
`data = json.loads(text)`	Converts JSON-formtted text to a Python code data structure
Writing JSON	-
`text = json.dumps(data, indent=3)`	Convert a Python code data structure to a text string
`pathlib.Path("myfile.json").write_text(text)`	Write the text to a file
`pathlib.Path("data/myfile.json").parent`	Get the parent directory of “myfile.json” (in this case, “data”)
`pathlib.Path("data").mkdir(exist_ok=True, parents=True)`	Create a folder at the path, and all of its parent folders, if necessary.

Exercises

Example: Translate the following sentence to JSON-formatted text, and use the JSON parser to validate it (i.e. check that it is formatted correctly): The researcher, Sam Vimes, ran Session Number 3 with Subject XTR2 on February 4th, 2022..

text = '{"Researcher": "Sam Vimes", "Session": 3, "Subject": "XTR2", "Date": "2022-02-04"}'
json.loads(text)

{'Researcher': 'Sam Vimes',
 'Session': 3,
 'Subject': 'XTR2',
 'Date': '2022-02-04'}

Example: Save this data to an appropriately-named file.

path = Path("data/session.json")
path.parent.mkdir(exist_ok=True, parents=True)
path.write_text(text)

Example: Read the data from the file back into a Python data structure.

data = json.loads(path.read_text())
data

{'Researcher': 'Sam Vimes',
 'Session': 3,
 'Subject': 'XTR2',
 'Date': '2022-02-04'}

Exercise: Translate the following sentence to JSON-formatted text, and use the JSON parser to validate it (i.e. check that it is formatted correctly): The EEG amplifier’s low-pass filter was set to 200 Hz, its high-pass filter to 0.2 Hz, and its notch filter (which was set to 50 Hz) was turned on.

Solution

text = '{"low": "200", "high": 0.2, "notch": "on"}'
json.loads(text)

{'low': '200', 'high': 0.2, 'notch': 'on'}

Exercise: Save this data to an appropriately-named file.

Solution

path = Path("data/eeg.json")
path.parent.mkdir(exist_ok=True, parents=True)
path.write_text(text)

Exercise: Read the data from the file back into a Python data structure.

Solution

data = json.loads(path.read_text())
data

{'low': '200', 'high': 0.2, 'notch': 'on'}

Exercise: Translate the following sentence to a Python data structure, then use the json library to convert it to JSON-formatted text: Three electrodes were implanted into subject “Pinky”, a Sprague-Dawley rat: one in the hippocampus (channel 3), one in the visual cortex (channel 4), and one in the motor cortex (channel 6).

Solution

text = """
{
    "subject": "Pinky",
    "strain": "Sprague-Dawley",
    "electrodes": 
        [
            {
                "channel": 4,
                "location": "hippocampus"
            },
            {
                "channel": 5,
                "location": "visual cortex"
            },
            {
                "channel": 6,
                "location": "motor cortex"
            }
        ]
}
 
"""

data = json.loads(text)
data

{'subject': 'Pinky',
 'strain': 'Sprague-Dawley',
 'electrodes': [{'channel': 4, 'location': 'hippocampus'},
  {'channel': 5, 'location': 'visual cortex'},
  {'channel': 6, 'location': 'motor cortex'}]}

Exercise: Save the json data to an appropriately-named file.

Solution

path = Path("data/pinky.json")
path.parent.mkdir(exist_ok=True, parents=True)
path.write_text(text)

Exercise: Read the file back into a Python data structure.

Solution

data = json.loads(path.read_text())
data

{'subject': 'Pinky',
 'strain': 'Sprague-Dawley',
 'electrodes': [{'channel': 4, 'location': 'hippocampus'},
  {'channel': 5, 'location': 'visual cortex'},
  {'channel': 6, 'location': 'motor cortex'}]}

Exercise: Translate the following sentence to a Python data structure and save it to a JSON file: “The image has a width of 1080 pixels, a height of 720 pixels, saved data in RGB format. The camera settings had an exposure time of 8 milliseconds, an aperture of 2.8 stops, and an ISO setting of 100.”

Solution

text = """
{
    "width": "1080",
    "height": "720",
    "color": "rgb",
    "exp_time": "8",
    "aperture": "2.8",
    "iso": "100" 
}
"""

data = json.loads(text)
path = Path("data/pinky.json")
path.parent.mkdir(exist_ok=True, parents=True)
path.write_text(text)

Exercise: Read the file back to check that it was saved correctly.

Solution

data = json.loads(path.read_text())
data

{'width': '1080',
 'height': '720',
 'color': 'rgb',
 'exp_time': '8',
 'aperture': '2.8',
 'iso': '100'}

Example: In image_data, Read and Parse the JSON-formatted data in session 72, to get the exposure time.

json.loads(Path("image_data/Sophie_72/session.json").read_text())["exposure_time"]

Exercise: Read and Parse the JSON-formatted data in session 177, to get the frame rate.

Solution

json.loads(Path("image_data/Florian_117/session.json").read_text())["frame_rate"]

Example: Use list(Path().glob(pattern)) to list all the the JSON session files in the image_data folder (tip: use the wildcard “*” whereever there are variable parts in the filename)

list(Path("image_data").glob("*/session.json"))

[PosixPath('image_data/Sophie_187/session.json'),
 PosixPath('image_data/Florian_41/session.json'),
 PosixPath('image_data/Florian_177/session.json'),
 PosixPath('image_data/Sophie_143/session.json'),
 PosixPath('image_data/Sophie_215/session.json'),
 PosixPath('image_data/Sophie_88/session.json'),
 PosixPath('image_data/Sophie_16/session.json'),
 PosixPath('image_data/Sophie_72/session.json'),
 PosixPath('image_data/Sophie_167/session.json'),
 PosixPath('image_data/Florian_117/session.json')]

Example: Read and parse all the session.json files and put them into a Pandas DataFrame. Here is a code template to help you get started:

sessions = []
for path in Path().glob("image_data/Sophie_16/session.json"):
    text = path.read_text()
    session = {"A": 3}
    sessions.append(session)

df = pd.DataFrame(sessions)
df

Exercise: Read and parse all the session.json files and put them into a Pandas DataFrame.

Solution

df = pd.DataFrame([
    json.loads(Path(path).read_text())
    for path in Path().glob("image_data/*/session.json")
])
df

	exposure_time	laser_power	num_frames	frame_rate	region_of_interest	start_time
0	100	5	225	20	ROI2	NaN
1	100	15	317	30	ROI1	3101.0
2	100	10	226	10	ROI2	NaN
3	200	15	271	10	ROI1	2788.0
4	100	5	329	30	ROI1	4465.0
5	200	10	253	30	ROI2	585.0
6	300	15	339	10	ROI3	NaN
7	300	5	206	30	ROI2	NaN
8	100	10	297	20	ROI3	1800.0
9	300	15	292	30	ROI1	376.0

Exercise: Read and parse all the session.json files and put them into a Pandas DataFrame, this time including the experimenter name, the session ID from the parent folder’s name (tip: Path().parent.name), and the path to the parent folder’s name for later analysis (e.g. to load up other data files from that session).

Solution

import json, pandas as pd
from pathlib import Path

df = pd.DataFrame([
    {**json.loads(path.read_text()),
     "experimenter": path.parent.name.split("_")[0],
     "session": int(path.parent.name.split("_")[1])}
    for path in Path("image_data").glob("*/session.json")
])
df

	exposure_time	laser_power	num_frames	frame_rate	region_of_interest	experimenter	session	start_time
0	100	5	225	20	ROI2	Sophie	187	NaN
1	100	15	317	30	ROI1	Florian	41	3101.0
2	100	10	226	10	ROI2	Florian	177	NaN
3	200	15	271	10	ROI1	Sophie	143	2788.0
4	100	5	329	30	ROI1	Sophie	215	4465.0
5	200	10	253	30	ROI2	Sophie	88	585.0
6	300	15	339	10	ROI3	Sophie	16	NaN
7	300	5	206	30	ROI2	Sophie	72	NaN
8	100	10	297	20	ROI3	Sophie	167	1800.0
9	300	15	292	30	ROI1	Florian	117	376.0

Writing and Reading Metadata with Serialized with JSON

Author

Setup

Import Libraries

Data

Section 1: The Built-In json Library

Exercises

Section 1: The Built-In `json` Library