The Ising Model of Neural Populations

Authors
Dr. Michael Denker | Prof. Dr. Hideaki Shimazaki

Setup

Import Libraries

Import the modules required for this session. Run all cells in this section before starting the exercises.

Note: This notebook uses a local libary that sits in the /ssll folder. Before we can import from it we must call sys.path.insert(0, 'ssll') to add that folder to the path where Python can find it.

import numpy as np
import matplotlib.pyplot as plt

# Import modules from local library /ssll
import sys
sys.path.insert(0, 'ssll')
import ssll
import synthesis
import transforms
import energies

Utility Functions

The utils class below provides two plotting helpers used throughout the exercises. Run this cell once as part of Setup — you do not need to study the implementation.

class utils:
    @staticmethod
    def plot_pattern_probs(p_t, N):
        """Bar chart of 2^N pattern probabilities."""
        patterns = transforms.enumerate_patterns(N)
        xlabels = [''.join(str(b) for b in pat) for pat in patterns]
        fig, ax = plt.subplots(figsize=(max(5, 2**N * 0.5), 3))
        ax.bar(range(2**N), p_t, color='steelblue')
        ax.set_xticks(range(2**N))
        ax.set_xticklabels(xlabels)
        ax.set_xlabel('Pattern')
        ax.set_ylabel('Probability')
        plt.tight_layout()

    @staticmethod
    def plot_pattern_comparison(p_list, labels, N):
        """Side-by-side bar chart comparing multiple pattern distributions."""
        patterns = transforms.enumerate_patterns(N)
        xlabels = [''.join(str(b) for b in pat) for pat in patterns]
        n = 2**N
        x = np.arange(n)
        w = 0.8 / len(p_list)
        offsets = (np.arange(len(p_list)) - (len(p_list) - 1) / 2) * w
        fig, ax = plt.subplots(figsize=(max(5, n * 0.7), 3))
        for p, label, offset in zip(p_list, labels, offsets):
            ax.bar(x + offset, p, w, label=label)
        ax.set_xticks(x)
        ax.set_xticklabels(xlabels)
        ax.set_xlabel('Pattern')
        ax.set_ylabel('Probability')
        ax.legend()
        plt.tight_layout()

When working with custom libraries, it is a good practice to test that it is working as expected. The cell below runs the code with some example paramerters and uses assert to verify the output is as expected. The cell should print: "Setup OK: shapes (5, 6) (5, 8) (5, 10, 3)"

N_test, O_test, T_test, R_test = 3, 2, 5, 10

transforms.initialise(N_test, O_test)
theta_test = synthesis.generate_thetas(N_test, O_test, T_test)
p_test = np.array([transforms.compute_p(theta_test[t]) for t in range(T_test)])
spikes_test = synthesis.generate_spikes(p_test, R_test)
assert spikes_test.shape == (T_test, R_test, N_test), f"Expected (5,10,3), got {spikes_test.shape}"

print(f"Setup OK: shapes {theta_test.shape} {p_test.shape} {spikes_test.shape}")
Setup OK: shapes (5, 6) (5, 8) (5, 10, 3)

Section 1: Population Activity as Binary Words

In each time bin, every neuron is either active (1) or silent (0). Recording NN neurons simultaneously produces a binary vector of length NN at each time step. This is sometimes also called a spike word or population pattern. With NN neurons there are 2N2^N possible distinct patterns, ranging from the all-silent pattern (0000…) to the pattern in which every neuron fires (1111…).

In this exercise we will be working with the State-Space Analysis of Spike Correlations (SSLL) library developed by Hideaki Shimazaki and his team (https://github.com/shimazaki/ssll) . The library generates and fits data using four key quantities — NN, OO, TT, and RR — that together fully specify the experimental setup:

  • NN: number of neurons recorded simultaneously.
  • OO: model order. O=2O=2 is the pairwise (Ising) model, which captures individual firing rates and all pairwise co-activation rates. O=1O=1 is the independent model (firing rates only).
  • TT: number of time bins. In this exercise T=1T=1, meaning we treat the population as stationary (one fixed distribution). (The next exercise notebook uses T>1T>1 to track how the distribution changes over time.)
  • RR: number of trials. At each time bin the same stimulus or condition is repeated RR times, giving RR independent draws from the pattern distribution. More trials mean better-estimated statistics.

The function synthesis.generate_thetas draws a set of DD natural parameters θ\theta from a Gaussian process. The latter is a model for smooth random trajectories in which nearby time bins tend to have similar parameter values, with the degree of smoothness controlled by a covariance function. For T=1T=1 this reduces to a single draw from a Gaussian distribution. The function returns parameter trajectories of shape (T,D)(T, D). The function transforms.compute_p converts a single parameter vector θt\theta_t into a probability distribution pp over all 2N2^N patterns — a vector of length 2N2^N where each entry is P(x)P(\mathbf{x}) for one binary word xx. Given pp, the function transforms.compute_eta computes the expectation parameters η\eta: a vector of length DD containing the model-predicted marginal firing rates ηi=E[xi]\eta_i = E[x_i] and pairwise co-activation rates ηij=E[xixj]\eta_{ij} = E[x_i x_j]. Finally, synthesis.generate_spikes samples RR binary patterns from pp at each time bin, returning an array of shape (T,R,N)(T, R, N).

The natural parameters are organized as a vector of length DD, where

D=N+N(N1)2D = N + \frac{N(N-1)}{2}

for a second-order model. The first NN entries are biases (one per neuron) and the remaining N(N1)/2N(N-1)/2 entries are pairwise coupling parameters (one per neuron pair). Together these parameters fully specify the Ising model probability distribution over spike patterns.

Code Reference

Code Description
synthesis.generate_thetas(N, O, T) Ground-truth natural parameters as Gaussian process, shape (T, D)
synthesis.generate_thetas(N, O, T, mu1=...) mu1: prior mean of the Gaussian process for bias parameters (default -2.0); more negative shifts all firing rates lower
transforms.initialise(N, O) Set up internal mapping matrices for N neurons, order O
transforms.compute_D(N, O) Total number of natural parameters D
transforms.compute_p(theta_t) Probability distribution over all 2^N patterns for one time bin’s parameter vector; shape (2^N,)
p = np.array([transforms.compute_p(theta[t]) for t in range(T)]) Stack per-time-bin distributions; p has shape (T, 2^N) — use p[t] (e.g. p[0]) to select one time bin
transforms.compute_eta(p[t]) Expectation parameters η\eta from one time bin’s p[t]: marginal rates [:N] and pairwise rates [N:]; shape (D,)
synthesis.generate_spikes(p, R) Sample R spike patterns per time bin from p; returns shape (T, R, N)

Exercises

Example: Generate parameters and example spike trains for N=3, O=2, T=1, R=200. Print shapes of theta, p, and sampled spikes spikes.

N, O, T, R = 3, 2, 1, 200

transforms.initialise(N, O)
theta = synthesis.generate_thetas(N, O, T)
p = np.array([transforms.compute_p(theta[t]) for t in range(T)])
spikes = synthesis.generate_spikes(p, R)

print('theta shape:', theta.shape)
print('p shape:    ', p.shape)
print('spikes shape:', spikes.shape)
theta shape: (1, 6)
p shape:     (1, 8)
spikes shape: (1, 200, 3)

Exercise: Change N to 4. Print transforms.compute_D(4, O), which outputs the number of natural parameters DD. Verify it equals N+N(N1)/2N + N(N-1)/2.

Solution
N4 = 4

transforms.initialise(N4, O)
D4 = transforms.compute_D(N4, O)

print('D =', D4, '  expected:', N4 + N4*(N4-1)//2)
D = 10   expected: 10

Exercise: Using N=3, set R=500. Compute the model’s expected firing rates with transforms.compute_eta(p[0])[:N] and compare them to the empirical rates obtained from the sampled spike trains (Hint: spikes.mean(axis=(0, 1))). They should be close.

Solution
N, O, T, R = 3, 2, 1, 500

transforms.initialise(N, O)
theta = synthesis.generate_thetas(N, O, T)
p = np.array([transforms.compute_p(theta[t]) for t in range(T)])
spikes = synthesis.generate_spikes(p, R)
expected_rates = np.array(transforms.compute_eta(p[0]))[:N]

print('Expected rates (model): ', expected_rates.round(3))
print('Empirical firing rates:  ', spikes.mean(axis=(0, 1)).round(3))
Expected rates (model):  [0.087 0.106 0.132]
Empirical firing rates:   [0.094 0.116 0.102]

Exercise: Print the 2N2^N pattern probabilities p[0] and verify they sum to 1. Then call utils.plot_pattern_probs(p[0], N) to visualise the distribution.

Solution
print('Pattern probabilities:', p[0].round(4))
print('Sum:', p[0].sum())

utils.plot_pattern_probs(p[0], N)
Pattern probabilities: [0.7087 0.0667 0.0872 0.1046 0.0058 0.0136 0.0123 0.0011]
Sum: 1.0

Exercise: Plot spikes[0, :30, :].T of the first 30 trials as a 2D binary image to visualize 30 spike patterns. Set xlabel=‘Trial’, ylabel=‘Neuron’. You can use imshow(...,aspect='auto', cmap='binary', interpolation='none') of matplotlib to do this.

Solution
fig, ax = plt.subplots(figsize=(8, 3))
ax.imshow(spikes[0, :30, :].T, aspect='auto', cmap='binary', interpolation='none')
ax.set_xlabel('Trial')
ax.set_ylabel('Neuron')
ax.set_yticks([0, 1, 2])
ax.set_yticklabels(['Neuron 0', 'Neuron 1', 'Neuron 2']);

Exercise: The parameter mu1 is the prior mean of the Gaussian process from which bias parameters are drawn (default -2.0). Change mu1 to -4.0 in synthesis.generate_thetas and compare the empirical firing rates to the default case. Why do all three neurons fire less often?

Solution
theta_low = synthesis.generate_thetas(N, O, T, mu1=-4.0)
p_low = np.array([transforms.compute_p(theta_low[t]) for t in range(T)])
spikes_low = synthesis.generate_spikes(p_low, R)
print('Empirical rates (mu1=-4.0):', spikes_low.mean(axis=(0, 1)).round(3))
Empirical rates (mu1=-4.0): [0.02  0.016 0.022]

Section 2: The Ising Model and Natural Parameters

The Ising model assigns a probability to each of the 2N2^N binary patterns xx based on how much each neuron fires and how often pairs of neurons fire together. Each bias θi\theta_i contributes θixi\theta_i \cdot x_i to a score, thus adding weight when neuron ii fires (xi=1x_i = 1) and zero when it is silent. Each coupling θij\theta_{ij} contributes θijxixj\theta_{ij} \cdot x_i x_j, likewise adding weight only when both neurons ii and jj fire simultaneously. The probability of a pattern is proportional to the exponential of this score:

P(x)exp ⁣(iθixi+i<jθijxixj)P(x) \propto \exp\!\Bigl(\sum_i \theta_i x_i + \sum_{i<j} \theta_{ij} x_i x_j\Bigr)

The exponential is the natural choice: probabilities must be positive (the exponential is always positive), and independent contributions should multiply. If all couplings are zero, exp ⁣(iθixi)\exp\!\bigl(\sum_i \theta_i x_i\bigr) factorizes into a product of per-neuron terms iexp(θixi)\prod_i \exp(\theta_i x_i), exactly as expected for independent neurons. Dividing by the sum over all 2N2^N patterns normalizes the distribution.

First-order parameters θi\theta_i are biases: a large negative value makes neuron ii unlikely to fire in isolation, while a positive value increases its marginal firing probability. Second-order parameters θij\theta_{ij} are couplings: a positive value means neurons ii and jj tend to fire together more than expected from their individual rates, while a negative value introduces anticorrelation.

The DD natural parameters are arranged as a vector with NN biases first (θ1,,θN\theta_1, \ldots, \theta_N), followed by N(N1)/2N(N-1)/2 pairwise couplings in lexicographic order (θ12,θ13,,θ(N1)N\theta_{12}, \theta_{13}, \ldots, \theta_{(N-1)N}). The function transforms.compute_p evaluates this for a given theta and returns a vector of 2N2^N probabilities, one per pattern, ordered to match the rows of transforms.enumerate_patterns(N).

Code Reference

Code Description
theta[:N] First-order parameters (biases), one per neuron
theta[N:] Second-order parameters (couplings), one per neuron pair
transforms.compute_p(theta) Returns 2N2^N pattern probabilities for theta; order matches enumerate_patterns(N)
transforms.enumerate_patterns(N) Matrix of all 2^N binary patterns, shape (2^N, N)

Exercises

Example: For N=2, define the vector θ\theta manually and print all 4 patterns with their probabilities. Check how the choice of θ\theta affects pp.

N, O = 2, 2

transforms.initialise(N, O)
theta_ex = np.array([-1.0, -1.0, 0.0])
p_ex = transforms.compute_p(theta_ex)
patterns = transforms.enumerate_patterns(N)

for i in range(len(p_ex)):
    print(f'Pattern {patterns[i]}: p = {p_ex[i]:.4f}')
Pattern [0 0]: p = 0.5344
Pattern [1 0]: p = 0.1966
Pattern [0 1]: p = 0.1966
Pattern [1 1]: p = 0.0723

Exercise: Set theta_ex = np.array([-3.0, 0.0, 0.0]). Which neuron is less likely to fire? Confirm by calculating the probabilities of all patterns and then summing the probability of all patterns where neuron 0 fires and separately where neuron 1 fires.

Solution
theta_ex = np.array([-3.0, 0.0, 0.0])

p_ex = transforms.compute_p(theta_ex)
patterns = transforms.enumerate_patterns(N)
p_n0 = p_ex[patterns[:, 0] == 1].sum()
p_n1 = p_ex[patterns[:, 1] == 1].sum()

print('P(neuron 0 fires):', p_n0.round(4))
print('P(neuron 1 fires):', p_n1.round(4))
P(neuron 0 fires): 0.0474
P(neuron 1 fires): 0.5

Exercise: Set a strong positive coupling: theta_coupled = np.array([-1.0, -1.0, 2.0]). Compare P(both neurons fire)P(\textrm{both neurons fire}) to the uncoupled case theta_uncoupled = np.array([-1.0, -1.0, 0.0]). Store the two probability distributions as p_uncoupled and p_coupled — the next exercise reuses these names. Then call utils.plot_pattern_comparison([p_uncoupled, p_coupled], ['zero coupling', 'positive coupling'], N) to compare the full distributions side by side.

Solution
theta_coupled = np.array([-1.0, -1.0, 2.0])
theta_uncoupled = np.array([-1.0, -1.0, 0.0])

p_coupled = transforms.compute_p(theta_coupled)
p_uncoupled = transforms.compute_p(theta_uncoupled)
patterns = transforms.enumerate_patterns(N)
both_idx = (patterns[:, 0] == 1) & (patterns[:, 1] == 1)

print('P(both fire) with coupling:   ', p_coupled[both_idx][0].round(4))
print('P(both fire) without coupling:', p_uncoupled[both_idx][0].round(4))
utils.plot_pattern_comparison([p_uncoupled, p_coupled], ['zero coupling', 'positive coupling'], N)
P(both fire) with coupling:    0.3655
P(both fire) without coupling: 0.0723

Exercise: Set theta_neg = np.array([-1.0, -1.0, -2.0]) (negative coupling). Print p(1,1)p(1,1) and p(0,0)p(0,0). Then call utils.plot_pattern_comparison([p_neg, p_uncoupled, p_coupled], ['negative', 'zero', 'positive'], N) to compare all three coupling scenarios side by side. How does anticorrelation change the pattern distribution?

Solution
theta_neg = np.array([-1.0, -1.0, -2.0])

p_neg = transforms.compute_p(theta_neg)
patterns = transforms.enumerate_patterns(N)
neither_fire_idx = (patterns[:, 0] == 0) & (patterns[:, 1] == 0)
both_fire_idx = (patterns[:, 0] == 1) & (patterns[:, 1] == 1)
p_silent = p_neg[neither_fire_idx][0]
p_both = p_neg[both_fire_idx][0]

print('P(both fire):    ', p_both.round(4))
print('P(both silent):  ', p_silent.round(4))
utils.plot_pattern_comparison([p_neg, p_uncoupled, p_coupled], ['negative', 'zero', 'positive'], N)
P(both fire):     0.0104
P(both silent):   0.5701

Exercise: Verify that transforms.compute_p(theta) always produces a valid probability distribution by checking that the probabilities sum to 1.0 for three different theta vectors.

Solution
for theta_test in [
    np.array([0.0, 0.0, 0.0]),
    np.array([-2.0, 1.0, 3.0]),
    np.array([0.5, -0.5, -1.5]),
]:
    p_test = transforms.compute_p(theta_test)
    print(f'sum(p) = {p_test.sum():.8f}')
sum(p) = 1.00000000
sum(p) = 1.00000000
sum(p) = 1.00000000

Section 3: Fitting the Model with the EM Algorithm

Given observed spike patterns, SSLL finds the natural parameters θ\theta that best explain the data using an expectation-maximization (EM) algorithm. The algorithm alternates two steps until convergence:

  • E-step (Expectation): Estimate where θ\theta most likely was at each time bin, given the spike data and the current assumption about how θ\theta evolves over time. The result is not a single value but a posterior, i.e., a probability distribution over θ\theta values that are consistent with the observed spikes. A broad posterior means the data leave many θ\theta values plausible; a narrow one means the spikes pin down θ\theta tightly. Technically, this uses a Kalman filter (forward pass) and a backward smoother to propagate information across time bins.

  • M-step (Maximization): Update the assumption about how much θ\theta is allowed to change from one time bin to the next (the state noise covariance QQ), choosing the value that makes the observed data most probable under the current posterior.

The function ssll.run runs this loop until convergence and returns an EMData container.

The container’s primary result is emd.theta_s, the smoothed posterior mean, shape (T, D). The posterior mean is the average of the posterior distribution — the single θ\theta value that best summarizes what the spike data support at each time bin. The attribute emd.eta_s holds the corresponding expectation parameters: model-predicted firing rates and pairwise co-firing rates.

The log marginal likelihood emd.mll summarizes how well the model fits the data as a single number. The list emd.mll_list records this value at each EM iteration; convergence is visible as a plateau.

Code Reference

Code Description
ssll.run(spikes, order) Fit the model; returns EMData container
ssll.run(spikes, order, EM_Info=False) EM_Info=False suppresses the per-iteration progress bar and final likelihood printout
emd.theta_s Smoothed natural parameters, shape (T, D)
emd.eta_s Smoothed expectation parameters, shape (T, D)
emd.sigma_s Smoothed posterior covariances, shape (T, D, D)
emd.mll Final log marginal likelihood
emd.mll_list Log marginal likelihood at each EM iteration
emd.N, emd.D, emd.T Number of neurons, parameters, time bins

Exercises

Example: Generate ground truth data (spikes) following a Gaussian process and fit a model with three neurons to it (N=3, O=2, T=1, R=300). Print key container attributes and compare inferred theta to ground truth.

N, O, T, R = 3, 2, 1, 300

transforms.initialise(N, O)
theta = synthesis.generate_thetas(N, O, T)
p = np.array([transforms.compute_p(theta[t]) for t in range(T)])
spikes = synthesis.generate_spikes(p, R)
emd = ssll.run(spikes, order=O, EM_Info=False)

print('N =', emd.N, '  D =', emd.D, '  T =', emd.T)
print('Inferred \u03b8:', emd.theta_s[0].round(3))
print('True \u03b8:    ', theta[0].round(3))
N = 3   D = 6   T = 1
Inferred θ: [-2.157 -1.822 -1.747  0.11  -0.435 -0.232]
True θ:     [-2.228 -1.727 -1.816  0.087 -0.298 -0.03 ]

Exercise: Print the mean squared error between emd.theta_s[0] and theta[0].

Solution
mse = np.mean((emd.theta_s[0] - theta[0])**2)

print(f'MSE = {mse:.4f}')
MSE = 0.0131

Exercise: Increase R to 1000 and refit. How does the MSE change?

Solution
spikes_large = synthesis.generate_spikes(p, 1000)
emd_large = ssll.run(spikes_large, order=O, EM_Info=False)
mse_large = np.mean((emd_large.theta_s[0] - theta[0])**2)

print(f'MSE with R=1000: {mse_large:.4f}')
MSE with R=1000: 0.0775

Exercise: Plot emd.mll_list to visualize EM convergence. The values should increase monotonically and plateau.

Solution
fig, ax = plt.subplots(figsize=(7, 3))
ax.plot(emd.mll_list)

ax.set_xlabel('EM iteration')
ax.set_ylabel('Log marginal likelihood')

Exercise: Decrease R to 30 and refit. Inspect how the inferred theta compares to ground truth.

Solution
spikes_tiny = synthesis.generate_spikes(p, 30)
emd_tiny = ssll.run(spikes_tiny, order=O, EM_Info=False)

print(f'MSE with R=30:  {np.mean((emd_tiny.theta_s[0] - theta[0])**2):.4f}')
print('Inferred:', emd_tiny.theta_s[0].round(3))
print('True:    ', theta[0].round(3))
MSE with R=30:  1.3121
Inferred: [-2.861 -1.342 -1.912  1.898 -1.15  -1.849]
True:     [-2.228 -1.727 -1.816  0.087 -0.298 -0.03 ]

Exercise: Print emd.sigma_s.shape. Then print np.sqrt(emd.sigma_s[0, 0, 0]): the posterior standard deviation of the first parameter. How does this compare to the estimation error?

Solution
print('sigma_s shape:', emd.sigma_s.shape)
std_first = np.sqrt(emd.sigma_s[0, 0, 0])
err_first = abs(emd.theta_s[0, 0] - theta[0, 0])
print(f'Posterior std (\u03b8\u2081): {std_first:.4f}')
print(f'Actual error:       {err_first:.4f}')
sigma_s shape: (1, 6, 6)
Posterior std (θ₁): 0.1682
Actual error:       0.0708

Section 4: Expectation Parameters and Population Statistics (Optional)

The expectation parameters η\eta are the expected values of the sufficient statistics under the model: ηi1=E[xi]\eta^1_{i} = E[x_i] is the marginal firing probability of neuron ii, and ηij2=E[xixj]\eta^2_{ij} = E[x_i \cdot x_j] is the joint firing probability of neurons i and j. The EM algorithm ensures that the model’s expectation parameters match the empirical statistics at convergence. Thus emd.eta_s[t, :N] equals the trial-averaged firing rates, and emd.eta_s[t, N:] equals the empirical co-firing rates.

The distinction between natural parameters theta and expectation parameters eta is fundamental to exponential family models. The natural parameters determine the shape of the distribution in an abstract algebraic sense, while the expectation parameters directly correspond to observable statistics of the data. Both representations are useful: θ\theta is informative of the coupling structure, and η\eta directly relates to firing rates and correlations the model predicts from a given realization of that model.

Code Reference

Code Description
emd.eta_s[:, :N] Smoothed firing rates, shape (T, N)
emd.eta_s[:, N:] Smoothed pairwise co-firing rates, shape (T, N*(N-1)//2)
spikes.mean(axis=(0, 1)) Empirical firing rates
(spikes[:, :, i] * spikes[:, :, j]).mean() Empirical co-firing rate of neurons i and j

Exercises

Run the cell below only if you are starting here without having completed Section 3; it regenerates spikes, theta, p, and the fitted emd.

# Realign: run this cell if you are starting Section 4 without having run Section 3
# Sets up: N, O, T, R, theta, p, spikes, emd
N, O, T, R = 3, 2, 1, 300

transforms.initialise(N, O)
theta = synthesis.generate_thetas(N, O, T)
p = np.array([transforms.compute_p(theta[t]) for t in range(T)])
spikes = synthesis.generate_spikes(p, R)
emd = ssll.run(spikes, order=O, EM_Info=False)

print(f'Realign OK — N={N}, T={T}, R={R}, emd fitted')
Realign OK — N=3, T=1, R=300, emd fitted

Example: Compare inferred firing rates emd.eta_s[0, :N] with the empirical rates.

emp_rates = spikes.mean(axis=(0, 1))

print('Empirical rates:  ', emp_rates.round(3))
print('Inferred \u03b7\u2081:      ', emd.eta_s[0, :N].round(3))
Empirical rates:   [0.113 0.133 0.103]
Inferred η₁:       [0.114 0.133 0.104]

Exercise: Print emd.eta_s[0, N:] (pairwise co-firing rates). For pairs (0,1), (0,2), (1,2), compute the empirical p(xi=1,xj=1)p(x_i=1, x_j=1) from the spike data and compare.

Solution
print('Inferred \u03b7\u2082:', emd.eta_s[0, N:].round(3))

for i, j in [(0, 1), (0, 2), (1, 2)]:
    emp = (spikes[:, :, i] * spikes[:, :, j]).mean()
    print(f'Empirical P(x{i}=1, x{j}=1): {emp:.3f}')
Inferred η₂: [0.013 0.012 0.01 ]
Empirical P(x0=1, x1=1): 0.013
Empirical P(x0=1, x2=1): 0.013
Empirical P(x1=1, x2=1): 0.010

Exercise: Refit with order=1 (independent model) and store the result as emd_ind. Do the inferred firing rates emd_ind.eta_s[0, :N] match those of the pairwise model?

Solution
emd_ind = ssll.run(spikes, order=1, EM_Info=False)

print('Independent model \u03b7\u2081:', emd_ind.eta_s[0, :N].round(3))
print('Pairwise model \u03b7\u2081:   ', emd.eta_s[0, :N].round(3))
Independent model η₁: [0.113 0.133 0.104]
Pairwise model η₁:    [0.114 0.133 0.104]

Exercise: Compare the bias parameters emd.theta_s[0, :N] (pairwise) and emd_ind.theta_s[0, :N] (independent). Why might they differ even though both models match the firing rates?

Hint: in the pairwise model each bias must compensate for the coupling’s contribution to the marginal firing rate. A positive coupling between neurons i and j raises their joint probability, which in turn raises neuron i’s marginal rate; the bias θi\theta_i is therefore shifted more negative to keep the marginal rate at the empirical level. The independent model has no such compensation.

Solution
print('Pairwise \u03b8\u2081:    ', emd.theta_s[0, :N].round(3))
print('Independent \u03b8\u2081: ', emd_ind.theta_s[0, :N].round(3))
Pairwise θ₁:     [-2.027 -1.809 -2.111]
Independent θ₁:  [-2.056 -1.871 -2.158]

Exercise: Create a dataset with a known strong positive coupling (theta = [-1.5, -1.5, -1.5, 2.0, 0.0, 0.0]) and fit the model; store the result as emd_strong. Verify that emd_strong.theta_s[0, N] is positive and that emd_strong.eta_s[0, N] is higher than for uncoupled neurons.

Hint: For eta_s, the first index [0] selects the first (and only) time bin (T=1). [N] selects index N=3, which is the first pairwise parameter — the coupling between neurons 0 and 1. This is because [:N] holds biases (individual neurons) and [N:] holds pairwise parameters.

Solution
N, O, T, R = 3, 2, 1, 300

transforms.initialise(N, O)
theta_strong_gt = np.array([-1.5, -1.5, -1.5, 2.0, 0.0, 0.0]).reshape(1, -1)
p_strong = np.array([transforms.compute_p(theta_strong_gt[t]) for t in range(T)])
spikes_strong = synthesis.generate_spikes(p_strong, R)
emd_strong = ssll.run(spikes_strong, order=O, EM_Info=False)

print('Coupling \u03b8\u2081\u2082:       ', emd_strong.theta_s[0, N].round(3))
print('Co-firing rate \u03b7\u2081\u2082: ', emd_strong.eta_s[0, N].round(3))
print('Baseline \u03b7\u2081\u2082:       ', emd.eta_s[0, N].round(3))
Coupling θ₁₂:        1.926
Co-firing rate η₁₂:  0.209
Baseline η₁₂:        0.013

Section 5: Model Comparison: Independent vs Pairwise Interactions (Optional)

The energies module computes thermodynamic quantities that measure how much of the population’s statistical structure is captured by pairwise interactions. The entropy SS of a probability distribution measures its variability: a uniform distribution over all 2N2^N patterns has maximum entropy, while a distribution concentrated on one pattern has zero entropy. The entropy S1S_1 of the independent model is always at least as large as S2S_2 of the pairwise model, because pairwise interactions constrain which patterns are more likely.

The entropy ratio Sratio=(S1S2)/S1S_{\textrm{ratio}} = (S1 - S2) / S1 measures what fraction of the independent model’s uncertainty is explained by pairwise coupling. When SratioS_{\textrm{ratio}} is near zero, pairwise interactions add little beyond the marginal firing rates. When SratioS_{\textrm{ratio}} is large, pairwise structure accounts for a substantial fraction of the population variability. This quantity has been used empirically to assess how close neural populations are to a critical state.

The KL divergence quantifies the information cost of ignoring pairwise interactions: it measures how much the independent model distribution differs from the pairwise model distribution. Large KL divergence means the two models make very different predictions about pattern probabilities.

Code Reference

Code Description
energies.get_energies(emd) Compute and attach thermodynamic quantities to the container
emd.S1 Entropy of independent model, shape (T,)
emd.S2 Entropy of pairwise model, shape (T,)
emd.S_ratio Entropy ratio (S1-S2)/S1, shape (T,)
emd.dkl KL divergence from independent to pairwise model, shape (T,)
emd.llk1, emd.llk2 Log-likelihoods of independent and pairwise models

Exercises

Run the cell below only if you are starting here without having completed Sections 3 and 4; it regenerates spikes, theta, p, emd, emd_ind, and emd_strong.

# Realign: run this cell if you are starting Section 5 without having run Sections 3 and 4
# Sets up: N, O, T, R, theta, p, spikes, emd, emd_ind, emd_strong
N, O, T, R = 3, 2, 1, 300

transforms.initialise(N, O)
theta = synthesis.generate_thetas(N, O, T)
p = np.array([transforms.compute_p(theta[t]) for t in range(T)])
spikes = synthesis.generate_spikes(p, R)
emd = ssll.run(spikes, order=O, EM_Info=False)
emd_ind = ssll.run(spikes, order=1, EM_Info=False)
transforms.initialise(N, O)  # ssll.run(order=1) leaves transforms in O=1 state; re-initialise for O=2
theta_strong_gt = np.array([-1.5, -1.5, -1.5, 2.0, 0.0, 0.0]).reshape(1, -1)
p_strong = np.array([transforms.compute_p(theta_strong_gt[t]) for t in range(T)])
spikes_strong = synthesis.generate_spikes(p_strong, R)
emd_strong = ssll.run(spikes_strong, order=O, EM_Info=False)

print(f'Realign OK — emd, emd_ind, emd_strong ready')
Realign OK — emd, emd_ind, emd_strong ready

Example: Compute thermodynamic quantities for the fitted model and print S1, S2, and the entropy ratio.

energies.get_energies(emd)
print(f'S\u2081 (independent) = {emd.S1[0]:.4f}')
print(f'S\u2082 (pairwise)    = {emd.S2[0]:.4f}')
print(f'S_ratio          = {emd.S_ratio[0]:.4f}')
S₁ (independent) = 1.1515
S₂ (pairwise)    = 1.1485
S_ratio          = 0.0026

Exercise: Print emd.dkl[0]. Does a higher S_ratio correspond to a higher KL divergence?

Solution
print(f'KL divergence: {emd.dkl[0]:.4f}')
KL divergence: 0.0030

Exercise: Compute energies for emd_strong (the strongly coupled model). Compare S_ratio to the weakly coupled case.

Solution
energies.get_energies(emd_strong)
print(f'Weak coupling   S_ratio: {emd.S_ratio[0]:.4f}')
print(f'Strong coupling S_ratio: {emd_strong.S_ratio[0]:.4f}')
Weak coupling   S_ratio: 0.0026
Strong coupling S_ratio: 0.0646

Exercise: Compute energies for emd_ind (the order=1 model fitted in the Expectation Parameters section above). What is S_ratio for a model fitted without pairwise terms?

Solution
energies.get_energies(emd_ind)
print(f'Independent model S_ratio: {emd_ind.S_ratio[0]:.4f}')
Independent model S_ratio: -0.0000

Exercise: Print emd.llk2 - emd.llk1. Does the pairwise model always fit the data better?

Solution
print(f'LL gain (pairwise vs independent): {emd.llk2 - emd.llk1}')
print(f'LL gain (strong coupling):         {emd_strong.llk2 - emd_strong.llk1}')
LL gain (pairwise vs independent): [1.03509562]
LL gain (strong coupling):         [35.61921265]

Exercise: Generate a dataset where all couplings are exactly zero (theta = [-1.5, -1.5, -1.5, 0.0, 0.0, 0.0]). Fit the pairwise model and verify that S_ratio is close to zero.

Solution
theta_ind_gt = np.array([-1.5, -1.5, -1.5, 0.0, 0.0, 0.0]).reshape(1, -1)
p_ind = np.array([transforms.compute_p(theta_ind_gt[0])])
spikes_ind = synthesis.generate_spikes(p_ind, R)
emd_ind_data = ssll.run(spikes_ind, order=O, EM_Info=False)
energies.get_energies(emd_ind_data)

print(f'True independent data S_ratio: {emd_ind_data.S_ratio[0]:.4f}')
True independent data S_ratio: 0.0004

Session 2 extends this framework to time-varying populations: the natural parameters θt\theta_t evolve over time, and state-space smoothing is used to track them dynamically.