Hopfield Models with Excitatory and Inhibitory Neurons

DH and IN

1. Introduction

Let us for simplicity consider a model with binary variables $\vec{\sigma} = (\sigma_1, \sigma_2, \ldots, \sigma_N)$ with $\sigma_i \in \{-1, 1\}$. Here, $\sigma_i = 1$ represents a neuron with high firing activity, and $\sigma_i = -1$ is a neuron with low activity.

The dynamics is of the form

$$\sigma_i(t) = \operatorname{sign}\!\left(\sum_{j=1}^{N} J_{ij}\,\sigma_j(t-1)\right), \tag{1}$$

where $t \in \mathbb{N}$ is a time index. The fixed points of the dynamics satisfy the equations

$$\sigma^*_i = \operatorname{sign}\!\left(\sum_{j=1}^{N} J_{ij}\,\sigma^*_j\right), \tag{2}$$

so that $\vec{\sigma}(t) = \vec{\sigma}^*$ for all $t \in \mathbb{N}$ if $\vec{\sigma}(0) = \vec{\sigma}^*$.

If neuron $j$ is inhibitory, then $J_{ij} < 0$ for all $i \in [N] = \{1, 2, \ldots, N\}$, and if $j$ is excitatory then $J_{ij} > 0$ for all $i \in [N]$.

2. Associative Memory and Hebbian Interactions [cite: 13]

In associative memories, we aim to construct a weight matrix $\mathbf{J}$ for which a prescribed number $p$ of target states $\vec{\xi}^{(\alpha)} = (\xi^{(\alpha)}_1, \xi^{(\alpha)}_2, \ldots, \xi^{(\alpha)}_N)$ (with $\xi^{(\alpha)}_i \in \{-1, 1\}$) are fixed points[cite: 13]. Hence, we are looking for a matrix $\mathbf{J}$ with entries $J_{ij}$ such that [cite: 14]

$$\xi^{(\alpha)}_i = \operatorname{sign}\!\left(\sum_{j=1}^{N} J_{ij}\,\xi^{(\alpha)}_j\right), \tag{3}$$

holds for $\alpha \in [p] = \{1, 2, \ldots, p\}$[cite: 14].

These $p \times N$ equations are satisfied if the $\vec{\xi}^{(\alpha)}$ are eigenvectors of the matrix $\mathbf{J}$ with positive eigenvalues, i.e., [cite: 14]

$$\lambda^{(\alpha)}\,\vec{\xi}^{(\alpha)} = \mathbf{J}\,\vec{\xi}^{(\alpha)}, \tag{4}$$

with $\lambda^{(\alpha)} > 0$[cite: 14]. Indeed, in this case [cite: 15]

$$\operatorname{sign}\!\left(\sum_{j=1}^{N} J_{ij}\,\xi^{(\alpha)}_j\right) = \operatorname{sign}\!\left(\lambda^{(\alpha)}\,\xi^{(\alpha)}_i\right) = \xi^{(\alpha)}_i, \tag{5}$$

holds for $\alpha$, and thus the target states $\vec{\xi}^{(\alpha)}$ are fixed points of the dynamics given by Eq. (1)[cite: 15, 16].

If the $\vec{\xi}^{(\alpha)}$ are mutually orthogonal, i.e., [cite: 16]

$$\vec{\xi}^{(\alpha)} \cdot \vec{\xi}^{(\beta)} = N\delta_{\alpha,\beta}, \tag{6}$$

for all $\alpha, \beta \in \{1, 2, \ldots, p\}$, then the following Hebbian matrix [cite: 16]

$$J^{\mathrm{Hebb}}_{ij} = \frac{1}{N}\sum_{\alpha=1}^{p}\xi^{(\alpha)}_i\,\xi^{(\alpha)}_j, \tag{7}$$

has the property that all $\vec{\xi}^{(\alpha)}$ are eigenvectors of $\mathbf{J}$[cite: 16].

Indeed, we can readily verify this: [cite: 17]

$$\sum_{j=1}^{N} J^{\mathrm{Hebb}}_{ij}\,\xi^{(\beta)}_j = \frac{1}{N}\sum_{\alpha=1}^{p}\xi^{(\alpha)}_i \sum_{j=1}^{N}\xi^{(\alpha)}_j\,\xi^{(\beta)}_j = \sum_{\alpha=1}^{p}\xi^{(\alpha)}_i\,\delta_{\alpha,\beta} = \xi^{(\beta)}_i. \tag{8}$$

The $\mathbf{J}^{\mathrm{Hebb}}$ in Eq. (7) is the Hebbian interaction matrix of the Hopfield model[cite: 18]. The Hebbian structure plus orthogonality is sufficient to guarantee the stability of the target states, which is the main reason why the Hebbian interaction matrix works[cite: 19]. In what follows we generalise this to account for the excitatory and inhibitory structure of neurons[cite: 20].

3. Associative Neural Networks with Only Excitatory Weights $J_{ij}$ [cite: 24]

We would like to implement an associative memory that retrieves $p$ prescribed targets $\vec{\xi}^{(\alpha)}$ with an interaction matrix $J^{\mathrm{exc}}_{ij}$, with the additional constraint that $J^{\mathrm{exc}}_{ij} > 0$ for all entries[cite: 24]. Note that the Hebbian matrix $J^{\mathrm{Hebb}}_{ij}$ has both positive and negative entries[cite: 25].

The simplest way to implement an associative memory with excitatory couplings is through a rank-1 perturbation of $\mathbf{J}^{\mathrm{Hebb}}$, namely, [cite: 26]

$$\mathbf{J}^{\mathrm{Exc}} = \mathbf{J}^{\mathrm{Hebb}} + c\,\mathbf{1}, \tag{9}$$

where $c > p/N$ is a positive constant, and $\mathbf{1}$ is the matrix with all one entries[cite: 26]. Since this is a rank-one perturbation, it will only minimally affect the eigenvectors of $\mathbf{J}^{\mathrm{Hebb}}$ if $N$ is large enough[cite: 27]. Moreover, if the target states $\vec{\xi}^{(\beta)}$ are orthogonal to $\vec{1}$ (as is the case when half of the $\xi^{(\beta)} = 1$ and the other half has $\xi^{(\beta)} = -1$), then they will not be affected by the perturbation[cite: 28]. Indeed, then we find that [cite: 29]

$$\sum_{j=1}^{N} J^{\mathrm{exc}}_{ij}\,\xi^{(\beta)}_j = \frac{1}{N}\sum_{\alpha=1}^{p}\xi^{(\alpha)}_i\sum_{j=1}^{N}\xi^{(\alpha)}_j\,\xi^{(\beta)}_j + c\sum_{j=1}^{N}\xi^{(\beta)}_j = \sum_{\alpha=1}^{p}\xi^{(\alpha)}_i\,\delta_{\alpha,\beta} = \xi^{(\beta)}_i, \tag{10}$$

as $\sum_{j=1}^{N}\xi^{(\beta)}_j = 0$[cite: 29].

4. Associative Neural Networks with Excitatory Weights and Inhibitory Weights [cite: 33]

In this case, we require a specific sign pattern, namely that $J_{ij} > 0$ if $j$ is excitatory, and $J_{ij} < 0$ if $j$ is inhibitory[cite: 33].

Such a pattern can be achieved with a rank-2 perturbation of $\mathbf{J}^{\mathrm{Hebb}}$, [cite: 34]

$$\mathbf{J}^{\mathrm{Exc}} = \mathbf{J}^{\mathrm{Hebb}} + c\,\mathbf{1}_{+-}, \tag{11}$$

where $\mathbf{1}_{+-}$ has matrix entries [cite: 34]

$$\bigl(\mathbf{1}_{+-}\bigr)_{ij} = \begin{cases} 1 & \text{if } j \leq N/2, \\ -1 & \text{if } j > N/2. \end{cases} \tag{12}$$

Appendix: Unpacking Sections 1–2

Section 1: what it's actually saying

The first paragraph sets up notation. A network has $N$ neurons, each in state $+1$ or $-1$. This is identical to Amari's setup — his state vector $x = (x_i)$ with $x_i = \pm 1$.

The dynamics Eq. (1) is exactly Amari's Eq. (3): each neuron looks at the weighted sum of all other neurons' states and fires $+1$ or $-1$ accordingly. The difference is notation — we write $\sigma$ and $J$ where Amari writes $x$ and $W$. The threshold $h_i$ from Amari is absent here because we set all thresholds to zero for simplicity.

Eq. (2) is the fixed point condition: a state $\vec{\sigma}^*$ where applying the dynamics doesn't change anything. This is Amari's Theorem 1 restated — but where Amari derives it from the $u_i$ functions and the $\min(k)$ machinery, here it is written down directly. The fixed point condition is: for every neuron $i$, the sign of the field it sees already matches its current state.

The last paragraph of §1 defines what it means for a neuron to be excitatory or inhibitory. This is the ingredient that Amari doesn't have. The constraint is on the columns of $J$: if neuron $j$ is excitatory, then $J_{ij} > 0$ for every $i$ (everyone receives a positive input from $j$). If $j$ is inhibitory, $J_{ij} < 0$ for every $i$. This is a structural constraint on the weight matrix — it restricts which matrices are allowed.

Section 2: the logic chain

Section 2 has one argument with four steps, interleaved in the text.

Step 1: the goal. Find a weight matrix $J$ such that $p$ chosen patterns $\vec{\xi}^{(\alpha)}$ are all fixed points. Plugging any $\vec{\xi}^{(\alpha)}$ into the dynamics gives back $\vec{\xi}^{(\alpha)}$. This is Eq. (3) — identical to Amari's starting point, just stated for multiple patterns at once.

Step 2: a sufficient condition. If the patterns happen to be eigenvectors of $J$ with positive eigenvalues, then they are automatically fixed points. Why? Because if $\mathbf{J}\,\vec{\xi}^{(\alpha)} = \lambda^{(\alpha)}\vec{\xi}^{(\alpha)}$ with $\lambda^{(\alpha)} > 0$, then the field that neuron $i$ sees is $\lambda^{(\alpha)}\xi_i^{(\alpha)}$, which has the same sign as $\xi_i^{(\alpha)}$, so $\operatorname{sign}(\text{field}) = \xi_i^{(\alpha)}$. That's Eq. (5). This step is pure linear algebra: eigenvectors of a matrix come back scaled, and if the scale factor is positive, the sign is preserved.

Step 3: constructing such a matrix. If the patterns are mutually orthogonal (Eq. (6)), then the Hebbian matrix (Eq. (7)) — the average outer product of the patterns — has exactly this eigenvector property. The verification in Eq. (8) is just multiplying out: hit $\mathbf{J}^{\mathrm{Hebb}}$ with $\vec{\xi}^{(\beta)}$, the orthogonality kills all the cross terms, and you get $\vec{\xi}^{(\beta)}$ back with eigenvalue 1.

Step 4: the bridge to what comes next. The Hebbian matrix works, but it has both positive and negative entries — so it doesn't satisfy Dale's law (the E/I sign constraint from §1). Sections 3–4 fix this by adding low-rank perturbations that enforce the sign constraint without destroying the eigenvector structure.

Why Amari's approach feels different

Amari builds up from the dynamics: he defines the $u_i$ functions (Eq. (4)), which directly measure whether each neuron "agrees" with a target, then defines stability numbers that quantify how robust that agreement is. Everything flows from the update rule through concrete measurable quantities.

Here we take a spectral approach: treat the weight matrix as a linear operator and ask when patterns are eigenvectors. This is more abstract but more powerful for the perturbation theory in §3–4, because adding $c\,\mathbf{1}$ or $c\,\mathbf{1}_{+-}$ is a rank-1 or rank-2 perturbation of the operator, and you can reason about how it affects the eigenvalues and eigenvectors without running the dynamics at all.

The two approaches are complementary. Amari tells you how well the network remembers (stability numbers, basin sizes). The spectral approach tells you why it remembers (eigenvector structure) and gives you a way to modify the matrix while preserving the memory (low-rank perturbations that don't move the eigenvectors).

The connection point

The eigenvalue $\lambda^{(\alpha)}$ from Eq. (4) is directly related to Amari's $u_i$ values. When all eigenvalues are large and positive, the $u_i$'s are all large and positive, which means large stability numbers and wide basins. When you add the E/I perturbation, you may shift some eigenvalues, which changes the $u_i$'s, which changes the stability numbers — and that is exactly what the E/I sweep will measure.