The mathematics behind HeiDI

The HeiDI model has four major components: 1) the acquisition of reciprocal associations between stimuli, 2) the pooling of those associations into stimulus activations, 3) the distribution of those activations into stimulus-specific response units, and 4) the generation of responses.

1 - Acquiring reciprocal associations

Whenever a trial is given, HeiDI learns associations among stimuli. The association between two stimuli, i and j is denoted via v_i, j. The association v_i, j represents a directional expectation: the expectation of j after being presented with i. Furthermore, its value represents the nature of the effect that i has over the representation of j. If positive, the presentation of i “excites” the representation of j. If negative, the presentation of i “inhibits” the representation of j.

HeiDI not only learns “forward” associations between stimuli, but also their reciprocal, or “backward” associations. Thus, if organisms are presented with i → j, organisms not only learn about v_i, j, but also about v_j, i, or the expectation of receiving i after being presented with j. Note that, for the sake of brevity, the learning equations below are only specified for forward associations.

1.1 - The stimulus expectation rule

HeiDI generates expectations about stimuli. The expectation of stimulus j (e_j) is expressed as

$$ \tag{Eq. 1} e_j = \sum_{k}^{K}x_kv_{k,j} $$

where K is the set containing all stimuli in the experiment, and x_k is a quantity denoting the presence or absence of stimulus k (1 or 0, respectively)¹.

1.2 - Learning rule

HeiDI learns the appropriate expectations via error-correction mechanisms. After trial t, the association between stimuli i and j is expressed as

v_i, j, t = v_{i, j, t − 1} + Δv_i, j, t

where v_{j, i, t − 1} is the forward association between i and j on trial t − 1, and Δv_i, j, t is the change in that association as a result of trial t. That delta term uses a pooled error term and is expressed as

Δv_i, j = x_iα_i(x_jcα_j − e_j) where α_i and α_j are parameters representing the salience of stimuli i and j, respectively (0 ≤ α ≤ 1), c is a scaling constant (c = 1). Note that the term denoting the trial, t has been omitted here for simplicity.

2 - Pooling the strength of associations

HeiDI pools its stimulus associations to activate stimulus-specific representations. The activation of the representation for stimulus j, a_j, is defined as:

a_j, M = o_j, M + h_j, M

where o_j, M denotes the combined associative strength towards stimulus j in presence of stimuli M, and h_j, M denotes the chained associative strength towards stimulus j in presence of stimuli M.

2.1 - Combined associative strength

The quantity o_j, M is the result of combining the associative strength of forward and backward associations to and from stimulus j as

$$ \tag{Eq. 5} o_{j,M} = \sum_{m \neq j}^{M}v_{m,j} + \left(\frac{\sum_{m \neq j}^{M}v_{m,j} \sum_{m \neq j}^{M}v_{j,m}}{c}\right) $$

where each of the sums above run over all stimuli M presented in the trial, different from stimulus j.² The left-hand term describes how the forward associations from stimuli M to j affect the representation of j, whereas the right-hand term describes how the backward associations that j has with stimuli M affect its representation (although these are modulated by the forward associations themselves).

2.2 - Chained associative strength

The quantity h_j, M captures the indirect associative strength that the stimuli M have with j, via absent stimuli. As such, h_j, M is defined as

$$ \tag{Eq. 6a} h_{j,M} = \sum_{m \neq j}^{M} \sum_{n}^{N}\frac{v_{m,n}o_{j,n}}{c} $$

where N are the stimuli not presented on the trial (i.e., K-M). Note the re-use of o, the quantity defined in Eq. 5. This equation allows absent stimuli N to influence the representation of stimulus j, as long as they have an association with present stimuli M.

In Honey and Dwyer (2022), the authors specify a similarity-based mechanism that modulates the effect of associative chains according to the similarity of the salience of nominal and retrieved stimuli³. As such, Eq. 6a is expanded as:

$$ \tag{Eq. 6b} h_{j,M} = \sum_{m \neq j}^{M} \sum_{n}^{N}S(\alpha_{n}, \alpha'_n)\frac{v_{m,n}o_{j,n}}{c} $$

where S is a similarity function that takes the nominal salience of stimulus n, α_n (as perceived when n is presented on a trial) and its retrieved salience, α′_n (as perceived when n is retrieved via other stimuli M, see ahead). This function is defined as:

$$ \tag{Eq. 7} S(\alpha_n, \alpha'_n) = \frac{\alpha_n}{\alpha_n + |\alpha_n-\alpha'_n|} \times \frac{\alpha'_n}{\alpha'_n+ |\alpha_n-\alpha'_n|} $$

Notably, whenever there is more than one nominal salience for a given stimulus, then α_n is the arithmetic mean among all nominal values (see “heidi_similarity” vignette).

3 - Distributing strength into stimulus-specific response units

HeiDI then distributes the pooled stimulus-specific strength among all K stimuli, according to their relative salience. The activation of response unit j, R_j is expressed as

$$ \tag{Eq. 8} R_{j,k} = \frac{\theta(j)}{\sum_{k}^{K}\theta(k)}a_{k,M} $$

where j ∈ K. As K can include both present and absent stimuli, the θ function above depends on whether the stimulus k is absent (i.e., k ∈ N) or not (i.e., k ∈ M), as:

$$ \tag{Eq. 9} \theta(k) = \begin{cases} \left |\sum_{m}^{M}\left( v_{m,k}+\sum_{n \neq k}^{N}\frac{v_{m,n}v_{n,k}}{c}\right) \right|,& \text{if } k \in N\\ \alpha_k, & \text{otherwise} \end{cases} $$

Note that the quantity for absent stimuli is absolute, to prevent negative θ values due to inhibitory associations⁴. Also, note a summation term is used on the left-hand side of the expression for an absent stimulus. It implies that all the present stimuli M contribute to the salience of stimulus k. Finally, note on the right-hand side of the same expression that the present stimuli contribute not only via the direct association each of them has with k, v_m, k but also through associative chains with other absent stimuli (c.f., Eq. 6a).

4 - Generating responses

Finally, HeiDI responds. The response-generating mechanisms in HeiDI are currently underspecified. In its current version, HeiDI’s responses are the product of the activation of stimulus-specific response units and the connection that those units have with specific motor units. As such, the activation of motor unit q, r_q, is given by

r_q = R_jw_j, q

where w_j, q is a weight representing the association between stimulus-specific unit j and motor unit q.

We go the extra length of specifying x quantities because the stimulus expectation and learning rules can be vectorized, as e = xV and ΔV = (x ⊙ a)′(c(x ⊙ a) − e), respectively. Here, the matrix V contains all associations between each pair of stimuli, the row vectors $\textbf x$ and $\textbf a$ denote the presence and salience of all stimuli K, the ⊙ symbol specifies element-wise multiplication, and the ′ symbol denotes transposition. Note further that the ΔV matrix must be made hollow before summing it to V.↩︎
An alternative formulation of this equation could be $\sum_{m \neq j}^{M} v_{m,j} + (v_{m,j} v_{j,m})$ but, although this alternative formulation is positively related to Eq. 5, we have not compared their behavior exhaustively.↩︎
This mechanism is in model HD2022 but not in model HDI2020↩︎
An alternative and perhaps more naturalistic parametrization of this rule would be to use min[0, θ(n)], where min is the minimum function and n is an absent stimulus; ReLUs are extensively used in neural networks. Another alternative that avoids the use of absolute values or a rectifying mechanism would be to use quantities of e^θ(k) instead of θ(k).↩︎

- The mathematics behind HeiDI

HD2022