Skip to content

Commit c9863fb

Browse files
committed
Fixed typo
1 parent 3ea2cc6 commit c9863fb

File tree

8 files changed

+8
-8
lines changed

8 files changed

+8
-8
lines changed

authors/chaitanya-joshi/index.xml

+1-1
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ $$</p>
6666
i.e.,\ h_{i}^{\ell+1} = \sum_{j \in \mathcal{S}} w_{ij} \left( V^{\ell} h_{j}^{\ell} \right),
6767
$$</p>
6868
<p>$$
69-
\text{where} ;\ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
69+
\text{where} \ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
7070
$$</p>
7171
<p>where $j \in \mathcal{S}$ denotes the set of words in the sentence and $Q^{\ell}, K^{\ell}, V^{\ell}$ are learnable linear weights (denoting the <strong>Q</strong>uery, <strong>K</strong>ey and <strong>V</strong>alue for the attention computation, respectively).
7272
The attention mechanism is performed parallelly for each word in the sentence to obtain their updated features in <em>one shot</em>–another plus point for Transformers over RNNs, which update features word-by-word.</p>

index.json

+1-1
Large diffs are not rendered by default.

post/index.xml

+1-1
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ $$</p>
6666
i.e.,\ h_{i}^{\ell+1} = \sum_{j \in \mathcal{S}} w_{ij} \left( V^{\ell} h_{j}^{\ell} \right),
6767
$$</p>
6868
<p>$$
69-
\text{where} ;\ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
69+
\text{where} \ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
7070
$$</p>
7171
<p>where $j \in \mathcal{S}$ denotes the set of words in the sentence and $Q^{\ell}, K^{\ell}, V^{\ell}$ are learnable linear weights (denoting the <strong>Q</strong>uery, <strong>K</strong>ey and <strong>V</strong>alue for the attention computation, respectively).
7272
The attention mechanism is performed parallelly for each word in the sentence to obtain their updated features in <em>one shot</em>–another plus point for Transformers over RNNs, which update features word-by-word.</p>

post/transformers-are-gnns/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -628,7 +628,7 @@ <h3 id="breaking-down-the-transformer">Breaking down the Transformer</h3>
628628
i.e.,\ h_{i}^{\ell+1} = \sum_{j \in \mathcal{S}} w_{ij} \left( V^{\ell} h_{j}^{\ell} \right),
629629
$$</p>
630630
<p>$$
631-
\text{where} ;\ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
631+
\text{where} \ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
632632
$$</p>
633633
<p>where $j \in \mathcal{S}$ denotes the set of words in the sentence and $Q^{\ell}, K^{\ell}, V^{\ell}$ are learnable linear weights (denoting the <strong>Q</strong>uery, <strong>K</strong>ey and <strong>V</strong>alue for the attention computation, respectively).
634634
The attention mechanism is performed parallelly for each word in the sentence to obtain their updated features in <em>one shot</em>&ndash;another plus point for Transformers over RNNs, which update features word-by-word.</p>

tags/deep-learning/index.xml

+1-1
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ $$&lt;/p&gt;
6666
i.e.,\ h_{i}^{\ell+1} = \sum_{j \in \mathcal{S}} w_{ij} \left( V^{\ell} h_{j}^{\ell} \right),
6767
$$&lt;/p&gt;
6868
&lt;p&gt;$$
69-
\text{where} ;\ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
69+
\text{where} \ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
7070
$$&lt;/p&gt;
7171
&lt;p&gt;where $j \in \mathcal{S}$ denotes the set of words in the sentence and $Q^{\ell}, K^{\ell}, V^{\ell}$ are learnable linear weights (denoting the &lt;strong&gt;Q&lt;/strong&gt;uery, &lt;strong&gt;K&lt;/strong&gt;ey and &lt;strong&gt;V&lt;/strong&gt;alue for the attention computation, respectively).
7272
The attention mechanism is performed parallelly for each word in the sentence to obtain their updated features in &lt;em&gt;one shot&lt;/em&gt;&amp;ndash;another plus point for Transformers over RNNs, which update features word-by-word.&lt;/p&gt;

tags/graph-neural-networks/index.xml

+1-1
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ $$&lt;/p&gt;
6666
i.e.,\ h_{i}^{\ell+1} = \sum_{j \in \mathcal{S}} w_{ij} \left( V^{\ell} h_{j}^{\ell} \right),
6767
$$&lt;/p&gt;
6868
&lt;p&gt;$$
69-
\text{where} ;\ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
69+
\text{where} \ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
7070
$$&lt;/p&gt;
7171
&lt;p&gt;where $j \in \mathcal{S}$ denotes the set of words in the sentence and $Q^{\ell}, K^{\ell}, V^{\ell}$ are learnable linear weights (denoting the &lt;strong&gt;Q&lt;/strong&gt;uery, &lt;strong&gt;K&lt;/strong&gt;ey and &lt;strong&gt;V&lt;/strong&gt;alue for the attention computation, respectively).
7272
The attention mechanism is performed parallelly for each word in the sentence to obtain their updated features in &lt;em&gt;one shot&lt;/em&gt;&amp;ndash;another plus point for Transformers over RNNs, which update features word-by-word.&lt;/p&gt;

tags/natural-language-processing/index.xml

+1-1
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ $$&lt;/p&gt;
6666
i.e.,\ h_{i}^{\ell+1} = \sum_{j \in \mathcal{S}} w_{ij} \left( V^{\ell} h_{j}^{\ell} \right),
6767
$$&lt;/p&gt;
6868
&lt;p&gt;$$
69-
\text{where} ;\ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
69+
\text{where} \ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
7070
$$&lt;/p&gt;
7171
&lt;p&gt;where $j \in \mathcal{S}$ denotes the set of words in the sentence and $Q^{\ell}, K^{\ell}, V^{\ell}$ are learnable linear weights (denoting the &lt;strong&gt;Q&lt;/strong&gt;uery, &lt;strong&gt;K&lt;/strong&gt;ey and &lt;strong&gt;V&lt;/strong&gt;alue for the attention computation, respectively).
7272
The attention mechanism is performed parallelly for each word in the sentence to obtain their updated features in &lt;em&gt;one shot&lt;/em&gt;&amp;ndash;another plus point for Transformers over RNNs, which update features word-by-word.&lt;/p&gt;

tags/transformer/index.xml

+1-1
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ $$&lt;/p&gt;
6666
i.e.,\ h_{i}^{\ell+1} = \sum_{j \in \mathcal{S}} w_{ij} \left( V^{\ell} h_{j}^{\ell} \right),
6767
$$&lt;/p&gt;
6868
&lt;p&gt;$$
69-
\text{where} ;\ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
69+
\text{where} \ w_{ij} = \text{softmax}_j \left( Q^{\ell} h_{i}^{\ell} \cdot K^{\ell} h_{j}^{\ell} \right),
7070
$$&lt;/p&gt;
7171
&lt;p&gt;where $j \in \mathcal{S}$ denotes the set of words in the sentence and $Q^{\ell}, K^{\ell}, V^{\ell}$ are learnable linear weights (denoting the &lt;strong&gt;Q&lt;/strong&gt;uery, &lt;strong&gt;K&lt;/strong&gt;ey and &lt;strong&gt;V&lt;/strong&gt;alue for the attention computation, respectively).
7272
The attention mechanism is performed parallelly for each word in the sentence to obtain their updated features in &lt;em&gt;one shot&lt;/em&gt;&amp;ndash;another plus point for Transformers over RNNs, which update features word-by-word.&lt;/p&gt;

0 commit comments

Comments
 (0)