[インデックス 16894] ファイルの概要

このコミットは、Go言語の標準ライブラリ compress/flate パッケージにおけるエンコーダのメモリ割り当て（アロケーション）を削減することを目的としています。特に、ハフマン符号化の過程で発生する小さなオブジェクトの頻繁な割り当てを減らすことで、パフォーマンスの向上とメモリ使用量の削減を図っています。

コミット

commit 05026c4ebd67ccf82d5cb2238bdebce8f0fde363
Author: Rémy Oudompheng <oudomphe@phare.normalesup.org>
Date:   Sun Jul 28 09:42:40 2013 +0200

    compress/flate: reduce tiny allocs done by encoder.
    
    benchmark                          old allocs   new allocs    delta
    BenchmarkEncodeDigitsSpeed1e4             942           91  -90.34%
    BenchmarkEncodeDigitsSpeed1e5            1919          178  -90.72%
    BenchmarkEncodeDigitsDefault1e4           734          100  -86.38%
    BenchmarkEncodeDigitsDefault1e5          1958          193  -90.14%
    BenchmarkEncodeDigitsCompress1e4          734          100  -86.38%
    BenchmarkEncodeDigitsCompress1e5         1958          193  -90.14%
    BenchmarkEncodeTwainSpeed1e4             1865          109  -94.16%
    BenchmarkEncodeTwainSpeed1e5             3943          211  -94.65%
    BenchmarkEncodeTwainDefault1e4           1811          103  -94.31%
    BenchmarkEncodeTwainDefault1e5           3708          199  -94.63%
    BenchmarkEncodeTwainCompress1e4          1811          103  -94.31%
    BenchmarkEncodeTwainCompress1e5          3693          190  -94.86%
    
    benchmark                           old bytes    new bytes    delta
    BenchmarkEncodeDigitsSpeed1e4         1469438      1453920   -1.06%
    BenchmarkEncodeDigitsSpeed1e5         1490898      1458961   -2.14%
    BenchmarkEncodeDigitsSpeed1e6         1858819      1542407  -17.02%
    BenchmarkEncodeDigitsDefault1e4       1465903      1454160   -0.80%
    BenchmarkEncodeDigitsDefault1e5       1491841      1459361   -2.18%
    BenchmarkEncodeDigitsDefault1e6       1825424      1531545  -16.10%
    BenchmarkEncodeDigitsCompress1e4      1465903      1454160   -0.80%
    BenchmarkEncodeDigitsCompress1e5      1491681      1459361   -2.17%
    BenchmarkEncodeDigitsCompress1e6      1825424      1531545  -16.10%
    BenchmarkEncodeTwainSpeed1e4          1485308      1454400   -2.08%
    BenchmarkEncodeTwainSpeed1e5          1526065      1459878   -4.34%
    BenchmarkEncodeTwainSpeed1e6          2066627      1536296  -25.66%
    BenchmarkEncodeTwainDefault1e4        1484380      1454240   -2.03%
    BenchmarkEncodeTwainDefault1e5        1521793      1459558   -4.09%
    BenchmarkEncodeTwainDefault1e6        1977504      1523388  -22.96%
    BenchmarkEncodeTwainCompress1e4       1484380      1454240   -2.03%
    BenchmarkEncodeTwainCompress1e5       1521457      1459318   -4.08%
    BenchmarkEncodeTwainCompress1e6       1980000      1523609  -23.05%
    
    benchmark                           old ns/op    new ns/op    delta
    BenchmarkEncodeDigitsSpeed1e4         1472128      1384343   -5.96%
    BenchmarkEncodeDigitsSpeed1e5         8283663      8112304   -2.07%
    BenchmarkEncodeDigitsSpeed1e6        77459311     76364216   -1.41%
    BenchmarkEncodeDigitsDefault1e4       1813090      1746552   -3.67%
    BenchmarkEncodeDigitsDefault1e5      26221292     26052516   -0.64%
    BenchmarkEncodeDigitsDefault1e6     286512472    286099039   -0.14%
    BenchmarkEncodeDigitsCompress1e4      1809373      1747230   -3.43%
    BenchmarkEncodeDigitsCompress1e5     26231580     26038456   -0.74%
    BenchmarkEncodeDigitsCompress1e6    286140002    286025372   -0.04%
    BenchmarkEncodeTwainSpeed1e4          1594094      1438600   -9.75%
    BenchmarkEncodeTwainSpeed1e5          7669724      7316288   -4.61%
    BenchmarkEncodeTwainSpeed1e6         68731353     65938994   -4.06%
    BenchmarkEncodeTwainDefault1e4        2063497      1866488   -9.55%
    BenchmarkEncodeTwainDefault1e5       22602689     22221377   -1.69%
    BenchmarkEncodeTwainDefault1e6      233376842    232114297   -0.54%
    BenchmarkEncodeTwainCompress1e4       2062441      1949676   -5.47%
    BenchmarkEncodeTwainCompress1e5      28264344     27930627   -1.18%
    BenchmarkEncodeTwainCompress1e6     304369641    303704330   -0.22%
    
    benchmark                            old MB/s     new MB/s  speedup
    BenchmarkEncodeDigitsSpeed1e4            6.79         7.22    1.06x
    BenchmarkEncodeDigitsSpeed1e5           12.07        12.33    1.02x
    BenchmarkEncodeDigitsSpeed1e6           12.91        13.10    1.01x
    BenchmarkEncodeDigitsDefault1e4          5.52         5.73    1.04x
    BenchmarkEncodeDigitsDefault1e5          3.81         3.84    1.01x
    BenchmarkEncodeDigitsDefault1e6          3.49         3.50    1.00x
    BenchmarkEncodeDigitsCompress1e4         5.53         5.72    1.03x
    BenchmarkEncodeDigitsCompress1e5         3.81         3.84    1.01x
    BenchmarkEncodeDigitsCompress1e6         3.49         3.50    1.00x
    BenchmarkEncodeTwainSpeed1e4             6.27         6.95    1.11x
    BenchmarkEncodeTwainSpeed1e5            13.04        13.67    1.05x
    BenchmarkEncodeTwainSpeed1e6            14.55        15.17    1.04x
    BenchmarkEncodeTwainDefault1e4           4.85         5.36    1.11x
    BenchmarkEncodeTwainDefault1e5           4.42         4.50    1.02x
    BenchmarkEncodeTwainDefault1e6           4.28         4.31    1.01x
    BenchmarkEncodeTwainCompress1e4          4.85         5.13    1.06x
    BenchmarkEncodeTwainCompress1e5          3.54         3.58    1.01x
    BenchmarkEncodeTwainCompress1e6          3.29         3.29    1.00x
    
    R=imkrasin, golang-dev, bradfitz, r
    CC=golang-dev
    https://golang.org/cl/10006043

GitHub上でのコミットページへのリンク

https://github.com/golang/go/commit/05026c4ebd67ccf82d5cb2238bdebce8f0fde363

このコミットの目的は、compress/flate パッケージのエンコーダが実行する小さなメモリ割り当ての数を削減することです。コミットメッセージには、様々なベンチマーク結果が示されており、allocs (割り当て回数) が大幅に削減されていることが明確に示されています。具体的には、ベンチマークによっては90%以上の割り当て削減が達成されており、これに伴い、bytes (割り当てられたバイト数) と ns/op (操作あたりのナノ秒、つまり実行時間) も改善されています。これは、Goのガベージコレクションの負荷を軽減し、全体的なパフォーマンスを向上させる効果が期待されます。

変更の背景

Go言語のランタイムは、ガベージコレクション（GC）によってメモリ管理を行っています。プログラムが頻繁に小さなオブジェクトを割り当てると、GCがより頻繁に実行され、その結果、プログラムの実行が一時的に停止する「GCストップ・ザ・ワールド」の時間が長くなる可能性があります。これは、特に低レイテンシが求められるアプリケーションや、大量のデータを処理するシステムにおいて、パフォーマンスのボトルネックとなることがあります。

compress/flate パッケージは、DEFLATEアルゴリズム（ZIP、gzipなどで使用される圧縮アルゴリズム）の実装を提供しており、データ圧縮において重要な役割を担っています。このパッケージのエンコーダが、ハフマン符号化の過程で多数の小さなオブジェクトを生成していることが判明しました。これらの小さなオブジェクトは、ハフマンツリーの構築や管理に使用されるデータ構造に関連していると考えられます。

このコミットは、これらの頻繁な小さな割り当てを特定し、それらを削減することで、エンコーダの効率を向上させ、Goプログラム全体のパフォーマンスに貢献することを目的としています。具体的には、ヒープ割り当てを減らし、スタック割り当てや既存のメモリの再利用を促進することで、GCのオーバーヘッドを低減しようとしています。

前提知識の解説

1. DEFLATEアルゴリズムとFlate圧縮

DEFLATEは、LZ77アルゴリズムとハフマン符号化を組み合わせた可逆データ圧縮アルゴリズムです。

LZ77: 繰り返し出現するバイト列を、以前に出現した同じバイト列への参照（オフセットと長さ）で置き換えることでデータを圧縮します。
ハフマン符号化: 各シンボル（バイトやLZ77の参照）の出現頻度に基づいて、可変長のビット列を割り当てることでデータを圧縮します。出現頻度の高いシンボルには短いビット列を、低いシンボルには長いビット列を割り当てます。これにより、平均ビット長を最小化し、高い圧縮率を実現します。

Goの compress/flate パッケージは、このDEFLATEアルゴリズムの実装を提供しており、データの圧縮・伸長に利用されます。

2. ハフマン符号化のツリー構築

ハフマン符号化では、各シンボルの頻度に基づいて二分木（ハフマンツリー）を構築します。このツリーは、最も頻度の低い2つのシンボルを結合して新しいノードを作成し、それを繰り返すことで構築されます。このプロセスは、すべてのシンボルが1つのツリーに結合されるまで続きます。

ツリー構築のアルゴリズムは、通常、優先度キュー（またはソートされたリスト）を使用して、最も頻度の低いノードを効率的に選択します。各ステップで2つのノードが結合されるたびに、新しいノードが生成され、ツリーの構造が変化します。このノードの生成と管理が、メモリ割り当ての発生源となります。

3. Go言語のメモリ割り当てとガベージコレクション

Go言語は、自動メモリ管理（ガベージコレクション）を採用しています。

ヒープ割り当て: new や make を使用して作成されたオブジェクトは、ヒープ領域に割り当てられます。ヒープに割り当てられたオブジェクトは、GCによって管理され、不要になった時点で回収されます。
スタック割り当て: 関数内で宣言されたローカル変数など、生存期間が短いオブジェクトは、コンパイラによってスタックに割り当てられることがあります（エスケープ解析の結果による）。スタックに割り当てられたオブジェクトは、関数が終了すると自動的に解放されるため、GCの対象になりません。
小さな割り当てのオーバーヘッド: 非常に小さなオブジェクトを頻繁にヒープに割り当てると、GCのオーバーヘッドが増大します。これは、GCがオブジェクトの追跡、マーク、スイープといった処理を行う必要があるためです。小さなオブジェクトの割り当てが多数発生すると、GCの実行頻度が高まり、アプリケーションのパフォーマンスに悪影響を与える可能性があります。

このコミットは、ハフマンツリー構築の過程で発生する小さなオブジェクトのヒープ割り当てを、より効率的な方法（例えば、スタック割り当てや既存の配列の再利用）に置き換えることで、GCの負荷を軽減し、パフォーマンスを向上させることを目指しています。

技術的詳細

このコミットの技術的な核心は、ハフマン符号化のツリー構築アルゴリズムにおけるデータ構造の変更と、それに伴うメモリ割り当ての最適化です。

変更前は、ハフマンツリーの構築過程で chain という構造体が頻繁に割り当てられていました。chain 構造体は、ツリーのノード間の関係や、特定のレベルでの葉の数を追跡するために使用されていました。また、levelInfo 構造体も chain へのポインタや、他の levelInfo へのポインタ（up, down）を持っており、これらも間接的に割り当てを誘発していました。

コミットの変更点を見ると、以下の点が挙げられます。

chain 構造体の削除: type chain struct が完全に削除されました。これは、ハフマンツリーの構築ロジックにおいて、chain オブジェクトの動的な割り当てが不要になったことを意味します。
levelInfo 構造体の変更:
- lastChain *chain フィールドが lastFreq int32 に変更されました。これにより、chain オブジェクトへのポインタが不要になり、chain の割り当てがなくなりました。
- up *levelInfo および down *levelInfo フィールドが削除されました。これにより、levelInfo オ構造体間のポインタによるリンク構造が削除され、levelInfo オブジェクトの動的なリンク構築に伴う割り当てが削減されました。
固定サイズ配列の導入:
- var levels [maxBitsLimit]levelInfo が導入されました。maxBitsLimit は 16 に設定されており、これはハフマン符号の最大ビット長が15（RFC 1951で定義されているDEFLATEの最大ビット長）であることを考慮したものです。これにより、levelInfo オブジェクトがヒープに動的に割り当てられる代わりに、スタック上に固定サイズの配列として確保されるようになりました。
- var leafCounts [maxBitsLimit][maxBitsLimit]int32 が導入されました。これは、各レベルにおける葉の数を追跡するための二次元配列で、これもスタック上に固定サイズで確保されます。

これらの変更により、ハフマンツリー構築のアルゴリズムが、動的なリンクリストやツリー構造の代わりに、固定サイズの配列ベースの処理に切り替わったことがわかります。これにより、ツリー構築中に発生する多数の小さなオブジェクトのヒープ割り当てが劇的に削減され、その結果、ガベージコレクションの頻度とオーバーヘッドが低減されます。

ベンチマーク結果が示すように、allocs の削減が最も顕著であり、これはこの最適化の直接的な効果です。bytes と ns/op の改善は、GCの負荷軽減と、よりキャッシュフレンドリーなデータアクセスパターンによる副次的な効果と考えられます。

コアとなるコードの変更箇所

変更は src/pkg/compress/flate/huffman_code.go ファイルに集中しています。

--- a/src/pkg/compress/flate/huffman_code.go
+++ b/src/pkg/compress/flate/huffman_code.go
@@ -19,23 +19,13 @@ type literalNode struct {
 	tfreq    int32
  }
  
-type chain struct {
-	// The sum of the leaves in this tree
-	tfreq int32
-
-	// The number of literals to the left of this item at this level
-	tleafCount int32
-
-	// The right child of this chain in the previous level.
-	tup *chain
-}
-
+// A levelInfo describes the state of the constructed tree for a given depth.
 type levelInfo struct {
 	// Our level.  for better printing
 	tlevel int32
  
-	// The most recent chain generated for this level
-	tlastChain *chain
+	// The frequency of the last node at this level
+	tlastFreq int32
  
 	// The frequency of the next character to add to this level
 	tnextCharFreq int32
@@ -47,12 +37,6 @@ type levelInfo struct {
 	// The number of chains remaining to generate for this level before moving
 	// up to the next level
 	tneeded int32
-
-	// The levelInfo for level+1
-	tup *levelInfo
-
-	// The levelInfo for level-1
-	tdown *levelInfo
  }
  
  func maxNode() literalNode { return literalNode{math.MaxUint16, math.MaxInt32} }
@@ -121,6 +105,8 @@ func (h *huffmanEncoder) bitLength(freq []int32) int64 {
  	return total
  }
  
+const maxBitsLimit = 16
+
  // Return the number of literals assigned to each bit size in the Huffman encoding
  //
  // This method is only called when list.length >= 3
@@ -131,9 +117,13 @@ func (h *huffmanEncoder) bitLength(freq []int32) int64 {
  //             frequency, and has as its last element a special element with frequency
  //             MaxInt32
  // maxBits     The maximum number of bits that should be used to encode any literal.
+//             Must be less than 16.
  // return      An integer array in which array[i] indicates the number of literals
  //             that should be encoded in i bits.
  func (h *huffmanEncoder) bitCounts(list []literalNode, maxBits int32) []int32 {
+\tif maxBits >= maxBitsLimit {
+\t\tpanic("flate: maxBits too large")
+\t}\n \tn := int32(len(list))\n  	list = list[0 : n+1]\n  	list[n] = maxNode()\n@@ -148,53 +138,61 @@ func (h *huffmanEncoder) bitCounts(list []literalNode, maxBits int32) []int32 {
  	// A bogus "Level 0" whose sole purpose is so that
  	// level1.prev.needed==0.  This makes level1.nextPairFreq
  	// be a legitimate value that never gets chosen.\n-\ttop := &levelInfo{needed: 0}\n-\tchain2 := &chain{list[1].freq, 2, new(chain)}\n+\tvar levels [maxBitsLimit]levelInfo\n+\t// leafCounts[i] counts the number of literals at the left\n+\t// of ancestors of the rightmost node at level i.\n+\t// leafCounts[i][j] is the number of literals at the left\n+\t// of the level j ancestor.\n+\tvar leafCounts [maxBitsLimit][maxBitsLimit]int32\n+\n  	for level := int32(1); level <= maxBits; level++ {\n  		// For every level, the first two items are the first two characters.\n  		// We initialize the levels as if we had already figured this out.\n-\t\ttop = &levelInfo{\n+\t\tlevels[level] = levelInfo{\n  		\tlevel:        level,\n-\t\t\tlastChain:    chain2,\n+\t\t\tlastFreq:     list[1].freq,\n  		\tnextCharFreq: list[2].freq,\n  		\tnextPairFreq: list[0].freq + list[1].freq,\n-\t\t\tdown:         top,\n \t\t}\n-\t\ttop.down.up = top\n+\t\tleafCounts[level][level] = 2\n  		\tif level == 1 {\n-\t\t\t\ttop.nextPairFreq = math.MaxInt32\n+\t\t\t\tlevels[level].nextPairFreq = math.MaxInt32\n  		\t}\n  	}\n  
  	// We need a total of 2*n - 2 items at top level and have already generated 2.\n-\ttop.needed = 2*n - 4\n+\tlevels[maxBits].needed = 2*n - 4\n  
-\tl := top\n+\tlevel := maxBits\n  	for {\n+\t\tl := &levels[level]\n  		\tif l.nextPairFreq == math.MaxInt32 && l.nextCharFreq == math.MaxInt32 {\n  		\t\t// We\'ve run out of both leafs and pairs.\n  		\t\t// End all calculations for this level.\n-\t\t\t// To m sure we never come back to this level or any lower level,\n+\t\t\t// To make sure we never come back to this level or any lower level,\n  		\t\t// set nextPairFreq impossibly large.\n-\t\t\t\tl.lastChain = nil\n  		\t\tl.needed = 0\n-\t\t\t\tl = l.up\n-\t\t\t\tl.nextPairFreq = math.MaxInt32\n+\t\t\t\tlevels[level+1].nextPairFreq = math.MaxInt32\n+\t\t\t\tlevel++\n  		\t\tcontinue\n  		\t}\n  
-\t\tprevFreq := l.lastChain.freq\n+\t\tprevFreq := l.lastFreq\n  		\tif l.nextCharFreq < l.nextPairFreq {\n  		\t\t// The next item on this row is a leaf node.\n-\t\t\t\tn := l.lastChain.leafCount + 1\n-\t\t\t\tl.lastChain = &chain{l.nextCharFreq, n, l.lastChain.up}\n+\t\t\t\tn := leafCounts[level][level] + 1\n+\t\t\t\tl.lastFreq = l.nextCharFreq\n+\t\t\t\t// Lower leafCounts are the same of the previous node.\n+\t\t\t\tcopy(leafCounts[level][:level], leafCounts[level-1][:level])\n+\t\t\t\tleafCounts[level][level] = n\n  		\t\tl.nextCharFreq = list[n].freq\n  		\t} else {\n  		\t\t// The next item on this row is a pair from the previous row.\n  		\t\t// nextPairFreq isn\'t valid until we generate two\n  		\t\t// more values in the level below\n-\t\t\t\tl.lastChain = &chain{l.nextPairFreq, l.lastChain.leafCount, l.down.lastChain}\n-\t\t\t\tl.down.needed = 2\n+\t\t\t\tl.lastFreq = l.nextPairFreq\n+\t\t\t\t// Take leaf counts from the lower level, except counts[level] remains the same.\n+\t\t\t\tcopy(leafCounts[level][:level], leafCounts[level-1][:level])\n+\t\t\t\tlevels[l.level-1].needed = 2\n  		\t}\n  
  		\tif l.needed--; l.needed == 0 {\n  		\t\t// Continue calculating one level up.  Fill in nextPairFreq\n  		\t\t// of that level with the sum of the two nodes we\'ve just calculated on\n  		\t\t// this level.\n-\t\t\t\tup := l.up\n-\t\t\t\tif up == nil {\n+\t\t\t\tif l.level == maxBits {\n  		\t\t\t// All done!\n  		\t\t\tbreak\n  		\t\t}\n-\t\t\t\tup.nextPairFreq = prevFreq + l.lastChain.freq\n-\t\t\t\tl = up\n+\t\t\t\tlevels[l.level+1].nextPairFreq = prevFreq + l.lastFreq\n+\t\t\t\tlevel++\n  		\t} else {\n  		\t\t// If we stole from below, move down temporarily to replenish it.\n-\t\t\t\tfor l.down.needed > 0 {\n-\t\t\t\t\tl = l.down\n+\t\t\t\tfor levels[level-1].needed > 0 {\n+\t\t\t\t\tlevel--\n  		\t\t}\n  		\t}\n  		}\n  
  	// Somethings is wrong if at the end, the top level is null or hasn\'t used\n  	// all of the leaves.\n-\tif top.lastChain.leafCount != n {\n-\t\tpanic("top.lastChain.leafCount != n")\n+\tif leafCounts[maxBits][maxBits] != n {\n+\t\tpanic("leafCounts[maxBits][maxBits] != n")\n  	}\n  
  	bitCount := make([]int32, maxBits+1)\n  	bits := 1\n-\tfor chain := top.lastChain; chain.up != nil; chain = chain.up {\n+\tcounts := &leafCounts[maxBits]\n+\tfor level := maxBits; level > 0; level-- {\n  		// chain.leafCount gives the number of literals requiring at least "bits"\n  		// bits to encode.\n-\t\tbitCount[bits] = chain.leafCount - chain.up.leafCount\n+\t\tbitCount[bits] = counts[level] - counts[level-1]\n  		\tbits++\n  	}\n  	return bitCount

コアとなるコードの解説

このコミットの主要な変更は、ハフマン符号化のビット長を計算する bitCounts 関数内のデータ構造とアルゴリズムの変更です。

chain 構造体の廃止:
- 変更前は、chain 構造体がハフマンツリーの構築過程でノード間の関係を表現するために使用されていました。この構造体は、freq (頻度)、leafCount (葉の数)、up (親ノードへのポインタ) を持っていました。
- この chain 構造体のインスタンスがツリー構築中に動的に生成され、ヒープ割り当ての原因となっていました。
- 変更後、chain 構造体は完全に削除され、その役割は固定サイズの配列 leafCounts と levels 内の lastFreq フィールドによって代替されるようになりました。
levelInfo 構造体の変更:
- 変更前は、levelInfo 構造体が lastChain *chain、up *levelInfo、down *levelInfo といったポインタを持っていました。これらのポインタは、ツリーのレベル間の関係を動的に構築するために使用され、間接的にヒープ割り当てを誘発していました。
- 変更後、これらのポインタフィールドは削除され、lastFreq int32 が追加されました。これにより、levelInfo オブジェクト自体がポインタを介して他のオブジェクトを参照する必要がなくなり、よりコンパクトでキャッシュフレンドリーな構造になりました。
固定サイズ配列 levels と leafCounts の導入:
- bitCounts 関数内で、var levels [maxBitsLimit]levelInfo と var leafCounts [maxBitsLimit][maxBitsLimit]int32 という固定サイズの配列が導入されました。
- maxBitsLimit は 16 に定義されており、これはハフマン符号の最大ビット長が15であることを考慮したものです。
- levels 配列は、各ビット長レベルの levelInfo を格納します。これにより、levelInfo オブジェクトがヒープに動的に割り当てられる代わりに、関数のスタックフレーム上に静的に確保されるようになります。
- leafCounts 配列は、各レベルにおける葉の数を追跡するために使用されます。これもスタック上に確保されるため、動的な割り当てが不要になります。
bitCounts 関数のロジック変更:
- ツリー構築のループ内で、l.lastChain へのアクセスが l.lastFreq と leafCounts 配列へのアクセスに置き換えられました。
- l.down.needed = 2 のようなレベル間のポインタ操作が、levels[l.level-1].needed = 2 のような配列インデックスによるアクセスに置き換えられました。
- 最終的な bitCount の計算も、chain のポインタを辿る代わりに、leafCounts 配列から直接値を取得するように変更されました。

これらの変更により、ハフマンツリーの構築とビット長計算のアルゴリズムが、動的なリンクリストやツリーノードの割り当てに依存する代わりに、固定サイズの配列を効率的に利用するようになりました。これにより、多数の小さなオブジェクトのヒープ割り当てが削減され、ガベージコレクションの頻度が低下し、結果としてエンコーダのパフォーマンスが向上しました。

参考にした情報源リンク

Go言語のコミット履歴: https://github.com/golang/go/commits/master
Go言語のコードレビューシステム (Gerrit): https://go.dev/cl/10006043 (コミットメッセージに記載されているCLリンク)
Go言語のベンチマーク結果の解釈に関する情報 (例: go test -bench の出力): https://pkg.go.dev/testing (Goのtestingパッケージのドキュメント)
RFC 1951 (DEFLATE Compressed Data Format Specification): https://datatracker.ietf.org/doc/html/rfc1951 (ハフマン符号の最大ビット長に関する情報源)

comemo