[インデックス 11135] ファイルの概要

コミット

strconv: implement fast path for rounding already short numbers.

benchmark old ns/op new ns/op delta BenchmarkFormatFloatDecimal 3765 1386 -63%

R=rsc CC=golang-dev, remy https://golang.org/cl/5494060

GitHub上でのコミットページへのリンク

https://github.com/golang/go/commit/37cd1658386bcc1d4f4ffddb30dd863df2e2ce7b

元コミット内容

commit 37cd1658386bcc1d4f4ffddb30dd863df2e2ce7b
Author: Rémy Oudompheng <oudomphe@phare.normalesup.org>
Date:   Thu Jan 12 11:34:06 2012 -0800

    strconv: implement fast path for rounding already short numbers.
    
    benchmark                   old ns/op   new ns/op   delta
    BenchmarkFormatFloatDecimal      3765        1386    -63%
    
    R=rsc
    CC=golang-dev, remy
    https://golang.org/cl/5494060
---
 src/pkg/strconv/ftoa.go | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)\n
diff --git a/src/pkg/strconv/ftoa.go b/src/pkg/strconv/ftoa.go
index b1d4b32f03..ab8dd2bf95 100644
--- a/src/pkg/strconv/ftoa.go
+++ b/src/pkg/strconv/ftoa.go
@@ -178,15 +178,26 @@ func roundShortest(d *decimal, mant uint64, exp int, flt *floatInfo) {\n 		return\n \t}\n \n-\t// TODO(rsc): Unless exp == minexp, if the number of digits in d\n-\t// is less than 17, it seems likely that it would be\n-\t// the shortest possible number already.  So maybe we can\n-\t// bail out without doing the extra multiprecision math here.\n-\n \t// Compute upper and lower such that any decimal number\n \t// between upper and lower (possibly inclusive)\n \t// will round to the original floating point number.\n \n+\t// We may see at once that the number is already shortest.\n+\t//\n+\t// Suppose d is not denormal, so that 2^exp <= d < 10^dp.\n+\t// The closest shorter number is at least 10^(dp-nd) away.\n+\t// The lower/upper bounds computed below are at distance\n+\t// at most 2^(exp-mantbits).\n+\t//\n+\t// So the number is already shortest if 10^(dp-nd) > 2^(exp-mantbits),\n+\t// or equivalently log2(10)*(dp-nd) > exp-mantbits.\n+\t// It is true if 332/100*(dp-nd) >= exp-mantbits (log2(10) > 3.32).\n+\tminexp := flt.bias + 1 // minimum possible exponent\n+\tif exp > minexp && 332*(d.dp-d.nd) >= 100*(exp-int(flt.mantbits)) {\n+\t\t// The number is already shortest.\n+\t\treturn\n+\t}\n+\n \t// d = mant << (exp - mantbits)\n \t// Next highest floating point number is mant+1 << exp-mantbits.\n \t// Our upper bound is halfway inbetween, mant*2+1 << exp-mantbits-1.\n@@ -200,7 +211,6 @@ func roundShortest(d *decimal, mant uint64, exp int, flt *floatInfo) {\n \t// in which case the next lowest is mant*2-1 << exp-mantbits-1.\n \t// Either way, call it mantlo << explo-mantbits.\n \t// Our lower bound is halfway inbetween, mantlo*2+1 << explo-mantbits-1.\n-\tminexp := flt.bias + 1 // minimum possible exponent\n \tvar mantlo uint64\n \tvar explo int\n \tif mant > 1<<flt.mantbits || exp == minexp {\n```

## 変更の背景

このコミットは、Go言語の標準ライブラリ`strconv`パッケージにおける浮動小数点数から文字列への変換処理のパフォーマンス改善を目的としています。特に、`FormatFloat`関数などで使用される`roundShortest`関数において、既に最短表現であると判断できる浮動小数点数に対して、不要な高精度演算をスキップするための高速パス（fast path）を導入しています。

元のコードには、以下のような`TODO`コメントが存在していました。

```go
// TODO(rsc): Unless exp == minexp, if the number of digits in d
// is less than 17, it seems likely that it would be
// the shortest possible number already. So maybe we can
// bail out without doing the extra multiprecision math here.

このコメントは、特定の条件下で、浮動小数点数の最短表現を決定するための複雑な多倍長演算を回避できる可能性を示唆していました。このコミットは、このTODOコメントで示唆された最適化を具体的に実装し、特定のケースで大幅な性能向上を実現しています。ベンチマーク結果が示すように、BenchmarkFormatFloatDecimalの実行時間が63%削減されており、この最適化が非常に効果的であったことがわかります。

前提知識の解説

浮動小数点数の表現と丸め

コンピュータにおける浮動小数点数（例: float64やfloat32）は、IEEE 754標準に基づいてバイナリ形式で表現されます。しかし、多くの10進数の小数は、バイナリ形式では正確に表現できません。このため、浮動小数点数を文字列に変換する際には、元の浮動小数点数を一意に識別できる最短の10進数表現を見つける「最短丸め（shortest rounding）」というプロセスが重要になります。

例えば、0.1という10進数は、バイナリ浮動小数点数では正確に表現できず、わずかに異なる値として格納されます。この格納されたバイナリ値を文字列に戻す際、0.1000000000000000055511151231257827021181583404541015625のような長い文字列ではなく、元の0.1という最短の表現に丸める必要があります。

`strconv`パッケージと`ftoa.go`

Go言語のstrconvパッケージは、基本的なデータ型（数値、真偽値など）と文字列との間の変換を提供します。ftoa.goファイルは、このパッケージ内で浮動小数点数（float to ASCII）を文字列に変換するロジックを実装しています。特に、FormatFloat関数がこのファイル内の関数を利用して、浮動小数点数を指定されたフォーマットで文字列に変換します。

`roundShortest`関数

roundShortest関数は、与えられた浮動小数点数の10進数表現が、元の浮動小数点数を一意に識別できる最短の文字列であるかどうかを判断し、必要に応じて丸め処理を行うための中心的な役割を担っています。この関数は、浮動小数点数のバイナリ表現と10進数表現の間の複雑な関係を考慮し、多倍長演算を用いて正確な丸めを行います。

多倍長演算

浮動小数点数の正確な10進数表現を決定するためには、通常の浮動小数点演算では精度が不足することがあります。そのため、非常に大きな整数や高精度な小数を扱うための「多倍長演算（multi-precision arithmetic）」が用いられます。これは、複数のワード（コンピュータの基本データ単位）を使って数値を表現し、通常のCPU命令では扱えない桁数の計算を可能にするものです。しかし、多倍長演算は通常の演算に比べて計算コストが非常に高いため、可能な限り回避することが性能向上の鍵となります。

技術的詳細

このコミットの核心は、roundShortest関数内で、多倍長演算による高コストな丸め処理を実行する前に、与えられた数値が既に最短表現であるかどうかを数学的に判定する高速パスを導入した点にあります。

導入された条件式は以下の通りです。

if exp > minexp && 332*(d.dp-d.nd) >= 100*(exp-int(flt.mantbits)) {
    // The number is already shortest.
    return
}

この条件式の背後にある数学的根拠は、浮動小数点数のバイナリ表現と10進数表現の間の「ギャップ」の比較に基づいています。

10進数表現の「ギャップ」: dは浮動小数点数の10進数表現を表す構造体です。d.dpは総桁数、d.ndは小数点以下の桁数を表します。したがって、d.dp - d.ndは小数点より上の桁数（整数部の桁数）に相当します。「最も近い短い数」とは、現在の10進数表現よりも桁数が少ない、しかし元の浮動小数点数とは異なる値に丸められてしまうような10進数のことです。このような「最も近い短い数」との距離は、少なくとも 10^(dp-nd) のオーダーになります。これは、10進数の桁数が1つ減ると、その値が10分の1になることを意味します。
浮動小数点数表現の「ギャップ」: 浮動小数点数（flt）は、mant（仮数部）とexp（指数部）で構成されます。flt.mantbitsは仮数部のビット数です。浮動小数点数における隣接する値との距離は、2^(exp - mantbits) のオーダーになります。これは、バイナリ表現の最下位ビット（LSB）が表す値に相当します。
最短表現の判定条件: コメントに記載されているように、数値が既に最短であると判断できるのは、以下の条件が満たされる場合です。 10^(dp-nd) > 2^(exp-mantbits)

これは、10進数表現における「最も近い短い数」との距離が、浮動小数点数表現における隣接する値との距離よりも大きいことを意味します。もし10進数表現のギャップが十分に大きければ、現在の10進数表現が元の浮動小数点数を一意に識別するのに十分であり、これ以上短くすると別の浮動小数点数に丸められてしまう可能性がない、つまり既に最短であると判断できます。
対数による近似: 上記の不等式を扱いやすくするために、両辺のlog2を取ります。 log2(10^(dp-nd)) > log2(2^(exp-mantbits)) (dp-nd) * log2(10) > exp - mantbits

ここで、log2(10)は約3.3219です。この値を332/100で近似しています。 332/100 * (dp-nd) >= exp - mantbits 両辺に100を掛けると、整数演算で処理できるようになります。 332 * (d.dp - d.nd) >= 100 * (exp - int(flt.mantbits))

この条件が真であれば、現在の10進数表現は既に最短であると判断し、高コストなroundShortest関数の残りの処理（多倍長演算を含む）をスキップして早期にreturnします。

また、exp > minexpという条件も追加されています。minexpは浮動小数点数の最小指数を表します。これは、非正規化数（denormalized number）のような特殊なケースを除外するための条件と考えられます。非正規化数は、通常の浮動小数点数とは異なる精度特性を持つため、この高速パスの適用から除外することで、正確性を保証しています。

この最適化により、多くの一般的な浮動小数点数変換において、不要な計算が削減され、strconvパッケージの性能が大幅に向上しました。

コアとなるコードの変更箇所

変更はsrc/pkg/strconv/ftoa.goファイルのroundShortest関数内で行われています。

--- a/src/pkg/strconv/ftoa.go
+++ b/src/pkg/strconv/ftoa.go
@@ -178,15 +178,26 @@ func roundShortest(d *decimal, mant uint64, exp int, flt *floatInfo) {\n 		return\n \t}\n \n-\t// TODO(rsc): Unless exp == minexp, if the number of digits in d\n-\t// is less than 17, it seems likely that it would be\n-\t// the shortest possible number already.  So maybe we can\n-\t// bail out without doing the extra multiprecision math here.\n-\n \t// Compute upper and lower such that any decimal number\n \t// between upper and lower (possibly inclusive)\n \t// will round to the original floating point number.\n \n+\t// We may see at once that the number is already shortest.\n+\t//\n+\t// Suppose d is not denormal, so that 2^exp <= d < 10^dp.\n+\t// The closest shorter number is at least 10^(dp-nd) away.\n+\t// The lower/upper bounds computed below are at distance\n+\t// at most 2^(exp-mantbits).\n+\t//\n+\t// So the number is already shortest if 10^(dp-nd) > 2^(exp-mantbits),\n+\t// or equivalently log2(10)*(dp-nd) > exp-mantbits.\n+\t// It is true if 332/100*(dp-nd) >= exp-mantbits (log2(10) > 3.32).\n+\tminexp := flt.bias + 1 // minimum possible exponent\n+\tif exp > minexp && 332*(d.dp-d.nd) >= 100*(exp-int(flt.mantbits)) {\n+\t\t// The number is already shortest.\n+\t\treturn\n+\t}\n+\n \t// d = mant << (exp - mantbits)\n \t// Next highest floating point number is mant+1 << exp-mantbits.\n \t// Our upper bound is halfway inbetween, mant*2+1 << exp-mantbits-1.\n@@ -200,7 +211,6 @@ func roundShortest(d *decimal, mant uint64, exp int, flt *floatInfo) {\n \t// in which case the next lowest is mant*2-1 << exp-mantbits-1.\n \t// Either way, call it mantlo << explo-mantbits.\n \t// Our lower bound is halfway inbetween, mantlo*2+1 << explo-mantbits-1.\n-\tminexp := flt.bias + 1 // minimum possible exponent\n \tvar mantlo uint64\n \tvar explo int\n \tif mant > 1<<flt.mantbits || exp == minexp {\n```

具体的には、以下の変更が行われました。

1.  既存の`TODO`コメントが削除されました。
2.  新しい条件分岐が追加されました。この条件分岐は、数値が既に最短表現であると判断できる場合に、関数の残りの部分（高コストな多倍長演算を含む）をスキップして早期に`return`します。
3.  `minexp := flt.bias + 1`の定義が、新しい条件分岐の直前に移動されました。これは、この変数が新しい条件分岐で必要となるためです。

## コアとなるコードの解説

追加されたコードブロックは、`roundShortest`関数の冒頭近くに配置され、高コストな丸め処理の前に実行されます。

```go
	// We may see at once that the number is already shortest.
	//
	// Suppose d is not denormal, so that 2^exp <= d < 10^dp.
	// The closest shorter number is at least 10^(dp-nd) away.
	// The lower/upper bounds computed below are at distance
	// at most 2^(exp-mantbits).
	//
	// So the number is already shortest if 10^(dp-nd) > 2^(exp-mantbits),
	// or equivalently log2(10)*(dp-nd) > exp-mantbits.
	// It is true if 332/100*(dp-nd) >= exp-mantbits (log2(10) > 3.32).
	minexp := flt.bias + 1 // minimum possible exponent
	if exp > minexp && 332*(d.dp-d.nd) >= 100*(exp-int(flt.mantbits)) {
		// The number is already shortest.
		return
	}

コメント: 新しいコードブロックの前に詳細なコメントが追加されています。このコメントは、この高速パスの数学的根拠と、なぜこの条件式が機能するのかを説明しています。特に、10進数表現の「最も近い短い数」との距離と、浮動小数点数表現の隣接する値との距離を比較するという考え方が示されています。
minexpの定義: minexpは、浮動小数点数の最小指数（バイアスを考慮したもの）を計算します。これは、非正規化数（denormalized number）を適切に扱うために重要です。
条件式:
- exp > minexp: この条件は、現在の浮動小数点数が非正規化数ではないことを確認します。非正規化数は、通常の浮動小数点数とは異なる精度特性を持つため、この高速パスの適用から除外することで、正確性を保証しています。
- 332*(d.dp-d.nd) >= 100*(exp-int(flt.mantbits)): この部分が、前述の技術的詳細で説明した、10進数表現のギャップと浮動小数点数表現のギャップを比較する主要な条件です。log2(10)の近似値3.32（332/100）を用いて、10進数の桁数とバイナリの指数・仮数ビット数から、数値が既に最短表現であるかを効率的に判定します。

この条件が真である場合、roundShortest関数は即座にreturnし、その後の多倍長演算を含む複雑な処理をスキップします。これにより、特に既に最短表現である多くの数値に対して、大幅な性能向上が実現されます。

参考にした情報源リンク

コミットメッセージと差分: /home/orange/Project/comemo/commit_data/11135.txt
GitHubコミットページ: https://github.com/golang/go/commit/37cd1658386bcc1d4f4ffddb30dd863df2e2ce7b
Go言語のソースコード (src/pkg/strconv/ftoa.go): https://github.com/golang/go/blob/master/src/strconv/ftoa.go (コミット当時のバージョンとは異なる可能性がありますが、一般的な構造を理解するために参照)
浮動小数点数の最短表現に関する一般的な知識 (Web検索)
対数と指数に関する数学的知識 (Web検索)

Keyboard shortcuts

comemo