[インデックス 13513] ファイルの概要

このコミットは、Go言語の実験的なロケールパッケージ exp/locale/collate におけるトライ（Trie）データ構造の改善に関するものです。exp/locale/collate パッケージは、Unicode Collation Algorithm (UCA) に基づいて、異なる言語や地域（ロケール）に応じた文字列のソート順序を決定するための機能を提供します。このコミットの主な目的は、複数ロケールのサポートに向けたトライ構造の変更、メモリフットプリントの削減、および生成速度の向上です。

コミット

このコミットは、exp/locale/collate パッケージ内のトライデータ構造を変更し、複数ロケールのサポートを可能にするための第一歩を踏み出しました。具体的には、異なるロケールに対応するハンドルをトライに導入し、複数のテーブルが同じトライを共有することでブロックの再利用を促進しています。これにより、メモリフットプリントが大幅に改善され、trieNode のアロケーションが削減されました。結果として、トライの生成速度が約30%向上し、生成中に複数ロケールに対応する trieNode を保持できるようになりました。また、print メソッドが fprint にリネームされています。

GitHub上でのコミットページへのリンク

https://github.com/golang/go/commit/601045e87a27178048c82c01e00c64bcb5bc8810

元コミット内容

commit 601045e87a27178048c82c01e00c64bcb5bc8810
Author: Marcel van Lohuizen <mpvl@golang.org>
Date:   Sat Jul 28 18:44:14 2012 +0200

    exp/locale/collate: changed trie in first step towards support for multiple locales.
    - Allow handles into the trie for different locales.  Multiple tables share the same
      try to allow for reuse of blocks.
    - Significantly improved memory footprint and reduced allocations of trieNodes.
      This speeds up generation by about 30% and allows keeping trieNodes around
      for multiple locales during generation.
    - Renamed print method to fprint.
    
    R=r
    CC=golang-dev
    https://golang.org/cl/6408052

変更の背景

このコミットの背景には、Go言語の exp/locale/collate パッケージが、国際化（i18n）の要件を満たすために、より効率的かつ柔軟に複数のロケールをサポートする必要があったという点があります。

従来のトライ実装では、各ロケールが独自のトライを持つ必要があり、メモリ使用量とトライ生成のパフォーマンスに課題がありました。特に、Unicode Collation Algorithm (UCA) は非常に複雑であり、そのデータ構造（トライ）は大量のメモリを消費する可能性があります。異なるロケール間で共通する部分が多いにもかかわらず、それぞれが独立したトライを持つことは非効率的です。

このコミットは、以下の課題を解決するために行われました。

複数ロケールサポートの効率化: 異なるロケール間で共通するトライのブロックを再利用することで、メモリ使用量を削減し、複数ロケールを同時に扱う際のオーバーヘッドを低減する。
メモリフットプリントの削減: trieNode のアロケーションを最適化し、全体的なメモリ消費量を減らす。これは、特にリソースが限られた環境や、多数のロケールをサポートする必要があるアプリケーションにおいて重要です。
トライ生成速度の向上: メモリ効率の改善は、トライの構築プロセスを高速化することにも繋がります。これにより、アプリケーションの起動時間や、新しいロケールデータをロードする際のパフォーマンスが向上します。

これらの改善は、exp/locale/collate パッケージが、より堅牢でスケーラブルな国際化ソリューションを提供するための基盤を築くことを目的としています。

前提知識の解説

このコミットを理解するためには、以下の概念について理解しておく必要があります。

1. Unicode Collation Algorithm (UCA)

Unicode Collation Algorithm (UCA) は、Unicode文字列を言語的に正しい順序でソートするための国際標準です。単なるコードポイント順のソートとは異なり、UCAは以下のような複雑なルールを考慮します。

言語固有の順序: 例えば、ドイツ語では 'ä' は 'a' の後に来ますが、スウェーデン語では 'z' の後に来ます。
アクセント記号やダイアクリティカルマークの扱い: 'e', 'é', 'è', 'ê' などがどのようにソートされるか。
大文字・小文字の扱い: 大文字と小文字が同じとみなされるか、異なるものとして扱われるか。
結合文字: 複数の文字が結合して一つの文字として扱われる場合（例: a + ¨ = ä）。
省略形や特殊な文字: ch が一つの文字として扱われる言語（スペイン語の旧正書法など）。
レベルベースの比較: UCAは通常、複数のレベル（Primary, Secondary, Tertiary, Quaternary）で比較を行います。
- Primary Level: 基本的な文字の順序（例: a < b < c）。
- Secondary Level: アクセント記号やダイアクリティカルマークの違い（例: a < à < á）。
- Tertiary Level: 大文字・小文字の違い（例: a < A）。
- Quaternary Level: 句読点や記号の扱い。

UCAは、これらの複雑なルールを効率的に処理するために、通常、トライ（Trie）などのデータ構造を用いて実装されます。

2. トライ（Trie / Prefix Tree）

トライ（Trie、またはPrefix Tree）は、文字列の集合を格納するために使用される木構造のデータ構造です。各ノードは文字列のプレフィックスを表し、ルートからノードへのパスが文字列を形成します。トライの主な特徴は以下の通りです。

効率的なプレフィックス検索: 特定のプレフィックスを持つすべての文字列を効率的に検索できます。
空間効率: 共通のプレフィックスを持つ文字列は、トライ内でノードを共有するため、空間効率が良い場合があります。
辞書順ソート: トライを深さ優先探索（DFS）で走査すると、格納されている文字列が辞書順に取得できます。

UCAの実装では、文字のシーケンス（コードポイント）をキーとして、その文字のソート順序に関する情報（コレーション要素）を値として格納するためにトライが利用されます。

3. ロケール（Locale）

ロケールは、ユーザーの言語、地域、および文化的な設定を定義する一連のパラメータです。これには、日付と時刻の形式、通貨記号、数値の区切り文字、そして文字列のソート順序などが含まれます。exp/locale/collate パッケージにおける「複数ロケールのサポート」とは、異なるロケール設定に基づいて、文字列のコレーション（ソート）ルールを適用できることを意味します。

4. Go言語の `exp/locale/collate` パッケージ

exp/locale/collate は、Go言語の実験的なパッケージであり、Unicode Collation Algorithm (UCA) に基づくロケール依存の文字列コレーション機能を提供します。このパッケージは、golang.org/x/text/collate パッケージの前身にあたります。主な機能は、特定のロケールに対応する Collator オブジェクトを生成し、それを用いて文字列を比較したり、コレーションキーを生成したりすることです。コレーションキーは、バイト列として表現された文字列のソート可能な形式であり、これを比較することで言語的に正しいソート順序を得ることができます。

このコミットは、このパッケージの内部実装、特にコレーションルールを格納するトライの効率性を向上させることを目的としています。

技術的詳細

このコミットにおける技術的な変更は、主に exp/locale/collate パッケージ内のトライデータ構造の再設計と最適化に焦点を当てています。

1. `trieNode` 構造体の変更とメモリフットプリントの削減

以前の trieNode 構造体は、[256]*trieNode の配列 table を持っていました。これは、各ノードが256個の子ノードへのポインタを保持できることを意味し、メモリ効率が悪い可能性がありました。特に、多くのエントリが nil である場合、無駄なメモリ消費が発生します。

変更前:

type trieNode struct {
	table [256]*trieNode
	value int64
	b     byte
	leaf  bool
}

変更後:

type trieNode struct {
	index []*trieNode // 以前の table に相当
	value []uint32    // 以前の value に相当
	b     byte
	ref   uint16      // 新たに追加された参照オフセット
}

変更後の trieNode は、index []*trieNode と value []uint32 を持ちます。

index は、以前の固定長配列 table からスライスに変更されました。これにより、必要な子ノードへのポインタのみを格納できるようになり、メモリ使用量が削減されます。特に、make([]*trieNode, 64) のように、より小さなブロックサイズで初期化されることで、疎なトライにおけるメモリの無駄が減ります。
value もスライスになり、ノードがリーフであるか内部ノードであるかを示す leaf フィールドが削除されました。isInternal() メソッドのロジックも n.value != nil に変更され、value スライスの存在でノードのタイプを判断するようになりました。
ref uint16 フィールドが追加され、このノードがトライの最終的な配列内でどこに配置されるかを示すオフセットを保持するようになりました。これは、トライのブロック再利用と効率的な参照に不可欠です。

これらの変更により、trieNode のアロケーションが大幅に削減され、メモリフットプリントが改善されました。これは、トライの生成速度向上（約30%）にも寄与しています。

2. 複数ロケールサポートのためのトライハンドルとブロック再利用

このコミットの重要な目的の一つは、複数ロケールを効率的にサポートすることです。そのために、trieHandle 構造体が導入されました。

type trieHandle struct {
	lookupStart uint16 // offset in table for first byte
	valueStart  uint16 // offset in table for first byte
}

trieHandle は、特定のロケールに対応するトライの「開始点」を示すオフセットを保持します。これにより、複数のロケールが同じ基盤となるトライデータ（trie.index と trie.values）を共有できるようになります。

trieBuilder は、複数のトライノードを構築し、それらを共通のルックアップブロックと値ブロックにマッピングする役割を担います。computeOffsets メソッドは、各 trieNode が最終的なトライ配列内でどこに配置されるべきかを計算し、共通のブロックを識別して再利用します。

trieBuilder.lookupBlockIdx と trieBuilder.valueBlockIdx は、ハッシュ値に基づいて既存のブロックを検索し、重複するブロックがあれば再利用します。これにより、異なるロケールが同じコレーションルールの一部を共有する場合に、メモリを節約できます。
Builder 構造体に index *trieBuilder が追加され、コレーションテーブルの構築プロセス全体でこの共有トライビルダーが使用されるようになりました。

このアプローチにより、各ロケールが独自のトライを持つのではなく、共通のトライデータ構造に対して異なる「ビュー」を持つことができるようになり、メモリ効率が大幅に向上します。

3. `print` メソッドから `fprint` メソッドへのリネーム

src/pkg/exp/locale/collate/build/builder.go と src/pkg/exp/locale/collate/build/table.go において、print メソッドが fprint にリネームされました。これは、Go言語の慣習に従い、io.Writer に出力する関数には f プレフィックスを付けることが推奨されるためです（例: fmt.Print vs fmt.Fprint）。機能的な変更はありませんが、コードの可読性と一貫性が向上します。

4. `trie` 構造体の変更とルックアップの最適化

src/pkg/exp/locale/collate/trie.go では、trie 構造体自体も変更されました。

変更前:

type trie struct {
	index  []uint16
	values []uint32
}

変更後:

type trie struct {
	index0  []uint16 // index for first byte (0xC0-0xFF)
	values0 []uint32 // index for first byte (0x00-0x7F)
	index   []uint16
	values  []uint32
}

index0 と values0 が追加され、最初のバイト（特にUTF-8のマルチバイト文字の開始バイト）のルックアップを最適化するための専用のインデックスが提供されるようになりました。これにより、lookup メソッド内の条件分岐がより効率的になり、特定のバイト範囲に対するアクセスが高速化されます。

また、table.go の trie.printStruct メソッドが trieHandle を引数として受け取るようになり、生成されるGoコードでトライの開始オフセットが指定されるようになりました。

func (t *trie) printStruct(w io.Writer, handle *trieHandle, name string) (n, sz int, err error) {
	const msg = "trie{ %sLookup[%d:], %sValues[%d:], %sLookup[:], %sValues[:]}"
	n, err = fmt.Fprintf(w, msg, name, handle.lookupStart*blockSize, name, handle.valueStart*blockSize, name, name)
	sz += int(reflect.TypeOf(trie{}).Size())
	return
}

これにより、生成される trie インスタンスは、共有された rootLookup と rootValues 配列から、そのロケールに対応する適切なオフセットからデータを参照するようになります。

これらの技術的変更は、exp/locale/collate パッケージのパフォーマンスとスケーラビリティを大幅に向上させ、将来的な複数ロケールサポートの基盤を強化するものです。

コアとなるコードの変更箇所

このコミットのコアとなるコード変更は、主に以下のファイルに集中しています。

src/pkg/exp/locale/collate/build/trie.go:
- trieNode 構造体の定義が変更され、メモリ効率が向上しました。
- trieBuilder 構造体が新しく導入され、トライの構築とブロックの再利用ロジックが実装されました。
- computeOffsets 関数が trieBuilder のメソッドとして再実装され、トライノードのオフセット計算とブロックの共有を管理します。
- generate 関数が trieBuilder のメソッドとして再実装され、最終的なトライデータ構造を生成します。
- print メソッドが fprint にリネームされました。
src/pkg/exp/locale/collate/build/builder.go:
- Builder 構造体に index *trieBuilder フィールドが追加され、トライ構築プロセスで共有の trieBuilder を使用するようになりました。
- Print メソッド内で t.print が t.fprint に変更されました。
- buildTrie メソッド内で、トライの生成に b.index.addTrie(t) と b.index.generate() が使用されるようになりました。
src/pkg/exp/locale/collate/build/table.go:
- table 構造体に root *trieHandle フィールドが追加され、トライの開始点を保持するようになりました。
- FirstBlockOffsets() メソッドが追加され、トライのルックアップと値の開始オフセットを返します。
- print メソッドが fprint にリネームされました。
- t.index.printStruct の呼び出しが t.index.printStruct(w, t.root, name) に変更され、trieHandle が渡されるようになりました。
src/pkg/exp/locale/collate/trie.go:
- trie 構造体に index0 と values0 フィールドが追加され、ルックアップの最適化が図られました。
- lookup メソッド内で、index0 と values0 を使用するように変更されました。
src/pkg/exp/locale/collate/tables.go:
- rootTable の初期化で、trie 構造体の初期化方法が変更され、rootLookup と rootValues のオフセットが指定されるようになりました。
src/pkg/exp/locale/collate/export.go:
- tableInitializer インターフェースに FirstBlockOffsets() メソッドが追加されました。
- Init 関数内で、table の index.index0 と index.values0 が FirstBlockOffsets を使用して初期化されるようになりました。
src/pkg/exp/locale/collate/build/trie_test.go および src/pkg/exp/locale/collate/trie_test.go:
- テストコードが変更され、新しいトライ構築ロジックと trieHandle の使用に対応しました。

これらの変更は、トライの内部表現、構築プロセス、および複数ロケールでの利用方法に根本的な影響を与えています。

コアとなるコードの解説

ここでは、上記の「コアとなるコードの変更箇所」で挙げた主要な変更について、具体的なコードの差分を基に詳細に解説します。

1. `src/pkg/exp/locale/collate/build/trie.go` の変更

`trieNode` 構造体の変更

--- a/src/pkg/exp/locale/collate/build/trie.go
+++ b/src/pkg/exp/locale/collate/build/trie.go
@@ -31,181 +35,189 @@ type trie struct {
 
 // trieNode is the intermediate trie structure used for generating a trie.
 type trieNode struct {
-	table [256]*trieNode
-	value int64
+	index []*trieNode
+	value []uint32
 	b     byte
-	leaf  bool
+	ref   uint16
 }
 
 func newNode() *trieNode {
-	return new(trieNode)
+	return &trieNode{
+		index: make([]*trieNode, 64),
+		value: make([]uint32, 128), // root node size is 128 instead of 64
+	}
 }
 
 func (n *trieNode) isInternal() bool {
-	internal := true
-	for i := 0; i < 256; i++ {
-		if nn := n.table[i]; nn != nil {
-			if !internal && !nn.leaf {
-				log.Fatalf("trie:isInternal: node contains both leaf and non-leaf children (%v)", n)
-			}
-			internal = internal && !nn.leaf
-		}
-	}
-	return internal
+	return n.value != nil
 }

table [256]*trieNode から index []*trieNode へ: 以前は固定サイズの配列で子ノードへのポインタを保持していましたが、スライスに変更されました。これにより、実際に使用される子ノードの数に応じてメモリを動的に割り当てることが可能になり、メモリ効率が向上します。newNode() で make([]*trieNode, 64) と初期化されることで、デフォルトのブロックサイズが小さくなり、疎なトライにおける無駄なメモリ消費が削減されます。
value int64 から value []uint32 へ: value もスライスになりました。これにより、単一の値だけでなく、複数のコレーション要素を格納できるようになります。
leaf bool の削除: ノードがリーフであるかどうかの判定は、isInternal() メソッドが n.value != nil をチェックすることで行われるようになりました。value スライスが存在すれば内部ノード、存在しなければリーフノードと判断されます。
ref uint16 の追加: このフィールドは、トライの最終的な配列内でのこのノードの参照オフセットを格納するために使用されます。これは、ブロックの再利用と効率的なルックアップに不可欠です。

`insert` メソッドの変更

--- a/src/pkg/exp/locale/collate/build/trie.go
+++ b/src/pkg/exp/locale/collate/build/trie.go
@@ -40,14 +37,14 @@ func (n *trieNode) isInternal() bool {
 }
 
 func (n *trieNode) insert(r rune, value uint32) {
-	for _, b := range []byte(string(r)) {
-		if n.leaf {
-			log.Fatalf("trie:insert: node (%#v) should not be a leaf", n)
-		}
-		nn := n.table[b]
-		if nn == nil {
-			nn = newNode()
-			nn.b = b
-			n.table[b] = nn
-		}
-		n = nn
-	}
-	n.value = int64(value)
-	n.leaf = true
+	const maskx = 0x3F // mask out two most-significant bits
+	str := string(r)
+	if len(str) == 1 {
+		n.value[str[0]] = value
+		return
+	}
+	for i := 0; i < len(str)-1; i++ {
+		b := str[i] & maskx
+		if n.index == nil {
+			n.index = make([]*trieNode, blockSize)
+		}
+		nn := n.index[b]
+		if nn == nil {
+			nn = &trieNode{}
+			nn.b = b
+			n.index[b] = nn
+		}
+		n = nn
+	}
+	if n.value == nil {
+		n.value = make([]uint32, blockSize)
+	}
+	b := str[len(str)-1] & maskx
+	n.value[b] = value
 }

insert メソッドは、UTF-8エンコーディングのバイト列を考慮し、より効率的なトライの挿入ロジックに変わりました。

maskx = 0x3F を使用して、バイトの最上位2ビットをマスクアウトしています。これは、UTF-8の継続バイトの処理に関連している可能性があります。
単一バイトの文字とマルチバイトの文字で処理を分けています。
n.index が nil の場合に make([]*trieNode, blockSize) で初期化することで、必要な場合にのみ子ノードのスライスが作成されます。
n.value も同様に、必要な場合にのみ make([]uint32, blockSize) で初期化されます。

`trieBuilder` 構造体の導入と `computeOffsets` の再実装

--- a/src/pkg/exp/locale/collate/build/trie.go
+++ b/src/pkg/exp/locale/collate/build/trie.go
@@ -57,18 +65,20 @@ func (n *trieNode) insert(r rune, value uint32) {
 	n.value[b] = value
 }
 
-type nodeIndex struct {
+type trieBuilder struct {
+	t *trie
+
+	roots []*trieHandle
+
 	lookupBlocks []*trieNode
 	valueBlocks  []*trieNode
 
-	lookupBlockIdx map[uint32]int64
-	valueBlockIdx  map[uint32]int64
+	lookupBlockIdx map[uint32]*trieNode
+	valueBlockIdx  map[uint32]*trieNode
 }
 
-func newIndex() *nodeIndex {
-	index := &nodeIndex{}
+func newTrieBuilder() *trieBuilder {
+	index := &trieBuilder{}
 	index.lookupBlocks = make([]*trieNode, 0)
 	index.valueBlocks = make([]*trieNode, 0)
-	index.lookupBlockIdx = make(map[uint32]int64)
-	index.valueBlockIdx = make(map[uint32]int64)
+	index.lookupBlockIdx = make(map[uint32]*trieNode)
+	index.valueBlockIdx = make(map[uint32]*trieNode)
+	// The third nil is the default null block.  The other two blocks
+	// are used to guarantee an offset of at least 3 for each block.
+	index.lookupBlocks = append(index.lookupBlocks, nil, nil, nil)
+	index.t = &trie{}
 	return index
 }
 
-func computeOffsets(index *nodeIndex, n *trieNode) int64 {
-	if n.leaf {
-		return n.value
-	}
+func (b *trieBuilder) computeOffsets(n *trieNode) *trieNode {
 	hasher := fnv.New32()
-	// We only index continuation bytes.
-	for i := 0; i < blockSize; i++ {
-		v := int64(0)
-		if nn := n.table[0x80+i]; nn != nil {
-			v = computeOffsets(index, nn)
-		}
-		hasher.Write([]byte{byte(v >> 24), byte(v >> 16), byte(v >> 8), byte(v)})
-	}
-	h := hasher.Sum32()
-	if n.isInternal() {
-		v, ok := index.lookupBlockIdx[h]
-		if !ok {
-			v = int64(len(index.lookupBlocks)) - blockOffset
-			index.lookupBlocks = append(index.lookupBlocks, n)
-			index.lookupBlockIdx[h] = v
-		}
-		n.value = v
-	} else {
-		v, ok := index.valueBlockIdx[h]
-		if !ok {
-			v = int64(len(index.valueBlocks)) - blockOffset
-			index.valueBlocks = append(index.valueBlocks, n)
-			index.valueBlockIdx[h] = v
-		}
-		n.value = v
-	}
-	return n.value
+	if n.index != nil {
+		for i, nn := range n.index {
+			v := uint16(0)
+			if nn != nil {
+				nn = b.computeOffsets(nn)
+				n.index[i] = nn
+				v = nn.ref
+			}
+			hasher.Write([]byte{byte(v >> 8), byte(v)})
+		}
+		h := hasher.Sum32()
+		nn, ok := b.lookupBlockIdx[h]
+		if !ok {
+			n.ref = uint16(len(b.lookupBlocks)) - blockOffset
+			b.lookupBlocks = append(b.lookupBlocks, n)
+			b.lookupBlockIdx[h] = n
+		} else {
+			n = nn
+		}
+	} else {
+		for _, v := range n.value {
+			hasher.Write([]byte{byte(v >> 24), byte(v >> 16), byte(v >> 8), byte(v)})
+		}
+		h := hasher.Sum32()
+		nn, ok := b.valueBlockIdx[h]
+		if !ok {
+			n.ref = uint16(len(b.valueBlocks)) - blockOffset
+			b.valueBlocks = append(b.valueBlocks, n)
+			b.valueBlockIdx[h] = n
+		} else {
+			n = nn
+		}
+	}
+	return n
 }

nodeIndex から trieBuilder へ: トライの構築とブロックのインデックス付けを担当する構造体が nodeIndex から trieBuilder に変更されました。trieBuilder は、最終的なトライ (t *trie) と、複数のロケールに対応するトライの開始点 (roots []*trieHandle) を保持するようになりました。
lookupBlockIdx と valueBlockIdx の型変更: マップの値が int64 から *trieNode に変更されました。これにより、ハッシュ値に対応する trieNode 自体を直接参照できるようになり、より柔軟なブロック管理が可能になります。
newTrieBuilder() の初期化: lookupBlocks に nil を3つ追加することで、オフセットの計算に影響を与える特定のブロックオフセット（blockOffset = 2）を考慮しています。
computeOffsets のロジック変更:
- trieBuilder のメソッドになりました。
- n.index が nil でない場合（内部ノードの場合）と nil の場合（リーフノードの場合）で処理を分けています。
- ハッシュ値に基づいて lookupBlockIdx または valueBlockIdx をチェックし、既存のブロックがあればその trieNode を再利用します。なければ新しいブロックとして追加し、n.ref にそのオフセットを格納します。
- この関数は、trieNode の ref フィールドに最終的なオフセットを書き込み、その trieNode を返します。これにより、トライの各ノードが最終的な配列内のどこにマッピングされるかが決定されます。

`addTrie` と `generate` メソッドの変更

--- a/src/pkg/exp/locale/collate/build/trie.go
+++ b/src/pkg/exp/locale/collate/build/trie.go
@@ -149,39 +157,47 @@ func genLookupBlock(t *trie, n *trieNode, offset int) error {
 	return nil
 }
 
-// generate generates and returns the trie for n.\n-func (n *trieNode) generate() (t *trie, err error) {\n-\tseterr := func(e error) {\n-\t\tif err == nil {\n-\t\t\terr = e\n-		}\n-\t}\n-\tindex := newIndex()\n-\t// Values for 7-bit ASCII are stored in the first of two blocks, followed by a nil block.\n-\tindex.valueBlocks = append(index.valueBlocks, nil, nil, nil)\n-\t// First byte of multi-byte UTF-8 codepoints are indexed in 4th block.\n-\tindex.lookupBlocks = append(index.lookupBlocks, nil, nil, nil, nil)\n-\t// Index starter bytes of multi-byte UTF-8.\n-\tfor i := 0xC0; i < 0x100; i++ {\n-\t\tif n.table[i] != nil {\n-\t\t\tcomputeOffsets(index, n.table[i])
-\t\t}\n-	}\n-\tt = &trie{}\n-\tseterr(genValueBlock(t, n, 0))\n-\tseterr(genValueBlock(t, n, 64))\n-\tseterr(genValueBlock(t, newNode(), 0))\n-\tfor i := 3; i < len(index.valueBlocks); i++ {\n-\t\tseterr(genValueBlock(t, index.valueBlocks[i], 0x80))\n-	}\n-\tif len(index.valueBlocks) >= 1<<16 {\n-\t\tseterr(fmt.Errorf("maximum number of value blocks exceeded (%d > %d)", len(index.valueBlocks), 1<<16))\n-\t\treturn\n-	}\n-\tseterr(genLookupBlock(t, newNode(), 0))\n-\tseterr(genLookupBlock(t, newNode(), 0))\n-\tseterr(genLookupBlock(t, newNode(), 0))\n-\tseterr(genLookupBlock(t, n, 0xC0))\n-\tfor i := 4; i < len(index.lookupBlocks); i++ {\n-\t\tseterr(genLookupBlock(t, index.lookupBlocks[i], 0x80))\n-	}\n-\treturn
-}
-
-// print writes a compilable trie to w.  It returns the number of characters\n-// printed and the size of the generated structure in bytes.\n-func (t *trie) print(w io.Writer, name string) (n, size int, err error) {\n-\tupdate3 := func(nn, sz int, e error) {\n-\t\tn += nn\n-\t\tif err == nil {\n-\t\t\terr = e\n-\t\t}\n-\t\tsize += sz\n-	}\n-\tupdate2 := func(nn int, e error) { update3(nn, 0, e) }\n-\n-\tupdate3(t.printArrays(w, name))\n-\tupdate2(fmt.Fprintf(w, "var %sTrie = ", name))\n-\tupdate3(t.printStruct(w, name))\n-\tupdate2(fmt.Fprintln(w))\n-\treturn
-}
-
+func (b *trieBuilder) addStartValueBlock(n *trieNode) uint16 {
+	hasher := fnv.New32()
+	for _, v := range n.value[:2*blockSize] {
+		hasher.Write([]byte{byte(v >> 24), byte(v >> 16), byte(v >> 8), byte(v)})
+	}
+	h := hasher.Sum32()
+	nn, ok := b.valueBlockIdx[h]
+	if !ok {
+		n.ref = uint16(len(b.valueBlocks))
+		b.valueBlocks = append(b.valueBlocks, n)
+		// Add a dummy block to accommodate the double block size.
+		b.valueBlocks = append(b.valueBlocks, nil)
+		b.valueBlockIdx[h] = n
+	} else {
+		n = nn
+	}
+	return n.ref
+}
+
+func genValueBlock(t *trie, n *trieNode) {
+	if n != nil {
+		for _, v := range n.value {
+			t.values = append(t.values, v)
+		}
+	}
+}
+
+func genLookupBlock(t *trie, n *trieNode) {
+	for _, nn := range n.index {
+		v := uint16(0)
+		if nn != nil {
+			v = nn.ref
+		}
+		t.index = append(t.index, v)
+	}
+}
+
+func (b *trieBuilder) addTrie(n *trieNode) *trieHandle {
+	h := &trieHandle{}
+	b.roots = append(b.roots, h)
+	h.valueStart = b.addStartValueBlock(n)
+	if len(b.roots) == 1 {
+		// We insert a null block after the first start value block.
+		// This ensures that continuation bytes UTF-8 sequences of length
+		// greater than 2 will automatically hit a null block if there
+		// was an undefined entry.
+		b.valueBlocks = append(b.valueBlocks, nil)
+	}
+	n = b.computeOffsets(n)
+	// Offset by one extra block as the first byte starts at 0xC0 instead of 0x80.
+	h.lookupStart = n.ref - 1
+	return h
+}
+
+// generate generates and returns the trie for n.
+func (b *trieBuilder) generate() (t *trie, err error) {
+	t = b.t
+	if len(b.valueBlocks) >= 1<<16 {
+		return nil, fmt.Errorf("maximum number of value blocks exceeded (%d > %d)", len(b.valueBlocks), 1<<16)
+	}
+	if len(b.lookupBlocks) >= 1<<16 {
+		return nil, fmt.Errorf("maximum number of lookup blocks exceeded (%d > %d)", len(b.lookupBlocks), 1<<16)
+	}
+	genValueBlock(t, b.valueBlocks[0])
+	genValueBlock(t, &trieNode{value: make([]uint32, 64)})
+	for i := 2; i < len(b.valueBlocks); i++ {
+		genValueBlock(t, b.valueBlocks[i])
+	}
+	n := &trieNode{index: make([]*trieNode, 64)}
+	genLookupBlock(t, n)
+	genLookupBlock(t, n)
+	genLookupBlock(t, n)
+	for i := 3; i < len(b.lookupBlocks); i++ {
+		genLookupBlock(t, b.lookupBlocks[i])
+	}
+	return b.t, nil
+}

addStartValueBlock: これは、トライの開始値ブロックを追加し、そのオフセットを返す新しいメソッドです。ハッシュに基づいて既存のブロックを再利用するロジックが含まれています。
genValueBlock と genLookupBlock: これらの関数は、trieNode から最終的な trie の values および index 配列にデータをコピーする役割を担います。
addTrie: この重要なメソッドは、新しいトライノード（ロケールのルートノード）を受け取り、それに対応する trieHandle を生成します。computeOffsets を呼び出してノードのオフセットを計算し、lookupStart と valueStart を設定します。これにより、異なるロケールが共有トライ内の異なる開始点を持つことができます。
generate メソッドの再実装:
- trieBuilder のメソッドになりました。
- b.t (最終的な trie オブジェクト) を直接操作します。
- valueBlocks と lookupBlocks のサイズが 1<<16 (65536) を超えないことを確認します。これは uint16 の最大値に対応し、オフセットが収まるようにするためです。
- genValueBlock と genLookupBlock を呼び出して、trieBuilder が構築したブロックから最終的な trie の values と index 配列を生成します。

2. `src/pkg/exp/locale/collate/build/builder.go` の変更

--- a/src/pkg/exp/locale/collate/build/builder.go
+++ b/src/pkg/exp/locale/collate/build/builder.go
@@ -66,6 +66,7 @@ func (e *entry) contractionStarter() bool {
 // tables using Add and AddTailoring before making any call to Build.  This allows
 // Builder to ensure that a root table can support tailorings for each locale.
 type Builder struct {
+\tindex    *trieBuilder
 	entryMap map[string]*entry
 	entry    []*entry
 	t        *table
@@ -76,6 +77,7 @@ type Builder struct {
 // NewBuilder returns a new Builder.\n func NewBuilder() *Builder {\n \tb := &Builder{\n+\t\tindex:    newTrieBuilder(),\n \t\tentryMap: make(map[string]*entry),\n \t}\n \treturn b\
@@ -218,7 +220,7 @@ func (b *Builder) Print(w io.Writer) (int, error) {\n \t\treturn 0, err\n \t}\n \t// TODO: support multiple locales\n-\tn, _, err := t.print(w, "root")\n+\tn, _, err := t.fprint(w, "root")\n \treturn n, err\n }\n \
@@ -510,7 +512,8 @@ func (b *Builder) buildTrie() {\n \t\t\tt.insert(e.runes[0], ce)\n \t\t}\n \t}\n-\ti, err := t.generate()\n+\tb.t.root = b.index.addTrie(t)\n+\ti, err := b.index.generate()\n \tb.t.index = *i\n \tb.error(err)\
 }\

Builder 構造体への index *trieBuilder の追加: Builder がトライ構築の共有ロジックをカプセル化する trieBuilder のインスタンスを持つようになりました。NewBuilder() で初期化されます。
Print メソッドの t.print から t.fprint への変更: メソッド名のリネームです。
buildTrie メソッドの変更:
- b.t.root = b.index.addTrie(t): ここで、構築中のテーブル t のルートノードを trieBuilder に追加し、その結果として得られる trieHandle を b.t.root に設定しています。これにより、このテーブルが共有トライ内のどこから始まるかが定義されます。
- i, err := b.index.generate(): 最終的なトライデータ構造の生成は、trieBuilder の generate メソッドに委譲されます。

3. `src/pkg/exp/locale/collate/build/table.go` の変更

--- a/src/pkg/exp/locale/collate/build/table.go
+++ b/src/pkg/exp/locale/collate/build/table.go
@@ -14,6 +14,7 @@ import (\n // It implements the non-exported interface collate.tableInitializer\n type table struct {\n \tindex trie // main trie\n+\troot  *trieHandle\n \n \t// expansion info\n \texpandElem []uint32\
@@ -32,6 +33,10 @@ func (t *table) TrieValues() []uint32 {\n \treturn t.index.values\n }\n \
+func (t *table) FirstBlockOffsets() (i, v uint16) {\n+\treturn t.root.lookupStart, t.root.valueStart\n+}\n+\
 func (t *table) ExpandElems() []uint32 {\n \treturn t.expandElem\n }\
@@ -51,7 +56,7 @@ func (t *table) MaxContractLen() int {\n // print writes the table as Go compilable code to w. It prefixes the\n // variable names with name. It returns the number of bytes written\n // and the size of the resulting table.\n-func (t *table) print(w io.Writer, name string) (n, size int, err error) {\n+func (t *table) fprint(w io.Writer, name string) (n, size int, err error) {\n \tupdate := func(nn, sz int, e error) {\n \t\tn += nn\n \t\tif err == nil {\n@@ -66,7 +71,7 @@ func (t *table) print(w io.Writer, name string) (n, size int, err error) {\n \t// Write main table.\n \tsize += int(reflect.TypeOf(*t).Size())\n \tp("var %sTable = table{\\n", name)\n-\tupdate(t.index.printStruct(w, name))\n+\tupdate(t.index.printStruct(w, t.root, name))\n \tp(",\\n")\n \tp("%sExpandElem[:],\\n", name)\n \tupdate(t.contractTries.printStruct(w, name))\

table 構造体への root *trieHandle の追加: 各コレーションテーブルが、共有トライ内の自身の開始点を指す trieHandle を持つようになりました。
FirstBlockOffsets() メソッドの追加: このメソッドは、テーブルのルートが指すルックアップと値の開始オフセットを返します。これは、export.go で trie を初期化する際に使用されます。
print メソッドの fprint へのリネーム: メソッド名のリネームです。
t.index.printStruct の呼び出し変更: t.root が引数として渡されるようになり、生成されるGoコードでこのテーブルのトライが共有トライのどの部分を参照するかが明示されます。

4. `src/pkg/exp/locale/collate/trie.go` の変更

--- a/src/pkg/exp/locale/collate/trie.go
+++ b/src/pkg/exp/locale/collate/trie.go
@@ -14,8 +14,10 @@ package collate\n const blockSize = 64\n \n type trie struct {\n-\tindex  []uint16\n-\tvalues []uint32\n+\tindex0  []uint16 // index for first byte (0xC0-0xFF)\n+\tvalues0 []uint32 // index for first byte (0x00-0x7F)\n+\tindex   []uint16\n+\tvalues  []uint32\n }\n \n const (\
@@ -40,14 +42,14 @@ func (t *trie) lookup(s []byte) (v colElem, sz int) {\n \tc0 := s[0]\n \tswitch {\n \tcase c0 < tx:\n-\t\treturn colElem(t.values[c0]), 1\n+\t\treturn colElem(t.values0[c0]), 1\n \tcase c0 < t2:\n \t\treturn 0, 1\n \tcase c0 < t3:\n \t\tif len(s) < 2 {\n \t\t\treturn 0, 0\n \t\t}\n-\t\ti := t.index[c0]\n+\t\ti := t.index0[c0]\n \t\tc1 := s[1]\n \t\tif c1 < tx || t2 <= c1 {\n \t\t\treturn 0, 1\
@@ -57,7 +59,7 @@ func (t *trie) lookup(s []byte) (v colElem, sz int) {\n \t\tif len(s) < 3 {\n \t\t\treturn 0, 0\n \t\t}\n-\t\ti := t.index[c0]\n+\t\ti := t.index0[c0]\n \t\tc1 := s[1]\n \t\tif c1 < tx || t2 <= c1 {\n \t\t\treturn 0, 1\
@@ -73,7 +75,7 @@ func (t *trie) lookup(s []byte) (v colElem, sz int) {\n \t\tif len(s) < 4 {\n \t\t\treturn 0, 0\n \t\t}\n-\t\ti := t.index[c0]\n+\t\ti := t.index0[c0]\n \t\tc1 := s[1]\n \t\tif c1 < tx || t2 <= c1 {\n \t\t\treturn 0, 1\

trie 構造体への index0 と values0 の追加: これらは、特定のバイト範囲（特にUTF-8の最初のバイト）に対するルックアップを高速化するための専用のインデックスと値の配列です。
lookup メソッドの変更: c0 < tx (ASCII文字) の場合は t.values0 を、それ以外の場合は t.index0 を使用するように変更されました。これにより、ルックアップパスが最適化され、パフォーマンスが向上します。

これらの変更は、exp/locale/collate パッケージが、より効率的にメモリを使用し、複数ロケールをサポートするための基盤を強化するものです。トライの構築プロセスがより洗練され、共有可能なブロックを最大限に活用することで、全体的なパフォーマンスとスケーラビリティが向上しています。

comemo