[インデックス 1263] ファイルの概要

コミット

commit cb67a8324bbdffcb5e8a8a6caa8dbb400d0dc7a8
Author: Robert Griesemer <gri@golang.org>
Date:   Tue Dec 2 16:49:44 2008 -0800

    - fine-tuning of white space
    - by default consider extra newlines in src for better formatting
    - additional flags for control (-newlines, -maxnewlines, -optsemicolons)
    - don't print ()'s around single anonymous result types
    
    Status: Comparing the output of pretty with the input for larger files
    shows mostly whitespace/formatting differences, which is what is desired.
    
    TODO:
    - Handling of overlong lines
    - some esoteric cases which look funny
    
    R=r
    OCL=20293
    CL=20293

GitHub上でのコミットページへのリンク

https://github.com/golang/go/commit/cb67a8324bbdffcb5e8a8a6caa8dbb400d0dc7a8

元コミット内容

このコミットは、Go言語のコードフォーマッタ（prettyパッケージ、後のgo fmtの原型）における以下の改善を目的としています。

ホワイトスペースの微調整。
ソースコード内の余分な改行をデフォルトで考慮し、より良いフォーマットを実現。
フォーマット制御のための追加フラグ（-newlines, -maxnewlines, -optsemicolons）の導入。
単一の匿名結果型に対する括弧()の出力抑制。

コミットメッセージには、prettyツールの出力と入力の比較において、ほとんどがホワイトスペース/フォーマットの違いであることが示されており、これは望ましい結果であると述べられています。また、今後の課題として、長すぎる行の処理や、一部の奇妙なケースの対応が挙げられています。

変更の背景

このコミットは、Go言語がまだ初期開発段階にあった2008年に行われたものです。Go言語は、その設計思想の一つとして「強制的なコードフォーマット」を掲げており、go fmtというツールによって、Goコードのスタイルを統一することを推奨しています。これにより、コードレビューの際にスタイルに関する議論を減らし、開発者が本質的なロジックに集中できるようにすることを目指していました。

このコミットは、そのgo fmtの原型となるprettyパッケージの改善の一環であり、より洗練された、かつ柔軟なコードフォーマットを実現するための調整が行われています。特に、ソースコード中の改行の扱い、連続する改行の最大数、そしてオプションのセミコロンの出力制御といった、フォーマットの細部にわたる調整が加えられています。また、関数の戻り値の型宣言における冗長な括弧の削除は、Go言語の簡潔さを追求する設計思想に合致する変更と言えます。

前提知識の解説

Go言語のコードフォーマット (go fmt): Go言語には、公式のコードフォーマッタであるgo fmtが提供されています。これは、Goのソースコードを標準的なスタイルに自動的に整形するツールです。開発者はgo fmtを使用することで、コードの見た目を統一し、可読性を向上させることができます。このコミットで変更されているprettyパッケージは、go fmtの基盤となる部分です。
抽象構文木 (AST): コンパイラやリンタ、フォーマッタなどのツールは、ソースコードを直接操作するのではなく、まずソースコードを解析して抽象構文木（AST）と呼ばれるツリー構造に変換します。ASTはプログラムの構造を抽象的に表現したもので、このASTを操作することで、コードの解析、変換、最適化、そしてフォーマットなどが行われます。printer.goはASTを走査し、整形されたコードを出力する役割を担っています。
tabwriterパッケージ: Go言語の標準ライブラリに含まれるtext/tabwriterパッケージは、テキストをタブ区切りで整形し、列を揃えて出力するための機能を提供します。printer.goはこのtabwriterを利用して、整形されたコードの出力を行っています。
匿名結果型 (Anonymous Result Types): Go言語の関数は複数の戻り値を返すことができます。戻り値には名前を付けることもできますが、名前を付けない「匿名」の戻り値型も存在します。例えば、func foo() (int, error)のように複数の匿名結果型を返す場合、これらは括弧で囲まれます。しかし、func bar() intのように単一の匿名結果型の場合、初期のGoではfunc bar() (int)のように括弧が付けられることがありましたが、このコミットでその冗長な括弧が削除されるようになりました。
セミコロンの自動挿入 (Automatic Semicolon Insertion): Go言語では、行末にセミコロンを明示的に記述する必要はほとんどありません。コンパイラが特定のルールに基づいて自動的にセミコロンを挿入します。しかし、一部のケースではセミコロンが必要となる場合があります。このコミットで導入された-optsemicolonsフラグは、この自動挿入の挙動に関連するフォーマットの制御を可能にするものです。

技術的詳細

このコミットの主要な変更は、usr/gri/pretty/printer.goファイルに集中しており、Go言語のコードフォーマッタの内部ロジックが改善されています。

新しいフラグの導入:
- newlines (bool): ソースコード中の改行を尊重するかどうかを制御します。デフォルトはtrue。
- maxnewlines (int): 連続する改行の最大数を制御します。デフォルトは3。これにより、過剰な空行が挿入されるのを防ぎます。
- optsemicolons (bool): オプションのセミコロンを出力するかどうかを制御します。デフォルトはfalse。これは、Goの自動セミコロン挿入の挙動と関連し、フォーマットの柔軟性を高めます。
Printer構造体の変更:
- actionフィールドがstateフィールドに変更されました。これは、フォーマッタがコードのセマンティックな状態（例: スコープの開始、スコープの終了、リスト内）をより適切に管理するための変更です。
- no_action, open_scope, close_scopeといった定数が、normal, opening_scope, closing_scope, inside_listといったよりセマンティックな意味を持つ定数に置き換えられました。これにより、フォーマッタの内部状態管理がより明確になります。
改行処理の改善:
- Newlineメソッドにおいて、maxnewlinesフラグの値に基づいて連続する改行の最大数が適用されるようになりました。
- Stringメソッド内で、コメントと改行の処理がより洗練されました。特に、newlines.BVal()フラグがfalseの場合（ソース中の改行を尊重しない場合）の改行カウントの調整や、inside_list状態での改行の扱いが追加されています。これにより、フォーマッタがソースコードの元の改行をより賢く解釈し、整形に反映できるようになりました。
単一匿名結果型の括弧削除:
- Typeメソッド内で、関数の戻り値の型を処理する際に、単一の匿名結果型の場合にのみ括弧()を出力しないロジックが追加されました。これにより、func foo() (int)のような冗長な記述がfunc foo() intと整形されるようになります。
セミコロン出力の制御:
- Blockメソッド内で、optsemicolons.BVal()がfalseの場合にP.separator = noneを設定するロジックが追加されました。これは、オプションのセミコロンを出力しない場合に、ブロックの終わりに余分なセミコロンが挿入されないようにするためのものです。
selftest2.goの更新:
- 新しいimportブロックの追加（複数行インポートのテスト）。
- constブロックでのiotaを使用した列挙型の定義。
- switch文での複数ケースの記述。
- これらの変更は、printer.goの新しいフォーマットルールが正しく適用されることを確認するためのテストケースの追加です。

これらの変更は、Go言語のコードフォーマッタが、より柔軟で、かつGoのイディオムに沿ったコードを生成できるようにするための重要なステップでした。特に、ホワイトスペースの扱いに関する細かな調整は、コードの可読性と一貫性を高める上で不可欠です。

コアとなるコードの変更箇所

usr/gri/pretty/printer.go

--- a/usr/gri/pretty/printer.go
+++ b/usr/gri/pretty/printer.go
@@ -16,16 +16,23 @@ import (
 
 var (
 	debug = flag.Bool("debug", false, nil, "print debugging information");
+	
+	// layout control
 	tabwidth = flag.Int("tabwidth", 8, nil, "tab width");
 	usetabs = flag.Bool("usetabs", true, nil, "align with tabs instead of blanks");
-	comments = flag.Bool("comments", true, nil, "enable printing of comments");
+	newlines = flag.Bool("newlines", true, nil, "respect newlines in source");
+	maxnewlines = flag.Int("maxnewlines", 3, nil, "max. number of consecutive newlines");
+
+	// formatting control
+	comments = flag.Bool("comments", true, nil, "print comments");
+	optsemicolons = flag.Bool("optsemicolons", false, nil, "print optional semicolons");
 )
 
 
 // ----------------------------------------------------------------------------
 // Printer
 
-// Separators are printed in a delayed fashion, depending on the next token.\n+// Separators - printed in a delayed fashion, depending on context.
 const (
 	none = iota;
 	blank;
@@ -35,11 +42,12 @@ const (
 )
 
 
-// Formatting actions control formatting parameters during printing.\n+// Semantic states - control formatting.
 const (
-	no_action = iota;
-	open_scope;
-	close_scope;
+	normal = iota;
+	opening_scope;  // controls indentation, scope level
+	closing_scope;  // controls indentation, scope level
+	inside_list;  // controls extra line breaks
 )
 
 
@@ -61,9 +69,14 @@ type Printer struct {
 	separator int;  // pending separator
 	newlines int;  // pending newlines
 	
-	// formatting action
-	action int;  // action executed on formatting parameters
-	lastaction int;  // action for last string
+	// semantic state
+	state int;  // current semantic state
+	laststate int;  // state for last string
+}
+
+
+func (P *Printer) HasComment(pos int) bool {
+	return comments.BVal() && P.cpos < pos;
 }
 
 
@@ -90,7 +103,7 @@ func (P *Printer) Init(writer *tabwriter.Writer, comments *array.Array) {
 	P.cindex = -1;
 	P.NextComment();
 	
-	// formatting parameters & action initialized correctly by default\n+	// formatting parameters & semantic state initialized correctly by default
 }
 
 
@@ -106,10 +119,10 @@ func (P *Printer) Printf(format string, s ...) {
 
 
 func (P *Printer) Newline(n int) {
-	const maxnl = 2;\n 	if n > 0 {
-	\tif n > maxnl {\n-	\t\tn = maxnl;\n+	\tm := int(maxnewlines.IVal());\n+	\tif n > m {\n+	\t\tn = m;\n 	\t}\n 	\tfor ; n > 0; n-- {
 	\t\tP.Printf("\n");
@@ -122,14 +135,16 @@ func (P *Printer) Newline(n int) {
 
 
 func (P *Printer) String(pos int, s string) {
-	// correct pos if necessary\n+	// use estimate for pos if we don't have one
 	if pos == 0 {
-		pos = P.lastpos;  // estimate\n+		pos = P.lastpos;
 	}
 
 	// --------------------------------
 	// print pending separator, if any
 	// - keep track of white space printed for better comment formatting
-	// TODO print white space separators after potential comments and newlines\n+	// TODO print white space separators after potential comments and newlines
+	// (currently, we may get trailing white space before a newline)
 	trailing_char := 0;
 	switch P.separator {
 	case none:	// nothing to do
@@ -160,7 +175,7 @@ func (P *Printer) String(pos int, s string) {
 	// --------------------------------
 	// interleave comments, if any
 	nlcount := 0;
-	for comments.BVal() && P.cpos < pos {\n+	for ; P.HasComment(pos); P.NextComment() {
 	\t\t// we have a comment/newline that comes before the string
 	\t\tcomment := P.comments.At(P.cindex).(*AST.Comment);
 	\t\tctext := comment.text;
@@ -176,7 +191,12 @@ func (P *Printer) String(pos int, s string) {
 	\t\t\t\t// only white space before comment on this line
 	\t\t\t\t// or file starts with comment
 	\t\t\t\t// - indent
+	\t\t\t\tif !newlines.BVal() && P.cpos != 0 {
+	\t\t\t\t\tnlcount = 1;
+	\t\t\t\t}
 	\t\t\t\tP.Newline(nlcount);
+	\t\t\t\tnlcount = 0;
+
 	\t\t\t} else {
 	\t\t\t\t// black space before comment on this line
 	\t\t\t\tif ctext[1] == '/' {
@@ -184,7 +204,7 @@ func (P *Printer) String(pos int, s string) {
 	\t\t\t\t\t// - put in next cell unless a scope was just opened
 	\t\t\t\t\t//   in which case we print 2 blanks (otherwise the
 	\t\t\t\t\t//   entire scope gets indented like the next cell)
-	\t\t\t\t\tif P.lastaction == open_scope {\n+	\t\t\t\t\tif P.laststate == opening_scope {
 	\t\t\t\t\t\tswitch trailing_char {
 	\t\t\t\t\t\tcase ' ': P.Printf(" ");  // one space already printed
 	\t\t\t\t\t\tcase '\t': // do nothing
@@ -205,6 +225,7 @@ func (P *Printer) String(pos int, s string) {
 	\t\t\t\t}
 	\t\t\t}
 	\t\t\t
+\t\t\t// print comment
 	\t\tif debug.BVal() {
 	\t\t\tP.Printf("[%d]", P.cpos);
 	\t\t}
@@ -216,33 +237,36 @@ func (P *Printer) String(pos int, s string) {
 	\t\t\t\t\tP.newlines = 1;
 	\t\t\t\t}
 	\t\t\t}
-	\t\t\t\n-	\t\tnlcount = 0;\n \t\t}
-	\n-	\tP.NextComment();\n \t}
+\t// At this point we may have nlcount > 0: In this case we found newlines
+\t// that were not followed by a comment. They are recognized (or not) when
+\t// printing newlines below.
 	\t
 	// --------------------------------
-	// handle extra newlines\n-	if nlcount > 0 {\n-	\tP.newlines += nlcount - 1;\n-	}\n-\n-	// --------------------------------\n-	// interpret control\n+	// interpret state
 	// (any pending separator or comment must be printed in previous state)
-	switch P.action {\n-	case none:\n-	case open_scope:\n-	case close_scope:\n+	switch P.state {
+	case normal:
+	case opening_scope:
+	case closing_scope:
 	\tP.indentation--;
+	case inside_list:
 	default:
 	\tpanic("UNREACHABLE");
 	}\n 
 	// --------------------------------
-	// adjust formatting depending on state\n+	// print pending newlines
+	if newlines.BVal() && (P.newlines > 0 || P.state == inside_list) && nlcount > P.newlines {
+	\t// Respect additional newlines in the source, but only if we
+	\t// enabled this feature (newlines.BVal()) and we are expecting
+	\t// newlines (P.newlines > 0 || P.state == inside_list).
+	\t// Otherwise - because we don't have all token positions - we
+	\t// get funny formatting.
+	\tP.newlines = nlcount;
+	}
+	nlcount = 0;
 	P.Newline(P.newlines);
 	P.newlines = 0;
 
 	// --------------------------------
 	// print string
 	P.Printf("%s", s);
 
 	// --------------------------------
-	// interpret control\n-	switch P.action {\n-	case none:\n-	case open_scope:\n+	// interpret state
+	switch P.state {
+	case normal:
+	case opening_scope:
 	\tP.level++;
 	\tP.indentation++;
-	\t//P.newlines = 1;\n-	case close_scope:\n+	case closing_scope:
 	\tP.level--;
+	case inside_list:
 	default:
 	\tpanic("UNREACHABLE");
 	}\n-	P.lastaction = P.action;\n-	P.action = none;\n+	P.laststate = P.state;
+	P.state = normal;
 
 	// --------------------------------
 	// done
@@ -321,7 +345,7 @@ func (P *Printer) Parameters(pos int, list *array.Array) {
 
 
 func (P *Printer) Fields(list *array.Array, end int) {
-	P.action = open_scope;\n+	P.state = opening_scope;
 	P.String(0, "{");
 
 	if list != nil {
@@ -345,7 +369,7 @@ func (P *Printer) Fields(list *array.Array, end int) {
 		P.newlines = 1;
 	}
 
-	P.action = close_scope;\n+	P.state = closing_scope;
 	P.String(end, "}");
 }
 
@@ -394,7 +418,13 @@ func (P *Printer) Type(t *AST.Type) {
 		P.Parameters(t.pos, t.list);
 		if t.elt != nil {
 			P.separator = blank;
-			P.Parameters(0, t.elt.list);\n+			list := t.elt.list;
+			if list.Len() > 1 {
+				P.Parameters(0, list);
+			} else {
+				// single, anonymous result type
+				P.Expr(list.At(0).(*AST.Expr));
+			}
 		}
 
 	case Scanner.ELLIPSIS:
@@ -438,6 +468,7 @@ func (P *Printer) Expr1(x *AST.Expr, prec1 int) {
 		P.Expr(x.x);
 		P.String(x.pos, ",");
 		P.separator = blank;
+		P.state = inside_list;
 		P.Expr(x.y);
 
 	case Scanner.PERIOD:
@@ -522,7 +553,7 @@ func (P *Printer) StatementList(list *array.Array) {
 
 
 func (P *Printer) Block(pos int, list *array.Array, end int, indent bool) {
-	P.action = open_scope;\n+	P.state = opening_scope;
 	P.String(pos, "{");
 	if !indent {
 		P.indentation--;
@@ -531,8 +562,10 @@ func (P *Printer) Block(pos int, list *array.Array, end int, indent bool) {
 	if !indent {
 		P.indentation++;
 	}\n-	P.separator = none;\n-	P.action = close_scope;\n+	if !optsemicolons.BVal() {
+		P.separator = none;
+	}
+	P.state = closing_scope;
 	P.String(end, "}");
 }
 
@@ -651,6 +684,8 @@ func (P *Printer) Stat(s *AST.Stat) {
 // ----------------------------------------------------------------------------
 // Declarations
 
+// TODO This code is unreadable! Clean up AST and rewrite this.
+
 func (P *Printer) Declaration(d *AST.Decl, parenthesized bool) {
 	if !parenthesized {
 		if d.exported {
@@ -662,7 +697,7 @@ func (P *Printer) Declaration(d *AST.Decl, parenthesized bool) {
 	}\n 
 	if d.tok != Scanner.FUNC && d.list != nil {
-		P.action = open_scope;\n+		P.state = opening_scope;
 		P.String(0, "(");
 		if d.list.Len() > 0 {
 			P.newlines = 1;
@@ -672,7 +707,7 @@ func (P *Printer) Declaration(d *AST.Decl, parenthesized bool) {
 				P.newlines = 1;
 			}
 		}\n-		P.action = close_scope;\n+		P.state = closing_scope;
 		P.String(d.end, ")");
 
 	} else {
@@ -691,11 +726,12 @@ func (P *Printer) Declaration(d *AST.Decl, parenthesized bool) {
 				P.separator = blank;
 			}
 			P.Type(d.typ);
+			P.separator = tab;
 		}
 
 	if d.val != nil {
-		P.String(0, "\t");\n 		if d.tok != Scanner.IMPORT {
+		if d.tok != Scanner.IMPORT {
+			P.separator = tab;
 			P.String(0, "=");
 			P.separator = blank;
 		}

usr/gri/pretty/selftest2.go

--- a/usr/gri/pretty/selftest2.go
+++ b/usr/gri/pretty/selftest2.go
@@ -4,7 +4,25 @@
 
 package main
 
-import Fmt "fmt"\n+import (
++	"array";  // not needed
++	"utf8";  // not needed
++	Fmt "fmt"
++)
+
+
+const /* enum */ (
++	EnumTag0 = iota;
++	EnumTag1;
++	EnumTag2;
++	EnumTag3;
++	EnumTag4;
++	EnumTag5;
++	EnumTag6;
++	EnumTag7;
++	EnumTag8;
++	EnumTag9;
+)
 
 
 type T struct {
@@ -29,6 +47,16 @@ func f0(a, b int) int {
 }
 
 
+func f1(tag int) {
++	switch tag {
++	case
++		EnumTag0, EnumTag1, EnumTag2, EnumTag3, EnumTag4,
++		EnumTag5, EnumTag6, EnumTag7, EnumTag8, EnumTag9: break;
++	default:
++	}
+}
++
+
 func main() {
 // the prologue
 	for i := 0; i <= 10 /* limit */; i++ {

コアとなるコードの解説

`usr/gri/pretty/printer.go`

このファイルは、Go言語のAST（抽象構文木）を受け取り、整形されたGoコードを出力する主要なロジックを含んでいます。

フラグの追加:
- newlines, maxnewlines, optsemicolonsという新しいコマンドラインフラグが追加されました。これらは、フォーマッタの挙動をより細かく制御するためのものです。
  - newlines: ソースコード中の改行をフォーマット時に考慮するかどうか。
  - maxnewlines: 連続する改行の最大数。これにより、過剰な空行の挿入を防ぎます。
  - optsemicolons: オプションのセミコロン（Goでは通常不要だが、特定の状況で必要となる場合がある）を出力するかどうか。
セマンティック状態の導入:
- 以前のaction（アクション）という概念がstate（セマンティック状態）に置き換えられました。これは、フォーマッタがコードの構造的なコンテキスト（例: スコープの開始、スコープの終了、リスト内）をより正確に把握し、それに基づいてフォーマットを適用できるようにするためです。
- normal, opening_scope, closing_scope, inside_listといった新しい定数が定義され、それぞれが異なるフォーマットルールをトリガーします。
Newlineメソッドの改善:
- maxnewlinesフラグの値が考慮されるようになり、出力される連続改行の数が制限されます。これにより、整形後のコードが過度に縦長になるのを防ぎます。
Stringメソッドにおけるコメントと改行の処理:
- P.HasComment(pos)というヘルパーメソッドが追加され、指定された位置にコメントが存在するかどうかを効率的にチェックできるようになりました。
- コメントの出力ロジックが改善され、特にnewlines.BVal()がfalseの場合（ソース中の改行を尊重しない設定）の改行の扱いが調整されました。
- nlcount（ソース中の改行数）とP.newlines（フォーマッタが挿入しようとしている改行数）を比較し、newlines.BVal()がtrueの場合にのみソース中の追加の改行を尊重するロジックが追加されました。これにより、フォーマッタがソースコードの意図をより適切に反映できるようになります。
単一匿名結果型の括弧削除:
- Typeメソッド内で、関数の戻り値の型を処理する際に、t.elt.list.Len() > 1の場合にのみP.Parameters(0, list)を呼び出し、それ以外（単一の匿名結果型）の場合はP.Expr(list.At(0).(*AST.Expr))を直接呼び出すように変更されました。これにより、func foo() (int)のような記述がfunc foo() intと整形されるようになります。これはGo言語の簡潔な記述スタイルに合致します。
Blockメソッドにおけるセミコロン制御:
- optsemicolons.BVal()がfalseの場合にP.separator = noneを設定する条件が追加されました。これは、オプションのセミコロンを出力しない設定の場合に、ブロックの終わりに不要なセミコロンが挿入されるのを防ぐためのものです。

`usr/gri/pretty/selftest2.go`

このファイルは、printer.goの変更によって導入された新しいフォーマットルールが正しく機能するかどうかを検証するためのテストケースを含んでいます。

複数行インポートのテスト: import文が複数行で記述された場合のフォーマットが正しく行われるかを確認します。
iotaを使用したconstブロックのテスト: iotaキーワードを用いた連続する定数宣言が正しく整形されるかを確認します。
switch文の複数ケース記述のテスト: switch文のcase節で複数の値をカンマ区切りで指定した場合のフォーマットが正しく行われるかを確認します。

これらのテストケースは、printer.goの変更が意図した通りに動作し、Goコードのフォーマットが改善されたことを確認するために重要です。

comemo