Using text/scanner (Mode) in Go
Package scanner provides a scanner and tokenizer for UTF-8-encoded text. It takes an io.Reader providing the source, which then can be tokenized through repeated calls to the Scan function. For compatibility with existing tools, the NUL character is not allowed. If the first character in the source is a UTF-8 encoded byte order mark (BOM), it is discarded.
package main
import (
"fmt"
"strings"
"text/scanner"
)
func main() {
const src = `
// Comment begins at column 5.
This line should not be included in the output.
/*
This multiline comment
should be extracted in
its entirety.
*/
`
var s scanner.Scanner
s.Init(strings.NewReader(src))
s.Filename = "comments"
s.Mode ^= scanner.SkipComments // don't skip comments
for tok := s.Scan(); tok != scanner.EOF; tok = s.Scan() {
txt := s.TokenText()
if strings.HasPrefix(txt, "//") || strings.HasPrefix(txt, "/*") {
fmt.Printf("%s: %s\n", s.Position, txt)
}
}
}