如何在不共享bufio.Scanner的情况下反复从os.Stdin中读取
In Go, can a single line of input be read from stdin in a simple way, which also meets the following requirements?
- can be called by disparate parts of a larger interactive application without having to create coupling between these different parts of the application (e.g. by passing a global
bufio.Scanner
between them) - works whether users are running an interactive terminal or using pre-scripted input
I'd like to modify an existing large Go application which currently creates a bufio.Scanner
instance every time it asks users for a line of input. Multiple instances work fine when standard input is from a terminal, but when standard input is piped from another process, calls to Scan
only succeed on the first instance of bufio.Scanner
. Calls from all other instances fail.
Here's some toy code that demonstrates the problem:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
// read with 1st scanner -> works for both piped stdin and terminal
scanner1 := readStdinLine(1)
// read with 2nd scanner -> fails for piped stdin, works for terminal
readStdinLine(2)
// read with 1st scanner -> prints line 2 for piped stdin, line 3 for terminal
readLine(scanner1, 3)
}
func readStdinLine(lineNum int64) (scanner *bufio.Scanner) {
scanner = readLine(bufio.NewScanner(os.Stdin), lineNum)
return
}
func readLine(scannerIn *bufio.Scanner, lineNum int64) (scanner *bufio.Scanner) {
scanner = scannerIn
scanned := scanner.Scan()
fmt.Printf("%d: ", lineNum)
if scanned {
fmt.Printf("Text=%s
", scanner.Text())
return
}
if scanErr := scanner.Err(); scanErr != nil {
fmt.Printf("Error=%s
", scanErr)
return
}
fmt.Println("EOF")
return
}
I build this as print_stdin
and run interactively from a bash shell:
~$ ./print_stdin
ab
1: Text=ab
cd
2: Text=cd
ef
3: Text=ef
But if I pipe in the text, the second bufio.Scanner
fails:
~$ echo "ab
> cd
> ef" | ./print_stdin
1: Text=ab
2: EOF
3: Text=cd
The suggestion in the comment by ThunderCat works.
The alternative to buffered read is reading a byte a time. Read single bytes until or some terminator is found and return the data up to that point.
Here's my implementation, heavily inspired by Scanner.Scan:
package lineio
import (
"errors"
"io"
)
const startBufSize = 4 * 1024
const maxBufSize = 64 * 1024
const maxConsecutiveEmptyReads = 100
var ErrTooLong = errors.New("lineio: line too long")
func ReadLine(r io.Reader) (string, error) {
lb := &lineBuf {r:r, buf: make([]byte, startBufSize)}
for {
lb.ReadByte()
if lb.err != nil || lb.TrimCrlf() {
return lb.GetResult()
}
}
}
type lineBuf struct {
r io.Reader
buf []byte
end int
err error
}
func (lb *lineBuf) ReadByte() {
if lb.EnsureBufSpace(); lb.err != nil {
return
}
for empties := 0; ; {
n := 0
if n, lb.err = lb.r.Read(lb.buf[lb.end:lb.end+1]); lb.err != nil {
return
}
if n > 0 {
lb.end++
return
}
empties++
if empties > maxConsecutiveEmptyReads {
lb.err = io.ErrNoProgress
return
}
}
}
func (lb *lineBuf) TrimCrlf() bool {
if !lb.EndsLf() {
return false
}
lb.end--
if lb.end > 0 && lb.buf[lb.end-1] == '' {
lb.end--
}
return true
}
func (lb *lineBuf) GetResult() (string, error) {
if lb.err != nil && lb.err != io.EOF {
return "", lb.err
}
return string(lb.buf[0:lb.end]), nil
}
func (lb *lineBuf) EndsLf() bool {
return lb.err == nil && lb.end > 0 && (lb.buf[lb.end-1] == '
')
}
func (lb *lineBuf) EnsureBufSpace() {
if lb.end < len(lb.buf) {
return
}
newSize := len(lb.buf) * 2
if newSize > maxBufSize {
lb.err = ErrTooLong
return
}
newBuf := make([]byte, newSize)
copy(newBuf, lb.buf[0:lb.end])
lb.buf = newBuf
return
}
TESTING
Compiled lineio with go install
and main (see below) with go build -o read_each_byte
.
Tested scripted input:
$ seq 12 22 78 | ./read_each_byte
1: Text: "12"
2: Text: "34"
3: Text: "56"
Tested input from an interactive terminal:
$ ./read_each_byte
abc
1: Text: "abc"
123
2: Text: "123"
x\y"z
3: Text: "x\\y\"z"
Here's main:
package main
import (
"fmt"
"lineio"
"os"
)
func main() {
for i := 1; i <= 3; i++ {
text, _ := lineio.ReadLine(os.Stdin)
fmt.Printf("%d: Text: %q
", i, text)
}
}
Your sequence is:
- create scanner
- wait read terminal
- print result
- repeat 1 to 3 (creating new scanner about stdin)
- repeat 2 to 3
- exit program
When you exec echo in pipeline, only exists a stdin/stdout file being read/write, but you are trying to use two.
UPDATE: The flow of execution for echo is:
- read args
- process args
- write args in stdout
- terminal read stdout and print its
See that this occur on press ENTER key. The argument whole is sent to echo program and not by line.
The echo utility writes its arguments to standard output, followed by a . If there are no arguments, only the is written.
More here: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html.
See in source code how echo work:
while (argc > 0)
{
fputs (argv[0], stdout);//<-- send args to the same stdout
argc--;
argv++;
if (argc > 0)
putchar (' ');
}
So your code will work fine with this:
$ (n=1; while sleep 1; do echo a$n; n=$((n+1)); done) | ./print_stdin
$ 1: Text=a1
$ 2: Text=a2
$ 3: Text=a3
If you need repeat args in differents stdout, use "yes" program or alternatives. yes program repeats the wrote args in stdout. More in: https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/yes.c
Example:
$ yes a | ./print_stdin
$ 1: Text=a
$ 2: Text=a
$ 3: Text=a