按字节比较varint编码的int64

按字节比较varint编码的int64

问题描述:

I'm using levigo, the leveldb bindings for Go. My keys are int64's and need to be kept sorted. By default, leveldb uses a bytewise comparator so I'm trying to use varint encoding.

func i2b(x int64) []byte {
    b := make([]byte, binary.MaxVarintLen64)
    n := binary.PutVarint(b, x)
    return key[:n]
}

My keys are not being sorted correctly. I wrote the following as a test.

var prev int64 = 0
for i := int64(1); i < 1e5; i++ {
    if bytes.Compare(i2b(i), i2b(prev)) <= 0 {
        log.Fatalf("bytewise: %d > %d", b2i(prev), i)
    }
    prev = i
}

output: bytewise: 127 > 128

playground

I'm not sure where the problem is. Am I doing the encoding wrong? Is varint not the right encoding to use?

EDIT:

BigEndian fixed width encoding is bytewise comparable

func i2b(x int64) []byte {
  b := make([]byte, 8)
  binary.BigEndian.PutUint64(b, uint64(x))
  return b
}

The varint encoding is not bytewise comparable* wrt to the order of the values it caries. One option how to write the ordering/collating function (cmp bellow) is for example:

package main

import (
        "encoding/binary"
        "log"
)

func i2b(x int64) []byte {
        var b [binary.MaxVarintLen64]byte
        return b[:binary.PutVarint(b[:], x)]
}

func cmp(a, b []byte) int64 {
        x, n := binary.Varint(a)
        if n < 0 {
                log.Fatal(n)
        }

        y, n := binary.Varint(b)
        if n < 0 {
                log.Fatal(n)
        }

        return x - y
}

func main() {
        var prev int64 = 0
        for i := int64(1); i < 1e5; i++ {
                if cmp(i2b(i), i2b(prev)) <= 0 {
                        log.Fatal("fail")
                }
                prev = i
        }
}

Playground

(*) The reason is (also) the bit fiddling performed.