Go中有与Java的String intern函数等效的功能吗?

Go中有与Java的String intern函数等效的功能吗?

问题描述:

Is there an equivalent to Java's String intern function in Go?

I am parsing a lot of text input that has repeating patterns (tags). I would like to be memory efficient about it and store pointers to a single string for each tag, instead of multiple strings for each occurrence of a tag.

Go中是否有与Java的String intern函数等效的文件? p>

I 解析了很多具有重复模式(标记)的文本输入。 我想提高内存效率,并为每个标签存储指向单个字符串的指针,而不是为每次出现标签存储多个字符串。 p> div>

I think that for example Pool and GoPool may fulfill your needs. That code solves one thing which Stephen's solution ignores. In Go, a string value may be a slice of a bigger string. Scenarios are where it doesn't matter and scenarios are where that is a show stopper. The linked functions attempt to be on the safe side.

No such function exists that I know of. However, you can make your own very easily using maps. The string type itself is a uintptr and a length. So, a string assigned from another string takes up only two words. Therefore, all you need to do is ensure that there are no two strings with redundant content.

Here is an example of what I mean.

type Interner map[string]string

func NewInterner() Interner {
    return Interner(make(map[string]string))
}

func (m Interner) Intern(s string) string {
    if ret, ok := m[s]; ok {
        return ret
    }

    m[s] = s
    return s
}

This code will deduplicate redundant strings whenever you do the following:

str = interner.Intern(str)

EDIT: As jnml mentioned, my answer could pin memory depending on the string it is given. There are two ways to solve this problem. Both of these should be inserted before m[s] = s in my previous example. The first copies the string twice, the second uses unsafe. Neither are ideal.

Double copy:

b := []byte(s)
s = string(b)

Unsafe (use at your own risk. Works with current version of gc compiler):

b := []byte(s)
s = *(*string)(unsafe.Pointer(&b))