SoFunction
Updated on 2025-03-03

Understand the strings in Go

The nature of a string

In programming languages, strings play an important role.There are generally two types of data structures behind strings:

  • A specified length at compile time and cannot be modified
  • A dynamic length that can be modified.

for example:Like strings in Python, strings in Go cannot be modified and can only be accessed.
In Python,If you change the value of a string, you will get the following result:

>>> hi = "Hello"
>>> hi
'Hello'
>>> hi[0] = 'h'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
>>>

Similarly, in Go:

package main
import "fmt"
func main() {
var hello = "Hello"
hello[1] = 'h'
(hello)
}
// # command-line-arguments
// string_in_go/:8:11: cannot assign to hello[1] (strings are immutable)

There are two ways to terminate strings:

  • One is an implicit declaration in C language, with the character "\0" as the terminator
  • One is an explicit declaration of Go language

The representation structure of Go language string is as follows:

type StringHeader struct {
Data uintptr // Data points to the underlying character arrayLen int // Len is used to represent the length of the string}

StringnatureOn top is an array of characters, each character corresponds to one or more integers when stored. Use these integers to represent characters, such as printinghelloThe byte array is as follows:

package main
import "fmt"
func main() {
var hello = "Hello"
for i := 0; i < len(hello); i++ {
("%x ", hello[i])
}
}
// Output: 48 65 6c 6c 6f

The underlying principle of strings

Strings have special identifiers and there are two ways to declare them:

var s1 string = `hello world`
var s2 string = "hello world"

String constants are eventually marked as tokens of type StringLit and passed to the next stage of compilation during the lexical parsing phase.
During the syntax analysis phase, UTF-8 characters are read in a recursive manner, and single apostrophes or double quotes are the identifiers of the string.

The logic of the analysis is located in the syntax/ file:

func (s *scanner) stdString() {
ok := true
()
for {
if  == '"' {
()
break
}
if  == '\\' {
()
if !('"') {
ok = false
}
continue
}
if  == '\n' {
("newline in string")
ok = false
break
}
if  < 0 {
(0, "string not terminated")
ok = false
break
}
()
}
(StringLit, ok)
}
func (s *scanner) rawString() {
ok := true
()
for {
if  == '`' {
()
break
}
if  < 0 {
(0, "string not terminated")
ok = false
break
}
()
}
// We leave CRs in the string since they are part of the
// literal (even though they are not part of the literal
// value).
(StringLit, ok)
}

From the above code, we can see that there are two types of string checks in Go: one is that the standard string is defined in double quotes."",like"Hello,World", Another type is the original string, using \\Defined, so there are two syntax analysis functions for two strings:

  • If it is a single apostrophe, callrawString function
  • If it is double quotes, callstdStringfunction

This is the end of this article about in-depth understanding of strings in Go. For more relevant Go string content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!