Preface:
For traversing large arrays, for loops can be more efficient and stable than for range loops, which is more obvious when array elements are structure types.
We know that Go's syntax is relatively concise. It does not provide loop control syntax such as while, do...while, etc. supported by C, but only retains one statement, namely for loop.
for i := 0; i < n; i++ { ... ... }
However, the classic three-parameter loop statement needs to obtain the length n of the iterative object. In view of this, in order to make iterate over compound data types more convenient for Go developers, such as array, slice, channel, map, Go provides a variant of for loop, namely for range loop.
Copy replication issues
While bringing convenience, range also brings some trouble to beginners of Go. Because the user needs to understand: in for range, the only thing that participates in the loop expression is the copy of the object.
func main() { var a = [5]int{1, 2, 3, 4, 5} var r [5]int ("original a =", a) for i, v := range a { if i == 0 { a[1] = 12 a[2] = 13 } r[i] = v } ("after for range loop, r =", r) ("after for range loop, a =", a) }
Do you think this code will output the following results?
original a = [1 2 3 4 5]
after for range loop, r = [1 12 13 4 5]
after for range loop, a = [1 12 13 4 5]
However, the actual output is;
original a = [1 2 3 4 5]
after for range loop, r = [1 2 3 4 5]
after for range loop, a = [1 12 13 4 5]
Why is this happening? The reason is that participating in the for range loop is a copy of the range expression. That is, in the example above, it is actually a copy of a that participates in the loop, not the real a.
To make it easier for everyone to understand, we rewrite the for range loop in the above example into an equivalent pseudocode form.
for i, v := range ac { //ac is a value copy of a if i == 0 { a[1] = 12 a[2] = 13 } r[i] = v }
a is a sequence of consecutive bytes temporarily allocated by Go, and is not the same memory space as a at all. Therefore, no matter how a is modified, the copy of a participating in the loop, ac, still retains the original value, so v taken from a is still the original value of a, rather than the modified value.
So, the question is, since for range uses replica data, will for range consume more resources and perform worse than classic for loops?
Performance comparison
Based on the replica replication problem, let's first verify using the benchmark example: for range must run slower than the classic for loop for large arrays?
package main import "testing" func BenchmarkClassicForLoopIntArray(b *) { () var arr [100000]int for i := 0; i < ; i++ { for j := 0; j < len(arr); j++ { arr[j] = j } } } func BenchmarkForRangeIntArray(b *) { () var arr [100000]int for i := 0; i < ; i++ { for j, v := range arr { arr[j] = j _ = v } } }
In this example, we use a for loop and for range to traverse an array of 100,000 int-type elements respectively. Let's look at the results of the benchmark.
$ go test -bench . forRange1_test.go
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz
BenchmarkClassicForLoopIntArray-8 47404 25486 ns/op 0 B/op 0 allocs/op
BenchmarkForRangeIntArray-8 37142 31691 ns/op 0 B/op 0 allocs/op
PASS
ok command-line-arguments 2.978s
From the output results, we can see that for range is indeed slightly inferior to for loops, which of course includes the results of compiler-level optimization (usually static single assignment, or SSA link).
Let's turn off the optimization switch and run the stress test again.
$ go test -c -gcflags '-N -l' . -o
$ ./ - .
goos: darwin
goarch: amd64
pkg: workspace/example/forRange
cpu: Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz
BenchmarkClassicForLoopIntArray-8 6734 175319 ns/op 0 B/op 0 allocs/op
BenchmarkForRangeIntArray-8 5178 242977 ns/op 0 B/op 0 allocs/op
PASS
When there is no compiler optimization, the performance of both loops is significantly reduced, the for range drops more significantly, and the performance is even worse than that of classic for loops.
Iterate through the structure array
In the above performance test, our traversal object type is an array of int values. What will happen if we change the int element to a structure? What will happen to the for and for range loops each perform?
package main import "testing" type U5 struct { a, b, c, d, e int } type U4 struct { a, b, c, d int } type U3 struct { b, c, d int } type U2 struct { c, d int } type U1 struct { d int } func BenchmarkClassicForLoopLargeStructArrayU5(b *) { () var arr [100000]U5 for i := 0; i < ; i++ { for j := 0; j < len(arr)-1; j++ { arr[j].d = j } } } func BenchmarkClassicForLoopLargeStructArrayU4(b *) { () var arr [100000]U4 for i := 0; i < ; i++ { for j := 0; j < len(arr)-1; j++ { arr[j].d = j } } } func BenchmarkClassicForLoopLargeStructArrayU3(b *) { () var arr [100000]U3 for i := 0; i < ; i++ { for j := 0; j < len(arr)-1; j++ { arr[j].d = j } } } func BenchmarkClassicForLoopLargeStructArrayU2(b *) { () var arr [100000]U2 for i := 0; i < ; i++ { for j := 0; j < len(arr)-1; j++ { arr[j].d = j } } } func BenchmarkClassicForLoopLargeStructArrayU1(b *) { () var arr [100000]U1 for i := 0; i < ; i++ { for j := 0; j < len(arr)-1; j++ { arr[j].d = j } } } func BenchmarkForRangeLargeStructArrayU5(b *) { () var arr [100000]U5 for i := 0; i < ; i++ { for j, v := range arr { arr[j].d = j _ = v } } } func BenchmarkForRangeLargeStructArrayU4(b *) { () var arr [100000]U4 for i := 0; i < ; i++ { for j, v := range arr { arr[j].d = j _ = v } } } func BenchmarkForRangeLargeStructArrayU3(b *) { () var arr [100000]U3 for i := 0; i < ; i++ { for j, v := range arr { arr[j].d = j _ = v } } } func BenchmarkForRangeLargeStructArrayU2(b *) { () var arr [100000]U2 for i := 0; i < ; i++ { for j, v := range arr { arr[j].d = j _ = v } } } func BenchmarkForRangeLargeStructArrayU1(b *) { () var arr [100000]U1 for i := 0; i < ; i++ { for j, v := range arr { arr[j].d = j _ = v } } }
In this example, we define 5 types of structures: U1~U5, and their differences are the number of int type fields that are contained.
The performance test results are as follows:
$ go test -bench . forRange2_test.go
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz
BenchmarkClassicForLoopLargeStructArrayU5-8 44540 26227 ns/op 0 B/op 0 allocs/op
BenchmarkClassicForLoopLargeStructArrayU4-8 45906 26312 ns/op 0 B/op 0 allocs/op
BenchmarkClassicForLoopLargeStructArrayU3-8 43315 27400 ns/op 0 B/op 0 allocs/op
BenchmarkClassicForLoopLargeStructArrayU2-8 44605 26313 ns/op 0 B/op 0 allocs/op
BenchmarkClassicForLoopLargeStructArrayU1-8 45752 26110 ns/op 0 B/op 0 allocs/op
BenchmarkForRangeLargeStructArrayU5-8 3072 388651 ns/op 0 B/op 0 allocs/op
BenchmarkForRangeLargeStructArrayU4-8 4605 261329 ns/op 0 B/op 0 allocs/op
BenchmarkForRangeLargeStructArrayU3-8 5857 182565 ns/op 0 B/op 0 allocs/op
BenchmarkForRangeLargeStructArrayU2-8 10000 108391 ns/op 0 B/op 0 allocs/op
BenchmarkForRangeLargeStructArrayU1-8 36333 32346 ns/op 0 B/op 0 allocs/op
PASS
ok command-line-arguments 16.160s
We see a phenomenon:Regardless of the type of structure element array, the performance of classic for loop traversal is relatively consistent, but the traversal performance of for range will decrease as the number of structure fields increases.
in conclusion
For traversing large arrays, for loops can be more efficient and stable than for range loops, which is more obvious when array elements are structure types.
In addition, since the bottom layer of slices in Go stores data through arrays, despite the problem of copy copying for range, the underlying array pointed to by slice copies is consistent with the original slice. This means that when we replace the array with slices, we can obtain consistent and stable traversal performance through both for range and for loops.
This is the end of this article about the difference between Go processing large arrays using for range and for loops. For more related content of Go processing large arrays, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!