SoFunction
Updated on 2025-03-04

The difference between Go and for loops using for range and for loops when dealing with large arrays

Preface:

For traversing large arrays, for loops can be more efficient and stable than for range loops, which is more obvious when array elements are structure types.

We know that Go's syntax is relatively concise. It does not provide loop control syntax such as while, do...while, etc. supported by C, but only retains one statement, namely for loop.

for i := 0; i < n; i++ {
    ... ...
}

However, the classic three-parameter loop statement needs to obtain the length n of the iterative object. In view of this, in order to make iterate over compound data types more convenient for Go developers, such as array, slice, channel, map, Go provides a variant of for loop, namely for range loop.

Copy replication issues

While bringing convenience, range also brings some trouble to beginners of Go. Because the user needs to understand: in for range, the only thing that participates in the loop expression is the copy of the object.

func main() {
    var a = [5]int{1, 2, 3, 4, 5}
    var r [5]int
    ("original a =", a)
    for i, v := range a {
        if i == 0 {
            a[1] = 12
            a[2] = 13
        }
        r[i] = v
    }
    ("after for range loop, r =", r)
    ("after for range loop, a =", a)
}

Do you think this code will output the following results?

original a = [1 2 3 4 5]
after for range loop, r = [1 12 13 4 5]
after for range loop, a = [1 12 13 4 5]

However, the actual output is;

original a = [1 2 3 4 5]
after for range loop, r = [1 2 3 4 5]
after for range loop, a = [1 12 13 4 5]

Why is this happening? The reason is that participating in the for range loop is a copy of the range expression. That is, in the example above, it is actually a copy of a that participates in the loop, not the real a.

To make it easier for everyone to understand, we rewrite the for range loop in the above example into an equivalent pseudocode form.

for i, v := range ac { //ac is a value copy of a
    if i == 0 {
        a[1] = 12
        a[2] = 13
    }
    r[i] = v
}

a is a sequence of consecutive bytes temporarily allocated by Go, and is not the same memory space as a at all. Therefore, no matter how a is modified, the copy of a participating in the loop, ac, still retains the original value, so v taken from a is still the original value of a, rather than the modified value.

So, the question is, since for range uses replica data, will for range consume more resources and perform worse than classic for loops?

Performance comparison

Based on the replica replication problem, let's first verify using the benchmark example: for range must run slower than the classic for loop for large arrays?

package main
import "testing"
func BenchmarkClassicForLoopIntArray(b *) {
 ()
 var arr [100000]int
 for i := 0; i < ; i++ {
  for j := 0; j < len(arr); j++ {
   arr[j] = j
  }
 }
}
func BenchmarkForRangeIntArray(b *) {
 ()
 var arr [100000]int
 for i := 0; i < ; i++ {
  for j, v := range arr {
   arr[j] = j
   _ = v
  }
 }
}

In this example, we use a for loop and for range to traverse an array of 100,000 int-type elements respectively. Let's look at the results of the benchmark.

$ go test -bench . forRange1_test.go 
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz
BenchmarkClassicForLoopIntArray-8          47404             25486 ns/op               0 B/op          0 allocs/op
BenchmarkForRangeIntArray-8                37142             31691 ns/op               0 B/op          0 allocs/op
PASS
ok      command-line-arguments  2.978s

From the output results, we can see that for range is indeed slightly inferior to for loops, which of course includes the results of compiler-level optimization (usually static single assignment, or SSA link).

Let's turn off the optimization switch and run the stress test again.

 $ go test -c -gcflags '-N -l' . -o
 $ ./ - .
 goos: darwin
goarch: amd64
pkg: workspace/example/forRange
cpu: Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz
BenchmarkClassicForLoopIntArray-8           6734            175319 ns/op               0 B/op          0 allocs/op
BenchmarkForRangeIntArray-8                 5178            242977 ns/op               0 B/op          0 allocs/op
PASS

When there is no compiler optimization, the performance of both loops is significantly reduced, the for range drops more significantly, and the performance is even worse than that of classic for loops.

Iterate through the structure array

In the above performance test, our traversal object type is an array of int values. What will happen if we change the int element to a structure? What will happen to the for and for range loops each perform?

package main
import "testing"
type U5 struct {
 a, b, c, d, e int
}
type U4 struct {
 a, b, c, d int
}
type U3 struct {
 b, c, d int
}
type U2 struct {
 c, d int
}
type U1 struct {
 d int
}

func BenchmarkClassicForLoopLargeStructArrayU5(b *) {
 ()
 var arr [100000]U5
 for i := 0; i < ; i++ {
  for j := 0; j < len(arr)-1; j++ {
   arr[j].d = j
  }
 }
}
func BenchmarkClassicForLoopLargeStructArrayU4(b *) {
 ()
 var arr [100000]U4
 for i := 0; i < ; i++ {
  for j := 0; j < len(arr)-1; j++ {
   arr[j].d = j
  }
 }
}
func BenchmarkClassicForLoopLargeStructArrayU3(b *) {
 ()
 var arr [100000]U3
 for i := 0; i < ; i++ {
  for j := 0; j < len(arr)-1; j++ {
   arr[j].d = j
  }
 }
}
func BenchmarkClassicForLoopLargeStructArrayU2(b *) {
 ()
 var arr [100000]U2
 for i := 0; i < ; i++ {
  for j := 0; j < len(arr)-1; j++ {
   arr[j].d = j
  }
 }
}

func BenchmarkClassicForLoopLargeStructArrayU1(b *) {
 ()
 var arr [100000]U1
 for i := 0; i < ; i++ {
  for j := 0; j < len(arr)-1; j++ {
   arr[j].d = j
  }
 }
}

func BenchmarkForRangeLargeStructArrayU5(b *) {
 ()
 var arr [100000]U5
 for i := 0; i < ; i++ {
  for j, v := range arr {
   arr[j].d = j
   _ = v
  }
 }
}
func BenchmarkForRangeLargeStructArrayU4(b *) {
 ()
 var arr [100000]U4
 for i := 0; i < ; i++ {
  for j, v := range arr {
   arr[j].d = j
   _ = v
  }
 }
}

func BenchmarkForRangeLargeStructArrayU3(b *) {
 ()
 var arr [100000]U3
 for i := 0; i < ; i++ {
  for j, v := range arr {
   arr[j].d = j
   _ = v
  }
 }
}
func BenchmarkForRangeLargeStructArrayU2(b *) {
 ()
 var arr [100000]U2
 for i := 0; i < ; i++ {
  for j, v := range arr {
   arr[j].d = j
   _ = v
  }
 }
}
func BenchmarkForRangeLargeStructArrayU1(b *) {
 ()
 var arr [100000]U1
 for i := 0; i < ; i++ {
  for j, v := range arr {
   arr[j].d = j
   _ = v
  }
 }
}

In this example, we define 5 types of structures: U1~U5, and their differences are the number of int type fields that are contained.

The performance test results are as follows:

 $ go test -bench . forRange2_test.go
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz
BenchmarkClassicForLoopLargeStructArrayU5-8        44540             26227 ns/op               0 B/op          0 allocs/op
BenchmarkClassicForLoopLargeStructArrayU4-8        45906             26312 ns/op               0 B/op          0 allocs/op
BenchmarkClassicForLoopLargeStructArrayU3-8        43315             27400 ns/op               0 B/op          0 allocs/op
BenchmarkClassicForLoopLargeStructArrayU2-8        44605             26313 ns/op               0 B/op          0 allocs/op
BenchmarkClassicForLoopLargeStructArrayU1-8        45752             26110 ns/op               0 B/op          0 allocs/op
BenchmarkForRangeLargeStructArrayU5-8               3072            388651 ns/op               0 B/op          0 allocs/op
BenchmarkForRangeLargeStructArrayU4-8               4605            261329 ns/op               0 B/op          0 allocs/op
BenchmarkForRangeLargeStructArrayU3-8               5857            182565 ns/op               0 B/op          0 allocs/op
BenchmarkForRangeLargeStructArrayU2-8              10000            108391 ns/op               0 B/op          0 allocs/op
BenchmarkForRangeLargeStructArrayU1-8              36333             32346 ns/op               0 B/op          0 allocs/op
PASS
ok      command-line-arguments  16.160s

We see a phenomenon:Regardless of the type of structure element array, the performance of classic for loop traversal is relatively consistent, but the traversal performance of for range will decrease as the number of structure fields increases.

in conclusion

For traversing large arrays, for loops can be more efficient and stable than for range loops, which is more obvious when array elements are structure types.

In addition, since the bottom layer of slices in Go stores data through arrays, despite the problem of copy copying for range, the underlying array pointed to by slice copies is consistent with the original slice. This means that when we replace the array with slices, we can obtain consistent and stable traversal performance through both for range and for loops.

This is the end of this article about the difference between Go processing large arrays using for range and for loops. For more related content of Go processing large arrays, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!