SoFunction
Updated on 2025-04-07

Causes and solutions for the production of duplicate data by Go slices

Problem description

In a Go server API, we need to followcurBatchParameters are paging, frominterestCfgSelect in batchesinterestTagNumA interest tag and randomly disrupt the selected data before returning the result.

All interest tag examples:

{
    "InterestTags": [
        {"interestName":"Daily Sharing"},
        {"interestName":"Gaming"},
        {"interestName":"AI"},
        {"interestName":"test"},
        {"interestName":"Sports"},
        {"interestName":"Cars"},
        {"interestName":"other"}
    ]
}

Review of phenomenon

whencurBatch = 0When  , the returned data is correct:

{
    "InterestTags": [
        { "interestName": "Daily Sharing" },
        { "interestName": "Gaming" },
        { "interestName": "AI" }
    ]
}

But whencurBatch = 2hour,There is a problem with duplication of data in the test environment: (local operation is normal)

1. When not random (correct result):

{
    "InterestTags": [
        { "interestName": "other" },
        { "interestName": "Daily Sharing" },
        { "interestName": "Gaming" }
    ]
}

2. After random (error result):

{
    "InterestTags": [
        { "interestName": "Gaming" },
        { "interestName": "Gaming" },
        { "interestName": "AI" }
    ]
}

question:

  • "Gaming" appears twice, and "test" disappears!
  • The local environment is normal, but the test environment is abnormal, making debugging difficult.

Problem troubleshooting

The data selection and random operation logic are as follows:

interestTags := make([], 0, )

// Process interestConfig, batch processing according to curBatchif len() > 0 &&  > 0 {
    interestAllTags := 
    numBatches := (len(interestAllTags) + int() - 1) / int()
    startIdx := (curBatch % numBatches) * int()
    endIdx := startIdx + int()

    if endIdx > len(interestAllTags) {
        interestTags = interestAllTags[startIdx:]
        interestTags = append(interestTags, interestAllTags[:(endIdx-len(interestAllTags))]...)
    } else {
        interestTags = interestAllTags[startIdx:endIdx]
    }
}

// Randomly disrupt the interestTags orderr := ((().UnixNano()))
(len(interestTags), func(i, j int) {
    interestTags[i], interestTags[j] = interestTags[j], interestTags[i]
})

Key points analysis

  1. interestTags = interestAllTags[startIdx:endIdx]Directly frominterestAllTagsTake out the data,But slices are reference types,thereforeinterestTags SharedinterestAllTagsThe underlying array
  2. Random exchangeinterestTagsThe elements in  butinterestTagsPoint tointerestAllTags, may causeThe original data was modified incorrectly
  3. Local and test environments are inconsistent, possibly with Go runtimeMemory management mechanismorHigh concurrency scenariosThe slice expansion behavior is related to the performance of the slice.

Code Verification

For verificationinterestTagsWhether to shareinterestAllTagsIn the underlying array, we print the memory address of the slice element:

("Before Shuffle:")
for i, tag := range interestTags {
    ("[%d] %p: %s\n", i, &interestTags[i], )
}

(len(interestTags), func(i, j int) {
    interestTags[i], interestTags[j] = interestTags[j], interestTags[i]
})

("After Shuffle:")
for i, tag := range interestTags {
    ("[%d] %p: %s\n", i, &interestTags[i], )
}

Solution

Solution 1: Use append to copy data

To avoidinterestTagsShareinterestAllTagsFor the underlying array, we need to explicitly copy the data:

interestTags = make([], 0, )
if endIdx > len(interestAllTags) {
    interestTags = append(interestTags, interestAllTags[startIdx:]...)
    interestTags = append(interestTags, interestAllTags[:(endIdx-len(interestAllTags))]...)
} else {
    interestTags = append(interestTags, interestAllTags[startIdx:endIdx]...)
}

Why do this?

  • append(..., interestAllTags[startIdx:endIdx]...) Create a new slice,avoidinterestTagsShareinterestAllTagsThe underlying data.
  • Independent data copymake sure Only affectinterestTags, will not destroy the originalinterestAllTags

Summarize

1. Cause of the problem

  • Go slice is a reference type, and it is directly assigned.interestTags = interestAllTags[startIdx:endIdx] No new data is created, but the underlying array is shared
  •  May affectinterestAllTags, causing the elements to be repeated
  • The local environment is normal, but the test environment is abnormal, which may be related to GoMemory managementandSlice expansion strategyrelated.

2. Solution

  • useappendCopy data,make sureinterestTags Is independent data,avoidInfluence the originalinterestAllTags

Experience summary

  1. Go Slices are reference types, cannot be assigned directly, otherwise the underlying data may be shared.
  2. useBefore, it is necessary to ensure that the data is an independent copy
  3. Try to useappendCreate a new slice, avoid the problem of underlying array sharing.
  4. When different environments perform inconsistently, memory management, concurrency and data structure side effects should be checked.

The above is the detailed content of the reasons and solutions for the production of duplicate data by Go slices. For more information about the production of duplicate data by Go, please pay attention to my other related articles!