Weak references for Swift source code analysis

Preface:

There are many articles on the implementation of the Objective-C weak mechanism in various communities. However, since Swift was released for so long, there have been very few articles on ABI, and it seems that it is an area that many iOS developers have not covered... This article analyzes how Swift implements the weak mechanism from the source code level.

I won't say much below, let's take a look at the detailed introduction

Preparation

Since Swift's source code is large, it is strongly recommended that you put down the repo clone and read this article in combination with the source code.

$ git clone /apple/

The entire Swift project uses CMake as a construction tool. If you want to use Xcode to open it, you need to install LLVM first, and then use cmake -G to generate Xcode projects.

We are just doing source code analysis here, so I directly use Visual Studio Code with C/C++ plug-in, which also supports symbol jumping and searching for references. Also, I would like to remind you that the C++ code type level in Swift stdlib is quite complicated, and it will be quite difficult to read without using IDE to assist.

text

Next, we will officially enter the source code analysis stage. First, let’s take a look at the memory layout of objects (class instances) in Swift.

HeapObject

We know that Objective-C represents an object in runtime through objc_object, and these types define the structure of the object's header in memory. Similarly, there is a similar structure in Swift, that is HeapObject. Let's take a look at its definition:

struct HeapObject {
 /// This is always a valid pointer to a metadata object.
 HeapMetadata const *metadata;

 SWIFT_HEAPOBJECT_NON_OBJC_MEMBERS;

 HeapObject() = default;

 // Initialize a HeapObject header as appropriate for a newly-allocated object.
 constexpr HeapObject(HeapMetadata const *newMetadata) 
 : metadata(newMetadata)
 , refCounts(InlineRefCounts::Initialized)
 { }

 // Initialize a HeapObject header for an immortal object
 constexpr HeapObject(HeapMetadata const *newMetadata,
      InlineRefCounts::Immortal_t immortal)
 : metadata(newMetadata)
 , refCounts(InlineRefCounts::Immortal)
 { }

};

As you can see, the first field of HeapObject is a HeapMetadata object, which has a similar function to isa_t, which is used to describe the object type (equivalent to the result obtained by type(of:)), but Swift does not use it in many cases, such as static method distribution, etc.

Next is SWIFT_HEAPOBJECT_NON_OBJC_MEMBERS, which is a macro definition, after expansion:

RefCounts<InlineRefCountBits> refCounts;

This is a very important thing. Reference count, weak reference, and unowned reference are all related to it. At the same time, it is also a relatively complex structure in Swift objects (the subsequent Swift objects in the article all refer to reference types, that is, instances of class).

In fact, it is not very complicated to say that it is complex. We know that there are many applications of union structures in Objective-C runtime. For example, isa_t has pointer type and nonpointer type, and they all occupy the same memory space. The advantage of this is that it can use memory more efficiently, especially these things used a lot, which can greatly reduce the overhead during the runtime. Similar technologies are also available in JVM, such as the mark word of object header. Of course, this technology is also widely used in Swift ABI.

RefCounts Type and Side Table

The RefCounts type mentioned above, let’s take a look at what it is.

Let's take a look at the definition first:

template <typename RefCountBits>
class RefCounts {
 std::atomic<RefCountBits> refCounts;

 // ...

};

This is the memory layout of RefCounts, I omit all method and type definitions here. You can think of RefCounts as a thread-safe wrapper. The template parameter RefCountBits specifies the real internal type. There are two types in Swift ABI:

typedef RefCounts<InlineRefCountBits> InlineRefCounts;
typedef RefCounts<SideTableRefCountBits> SideTableRefCounts;

The former is used in HeapObject, while the latter is used in HeapObjectSideTableEntry (Side Table). I will talk about these two types one by one in the following text.

Generally speaking, Swift objects do not use Side Tables. Once the object is referenced by weak or unowned, the object will be assigned a Side Table.

InlineRefCountBits

definition:

typedef RefCountBitsT<RefCountIsInline> InlineRefCountBits;

template <RefCountInlinedness refcountIsInline>
class RefCountBitsT {

 friend class RefCountBitsT<RefCountIsInline>;
 friend class RefCountBitsT<RefCountNotInline>;

 static const RefCountInlinedness Inlinedness = refcountIsInline;

 typedef typename RefCountBitsInt<refcountIsInline, sizeof(void*)>::Type
 BitsType;
 typedef typename RefCountBitsInt<refcountIsInline, sizeof(void*)>::SignedType
 SignedBitsType;
 typedef RefCountBitOffsets<sizeof(BitsType)>
 Offsets;

 BitsType bits;

 // ...

};

After replacing the template, InlineRefCountBits is actually a uint64_t. A bunch of related types are to make the code readable more (or lower, hahaha).

Let's simulate the object reference count +1:

Call the SIL interface swift::swift_retain:

HeapObject *swift::swift_retain(HeapObject *object) {
 return _swift_retain(object);
}

static HeapObject *_swift_retain_(HeapObject *object) {
 SWIFT_RT_TRACK_INVOCATION(object, swift_retain);
 if (isValidPointerForNativeRetain(object))
 object->(1);
 return object;
}

auto swift::_swift_retain = _swift_retain_;

Call the increment method of RefCounts:

void increment(uint32_t inc = 1) {
 // 3. Atomically read out the InlineRefCountBits object (i.e., a uint64_t). auto oldbits = (SWIFT_MEMORY_ORDER_CONSUME);
 RefCountBits newbits;
 do {
 newbits = oldbits;
 // 4. Call the incrementStrongExtraRefCount method of InlineRefCountBits // Perform a series of operations on this uint64_t. bool fast = (inc);
 // No weak or unowned references are generally not entered. if (SWIFT_UNLIKELY(!fast)) {
  if (())
  return;
  return incrementSlow(oldbits, inc);
 }
 // 5. Set the calculated uint64_t back through CAS. } while (!refCounts.compare_exchange_weak(oldbits, newbits,
           std::memory_order_relaxed));
}

Here we complete a retain operation.

SideTableRefCountBits

The above is the case where weak or unowned references do not exist. Now let’s see what it will happen if you add a weak reference.

Call the SIL interface swift::swift_weakAssign (the logic of this section is temporarily omitted, it belongs to the logic of the citor. Let's analyze the citor first)
Calling RefCounts<InlineRefCountBits>::formWeakReference adds a weak reference:

template &lt;&gt;
HeapObjectSideTableEntry* RefCounts&lt;InlineRefCountBits&gt;::formWeakReference()
{
 // Assign a Side Table. auto side = allocateSideTable(true);
 if (side)
 // Add a weak reference. return side-&gt;incrementWeak();
 else
 return nullptr;
}

Let’s take a look at the implementation of allocateSideTable:

template &lt;&gt;
HeapObjectSideTableEntry* RefCounts&lt;InlineRefCountBits&gt;::allocateSideTable(bool failIfDeiniting)
{
 auto oldbits = (SWIFT_MEMORY_ORDER_CONSUME);

 // If there is already a Side Table or is being destructed, it will be returned directly. if (()) {
 return ();
 } 
 else if (failIfDeiniting &amp;&amp; ()) {
 return nullptr;
 }

 // Assign Side Table object. HeapObjectSideTableEntry *side = new HeapObjectSideTableEntry(getHeapObject());

 auto newbits = InlineRefCountBits(side);

 do {
 if (()) {
  // At this time, other threads may have created a Side Table, deleted the thread's allocated, and then returned.  auto result = ();
  delete side;
  return result;
 }
 else if (failIfDeiniting &amp;&amp; ()) {
  return nullptr;
 }

 // Initialize the Side Table with the current InlineRefCountBits. side-&gt;initRefCounts(oldbits);
 // Carry out CAS. } while (! refCounts.compare_exchange_weak(oldbits, newbits,
            std::memory_order_release,
            std::memory_order_relaxed));
 return side;
}

Remember that the RefCounts in HeapObject is actually a wrapper of InlineRefCountBits? After constructing the Side Table above, the InlineRefCountBits in the object is not the original reference count, but a pointer to the Side Table. However, since they are actually uint64_t, a method is needed to distinguish them. For the differentiation method, we can look at the constructor of InlineRefCountBits:

LLVM_ATTRIBUTE_ALWAYS_INLINE
 RefCountBitsT(HeapObjectSideTableEntry* side)
 : bits((reinterpret_cast<BitsType>(side) >> Offsets::SideTableUnusedLowBits)
   | (BitsType(1) << Offsets::UseSlowRCShift)
   | (BitsType(1) << Offsets::SideTableMarkShift))
 {
 assert(refcountIsInline);
 }

In fact, it is still the most common method, replace the useless bit of the pointer address with the identification bit.

By the way, take a look at the structure of the Side Table:

class HeapObjectSideTableEntry {
 // FIXME: does object need to be atomic?
 std::atomic<HeapObject*> object;
 SideTableRefCounts refCounts;

 public:
 HeapObjectSideTableEntry(HeapObject *newObject)
 : object(newObject), refCounts()
 { }

 // ...

};

What if you increase the reference count at this time? Let’s take a look at the previous RefCounts::increment method:

void increment(uint32_t inc = 1) {
 auto oldbits = (SWIFT_MEMORY_ORDER_CONSUME);
 RefCountBits newbits;
 do {
 newbits = oldbits;
 bool fast = (inc);
 // --> Enter this branch this time. if (SWIFT_UNLIKELY(!fast)) {
  if (())
  return;
  return incrementSlow(oldbits, inc);
 }
 } while (!refCounts.compare_exchange_weak(oldbits, newbits,
           std::memory_order_relaxed));
}
template &lt;typename RefCountBits&gt;
void RefCounts&lt;RefCountBits&gt;::incrementSlow(RefCountBits oldbits,
           uint32_t n) {
 if (()) {
 return;
 }
 else if (()) {
 auto side = ();
 // ---> Then call it here. side-&gt;incrementStrong(n);
 }
 else {
 swift::swift_abortRetainOverflow();
 }
}
void HeapObjectSideTableEntry::incrementStrong(uint32_t inc) {
 // Finally here, refCounts is a RefCounts<SideTableRefCountBits> object. (inc);
}

At this point, we need to introduce SideTableRefCountBits, which is very similar to the previous InlineRefCountBits, except that there is another field. Let's take a look at the definition:

class SideTableRefCountBits : public RefCountBitsT<RefCountNotInline>
{
 uint32_t weakBits;

 // ...

};

Let's summarize

I don’t know if you’re dizzy when you read the above content. Anyway, it took some time to analyze it at the beginning.

We have talked about two types of RefCounts above. One is inline. It is used in HeapObject. It is actually a uint64_t, which can be used as a reference count or as a pointer to a Side Table.

Side Table is a structure with the class name HeapObjectSideTableEntry, which also has RefCounts members, and is SideTableRefCountBits internally. In fact, it is the original uint64_t plus a uint32_t that stores weak references.

WeakReference

The above mentioned all are the logic involved in the referenced object, and the logic on the referenced side is a little simpler. It is mainly implemented through the WeakReference class. It is relatively simple, so let's just go through it.

After the weak variable in Swift passes silgen, it will become a swift::swift_weakAssign call, and then be distributed to WeakReference::nativeAssign:

void nativeAssign(HeapObject *newObject) {
 if (newObject) {
 assert(objectUsesNativeSwiftReferenceCounting(newObject) &amp;&amp;
   "weak assign native with non-native new object");
 }

 // Let the cited person construct a Side Table. auto newSide =
 newObject ? newObject-&gt;() : nullptr;
 auto newBits = WeakReferenceBits(newSide);

 // CAS that you like. auto oldBits = (std::memory_order_relaxed);
 (newBits, std::memory_order_relaxed);

 assert(() &amp;&amp;
   "weak assign native with non-native old object");
 // Destroy weak references of the original object. destroyOldNativeBits(oldBits);
}

Access to weak references is easier:

HeapObject *nativeLoadStrongFromBits(WeakReferenceBits bits) {
 auto side = ();
 return side ? side->tryRetain() : nullptr;
}

At this point, you have found no problem. Why can you directly access the Side Table after the referenced object is released? In fact, the life cycle of the Side Table in Swift ABI is separated from the object. When the strong reference count is 0, only HeapObject is released.

Only after all weak quoters are released or the related variables are set to nil, the Side Table can be released. See you:

void HeapObjectSideTableEntry::decrementWeak() {
 // FIXME: assertions
 // FIXME: optimize barriers
 bool cleanup = ();
 if (!cleanup)
 return;

 // Weak ref count is now zero. Delete the side table entry.
 // FREED -> DEAD
 assert(() == 0);
 delete this;
}

Therefore, even if weak references are used, it is not guaranteed that all relevant memory will be released, because as long as the weak variable is not explicitly set, the Side Table will exist. There is also an improvement in the ABI, that is, if you find that the referenced object has been released when accessing a weakly referenced variable, you will destroy your weak reference to avoid repeated meaningless CAS operations later. Of course, ABI does not do this optimization, we can also do it in Swift code. :)

Summarize

The above is a simple analysis of the implementation of Swift's weak reference mechanism. It can be seen that the idea is still very similar to Objective-C runtime. Both use Side Tables that match objects to maintain reference counts. The difference is that Objective-C objects do not have Side Table pointers in memory layout, but use a global StripedMap to maintain the relationship between the object and the Side Table, which is not as efficient as Swift. In addition, Objective-C runtime will zero-out all __weak variables when the object is released, while Swift does not.

Overall, Swift is implemented a little simpler (although the code is more complex, the Swift team pursues higher abstractions). The first analysis of Swift ABI, this article is for reference only. If there is any error, please feel free to correct it. grateful!

Okay, the above is the entire content of this article. I hope that the content of this article has a certain reference value for everyone's study or work. If you have any questions, you can leave a message to communicate. Thank you for your support.