SoFunction
Updated on 2025-04-05

iOS 16 CocoaAsyncSocket Crash Repair Detailed Explanation

background

After the release of iOS 16 version, we monitoredCocoaAsyncSocketThere are a lot of new crashes, stacks and mentioned hereissueConsistent:

  libsystem_platform.dylib      	       0x210a5e08c _os_unfair_lock_recursive_abort + 36
  libsystem_platform.dylib      	       0x210a58898 _os_unfair_lock_lock_slow + 280
  CoreFoundation                	       0x1c42953ec CFSocketInvalidate + 132
  CFNetwork                     	       0x1c54a4e24 0x1c533f000 + 1465892
  CoreFoundation                	       0x1c41db030 CFArrayApplyFunction + 72
  CFNetwork                     	       0x1c54829a0 0x1c533f000 + 1325472
  CoreFoundation                	       0x1c4242d20 _CFRelease + 316
  CoreFoundation                	       0x1c4295724 CFSocketInvalidate + 956
  CFNetwork                     	       0x1c548f478 0x1c533f000 + 1377400
  CoreFoundation                	       0x1c420799c _CFStreamClose + 108
  Test                   	               0x102ca5228 -[GCDAsyncSocket closeWithError:] + 452
  Test                   	               0x102ca582c __28-[GCDAsyncSocket disconnect]_block_invoke + 80
               	       0x1cb649fdc _dispatch_client_callout + 20
               	       0x1cb6599a8 _dispatch_sync_invoke_and_complete_recurse + 64
               	       0x1cb659428 _dispatch_sync_f_slow + 172
  Test                   	               0x102ca57b0 -[GCDAsyncSocket disconnect] + 164
  Test                   	               0x102db951c -[TestSocket forceDisconnect] + 312
  Test                   	               0x102cdfa5c -[TestSocket forceDisconnect] + 396
  Test                   	               0x102d6b748 __27-[TestSocketManager didConnectWith:]_block_invoke + 2004
               	       0x1cb6484b4 _dispatch_call_block_and_release + 32
               	       0x1cb649fdc _dispatch_client_callout + 20
               	       0x1cb651694 _dispatch_lane_serial_drain + 672
               	       0x1cb6521e0 _dispatch_lane_invoke + 384
               	       0x1cb65ce10 _dispatch_workloop_worker_thread + 652
  libsystem_pthread.dylib       	       0x210aecdf8 _pthread_wqthread + 288
  libsystem_pthread.dylib       	       0x210aecb98 start_wqthread + 8

Cause of the crashBUG IN CLIENT OF LIBPLATFORM: Trying to recursively lock an os_unfair_lockThe reason is very simple, the lock is called recursively.os_unfair_lock_lockThe recursive call of the lock is judged by the current owner equal to the current thread. In theory, just breaking this recursive call can solve this problem. Analysis stack crash topCoreFoundationIn-houseCFSocketInvalidateThe function has been calledlibsystem_platform.dylibIn-houseos_unfair_lock, the indirect call of bind between two dynamic libraries is used, and then use the fishhook hook to drop it directlyCoreFoundationIn the lock method called in, the replaced lock method determines whether the owner is the current thread. If so, it will be returned directly. Isn’t this crash problem solved? So the following first version of the plan was found. (Note: Plans 1&2 were eventually passed, Plan 3 is verified to be feasible)

Solution 1: Replace os_unfair_lock_lock

This solution has two key steps: hook lock method. The lock method determines whether the owner is the current thread. The first step is to default fishhook to be feasible, and the second step seems more challenging, so we start the research from the lock judgment logic, and tears of regret are shed here.

<os/>The system API is providedos_unfair_lock_assert_ownerTo judge the lock's current owner

/*!
&nbsp;* **@function** os_*unfair_lock_assert_not_owner*
&nbsp;*
&nbsp;* **@abstract**
&nbsp;* Asserts that the calling thread is not the current owner of the specified
&nbsp;* unfair lock.
&nbsp;*
&nbsp;* **@discussion**
&nbsp;* If the lock is unlocked or owned by a different thread, this function
&nbsp;* returns.
&nbsp;*
&nbsp;* If the lock is currently owned by the current thread, this function asserts
&nbsp;* and terminates the process.
&nbsp;*
&nbsp;* **@param** lock
&nbsp;* Pointer to an os_unfair_lock.
&nbsp;*/
OS_UNFAIR_LOCK_AVAILABILITY
OS_EXPORT OS_NOTHROW OS_NONNULL_ALL
**void** os_unfair_lock_assert_not_owner(**const** os_unfair_lock *lock);

If lock is held by other threads, this method directly returns. If lock is held by the current thread, it will directly contact and interrupt the program. Because dev will trigger a crash, this API cannot be called directly in our scenario. Fortunately, Apple has provided this part of the code, which can implement the judgment logic of lock owner under reference. Some tsd codes involved in the middle need additional processing, so I will not explain it here. After that, the fishhook global replacementos_unfair_lock_lockStart the test.

os_unfair_lock_lock(&amp;test_lock);
os_unfair_lock_lock(&amp;test_lock);

The above can stably reproduce the crash of the recursive lock. After adding hook code, the crash disappeared. This is the first time I thought the problem was solved.

However, the test code is inside the main executable file, and the crash occursCoreFoundationin,CoreFoundationCan the lock method be hooked? The answer is no. The students in the subsequent business department have reproduced this crash more vigorously and stably, and the crash is on the top of the stack.CFSocketInvalidateThe call to the lock method is as follows0x1ba8b13e8 bl 0x1c0155a60, This is not the familiar call to symbol stub, fishhook cannot take effect. The calls between dynamic libraries have always been my blind spot in knowledge. I don’t know where to start, and the hook solution was passed.

    0x1ba8b13d0 <+104>:  tbz    w8, #0x0, 0x1ba8b13d8     ; <+112>
    0x1ba8b13d4 <+108>:  bl     0x1ba920e7c               ; __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__
    0x1ba8b13d8 <+112>:  mov    x0, x19
    0x1ba8b13dc <+116>:  bl     0x1ba860e34               ; CFRetain
    0x1ba8b13e0 <+120>:  adrp   x0, 354829
    0x1ba8b13e4 <+124>:  add    x0, x0, #0x900            ; __CFAllSocketsLock
    0x1ba8b13e8 <+128>:  bl     0x1c0155a60
->  0x1ba8b13ec <+132>:  add    x20, x19, #0x18
    0x1ba8b13f0 <+136>:  mov    x0, x20
    0x1ba8b13f4 <+140>:  bl     0x1ba99c984               ; symbol stub for: pthread_mutex_lock
->  0x1c0155a60: adrp   x16, 290593
    0x1c0155a64: add    x16, x16, #0x3b0          ; os_unfair_lock_lock
    0x1c0155a68: br     x16
    0x1c0155a6c: brk    #0x1
    0x1c0155a70: adrp   x16, 290593
    0x1c0155a74: add    x16, x16, #0x4e0          ; os_unfair_lock_lock_with_options
    0x1c0155a78: br     x16
    0x1c0155a7c: brk    #0x1

After debugging the iOS 15 device, I found that the lock type called by iOS 15 is pthread_mutex_lock. iOS 16 was replaced by os_unfair_lock. Perhaps the update here caused this crash. Since this problem cannot be fixed by starting from the lock directly, we need to analyze why recursive calls occur here.

Schedule 2: _schedulables delete _socket

The symbols in the CFNetwork library of the crash stack are not parsed normally, and xcode cannot be parsed during offline debugging. The stack captured by xcode is as follows:

#0	0x000000020707a08c in _os_unfair_lock_recursive_abort ()
#1	0x0000000207074898 in _os_unfair_lock_lock_slow ()
#2	0x00000001ba8b13ec in CFSocketInvalidate ()
#3	0x00000001bbac0e24 in ___lldb_unnamed_symbol8533 ()
#4	0x00000001ba7f7030 in CFArrayApplyFunction ()
#5	0x00000001bba9e9a0 in ___lldb_unnamed_symbol7940 ()
#6	0x00000001ba85ed20 in _CFRelease ()
#7	0x00000001ba8b1724 in CFSocketInvalidate ()
#8	0x00000001bbaab478 in ___lldb_unnamed_symbol8050 ()
#9	0x00000001ba82399c in _CFStreamClose ()
#10	0x000000010844e934 in -[GCDAsyncSocket closeWithError:] at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:3213
#11	0x0000000108456b8c in -[GCDAsyncSocket maybeDequeueWrite] at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:5976
#12	0x0000000108457584 in __29-[GCDAsyncSocket doWriteData]_block_invoke at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:6317
#13	0x00000001c1c644b4 in _dispatch_call_block_and_release ()
#14	0x00000001c1c65fdc in _dispatch_client_callout ()
#15	0x00000001c1c6d694 in _dispatch_lane_serial_drain ()
#16	0x00000001c1c6e1e0 in _dispatch_lane_invoke ()
#17	0x00000001c1c78e10 in _dispatch_workloop_worker_thread ()
#18	0x0000000207108df8 in _pthread_wqthread ()

Look at this stack, you can roughly get the reason for the crashCFSocketInvalidateExecute twice,CFSocketInvalidateCalledos_unfair_lock_lockos_unfair_lock_lockExecution twice resulted in the lock recursion. To analyze more specific reasons, the corresponding symbols need to be parsed.

#8 Unparsed symbol: ___lldb_unnamed_symbol8050

_CFStreamCloseCalled___lldb_unnamed_symbol8050___lldb_unnamed_symbol8050The first time it was calledCFSocketInvalidate

CFNetworkmiddle_CFStreamCloseThe source code is as follows:

CF_PRIVATE void _CFStreamClose(struct _CFStream *stream) {
    CFStreamStatus status = _CFStreamGetStatus(stream);
    const struct _CFStreamCallBacks *cb = _CFStreamGetCallBackPtr(stream);
    if (status == kCFStreamStatusNotOpen || status == kCFStreamStatusClosed || (status == kCFStreamStatusError && __CFBitIsSet(stream->flags, HAVE_CLOSED))) {
        // Stream is not open from the client's perspective; do not callout and do not update our status to "closed"
        return;
    }
    if (! __CFBitIsSet(stream->flags, HAVE_CLOSED)) {
        __CFBitSet(stream->flags, HAVE_CLOSED);
        __CFBitSet(stream->flags, CALLING_CLIENT);
        if (cb->close) {
            cb->close(stream, _CFStreamGetInfoPointer(stream));
        }
        if (stream->client) {
            _CFStreamDetachSource(stream);
        }
        _CFStreamSetStatusCode(stream, kCFStreamStatusClosed);
        __CFBitClear(stream->flags, CALLING_CLIENT);
    }
}

Combined with xcode debugging information___lldb_unnamed_symbol8050Most likelycb->closemethod. Try mapping here_CFStreamModification of data structurecb->close

struct _CFStream {
    CFRuntimeBase _cfBase;
    CFOptionFlags flags;
    CFErrorRef error; // if callBacks-&gt;version &lt; 2, this is actually a pointer to a CFStreamError
    struct _CFStreamClient *client;
    /* NOTE: CFNetwork is still using _CFStreamGetInfoPointer, and so this slot needs to stay in this position (as the fifth field in the structure) */
    /* NOTE: This can be taken out once CFNetwork rebuilds */
    /* NOTE: &lt;rdar://problem/13678879&gt; Remove comment once CFNetwork has been rebuilt */
    void *info;
    const struct _CFStreamCallBacks *callBacks;  // This will not exist (will not be allocated) if the callbacks are from our known, "blessed" set.
    CFLock_t streamLock;
    CFArrayRef previousRunloopsAndModes;
    dispatch_queue_t queue;
};

Modify the close pointer of callBacks to_new_SocketStreamCloseThe method can be a stone hammer___lldb_unnamed_symbol8050That's rightcb->closeCall

void (*_origin_SocketStreamClose)(CFTypeRef stream, void* ctxt);
void _new_SocketStreamClose(CFTypeRef stream, void* ctxt) {
  _origin_SocketStreamClose(stream, ctxt);
}

Continue to look through the CFNetwork code and finally find the cb->close pointing functionSocketStreamCloseThis function is relatively long, we only focus on the insideCFSocketInvalidateThe first call part:

if (ctxt->_socket) {
    /* Make sure to invalidate the socket */
    CFSocketInvalidate(ctxt->_socket);
    /* Dump and forget it. */
    CFRelease(ctxt->_socket);
    ctxt->_socket = NULL;
}

ctxt by method_CFStreamGetInfoPointerGet the value taken is the info of stream,CoreFoundationThe data structure of info provided in

typedef struct {
	CFSpinLock_t				_lock;				/* Protection for read-half versus write-half */
	UInt32						_flags;
	CFStreamError				_error;
	CFReadStreamRef				_clientReadStream;
	CFWriteStreamRef			_clientWriteStream;
	CFSocketRef					_socket;			/* Actual underlying CFSocket */
        CFMutableArrayRef			_readloops;
        CFMutableArrayRef			_writeloops;
        CFMutableArrayRef			_sharedloops;
	CFMutableArrayRef			_schedulables;		/* Items to be scheduled (. socket, reachability, host, etc.) */
	CFMutableDictionaryRef		_properties;		/* Host and port and reachability should be here too. */
} _CFSocketStreamContext;

This data structure has been modified in iOS 16, but during debugging, lldb can be found through memory read_socketoffset and_schedulablesoffset._schedulablesIt is also a relatively critical value, and the second call is analyzedCFSocketInvalidateWill use it when it is.

Summary: The first timeCFSocketInvalidateYesSocketStreamCloseCalled inside, the parameter isstream->info->_socket

#3 Unparsed symbol: ___lldb_unnamed_symbol8533

The second timeCFSocketInvalidateThe call to___lldb_unnamed_symbol8533Inside, the assembly code is as follows:

CFNetwork`___lldb_unnamed_symbol8533:
    0x1bbac0e00 <+0>:   pacibsp 
    0x1bbac0e04 <+4>:   stp    x20, x19, [sp, #-0x20]!
    0x1bbac0e08 <+8>:   stp    x29, x30, [sp, #0x10]
    0x1bbac0e0c <+12>:  add    x29, sp, #0x10
    0x1bbac0e10 <+16>:  mov    x19, x0
    0x1bbac0e14 <+20>:  bl     0x1c015b020
    0x1bbac0e18 <+24>:  mov    x20, x0
    0x1bbac0e1c <+28>:  mov    x0, x19
    0x1bbac0e20 <+32>:  bl     0x1bba0f498               ; ___lldb_unnamed_symbol5324
->  0x1bbac0e24 <+36>:  adrp   x8, 348073
    0x1bbac0e28 <+40>:  ldr    x8, [x8, #0x4a0]
    0x1bbac0e2c <+44>:  cmn    x8, #0x1
    0x1bbac0e30 <+48>:     0x1bbac0ea4               ; <+164>
    0x1bbac0e34 <+52>:  adrp   x8, 348073
    0x1bbac0e38 <+56>:  ldr    x8, [x8, #0x4c0]
    0x1bbac0e3c <+60>:  ldr    x8, [x8, #0x60]
    0x1bbac0e40 <+64>:  cmp    x8, x20
    0x1bbac0e44 <+68>:     0x1bbac0e6c               ; <+108>
    0x1bbac0e48 <+72>:  mov    x0, x19
    0x1bbac0e4c <+76>:  mov    w1, #0x0
    0x1bbac0e50 <+80>:  ldp    x29, x30, [sp, #0x10]
    0x1bbac0e54 <+84>:  ldp    x20, x19, [sp], #0x20
    0x1bbac0e58 <+88>:  autibsp 
    0x1bbac0e5c <+92>:  eor    x16, x30, x30, lsl #1
    0x1bbac0e60 <+96>:  tbz    x16, #0x3e, 0x1bbac0e68   ; <+104>
    0x1bbac0e64 <+100>: brk    #0xc471
    0x1bbac0e68 <+104>: b      0x1bba16948               ; CFHostCancelInfoResolution
    0x1bbac0e6c <+108>: bl     0x1bba108f0               ; CFNetServiceGetTypeID
    0x1bbac0e70 <+112>: cmp    x0, x20
    0x1bbac0e74 <+116>:    0x1bbac0e98               ; <+152>
    0x1bbac0e78 <+120>: mov    x0, x19
    0x1bbac0e7c <+124>: ldp    x29, x30, [sp, #0x10]
    0x1bbac0e80 <+128>: ldp    x20, x19, [sp], #0x20
    0x1bbac0e84 <+132>: autibsp 
    0x1bbac0e88 <+136>: eor    x16, x30, x30, lsl #1
    0x1bbac0e8c <+140>: tbz    x16, #0x3e, 0x1bbac0e94   ; <+148>
    0x1bbac0e90 <+144>: brk    #0xc471
    0x1bbac0e94 <+148>: b      0x1bba12ef8               ; CFNetServiceCancel
    0x1bbac0e98 <+152>: ldp    x29, x30, [sp, #0x10]
    0x1bbac0e9c <+156>: ldp    x20, x19, [sp], #0x20
    0x1bbac0ea0 <+160>: retab  
    0x1bbac0ea4 <+164>: adrp   x0, 348073
    0x1bbac0ea8 <+168>: add    x0, x0, #0x4a0
    0x1bbac0eac <+172>: adrp   x1, 356609
    0x1bbac0eb0 <+176>: add    x1, x1, #0xaa8
    0x1bbac0eb4 <+180>: bl     0x1bbbd3b80               ; symbol stub for: dispatch_once
    0x1bbac0eb8 <+184>: b      0x1bbac0e34               ; <+52>

Combined with some key features: the function will be called at the beginningCFSocketInvalidate, it will be called laterCFHostCancelInfoResolutionCFNetServiceGetTypeIDWait, inCFNetworkA very high-matching method was found in it_SchedulablesInvalidateApplierFunction

/* static */ void
_SchedulablesInvalidateApplierFunction(CFTypeRef obj, void* context) {
	(void)context;  /* unused */
	CFTypeID type = CFGetTypeID(obj);
	/* Invalidate the process. */
	_CFTypeInvalidate(obj);
	/* For CFHost and CFNetService, make sure to cancel too. */
	if (CFHostGetTypeID() == type)
		CFHostCancelInfoResolution((CFHostRef)obj, kCFHostAddresses);
	else if (CFNetServiceGetTypeID() == type)
		CFNetServiceCancel((CFNetServiceRef)obj);
}

_CFTypeInvalidateThe method will determine the CF type if it isCFSocketGetTypeIDWill executeCFSocketInvalidatemethod._SchedulablesInvalidateApplierFunctionexistCFNetworkThere are two calls in the search, the call method is the same as the entry parameter, and the parameters passed in are allctxt->_schedulablesThe item and ctxt in this array are the info field of the stream.

CFArrayApplyFunction(ctxt->_schedulables, r, (CFArrayApplierFunction)_SchedulablesInvalidateApplierFunction, NULL);

Summary: The second timeCFSocketInvalidateYes_SchedulablesInvalidateApplierFunctionExecute inside, enter parameter isstream->info->_schedulablesIncluded item.

Logical analysis

Two recursive calls

CFSocketInvalidate(stream->info->_socket)

CFSocketInvalidate(stream->info->_schedulables item)

info->_socketIt's aCFSocketRefObject, operation when crash occurs_schedulablesIn the arrayCFSocketRefObject, description_schedulablesIt also containsCFSocketRefObjects, both are attribute values ​​held by info,_schedulablesIncludedCFSocketRefObjects and_socketWhat is the relationship between objects? If the execution is repeated equallyCFSocketInvalidateIt's meaningless,_schedulablesDelete it directly_socketThe object, recursively is broken, and this problem can also be solved.

Try mappingstream->infoThe data structure should be noted that_CFSocketStreamContextmiddle_schedulablesThis value is a secondary pointer in iOS 16, andCFNetworkThe data structures provided in it are inconsistent, making it more troublesome to search in memory. Finally, you will find outinfo->_schedulablesIncluded inCFSocketRefThe object isinfo->_socket

Try our fix map info to get it_schedulables, when the crash occurs_schedulablesContains only_socketAn element, so I called the RemoveAll method directly and simply and roughly. At this point, I thought this problem was solved for the second time:

CFArrayRemoveAllValues(stream->info->_schedulables)

Then the nightmare began, many things_schedulablesThe call did not make a null operation, and the result was a crash, such as the following code

CFArrayApplyFunction(ctxt->_schedulables,
                     CFRangeMake(0, CFArrayGetCount(ctxt->_schedulables)),
                     (CFArrayApplierFunction)_SchedulablesScheduleApplierFunction,
                     loopAndMode);

This was a very dirty way to bypass these crashes without a signature, and the initial lock recursive crash still reproduced. The inclusion of the top-of-stack operation is as follows:_schedulables, but in fact, the array address of the top of the stack operation in the end crash is notstream->info->_schedulables. from_schedulablesdelete_socketThe solution cannot work. In fact, you can continue to analyze where the array on the top of the stack is generated at this time, but it is really more difficult. In addition, the logic of not being judged to be empty on the array will trigger a new crash. Clearing the array on the top of the stack also has risks. Although this path is unwilling to be reconciled, it is still temporarily shelved. After all, solving the problem as soon as possible is the key.

Solution 3: _CFRelease

Although solution 2 did not solve the problem, through solution 2 we got a rough call stack:

#0	0x000000020707a08c in _os_unfair_lock_recursive_abort ()
#1	0x0000000207074898 in _os_unfair_lock_lock_slow ()
#2	0x00000001ba8b13ec in CFSocketInvalidate ()
#3	0x00000001bbac0e24 in _SchedulablesInvalidateApplierFunction ()
#4	0x00000001ba7f7030 in CFArrayApplyFunction ()
#5	0x00000001bba9e9a0 in ___lldb_unnamed_symbol7940 ()
#6	0x00000001ba85ed20 in _CFRelease ()
#7	0x00000001ba8b1724 in CFSocketInvalidate ()
#8	0x00000001bbaab478 in _SocketStreamClose ()
#9	0x00000001ba82399c in _CFStreamClose ()
#10	0x000000010844e934 in -[GCDAsyncSocket closeWithError:] at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:3213
#11	0x0000000108456b8c in -[GCDAsyncSocket maybeDequeueWrite] at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:5976
#12	0x0000000108457584 in __29-[GCDAsyncSocket doWriteData]_block_invoke at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:6317
#13	0x00000001c1c644b4 in _dispatch_call_block_and_release ()
#14	0x00000001c1c65fdc in _dispatch_client_callout ()
#15	0x00000001c1c6d694 in _dispatch_lane_serial_drain ()
#16	0x00000001c1c6e1e0 in _dispatch_lane_invoke ()
#17	0x00000001c1c78e10 in _dispatch_workloop_worker_thread ()
#18	0x0000000207108df8 in _pthread_wqthread ()

Continue to research this stack, there is a very strange thingCoreFoundation: _CFReleaseCalledCFNetwork: ___lldb_unnamed_symbol7940, CoreFoundationIt should be a more underlying library.CoreFoundation It should not be calledCFNetwork. CheckCFSocketInvalidateInside_CFReleaseThe code is relatively long to intercept some key information:

void CFSocketInvalidate(CFSocketRef s) {
    CFRetain(s);
    __CFLock(&__CFAllSocketsLock);
    __CFSocketLock(s);
    if (__CFSocketIsValid(s)) {        
        contextInfo = s->_context.info;
        contextRelease = s->_context.release;
        // Do this after the socket unlock to avoid deadlock (10462525)
        for (idx = CFArrayGetCount(runLoops); idx--;) {
            CFRunLoopWakeUp((CFRunLoopRef)CFArrayGetValueAtIndex(runLoops, idx));
        }
        CFRelease(runLoops);
        if (NULL != contextRelease) {
            contextRelease(contextInfo);
        }
        if (NULL != source0) {
            CFRunLoopSourceInvalidate(source0);
            CFRelease(source0);
        }
    } else {
        __CFSocketUnlock(s);
    }
    __CFUnlock(&__CFAllSocketsLock);
    CFRelease(s);
}

Combined with Xcode debugging information:

    0x1ba8b16fc <+916>:  bl     0x1ba862870               ; CFArrayGetValueAtIndex
    0x1ba8b1700 <+920>:  bl     0x1ba8945a0               ; CFRunLoopWakeUp
    0x1ba8b1704 <+924>:  sub    x24, x24, #0x1
    0x1ba8b1708 <+928>:  subs   w20, w20, #0x1
    0x1ba8b170c <+932>:     0x1ba8b16f4               ; <+908>
    0x1ba8b1710 <+936>:  mov    x0, x22
    0x1ba8b1714 <+940>:  bl     0x1ba860cec               ; CFRelease
    0x1ba8b1718 <+944>:  cbz    x25, 0x1ba8b1724          ; <+956>
    0x1ba8b171c <+948>:  mov    x0, x23
    0x1ba8b1720 <+952>:  blraaz x25
->  0x1ba8b1724 <+956>:  cbz    x21, 0x1ba8b1738          ; <+976>
    0x1ba8b1728 <+960>:  mov    x0, x21
    0x1ba8b172c <+964>:  bl     0x1ba8b1a54               ; CFRunLoopSourceInvalidate
    0x1ba8b1730 <+968>:  mov    x0, x21
    0x1ba8b1734 <+972>:  bl     0x1ba860cec               ; CFRelease
    0x1ba8b1738 <+976>:  adrp   x0, 354829
    0x1ba8b173c <+980>:  add    x0, x0, #0x900            ; __CFAllSocketsLock

Completed executionCFReleaseWill execute laterCFRunLoopSourceInvalidate, hereCFReleaseonlyCFRelease(source0); source0 is an array, I thought it naively at that time___lldb_unnamed_symbol7940It is throughCFArrayReleaseCallBackThe callback method added, this call logic looks reasonable.CFReleaseAlthough it cannot be hooked, can we break the recursive call by modifying CallBack? It is still not feasible to try this way. BreakpointCFReleaseIt is found that the object type of release isSocketStreamNot the previous source0 array.CFSocketInvalidateThe search type in this function isSocketStreamThe object ofs->_context.info, followed the clues and found the three most critical lines of code that we solved this problem:

if (NULL != contextRelease) {
    contextRelease(contextInfo);
}

Follow xcode debugging informationcontextRelease == CFReleaseandcontextReleaseTake values ​​in codes->_context.release. As long as you get its->_contextdata structure, modifiedreleaseThis pointer can be used to crash the stackCFReleaseThe hook causes the lock to recurse twiceCFSocketInvalidateThe calls are respectivelyCFReleaseBefore and after, ifCFReleaseModify to asynchronous call,CFSocketInvalidateTwo callsos_unfair_lock_lockIn two different threads, the condition for lock recursive judgment is that lock's current owner is the current thread, and the lock method is executed on different threads, so this problem is solved. The process of mapping stream and socket is not introduced in detail. This process is too boring. Just post the result:

struct __CFSocket {
    int64_t offset[27];
    CFSocketContext _context;    /* immutable */
};
typedef struct {
    int64_t offset[33];
    struct __CFSocket *          _socket;
} __CFSocketStreamContext;
struct __CFStream {
    int64_t offset[5];
    __CFSocketStreamContext *info;
};

The final solution summarizes the following code, because it maps many system data structures, which is not a safe operation. Some memory readable and writeable judgments are required. For this part of the code, refer to kscrash. In addition, the business layer also needs toAdd switches, switches, and switches to take effect on specific systems, if the data structure of the new system stream or socket changes, it may cause some memory access to crashes.

// Memory protectionstatic inline int copySafely(const void* restrict const src, void* restrict const dst, const int byteCount)
{
    vm_size_t bytesCopied = 0;
    kern_return_t result = vm_read_overwrite(mach_task_self(),
                                             (vm_address_t)src,
                                             (vm_size_t)byteCount,
                                             (vm_address_t)dst,
                                             &amp;bytesCopied);
    if(result != KERN_SUCCESS)
    {
        return 0;
    }
    return (int)bytesCopied;
}
static char g_memoryTestBuffer[10240];
static inline bool isMemoryReadable(const void* const memory, const int byteCount)
{
    const int testBufferSize = sizeof(g_memoryTestBuffer);
    int bytesRemaining = byteCount;
    while(bytesRemaining &gt; 0)
    {
        int bytesToCopy = bytesRemaining &gt; testBufferSize ? testBufferSize : bytesRemaining;
        if(copySafely(memory, g_memoryTestBuffer, bytesToCopy) != bytesToCopy)
        {
            break;
        }
        bytesRemaining -= bytesToCopy;
    }
    return bytesRemaining == 0;
}
// Asynchronous CFReleasestatic dispatch_queue_t socket_context_release_queue = nil;
void (*origin_context_release)(const void *info);
void new_context_release(const void *info) {
    if (socket_context_release_queue == nil) {
        socket_context_release_queue = dispatch_queue_create("socketContextReleaseQueue", 0x0);
    }
    dispatch_async(socket_context_release_queue, ^{
        origin_context_release(info);
    });
}
// CocoaAsyncSocket modify writeStreamif (@available(iOS 16.0, *)) {
    struct __CFStream *cfstream  = (struct __CFStream *)writeStream;
    if (isMemoryReadable(cfstream, sizeof(*cfstream))
       &amp;&amp; isMemoryReadable(cfstream-&gt;info, sizeof(*(cfstream-&gt;info)))
       &amp;&amp; isMemoryReadable(cfstream-&gt;info-&gt;_socket, sizeof(*(cfstream-&gt;info-&gt;_socket)))
       &amp;&amp; isMemoryReadable(&amp;(cfstream-&gt;info-&gt;_socket-&gt;_context), sizeof(cfstream-&gt;info-&gt;_socket-&gt;_context))
       &amp;&amp; isMemoryReadable(cfstream-&gt;info-&gt;_socket-&gt;_context.release, sizeof(*(cfstream-&gt;info-&gt;_socket-&gt;_context.release)))) {
        if (cfstream-&gt;info != NULL &amp;&amp; cfstream-&gt;info-&gt;_socket != NULL) {
            if ((uintptr_t)cfstream-&gt;info-&gt;_socket-&gt;_context.release == (uintptr_t)CFRelease) {
                origin_context_release = cfstream-&gt;info-&gt;_socket-&gt;_context.release;
                cfstream-&gt;info-&gt;_socket-&gt;_context.release = new_context_release;
            }
        }
}

Summarize

This problem does not only occur inCocoaAsyncSocketIn this library, this crash stack was also found in some system threads later, but the magnitude was not large, so it was evaluated that there was no need to solve it.

In addition, although both Solution 1 and Solution 2 were eventually passed, this is also my most commonly used troubleshooting method, so I will share it with you here. There were many points that were not clear in the entire investigation process, but these details did not affect the final conclusion, so we finally chose to look at it in Buddhist terms.

The above is the detailed explanation of the iOS 16 CocoaAsyncSocket crash repair. For more information about iOS CocoaAsyncSocket crash repair, please follow my other related articles!