background
After the release of iOS 16 version, we monitoredCocoaAsyncSocket
There are a lot of new crashes, stacks and mentioned hereissueConsistent:
libsystem_platform.dylib 0x210a5e08c _os_unfair_lock_recursive_abort + 36 libsystem_platform.dylib 0x210a58898 _os_unfair_lock_lock_slow + 280 CoreFoundation 0x1c42953ec CFSocketInvalidate + 132 CFNetwork 0x1c54a4e24 0x1c533f000 + 1465892 CoreFoundation 0x1c41db030 CFArrayApplyFunction + 72 CFNetwork 0x1c54829a0 0x1c533f000 + 1325472 CoreFoundation 0x1c4242d20 _CFRelease + 316 CoreFoundation 0x1c4295724 CFSocketInvalidate + 956 CFNetwork 0x1c548f478 0x1c533f000 + 1377400 CoreFoundation 0x1c420799c _CFStreamClose + 108 Test 0x102ca5228 -[GCDAsyncSocket closeWithError:] + 452 Test 0x102ca582c __28-[GCDAsyncSocket disconnect]_block_invoke + 80 0x1cb649fdc _dispatch_client_callout + 20 0x1cb6599a8 _dispatch_sync_invoke_and_complete_recurse + 64 0x1cb659428 _dispatch_sync_f_slow + 172 Test 0x102ca57b0 -[GCDAsyncSocket disconnect] + 164 Test 0x102db951c -[TestSocket forceDisconnect] + 312 Test 0x102cdfa5c -[TestSocket forceDisconnect] + 396 Test 0x102d6b748 __27-[TestSocketManager didConnectWith:]_block_invoke + 2004 0x1cb6484b4 _dispatch_call_block_and_release + 32 0x1cb649fdc _dispatch_client_callout + 20 0x1cb651694 _dispatch_lane_serial_drain + 672 0x1cb6521e0 _dispatch_lane_invoke + 384 0x1cb65ce10 _dispatch_workloop_worker_thread + 652 libsystem_pthread.dylib 0x210aecdf8 _pthread_wqthread + 288 libsystem_pthread.dylib 0x210aecb98 start_wqthread + 8
Cause of the crashBUG IN CLIENT OF LIBPLATFORM: Trying to recursively lock an os_unfair_lock
The reason is very simple, the lock is called recursively.os_unfair_lock_lock
The recursive call of the lock is judged by the current owner equal to the current thread. In theory, just breaking this recursive call can solve this problem. Analysis stack crash topCoreFoundation
In-houseCFSocketInvalidate
The function has been calledlibsystem_platform.dylib
In-houseos_unfair_lock
, the indirect call of bind between two dynamic libraries is used, and then use the fishhook hook to drop it directlyCoreFoundation
In the lock method called in, the replaced lock method determines whether the owner is the current thread. If so, it will be returned directly. Isn’t this crash problem solved? So the following first version of the plan was found. (Note: Plans 1&2 were eventually passed, Plan 3 is verified to be feasible)
Solution 1: Replace os_unfair_lock_lock
This solution has two key steps: hook lock method. The lock method determines whether the owner is the current thread. The first step is to default fishhook to be feasible, and the second step seems more challenging, so we start the research from the lock judgment logic, and tears of regret are shed here.
<os/>
The system API is providedos_unfair_lock_assert_owner
To judge the lock's current owner
/*! * **@function** os_*unfair_lock_assert_not_owner* * * **@abstract** * Asserts that the calling thread is not the current owner of the specified * unfair lock. * * **@discussion** * If the lock is unlocked or owned by a different thread, this function * returns. * * If the lock is currently owned by the current thread, this function asserts * and terminates the process. * * **@param** lock * Pointer to an os_unfair_lock. */ OS_UNFAIR_LOCK_AVAILABILITY OS_EXPORT OS_NOTHROW OS_NONNULL_ALL **void** os_unfair_lock_assert_not_owner(**const** os_unfair_lock *lock);
If lock is held by other threads, this method directly returns. If lock is held by the current thread, it will directly contact and interrupt the program. Because dev will trigger a crash, this API cannot be called directly in our scenario. Fortunately, Apple has provided this part of the code, which can implement the judgment logic of lock owner under reference. Some tsd codes involved in the middle need additional processing, so I will not explain it here. After that, the fishhook global replacementos_unfair_lock_lock
Start the test.
os_unfair_lock_lock(&test_lock); os_unfair_lock_lock(&test_lock);
The above can stably reproduce the crash of the recursive lock. After adding hook code, the crash disappeared. This is the first time I thought the problem was solved.
However, the test code is inside the main executable file, and the crash occursCoreFoundation
in,CoreFoundation
Can the lock method be hooked? The answer is no. The students in the subsequent business department have reproduced this crash more vigorously and stably, and the crash is on the top of the stack.CFSocketInvalidate
The call to the lock method is as follows0x1ba8b13e8 bl 0x1c0155a60
, This is not the familiar call to symbol stub, fishhook cannot take effect. The calls between dynamic libraries have always been my blind spot in knowledge. I don’t know where to start, and the hook solution was passed.
0x1ba8b13d0 <+104>: tbz w8, #0x0, 0x1ba8b13d8 ; <+112> 0x1ba8b13d4 <+108>: bl 0x1ba920e7c ; __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__ 0x1ba8b13d8 <+112>: mov x0, x19 0x1ba8b13dc <+116>: bl 0x1ba860e34 ; CFRetain 0x1ba8b13e0 <+120>: adrp x0, 354829 0x1ba8b13e4 <+124>: add x0, x0, #0x900 ; __CFAllSocketsLock 0x1ba8b13e8 <+128>: bl 0x1c0155a60 -> 0x1ba8b13ec <+132>: add x20, x19, #0x18 0x1ba8b13f0 <+136>: mov x0, x20 0x1ba8b13f4 <+140>: bl 0x1ba99c984 ; symbol stub for: pthread_mutex_lock
-> 0x1c0155a60: adrp x16, 290593 0x1c0155a64: add x16, x16, #0x3b0 ; os_unfair_lock_lock 0x1c0155a68: br x16 0x1c0155a6c: brk #0x1 0x1c0155a70: adrp x16, 290593 0x1c0155a74: add x16, x16, #0x4e0 ; os_unfair_lock_lock_with_options 0x1c0155a78: br x16 0x1c0155a7c: brk #0x1
After debugging the iOS 15 device, I found that the lock type called by iOS 15 is pthread_mutex_lock. iOS 16 was replaced by os_unfair_lock. Perhaps the update here caused this crash. Since this problem cannot be fixed by starting from the lock directly, we need to analyze why recursive calls occur here.
Schedule 2: _schedulables delete _socket
The symbols in the CFNetwork library of the crash stack are not parsed normally, and xcode cannot be parsed during offline debugging. The stack captured by xcode is as follows:
#0 0x000000020707a08c in _os_unfair_lock_recursive_abort () #1 0x0000000207074898 in _os_unfair_lock_lock_slow () #2 0x00000001ba8b13ec in CFSocketInvalidate () #3 0x00000001bbac0e24 in ___lldb_unnamed_symbol8533 () #4 0x00000001ba7f7030 in CFArrayApplyFunction () #5 0x00000001bba9e9a0 in ___lldb_unnamed_symbol7940 () #6 0x00000001ba85ed20 in _CFRelease () #7 0x00000001ba8b1724 in CFSocketInvalidate () #8 0x00000001bbaab478 in ___lldb_unnamed_symbol8050 () #9 0x00000001ba82399c in _CFStreamClose () #10 0x000000010844e934 in -[GCDAsyncSocket closeWithError:] at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:3213 #11 0x0000000108456b8c in -[GCDAsyncSocket maybeDequeueWrite] at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:5976 #12 0x0000000108457584 in __29-[GCDAsyncSocket doWriteData]_block_invoke at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:6317 #13 0x00000001c1c644b4 in _dispatch_call_block_and_release () #14 0x00000001c1c65fdc in _dispatch_client_callout () #15 0x00000001c1c6d694 in _dispatch_lane_serial_drain () #16 0x00000001c1c6e1e0 in _dispatch_lane_invoke () #17 0x00000001c1c78e10 in _dispatch_workloop_worker_thread () #18 0x0000000207108df8 in _pthread_wqthread ()
Look at this stack, you can roughly get the reason for the crashCFSocketInvalidate
Execute twice,CFSocketInvalidate
Calledos_unfair_lock_lock
, os_unfair_lock_lock
Execution twice resulted in the lock recursion. To analyze more specific reasons, the corresponding symbols need to be parsed.
#8 Unparsed symbol: ___lldb_unnamed_symbol8050
_CFStreamClose
Called___lldb_unnamed_symbol8050
,___lldb_unnamed_symbol8050
The first time it was calledCFSocketInvalidate
。
CFNetwork
middle_CFStreamClose
The source code is as follows:
CF_PRIVATE void _CFStreamClose(struct _CFStream *stream) { CFStreamStatus status = _CFStreamGetStatus(stream); const struct _CFStreamCallBacks *cb = _CFStreamGetCallBackPtr(stream); if (status == kCFStreamStatusNotOpen || status == kCFStreamStatusClosed || (status == kCFStreamStatusError && __CFBitIsSet(stream->flags, HAVE_CLOSED))) { // Stream is not open from the client's perspective; do not callout and do not update our status to "closed" return; } if (! __CFBitIsSet(stream->flags, HAVE_CLOSED)) { __CFBitSet(stream->flags, HAVE_CLOSED); __CFBitSet(stream->flags, CALLING_CLIENT); if (cb->close) { cb->close(stream, _CFStreamGetInfoPointer(stream)); } if (stream->client) { _CFStreamDetachSource(stream); } _CFStreamSetStatusCode(stream, kCFStreamStatusClosed); __CFBitClear(stream->flags, CALLING_CLIENT); } }
Combined with xcode debugging information___lldb_unnamed_symbol8050
Most likelycb->close
method. Try mapping here_CFStream
Modification of data structurecb->close
:
struct _CFStream { CFRuntimeBase _cfBase; CFOptionFlags flags; CFErrorRef error; // if callBacks->version < 2, this is actually a pointer to a CFStreamError struct _CFStreamClient *client; /* NOTE: CFNetwork is still using _CFStreamGetInfoPointer, and so this slot needs to stay in this position (as the fifth field in the structure) */ /* NOTE: This can be taken out once CFNetwork rebuilds */ /* NOTE: <rdar://problem/13678879> Remove comment once CFNetwork has been rebuilt */ void *info; const struct _CFStreamCallBacks *callBacks; // This will not exist (will not be allocated) if the callbacks are from our known, "blessed" set. CFLock_t streamLock; CFArrayRef previousRunloopsAndModes; dispatch_queue_t queue; };
Modify the close pointer of callBacks to_new_SocketStreamClose
The method can be a stone hammer___lldb_unnamed_symbol8050
That's rightcb->close
Call
void (*_origin_SocketStreamClose)(CFTypeRef stream, void* ctxt); void _new_SocketStreamClose(CFTypeRef stream, void* ctxt) { _origin_SocketStreamClose(stream, ctxt); }
Continue to look through the CFNetwork code and finally find the cb->close pointing functionSocketStreamClose
This function is relatively long, we only focus on the insideCFSocketInvalidate
The first call part:
if (ctxt->_socket) { /* Make sure to invalidate the socket */ CFSocketInvalidate(ctxt->_socket); /* Dump and forget it. */ CFRelease(ctxt->_socket); ctxt->_socket = NULL; }
ctxt by method_CFStreamGetInfoPointer
Get the value taken is the info of stream,CoreFoundation
The data structure of info provided in
typedef struct { CFSpinLock_t _lock; /* Protection for read-half versus write-half */ UInt32 _flags; CFStreamError _error; CFReadStreamRef _clientReadStream; CFWriteStreamRef _clientWriteStream; CFSocketRef _socket; /* Actual underlying CFSocket */ CFMutableArrayRef _readloops; CFMutableArrayRef _writeloops; CFMutableArrayRef _sharedloops; CFMutableArrayRef _schedulables; /* Items to be scheduled (. socket, reachability, host, etc.) */ CFMutableDictionaryRef _properties; /* Host and port and reachability should be here too. */ } _CFSocketStreamContext;
This data structure has been modified in iOS 16, but during debugging, lldb can be found through memory read_socket
offset and_schedulables
offset._schedulables
It is also a relatively critical value, and the second call is analyzedCFSocketInvalidate
Will use it when it is.
Summary: The first timeCFSocketInvalidate
YesSocketStreamClose
Called inside, the parameter isstream->info->_socket
。
#3 Unparsed symbol: ___lldb_unnamed_symbol8533
The second timeCFSocketInvalidate
The call to___lldb_unnamed_symbol8533
Inside, the assembly code is as follows:
CFNetwork`___lldb_unnamed_symbol8533: 0x1bbac0e00 <+0>: pacibsp 0x1bbac0e04 <+4>: stp x20, x19, [sp, #-0x20]! 0x1bbac0e08 <+8>: stp x29, x30, [sp, #0x10] 0x1bbac0e0c <+12>: add x29, sp, #0x10 0x1bbac0e10 <+16>: mov x19, x0 0x1bbac0e14 <+20>: bl 0x1c015b020 0x1bbac0e18 <+24>: mov x20, x0 0x1bbac0e1c <+28>: mov x0, x19 0x1bbac0e20 <+32>: bl 0x1bba0f498 ; ___lldb_unnamed_symbol5324 -> 0x1bbac0e24 <+36>: adrp x8, 348073 0x1bbac0e28 <+40>: ldr x8, [x8, #0x4a0] 0x1bbac0e2c <+44>: cmn x8, #0x1 0x1bbac0e30 <+48>: 0x1bbac0ea4 ; <+164> 0x1bbac0e34 <+52>: adrp x8, 348073 0x1bbac0e38 <+56>: ldr x8, [x8, #0x4c0] 0x1bbac0e3c <+60>: ldr x8, [x8, #0x60] 0x1bbac0e40 <+64>: cmp x8, x20 0x1bbac0e44 <+68>: 0x1bbac0e6c ; <+108> 0x1bbac0e48 <+72>: mov x0, x19 0x1bbac0e4c <+76>: mov w1, #0x0 0x1bbac0e50 <+80>: ldp x29, x30, [sp, #0x10] 0x1bbac0e54 <+84>: ldp x20, x19, [sp], #0x20 0x1bbac0e58 <+88>: autibsp 0x1bbac0e5c <+92>: eor x16, x30, x30, lsl #1 0x1bbac0e60 <+96>: tbz x16, #0x3e, 0x1bbac0e68 ; <+104> 0x1bbac0e64 <+100>: brk #0xc471 0x1bbac0e68 <+104>: b 0x1bba16948 ; CFHostCancelInfoResolution 0x1bbac0e6c <+108>: bl 0x1bba108f0 ; CFNetServiceGetTypeID 0x1bbac0e70 <+112>: cmp x0, x20 0x1bbac0e74 <+116>: 0x1bbac0e98 ; <+152> 0x1bbac0e78 <+120>: mov x0, x19 0x1bbac0e7c <+124>: ldp x29, x30, [sp, #0x10] 0x1bbac0e80 <+128>: ldp x20, x19, [sp], #0x20 0x1bbac0e84 <+132>: autibsp 0x1bbac0e88 <+136>: eor x16, x30, x30, lsl #1 0x1bbac0e8c <+140>: tbz x16, #0x3e, 0x1bbac0e94 ; <+148> 0x1bbac0e90 <+144>: brk #0xc471 0x1bbac0e94 <+148>: b 0x1bba12ef8 ; CFNetServiceCancel 0x1bbac0e98 <+152>: ldp x29, x30, [sp, #0x10] 0x1bbac0e9c <+156>: ldp x20, x19, [sp], #0x20 0x1bbac0ea0 <+160>: retab 0x1bbac0ea4 <+164>: adrp x0, 348073 0x1bbac0ea8 <+168>: add x0, x0, #0x4a0 0x1bbac0eac <+172>: adrp x1, 356609 0x1bbac0eb0 <+176>: add x1, x1, #0xaa8 0x1bbac0eb4 <+180>: bl 0x1bbbd3b80 ; symbol stub for: dispatch_once 0x1bbac0eb8 <+184>: b 0x1bbac0e34 ; <+52>
Combined with some key features: the function will be called at the beginningCFSocketInvalidate
, it will be called laterCFHostCancelInfoResolution
、CFNetServiceGetTypeID
Wait, inCFNetwork
A very high-matching method was found in it_SchedulablesInvalidateApplierFunction
。
/* static */ void _SchedulablesInvalidateApplierFunction(CFTypeRef obj, void* context) { (void)context; /* unused */ CFTypeID type = CFGetTypeID(obj); /* Invalidate the process. */ _CFTypeInvalidate(obj); /* For CFHost and CFNetService, make sure to cancel too. */ if (CFHostGetTypeID() == type) CFHostCancelInfoResolution((CFHostRef)obj, kCFHostAddresses); else if (CFNetServiceGetTypeID() == type) CFNetServiceCancel((CFNetServiceRef)obj); }
_CFTypeInvalidate
The method will determine the CF type if it isCFSocketGetTypeID
Will executeCFSocketInvalidate
method._SchedulablesInvalidateApplierFunction
existCFNetwork
There are two calls in the search, the call method is the same as the entry parameter, and the parameters passed in are allctxt->_schedulables
The item and ctxt in this array are the info field of the stream.
CFArrayApplyFunction(ctxt->_schedulables, r, (CFArrayApplierFunction)_SchedulablesInvalidateApplierFunction, NULL);
Summary: The second timeCFSocketInvalidate
Yes_SchedulablesInvalidateApplierFunction
Execute inside, enter parameter isstream->info->_schedulables
Included item.
Logical analysis
Two recursive calls
CFSocketInvalidate(stream->info->_socket)
CFSocketInvalidate(stream->info->_schedulables item)
info->_socket
It's aCFSocketRef
Object, operation when crash occurs_schedulables
In the arrayCFSocketRef
Object, description_schedulables
It also containsCFSocketRef
Objects, both are attribute values held by info,_schedulables
IncludedCFSocketRef
Objects and_socket
What is the relationship between objects? If the execution is repeated equallyCFSocketInvalidate
It's meaningless,_schedulables
Delete it directly_socket
The object, recursively is broken, and this problem can also be solved.
Try mappingstream->info
The data structure should be noted that_CFSocketStreamContext
middle_schedulables
This value is a secondary pointer in iOS 16, andCFNetwork
The data structures provided in it are inconsistent, making it more troublesome to search in memory. Finally, you will find outinfo->_schedulables
Included inCFSocketRef
The object isinfo->_socket
。
Try our fix map info to get it_schedulables
, when the crash occurs_schedulables
Contains only_socket
An element, so I called the RemoveAll method directly and simply and roughly. At this point, I thought this problem was solved for the second time:
CFArrayRemoveAllValues(stream->info->_schedulables)
Then the nightmare began, many things_schedulables
The call did not make a null operation, and the result was a crash, such as the following code
CFArrayApplyFunction(ctxt->_schedulables, CFRangeMake(0, CFArrayGetCount(ctxt->_schedulables)), (CFArrayApplierFunction)_SchedulablesScheduleApplierFunction, loopAndMode);
This was a very dirty way to bypass these crashes without a signature, and the initial lock recursive crash still reproduced. The inclusion of the top-of-stack operation is as follows:_schedulables
, but in fact, the array address of the top of the stack operation in the end crash is notstream->info->_schedulables
. from_schedulables
delete_socket
The solution cannot work. In fact, you can continue to analyze where the array on the top of the stack is generated at this time, but it is really more difficult. In addition, the logic of not being judged to be empty on the array will trigger a new crash. Clearing the array on the top of the stack also has risks. Although this path is unwilling to be reconciled, it is still temporarily shelved. After all, solving the problem as soon as possible is the key.
Solution 3: _CFRelease
Although solution 2 did not solve the problem, through solution 2 we got a rough call stack:
#0 0x000000020707a08c in _os_unfair_lock_recursive_abort () #1 0x0000000207074898 in _os_unfair_lock_lock_slow () #2 0x00000001ba8b13ec in CFSocketInvalidate () #3 0x00000001bbac0e24 in _SchedulablesInvalidateApplierFunction () #4 0x00000001ba7f7030 in CFArrayApplyFunction () #5 0x00000001bba9e9a0 in ___lldb_unnamed_symbol7940 () #6 0x00000001ba85ed20 in _CFRelease () #7 0x00000001ba8b1724 in CFSocketInvalidate () #8 0x00000001bbaab478 in _SocketStreamClose () #9 0x00000001ba82399c in _CFStreamClose () #10 0x000000010844e934 in -[GCDAsyncSocket closeWithError:] at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:3213 #11 0x0000000108456b8c in -[GCDAsyncSocket maybeDequeueWrite] at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:5976 #12 0x0000000108457584 in __29-[GCDAsyncSocket doWriteData]_block_invoke at /Users/yuencong/workplace/gif2/.gundam/Pods/CocoaAsyncSocket/Source/GCD/:6317 #13 0x00000001c1c644b4 in _dispatch_call_block_and_release () #14 0x00000001c1c65fdc in _dispatch_client_callout () #15 0x00000001c1c6d694 in _dispatch_lane_serial_drain () #16 0x00000001c1c6e1e0 in _dispatch_lane_invoke () #17 0x00000001c1c78e10 in _dispatch_workloop_worker_thread () #18 0x0000000207108df8 in _pthread_wqthread ()
Continue to research this stack, there is a very strange thingCoreFoundation: _CFRelease
CalledCFNetwork: ___lldb_unnamed_symbol7940
, CoreFoundation
It should be a more underlying library.CoreFoundation
It should not be calledCFNetwork
. CheckCFSocketInvalidate
Inside_CFRelease
The code is relatively long to intercept some key information:
void CFSocketInvalidate(CFSocketRef s) { CFRetain(s); __CFLock(&__CFAllSocketsLock); __CFSocketLock(s); if (__CFSocketIsValid(s)) { contextInfo = s->_context.info; contextRelease = s->_context.release; // Do this after the socket unlock to avoid deadlock (10462525) for (idx = CFArrayGetCount(runLoops); idx--;) { CFRunLoopWakeUp((CFRunLoopRef)CFArrayGetValueAtIndex(runLoops, idx)); } CFRelease(runLoops); if (NULL != contextRelease) { contextRelease(contextInfo); } if (NULL != source0) { CFRunLoopSourceInvalidate(source0); CFRelease(source0); } } else { __CFSocketUnlock(s); } __CFUnlock(&__CFAllSocketsLock); CFRelease(s); }
Combined with Xcode debugging information:
0x1ba8b16fc <+916>: bl 0x1ba862870 ; CFArrayGetValueAtIndex 0x1ba8b1700 <+920>: bl 0x1ba8945a0 ; CFRunLoopWakeUp 0x1ba8b1704 <+924>: sub x24, x24, #0x1 0x1ba8b1708 <+928>: subs w20, w20, #0x1 0x1ba8b170c <+932>: 0x1ba8b16f4 ; <+908> 0x1ba8b1710 <+936>: mov x0, x22 0x1ba8b1714 <+940>: bl 0x1ba860cec ; CFRelease 0x1ba8b1718 <+944>: cbz x25, 0x1ba8b1724 ; <+956> 0x1ba8b171c <+948>: mov x0, x23 0x1ba8b1720 <+952>: blraaz x25 -> 0x1ba8b1724 <+956>: cbz x21, 0x1ba8b1738 ; <+976> 0x1ba8b1728 <+960>: mov x0, x21 0x1ba8b172c <+964>: bl 0x1ba8b1a54 ; CFRunLoopSourceInvalidate 0x1ba8b1730 <+968>: mov x0, x21 0x1ba8b1734 <+972>: bl 0x1ba860cec ; CFRelease 0x1ba8b1738 <+976>: adrp x0, 354829 0x1ba8b173c <+980>: add x0, x0, #0x900 ; __CFAllSocketsLock
Completed executionCFRelease
Will execute laterCFRunLoopSourceInvalidate
, hereCFRelease
onlyCFRelease(source0)
; source0 is an array, I thought it naively at that time___lldb_unnamed_symbol7940
It is throughCFArrayReleaseCallBack
The callback method added, this call logic looks reasonable.CFRelease
Although it cannot be hooked, can we break the recursive call by modifying CallBack? It is still not feasible to try this way. BreakpointCFRelease
It is found that the object type of release isSocketStream
Not the previous source0 array.CFSocketInvalidate
The search type in this function isSocketStream
The object ofs->_context.info
, followed the clues and found the three most critical lines of code that we solved this problem:
if (NULL != contextRelease) { contextRelease(contextInfo); }
Follow xcode debugging informationcontextRelease
== CFRelease
andcontextRelease
Take values in codes->_context.release
. As long as you get its->_context
data structure, modifiedrelease
This pointer can be used to crash the stackCFRelease
The hook causes the lock to recurse twiceCFSocketInvalidate
The calls are respectivelyCFRelease
Before and after, ifCFRelease
Modify to asynchronous call,CFSocketInvalidate
Two callsos_unfair_lock_lock
In two different threads, the condition for lock recursive judgment is that lock's current owner is the current thread, and the lock method is executed on different threads, so this problem is solved. The process of mapping stream and socket is not introduced in detail. This process is too boring. Just post the result:
struct __CFSocket { int64_t offset[27]; CFSocketContext _context; /* immutable */ }; typedef struct { int64_t offset[33]; struct __CFSocket * _socket; } __CFSocketStreamContext; struct __CFStream { int64_t offset[5]; __CFSocketStreamContext *info; };
The final solution summarizes the following code, because it maps many system data structures, which is not a safe operation. Some memory readable and writeable judgments are required. For this part of the code, refer to kscrash. In addition, the business layer also needs toAdd switches, switches, and switches to take effect on specific systems, if the data structure of the new system stream or socket changes, it may cause some memory access to crashes.
// Memory protectionstatic inline int copySafely(const void* restrict const src, void* restrict const dst, const int byteCount) { vm_size_t bytesCopied = 0; kern_return_t result = vm_read_overwrite(mach_task_self(), (vm_address_t)src, (vm_size_t)byteCount, (vm_address_t)dst, &bytesCopied); if(result != KERN_SUCCESS) { return 0; } return (int)bytesCopied; } static char g_memoryTestBuffer[10240]; static inline bool isMemoryReadable(const void* const memory, const int byteCount) { const int testBufferSize = sizeof(g_memoryTestBuffer); int bytesRemaining = byteCount; while(bytesRemaining > 0) { int bytesToCopy = bytesRemaining > testBufferSize ? testBufferSize : bytesRemaining; if(copySafely(memory, g_memoryTestBuffer, bytesToCopy) != bytesToCopy) { break; } bytesRemaining -= bytesToCopy; } return bytesRemaining == 0; } // Asynchronous CFReleasestatic dispatch_queue_t socket_context_release_queue = nil; void (*origin_context_release)(const void *info); void new_context_release(const void *info) { if (socket_context_release_queue == nil) { socket_context_release_queue = dispatch_queue_create("socketContextReleaseQueue", 0x0); } dispatch_async(socket_context_release_queue, ^{ origin_context_release(info); }); } // CocoaAsyncSocket modify writeStreamif (@available(iOS 16.0, *)) { struct __CFStream *cfstream = (struct __CFStream *)writeStream; if (isMemoryReadable(cfstream, sizeof(*cfstream)) && isMemoryReadable(cfstream->info, sizeof(*(cfstream->info))) && isMemoryReadable(cfstream->info->_socket, sizeof(*(cfstream->info->_socket))) && isMemoryReadable(&(cfstream->info->_socket->_context), sizeof(cfstream->info->_socket->_context)) && isMemoryReadable(cfstream->info->_socket->_context.release, sizeof(*(cfstream->info->_socket->_context.release)))) { if (cfstream->info != NULL && cfstream->info->_socket != NULL) { if ((uintptr_t)cfstream->info->_socket->_context.release == (uintptr_t)CFRelease) { origin_context_release = cfstream->info->_socket->_context.release; cfstream->info->_socket->_context.release = new_context_release; } } }
Summarize
This problem does not only occur inCocoaAsyncSocket
In this library, this crash stack was also found in some system threads later, but the magnitude was not large, so it was evaluated that there was no need to solve it.
In addition, although both Solution 1 and Solution 2 were eventually passed, this is also my most commonly used troubleshooting method, so I will share it with you here. There were many points that were not clear in the entire investigation process, but these details did not affect the final conclusion, so we finally chose to look at it in Buddhist terms.
The above is the detailed explanation of the iOS 16 CocoaAsyncSocket crash repair. For more information about iOS CocoaAsyncSocket crash repair, please follow my other related articles!