SoFunction
Updated on 2025-03-03

Summary of various methods for intercepting Chinese strings in php

1. Intercept GB2312 Chinese string

<?php
 
< ?php
//Intercept Chinese stringfunction mysubstr($str, $start, $len) {
  $tmpstr = "";
  $strlen = $start + $len;
  for($i = 0; $i < $strlen; $i++) {
    if(ord(substr($str, $i, 1)) > 0xa0) {
      $tmpstr .= substr($str, $i, 2);
      $i++;
    } else
      $tmpstr .= substr($str, $i, 1);
  }
  return $tmpstr;
}
?>

2. Intercept utf8-encoded multibyte string

<?php
< ?php
//Intercept utf8 stringfunction utf8Substr($str, $from, $len)
{
  return preg_replace('#^(?:[\x00-\x7F]|[\xC0-\xFF][\x80-\xBF]+){0,'.$from.'}'.
            '((?:[\x00-\x7F]|[\xC0-\xFF][\x80-\xBF]+){0,'.$len.'}).*#s',
            '$1',$str);
}
?>

3. Chinese character interception function supported by UTF-8 and GB2312

<?php
< ?php
/*
 Chinese character interception function supported by Utf-8 and gb2312
 cut_str(string, intercept length, start length, encoding);
 The encoding default is utf-8
 The start length defaults to 0
 */
 
function cut_str($string, $sublen, $start = 0, $code = 'UTF-8')
{
  if($code == 'UTF-8')
  {
    $pa = "/[\x01-\x7f]|[\xc2-\xdf][\x80-\xbf]|\xe0[\xa0-\xbf][\x80-\xbf]|[\xe1-\xef][\x80-\xbf][\x80-\xbf]|\xf0[\x90-\xbf][\x80-\xbf][\x80-\xbf]|[\xf1-\xf7][\x80-\xbf][\x80-\xbf][\x80-\xbf]/";
    preg_match_all($pa, $string, $t_string);
 
    if(count($t_string[0]) - $start > $sublen) return join('', array_slice($t_string[0], $start, $sublen))."...";
    return join('', array_slice($t_string[0], $start, $sublen));
  }
  else
  {
    $start = $start*2;
    $sublen = $sublen*2;
    $strlen = strlen($string);
    $tmpstr = '';
 
    for($i=0; $i< $strlen; $i++)
    {
      if($i>=$start && $i< ($start+$sublen))
      {
        if(ord(substr($string, $i, 1))>129)
        {
          $tmpstr.= substr($string, $i, 2);
        }
        else
        {
          $tmpstr.= substr($string, $i, 1);
        }
      }
      if(ord(substr($string, $i, 1))>129) $i++;
    }
    if(strlen($tmpstr)< $strlen ) $tmpstr.= "...";
    return $tmpstr;
  }
}
 
$str = "The string that abcd needs to be intercepted";
echo cut_str($str, 8, 0, 'gb2312');
?>

4. BugFree's character intercept function

< ?php
/**
 * @package   BugFree
 * @version   $Id: ,v 1.32 2005/09/24 11:38:37 wwccss Exp $
 *
 *
 * Return part of a string(Enhance the function substr())
 *
 * @author         Chunsheng Wang <wwccss@>
 * @param string $String the string to cut.
 * @param int   $Length the length of returned string.
 * @param booble $Append whether append "...": false|true
 * @return string      the cutted string.
 */
function sysSubStr($String,$Length,$Append = false)
{
  if (strlen($String) < = $Length )
  {
    return $String;
  }
  else
  {
    $I = 0;
    while ($I < $Length)
    {
      $StringTMP = substr($String,$I,1);
      if ( ord($StringTMP) >=224 )
      {
        $StringTMP = substr($String,$I,3);
        $I = $I + 3;
      }
      elseif( ord($StringTMP) >=192 )
      {
        $StringTMP = substr($String,$I,2);
        $I = $I + 2;
      }
      else
      {
        $I = $I + 1;
      }
      $StringLast[] = $StringTMP;
    }
    $StringLast = implode("",$StringLast);
    if($Append)
    {
      $StringLast .= "...";
    }
    return $StringLast;
  }
}
 
$String = "Live at the forefront of automation testing in China";
$Length = "18";
$Append = false;
echo sysSubStr($String,$Length,$Append);
?>

The above is all the content of this article. I hope it will be helpful to everyone. I hope everyone will continue to pay attention to my latest content.