SoFunction
Updated on 2025-04-04

PHP batch removes BOM header content information code

What is a bomb head?

In the utf-8 encoding file, the BOM occupies three bytes in the file header, which is used to indicate that the file belongs to the utf-8 encoding. Now many software has identified the bom header, but some still cannot recognize the bom header. For example, PHP cannot recognize the bom header. This is also the reason why the execution will be errored after editing the utf-8 encoding with Notepad.

The code for batch removal of bomb headers is as follows:

<?php 
if (isset($_GET['dir'])){ //Set the file directory$basedir=$_GET['dir']; 
}else{ 
$basedir = '.'; 
} 
$auto = 1; 
checkdir($basedir); 
function checkdir($basedir){ 
if ($dh = opendir($basedir)) { 
while (($file = readdir($dh)) !== false) { 
if ($file != '.' && $file != '..'){ 
if (!is_dir($basedir."/".$file)) { 
echo "filename: $basedir/$file ".checkBOM("$basedir/$file")." <br>"; 
}else{ 
$dirname = $basedir."/".$file; 
checkdir($dirname); 
} 
} 
} 
closedir($dh); 
} 
} 
function checkBOM ($filename) { 
global $auto; 
$contents = file_get_contents($filename); 
$charset[1] = substr($contents, 0, 1); 
$charset[2] = substr($contents, 1, 1); 
$charset[3] = substr($contents, 2, 1); 
if (ord($charset[1]) == 239 && ord($charset[2]) == 187 && ord($charset[3]) == 191) { 
if ($auto == 1) { 
$rest = substr($contents, 3); 
rewrite ($filename, $rest); 
return ("<font color=red>BOM found, automatically removed._<a href=></a></font>"); 
} else { 
return ("<font color=red>BOM found.</font>"); 
} 
} 
else return ("BOM Not Found."); 
} 
function rewrite ($filename, $data) { 
$filenum = fopen($filename, "w"); 
flock($filenum, LOCK_EX); 
fwrite($filenum, $data); 
fclose($filenum); 
} 
?> 

PS: The simplest way to remove the bomb head is:

1. How to remove the BOM header by editplus

After the editor adjusts to UTF8 encoding format, a string of hidden characters (i.e., BOM) will appear in front of the saved file, which is used by the editor to identify whether the file is encoded in UTF8.

Run Editplus, click on the tool, select preferences, select file, select UTF-8 logo selection Always delete signature,

Then the PHP file edited and saved is not BOM.

2. Ultraedit removal method

After opening the file, select the encoding format of the Save As option (utf-8 without bomb header), and it will be OK if you confirm it.

How about it, it's easy to remove the bomb head

Let's talk about utf8's BOM information

BOM refers to the storage method of the php file itself as UTF-8 with BOM. The Chinese garbled method of ordinary pages is generally not caused by this reason.

header("Content-type: text/html; charset=utf-8");

This sentence controls the encoding method of the html output page.

The BOM will only be available when using "Notepad" to store it as UTF-8 under WINDOWS. This can be deleted with WINHEX.

In the encoding settings in dreamweaver, you can set whether to have a BOM. Generally, as long as the php output is not a picture (GDI Stream), the BOM will not cause problems.
GDI Stream will be displayed as a red cross if there is an extra character at the beginning.