SoFunction
Updated on 2025-04-09

Display formatted user input in PHP


You can download the file that comes with this document on this page, or you can download the file in character processing in the file download to describe how to safely display the formatted user input. We will discuss the dangers of not filtered output, giving a safe way to display formatted output.

No risk of filtering output

If you just get the user's input and then display it, you may break your output page, such as some people who can maliciously embed javascript scripts in the input box they submitted:

This is my comment.
<script language="javascript:
alert('Do something bad here!')">.

In this way, even if the user is not malicious, it will destroy some of your HTML statements, such as a sudden interruption of a table or the page display is incomplete.


Show only unformatted text

This is the easiest solution, you just display the user-submitted information as unformatted text. Use the htmlspecialchars() function to convert all characters into HTML encoding.

If <b> will be converted to <b>, this ensures that there will be no unexpected HTML tags output when inappropriately.
This is a good solution if your users only focus on text content without formatting. But it would be a little better if you give some ability to format it.
Formatting with Custom Markup Tags
User's own markup is formatted

You can provide special markers for users. For example, you can allow [b]...[/b] to accentuate the display and [i]...[/i] italic display. This way, you can do a simple search and replace operation: $output = str_replace("[b]", "<b>", $output);
$output = str_replace("[i]", "<i>", $output);

To do a better job, we can allow the user to type some links. For example, the user will allow input [link="url"]...[/link], and we will convert it to the <a href="">...</a> statement

At this time, we cannot use a simple lookup replacement, we should use regular expressions to replace:
$output = ereg_replace('\[link="([[:graph:]]+)"\]', '<a href="\\1">', $output);

The execution of ereg_replace() is:
Find the string where [link="..."] appears, and replace it with <a href="...">
[[:graph:]] means any non-null character. Please see the relevant article for regular expressions.


The format_output() function provides conversions of these tags, the overall principle is:
Call htmlspecialchars() to convert HTML tags into special encodings, filter out HTML tags that should not be displayed.
Then, convert a series of our custom tags to the corresponding HTML tags.
Please refer to the source code below:
<?php


function format_output($output) {
/****************************************************************************
* Takes a raw string ($output) and formats it for output using a special
* stripped down markup that is similar to HTML
****************************************************************************/

$output = htmlspecialchars(stripslashes($output));

/* new paragraph */
$output = str_replace('[p]', '<p>', $output);

/* bold */
$output = str_replace('[b]', '<b>', $output);
$output = str_replace('[/b]', '</b>', $output);

/* italics */
$output = str_replace('[i]', '<i>', $output);
$output = str_replace('[/i]', '</i>', $output);

/* preformatted */
$output = str_replace('[pre]', '<pre>', $output);
$output = str_replace('[/pre]', '</pre>', $output);

/* indented blocks (blockquote) */
$output = str_replace('[indent]', '<blockquote>', $output);
$output = str_replace('[/indent]', '</blockquote>', $output);

/* anchors */
$output = ereg_replace('\[anchor="([[:graph:]]+)"\]', '<a name="\\1"></a>', $output);

/* links, note we try to prevent javascript in links */
$output = str_replace('[link="javascript', '[link=" javascript', $output);
$output = ereg_replace('\[link="([[:graph:]]+)"\]', '<a href="\\1">', $output);
$output = str_replace('[/link]', '</a>', $output);

return nl2br($output);
}

?>

Some things to note:

Remember to replace custom tags to generate HTML tag strings after calling htmlspecialchars() function, not before this call, otherwise your hard work will be lost after calling htmlspecialchars().

After the conversion, the search HTML code will be replaced, such as double quotes "will become"

The nl2br() function converts the carriage return line break into the <br> tag, which must also be after htmlspecialchars().

When converting [links=""] to <a href="">, you must confirm that the committer will not insert the javascript script. A simple way to change [link="javascript to [link="javascript, this method will not be replaced, it will just display the original code.


Calling in the browser, you can see the usage of format_output()

Normal HTML tags cannot be used, replace them with the following special tags:

- this is [b]bold[/b]
- this is [i]italics[/i]
- this is [link=""]a link[/link]
- this is [anchor="test"]an anchor, and a [link="#test"]link[/link] to the anchor

[p]Paragraph
[pre]Preformatted[/pre]
[indent]Interleaved text[/indent]

These are just a few marks, of course, you can add more marks as you want

Conclusion
in conclusion

This discussion provides a method to safely display user input, which can be used in the following programs

Message board
User suggestions
System Announcement
BBS system