Preserving HTML tags in the WordPress excerpt

the_excerpt() in WordPress will, by default, return a small taste of a post, wrapped in <p> tags, and stripped of all HTML and formatting, including line breaks.

So what if you want to keep the HTML tags in tact so that you aren’t left with a blob of plain text? Stackoverflow user Pieter Goosen has a great answer on preserving HTML tags in the_excerpt() in WordPress.

He provides the complete code in the answer along with a thorough explanation. The code also covers:

  • excerpt length
  • “Read more” links
  • smart breaks based on regular expressions (so that the_excerpt() doesn’t cut off in mid-sentence)

For the record, below is my own customization of Pieter’s code, which I am now using for the front page of this site. My changes are mostly small things: adding allowed tags, removing the comma and semicolon from the list of smart break punctuation, shortening the word count, etc.

// this function and the if statment below cover the_excerpt()
function wpse_allowedtags() {
// Add custom tags to this string
    return '<script>,<style>,<br>,<em>,<i>,<ul>,<ol>,<li>,<a>,<p>,<img>,<video>,<audio>,,<pre>,<blockquote>'; 

if ( ! function_exists( 'wpse_custom_wp_trim_excerpt' ) ) : 

    function wpse_custom_wp_trim_excerpt($wpse_excerpt) {
    global $post;
    $raw_excerpt = $wpse_excerpt;
        if ( '' == $wpse_excerpt ) {

            $wpse_excerpt = get_the_content('');
            $wpse_excerpt = strip_shortcodes( $wpse_excerpt );
            $wpse_excerpt = apply_filters('the_content', $wpse_excerpt);
            $wpse_excerpt = str_replace(']]>', ']]>', $wpse_excerpt);
            $wpse_excerpt = strip_tags($wpse_excerpt, wpse_allowedtags()); /*IF you need to allow just certain tags. Delete if all tags are allowed */

            //Set the excerpt word count and only break after sentence is complete.
                $excerpt_word_count = 40;
                $excerpt_length = apply_filters('excerpt_length', $excerpt_word_count); 
                $tokens = array();
                $excerptOutput = '';
                $count = 0;

                // Divide the string into tokens; HTML tags, or words, followed by any whitespace
                preg_match_all('/(<[^>]+>|[^<>\s]+)\s*/u', $wpse_excerpt, $tokens);

                foreach ($tokens[0] as $token) { 

                    if ($count >= $excerpt_word_count && preg_match('/[\?\.\!]\s*$/uS', $token)) { 
                    // Limit reached, continue until , ; ? . or ! occur at the end
                        $excerptOutput .= trim($token);

                    // Add words to complete sentence

                    // Append what's left of the token
                    $excerptOutput .= $token;

            $wpse_excerpt = trim(force_balance_tags($excerptOutput));

                $excerpt_end = '<a class="read-more" alt= "' . get_the_title() . '" href="'. esc_url( get_permalink() ) . '">' . sprintf(__( 'Read more >' ), get_the_title()) . '</a>'; 
                $excerpt_more = apply_filters('excerpt_more', ' ' . $excerpt_end); 

                //$pos = strrpos($wpse_excerpt, '</');
                //if ($pos !== false)
                // Inside last HTML tag
                //$wpse_excerpt = substr_replace($wpse_excerpt, $excerpt_end, $pos, 0); /* Add read more next to last word */
                // After the content
                $wpse_excerpt .= $excerpt_end; /*Add read more in new paragraph */

            return $wpse_excerpt;   

        return apply_filters('wpse_custom_wp_trim_excerpt', $wpse_excerpt, $raw_excerpt);


remove_filter('get_the_excerpt', 'wp_trim_excerpt');
add_filter('get_the_excerpt', 'wpse_custom_wp_trim_excerpt'); 
