SoFunction
Updated on 2025-04-14

How to use isspace and ungetc to implement leading whitespace character filtering

Problem scenario

When usinggetchar()When entering characters, we often need to skip whitespace characters such as spaces, tabs, line breaks at the beginning of the input stream until we encounter the first valid character. This is especially common when processing user input or parsing files.

For beginners, when practicing algorithms, they will also encounter situations where they first enter a value for variables of type int, and then enter a string. At this time, since you do not know the number and type of whitespace characters between the inputs, processing leading null characters becomes a slightly troublesome problem.

When many questions in Luogu are input from different lines, they usually use "\r\n" instead of simply "\n". The author once tried to use free getchar() to process the '\n' between inputs. As a result, an orange question has been tossed all night but no AC...

Key Function Description

1. isspace() function

#include <>
int isspace(int c);
  • Determine whether the passed character is a whitespace character
  • Supported whitespace characters include: space (’ ‘), page break (’\f’), line break (’\n’), carriage return (’\r’), horizontal tab (’\t’), vertical tab (’\v’)

2. The ungetc() function

#include <>
int ungetc(int c, FILE *stream);
  • Push the specified character back to the input stream
  • Commonly used in scenarios where you need to restore after "peeking" the next character
  • Guaranteed reliable pushback of up to 1 character

Solution

#include &lt;&gt;
#include &lt;&gt;

void trim_leading_whitespace() {
    int c;
    // Skip all blank characters and pay attention to the semicolon after while here    while ((c = getchar()) != EOF &amp;&amp; isspace(c));
    if (c != EOF) {
    	ungetc(c, stdin); // Put the first non-whitespace character back to the buffer    }
}

Example of usage:

int main() {
    printf("Please enter a string with spaces:");
    
    trim_leading_whitespace();
    
    int ch;
    printf("Processed first character:");
    if ((ch = getchar()) != EOF) {
        putchar(ch);
    }
    
    // Read the remaining characters    printf("\nRemaining Characters:");
    while ((ch = getchar()) != '\n' &amp;&amp; ch != EOF) {
        putchar(ch);
    }
    
    return 0;
}

Running effect:

Please enter a string with spaces: Hello World
Processed first character: H
Remaining characters: ello World

Implementation analysis

  1. Loop reading:passwhileLoop continuously reads characters until EOF or non-whitespace characters are encountered
  2. Character judgment:useisspace()Filter all types of whitespace characters
  3. Character pushback: After detecting the first non-whitespace character, useungetc()Put it back into the input buffer
  4. Follow-up processing: The main program can read the filtered first valid character normally

Things to note

  1. Pushback limit: ANSI C ensures reliable pushback of at least 1 character, and the pushback behavior of multiple characters depends on the specific implementation
  2. Stream type: Applicable to all standard input streams (stdin) and file streams
  3. Error handling: The boundary conditions of EOF need to be considered
  4. Encoding compatibility: Perfectly handle ASCII encoding, use iswspace() instead for wide characters

Through this combination, we can gracefully implement the preprocessing of the input stream, laying a good foundation for subsequent character processing.

Summarize

The above is personal experience. I hope you can give you a reference and I hope you can support me more.