SoFunction
Updated on 2025-03-02

Python method to implement simple text string processing

This article describes the method of implementing simple text string processing in Python. Share it for your reference, as follows:

For a text string, you can use Python's()Method to cut it. Let’s take a look at the actual operation effect.

mySent = 'This book is the best book on python!'
print ()

Output:

['This', 'book', 'is', 'the', 'best', 'book', 'on', 'python!']

As you can see, the slicing effect is good, but punctuation marks are also regarded as words and can be processed using regular expressions, where the separator is any string except words or numbers.

import re
reg = ('\\W*')
mySent = 'This book is the best book on python!'
listof = (mySent)
print listof

The output is:

['This', 'book', 'is', 'the', 'best', 'book', 'on', 'python', '']

Now I get a list of words composed of a series of words, but the empty strings inside need to be removed.

You can calculate the length of each string and return only strings greater than 0.

import re
reg = ('\\W*')
mySent = 'This book is the best book on python!'
listof = (mySent)
new_list = [tok for tok in listof if len(tok)>0]
print new_list

The output is:

['This', 'book', 'is', 'the', 'best', 'book', 'on', 'python']

Finally, it was discovered that the first letter in the sentence was capitalized. We need the same form to convert uppercase to lowercase. Python embedded method, which can convert all strings to lowercase (.lower()) or capital (.upper())

import re
reg = ('\\W*')
mySent = 'This book is the best book on python!'
listof = (mySent)
new_list = [() for tok in listof if len(tok)>0]
print new_list

The output is:

['this', 'book', 'is', 'the', 'best', 'book', 'on', 'python']

Here is a complete email:

content

Hi Peter,

With Jose out of town, do you want to
meet once in a while to keep things
going and do some interesting stuff?

Let me know
Eugene

import re
reg = ('\\W*')
email = open('').read()
list = (email)
new_txt = [() for tok in list if len(tok)>0]
print new_txt

Output:

Copy the codeThe code is as follows:
['hi', 'peter', 'with', 'jose', 'out', 'of', 'town', 'do', 'you', 'want', 'to', 'meet', 'once', 'in', 'a', 'while', 'to', 'keep', 'things', 'going', 'and', 'do', 'some', 'interesting', 'stuff', 'let', 'me', 'know', 'eugene']

For more information about Python, please view the special topic of this site: "Summary of Python string operation skills》、《Python data structure and algorithm tutorial》、《Summary of Python function usage tips》、《Python introduction and advanced classic tutorials"and"Summary of Python file and directory operation skills

I hope this article will be helpful to everyone's Python programming.