SoFunction
Updated on 2025-03-02

Analysis of Python text processing simple and easy-to-understand method

This article mainly introduces the analysis of Python text processing simple and easy-to-understand methods. The article introduces the example code in detail, which has certain reference value for everyone's learning or work. Friends who need it can refer to it.

Since I learned about the language of python, everything seems to have become easier. As a novice, Dou Xueer will summarize some small methods of python's text processing for you today.

Without further ado, the code will be sucked.

Python uppercase and uppercase character swap

When performing case swaps, there are four commonly used methods, upper(), lower(), capitalize() and title().

str = ""
print(())# Convert lowercase letters in all characters to uppercase lettersprint(())# Convert uppercase letters in all characters to lowercase lettersprint(())# Convert the first letter into uppercase letters and the rest lowercase lettersprint(())# Convert the first letter of each word into uppercase and lowercase


You can also perform case swaps at the same time:

s="hGdssWW678qqfdDDD777f8888sD8FJJss jjYYhVV #sh&" 
def fn(x):
  if  lower():
    return ()
  elif  upper():
    return ()
  else:
    return x
result=''.join([fn(r) for r in list(s)])
print(result)
HgDSSww678QQFDddd777F8888Sd8fjjSS JJyyHvv #SH&

In s, not only are there upper and lower case letters, but also numbers and symbols as interference. The code successfully interchanges the upper and lower case.

Row interchange

01: insert performs interchange between rows A and rows N

with open('D:
.txt','r') as f:
  txt=()
  (4,txt[1])#The position where the second line is inserted into the fifth line  del(txt[1])#Delete the original second line  print(txt)
1  A  one 
3  C  three 
4  D  Four 
2  B  two 
5  E  five 
6  F  six

02: Row-column swap of matrix

matrix = [[1, 1, 1, 1],
         [2, 2, 2, 2],
         [3, 3, 3, 3],]

trans = []
for i in range(4):
  ([row[i] for row in matrix])

print('', trans)
 [[1, 2, 3], 
 [1, 2, 3], 
 [1, 2, 3], 
 [1, 2, 3]]

Regarding the situation of rank and column swap, Python has a very useful library pandas, which is very easy to operate. You can check it in the previous article "Ten Minutes to Start Pandas".

Implement quick sorting

The idea of ​​quick sorting: First select a data (usually the first number in the array) as the key data, and then put all numbers smaller than it in front of it, and all numbers larger than it in the back of it. This process is called a quick sorting trip.

01: Quick sorting of super "short" python implementation, easy to implement quick sorting in one line of code.

def quickSort(arg):
  if(arg==[]):
     return []
  return quickSort([i for i in arg[1:] if i<=arg[0]])+[arg[0]]+quickSort([i for i in arg[1:] if i>arg[0]])
print quickSort([11,22,8,23,7,33,13,28,66,777])
[7, 8, 11, 13, 22, 23, 28, 33, 66, 777]

02: General quick schedule implementation

def quicksort(array, left, right):
  #Create a recursive termination condition  if left &gt;= right:
    return
  low = left#low is the cursor to move to the left of the sequence  high = right#high is the cursor to move to the right of the sequence  key = array[low]#Set the number on the left as the reference element
  while left &lt; right:
    # When left and right do not overlap and are larger than the benchmark element, move the cursor to the left    while left &lt; right and array[right] &gt; key:
      right -= 1
    # If smaller than the benchmark element, jump out of the loop and place it to the left of the benchmark element    array[left] = array[right]

    # When low and last do not coincide and are smaller than the benchmark element, move the cursor to the right    while left &lt; right and array[left] &lt;= key:
      left += 1
    # If it is larger than the benchmark element, jump out of the loop and place it to the right of the benchmark element.    array[right] = array[left]

  # When low and last are equal, it is the sorting position of the benchmark elements  array[right] = key

  # Recursive the sequence on both sides of the sorted elements  quicksort(array, low, left - 1)
  quicksort(array, left + 1, high)

array = [11,22,8,23,7,33,13,28,66,777]
print("Quick Sort: ")
quicksort(array,0,len(array)-1)
print(array)
[7, 8, 11, 13, 22, 23, 28, 33, 66, 777]

03: Quick Schedule Program in "Introduction to Algorithm"

def quicksort(array, l, r):
  if l < r:
    q = partition(array, l, r)
    quick_sort(array, l, q - 1)
    quick_sort(array, q + 1, r)

def partition(array, l, r):
  x = array[r]
  i = l - 1
  for j in range(l, r):
    if array[j] <= x:
      i += 1
      array[i], array[j] = array[j], array[i]
  array[i + 1], array[r] = array[r], array[i+1]
  return i + 1
array = [11,22,8,23,7,33,13,28,66,777]
print("Quick Sort: ")
quicksort(array,0,len(array)-1)
print(array)
[7, 8, 11, 13, 22, 23, 28, 33, 66, 777]

04: Python has a built-in function sorted() sorted for list

a = [11,22,8,23,7,33,13,28,66,777]
b=sorted(a)
print(b)
print(a)

[7, 8, 11, 13, 22, 23, 28, 33, 66, 777]
[11, 22, 8, 23, 7, 33, 13, 28, 66, 777]

Although sorted uses function methods to sort, the effect is very good. Use the sorted function to sort, which will not change the original sort of a. And when there are interferences such as other letters and symbols in the text, you can also successfully sort the numbers.

Text Alignment

Sometimes the text we get may be confusing and needs to be processed in alignment. There are several ways to align:

01: format format alignment

# format format alignmentdef f1():  
  with open("D:
.txt","r") as f:
    for s in f:
      l= ()
      #Left aligned, fill symbols are customized      t='{0:&lt;5} {1:&lt;7} {2}'.format(l[0],l[1],l[2])  
      print(str(t))
f1()

111   ABC     watermelon
22222 AABBC   Peach
3333  CSDDGFF banana
44    QQQSED  Porocie

02: Just Alignment

#just Alignmentr=''
def f2():
  f=open("D:
.txt","r")
  for s in f:
    l=() #Slice the string by specifying the delimiter    print(l[0].ljust(5," "),l[1].ljust(7," "),l[2])
f2()

111   ABC     watermelon
22222 AABBC   Peach
3333  CSDDGFF banana
44    QQQSED  Porocie

Branch output

01: Regular expression branch output

#regular expressiona="aA1, bB2, cC3, dD4, eE5, fF6, six, gGG7, hH8, iI99, ninth"
import re
reg=["[a-z]","[A-Z]","d","[^da-zA-Z]"]
#compile and findall use together to return a listfor s in reg:  
  rega=(s)
  s=(rega,a)
  print("".join(s))

abcdefghi
ABCDEFGHI
123456789
One, two, three, four, five, six, seven, eight, nine

02: string method branch output

#string methoda="aA1, bB2, cC3, dD4, eE5, fF6, six, gGG7, hH8, iI99, ninth"
import string
ta=tb=tc=td=''
la=string.ascii_lowercase#la is lowercase letterua=string.ascii_uppercase#ua is capital letternb=#nb is 0~9ub="One, two, three, four, five, six, seven, eight, nine"

#Find lowercase, uppercase letters and numbers from a and output them separatelyfor s in a:
  if s in la:
    ta=ta+s
  if s in ua:
    tb=tb+s
  if s in nb:
    tc=tc+s
  if s in ub:
    td=td+s
print(ta)
print(tb)
print(tc)
print(td)

abcdefghi
ABCDEFGHI
123456789
One, two, three, four, five, six, seven, eight, nine

OK, that’s all for today’s sharing.

The above is all the content of this article. I hope it will be helpful to everyone's study and I hope everyone will support me more.