SoFunction
Updated on 2024-10-30

Regular Expressions and JSON Data Exchange Format in Python

I. Getting to Know Regular Expressions

Regular expression is a special sequence of characters, a string whether we set such a sequence of characters, match the fast retrieval of text, the realization of the operation of replacing text

json (xml) Lightweight web data exchange format

import re
a='C|C++|Java|C#||Python|Javascript'
r= ('Python',a)
print(r)
if len(r) > 0:
 print('String with Python in it')
else:
 print('No')
['Python']
String containingPython 

II. Metacharacters and ordinary characters

import re
a='C0C++7Java8C#9Python6Javascript'
r= ('\d',a)
print(r)
b=''
for x in a:
 try:
 int(x)
 b +=x+','
 except :
 pass
print(b)

Results:

['0', '7', '8', '9', '6']
0,7,8,9,6,

'Python' normal character '\d' meta character

III. Character sets

import re
# Find words whose middle character isn't a C or an F #
s = 'abc, acc, adc, aec, afc, ahc'
r = ('a[^cf]c', s) #[a-z] [cf]
print(r)

Results:

['abc', 'adc', 'aec', 'ahc']

IV. Generalized character sets

#\d Numbers \D Letters
#\w Numbers and letters = [a-zA-Z0-9_] \W
#\s Blank characters \S
a='python 11\t11java&678p\nh\rp'
r = ('\s', a)
print(r)

Results:

[' ', '\t', '\n', '\r']

V. Quantifiers

a='python 1111java&678php'
r = ('[a-z]{3,6}', a)
print(r)

Results:

['python', 'java', 'php']

VI. Greed and non-greed

a='python 1111java&678php'
r = ('[a-z]{3,6}?', a)
#Greedy vs. non-greedy ?
print(r)

Results:

['pyt', 'hon', 'jav', 'php']

VII. Match 0 times 1 time or an unlimited number of times

# * Match 0 or infinitely many times
# + Match 1 or infinitely many times
# ? Match 0 or 1 times
a='pytho0python1pythonn2pythonw'
r = ('python*', a)
print(r)

Results:

['pytho', 'python', 'pythonn', 'python']

VIII. Boundary Matchers

qq = '12345678'
# 4~8 
r = ('^\d{4,8}$', qq)
print(r)
a = '123456789'
# 4~8 ^Rules$ ^beginning $end
e = ('^\d{4,8}$', a)
print(e)

Results:

['12345678']
[]

IX. Groups

# () Group
a = 'pythonpythonpythonpythonpython'
# 
r = ('(python){3}', a)
print(r)

Results:

['python'] means that there exists a set of (pythonpythonpythonpython) such data

X. Matching model parameters

# I | S ignore case | match all characters
lanuage = 'PythonC#\nJavaPHP'
r = ('c#.{1}', lanuage, | )
print(r)

Results:

['C#\n']

XI. Regular substitution

Search Replacement

def convert(value):
 matched = ()
 # print(value) <_sre.SRE_Match object; span=(6, 8), match='C#'>
 return '!!'+matched+'!!'
lanuage = 'PythonC#JavaC#PHPC#'
# r = ('C#', 'GO', lanuage, 1) Returns: PythonGOJavaC#PHPC#
# s=('C#', 'GO')
r = ('C#', convert, lanuage) #Pass in parameters
print(r)

Results:

Python!!C#!!Java!!C#!!PHP!!C#!!

XII. Passing functions as parameters

def convert(value):
 matched = () # Get the value of the object
 # print(value) <_sre.SRE_Match object; span=(6, 8), match='C#'>
 if int(matched) >=6 :
 return '9'
 else:
 return '0'
lanuage = 'A8C3721D86'
r = ('\d', convert, lanuage)
print(r)
#
A9C0900D99

XIII, search and match function

s = 'A8C3721D86'
# None Match from the beginning If no match is found, return None Match only once.
r = ('\d', s) 
print(r) #None
# Search the string and return once the first match is found Match only once
r1 = ('\d', s)
print(r1) #<_sre.SRE_Match object; span=(1, 2), match='8'>
print(()) #8
print(()) # (1, 2)
r2 = ('\d', s)
print(r2) #['8', '3', '7', '2', '1', '8', '6']

xiv. grouping

# Extract the value between life and python #
s = 'life is short,i use python'
#None
r = ('life.*python', s)
print(()) #life is short,i use python group(batch number)
r = ('life(.*)python', s)
print((0)) #life is short,i use python group(batch number)
print((1)) # is short,i use
#group(0) A special case Match the result of the full regular expression
r = ('life(.*)python', s)
print(r) #[' is short,i use ']
s = 'life is short,i use python, i love python'
r = ('life(.*)python(.*)python', s)
print((0)) # life is short,i use python, i love python 
print((1)) # is short,i use
print((2)) # , i love
print((0,1,2)) #('life is short,i use python, i love python', ' is short,i use ', ', i love ')
print(()) # (' is short,i use ', ', i love ')

XV. Some suggestions for learning regularization

#\d Numbers \D Letters
#\w Numbers and letters = [a-zA-Z0-9_] \W
#\s Blank characters \S
# . Matches all characters except the newline character \n
# * Match 0 or infinitely many times
# + Match 1 or infinitely many times
# ? Match 0 or 1 times
# () Group
# I | S ignore capitals | Match all characters

python : crawler, data processing

Understanding JSON

JSON is a lightweightdata exchange format

Strings are JSON representations

A string that conforms to the JSON format is called a JSON string.

{"name":"qiyue"}

JSON VS XML

Advantage:

Cross-language exchange of data

easy-to-read

easy to analyze

High network transmission efficiency

XVII. Deserialization

import json
# JSON object array
json_str = '{"name":"qiyue","age":18}'
s = (json_str)
# dict
# Deserialization
s = (json_str) #load() converts json datatypes to our own language datatypes.
print(type(s)) #<class 'dict'>
print(s) #{'name': 'qiyue', 'age': 18}
print(s['name']) # qiyue
json_str = '[{"name":"qiyue","age":18},{"name":"qiyue","age":18}]'
s = (json_str)
print(type(s)) # <class 'list'>
print(s) # [{'name': 'qiyue', 'age': 18}, {'name': 'qiyue', 'age': 18}]

JSON Python
object dict
array list
string str
number int
number float
true True
false False
null None

XVIII. Serialization

# Serialize to json
student = [
 {"name":"qiyue","age":18, 'flag':False},
 {"name":"python","age":18}
]
json_str = (student)
print(type(json_str)) # <class 'str'>
print(json_str) #[{"name": "qiyue", "age": 18, "flag": false}, {"name": "python", "age": 18}]

XIX, a small talk JSON, JSON objects and JSON strings

JSON is a lightweightdata exchange format

JSON objects Limited to languages

JSON String

JSON has its own data types

Although it's somewhat similar to JavaScript's data types, they're not the same language.

ECMASCRIPT a standard JavaScript ActionScription JSON a solution to implement the standard

REST service

summarize

The above is a small introduction to the regular expressions in Python and JSON data exchange format ,I hope to help you, if you have any questions please leave me a message, I will reply to you in a timely manner. Here also thank you very much for your support of my website!
If you find this article helpful, please feel free to reprint it, and please note the source, thank you!