1 Filter out the specified rows of data
data=[2:5] #Here.[2:5]denote3go to the end of the line5content of the row,[]The first start is0,Indicates the first row of the data
2 Filter out all data records with a value in a column of data
data = df[(df['Column name 1']== ‘column value1')] # When matching multiple conditions data_many=df[(df['Column name 1']== ‘column value1')&(df['Column name 2']==‘column value2')] # When matching multiple values data_many=df[df['Column name 1'] in [‘(be) worth1', 'value 2',......]]
3 Pattern Matching
# Pattern matching with a value at the beginning cond=df['Column name'].('Value') $ Pattern matching with a value in the middle cond=df['Column name'].('Value')
4 Range interval value screening
# Filter the data based on between two values: cond=df[(df['Column name 1']>‘column value1')&(df['Column name 1']<‘column value2')]
5 Getting a value in a row or column
print(ridership_df.loc['05-05-11','R003']) # Or print(ridership_df.iloc[4,0]) # Results. 1608
6 Getting the raw numpy two-dimensional array
print()
7 Get the position of a row element according to the condition
import pandas as pd df = ({'BoolCol': [1, 2, 3, 3, 4],'attr': [22, 33, 22, 44, 66]},index=[10,20,30,40,50]) print(df) a = df[(==3)&(==22)].() b = df[(==3)&(==22)].index[0] c = df[(==3)&(==22)]. print(a)
8 Element Location Filtering
print(date_frame) # Print the full display print(date_frame.shape) # Get rows, columns meta-anchor of df print(date_frame.head(2)) # The first 2 lines print(date_frame.tail(2)) # 2 lines after print(date_frame.()) # Get only the list of indexes for df print(date_frame.()) # Get a list of column names for df only print(date_frame.()) # get onlydfA list of all the values of the(two-dimensional list)
9. Delete multiple rows/columns
# Used provided that the index and columns of the dataframe are in numbers, utilizing the drop() and range() functions. (labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise') # axis = 0 to delete rows; axis = 1 to delete columns. # want to delete multiple rows/columns, with range can be, for example, to delete the first 3 rows, drop(range(0,3), axis = 0 (default zero, can not write)) can be.
10 to_datetime converts string format to date format
import datetime import pandas as pd dictDate = {'date': ['2019-11-01 19:30', '2019-11-30 19:00']} df = (dictDate) df['datetime'] = pd.to_datetime(df['date']) df['today'] = df['datetime'].apply(lambda x: ('%Y%m%d')) df['tomorrow'] = (df['datetime'] + (days=1)).('%Y%m%d')
11 apply() function
The apply() function of # pandas can be applied to a Series or an entire DataFrame, and also automatically iterates over the entire Series or DataFrame, running the specified function on each element. def add_extra(nationality, extra): if nationality != "Han.": return extra else: return 0 df['ExtraScore'] = (add_extra, args=(5,)) df['ExtraScore'] = (add_extra, extra=5) df['Extra'] = (lambda n, extra : extra if n == 'Han' else 0, args=(5,)) def add_extra2(nationaltiy, **kwargs): return kwargs[nationaltiy] df['Extra'] = (add_extra2, the Han dynasty (206 BC-220 AD)=0, classifier for a chapter in old Chinese fictional novels=10, harbor=5)
12 map() function
import datetime import pandas as pd def f(x): x = str(x)[:8] if x !='n': gf = (x, "%Y%m%d") x = ("%Y-%m-%d") return x def f2(x): if str(x) not in [' ', 'nan']: dd = (str(x), "%Y/%m/%d") x = ("%Y-%m-%d") return x def test(): df = () df1 = pd.read_csv("600694_gf.csv") df2=pd.read_csv("") df['date1'] =df2['DateTime'].map(f2) df['date2'] =df1['date'].map(f) df.to_csv('')
consultation
- Pandas DataFrame operation
- — pandas 1.4.1 documentation
- pandas apply() function usage
- — pandas 1.4.1 documentation
summarize
to this article on the use of python pandas filtering function to achieve the way the article is introduced to this, more related pandas filtering function content please search for my previous articles or continue to browse the following related articles I hope that you will support me in the future more!