Common data types for Pandas
Common data structures used by extension library pandas
as follows:
(1)Series: One-dimensional array with labels
(2)DatetimeIndes: Time series
(3)DateFrame: A two-dimensional table structure with labels and variable size
(4)Panel: 3D array with labels and variable size
1. One-dimensional arrays and common operations
Series consists of two parts: index and value, and is a dictionary-like structure.
The types of values can be different, and if the index is not explicitly specified at creation time, a non-negative integer starting from 0 will be automatically used as the index.
import pandas as pd import as plt # Set the output result column alignmentpd.set_option('.ambiguous_as_wide',True) pd.set_option('.east_asian_width',True) # Automatically create non-negative integer indexes starting from 0s1=(range(1,20,5)) # Create Series with dictionary, use the "key" of the dictionary as the indexs2=({'Chinese':90,'math':92,'Python':98,'physics':87,'Chemical':92}) # Modify the value corresponding to the specified indexs1[3]=-17 s2['Chinese']=94 print('s1 raw data'.ljust(20,'=')) print(s1,'\n') print('Search for absolute values for all data in s1'.ljust(20,'=')) print(abs(s1),'\n') print('S1's index is preceded by the number 2'.ljust(20,'=')) print(s1.add_prefix(2),'\n') print('s2 raw data'.ljust(20,'=')) print(s2,'\n') print('histogram of s2 data'.ljust(20,'=')) () () print('The index of each row of s2 is followed by _Zhang San'.ljust(20,'=')) print(s2.add_suffix('_Zhang San'),'\n') print('S2 maximum index'.ljust(20,'=')) print((),'\n') print('Test whether the value of s2 is within the specified interval'.ljust(20,'=')) print((90,94,inclusive=True),'\n') print('View data with more than 90 points in s2'.ljust(20,'=')) print(s2[s2>90],'\n') print('View data greater than the median in s2'.ljust(20,'=')) print(s2[s2>()],'\n') print('The operation between s2 and numbers'.ljust(20,'=')) print(round((s2**0.5)*10,1),'\n') print('The smallest 2 values in s2'.ljust(20,'\n')) print((2),'\n') # Four operations and exponent operations can be performed between two equal-length Series objects# Only calculate the values corresponding to the indexes in both Series objects# The value corresponding to the non-common index is a null valueprint('Add two Series objects'.ljust(20,'=')) print((range(5))+(range(5,10)),'\n') # pipe() method can implement the function of chain callsprint('Add each value by 3'.ljust(20,'=')) print((range(5)).pipe(lambda x:x+3).pipe(lambda x:x*3),'\n') print('The remainder of squared versus 5 for each value'.ljust(20,'=')) print((range(5)).pipe(lambda x,y,z:(x**y)%z,2,5),'\n') # apply() method is used to perform function operations on the value of the Series objectprint('Add 3 for each value.ljust(20,'=')) print((range(5)).apply(lambda x:x+3),'\n') print('Standard deviation, unbiased variance, unbiased standard deviation'.ljust(20,'=')) print((range(5)).std(),'\n') print((range(5)).var(),'\n') print((range(5)).sem(),'\n') print('Check if there is a value equivalent to True'.ljust(20,'=')) print(any(([3,0,True])),'\n') print('See if all values are equivalent to True'.ljust(20,'=')) print(all(([3,0,True])))
2. Time series and common operations
Use pandas' date_range() function to generate a time series object:
date_range(start=None,end=None,periods=None,freq='D',tz=None,normalize=False,name=None,closed=None,**kwargs)
- (1) Start and end are used to specify the start and end date time
- (2) Periods are used to specify the number of data to be generated
- (3) freq is used to specify the time interval, default is 'D', indicating that there is one day difference between two adjacent dates.
Summarize
The above is personal experience. I hope you can give you a reference and I hope you can support me more.