SoFunction
Updated on 2024-12-20

How to select data using Pandas queries

First, Pandas query data in several ways

  1. df[] selects by rows and columns, in which case only rows or columns can be selected at a time.
  2. method, which queries the rows and columns based on their labeled values
  3. method, query based on the numerical position of rows and columns, and locate based on indexes
  4. methodologies

Second, the use of Pandas query data methods

  1. Querying data with a single label value
  2. Batch query using a list of values
  3. Range queries using numeric intervals
  4. Querying with Conditional Expressions
  5. Call function query

take note of

The above query method applies to both rows and columns

########################################## 

 df[]

>>> df=((25).reshape([5,5]),index=['A','B','C','D','E'],columns=['c1','c2','c3','c4','c5'])
>>> df
         c1        c2        c3        c4        c5
A  0.499404  0.082137  0.472568  0.649200  0.121681
B  0.564688  0.102398  0.374904  0.091373  0.495510
C  0.319272  0.720225  0.979103  0.910206  0.766642
D  0.478346  0.311616  0.466326  0.045612  0.258015
E  0.421653  0.577140  0.103048  0.235219  0.550336

##########################################  

# Get the c1 and c2 columns

df[['c1','c2']]

>>> df[['c1','c2']]
         c1        c2
A  0.499404  0.082137
B  0.564688  0.102398
C  0.319272  0.720225
D  0.478346  0.311616
E  0.421653  0.577140

##########################################  

# Get the c1 column

df.c1

>>> df.c1
A    0.499404
B    0.564688
C    0.319272
D    0.478346
E    0.421653
Name: c1, dtype: float64

##########################################  

# Get the data indexed as rows A-C

df['A':'C']

>>> df['A':'C']
         c1        c2        c3        c4        c5
A  0.499404  0.082137  0.472568  0.649200  0.121681
B  0.564688  0.102398  0.374904  0.091373  0.495510
C  0.319272  0.720225  0.979103  0.910206  0.766642

##########################################  

# Get 2-3 rows of data

df[1:3]

>>> df[1:3]
         c1        c2        c3        c4        c5
B  0.564688  0.102398  0.374904  0.091373  0.495510
C  0.319272  0.720225  0.979103  0.910206  0.766642

##########################################  

Methodological inquiries

1, the use of numerical intervals for range queries

It's kind of like slicing a list.

>>> ['A':'D',:]
         c1        c2        c3        c4        c5
A  0.499404  0.082137  0.472568  0.649200  0.121681
B  0.564688  0.102398  0.374904  0.091373  0.495510
C  0.319272  0.720225  0.979103  0.910206  0.766642
D  0.478346  0.311616  0.466326  0.045612  0.258015

##########################################  

2, a single label value query

Similar coordinate search

>>> ['A','c2']
0.08213716245372071

##########################################  

3、Use the list of batch query

>>> [['A','B','D'],['c1','c3']]
         c1        c3
A  0.499404  0.472568
B  0.564688  0.374904
D  0.478346  0.466326

##########################################  

4. Use of conditional expression queries

>>> [df['c2']>0.5,:]
         c1        c2        c3        c4        c5
C  0.319272  0.720225  0.979103  0.910206  0.766642
E  0.421653  0.577140  0.103048  0.235219  0.550336
>>> df[(df['c2']>0.2) & (df['c3'] < 0.8)]
         c1        c2        c3        c4        c5
D  0.478346  0.311616  0.466326  0.045612  0.258015
E  0.421653  0.577140  0.103048  0.235219  0.550336

##########################################  

5, the use of function query

def query_my_data(df):
    return ((df['c3']>0.2) & (df["c4"]<0.8))
            
[query_my_data, :]
            c1        c2            c3            c4            c5
    B    0.845310    0.545040    0.946026    0.106405    0.984376
    C    0.844622    0.947104    0.878854    0.377638    0.175846
    E    0.139952    0.420424    0.364295    0.012773    0.307853
 

##########################################  

Methodological inquiries

Similarly, locating by index

# Extract 2-3 rows, 1-2 columns of data

[1:3,0:2]

>>> [1:3,0:2]
         c1        c2
B  0.564688  0.102398
C  0.319272  0.720225

##########################################  

# Extract the second third row, fourth column of data

[[1,2],[3]]

         c4
B  0.091373
C  0.910206

##########################################  

# Extract a single value at a specified location

[3,4]

>>> [3,4]
0.2580148841605816

summarize

to this article on how to use Pandas query to select data is introduced to this article, more related Pandas query to select data content please search for my previous posts or continue to browse the following related articles I hope you will support me in the future!