Python uses Rows to quickly manipulate csv files

Rows is a third-party Python module dedicated to manipulating tables.

As long as the csv file is read through Rows, she can generate Python objects that can be computed.

Compared with pandas' pd.read_csv, I think Rows' advantages lies in its easy-to-understand computational syntax and various convenient export and conversion syntax. It can easily extract text in PDF, convert csv into sqlite files, merge csv, etc., and can also execute SQL syntax on csv files, which is quite powerful.

Of course, its influence is definitely not as great as Pandas, but let's learn about it, it's not worthy of your skills.

1. Prepare

Before you start, you need to make sure that Python and pip are installed successfully on your computer. If not, you can access this article:Extremely detailed Python installation guideProcess installation.

(Optional 1)If your purpose of using Python is data analysis, you can install it directlyAnaconda, it has Python and pip built in.

(Optional 2)In addition, it is recommended that you useVSCode Editor, it has many advantages

Please select any of the following methods to enter the command to install the dependencies:

1. Windows Environment Open Cmd (Start-Run-CMD).

2. MacOS environment Open Terminal (command+space input Terminal).

3. If you are using VSCode editor or Pycharm, you can directly use Terminal below the interface.

pip install rows

2. Basic use

Through the following small example, you can know the basic usage of Rows.

Suppose we have a csv table data like this:

state,city,inhabitants,area
AC,Acrelândia,12538,1807.92
AC,Assis Brasil,6072,4974.18
AC,Brasiléia,21398,3916.5
AC,Bujari,8471,3034.87
AC,Capixaba,8798,1702.58
[...]
RJ,Angra dos Reis,169511,825.09
RJ,Aperibé,10213,94.64
RJ,Araruama,112008,638.02
RJ,Areal,11423,110.92
RJ,Armação dos Búzios,27560,70.28
[...]

If we want to find cities whose state is RJ and whose population is larger than 500,000, we just need to do this:

import rows
 
cities = rows.import_from_csv("data/")
rio_biggest_cities = [
    city for city in cities
    if  == "RJ" and  > 500000
]
for city in rio_biggest_cities:
    density =  / 
    print(f"{} ({density:5.2f} ppl/km²)")

It's very similar to Pandas, but the syntax is simpler than Pandas, and the entire module is lighter than Pandas.

If you want to create a new "table" yourself, you can write it like this:

from collections import OrderedDict
from rows import fields, Table
 
 
country_fields = OrderedDict([
    ("name", ),
    ("population", ),
])
 
countries = Table(fields=country_fields)
({"name": "Argentina", "population": "45101781"})
({"name": "Brazil", "population": "212392717"})
({"name": "Colombia", "population": "49849818"})
({"name": "Ecuador", "population": "17100444"})
({"name": "Peru", "population": "32933835"})

Then you can iterate over it:

for country in countries:
    print(country)
# Result:
# Row(name='Argentina', population=45101781)
# Row(name='Brazil', population=212392717)
# Row(name='Colombia', population=49849818)
# Row(name='Ecuador', population=17100444)
# Row(name='Peru', population=32933835)
# "Row" is a namedtuple created from `country_fields`
 
# We've added population as a string, the library automatically converted to
# integer so we can also sum:
countries_population = sum( for country in countries)
print(countries_population) # prints 357378595

You can also export this table to CSV or any other supported format:

# Public Account: Python Practical Documentimport rows
rows.export_to_csv(countries, "")
 
# html
rows.export_to_html(legislators, "")

Import from a dictionary to a rows object:

import rows
 
data = [
    {"name": "Argentina", "population": "45101781"},
    {"name": "Brazil", "population": "212392717"},
    {"name": "Colombia", "population": "49849818"},
    {"name": "Ecuador", "population": "17100444"},
    {"name": "Peru", "population": "32933835"},
    {"name": "Guyana", }, # Missing "population", will fill with `None`
]
table = rows.import_from_dicts(data)
print(table[-1]) # Can use indexes
# Result:
# Row(name='Guyana', population=None)

3. Command line tools

In addition to writing Python code, you can also use Rows' command line tools directly. The following introduces several tools that may be frequently used.

Read the text in the pdf file and save it as a file:

# Need to install in advance: pip install rows[pdf]URL="/app/_edicoes/2018/01/"
rows pdf-to-text $URL  # Save to file Show progress barrows pdf-to-text --quiet $URL  # Save to file No progress bar displayedrows pdf-to-text --pages=1,2,3 $URL # Output three pages to the terminalrows pdf-to-text --pages=1-3 $URL # Output three pages to the terminal (using the - range character)

Convert csv to sqlite:

rows csv2sqlite \
     --dialect=excel \
     --input-encoding=latin1 \
       \

Merge multiple csv files:

rows csv-merge \
      .bz2  \

Perform sql search on csv:

# needs: pip install rows[html]
rows query \
    "SELECT * FROM table1 WHERE inhabitants > 1000000" \
    data/ \
    --output=data/

For more features, please refer to the official Rows documentation:

/rows

This is the article about Python using Rows to quickly operate csv files. This is all about this. For more related Python operating csv content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!