Preface
Let's talk about how to use it todayJS
Let's analyzeexcel
Files, of course, are not used directlyexceljs
、sheetjs
Such libraries are boring, but mainly talk about themJS
Analysisexcel
How tables are implemented.
Pay attention to the main discussion in this articlexlsx
Formatexcel
The table, other formats are not clearly explored.
What is an excel table file
First, parseexcel
Files, you have to first understand how it stores data. After my searches, I finally found itGG
The answer was found in:excel
The file is actually azip
Bag! So I quickly built a new onexlsx
Two new files were created in themsheet
Table, twosheet
The table data is as follows:
This issheet
1:
A | B | C |
---|---|---|
1 | 2 | |
1 | 2 | |
1 | 2 | |
1 | 2 |
This issheet
2:
A | B |
---|---|
q | a |
q | a |
q | a |
Then usezip
Decompress:
unzip -d test
Then passtree
We'll get a directory structure like this:
test
├── [Content_Types].xml
├── _rels
├── docProps
│ ├──
│ ├──
│ └──
└── xl
├── _rels
│ └──
├──
├──
├── theme
│ └──
├──
└── worksheets
├──
└──
Ahh, beautifully done, it's allxml
document.
We're openingxml
After exploring the details, you can see that there are several documents that are very conspicuous, that isworksheets
Nextand
,besides
, others
styles
、theme
It's obviously related to the style._rels
It feels like an internal reference. Let's take a look at two first.sheet
ofxml
File, see if the guess is correct, post it:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <worksheet xmlns="/spreadsheetml/2006/main" xmlns:r="/officeDocument/2006/relationships" xmlns:xdr="/drawingml/2006/spreadsheetDrawing" xmlns:x14="/office/spreadsheetml/2009/9/main" xmlns:mc="/markup-compatibility/2006" xmlns:etc="/officeDocument/2017/etCustomData"> <sheetPr/> <dimension ref="A1:C7"/> <sheetViews> <sheetView workbookViewId="0"> <selection activeCell="D5" sqref="A3:D5"/> </sheetView> </sheetViews> <sheetFormatPr defaultColWidth="9.23076923076923" defaultRowHeight="16.8" outlineLevelRow="6" outlineLevelCol="2"/> <sheetData> <row r="1" spans="1:3"> <c r="A1"> <v>1</v> </c> <c r="C1"> <v>2</v> </c> </row> <row r="2" spans="1:3"> <c r="A2"> <v>1</v> </c> <c r="C2"> <v>2</v> </c> </row> <row r="6" spans="1:3"> <c r="A6"> <v>1</v> </c> <c r="C6"> <v>2</v> </c> </row> <row r="7" spans="1:3"> <c r="A7"> <v>1</v> </c> <c r="C7"> <v>2</v> </c> </row> </sheetData> <pageMargins left="0.75" right="0.75" top="1" bottom="1" header="0.5" footer="0.5"/> <headerFooter/> </worksheet>
😂 I believe everyone has seen it.sheetData
that isexcel
The data in the table is here,<row>
Represents row,r
It is the index of the row number,row
In-house<c>
It should becell
It's one of them<v>
Correspondingcell
value inr
Yescell
location, such asA7
Represents inA
Column 7 rows.
There are also several obvious properties such asdimension
It can be seen that it is the size range of the table, fromA1 cell
arriveC7 cell
Form a box.<sheetViews>
The information stored in the page should be the information on the page.<selection>
What represents the selected form content.
andworkbook
The stored in thesheet
Information:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <workbook xmlns="/spreadsheetml/2006/main" xmlns:r="/officeDocument/2006/relationships"> <fileVersion appName="xl" lastEdited="3" lowestEdited="5" rupBuild="9302"/> <workbookPr/> <bookViews> <workbookView windowHeight="16360" activeTab="1"/> </bookViews> <sheets> <sheet name="Sheet1" sheetId="1" r:/> <sheet name="Sheet2" sheetId="2" r:/> </sheets> <calcPr calcId="144525"/> </workbook>
The remaining fewxml
, after a look, the stored information is quite clear, such as:
-
app
The file program information is stored, and it seems that the file name is also -
core
The author's information and creation and modification time are saved -
rels
So are the filesxml
Format, some other storagexml
Quotation -
theme
The colors and fonts defined in the table are stored -
[Content_Types]
There are references to all files, guesses are estimated to be parsed entry files
JS implementation steps
knewexcel
How do files store data, how do we use itjs
It will be clear to parse it, and it is mainly divided into three steps:
- use
js
Decompressexcel
document - Obtain the one in it
sheet
File contents, thenxml
Data parsed - Convert data to the shape we want
Let's do it right away, let's do it:
ZIP Decompression
aboutJS
How to implement itZIP
Unzipped, as mentioned in the previous article, we will not explain it in detail here, just use it directlyjszip
Get it done:
('#file').addEventListener('change', async e => { const file = [0]; if (!file) return; const zip = await (file); const sheetXML = await ['xl/worksheets/'].async('string'); });
Get it done quickly nowsheetXML
That's what we just sawThe data in it is already there.
XML parsing
Then we can analyzeXML
The content takes out the data,xml
The analysis principle is very simple, andhtml parse
Similarly, if you understand the principle, we can just create an open source library to help you solve it:
import convert from 'xml-js'; const result = convert.xml2json(sheetXML, { compact: true, spaces: 4 });
Then we got this stringJSON
(Some content has been deleted):
{ "_declaration": { "_attributes": {} }, "worksheet": { "_attributes": {}, "sheetPr": {}, "dimension": { "_attributes": { "ref": "A1:C7" } }, "sheetData": { "row": [ { "_attributes": { "r": "1", "spans": "1:3" }, "c": [ { "_attributes": { "r": "A1" }, "v": { "_text": "1" } }, { "_attributes": { "r": "C1" }, "v": { "_text": "2" } } ] }, { "_attributes": { "r": "7", "spans": "1:3" }, "c": [ { "_attributes": { "r": "A7" }, "v": { "_text": "1" } }, { "_attributes": { "r": "C7" }, "v": { "_text": "2" } } ] } ] } } }
Next, we just need tosheetData
Take out the data in it, and then generate the data format you want according to the internal properties.
Summarize
excel
The essence of a file is onezip
We only need to passzip
Unzip,xml
The three steps of parsing and data processing can be usedJS
There are still many details to read the data, but if it is just simpleexcel
Template, why not try it yourself.
This is the end of this article about parsing excel files using JS. For more related JS parsing excel files, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!