A binary file is a file containing information stored only in the form of bits and bytes (0 and 1). They are not human readable because the bytes in it are converted to characters and symbols containing many other non-printable characters. Trying to read a binary file using any text editor will display characters such as Ø and ð.
Binaries must be read by a specific program before they can be used. For example, the binary files of a Microsoft Word program can only be readable to human-readable forms through the Word program. This means that there is more information, such as formatting of characters and page numbers, in addition to human-readable text, which are also stored with alphanumeric characters. The last binary file is a continuous sequence of bytes. The newline character we see in the text file is the character that connects the first line to the next line.
Sometimes, data generated by other programs needs to be processed by R as a binary file. In addition, R language is necessary to create binary files that can be shared with other programs.
R language has two functions WriteBin() and readBin() to create and read binary files.
grammar
writeBin(object, con) readBin(con, what, n )
The following is a description of the parameters used
- con is a connection object that reads or writes to a binary file.
- object is the binary file to be written.
- what - is a byte pattern like characters, integers, etc. representing the byte pattern being read.
- n is the number of bytes read from the binary file.
example
We consider R language built-in data "mtcars". First, we create a csv file from it, convert it into a binary file, and store it as an operating system file. Next we read the created binary file.
Write to binary files
We read the dataframe "mtcars" as a csv file and then write it to the operating system as a binary file.
# Read the "mtcars" data frame as a csv file and store only the columns "cyl", "am" and "gear". (mtcars, file = "", = FALSE, na = "", = TRUE, sep = ",") # Store 5 records from the csv file as a new data frame. <- ("",sep = ",",header = TRUE,nrows = 5) # Create a connection object to write the binary file using mode "wb". = file("/web/com/", "wb") # Write the column names of the data frame to the connection object. writeBin(colnames(), ) # Write the records in each of the column to the file. writeBin(c($cyl,$am,$gear), ) # Close the file for writing so that it can be read by other program. close()
Read binary files
The binary file created above stores all data as consecutive bytes. Therefore, we will read it by selecting the appropriate column name value and column value.
# Create a connection object to read the file in binary mode using "rb". <- file("/web/com/", "rb") # First read the column names. n = 3 as we have 3 columns. <- readBin(, character(), n = 3) # Next read the column values. n = 18 as we have 3 column names and 15 values. <- file("/web/com/", "rb") bindata <- readBin(, integer(), n = 18) # Print the data. print(bindata) # Read the values from 4th byte to 8th byte which represents "cyl". cyldata = bindata[4:8] print(cyldata) # Read the values form 9th byte to 13th byte which represents "am". amdata = bindata[9:13] print(amdata) # Read the values form 9th byte to 13th byte which represents "gear". geardata = bindata[14:18] print(geardata) # Combine all the read values to a dat frame. finaldata = cbind(cyldata, amdata, geardata) colnames(finaldata) = print(finaldata)
When we execute the above code, it produces the following results and chart
[1] 7108963 1728081249 7496037 6 6 4 [7] 6 8 1 1 1 0 [13] 0 4 4 4 3 3 [1] 6 6 4 6 8 [1] 1 1 1 0 0 [1] 4 4 4 3 3 cyl am gear [1,] 6 1 4 [2,] 6 1 4 [3,] 4 1 4 [4,] 6 0 3 [5,] 8 0 3
As we can see, we get the raw data by reading the binary file in R.
This is the end of this article about the detailed explanation of R's binary file operation. For more relevant R's binary file operation examples, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!