How To Divide A Binary File To 6-byte Blocks In C++ Or Python With Fast Speed?

September 08, 2024 Post a Comment

I’m reading a file in C++ and Python as a binary file. I need to divide the binary into blocks, each 6 bytes. For example, if my file is 600 bytes, the result should be 100 block

Solution 1:

Let's start simple, then optimize.

Simple Loop

uint8_t  array1[6];
while (my_file.read((char *) &array1[0], 6))
{
    Process_Block(&array1[0]);
}

The above code reads in a file, 6 bytes at a time and sends the block to a function. Meets the requirements, not very optimal.

Reading Larger Blocks

Files are streaming devices. They have an overhead to start streaming, but are very efficient to keep streaming. In other words, we want to read as much data per transaction to reduce the overhead.

staticconstunsignedint CAPACITY = 6 * 1024;
uint8_t block1[CAPACITY];
while (my_file.read((char *) &block1[0], CAPACITY))
{
    constsize_t bytes_read = my_file.gcount();
    constsize_t blocks_read = bytes_read / 6;
    uint8_tconst * block_pointer = &block1[0];
    while (blocks_read > 0)
    {
        Process_Block(block_pointer);
        block_pointer += 6;
        --blocks_read;
    }
}

The above code reads up to 1024 blocks in one transaction. After reading, each block is sent to a function for processing.

This version is more efficient than the Simple Loop, as it reads more data per transaction. Adjust the CAPACITY to find the optimal size on your platform.

Loop Unrolling

The previous code reduces the first bottleneck of input transfer speed (although there is still room for optimization). Another technique is to reduce the overhead of the processing loop by performing more data processing inside the loop. This is called loop unrolling.

constsize_t bytes_read = my_file.gcount();
constsize_t blocks_read = bytes_read / 6;
uint8_tconst * block_pointer = &block1[0];
while ((blocks_read / 4) != 0)
{
    Process_Block(block_pointer);
    block_pointer += 6;

    Process_Block(block_pointer);
    block_pointer += 6;

    Process_Block(block_pointer);
    block_pointer += 6;

    Process_Block(block_pointer);
    block_pointer += 6;
    blocks_read -= 4;
}
while (blocks_read > 0)
{
    Process_Block(block_pointer);
    block_pointer += 6;
    --blocks_read;
}

You can adjust the quantity of operations in the loop, to see how it affects your program's speed.

Multi-Threading & Multiple Buffers

Another two techniques for speeding up the reading of the data, are to use multiple threads and multiple buffers.

One thread, an input thread, reads the file into a buffer. After reading into the first buffer, the thread sets a semaphore indicating there is data to process. The input thread reads into the next buffer. This repeats until the data is all read. (For a challenge, figure out how to reuse the buffers and notify the other thread of which buffers are available).

The second thread is the processing thread. This processing thread is started first and waits for the first buffer to be completely read. After the buffer has the data, the processing thread starts processing the data. After the first buffer has been processed, the processing thread starts on the next buffer. This repeats until all the buffers have been processed.

The goal here is to use as many buffers as necessary to keep the processing thread running and not waiting.

Edit 1: Other techniques

Memory Mapped Files

Some operating systems support memory mapped files. The OS reads a portion of the file into memory. When a location outside the memory is accessed, the OS loads another portion into memory. Whether this technique improves performance needs to be measured (profiled).

Parallel Processing & Threading

Adding multiple threads may show negligible performance gain. Computers have a data bus (data highway) connecting many hardware devices, including memory, file I/O and the processor. Devices will be paused to let other devices use the data highway. With multiple cores or processors, one processor may have to wait while the other processor is using the data highway. This waiting may cause negligible performance gain when using multiple threads or parallel processing. Also, the operating system has overhead when constructing and maintaining threads.

Solution 2:

Try that, the input file is received in argument of the program, as you said I suppose the the 6 bytes values in the file are written in the big endian order, but I do not make assumption for the program reading the file then sorting and it can be executed on both little and big endian (I check the case at the execution)

#include<iostream>#include<fstream>#include<vector>#include<cstdint>#include<algorithm>#include<limits.h>// CHAR_BITusingnamespace std;

#if CHAR_BIT != 8# error that code supposes a char has 8 bits#endifintmain(int argc, char ** argv){
  if (argc != 2)
    cerr << "Usage: " << argv[1] << " <file>" << endl;
  else {
    ifstream in(argv[1], ios::binary);

    if (!in.is_open())
      cerr << "Cannot open " << argv[1] << endl;
    else {
      in.seekg(0, ios::end);

      size_t n = (size_t) in.tellg() / 6;
      vector<uint64_t> values(n);
      uint64_t * p = values.data(); // for performanceuint64_t * psup = p + n;

      in.seekg(0, ios::beg);

      int i = 1;

      if (*((char *) &i)) {
        // little endianunsignedchar s[6];
        uint64_t v = 0;

        while (p != psup) {
          if (!in.read((char *) s, 6))
            return-1;
          ((char *) &v)[0] = s[5];
          ((char *) &v)[1] = s[4];
          ((char *) &v)[2] = s[3];
          ((char *) &v)[3] = s[2];
          ((char *) &v)[4] = s[1];
          ((char *) &v)[5] = s[0];
          *p++ = v;
        }
      }
      else {
        // big endianuint64_t v = 0;

        while (p != psup) {
          if (!in.read(((char *) &v) + 2, 6))
            return-1;
          *p++ = v;
        }
      }

      cout << "file successfully read" << endl;

      sort(values.begin(), values.end());
      cout << "values sort" << endl;

      // DEBUG, DO ON A SMALL FILE ;-)for (auto v : values)
        cout << v << endl;
    }
  }
}

Free Interactive Python Tutorial