Block Gas Limit and Knapsack Problem

In today’s daily standup, the core dev was engaged in an interesting discussion about how to pack as many as possible txs into a block with a given block gas limit (which is 20MIOTX for IoTeX mainnet). This is a combinatorial optimization problem that is equivalent to the knapsack problem – given a set of items, each with a weight and a value, determine the number of each item included in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. (https://en.wikipedia.org/wiki/Knapsack_problem). The problem is still NP-complete: while it is easy to confirm whether a proposed solution is valid, it is computationally difficult to determine in the first place whether any solution exists.

Several algorithms are available to solve knapsack problems, based on dynamic programming approach, branch and bound approach or hybridizations of both approaches.

To address this problem practically, here is a heuristic algorithm with O(n log n) cost that can be used. Note that the tx picking scheme is really up to the delegate who’s producing blocks, and there is no need for all delegates to pick txs using the same scheme. Still, consensus can be obtained.

1. sort all actions by gas price to the array A, sort all actions by the gas amount to the array B
2. choose first N entries from A
3. choose first M entries (that were unselected in step 2) from B to minimize (block gas limit - total gas of N txs)
4. Pack N + M entries into a block

We made a tradeoff here - optimize performance (low computational complexity) while sacrificing optimum s.t. step 4 may return a result that “waste” a few more gas units in a block. But it is completely acceptable.

Thoughts?