Reindex behavior oddity (minor issue?)

MentalNomad · September 30, 2019, 6:11pm

Continuing the discussion from 0.17.1rc1 reindex (full), assumevalid=0:

A curiosity regarding the txindex rebuild; this may be a (minor) issue. It appears that the txindex thread stops running when the chainstate reindex is complete. As a result the txindex doesn’t seem to be fully completed/enabled until a subsequent restart of the node.

This began when I noticed this about the latest reindex run - at the start:

2019-09-29T15:30:28Z txindex thread start
2019-09-29T15:30:28Z txindex is enabled
2019-09-29T15:30:28Z txindex thread exit

The txindex thread appears to have started running…

We can see that the index was, in fact, built during the reindexing because of the many WriteBatch memory usage: db=txindex message showing the memory size growing, as expected.

Here are the lines right after that last block was indexed from disk:

2019-09-30T04:08:16Z UpdateTip: new best=000000019f550bf1039de36e359d2f13697fe1f5d35430222f4382f2650447a6 height=601193 version=0x20000000 log2_work=87.340668 tx=266527475 date='2019-09-30T03:35:43Z' progress=0.634711 cache=228.9MiB(1402671txo)
2019-09-30T04:08:16Z   - Connect postprocess: 0.00ms [115.95s (0.19ms/blk)]
2019-09-30T04:08:16Z - Connect block: 2.00ms [37577.23s (62.50ms/blk)]
2019-09-30T04:08:16Z WriteBatch memory usage: db=txindex, before=13.0MiB, after=13.0MiB

The txindex appears to be getting updated.

All subsequent blocks (synced from network instead of disk) looked like the txindex was being updated with new incoming blocks just fine. Example:

2019-09-30T13:00:06Z UpdateTip: new best=000000018d8867310263292b5567e43a114adbe9c48366a93974f57e30065529 height=601255 version=0x20000000 log2_work=87.340668 tx=266527766 date='2019-09-30T12:59:55Z' progress=0.634596 cache=228.9MiB(1402505txo)
2019-09-30T13:00:06Z   - Connect postprocess: 1.00ms [115.99s (0.19ms/blk)]
2019-09-30T13:00:06Z - Connect block: 2.00ms [37577.39s (62.50ms/blk)]
2019-09-30T13:00:06Z WriteBatch memory usage: db=txindex, before=13.0MiB, after=13.0MiB

Buuuut… When I restarted the node several hours later (after removing the reindex=1 from the config so that it doesn’t wipe everything out again), I got this:

Bitcoingold.conf:

[main]
debug=1
#assumevalid=0
#reindex=1
txindex=1

Debug log start:

2019-09-30T16:39:40Z 




2019-09-30T16:39:40Z Bitcoin Gold version v0.17.1rc1 (release build)
2019-09-30T16:39:40Z InitParameterInteraction: parameter interaction: -whitelistforcerelay=1 -> setting -whitelistrelay=1
2019-09-30T16:39:41Z Assuming ancestors of block 00000001ddb50bf6de8012bf8cd15bebc278c7e75ce5f77409ab78a291bd7e36 have valid signatures.

Then the txindex started with:

2019-09-30T16:40:03Z txindex thread start
2019-09-30T16:40:03Z Syncing txindex with block chain from height 543459

This was strange, to me. Why height 543459? The node was already in sync up to block 601193 many hours earlier.

I watched some of the (usual?) activity, as the txindex was parsed (being built? or just being read?) until it completed, about a minute later:

2019-09-30T16:40:32Z WriteBatch memory usage: db=txindex, before=7.6MiB, after=7.6MiB
2019-09-30T16:40:32Z WriteBatch memory usage: db=txindex, before=7.6MiB, after=7.6MiB
2019-09-30T16:40:32Z txindex is enabled at height 601268
2019-09-30T16:40:32Z txindex thread exit

I restarted the node one more time. Then I got this at the start:
(I’ve clipped some lines here)

2019-09-30T17:33:27Z txindex thread start
2019-09-30T17:33:27Z Syncing txindex with block chain from height 601273
2019-09-30T17:33:27Z WriteBatch memory usage: db=txindex, before=0.0MiB, after=0.0MiB
2019-09-30T17:33:27Z txindex is enabled at height 601273
2019-09-30T17:33:27Z txindex thread exit

So this is a “clean” start of the node. It appears that the txindex was not fully completed on the first reindex run, and that it did not finish up until the next startup of the node.

I confirmed this in another way - the first big reindex was completed at 2019-09-30T04:08:16Z, which is 12:08 AM my time. Looking at the actual txindex database files on my drive, I can see that they were being written quickly and actively right up until the big reindex completed. Then the txindex stopped being built until the node was restarted 13 hours later:

I don’t know if this is normal behavior or not, but it may be worth noting for anyone who needs the services provided by a complete txindex… After running the txindex on a rebuild, you may want to restart the node to ensure all indexing is complete.

This appears to be a (minor) bug in the txindex implementation for BTG (& BTC); I suspect they stop the txindexing thread when the main index is completed, but the txindex is not yet fully rebuilt - it can be running behind by many, many thousands of blocks. The txindex rebuild does not start up again until the node is restarted. This is probably worth noting as an issue in Github to be resolved so that this runs more as expected in the future; I don’t yet know if the same issue exists in Bitcoin Core or not.

h4x3rotab · September 30, 2019, 9:10pm

The code indicates that there are two ways to index the transactions. The txindex database has a “best block” field to remember where it has synced.

The first way is for the initial indexing. When launching, the init function will try to start a new thread for txindex. The thread will first compare if the best indexed block has caught up with the chain tip. If not, it will loop through all the unprocessed blocks and index the transactions until it reaches the block tip.

The second way is for increment indexing. The indexer is registered as a subscriber for several signals including “ConnectBlock” and “ChainStateFlushed”. “ConnectBlock” event is triggered once a block is fully validated by the consensus rules and connected to the blockchain. When it’s triggered, the indexer will index the transactions in the block. “ChainStateFlushed” is triggered once the core wallet writes the updated blockchain information to the disk. It will trigger the txindex to advance the best block to the current block.

This explains some of the observation. When the first time you ran with “-reindex”, as the blockchain index was abandoned, it will find the txindex had already caught up the chain tip because the blockchain only had a genesis block. That’s why it exited immediately.

While reindexing, the newly imported blocks triggered “ConnectBlock” signal, and triggered the txindex. The “WriteBatch” logs were output from there. However, though still weird, it didn’t update the txindex best block somehow. It’s supposed to trigger “ChainStateFlushed” for multiple times during the reindexing process (as the chain state was actually updated constantly), but my inference is that it did not.

When you restarted the core wallet, it tried to start the initial indexing process. It’s clear that the best block in the txindex is older than the chain tip. Why? I haven’t figured it out. Maybe the reason was the missing “ChainStateFlushed” signal.

As the block downloaded from the network doesn’t suffer from this problem, I think this problem only apply to the reindex process.

MentalNomad · September 30, 2019, 9:42pm

Have you tested a fresh sync from the network with a full txindex enabled? (It is not enabled by default.)