SegmentNotFoundException: inconsistency in segment store

Recently I’ve installed a fresh content from PROD on my local instance and at the end of the installation, the package manager informed me about this error message: GC overhead limit exceeded.

I’ve increased the amount of memory in AEM’s start script from 1024M to 4096M and PermGen space from 256M to be 512M, and then restarted the server. After that, I wasn’t able to log in anymore. Only OSGi console was accessible.

After the inspection of error.log file, I found out that repository wasn’t up and running and the following message indicates the reason for that:

8.09.2017 12:33:06.863 *ERROR* [FelixStartLevel] com.adobe.granite.repository.impl.SlingRepositoryManager start:
Uncaught Throwable trying to access Repository, calling stopRepository() org.apache.jackrabbit.oak.plugins.segment.SegmentNotFoundException:
Segment 0c7fd7be-124d-4cd7-a3b3-ba36f2a5c0b2 not found

After googling a little bit I realized that the error caused because of the lack of memory, left my repository in inconsistent state and because of that, repository won’t start. I also found the recovery procedure and I’m going to describe it here:

  • log in to CRXDELite and find out Oak version
  • based on the oak version, download the corresponding oak-run file from this page http://repo1.maven.org/maven2/org/apache/jackrabbit/oak-run/. Put the file into AEM install directory (the one which contains crx-quickstart folder)
  • shut down AEM instance
  • find the latest consistent revision by running this command

    java -jar oak-run-<your_version>.jar check -d1 --bin=-1 -p crx-quickstart/repository/segmentstore/
    

    This would take some significant amount of time depending of the size of the repository. At the end you should get the output similar to this one

    13:21:17.421 [main] INFO  o.a.j.o.p.s.f.t.ConsistencyChecker - Found latest good revision ac578381-2fdd-4f56-a8c3-c84ae30c7e36:223964
    
  • Revert the repository to this revision by editing ./crx-quickstart/repository/segmentstore/journal.log. Delete all lines after the line containing the latest good revision
  • Remove all ./crx-quickstart/repository/segmentstore/*.bak files
  • Run checkpoint clean-up to remove orphaned checkpoints:

    java -jar oak-run-<your_version>.jar checkpoints ./crx-quickstart/repository/segmentstore rm-unreferenced
    
  • Compact the repository:

    java -jar oak-run-*.jar compact ./crx-quickstart/repository/segmentstore/