ref: e7ce17381f525328073577d60583447fc9412c18
dir: /research/dynamic_packfiles.txt/
dynamic packfiles to append objects gc/refcount process punches page-sized holes in them for pages fully within the space of unwanted objects, after setting a tombstone mark holes are recorded in an index and re-used then, if desired, the repack process removes all the punched holes and anything surrounding from unwanted objects that are slightly out of the page boundary repack is not really git's repack algorithm, it's bascially just defragmentation. genreational bloom filters idx design ========== so, let's first get our invariants and patterns clear. * fixed-length cryptographic object IDs * essentially uniform key distribution * exact lookup only, no range scans, no ordered iteration requirements * reads are extremely important * writes are mostly append-like * deletes/tombstones may happen later but are secondary 1st design ---------- * mutable front index * immutable base index * period merge/compaction into a new base generation upload-pack/send-pack/defrag ============================ take current pack, remove dead objects/holes, filter objects out, record offsets and adjust ofs_deltas since they always go backwards, write the pack back; then stream written pack to client. two-step necessary because pack header includes object count; could have a custom new protocol that doesn't do so.