Notes
Performance Ceiling
I've been trying to figure out what our "theoretical" performance limits would look like. To that end, I've written a script that generates a bunch of random sized strings and inserts them into Berkeley DB.
This gives us an upper bound on the kind of performance we can expect to see, since all of our item handling is substantially more complex (as in, involving mulitple updates per item) than this test. Still,
it's useful to know whether we have any chance of accomplish N orders of magnitude of speedup.
The data below is supposed reflect the effect of different Berkeley DB cache sizes, but the run is on OS X (where the cache settings appear to have no impact). This data set was generated on 1.25GHz Powerbook G4, with 1G of RAM and an 80G 4200RPM disk. Note that we pay a substantial price for running DBXML in transactional mode. My conclusion from this data is that on my machine, a commit rate of around 1700 items is the upper limit of where we could expect the repository to be, and that in fact the number should be much lower than that. I am currently seeing Chandler commit rates in the neighborhood of 50-70 items/sec. So a two order of magnitude speedup would be 5000-7000 items/s, which is probably out of reach. Even one order of magnitude (500-700 items/s) looks like a stretch at this point.
Generating data... done.
Cache Size = 2097152
Average Non transactional commit rate = 24058.3326043
Average Non transactional sequential read rate = 30068.0158278
Average Non transactional random read rate = 18303.8105232
Average Transactional commit rate = 1794.57831876
Average Transactional sequential read rate = 24403.0486604
Average Transactional random read rate = 15514.2713964
Cache Size = 4194304
Average Non transactional commit rate = 24824.7509826
Average Non transactional sequential read rate = 30036.5702142
Average Non transactional random read rate = 18220.3971302
Average Transactional commit rate = 1807.21427771
Average Transactional sequential read rate = 24531.4550798
Average Transactional random read rate = 15649.537593
Cache Size = 8388608
Average Non transactional commit rate = 24475.3442884
Average Non transactional sequential read rate = 29803.0671415
Average Non transactional random read rate = 17807.3295547
Average Transactional commit rate = 1804.59251815
Average Transactional sequential read rate = 24637.3907112
Average Transactional random read rate = 15278.0434745
Cache Size = 16777216
Average Non transactional commit rate = 23092.995253
Average Non transactional sequential read rate = 28368.5057499
Average Non transactional random read rate = 17048.7574515
Average Transactional commit rate = 1688.22675557
Average Transactional sequential read rate = 23190.1222807
Average Transactional random read rate = 14756.5062169
Cache Size = 33554432
Average Non transactional commit rate = 24342.1378805
Average Non transactional sequential read rate = 30100.3272971
Average Non transactional random read rate = 18173.6835204
Average Transactional commit rate = 1794.77913325
Average Transactional sequential read rate = 24435.0126125
Average Transactional random read rate = 15710.3291637
Cache Size = 67108864
Average Non transactional commit rate = 23597.8271317
Average Non transactional sequential read rate = 29941.6484082
Average Non transactional random read rate = 18114.7765343
Average Transactional commit rate = 1787.43309085
Average Transactional sequential read rate = 24307.6996952
Average Transactional random read rate = 15620.0413775
Cache Size = 134217728
Average Non transactional commit rate = 23968.2215524
Average Non transactional sequential read rate = 30404.8821225
Average Non transactional random read rate = 18008.7162333
Average Transactional commit rate = 1791.84714896
Average Transactional sequential read rate = 24506.8644868
Average Transactional random read rate = 15734.9999632
RunPython bdbtest.py 349.46s user 282.79s system 49% cpu 21:26.51 total
--
TedLeung - 29 Nov 2004