Mon Aug 14 19:12:00 UTC 2023
Exploring FIX processing using the latest C++.
A FIX coding challenge posted on the C++ group on LinkedIn – see the post.
Given that HFT firmware can make trading decisions in 10ns – see Trading at light speed: designing low latency systems in C++ for more info – I was curious how quickly I could process a FIX message in software. The process is timed using Google Benchmark; and just for kicks I’ve written my solution constexpr as I don’t get to use it very often in my day job. The challenge: extract the ticker symbol name and order quantity from a ASCII FIX message as quickly as possible.
The tags to search for: 55=Symbol 38=OrderQty
On a Sapphire Rapids VM I got my ranges solution below 100ns – not bad… — but Oleg Morozov managed to get his AVX version under 4ns!
Model name: Intel(R) Xeon(R) Platinum 8481C CPU @ 2.70GHz
BogoMIPS: 5399.99
---------------------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------------------
challenge_fix_dean_turpin 89.3 ns 89.3 ns 7850558
challenge_fix_oleg_morozov 3.72 ns 3.72 ns 189516479
make[1]: 'app.o' is up to date.
make[1]: 'app.o' is up to date.
make[1]: 'app.o' is up to date.
make EXTRA=-DNDEBUG run --directory=test --makefile=/tmp/makefile
make[1]: Entering directory '/builds/germs-dev/fix/test'
timeout 60 ./app.o --benchmark_filter=
Processing took 988247ns
[==========] Running 10 tests from 3 test suites.
[----------] Global test environment set-up.
[----------] 6 tests from fix
[ RUN ] fix.get_order
[ OK ] fix.get_order (0 ms)
[ RUN ] fix.get_summary
[ OK ] fix.get_summary (0 ms)
[ RUN ] fix.tokenise_to_map
[ OK ] fix.tokenise_to_map (0 ms)
[ RUN ] fix.tokenise_to_vector
[ OK ] fix.tokenise_to_vector (0 ms)
[ RUN ] fix.tokenise_hashmap
[ OK ] fix.tokenise_hashmap (0 ms)
[ RUN ] fix.tokenise_map
[ OK ] fix.tokenise_map (0 ms)
[----------] 6 tests from fix (0 ms total)
[----------] 3 tests from file
[ RUN ] file.read
[ OK ] file.read (0 ms)
[ RUN ] file.read
[ OK ] file.read (0 ms)
[ RUN ] file.read
[ OK ] file.read (0 ms)
[----------] 3 tests from file (0 ms total)
[----------] 1 test from parse
[ RUN ] parse.split_key_values
[ OK ] parse.split_key_values (0 ms)
[----------] 1 test from parse (0 ms total)
[----------] Global test environment tear-down
[==========] 10 tests from 3 test suites ran. (0 ms total)
[ PASSED ] 10 tests.
---------------------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------------------
challenge_fix_dean_turpin 0.000 ns 0.000 ns 1000000000
fix_get_key_value 0.000 ns 0.000 ns 1000000000
fix_get_summary_constexpr 0.000 ns 0.000 ns 1000000000
fix_tokenise_to_map 1869 ns 1865 ns 370919
main_process 733458 ns 732950 ns 954
main_process_devel 721134 ns 721121 ns 982
main_process_devel2 887724 ns 887036 ns 795
fix_tokenise_to_vector 1304 ns 1303 ns 543935
fix_get_key_value3 0.000 ns 0.000 ns 1000000000
fix_get_summary_hashmap 0.000 ns 0.000 ns 1000000000
fix_get_tag_first 0.000 ns 0.000 ns 1000000000
fix_get_tag_last 0.000 ns 0.000 ns 1000000000
read_stringstream_small 2328 ns 2237 ns 314637
read_stringstream_medium 3141 ns 3051 ns 229716
read_stringstream_large 19028 ns 18903 ns 36254
read_streambuf_iterator_small 2393 ns 2297 ns 300900
read_streambuf_iterator_medium 15776 ns 15714 ns 44206
read_streambuf_iterator_large 158921 ns 158909 ns 4343
read_tellg_small 2279 ns 2193 ns 315744
read_tellg_medium 2651 ns 2585 ns 270092
read_tellg_large 6151 ns 6075 ns 116546
split_key_value_pair_with_substr 23.1 ns 23.1 ns 29854659
make[1]: Leaving directory '/builds/germs-dev/fix/test'
Writing to /dev/null is as quick as it’s going to get – the baseline – and piping through the fix app is the thing we want to work on.
make[1]: 'app.o' is up to date.
make[1]: 'app.o' is up to date.
make[1]: 'app.o' is up to date.
# Test writing to null
timeout 5 fixer/app.o > /dev/null || true
# Test piping to fix processor
timeout 5 fixer/app.o | fixee/app.o || true
processed bits 978934368
[==========] Running 2 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from parse
[ RUN ] parse.split_key_values
[ OK ] parse.split_key_values (0 ms)
[----------] 1 test from parse (0 ms total)
[----------] 1 test from fix
[ RUN ] fix.tokenise_to_vector
[ OK ] fix.tokenise_to_vector (0 ms)
[----------] 1 test from fix (0 ms total)
[----------] Global test environment tear-down
[==========] 2 tests from 2 test suites ran. (0 ms total)
[ PASSED ] 2 tests.
---------------------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------------------
split_key_value_pair_with_substr 22.9 ns 22.9 ns 30994195
fix_tokenise_to_vector 1299 ns 1299 ns 541331