WIP: In-repository performance benchmarks (#6)

* Uses our custom LDPC file library. Updates Julia code and dependencies * Adds rate adapted performance simulation to benchmarks * BUGFIX missing reference in saturate. This lead to worse FER due to saturation not doing anything * Assume stored zero-based indices for rate adaption * Enables reading row indices from files for rate adaption * Adds rate adapted FER simulation and dichotomic average rate search. * bugfix: no longer skip first line of rate adaptive data csv file * bugfix console output in main fer simulation * bugfix miscounting frame errors on convergence * Implements a check for variable node elimination on code construction. * sorts variable node arrays after rate adaption * uses stop at n errors parameter in rate adapted fer simulation * Update runtime benchmarks, add rate adaption benchmark * Allows VN clashes during rate adaption. Duplicates removed manually Note: benchmarks show that rate adaption takes less than 1/10 of decoding time. While this change introduces some overhead, it should not be critical and can be adressed later. Relevant places in the code are marked with TODO * Adds csv files specifying rate adaption for each given LDPC matrix * Adds Julia function to convert all QC-cscmat files to cscmat files. Not part of command line interface yet * Updates documentation * Allows disabling console output by setting update console every N frames to zero * main_rate_adapted_simulation uses bisection to find critical rate
XQP-Munich · Sep 22, 2021 · eb4fc12 · eb4fc12
1 parent d719cc9
commit eb4fc12
Show file tree

Hide file tree

Showing 25 changed files with 535,353 additions and 491 deletions.
diff --git a/.gitignore b/.gitignore
@@ -3,6 +3,7 @@
 cmake-build-debug/
 cmake-build-release/
 cmake-build-debug-coverage/
+cmake-build-relwithdebinfo/
 
 # VSCode
 .vscode

diff --git a/README.md b/README.md
@@ -7,57 +7,76 @@ Note: This repository is still missing some benchmarks and documentation, which
 
 ## Overview
 
-This repository aims to solve the following problem. Suppose Alice and Bob each have a bit-string of length _N_. Bobs bit-string is the result of randomly flipping each bit of Alice with probability _p_, i.e., it is the output of a binary symmetric channel with **channel parameter** _p_. (Note: the same considerations apply for soft inputs instead of bit-strings and a Gaussian channel.) The goal is for Alice to send (via a noise-less channel) to Bob a message (called syndrome), such that Bob can recover Alice's bit-string given his bit-string and the syndrome received from Alice. There are two important metrics, which we want to minimize: 
+This repository aims to solve the following problem. Suppose Alice and Bob each have a bit-string of length _N_.
+Bobs bit-string is the result of randomly flipping each bit of Alice with probability _p_, i.e., it is the output of a binary symmetric channel with **channel parameter** _p_. 
+(Note: the same considerations apply for soft inputs instead of bit-strings and a Gaussian channel.) The goal is for Alice to send (via a noise-less channel) to Bob a message (called syndrome), such that Bob can recover Alice's bit-string given his bit-string and the syndrome received from Alice. There are two important metrics, which we want to minimize: 
+
 - the probability that Bob will fail to recover Alice's bit-string, which is called the frame error rate (FER)
 - the length of the syndrome
 
-We define the syndrome to be the matrix-vector product (modulo 2) of a sparse binary matrix (LDPC matrix) and Alice's bit-string. In this case, Bob can use an inference algorithm (loopy belief propagation) to obtain a guess of Alice's bit-string. If the LDPC matrix has size _M_ x _N_, we call _M_ / _N_ the **rate** of the LDPC matrix and _N_ the **block size**. (Note: the term rate is used differently in forward error correction, where it means 1 - _M_ / _N_.) Minimizing the length of the syndrome actually means minimizing the rate for a given channel parameter. Note that, for any given channel parameter, there is a trade-off between small rate, small FER and small block size. Furthermore, there are theoretical limits as to how small the rate can be (the Slepian Wolf limit, sometimes called Shannon limit, applies to assymptotically large bloc size. Limits for finite block sizes also exists but are more complicated).
+We define the syndrome to be the matrix-vector product (modulo 2) of a sparse binary matrix (LDPC matrix) and Alice's bit-string. 
+In this case, Bob can use an inference algorithm (loopy belief propagation) to obtain a guess of Alice's bit-string. If the LDPC matrix has size _M_ x _N_, we call _M_ / _N_ the **rate** of the LDPC matrix and _N_ the **block size**. 
+(Note: the term rate is used differently in forward error correction, where it means 1 - _M_ / _N_.) Minimizing the length of the syndrome actually means minimizing the rate for a given channel parameter.
+Note that, for any given channel parameter, there is a trade-off between small rate, small FER and small block size.
+Furthermore, there are theoretical limits as to how small the rate can be (the Slepian Wolf limit, sometimes called Shannon limit, applies to assymptotically large bloc size.
+Limits for finite block sizes also exist but are more complicated).
 
 ### Contrast with forward error correction
 
-Contrary to how forward error correction works, generator matrices for the LDPC code are not used at all for distributed source coding. This repository does not provide generator matrices (calculating them from the parity check matrices is straightforward).
-Our decoder implementation operates on a bit-string (noisy version of true message) and its correct syndrome. This is slightly different from what is used for forward error correction (as in e.g. [AFF3CT](https://github.com/aff3ct/aff3ct)), where the decoder operates on only the noisy codeword. The noisy codeword is the result transmitting the codeword (true message encoded using a generator matrix) via a noisy channel.
+Contrary to how forward error correction works, generator matrices for the LDPC code are not used at all for distributed source coding. 
+This repository does not provide generator matrices (calculating them from the parity check matrices is straightforward).
+Our decoder implementation operates on a bit-string (noisy version of true message) and its correct syndrome. 
+This is slightly different from what is used for forward error correction (as in e.g. [AFF3CT](https://github.com/aff3ct/aff3ct)), where the decoder operates on only the noisy codeword. 
+The noisy codeword is the result of transmitting the codeword (true message encoded using a generator matrix) via a noisy channel.
 
 
 ## How to use
+In order to use all the functionality provided in this repository, install
+- CMake (at least version 3.19)
+- C++ compiler (supporting C++17)
+- Julia (at least version 1.6)
+
+For how to use the provided Julia code, see directory `codes`. No familiarity with the Julia programming language is required.
+
+To build the C++ executables (except for unit tests) using CMake (Julia not required), execute
+
+        git clone <github url>
+        cd LDPC4QKD
+        cmake -S . -B build -DBUILD_UNIT_TESTS=OFF
+        cmake --build build --config Release
+        
+All executables will be built inside the `build` folder.
 
-Start from the `examples` directory, which shows a basic example of how to use the C++ header containing the decoder. 
-TODO
+For a demo of how to use the C++ header-only library, see the `examples` directory, which shows a basic example ("demo_error_correction") of how to use the C++ header containing the decoder. This example is built by CMake (executable `build/examples/demo_error_correction`. Note: the executable produces no output; look at the C++ source code to see how the header can be used).
 
 ## How to contribute
 This repository is actively maintained. Issues and pull requests will be responded to and processed.
 
-Let us know if you're having problems with the provided materials, wish to contribute new ideas or need modifications or our work for your application.
+Let us know if you're having problems with the provided materials, wish to contribute new ideas or need modifications of our work for your application.
 
 ## List of contents
 
 ### Data files
-- A number of LDPC codes. Their parity check matrices are stored in a custom file format (called `CSCMAT`).
+- A number of LDPC codes (a list and the actual LDPC matrices are in the folder `codes`). Their parity check matrices are stored in a custom file format (called `CSCMAT`).
 - Simulations results done using [AFF3CT](https://github.com/aff3ct/aff3ct), showing FER of the LDPC matrices at various channel parameters.
 - Simulation results done using the decoder in this repository, showing FER of LDPC matrices, their rate adapted versions, and average rate under rate adaption (Work in progress!).
 - For each LDPC matrix, a specification of rate adaption. This is a list of pairs of row indices of the matrix that are combined (added mod 2) in each rate adaption step.
 
-### Julia code to
-- load the LDPC matrices from `CSCMAT` files
-- save the compressed sparse column (CSC) representation and also to export to other standard formats, such as `alist`.
+### Julia code
+- Contained in the folder `codes`.
+- Enables loading the LDPC matrices from `CSCMAT` files
+- Enables saving the compressed sparse column (CSC) representation and also exporting to other standard formats, such as `alist`.
 
 ### C++ code
-- Basic LDPC decoder using belief propagation (BP). Contained in a single header file (`PATH`) and easy to use. Can perform syndrome computation as well. The LDPC code can be embedded into the executable or loaded from a file at program runtime.
-- For applications that only require syndrome computation but no decoding, we provide a separate implementation for multiplication of a sparse binary matrix and a dense binary vector (LDPC syndrome computation). This is also contained in a single header file (`PATH`).
-- For this purpose, the LDPC matrix can be stored within the executable. Julia code is provided to automatically generate header files storing an LDPC code in constant arrays.
+- Basic LDPC decoder using belief propagation (BP). Contained in a single header file (`src/rate_adaptive_code.hpp`) and easy to use. Can perform syndrome computation as well. The LDPC code can be embedded into the executable or loaded from a file at program runtime.
+- Utility functions for reading LDPC matrices (from `.cscmat` files) and rate adaption (from `.csv` files). These functions are contained in `src/read_cscmat_format.hpp`. Note that these functions require the fully explicit binary form of the LDPC matrix. The LDPC codes given inside `codes/ldpc` use an even more memory-efficient storage 
+- For applications that only require syndrome computation but no decoding, we provide a separate implementation for multiplication of a sparse binary matrix and a dense binary vector (LDPC syndrome computation). This is also contained in a single header file (`src/encoder.hpp`). This is a very specific application, which you probably don't care about initially.
+- A LDPC matrix can be stored within the executable. It is included as a header file defining constant data (for example `tests/fortest_autogen_ldpc_matrix_csc.hpp`). Julia code (see folder `codes`) is provided to generate such a C++ header file for any LDPC matrix.
 
-TODO add paths
 
 ## Planned features and improvements
 - Code to automatically generate reports on quality of all LDPC codes and rate adapted performance (work in progress)
 - More and better LDPC codes with more sizes and rates
 - The BP implementation is not state-of-the-art or heavily optimized. We may improve it in the future.
 - Using compressed sparse row format may have advantages over the current compressed sparse column format. This needs to be tested.
 - the quasi-cyclic structure of LDPC matrices can be exploited to store the matrices more efficiently. For the encoder, this may be combinable with storing individual bits as integers rather than booleans, again saving some memory. The encoding in this case would consist of adding bit-shifted integers.
-
-
-## Appendix
-
-### Table of LDPC matrices
-
-See directory `codes`.
diff --git a/benchmarks/CMakeLists.txt b/benchmarks/CMakeLists.txt
@@ -4,6 +4,7 @@ cmake_minimum_required(VERSION 3.19)
 
 project(ErrorCorrectionDemoForQKD)
 
+# google benchmark library. Only required for the targets called "benchmark", i.e., for runtime speed benchmarking.
 find_package(benchmark REQUIRED)
 
 # --------------------------------------------------------------------------------------------------- Encoder Benchmarks
@@ -38,9 +39,25 @@ target_include_directories(benchmark_decoder
         PRIVATE
         )
 
+# --------------------------------------------------------------------------------------------- Rate Adaption Benchmarks
+add_executable(benchmark_ra main_benchmark_ra.cpp
+        )
+
+target_compile_features(benchmark_ra PUBLIC cxx_std_17)
+
+target_link_libraries(benchmark_ra
+        PRIVATE
+        LDPC4QKD::LDPC4QKD
+        benchmark::benchmark
+        )
+
+target_include_directories(benchmark_ra
+        PRIVATE
+        )
+
 # ------------------------------------------------------------------------------------------ Frame error rate simulation
 add_executable(frame_error_rate_simulation main_frame_error_rate_simulation.cpp
-        )
+        code_simulation_helpers.hpp)
 
 target_compile_features(frame_error_rate_simulation PUBLIC cxx_std_17)
 
@@ -52,3 +69,33 @@ target_link_libraries(frame_error_rate_simulation
 target_include_directories(frame_error_rate_simulation
         PRIVATE
         )
+
+# ---------------------------------------------------------------------------------- Rate adapted performance simulation
+add_executable(rate_adapted_simulation main_rate_adapted_simulation.cpp
+        code_simulation_helpers.hpp)
+
+target_compile_features(rate_adapted_simulation PUBLIC cxx_std_17)
+
+target_link_libraries(rate_adapted_simulation
+        PRIVATE
+        LDPC4QKD::LDPC4QKD
+        )
+
+target_include_directories(rate_adapted_simulation
+        PRIVATE
+        )
+
+# ------------------------------------------------------------------------------------------ Rate adapted fer simulation
+add_executable(rate_adapted_fer main_rate_adapted_fer.cpp
+        code_simulation_helpers.hpp)
+
+target_compile_features(rate_adapted_fer PUBLIC cxx_std_17)
+
+target_link_libraries(rate_adapted_fer
+        PRIVATE
+        LDPC4QKD::LDPC4QKD
+        )
+
+target_include_directories(rate_adapted_fer
+        PRIVATE
+        )
diff --git a/benchmarks/code_simulation_helpers.hpp b/benchmarks/code_simulation_helpers.hpp
@@ -0,0 +1,70 @@
+//
+// Created by alice on 08.09.21.
+//
+
+#ifndef LDPC4QKD_CODE_SIMULATION_HELPERS_HPP
+#define LDPC4QKD_CODE_SIMULATION_HELPERS_HPP
+
+#include "read_scsmat_format.hpp"
+
+namespace LDPC4QKD::CodeSimulationHelpers {
+    template<typename T>
+    void noise_bitstring_inplace(std::mt19937_64 &rng, std::vector<T> &src, double err_prob) {
+        std::bernoulli_distribution distribution(err_prob);
+
+        for (std::size_t i = 0; i < src.size(); i++) {
+            if (distribution(rng)) {
+                src[i] = !src[i];
+            } else {
+                src[i] = src[i];
+            }
+        }
+    }
+
+
+    // Shannon binary entropy
+    template<typename T>
+    double h2(T p) {
+        return -p * ::log2(p) - (1 - p) * ::log2(1 - p);
+    }
+
+    template<typename T>
+    double avg(const std::vector<T> &in) {
+        double tmp{};
+        for (auto i : in) {
+            tmp += static_cast<double>(i);
+        }
+        return tmp / in.size();
+    }
+
+
+    /// WARNING: if the templated types are too small, the numbers in the files are static_cast down!
+    template<typename Bit=bool, // type of matrix entires (only values zero and one are used, default to bool)
+            typename colptr_t=std::uint32_t, // integer type that fits ("number of non-zero matrix entries" + 1)
+            typename idx_t=std::uint32_t>
+    LDPC4QKD::RateAdaptiveCode<Bit, colptr_t, idx_t> get_ldpc_code_nora(const std::string &cscmat_file_path) {
+        auto pair = LDPC4QKD::read_matrix_from_cscmat<colptr_t, idx_t>(cscmat_file_path);
+        auto colptr = pair.first;
+        auto row_idx = pair.second;
+
+        return LDPC4QKD::RateAdaptiveCode<Bit, colptr_t, idx_t>(colptr, row_idx);
+    }
+
+    /// WARNING: if the templated types are too small, the numbers in the files are static_cast down!
+    template<typename Bit=bool, // type of matrix entires (only values zero and one are used, default to bool)
+            typename colptr_t=std::uint32_t, // integer type that fits ("number of non-zero matrix entries" + 1)
+            typename idx_t=std::uint32_t>
+    LDPC4QKD::RateAdaptiveCode<Bit, colptr_t, idx_t> get_code_big_wra(
+            const std::string &cscmat_file_path, const std::string &rate_adaption_file_path
+    ) {
+        auto pair = LDPC4QKD::read_matrix_from_cscmat<colptr_t, idx_t>(cscmat_file_path);
+        auto colptr = pair.first;
+        auto row_idx = pair.second;
+
+        std::vector<idx_t> rows_to_combine = read_rate_adaption_from_csv<idx_t>(rate_adaption_file_path);
+
+        return LDPC4QKD::RateAdaptiveCode<Bit, colptr_t, idx_t>(colptr, row_idx, rows_to_combine);
+    }
+}
+
+#endif //LDPC4QKD_CODE_SIMULATION_HELPERS_HPP
diff --git a/benchmarks/main_benchmark_decoder.cpp b/benchmarks/main_benchmark_decoder.cpp
@@ -43,17 +43,17 @@ static void BM_decode_benchmark_no_rate_adaption(benchmark::State& state) {
     constexpr double bsc_err_p = 0.03; // channel error probability
 
     // Perform setup here
-    auto H = get_code_big();
+    auto H = get_code_big_nora();
 
     std::vector<bool> true_codeword(H.getNCols());
     std::vector<bool> predicted_codeword(H.getNCols());
-    std::vector<bool> syndrome(H.getNRows());
+    std::vector<bool> syndrome(H.get_n_rows_mother_matrix());
 
     for (auto _ : state) {
         state.PauseTiming();
 
         noise_bitstring_inplace(true_codeword, 0.5, state.range(0));
-        H.encode(true_codeword, syndrome);
+        H.encode_no_ra(true_codeword, syndrome);
         std::vector<bool> x_noised = true_codeword; // distorted data
         noise_bitstring_inplace(x_noised, bsc_err_p);
 
@@ -72,15 +72,16 @@ static void BM_decode_benchmark_no_rate_adaption(benchmark::State& state) {
         benchmark::DoNotOptimize(success);
 
         // This code gets timed
-        success = H.decode(llrs, syndrome, predicted_codeword);
+        success = H.decode_at_current_rate(llrs, syndrome, predicted_codeword);
         if (!success) {
             std::cout << "DECODER FAILED!!!!";
         }
 
         benchmark::ClobberMemory();
     }
 }
-BENCHMARK(BM_decode_benchmark_no_rate_adaption)->Unit(benchmark::kMicrosecond)->Range(8, 8<<10);
+BENCHMARK(BM_decode_benchmark_no_rate_adaption)->Unit(benchmark::kMicrosecond)->Range(0, 1<<16);
+
 
 // Run the benchmark
 BENCHMARK_MAIN();
diff --git a/benchmarks/main_benchmark_ra.cpp b/benchmarks/main_benchmark_ra.cpp
@@ -0,0 +1,42 @@
+//
+// Created by alice on 20.09.21.
+//
+
+// Standard library
+#include <iostream>
+#include <random>
+
+// Google Benchmark library
+#include <benchmark/benchmark.h>
+
+// Project scope
+#include "rate_adaptive_code.hpp"
+// Test cases test against constants known to be correct for the LDPC-matrix defined here:
+#include "autogen_ldpc_matrix_csc.hpp"
+#include "autogen_rate_adaption.hpp"
+
+
+LDPC4QKD::RateAdaptiveCode<bool> get_code_big_wra() {
+    std::vector<std::uint32_t> colptr(AutogenLDPC::colptr.begin(), AutogenLDPC::colptr.end());
+    std::vector<std::uint16_t> row_idx(AutogenLDPC::row_idx.begin(), AutogenLDPC::row_idx.end());
+    std::vector<std::uint16_t> rows_to_combine(AutogenRateAdapt::rows.begin(), AutogenRateAdapt::rows.end());
+    return LDPC4QKD::RateAdaptiveCode<bool>(colptr, row_idx, rows_to_combine);
+}
+
+
+static void BM_decode_benchmark_set_rate(benchmark::State& state) {
+    for (auto _ : state) {
+        state.PauseTiming();
+        auto H = get_code_big_wra();
+        state.ResumeTiming();
+
+        benchmark::DoNotOptimize(H.getPosVarn());
+        H.set_rate(state.range(0));
+        benchmark::ClobberMemory();
+    }
+}
+BENCHMARK(BM_decode_benchmark_set_rate)->Unit(benchmark::kMicrosecond)->Range(0, 1024);
+
+
+// Run the benchmark
+BENCHMARK_MAIN();