Compile Time String Encryption Using Constexpr

Compile time string encryption using constexpr

Here's how I would do it:

1.) Use the str_const template for constexpr string manipulation described here: Conveniently Declaring Compile-Time Strings in C++

Code:

class str_const {
// constexpr string
private:
const char * const p_;
const std::size_t sz_;

public:
template <std::size_t N>
constexpr str_const(const char(&a)[N])
: p_(a)
, sz_(N - 1)
{}

constexpr char operator[](std::size_t n) const { return n < sz_ ? p_[n] : throw std::out_of_range(""); }
constexpr std::size_t size() const { return sz_; }

constexpr const char * get() const { return p_; }
};

This lets you do things like str_const message = "Invalid license" and manipulate message in constexpr functions.

2.) Make a simple compile-time pseudorandom generator, using the macros __TIME__ and __LINE__ to generate the seed. This is described in detail here: Generate random numbers in C++ at compile time

They give some template-based code.

3.) Make a struct, with a constexpr ctor which takes either const char [] and templates itself against the size similarly to the str_const example, or which just takes a str_const, and generates two str_const which it are its member variables.

  • A str_const of length n containing pseudorandom unsigned chars, generated using the pseudorandom generator, where n is the length of the input. (the "noise string")
  • A str_const of length n containing the entry-wise sum (as unsigned chars) of the input characters with the noise characters. (the "cipher text")

Then it has a member function decrypt which need not be constexpr, and can return a std::string, which simply subtracts each character of the noise string from the corresponding character of the cipher text and returns the resulting string.

If your compiler is still storing the original string literal in the binary, it means that either it's storing the input string literal (the constructor argument) which I don't think it should be doing since its a temporary, or its basically inlining the decrypt function, and you should be able to prevent that by obfuscating it with function pointers, or marking it volatile or similar.

Edit: I'm not sure if the standard requires that temporary constexpr objects should not appear in the binary. Actually I'm curious about that now. My expectation is that at least in a release build, a good compiler should remove them when they are no longer needed.

Edit: So, you already accepted my answer. But anyways for completeness, here's some source code that implements the above ideas, using only C++11 standard. It works on gcc-4.9 and clang-3.6, even when optimizations are disabled, as nearly as I can tell.

#include <array>
#include <iostream>
#include <string>

typedef uint32_t u32;
typedef uint64_t u64;
typedef unsigned char uchar;

template<u32 S, u32 A = 16807UL, u32 C = 0UL, u32 M = (1UL<<31)-1>
struct LinearGenerator {
static const u32 state = ((u64)S * A + C) % M;
static const u32 value = state;
typedef LinearGenerator<state> next;
struct Split { // Leapfrog
typedef LinearGenerator< state, A*A, 0, M> Gen1;
typedef LinearGenerator<next::state, A*A, 0, M> Gen2;
};
};

// Metafunction to get a particular index from generator
template<u32 S, std::size_t index>
struct Generate {
static const uchar value = Generate<LinearGenerator<S>::state, index - 1>::value;
};

template<u32 S>
struct Generate<S, 0> {
static const uchar value = static_cast<uchar> (LinearGenerator<S>::value);
};

// List of indices
template<std::size_t...>
struct StList {};

// Concatenate
template<typename TL, typename TR>
struct Concat;

template<std::size_t... SL, std::size_t... SR>
struct Concat<StList<SL...>, StList<SR...>> {
typedef StList<SL..., SR...> type;
};

template<typename TL, typename TR>
using Concat_t = typename Concat<TL, TR>::type;

// Count from zero to n-1
template<size_t s>
struct Count {
typedef Concat_t<typename Count<s-1>::type, StList<s-1>> type;
};

template<>
struct Count<0> {
typedef StList<> type;
};

template<size_t s>
using Count_t = typename Count<s>::type;

// Get a scrambled character of a string
template<u32 seed, std::size_t index, std::size_t N>
constexpr uchar get_scrambled_char(const char(&a)[N]) {
return static_cast<uchar>(a[index]) + Generate<seed, index>::value;
}

// Get a ciphertext from a plaintext string
template<u32 seed, typename T>
struct cipher_helper;

template<u32 seed, std::size_t... SL>
struct cipher_helper<seed, StList<SL...>> {
static constexpr std::array<uchar, sizeof...(SL)> get_array(const char (&a)[sizeof...(SL)]) {
return {{ get_scrambled_char<seed, SL>(a)... }};
}
};

template<u32 seed, std::size_t N>
constexpr std::array<uchar, N> get_cipher_text (const char (&a)[N]) {
return cipher_helper<seed, Count_t<N>>::get_array(a);
}

// Get a noise sequence from a seed and string length
template<u32 seed, typename T>
struct noise_helper;

template<u32 seed, std::size_t... SL>
struct noise_helper<seed, StList<SL...>> {
static constexpr std::array<uchar, sizeof...(SL)> get_array() {
return {{ Generate<seed, SL>::value ... }};
}
};

template<u32 seed, std::size_t N>
constexpr std::array<uchar, N> get_key() {
return noise_helper<seed, Count_t<N>>::get_array();
}

/*
// Get an unscrambled character of a string
template<u32 seed, std::size_t index, std::size_t N>
char get_unscrambled_char(const std::array<uchar, N> & a) {
return static_cast<char> (a[index] - Generate<seed, index>::value);
}
*/

// Metafunction to get the size of an array
template<typename T>
struct array_info;

template <typename T, size_t N>
struct array_info<T[N]>
{
typedef T type;
enum { size = N };
};

template <typename T, size_t N>
struct array_info<const T(&)[N]> : array_info<T[N]> {};

// Scramble a string
template<u32 seed, std::size_t N>
class obfuscated_string {
private:
std::array<uchar, N> cipher_text_;
std::array<uchar, N> key_;
public:
explicit constexpr obfuscated_string(const char(&a)[N])
: cipher_text_(get_cipher_text<seed, N>(a))
, key_(get_key<seed,N>())
{}

operator std::string() const {
char plain_text[N];
for (volatile std::size_t i = 0; i < N; ++i) {
volatile char temp = static_cast<char>( cipher_text_[i] - key_[i] );
plain_text[i] = temp;
}
return std::string{plain_text, plain_text + (N - 1)};///We do not copy the termination character
}
};

template<u32 seed, std::size_t N>
std::ostream & operator<< (std::ostream & s, const obfuscated_string<seed, N> & str) {
s << static_cast<std::string>(str);
return s;
}

#define RNG_SEED ((__TIME__[7] - '0') * 1 + (__TIME__[6] - '0') * 10 + \
(__TIME__[4] - '0') * 60 + (__TIME__[3] - '0') * 600 + \
(__TIME__[1] - '0') * 3600 + (__TIME__[0] - '0') * 36000) + \
(__LINE__ * 100000)

#define LIT(STR) \
obfuscated_string<RNG_SEED, array_info<decltype(STR)>::size>{STR}

auto S2 = LIT(("Hewwo, I'm hunting wabbits"));

int main() {
constexpr auto S1 = LIT(("What's up doc"));
std::cout << S1 << std::endl;
std::cout << S2 << std::endl;
}

Compile-time string encryption

Perfect solution does exist, here it is.

I also thought this wasn't possible, even though it's very simple, people wrote solutions where you need a custom tool to scan the built file afterwards and scan for strings and encrypt the strings like that, which wasn't bad but I wanted a package that's compiled from Visual Studio, and it's possible now!

What you need is C++ 11 (Visual Studio 2015 Update 1 out of the box)

the magic happens with this new command constexpr

By magic happens in this #define

#define XorString( String ) ( CXorString<ConstructIndexList<sizeof( String ) - 1>::Result>( String ).decrypt() )

It won't decrypt the XorString at compile-time, only at run-time, but it will encrypt the string only in compile-time, so the strings will not appear in the Executable file

printf(XorString( "this string is hidden!" ));

It will print out "this string is hidden!" but you won't find it inside Executable file as strings!, check it yourself with Microsoft Sysinternals Strings program download link: https://technet.microsoft.com/en-us/sysinternals/strings.aspx

The full source code is quite large but could easily be included into one header file. But also quite random so the encrypted string outputs will always change every new compile, the seed is changed based on the time it took it compile, pretty much solid,perfect solution.

Create a file called XorString.h

#pragma once

//-------------------------------------------------------------//
// "Malware related compile-time hacks with C++11" by LeFF //
// You can use this code however you like, I just don't really //
// give a shit, but if you feel some respect for me, please //
// don't cut off this comment when copy-pasting... ;-) //
//-------------------------------------------------------------//

////////////////////////////////////////////////////////////////////
template <int X> struct EnsureCompileTime {
enum : int {
Value = X
};
};
////////////////////////////////////////////////////////////////////

////////////////////////////////////////////////////////////////////
//Use Compile-Time as seed
#define Seed ((__TIME__[7] - '0') * 1 + (__TIME__[6] - '0') * 10 + \
(__TIME__[4] - '0') * 60 + (__TIME__[3] - '0') * 600 + \
(__TIME__[1] - '0') * 3600 + (__TIME__[0] - '0') * 36000)
////////////////////////////////////////////////////////////////////

////////////////////////////////////////////////////////////////////
constexpr int LinearCongruentGenerator(int Rounds) {
return 1013904223 + 1664525 * ((Rounds> 0) ? LinearCongruentGenerator(Rounds - 1) : Seed & 0xFFFFFFFF);
}
#define Random() EnsureCompileTime<LinearCongruentGenerator(10)>::Value //10 Rounds
#define RandomNumber(Min, Max) (Min + (Random() % (Max - Min + 1)))
////////////////////////////////////////////////////////////////////

////////////////////////////////////////////////////////////////////
template <int... Pack> struct IndexList {};
////////////////////////////////////////////////////////////////////

////////////////////////////////////////////////////////////////////
template <typename IndexList, int Right> struct Append;
template <int... Left, int Right> struct Append<IndexList<Left...>, Right> {
typedef IndexList<Left..., Right> Result;
};
////////////////////////////////////////////////////////////////////

////////////////////////////////////////////////////////////////////
template <int N> struct ConstructIndexList {
typedef typename Append<typename ConstructIndexList<N - 1>::Result, N - 1>::Result Result;
};
template <> struct ConstructIndexList<0> {
typedef IndexList<> Result;
};
////////////////////////////////////////////////////////////////////

////////////////////////////////////////////////////////////////////
const char XORKEY = static_cast<char>(RandomNumber(0, 0xFF));
constexpr char EncryptCharacter(const char Character, int Index) {
return Character ^ (XORKEY + Index);
}

template <typename IndexList> class CXorString;
template <int... Index> class CXorString<IndexList<Index...> > {
private:
char Value[sizeof...(Index) + 1];
public:
constexpr CXorString(const char* const String)
: Value{ EncryptCharacter(String[Index], Index)... } {}

char* decrypt() {
for(int t = 0; t < sizeof...(Index); t++) {
Value[t] = Value[t] ^ (XORKEY + t);
}
Value[sizeof...(Index)] = '\0';
return Value;
}

char* get() {
return Value;
}
};
#define XorS(X, String) CXorString<ConstructIndexList<sizeof(String)-1>::Result> X(String)
#define XorString( String ) ( CXorString<ConstructIndexList<sizeof( String ) - 1>::Result>( String ).decrypt() )
////////////////////////////////////////////////////////////////////

UPDATED CODE BELOW, This is a better version below and supports char and wchar_t strings!

#pragma once
#include <string>
#include <array>
#include <cstdarg>

#define BEGIN_NAMESPACE( x ) namespace x {
#define END_NAMESPACE }

BEGIN_NAMESPACE(XorCompileTime)

constexpr auto time = __TIME__;
constexpr auto seed = static_cast< int >(time[7]) + static_cast< int >(time[6]) * 10 + static_cast< int >(time[4]) * 60 + static_cast< int >(time[3]) * 600 + static_cast< int >(time[1]) * 3600 + static_cast< int >(time[0]) * 36000;

// 1988, Stephen Park and Keith Miller
// "Random Number Generators: Good Ones Are Hard To Find", considered as "minimal standard"
// Park-Miller 31 bit pseudo-random number generator, implemented with G. Carta's optimisation:
// with 32-bit math and without division

template < int N >
struct RandomGenerator
{
private:
static constexpr unsigned a = 16807; // 7^5
static constexpr unsigned m = 2147483647; // 2^31 - 1

static constexpr unsigned s = RandomGenerator< N - 1 >::value;
static constexpr unsigned lo = a * (s & 0xFFFF); // Multiply lower 16 bits by 16807
static constexpr unsigned hi = a * (s >> 16); // Multiply higher 16 bits by 16807
static constexpr unsigned lo2 = lo + ((hi & 0x7FFF) << 16); // Combine lower 15 bits of hi with lo's upper bits
static constexpr unsigned hi2 = hi >> 15; // Discard lower 15 bits of hi
static constexpr unsigned lo3 = lo2 + hi;

public:
static constexpr unsigned max = m;
static constexpr unsigned value = lo3 > m ? lo3 - m : lo3;
};

template <>
struct RandomGenerator< 0 >
{
static constexpr unsigned value = seed;
};

template < int N, int M >
struct RandomInt
{
static constexpr auto value = RandomGenerator< N + 1 >::value % M;
};

template < int N >
struct RandomChar
{
static const char value = static_cast< char >(1 + RandomInt< N, 0x7F - 1 >::value);
};

template < size_t N, int K, typename Char >
struct XorString
{
private:
const char _key;
std::array< Char, N + 1 > _encrypted;

constexpr Char enc(Char c) const
{
return c ^ _key;
}

Char dec(Char c) const
{
return c ^ _key;
}

public:
template < size_t... Is >
constexpr __forceinline XorString(const Char* str, std::index_sequence< Is... >) : _key(RandomChar< K >::value), _encrypted{ enc(str[Is])... }
{
}

__forceinline decltype(auto) decrypt(void)
{
for (size_t i = 0; i < N; ++i) {
_encrypted[i] = dec(_encrypted[i]);
}
_encrypted[N] = '\0';
return _encrypted.data();
}
};

//--------------------------------------------------------------------------------
//-- Note: XorStr will __NOT__ work directly with functions like printf.
// To work with them you need a wrapper function that takes a const char*
// as parameter and passes it to printf and alike.
//
// The Microsoft Compiler/Linker is not working correctly with variadic
// templates!
//
// Use the functions below or use std::cout (and similar)!
//--------------------------------------------------------------------------------

static auto w_printf = [](const char* fmt, ...) {
va_list args;
va_start(args, fmt);
vprintf_s(fmt, args);
va_end(args);
};

static auto w_printf_s = [](const char* fmt, ...) {
va_list args;
va_start(args, fmt);
vprintf_s(fmt, args);
va_end(args);
};

static auto w_sprintf = [](char* buf, const char* fmt, ...) {
va_list args;
va_start(args, fmt);
vsprintf(buf, fmt, args);
va_end(args);
};

static auto w_sprintf_ret = [](char* buf, const char* fmt, ...) {
int ret;
va_list args;
va_start(args, fmt);
ret = vsprintf(buf, fmt, args);
va_end(args);
return ret;
};

static auto w_sprintf_s = [](char* buf, size_t buf_size, const char* fmt, ...) {
va_list args;
va_start(args, fmt);
vsprintf_s(buf, buf_size, fmt, args);
va_end(args);
};

static auto w_sprintf_s_ret = [](char* buf, size_t buf_size, const char* fmt, ...) {
int ret;
va_list args;
va_start(args, fmt);
ret = vsprintf_s(buf, buf_size, fmt, args);
va_end(args);
return ret;
};

//Old functions before I found out about wrapper functions.
//#define XorStr( s ) ( XorCompileTime::XorString< sizeof(s)/sizeof(char) - 1, __COUNTER__, char >( s, std::make_index_sequence< sizeof(s)/sizeof(char) - 1>() ).decrypt() )
//#define XorStrW( s ) ( XorCompileTime::XorString< sizeof(s)/sizeof(wchar_t) - 1, __COUNTER__, wchar_t >( s, std::make_index_sequence< sizeof(s)/sizeof(wchar_t) - 1>() ).decrypt() )

//Wrapper functions to work in all functions below
#define XorStr( s ) []{ constexpr XorCompileTime::XorString< sizeof(s)/sizeof(char) - 1, __COUNTER__, char > expr( s, std::make_index_sequence< sizeof(s)/sizeof(char) - 1>() ); return expr; }().decrypt()
#define XorStrW( s ) []{ constexpr XorCompileTime::XorString< sizeof(s)/sizeof(wchar_t) - 1, __COUNTER__, wchar_t > expr( s, std::make_index_sequence< sizeof(s)/sizeof(wchar_t) - 1>() ); return expr; }().decrypt()

END_NAMESPACE

Compile time function encryption

Standard C++ does not allow access code as data.

That is,

int f(int);

reinterpret_cast<int*>(&f) = 1;

Is not valid, you cannot access "code" as data.
Sure you cannot access to code as data at at compile time too.
So you cannot neither encrypt nor decrypt your unction.

Still there are some tools that actually do this. But they rely on implementation-specific behavior at runtime. At compile time they just add additional step, which is usually not known to compiler and happens after compilation by tampering with compiler output.

And something may work in portable C++, at least in theory, it is not what you want, but it is a "compile time function encryption" you're asking for.

If you define some grammar for your functions, like, you can parse, say, const char* function = "(a, b, c) { return a + b * c; }", then if you can add constexpr encrypting function, you'll have function in your program that encrypts at compile time, and can be decrypted before execution.

Sure Standard also does not require that calling a constexpr function to produce static initialization data indeed happens at compile time, but it is something expected from a good implementation.

How to encrypt strings at compile time?

you can encrypt it using macros or write your own preprocessor

#define CRYPT8(str) { CRYPT8_(str "\0\0\0\0\0\0\0\0") }
#define CRYPT8_(str) (str)[0] + 1, (str)[1] + 2, (str)[2] + 3, (str)[3] + 4, (str)[4] + 5, (str)[5] + 6, (str)[6] + 7, (str)[7] + 8, '\0'

// calling it
const char str[] = CRYPT8("ntdll");

Encrypting / obfuscating a string literal at compile-time

I think this question deserves an updated answer.

When I asked this question several years ago, I didn't consider the difference between obfuscation and encryption. Had I known this difference then, I'd have included the term Obfuscation in the title before.

C++11 and C++14 have features that make it possible to implement compile-time string obfuscation (and possibly encryption, although I haven't tried that yet) in an effective and reasonably simple way, and it's already been done.

ADVobfuscator is an obfuscation library created by Sebastien Andrivet that uses C++11/14 to generate compile-time obfuscated code without using any external tool, just C++ code. There's no need to create extra build steps, just include it and use it. I don't know a better compile-time string encryption/obfuscation implementation that doesn't use external tools or build steps. If you do, please share.

It not only obuscates strings, but it has other useful things like a compile-time FSM (Finite State Machine) that can randomly obfuscate function calls, and a compile-time pseudo-random number generator, but these are out of the scope of this answer.

Here's a simple string obfuscation example using ADVobfuscator:

#include "MetaString.h"

using namespace std;
using namespace andrivet::ADVobfuscator;

void Example()
{
/* Example 1 */

// here, the string is compiled in an obfuscated form, and
// it's only deobfuscated at runtime, at the very moment of its use
cout << OBFUSCATED("Now you see me") << endl;

/* Example 2 */

// here, we store the obfuscated string into an object to
// deobfuscate whenever we need to
auto narrator = DEF_OBFUSCATED("Tyler Durden");

// note: although the function is named `decrypt()`, it's still deobfuscation
cout << narrator.decrypt() << endl;
}

You can replace the macros DEF_OBFUSCATED and OBFUSCATED with your own macros. Eg.:

#define _OBF(s) OBFUSCATED(s)

...

cout << _OBF("klapaucius");

How does it work?

If you take a look at the definition of these two macros in MetaString.h, you will see:

#define DEF_OBFUSCATED(str) MetaString<andrivet::ADVobfuscator::MetaRandom<__COUNTER__, 3>::value, andrivet::ADVobfuscator::MetaRandomChar<__COUNTER__>::value, Make_Indexes<sizeof(str) - 1>::type>(str)

#define OBFUSCATED(str) (DEF_OBFUSCATED(str).decrypt())

Basically, there are three different variants of the MetaString class (the core of the string obfuscation). Each has its own obfuscation algorithm. One of these three variants is chosen randomly at compile-time, using the library's pseudo-random number generator (MetaRandom), along with a random char that is used by the chosen algorithm to xor the string characters.

"Hey, but if we do the math, 3 algorithms * 255 possible char keys (0 is not used) = 765 variants of the obfuscated string"

You're right. The same string can only be obfuscated in 765 different ways. If you have a reason to need something safer (you're paranoid / your application demands increased security) you can extend the library and implement your own algorithms, using stronger obfuscation or even encryption (White-Box cryptography is in the lib's roadmap).


Where / how does it store the obfuscated strings?

One thing I find interesting about this implementation is that it doesn't store the obfuscated string in the data section of the executable.
Instead, it is statically stored into the MetaString object itself (on the stack) and the algorithm decodes it in place at runtime. This approach makes it much harder to find the obfuscated strings, statically or at runtime.

You can dive deeper into the implementation by yourself. That's a very good basic obfuscation solution and can be a starting point to a more complex one.

Compile time encryption for strings using user-defined literals

Is it even possible what I'm trying to attempt?

Yes, it is possible*. What you can pre-compute and put directly in the source code can also be done by the compiler at compile time.

However, you cannot use std::string. It's not a literal type. Something like:

 constexpr std::string tmp = "some string literal"

will never compile because std::string and std::basic_string in general have no constexpr constructor.

You must therefore use const char [] as input for your meta-programming; after that, you may assign it to a std::string.

NB: Meta-programming has some restrictions you need to take into account: you don't have access to many tools you'd otherwise have, like new or malloc, for example: you must allocate on the stack your variables.


*Edit: Not entirely with UDLs, as @m.s. points out. Indeed, you receive a pointer to const chars and the length of the string. This is pretty restrictive in a constexpr scenario, and I doubt it's possible to find a way to work on that string. In "normal" meta-programming, where you can have a size that is a constant expression, compile-time encryption is instead possible.

Compile-time hashing with constexpr and CryptoPP

You say compile-time. Do you really mean that? That implies the user-defined string literal is declared constexpr which (AFIAK) is not possible (I have tried).

This leaves the route of re-implementing SHA3 hash as a constexpr template function with the following signature:

template<size_t N>
constexpr custom_digest sha3_hash(const char (&source)[N])
{
// your constexpr-friendly code goes here
}

Bear in mind that every function called by your constexpr function must also be constexpr (i.e. dealing only in literal types or constexpr user types composed therefrom).

Yes, const char (&)[N] is a literal type.

Computing length of a C string at compile time. Is this really a constexpr?

Constant expressions are not guaranteed to be evaluated at compile time, we only have a non-normative quote from draft C++ standard section 5.19 Constant expressions that says this though:

[...]>[ Note: Constant expressions can be evaluated during
translation.—end note ]

You can assign the result to constexpr variable to be sure it is evaluated at compile time, we can see this from Bjarne Stroustrup's C++11 reference which says (emphasis mine):

In addition to be able to evaluate expressions at compile time, we
want to be able to require expressions to be evaluated at compile
time; constexpr in front of a variable definition does that
(and
implies const):

For example:

constexpr int len1 = length("abcd") ;

Bjarne Stroustrup gives a summary of when we can assure compile time evaluation in this isocpp blog entry and says:

[...]The correct answer - as stated
by Herb - is that according to the standard a constexpr function may
be evaluated at compiler time or run time unless it is used as a
constant expression, in which case it must be evaluated at
compile-time. To guarantee compile-time evaluation, we must either use
it where a constant expression is required (e.g., as an array bound or
as a case label) or use it to initialize a constexpr. I would hope
that no self-respecting compiler would miss the optimization
opportunity to do what I originally said: "A constexpr function is
evaluated at compile time if all its arguments are constant
expressions."

So this outlines two cases where it should be evaluated at compile time:

  1. Use it where a constant expression is required, this would seem to be anywhere in the draft standard where the phrase shall be ... converted constant expression or shall be ... constant expression is used, such as an array bound.
  2. Use it to initialize a constexpr as I outline above.


Related Topics



Leave a reply



Submit