Notes on C++

August 6, 2009

Type-safe table container using variadic templates

Filed under: C++ — Tags: , , , , , , — Dmitry Duka @ 3:48 pm

What tools would you use, when you want to save a bunch of data, that can be structured as table?

SQLite and Berkeley DB have big advantage – they can save your data to file. Also they have smart cache, so when you select some data it is quite probable that it is already in memory.

Also they have big disadvantage – limited type system.
If you want to store your fine-grained C++ classes’ objects in the table, you might gonna have a story with BLOB type, which involves unsafe type conversions or object construction.

Traditional STL components allows us to mix std::tuple and std::vector to produce sorta table:


std::vector<std::tuple<T1, T2, T3>> table;

It can’t grow by columns, they are specified once at compile-time.
Each row is one object of type std::tuple<T1, T2, T3>, which can be added to vector.

Here is my implementation of this idea using C++1X feature – variadic templates, so you can construct table of unlimited number of columns with any type you want.

Variadic templates looks like this:

// The following template 'S' can accept zero or more type parameters
// S<int> still generates approximately: struct S { S(int t) { } };
// S<> generates approximately : struct S { S() { } };
// S<int, char> generates approximately : struct S { S(int t1, char t2) { } };
 
template<typename ... T> struct S { S(T...t) { } };

It allows such beautyfull things:

template <typename... BaseClasses> class ClassName : public BaseClasses...
{
public:
 
   ClassName (BaseClasses&&... baseClasses) : BaseClasses(baseClasses)... {}
}

or type-safe printf:

void printf(const char *s)
{
  while (*s)
  {
    if (*s == '%' && *(++s) != '%')
      throw std::runtime_error("invalid format string: missing arguments");
    std::cout << *s++;
  }
}
 
template<typename T, typename... Args>
void printf(const char* s, T value, Args... args)
{
  while (*s)
  {
    if (*s == '%' && *(++s) != '%')
    {
      std::cout << value;
      printf(*s ? ++s : s, args...); // call even when *s == 0 to detect extra arguments
      return;
    }
    std::cout << *s++;
  }
  throw std::logic_error("extra arguments provided to printf");
}
&#91;/sourcecode&#93;

Such approach to implement variadic functions often includes two steps:
  <strong>Define general case</strong>
  <strong>Define terminal case</strong>

<strong>General case</strong> will <em>instantiate</em> itself <em>recursively</em> (dismantling parameters pack from it's <strong>head</strong>) until <strong>terminal case</strong> occurs.

<strong>Back to tables</strong>. First, let's look how it can be used:


#include <iostream>
#include <iterator>
#include <algorithm>
#include <string>
#include <cmath>
#include <stdlib.h>
#include <time.h>

#include "common.hpp"
#include "timing.hpp"

using std::for_each;
using std::ostream_iterator;
using std::vector;
using std::string;
using std::cout;
using std::endl;

// To define columns in a type-safe manner, you must
// set column name as struct name, and define a type with
// name "type" which corresponds to type of column.
// like this:
struct id	{ typedef int			type; };
struct name	{ typedef string 		type; };
struct salary	{ typedef float			type; };

using namespace memTable;

inline void next_case() { cout << "---------------------------------------" << endl; }

int main() {
	// table will know about types of columns at compile-time
	table<id, name, salary> employes;
	
	// full insert
	// note: in all cases you can supply functions with implicitly convertible types:
	// double for column of type int - ok, but int for column of type std::string - not ok.
	employes.insert_values(1, "Dmitry Duka", 1000.0);
	employes.insert_values(2, "Other person", 900.52);
	employes.insert_values(10, "Third person", 5000.1);
	// partial insert, in this case types of unspecified columns MUST be default-constructible
	employes.insert<id, name>(20, "John Smith");
	employes.insert<name>("no id, no salary");
	employes.insert<id, salary>(40, 8000.0);	// reserved vacancy 🙂
	// pretty print
	employes.print();
	
	next_case();
	// partial select: let's get names of those persons, who have salary < 1000?
	// select will return table containing one column - name, and condition applies to column salary
	vector<string> names = employes.select<name>(conditions::less<salary>(1000.0));
	copy(names.begin(), names.end(), ostream_iterator<string>(cout, "; "));
	cout << endl;
	
	next_case();
	// partial select: let's get names and salaries of those persons, who have id >= 10?
	// select will return table containing two columns - name and salary, and condition applies to column id
	table<name, salary> res_id_more_than_10 = employes.select<name, salary>(conditions::moreeq<id>(10));
	res_id_more_than_10.print();
	
	next_case();
	// update: set salary for John Smith
	employes.update<salary>(conditions::eq<name>("John Smith"), 4000.6);
	employes.print();
	
	next_case();
	// remove: fire all with salary > 1000
	employes.remove(conditions::more<salary>(1000.0));
	employes.print();
	
	next_case();
	// clear: they all dissappeared.
	employes.clear();
	
return 0;
}

This code outputs:

javol@javol-laptop:~/compile/typesafe_table$ ./variadic
1 Dmitry Duka 1000
2 Other person 900.52
10 Third person 5000.1
20 John Smith 0
0 no id, no salary 0
40 8000
—————————————
Other person; John Smith; no id, no salary;
—————————————
Third person 5000.1
John Smith 0
8000
—————————————
1 Dmitry Duka 1000
2 Other person 900.52
10 Third person 5000.1
20 John Smith 4000.6
0 no id, no salary 0
40 8000
—————————————
1 Dmitry Duka 1000
2 Other person 900.52
0 no id, no salary 0
—————————————
javol@javol-laptop:~/compile/typesafe_table$

Containter ‘table’ is a template, that can accept unlimited number of parameters.

Note how we define columns. The name of column is actually the name of structure type and the type of column is specified by defining type with name “type”, that corresponds to actual type. What benefits of such approach? You can name your columns by meaningful name, avoiding coding this name as string (which involves analysing this string all the time), ambiguities with columns, and as you’ll see later – some undesirable type conversions. In a few words it is delegation of all dirty work to C++ type system.

What this container can?
+ table with unlimited number of columns (due to variadic templates)
+ full insert
+ partial insert (=> in this case all other column types must be default constructible)
+ full conditional select
+ partial conditional select
+ update
+ set (ex. table.select(condition).set(values) ~ update)
+ delete (vacuum)
+ limit (returns new copy of table with limited number of rows)

+ table with one column implicitly converts to vector
+ table with one column operates on actual values, instead of tuples with one element

+ in multicolumn table you can convert one column to vector
+ in multicolumn table you can convert pair of columns to multimap

+ conditions
less, more, lesseq, moreeq, eq, not_eq, regex
+ debug print

There are several hard things in implementation to understand:
How to find out what index have the type T in typepack (variadic templates) TPack…?
How to construct new sub-tuple, having set of types (typepack) and typepack TPack… that contains that set?

What it means: we have

std::tuple<id, name, salary> row;

and we want to have a function:

std::tuple<name, salary> convert(const std::tuple<id, name, salary>&);

or

std::tuple<id, salary> convert(const std::tuple<id, name, salary>&);

or even

std::tuple<name> convert(const std::tuple<id, name, salary>&);

to get only specified columns.

That’s all can be wrapped in a function template, but conversion of tuples between each other must be done just right, causing compile-time errors if something is wrong:

// there is no color column in source table
    std::tuple<id, color> convert(const std::tuple<id, name, salary>&);

Since the use of variadic templates is often recursive, there are certain moments when it’s hard to understand code you’d just implemented. The variadic parameters themselves are not readily available to the implementation of a function or class.

Here is my implementation to deduce type’s index in a pack at compile-time:

#ifndef TYPEINDEX_HPP
#define TYPEINDEX_HPP

// WARNING: C++0X's variadic templates required!
// returns index (counting from 0) of type in an argument pack
// at compile-time.
// compile-time error if type is not in pack
//
// USAGE:
// type_index<type, Args...>::is
//
//

template<bool cond> struct no_such_type_in_this_list;
template<> struct no_such_type_in_this_list<true> {};

template<bool same, typename SearchForT, typename... Args> struct type_index_impl;
template<typename SearchForT, typename... Args>
	struct type_index_impl<true, SearchForT, Args...> { enum { is = -1 }; };
template<typename SearchForT, typename T, typename... Args>
	struct type_index_impl<false, SearchForT, T, Args...> { 
		enum { is = type_index_impl<std::is_same<SearchForT, T>::value, SearchForT, Args...>::is + 1 };
	};
template<typename SearchForT, typename T>
	struct type_index_impl<false, SearchForT, T> {
		no_such_type_in_this_list<std::is_same<SearchForT, T>::value> dummy;
		enum { is = std::is_same<SearchForT, T>::value - 1 };
};

template<typename... Args> struct type_index;

template<typename SearchForT, typename T, typename... Args>
struct type_index<SearchForT, T, Args...> {
	enum { is = type_index_impl<std::is_same<SearchForT, T>::value, SearchForT, Args...>::is + 1 }; 
};
//

#endif

Here is actual implementation of table container:

namespace memTable {

// Stuff for unpacking tuples
template<std::size_t...> struct index_tuple{}; 

template<std::size_t I, typename IndexTuple, typename... Types> struct make_indices_impl; 

template<std::size_t I, std::size_t... Indices, typename T, typename... Types> 
struct make_indices_impl<I, index_tuple<Indices...>, T, Types...> 
{ typedef typename make_indices_impl<I + 1, index_tuple<Indices..., I>, Types...>::type type; }; 

template<std::size_t I, std::size_t... Indices> 
struct make_indices_impl<I, index_tuple<Indices...> > 
{ typedef index_tuple<Indices...> type; }; 

template<typename... Types> struct make_indices : make_indices_impl<0, index_tuple<>, Types...>  {};

struct all {};

// table with variable number of columns
template<typename... Args>
class table {
private:
	typedef std::tuple<typename Args::type...> row_type;
	typedef std::vector<row_type> container;
	typedef typename container::iterator container_iterator;

	class need_unique_columns : public Args... {};

	// printing functor
	template<typename... InnerArgs>
	struct print_tuple {
		template<typename Head, typename... Rest>
		inline void print_row(const Head& head, const Rest&... rest) {
			const int field_width = 25;
			std::cout << std::setw(field_width) << head;
			print_row(rest...);
		}
		inline void print_row() { std::cout << std::endl; }

		template <std::size_t... Indices>
		inline void forward_print(const index_tuple<Indices...>&&, std::tuple<InnerArgs...>&& t) 
		{ 
			print_row(std::forward<InnerArgs>(std::get<Indices>(t))...);
		}
		inline void operator()(std::tuple<InnerArgs...>& t) {
			typedef typename make_indices<InnerArgs...>::type Indices; 
			forward_print(Indices(), t); 			
		}
	};
	// convert tuples functor
	// converts full tuple to partial
	// specify with column names
	template<typename... InnerArgs>
	struct convert_tuple {
		template<typename Head, typename... Rest>
		inline void convert_impl(const row_type& t, 
						std::tuple<typename InnerArgs::type...>& r, 
						const Head&, const Rest&... inargs) {
			std::get<type_index<Head, InnerArgs...>::is>(r) = std::get<type_index<Head, Args...>::is>(t);
			convert_impl(t, r, inargs...);
		}
		template<typename Head, typename... Rest>
		inline void convert_impl(const row_type& t, 
						std::tuple<typename InnerArgs::type...>& r, 
						const Head&) {
			std::get<type_index<Head, InnerArgs...>::is>(r) = std::get<type_index<Head, Args...>::is>(t);
		}
		
		template <std::size_t... Indices>
		inline void convert_impl_pre(const row_type& t, 
							std::tuple<typename InnerArgs::type...>&& r,
							index_tuple<Indices...>,
							std::tuple<InnerArgs...> dummy) {
			// nice trick to define objects of type InnerArgs...
			convert_impl(t, r, std::forward<InnerArgs>(std::get<Indices>(dummy))...);			
		}

		inline std::tuple<typename InnerArgs::type...> operator()(const row_type& t) {
			std::tuple<typename InnerArgs::type...> result;
			typedef typename make_indices<InnerArgs...>::type Indices;
			convert_impl_pre(t, result, Indices(), std::tuple<InnerArgs...>());
			return result;
		}
	};
	
	// insert specified columns
	// base case
	template<typename T, typename... insertArgs>
	inline void insert_impl(row_type& tuple_to_insert, 
							const typename T::type& value, 
							const typename insertArgs::type&... inargs) {
		std::get<type_index<T, Args...>::is>(tuple_to_insert) = value;
		insert_impl<insertArgs...>(tuple_to_insert, inargs...);		
	}

	// terminal case
	template<typename T>
	inline void insert_impl(row_type& tuple_to_insert, const typename T::type& value) {
		std::get<type_index<T, Args...>::is>(tuple_to_insert) = value;
	}
public:
	table() { }
	table(const size_t res) { /*rows_.reserve(res);*/ }
	// full insert
	inline void insert_values(const typename Args::type&... args) { rows_.push_back(row_type(args...)); }
	// full insert
	inline void insert_tuple(const std::tuple<typename Args::type...>& t) { rows_.push_back(t); }
	// insert specified columns
	template<typename... insertArgs> 
	inline void insert(const typename insertArgs::type&... inargs){
		row_type tuple_to_insert;
		insert_impl<insertArgs...>(tuple_to_insert, inargs...);
		rows_.push_back(tuple_to_insert);
	}
	// delete
	template<template<typename> class T, typename U> 
	table<Args...>& remove(const T<U>& condition) {
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			if(condition.check(std::get<type_index<U, Args...>::is>(*it)))
				rows_.erase(it--);
								
		return *this;
	} 

	// select
	template<template<typename> class T, typename U> 
	table<Args...> select_all(const T<U>& condition) {
		table<Args...> result_table(size());
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			if(condition.check( std::get<type_index<U, Args...>::is>(*it) ))
				result_table.insert_tuple(*it);

		return result_table;
	}
	// select
	template<typename... selectArgs, template<typename> class T, typename U> 
	table<selectArgs...> select(const T<U>& condition) {
		table<selectArgs...> result_table(size());
		// convert full tuple to partial and insert in result table
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			if(condition.check( std::get<type_index<U, Args...>::is>(*it) ))
				result_table.insert_tuple(table::convert_tuple<selectArgs...>()(*it));

		return result_table;
	} 
	// unconditional select
	template<typename... selectArgs> 
	table<selectArgs...> select() {
		table<selectArgs...> result_table(size());
		// convert full tuple to partial and insert in result table
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			result_table.insert_tuple(table::convert_tuple<selectArgs...>()(*it));

		return result_table;
	} 	
	// set
	template<typename... selectArgs> 
	inline table<Args...>& set(const typename selectArgs::type&... sargs) {
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			insert_impl<selectArgs...>((*it), sargs...); 
			
		return *this;
	}
	// conditional set (update)
	template<typename... selectArgs, template<typename> class T, typename U> 
	inline table<Args...>& update(const T<U>& condition, const typename selectArgs::type&... sargs) {
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			if(condition.check( std::get<type_index<U, Args...>::is>(*it) ))
				insert_impl<selectArgs...>((*it), sargs...); 
			
		return *this;
	}
	inline table<Args...> limit(const size_t s) {
		table<Args...> result_table;
		// convert full tuple to partial and insert in result table
		if(s < size()) {
			container_iterator it = rows_.begin();
			for(size_t c = 0; c < s; ++it, ++c)
				result_table.insert_tuple(*it);		
		}
		else result_table.rows_ = rows_;
			
		return result_table;		
	}
	// convert column to vector
	template<typename T>
	std::vector<typename T::type> toVector() {
		std::vector<typename T::type> result;
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			result.push_back(std::get<type_index<T, Args...>::is>(*it));
		return result;
	}
	// convert pair of columns to map
	template<typename T, typename U>
	std::map<typename T::type, typename U::type> toMap() {
		std::map<typename T::type, typename U::type> result;
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			result[std::get<type_index<T, Args...>::is>(*it)] = std::get<type_index<U, Args...>::is>(*it);

		return result;
	}
	// convert pair of columns to multimap
	template<typename T, typename U>
	std::multimap<typename T::type, typename U::type> toMultiMap() {
		std::multimap<typename T::type, typename U::type> result;
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			result.insert(std::make_pair(std::get<type_index<T, Args...>::is>(*it), std::get<type_index<U, Args...>::is>(*it)));

		return result;
	}

	inline void clear() { rows_.erase(rows_.begin(), rows_.end()); }
	inline size_t size() { return rows_.size(); }
	// debug print
	inline void print() { std::for_each(rows_.begin(), rows_.end(), table::print_tuple<typename Args::type...>()); }	
private:

	container rows_;
};
/*
	 table with one column, optimized version
	-----------------------------------------------------------
*/
template<typename T>
class table<T> {
private:
	typedef typename T::type row_type;
	typedef std::vector<row_type> container;
	typedef typename container::iterator container_iterator;

	// printing functor
	struct print_tuple {	inline void operator()(typename T::type& t) {	std::cout << t << std::endl;	}	};
	
public:
	table() { }
	table(const size_t res) { /*rows_.reserve(res);*/ }
	// full insert
	inline void insert_values(const row_type& arg) { rows_.push_back(arg); }
	// full insert
	inline void insert_tuple(const std::tuple<row_type>& t) { rows_.push_back(std::get<0>(t)); }
	// insert specified columns
	inline void insert(const row_type& arg) { rows_.push_back(arg); }
	// delete
	template<template<typename> class V, typename U> 
	table<T>& remove(const V<U>& condition) {
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			if(condition.check(*it))
				rows_.erase(it--);
								
		return *this;
	} 

	// select
	template<template<typename> class V, typename U> 
	table<T> select_all(const V<U>& condition) {
		table<T> result_table(size());
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			if(condition.check(*it)) result_table.insert(*it);

		return result_table;
	}
	// select
	template<template<typename> class V, typename U> 
	table<T> select(const V<U>& condition) {
		table<T> result_table(size());
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			if(condition.check(*it)) result_table.insert(*it);

		return result_table;
	}

	// set
	inline table<T>& set(const row_type& sarg) {
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			*it = sarg;			
		return *this;
	}
	// conditional set (update)
	template<template<typename> class V, typename U> 
	inline table<T>& update(const V<U>& condition, const row_type& sarg) {
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			if(condition.check(*it)) *it = sarg; 
			
		return *this;
	}
	inline table<T> limit(const size_t s) {
		table<T> result_table;
		// convert full tuple to partial and insert in result table
		if(s < size()) {
			container_iterator it = rows_.begin();
			for(size_t c = 0; c < s; ++it, ++c)
				result_table.insert(*it);		
		}
		else result_table.rows_ = rows_;
			
		return result_table;		
	}
	// convert column to vector
	std::vector<row_type> toVector() {
		std::vector<row_type> result;
		for(container_iterator it = rows_.begin(); it != rows_.end(); ++it)
			result.push_back(*it);
		return result;
	}

	inline operator std::vector<row_type>() { return toVector(); }

	inline void clear() { rows_.erase(rows_.begin(), rows_.end()); }
	inline size_t size() { return rows_.size(); }
	// debug print
	inline void print() { std::for_each(rows_.begin(), rows_.end(), table::print_tuple()); }	
private:

	container rows_;
};

}

#endif

There are also conditions, which can be applied conditional operations (select, update, delete). They implemented as functors:

#ifndef CONDITIONS_HPP
#define CONDITIONS_HPP

// #define WITH_REGEX_CONDITION

#ifdef WITH_REGEX_CONDITION
    #include <boost/regex.hpp>
#endif


namespace conditions {
	// less
	template<typename T>
	struct less {
		less(const typename T::type v) : value(v) {}
		bool check(const typename T::type val) const { return val < value;}
	private:
		const typename T::type value;
	};
	// less or equal
	template<typename T>
	struct lesseq {
		lesseq(const typename T::type v) : value(v) {}
		bool check(const typename T::type val) const { return val <= value;}
	private:
		const typename T::type value;
	};
	// more
	template<typename T>
	struct more {
		more(const typename T::type v) : value(v) {}
		bool check(const typename T::type val) const { return val > value;}
	private:
		const typename T::type value;
	};
	// more or equal
	template<typename T>
	struct moreeq {
		moreeq(const typename T::type v) : value(v) {}
		bool check(const typename T::type val) const { return val >= value;}
	private:
		const typename T::type value;
	};
	// equal
	template<typename T>
	struct eq {
		eq(const typename T::type v) : value(v) {}
		bool check(const typename T::type val) const { return val == value;}
	private:
		const typename T::type value;
	};
	// not equal
	template<typename T>
	struct noteq {
		noteq(const typename T::type v) : value(v) {}
		bool check(const typename T::type val) const { return val != value;}
	private:
		const typename T::type value;
	};
#ifdef WITH_REGEX_CONDITION
	// regex
	template<typename T>
	struct regex {
		regex(const std::string& e) : expr(e.c_str()) { }
		bool check(const typename T::type val) const { return boost::regex_match(val, expr); }
	private:
		boost::regex expr;
	};
#endif
}

#endif

Select from such table is even faster than in sqlite3.

One question remains incomprehensible – how to save, serialize and deserialize such data?

Advertisements

3 Comments »

  1. The great thing with variadic templates and an approach like with your “table” type, above, is that it’ll let people develop ORM libraries in a quite easy and natural way.

    Comment by alpmestan — November 9, 2009 @ 8:22 pm

  2. […] practical example of tuple usage for ORM-like engine: https://javol.wordpress.com/2009/08/06/type-safe-table-container-using-variadic-templates/ Some practical aspects of using tuples: […]

    Pingback by Heterogeneous vector in c++ – overview of common approaches | Мои IT-заметки — October 3, 2014 @ 2:12 pm

  3. would you be appreciated to update this article according to C++11/14 or even C++17?

    Comment by Brent Jiang — April 19, 2017 @ 11:38 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: