What's a Good and Stable C++ Tree Implementation

What's a good and stable C++ tree implementation?

I don't know about your requirements, but wouldn't you be better off with a graph (implementations for example in Boost Graph) if you're interested mostly in the structure and not so much in tree-specific benefits like speed through balancing? You can 'emulate' a tree through a graph, and maybe it'll be (conceptually) closer to what you're looking for.

Why does the C++ STL not provide any tree containers?

There are two reasons you could want to use a tree:

You want to mirror the problem using a tree-like structure:

For this we have boost graph library

Or you want a container that has tree like access characteristics
For this we have

  • std::map (and std::multimap)
  • std::set (and std::multiset)

Basically the characteristics of these two containers is such that they practically have to be implemented using trees (though this is not actually a requirement).

See also this question:
C tree Implementation

What is the correct structure for a node in any BST Tree

The fundamental Binary Tree (foundation) requires child pointers:

struct binary_tree_node
{
binary_tree_node * left_child;
binary_tree_node * right_child;
};

There are many modifications that can be made to the foundation that help facilitate searching or storage.

These can include (but are not limited to):

  • parent pointer
  • array of child pointers
  • "color" indicator
  • specialized leaf nodes -- no child links

The amenities depend on the usage of the data structure. For example, an array of child nodes may help speed up I/O access, where reading a "page" node is as efficient as reading a single node (See B-Tree). The "color" indicator may help with the decision for balancing. The specialized "leaf" nodes reduce the amount of memory occupied by the tree.

As for traversal, a tree can be traversed in any method. There are no rules preventing a traversal from child to parent. Some traversals may include sibling to sibling.

Some books or websites may pick nits about a traditional or fundamental "binary tree" data structure. I find that restrictions get in the way.

Modelling an arbitrary tree in C++ (with iterators)

I've made improvments to this tree. Both the vertex and edge iterators are now wrapped in the facade. If I make more major changes, I'll post them. I had use tree.hh a while back for a smallish project but didn't really like it. I'll replace it with this to see what else it needs.

#include<iostream>
#include <string>
#include <boost/shared_ptr.hpp>
#include <boost/graph/adjacency_list.hpp>
#include <boost/iterator/iterator_adaptor.hpp>
#include <boost/graph/graphviz.hpp>

#include <boost/graph/visitors.hpp>
#include <boost/graph/breadth_first_search.hpp>
#include <boost/property_map/property_map.hpp>
#include <boost/graph/graph_utility.hpp>

// ..............................................
template< typename Tree >
void write_graphviz( std::ostream& os, Tree tree )
{
os << "digraph G {\n";
for( auto it : tree )
os << it << "[label=\"" << tree[ it ] << "\"];\n";
for( auto it : tree.get_edges( ) )
os << it.m_source << "->" << it.m_target << ";\n";
os << "}\n";
}

// ..............................................
template< typename Tree >
const typename boost::graph_traits< Tree >::vertex_descriptor get_vertex( Tree& tree, const typename Tree::out_edge_iterator& iter ) { return iter->m_target; }

template< typename Tree >
const typename boost::graph_traits< Tree >::vertex_descriptor get_vertex( Tree& tree, const typename Tree::vertex_iterator& iter ) { return *iter; }

// ..............................................
template< typename Tree, typename IteratorType >
class tree_iter_type
: public boost::iterator_facade<
tree_iter_type< typename Tree, typename IteratorType >
,typename Tree::vertex_property_type
,boost::forward_traversal_tag
>
{
public:
typedef typename tree_iter_type< typename Tree, typename IteratorType > other_type;
typedef typename boost::graph_traits< Tree >::vertex_descriptor iterator_value_type;
typedef typename boost::graph_traits< Tree >::edge_descriptor eiterator_value_type;
typedef typename Tree::vertex_property_type value_type;
private:
friend class boost::iterator_core_access;
value_type& dereference( ) const { return tree[ get_vertex( tree, iter ) ]; }
void increment( ) { ++iter; }
bool equal( other_type const& other ) const { return iter == other.iter; }
public:
tree_iter_type( Tree& tree, IteratorType iter ) : tree( tree ), iter( iter ) { }

//http://stackoverflow.com/questions/31264984/c-compiler-error-c2280-attempting-to-reference-a-deleted-function-in-visual
other_type& operator=( other_type const& a ){ tree= a.tree, iter= a.iter; return *this; };
iterator_value_type vertex( ) { return get_vertex( tree, iter ); }
//how to make this work? It blows things up....
//iterator_value_type operator ( ) { return get_vertex( tree, iter ); }

private:
Tree& tree;
IteratorType iter;
};

// ..............................................
template< typename Tree, typename IterType > //proxy container
class tree_pair_type
{
public:
typedef typename boost::graph_traits< Tree >::vertex_descriptor vertex_t;
typedef std::pair< IterType, IterType > iterator_pair_t;
tree_pair_type( iterator_pair_t pair ) :pair( pair ){ }
IterType begin( ) { return pair.first; }
IterType end( ) { return pair.second; }
private:
iterator_pair_t pair;
};

// ..............................................
template< typename ValueType >
class BGTree
{
public:
typedef boost::adjacency_list<
boost::mapS,
boost::vecS,
boost::bidirectionalS,
ValueType >
tree_t;
typedef typename ValueType value_type;
typedef typename boost::graph_traits< tree_t >::vertex_descriptor vertex_t;
typedef typename boost::graph_traits< tree_t >::edge_descriptor edge_t;

typedef typename tree_t::vertex_iterator vertex_iterator;
typedef typename tree_t::edge_iterator edge_iterator;
typedef typename tree_t::out_edge_iterator out_edge_iterator;

typedef typename tree_iter_type< tree_t, typename tree_t::out_edge_iterator > out_tree_iterator;
typedef typename tree_iter_type< tree_t, typename tree_t::vertex_iterator > vertex_tree_iterator;

typedef tree_pair_type< tree_t, edge_iterator > pair_type;
typedef tree_pair_type< tree_t, out_tree_iterator > out_pair_type;
typedef tree_pair_type< tree_t, vertex_tree_iterator > vertex_pair_type;

//get children
auto make_out_iterator( vertex_t pos ) { return out_tree_iterator( tree, boost::out_edges( pos, tree ).first ); }
auto make_out_iterator_end( vertex_t pos ) { return out_tree_iterator( tree, boost::out_edges( pos, tree ).second ); }
//get sub tree
auto make_range_iterator( vertex_t pos ) { return vertex_tree_iterator( tree, begin( ) ); }
auto make_range_iterator_end( vertex_t pos ) { return vertex_tree_iterator( tree, end( ) ); }

public:
BGTree( const ValueType& root )
:root( boost::add_vertex( root, tree ) ) { }

vertex_t get_root( ) const { return root; }
vertex_t add_child( const ValueType& item, vertex_t where ) {
auto temp= boost::add_vertex( item, tree );
boost::add_edge( where, temp, tree );
return temp;
}
vertex_t get_parent( vertex_t from ) const { return boost::in_edges( from, tree ).first->m_source; }
auto get_bundle( ) { return boost::get( vertex_bundle, tree ); }
//vertext set, not by value
vertex_iterator begin( ) { return boost::vertices( tree ).first; }
vertex_iterator end( ) { return boost::vertices( tree ).second; }
ValueType& operator [ ] ( vertex_t at ) { return tree[ at ]; }
//by index, not by value
auto get_edges( ) { return pair_type( boost::edges( tree ) ); }

auto get_children( vertex_t pos= 0 ) {
return out_pair_type( std::make_pair(
make_out_iterator( pos ), make_out_iterator_end( pos ) ) );
}
auto breadth_first( vertex_t pos= 0 ) {
return vertex_pair_type( std::make_pair(
make_range_iterator( pos ), make_range_iterator_end( pos ) ) );
}
auto get_vertex_children( vertex_t pos ) { return out_pair_type( boost::out_edges( pos, tree ) ); }
auto get_boost_graph_tree( ) { return tree; }

private:
tree_t tree;
vertex_t root;
};

// .....................................................................................
// .....................................................................................
int main( )
{
//build tree to match the images in documentation
//https://en.wikipedia.org/wiki/Tree_traversal
typedef BGTree< char > char_tree_t;

char_tree_t tree( 'F' );
auto last= tree.get_root( );
last= tree.add_child( 'B' , last );
tree.add_child( 'A' , last );
last= tree.add_child( 'D' , last );
tree.add_child( 'C' , last );
tree.add_child( 'E' , last );
last= tree.get_root( );
last= tree.add_child( 'G' , last );
last= tree.add_child( 'I' , last );
last= tree.add_child( 'H' , last );

// visualization
std::ofstream os( "./graph.dot" );
::write_graphviz( os, tree );

std::cout << "Pre-order: F, B, A, D, C, E, G, I, H\n";
for( auto& i : tree.breadth_first( ) )
std::cout << i << " ";
std::cout << "\n\n";

std::cout << "Children of root: B, G\n";
for( auto& i : tree.get_children( ) )
std::cout << i << " ";
std::cout << "\n\n";

auto it= std::find_if(
tree.breadth_first( ).begin( ), tree.breadth_first( ).end( ),
[ ] ( const char& test ) { return test == 'B'; } );
std::cout << "should be 'B', find_if: " << *it << "\n\n";

std::cout << "children of 'B' from iterator: A D\n";
//as it has a referance to tree, could be it.get_children( )?
for( auto& item : tree.get_children( it.vertex( ) ) )
std::cout << item << " ";
std::cout << "\n\n";

return 0;
}

What are the applications of binary trees?

To squabble about the performance of binary-trees is meaningless - they are not a data structure, but a family of data structures, all with different performance characteristics. While it is true that unbalanced binary trees perform much worse than self-balancing binary trees for searching, there are many binary trees (such as binary tries) for which "balancing" has no meaning.

Applications of binary trees

  • Binary Search Tree - Used in many search applications where data is constantly entering/leaving, such as the map and set objects in many languages' libraries.
  • Binary Space Partition - Used in almost every 3D video game to determine what objects need to be rendered.
  • Binary Tries - Used in almost every high-bandwidth router for storing router-tables.
  • Hash Trees - Used in torrents and specialized image-signatures in which a hash needs to be verified, but the whole file is not available. Also used in blockchains for eg. Bitcoin.
  • Heaps - Used in implementing efficient priority-queues, which in turn are used for scheduling processes in many operating systems, Quality-of-Service in routers, and A* (path-finding algorithm used in AI applications, including robotics and video games). Also used in heap-sort.
  • Huffman Coding Tree (Chip Uni) - Used in compression algorithms, such as those used by the .jpeg and .mp3 file-formats.
  • GGM Trees - Used in cryptographic applications to generate a tree of pseudo-random numbers.
  • Syntax Tree - Constructed by compilers and (implicitly) calculators to parse expressions.
  • Treap - Randomized data structure used in wireless networking and memory allocation.
  • T-tree - Though most databases use some form of B-tree to store data on the drive, databases which keep all (most) their data in memory often use T-trees to do so.

The reason that binary trees are used more often than n-ary trees for searching is that n-ary trees are more complex, but usually provide no real speed advantage.

In a (balanced) binary tree with m nodes, moving from one level to the next requires one comparison, and there are log_2(m) levels, for a total of log_2(m) comparisons.

In contrast, an n-ary tree will require log_2(n) comparisons (using a binary search) to move to the next level. Since there are log_n(m) total levels, the search will require log_2(n)*log_n(m) = log_2(m) comparisons total. So, though n-ary trees are more complex, they provide no advantage in terms of total comparisons necessary.

(However, n-ary trees are still useful in niche-situations. The examples that come immediately to mind are quad-trees and other space-partitioning trees, where divisioning space using only two nodes per level would make the logic unnecessarily complex; and B-trees used in many databases, where the limiting factor is not how many comparisons are done at each level but how many nodes can be loaded from the hard-drive at once)



Related Topics



Leave a reply



Submit