[ACCEPTED]-Red-Black Trees-red-black-tree

Accepted answer
Score: 56

Red Black trees are good for creating well-balanced 31 trees. The major problem with binary search 30 trees is that you can make them unbalanced 29 very easily. Imagine your first number is 28 a 15. Then all the numbers after that are 27 increasingly smaller than 15. You'll have 26 a tree that is very heavy on the left side 25 and has nothing on the right side.

Red Black 24 trees solve that by forcing your tree to 23 be balanced whenever you insert or delete. It 22 accomplishes this through a series of rotations 21 between ancestor nodes and child nodes. The 20 algorithm is actually pretty straightforward, although 19 it is a bit long. I'd suggest picking up 18 the CLRS (Cormen, Lieserson, Rivest and 17 Stein) textbook, "Introduction to Algorithms" and 16 reading up on RB Trees.

The implementation 15 is also not really so short so it's probably 14 not really best to include it here. Nevertheless, trees 13 are used extensively for high performance apps that 12 need access to lots of data. They provide 11 a very efficient way of finding nodes, with 10 a relatively small overhead of insertion/deletion. Again, I'd 9 suggest looking at CLRS to read up on how 8 they're used.

While BSTs may not be used 7 explicitly - one example of the use of trees 6 in general are in almost every single modern 5 RDBMS. Similarly, your file system is almost 4 certainly represented as some sort of tree 3 structure, and files are likewise indexed 2 that way. Trees power Google. Trees power 1 just about every website on the internet.

Score: 20

I'd like to address only the question "So 41 what makes binary trees useful in some of 40 the common tasks you find yourself doing 39 while programming?"

This is a big topic 38 that many people disagree on. Some say that 37 the algorithms taught in a CS degree such 36 as binary search trees and directed graphs 35 are not used in day-to-day programming and 34 are therefore irrelevant. Others disagree, saying 33 that these algorithms and data structures 32 are the foundation for all of our programming 31 and it is essential to understand them, even 30 if you never have to write one for yourself. This 29 filters into conversations about good interviewing 28 and hiring practices. For example, Steve Yegge has 27 an article on interviewing at Google that addresses this question. Remember 26 this debate; experienced people disagree.

In 25 typical business programming you may not 24 need to create binary trees or even trees 23 very often at all. However, you will use 22 many classes which internally operate using 21 trees. Many of the core organization classes 20 in every language use trees and hashes to 19 store and access data.

If you are involved 18 in more high-performance endeavors or situations 17 that are somewhat outside the norm of business 16 programming, you will find trees to be an 15 immediate friend. As another poster said, trees 14 are core data structures for databases and 13 indexes of all kinds. They are useful in 12 data mining and visualization, advanced 11 graphics (2d and 3d), and a host of other 10 computational problems.

I have used binary 9 trees in the form of BSP (binary space partitioning) trees in 3d graphics. I 8 am currently looking at trees again to sort 7 large amounts of geocoded data and other 6 data for information visualization in Flash/Flex 5 applications. Whenever you are pushing the 4 boundary of the hardware or you want to 3 run on lower hardware specifications, understanding 2 and selecting the best algorithm can make 1 the difference between failure and success.

Score: 12

None of the answers mention what it is exactly 40 BSTs are good for.

If what you want to do 39 is just lookup by values then a hashtable 38 is much faster, O(1) insert and lookup (amortized 37 best case).

A BST will be O(log N) lookup 36 where N is the number of nodes in the tree, inserts 35 are also O(log N).

RB and AVL trees are important 34 like another answer mentioned because of 33 this property, if a plain BST is created 32 with in-order values then the tree will 31 be as high as the number of values inserted, this 30 is bad for lookup performance.

The difference 29 between RB and AVL trees are in the the 28 rotations required to rebalance after an 27 insert or delete, AVL trees are O(log N) for 26 rebalances while RB trees are O(1). An example 25 of benefit of this constant complexity is 24 in a case where you might be keeping a persistent 23 data source, if you need to track changes 22 to roll-back you would have to track O(log 21 N) possible changes with an AVL tree.

Why 20 would you be willing to pay for the cost 19 of a tree over a hash table? ORDER! Hash 18 tables have no order, BSTs on the other 17 hand are always naturally ordered by virtue 16 of their structure. So if you find yourself 15 throwing a bunch of data in an array or 14 other container and then sorting it later, a 13 BST may be a better solution.

The tree's 12 order property gives you a number of ordered 11 iteration capabilities, in-order, depth-first, breadth-first, pre-order, post-order. These 10 iteration algorithms are useful in different 9 circumstances if you want to look them up.

Red 8 black trees are used internally in almost 7 every ordered container of language libraries, C++ Set 6 and Map, .NET SortedDictionary, Java TreeSet, etc...

So 5 trees are very useful, and you may use them 4 quite often without even knowing it. You 3 most likely will never need to write one yourself, though 2 I would highly recommend it as an interesting 1 programming exercise.

Score: 4

Red Black Trees and B-trees are used in 5 all sorts of persistent storage; because 4 the trees are balanced the performance of 3 breadth and depth traversals are mitigated.

Nearly 2 all modern database systems use trees for 1 data storage.

Score: 2

BSTs make the world go round, as said by 15 Micheal. If you're looking for a good tree 14 to implement, take a look at AVL trees (Wikipedia). They 13 have a balancing condition, so they are 12 guaranteed to be O(logn). This kind of searching 11 efficiency makes it logical to put into 10 any kind of indexing process. The only thing 9 that would be more efficient would be a 8 hashing function, but those get ugly quick, fast, and 7 in a hurry. Also, you run into the Birthday Paradox (also 6 known as the pigeon-hole problem).

What textbook 5 are you using? We used Data Structures and Analysis in Java by Mark Allen Weiss. I 4 actually have it open in my lap as i'm typing 3 this. It has a great section about Red-Black 2 trees, and even includes the code necessary 1 to implement all the trees it talks about.

Score: 2

The best description of red-black trees 9 I have seen is the one in Cormen, Leisersen 8 and Rivest's 'Introduction to Algorithms'. I 7 could even understand it enough to partially 6 implement one (insertion only). There are 5 also quite a few applets such as This One on various 4 web pages that animate the process and allow 3 you to watch and step through a graphical 2 representation of the algorithm building 1 a tree structure.

Score: 2

Red-black trees stay balanced, so you don't 20 have to traverse deep to get items out. The 19 time saved makes RB trees O(log()n)) in 18 the WORST case, whereas unlucky binary trees 17 can get into a lop sided configuration and 16 cause retrievals in O(n) a bad case. This 15 does happen in practice or on random data. So 14 if you need time critical code (database 13 retrievals, network server etc.) you use 12 RB trees to support ordered or unordered 11 lists/sets .

But RBTrees are for noobs! If 10 you are doing AI and you need to perform 9 a search, you find you fork the state information 8 alot. You can use a persistent red-black 7 to fork new states in O(log(n)). A persistent 6 red black tree keeps a copy of the tree 5 before and after a morphological operation 4 (insert/delete), but without copying the 3 entire tree (normally and O(log(n)) operation). I 2 have open sourced a persistent red-black 1 tree for java


Score: 1

Since you ask which tree people use, you 41 need to know that a Red Black tree is fundamentally 40 a 2-3-4 B-tree (i.e a B-tree of order 4). A 39 B-tree is not equivalent to a binary tree(as 38 asked in your question).

Here's an excellent 37 resource describing the initial abstraction 36 known as the symmetric binary B-tree that 35 later evolved into the RBTree. You would 34 need a good grasp on B-trees before it makes 33 sense. To summarize: a 'red' link on a Red 32 Black tree is a way to represent nodes that 31 are part of a B-tree node (values within 30 a key range), whereas 'black' links are 29 nodes that are connected vertically in a 28 B-tree.

So, here's what you get when you 27 translate the rules of a Red Black tree 26 in terms of a B-tree (I'm using the format 25 Red Black tree rule => B Tree equivalent):

1) A node is either red or black. => A 24 node in a b-tree can either be part of a 23 node, or as a node in a new level.

2) The 22 root is black. (This rule is sometimes 21 omitted, since it doesn't affect analysis) => The 20 root node can be thought of either as a 19 part of an internal root node as a child 18 of an imaginary parent node.

3) All leaves 17 (NIL) are black. (All leaves are same color 16 as the root.) => Since one way of representing 15 a RB tree is by omitting the leaves, we 14 can rule this out.

4)Both children of every 13 red node are black. => The children of 12 an internal node in a B-tree always lie 11 on another level.

5)Every simple path from 10 a given node to any of its descendant leaves 9 contains the same number of black nodes. => A 8 B-tree is kept balanced as it requires that 7 all leaf nodes are at the same depth (Hence 6 the height of a B-tree node is represented 5 by the number of black links from the root 4 to the leaf of a Red Black tree)

Also, there's 3 a simpler 'non-standard' implementation 2 by Robert Sedgewick here: (He's the author of 1 the book Algorithms along with Wayne)

Score: 1

Lots and lots of heat here, but not much 105 light, so lets see if we can provide some.

First, a 104 RB tree is an associative data structure, unlike, say 103 an array, which cannot take a key and return 102 an associated value, well, unless that's 101 an integer "key" in a 0% sparse index of 100 contiguous integers. An array cannot grow 99 in size either (yes, I know about realloc() too, but 98 under the covers that requires a new array 97 and then a memcpy()), so if you have either 96 of these requirements, an array won't do. An 95 array's memory efficiency is perfect. Zero 94 waste, but not very smart, or flexible - realloc() not 93 withstanding.

Second, in contrast to a bsearch() on 92 an array of elements, which IS an associative 91 data structure, a RB tree can grow (AND 90 shrink) itself in size dynamically. The 89 bsearch() works fine for indexing a data 88 structure of a known size, which will remain 87 that size. So if you don't know the size 86 of your data in advance, or new elements 85 need to be added, or deleted, a bsearch() is 84 out. Bsearch() and qsort() are both well 83 supported in classic C, and have good memory 82 efficiency, but are not dynamic enough for 81 many applications. They are my personal 80 favorite though because they're quick, easy, and 79 if you're not dealing with real-time apps, quite 78 often are flexible enough. In addition, in 77 C/C++ you can sort an array of pointers 76 to data records, pointing to the struc{} member, for 75 example, you wish to compare, and then rearranging 74 the pointer in the pointer array such that 73 reading the pointers in order at the end 72 of the pointer sort yields your data in 71 sorted order. Using this with memory-mapped 70 data files is extremely memory efficient, fast, and 69 fairly easy. All you need to do is add a 68 few "*"s to your compare function/s.

Third, in 67 contrast to a hashtable, which also must 66 be a fixed size, and cannot be grown once 65 filled, a RB tree will automagically grow 64 itself and balance itself to maintain its 63 O(log(n)) performance guarantee. Especially 62 if the RB tree's key is an int, it can be 61 faster than a hash, because even though 60 a hashtable's complexity is O(1), that 1 59 can be a very expensive hash calculation. A 58 tree's multiple 1-clock integer compares 57 often outperform 100-clock+ hash calculations, to 56 say nothing of rehashing, and malloc()ing 55 space for hash collisions and rehashes. Finally, if 54 you want ISAM access, as well as key access 53 to your data, a hash is ruled out, as there 52 is no ordering of the data inherent in the 51 hashtable, in contrast to the natural ordering 50 of data in any tree implementation. The 49 classic use for a hash table is to provide 48 keyed access to a table of reserved words 47 for a compiler. It's memory efficiency is 46 excellent.

Fourth, and very low on any list, is 45 the linked, or doubly-linked list, which, in 44 contrast to an array, naturally supports 43 element insertions and deletions, and as 42 that implies, resizing. It's the slowest 41 of all the data structures, as each element 40 only knows how to get to the next element, so 39 you have to search, on average, (element_knt/2) links 38 to find your datum. It is mostly used where 37 insertions and deletions somewhere in the 36 middle of the list are common, and especially, where 35 the list is circular and feeds an expensive 34 process which makes the time to read the 33 links relatively small. My general RX is 32 to use an arbitrarily large array instead 31 of a linked list if your only requirement 30 is that it be able to increase in size. If 29 you run out of size with an array, you can 28 realloc() a larger array. The STL does this 27 for you "under the covers" when you use 26 a vector. Crude, but potentially 1,000s 25 of times faster if you don't need insertions, deletions 24 or keyed lookups. It's memory efficiency 23 is poor, especially for doubly-linked lists. In 22 fact, a doubly-linked list, requiring two 21 pointers, is exactly as memory inefficient 20 as a red-black tree while having NONE of 19 its appealing fast, ordered retrieval characteristics.

Fifth, trees 18 support many additional operations on their 17 sorted data than any other data structure. For 16 example, many database queries make use 15 of the fact that a range of leaf values 14 can be easily specified by specifying their 13 common parent, and then focusing subsequent 12 processing on the part of the tree that 11 parent "owns". The potential for multi-threading 10 offered by this approach should be obvious, as 9 only a small region of the tree needs to 8 be locked - namely, only the nodes the parent 7 owns, and the parent itself.

In short, trees 6 are the Cadillac of data structures. You 5 pay a high price in terms of memory used, but 4 you get a completely self-maintaining data 3 structure. This is why, as pointed out in 2 other replies here, transaction databases 1 use trees almost exclusively.

Score: 0

If you would like to see how a Red-Black 3 tree is supposed to look graphically, I 2 have coded an implementation of a Red-Black 1 tree that you can download here

Score: 0

IME, almost no one understands the RB tree 8 algorithm. People can repeat the rules 7 back to you, but they don't understand why those 6 rules and where they come from. I am no 5 exception :-)

For this reason, I prefer the 4 AVL algorithm, because it's easy to comprehend. Once 3 you understand it, you can then code it 2 up from scratch, because it make sense to 1 you.

Score: 0

Trees can be fast. If you have a million 25 nodes in a balanced binary tree, it takes 24 twenty comparisons on average to find any 23 one item. If you have a million nodes in 22 a linked list, it takes five hundred thousands 21 comparisons on average to find the same 20 item.

If the tree is unbalanced, though, it 19 can be just as slow as a list, and also take 18 more memory to store. Imagine a tree where 17 most nodes have a right child, but no left 16 child; it is a list, but you still have to 15 hold memory space to put in the left node 14 if one shows up.

Anyways, the AVL tree was the 13 first balanced binary tree algorithm, and 12 the Wikipedia article on it is pretty clear. The 11 Wikipedia article on red-black trees is 10 clear as mud, honestly.

Beyond binary trees, B-Trees 9 are trees where each node can have many 8 values. B-Tree is not a binary tree, just 7 happens to be the name of it. They're really 6 useful for utilizing memory efficiently; each 5 node of the tree can be sized to fit in 4 one block of memory, so that you're not 3 (slowly) going and finding tons of different 2 things in memory that was paged to disk. Here's 1 a phenomenal example of the B-Tree.

More Related questions