Theory and Practice: Sorting out stuff

Comparison based sorting needs Ω(lgn!)=Ω(n lgn) comparisons because each comparison yields one bit of information and there are n! permutations. If you are sorting integers then you can do better, and not only asymptotically. Radix sort works as follows:

deque<int> bucket[2][256];

void rsort(vector<int>& v) {
  int i, j, k;
  for (i = 0; i < v.size(); ++i)
    bucket[0][v[i]&0x000000ff].push_back(v[i]);
  for (i = 0; i < 256; ++i) {
    while (!bucket[0][i].empty()) {
      int &e = bucket[0][i].front();
      bucket[1][(e&0x0000ff00)>>8].push_back(e);
      bucket[0][i].pop_front();
    }
  }
  for (i = 0; i < 256; ++i) {
    while (!bucket[1][i].empty()) {
      int &e = bucket[1][i].front();
      bucket[0][(e&0x00ff0000)>>16].push_back(e);
      bucket[1][i].pop_front();
    }
  }
  for (i = 0; i < 256; ++i) {
    while (!bucket[0][i].empty()) {
      int &e = bucket[0][i].front();
      bucket[1][(e&0xff000000)>>8].push_back(e);
      bucket[0][i].pop_front();
    }
  }
  k = 0;
  for (i = 0; i < 256; ++i) {
    while (!bucket[1][i].empty()) {
      v[k++] = bucket[1][i].front();
      bucket[1][i].pop_front();
    }
  }
}

This straightforward (and somewhat repetitive) code is faster than std::sort if you sort more than one thousand integers.

Whenever someone says that radix sort works in linear time there is someone that feels compelled to point out that saying “linear time” is cheating. I really don’t care whether it “works in linear time” or not, but I can tell you that it is about two to three times faster than std::sort. We can still analyze the runtime and get a better idea of just how fast it is, but we need to be a bit more precise.

Let’s say we work on a computer with 2^k bit words. Typical computers nowadays have k=5 or k=6, and the code above assumes k=5. We group the bits of a word in digits, each with 2^w bits. In the example above w=3. We have b=2^{2^w} buckets and d=2^k−w digits. The number of operations of the code above is proportional to n+(b+n)d=n+(2^{2^w}+n)2^k−w. We can’t change how many numbers we must sort (n) or the machine on which the program runs (k). But we can change w. What is the optimum value? If we let the derivative with respect to w equal 0 we get 2^{2^w+w}ln2=2^w+wn. That looks horrible. But it has one interesting feature: It doesn’t contain k, which means that the optimum number of bits per digit does not depend on the machine on which we do the sorting. It can be a 8 bit machine, a 32 bit machine, a 64 bit machine, or even a 8192 bit machine, we should still have the same number of bits per digit for a certain size of the list that we are sorting.

We now turn to approximations. It’s probably safe to assume 2^w≫ w and wn≫2^w. We are left with 2^{2^w}ln2=wn, which is still horrible. Since the left side varies much faster than the right one we can replace the w on the right with a (yet unknown) suitable constant w′. We get that 2^w varies as O(lgn). (Note that this is an upper bound.) If we plug this back into the running time formula we finally get O(n 2^k/lgn). Here 2^k is how many bits the machine has per word and lgn is how many bits we need to write down the length of the list we are sorting. So radix sort should be quite good when we have to sort lots of numbers. And indeed, as I said, it clearly beats std:sort.

Now, if you have to sort really lots of numbers, then you should probably look at funnelsort.

Theory and Practice

08 August 2008

Sorting out stuff

No comments:

Post a Comment