08 August 2008

Sorting out stuff

Comparison based sorting needs Ω(lgn!)=Ω(n lgn) comparisons because each comparison yields one bit of information and there are n! permutations. If you are sorting integers then you can do better, and not only asymptotically. Radix sort works as follows:

deque<int> bucket[2][256];

void rsort(vector<int>& v) {
  int i, j, k;
  for (i = 0; i < v.size(); ++i)
    bucket[0][v[i]&0x000000ff].push_back(v[i]);
  for (i = 0; i < 256; ++i) {
    while (!bucket[0][i].empty()) {
      int &e = bucket[0][i].front();
      bucket[1][(e&0x0000ff00)>>8].push_back(e);
      bucket[0][i].pop_front();
    }
  }
  for (i = 0; i < 256; ++i) {
    while (!bucket[1][i].empty()) {
      int &e = bucket[1][i].front();
      bucket[0][(e&0x00ff0000)>>16].push_back(e);
      bucket[1][i].pop_front();
    }
  }
  for (i = 0; i < 256; ++i) {
    while (!bucket[0][i].empty()) {
      int &e = bucket[0][i].front();
      bucket[1][(e&0xff000000)>>8].push_back(e);
      bucket[0][i].pop_front();
    }
  }
  k = 0;
  for (i = 0; i < 256; ++i) {
    while (!bucket[1][i].empty()) {
      v[k++] = bucket[1][i].front();
      bucket[1][i].pop_front();
    }
  }
}

This straightforward (and somewhat repetitive) code is faster than std::sort if you sort more than one thousand integers.

Whenever someone says that radix sort works in linear time there is someone that feels compelled to point out that saying “linear time” is cheating. I really don’t care whether it “works in linear time” or not, but I can tell you that it is about two to three times faster than std::sort. We can still analyze the runtime and get a better idea of just how fast it is, but we need to be a bit more precise.

Let’s say we work on a computer with 2k bit words. Typical computers nowadays have k=5 or k=6, and the code above assumes k=5. We group the bits of a word in digits, each with 2w bits. In the example above w=3. We have b=22w buckets and d=2kw digits. The number of operations of the code above is proportional to n+(b+n)d=n+(22w+n)2kw. We can’t change how many numbers we must sort (n) or the machine on which the program runs (k). But we can change w. What is the optimum value? If we let the derivative with respect to w equal 0 we get 22w+wln2=2w+wn. That looks horrible. But it has one interesting feature: It doesn’t contain k, which means that the optimum number of bits per digit does not depend on the machine on which we do the sorting. It can be a 8 bit machine, a 32 bit machine, a 64 bit machine, or even a 8192 bit machine, we should still have the same number of bits per digit for a certain size of the list that we are sorting.

We now turn to approximations. It’s probably safe to assume 2ww and wn≫2w. We are left with 22wln2=wn, which is still horrible. Since the left side varies much faster than the right one we can replace the w on the right with a (yet unknown) suitable constant w′. We get that 2w varies as O(lgn). (Note that this is an upper bound.) If we plug this back into the running time formula we finally get O(n 2k/lgn). Here 2k is how many bits the machine has per word and lgn is how many bits we need to write down the length of the list we are sorting. So radix sort should be quite good when we have to sort lots of numbers. And indeed, as I said, it clearly beats std:sort.

Now, if you have to sort really lots of numbers, then you should probably look at funnelsort.

No comments:

Post a Comment

Note: (1) You need to have third-party cookies enabled in order to comment on Blogger. (2) Better to copy your comment before hitting publish/preview. Blogger sometimes eats comments on the first try, but the second works. Crazy Blogger.