A query can cause a program to fail because of bugs or various other issues. This means that a single query can take down an entire cluster of machines, which is not good for availability and response times, as it takes quite a while for thousands of machines to recover. Thus the Query of Death. New queries are always coming into the system and when you are always rolling out new software, it’s impossible to completely get rid of the problem.
via High Scalability – Strategy: Google Sends Canary Requests into the Data Mine. Google of course has a really solid solution to solving the Query of Death issue.