No. They try and understand what constitutes a good result and what a bad one. Then create algorithm tweaks to account for those. So no direct manual intervention (unless it's a real bad case I guess) but indirect long-run effects.
There's a couple of threads here about this. People have seen eval.google.com show up in their logs. That's their super sleuth back-end facility where these testers operate. There's a Dutch guy who posted details on his blog. Search DP for the links.