Does anyone know how where proxy sites get their proxy lists from? An example is here: http://hidemyass.com/free_proxy_lists.php
The source is scanning. They scan IP ranges for certain ports, and then test if they can connect via the protocol they're trying to find (like socks5).