All Input Is Evil - Check All Input
The root of most security issues is that some code trusts the input it gets, and that trust is misplaced. Buffer overflows are the typical example, but any false assumptions about any input can lead to security vulnerabilities.
Always check inputs. Every function, object and module should act as if the outside world is bent on their destruction. We'll worry about performance later.
List allowed values, reject all other. This is also called whitelisting, and is much more effective than blacklisting, which means exlucing bad input. The reason is that it is very easy to forget some bad inputs, and in some cases arbitrary input is possible so it may not even be possible to list all bad inputs.
No Ad Hoc URL Parsing
URLs are extremely tricky beasts.
Do not attempt to roll your own URL parser, and never use string operations to parse URL strings. Use @@@insert safe URL handing module here@@@ only in Python.
NOTE: urlparse that comes with Python is not safe/complete, for example it does not support username, password, port. I would guess it does not make a difference between a dot and no dot at the end of a hostname either.
URLs are much more complicated than most people think. The full URL syntax allows for some rather obscure and rarely used features, which can easily cause security problems. For example, URLs can have embedded usernames and passwords, and the host name can end in a dot (.).
URLs can come in several escaped forms. URLs must be stored internally in canonical form @@@which is what?@@@