Does Python have screen scraping libraries that offer JavaScript support?
I've been using pycurl for simple HTML requests, and Java's HtmlUnit for more complicated requests requiring JavaScript support.
Ideally I would like to be able to do everything from Python, but I haven't come across any libraries that would allow me to do it. Do they exist?
Here you go: http://scrapy.org/
you can try spidermonkey ?
I have not found anything for this. I use a combination of beautifulsoup and custom routines...
Selenium maybe? It allows you to automate an actual browser (Firefox, IE, Safari) using python (amongst other languages). It is meant for testing websites, but seems it should be usable for scraping as well. (disclaimer: never used it myself)
The Webscraping library wraps the PyQt4 WebView into a simple and easy-to-use API.
Here is a simple example to download a web page rendered by WebKit and extract the title element using XPath (taken from the URL above):
There are many options when dealing with static HTML, which the other responses cover. However if you need JavaScript support and want to stay in Python I recommend using webkit to render the webpage (including the JavaScript) and then examine the resulting HTML. For example: