r764 - in python-mechanize/branches/upstream/current: . examples mechanize mechanize.egg-info test test-tools

Jérémy Bobbio lunar at alioth.debian.org
Mon Apr 9 20:40:59 UTC 2007


Author: lunar
Date: 2007-04-09 20:40:55 +0000 (Mon, 09 Apr 2007)
New Revision: 764

Added:
   python-mechanize/branches/upstream/current/ez_setup.py
   python-mechanize/branches/upstream/current/mechanize.egg-info/dependency_links.txt
   python-mechanize/branches/upstream/current/mechanize/_beautifulsoup.py
   python-mechanize/branches/upstream/current/mechanize/_debug.py
   python-mechanize/branches/upstream/current/mechanize/_http.py
   python-mechanize/branches/upstream/current/mechanize/_response.py
   python-mechanize/branches/upstream/current/mechanize/_rfc3986.py
   python-mechanize/branches/upstream/current/mechanize/_seek.py
   python-mechanize/branches/upstream/current/mechanize/_upgrade.py
   python-mechanize/branches/upstream/current/setup.cfg
   python-mechanize/branches/upstream/current/test-tools/
   python-mechanize/branches/upstream/current/test-tools/doctest.py
   python-mechanize/branches/upstream/current/test-tools/linecache_copy.py
   python-mechanize/branches/upstream/current/test/test_browser.doctest
   python-mechanize/branches/upstream/current/test/test_browser.py
   python-mechanize/branches/upstream/current/test/test_forms.doctest
   python-mechanize/branches/upstream/current/test/test_history.doctest
   python-mechanize/branches/upstream/current/test/test_html.doctest
   python-mechanize/branches/upstream/current/test/test_html.py
   python-mechanize/branches/upstream/current/test/test_opener.py
   python-mechanize/branches/upstream/current/test/test_password_manager.doctest
   python-mechanize/branches/upstream/current/test/test_request.doctest
   python-mechanize/branches/upstream/current/test/test_response.doctest
   python-mechanize/branches/upstream/current/test/test_response.py
   python-mechanize/branches/upstream/current/test/test_rfc3986.doctest
   python-mechanize/branches/upstream/current/test/test_useragent.py
Removed:
   python-mechanize/branches/upstream/current/ez_setup/
   python-mechanize/branches/upstream/current/mechanize/_urllib2_support.py
   python-mechanize/branches/upstream/current/test/test_conncache.py
   python-mechanize/branches/upstream/current/test/test_mechanize.py
   python-mechanize/branches/upstream/current/test/test_misc.py
Modified:
   python-mechanize/branches/upstream/current/0.1-changes.txt
   python-mechanize/branches/upstream/current/ChangeLog.txt
   python-mechanize/branches/upstream/current/MANIFEST.in
   python-mechanize/branches/upstream/current/PKG-INFO
   python-mechanize/branches/upstream/current/README.html
   python-mechanize/branches/upstream/current/README.html.in
   python-mechanize/branches/upstream/current/README.txt
   python-mechanize/branches/upstream/current/doc.html
   python-mechanize/branches/upstream/current/doc.html.in
   python-mechanize/branches/upstream/current/examples/pypi.py
   python-mechanize/branches/upstream/current/functional_tests.py
   python-mechanize/branches/upstream/current/mechanize.egg-info/PKG-INFO
   python-mechanize/branches/upstream/current/mechanize.egg-info/SOURCES.txt
   python-mechanize/branches/upstream/current/mechanize.egg-info/requires.txt
   python-mechanize/branches/upstream/current/mechanize.egg-info/zip-safe
   python-mechanize/branches/upstream/current/mechanize/__init__.py
   python-mechanize/branches/upstream/current/mechanize/_auth.py
   python-mechanize/branches/upstream/current/mechanize/_clientcookie.py
   python-mechanize/branches/upstream/current/mechanize/_gzip.py
   python-mechanize/branches/upstream/current/mechanize/_headersutil.py
   python-mechanize/branches/upstream/current/mechanize/_html.py
   python-mechanize/branches/upstream/current/mechanize/_lwpcookiejar.py
   python-mechanize/branches/upstream/current/mechanize/_mechanize.py
   python-mechanize/branches/upstream/current/mechanize/_mozillacookiejar.py
   python-mechanize/branches/upstream/current/mechanize/_msiecookiejar.py
   python-mechanize/branches/upstream/current/mechanize/_opener.py
   python-mechanize/branches/upstream/current/mechanize/_request.py
   python-mechanize/branches/upstream/current/mechanize/_urllib2.py
   python-mechanize/branches/upstream/current/mechanize/_useragent.py
   python-mechanize/branches/upstream/current/mechanize/_util.py
   python-mechanize/branches/upstream/current/setup.py
   python-mechanize/branches/upstream/current/test.py
   python-mechanize/branches/upstream/current/test/test_cookies.py
   python-mechanize/branches/upstream/current/test/test_date.py
   python-mechanize/branches/upstream/current/test/test_urllib2.py
Log:
[svn-upgrade] Integrating new upstream version, python-mechanize (0.1.6b)

Modified: python-mechanize/branches/upstream/current/0.1-changes.txt
===================================================================
--- python-mechanize/branches/upstream/current/0.1-changes.txt	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/0.1-changes.txt	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,5 +1,9 @@
 Recent public API changes:
 
+- Since 0.1.2b beta release: Factory now takes EncodingFinder and
+  ResponseTypeFinder class instances instead of functions (since
+  closures don't play well with module pickle).
+
 - ClientCookie has been moved into the mechanize package and is no
   longer a separate package.  The ClientCookie interface is still
   supported, but all names must be imported from module mechanize
@@ -27,7 +31,7 @@
 - .forms() and .links() now both return iterators (in fact, generators),
   not sequences (not really an interface change: these were always
   documented to return iterables, but it will no doubt break some client
-  code).
+  code).  Use e.g. list(browser.forms()) if you want a list.
 
 - .links no longer raises LinkNotFoundError (was accidental -- only
   .click_link() / .find_link() should raise this).
@@ -48,7 +52,9 @@
 - mechanize.Browser.default_encoding is gone.
 
 - mechanize.Browser.set_seekable_responses() is gone (they're always
-  .seek()able).
+  .seek()able).  Browser and UserAgent now both inherit from
+  mechanize.UserAgentBase, and UserAgent is now there only to add the
+  single method .set_seekable_responses().
 
 - Added Browser.encoding().
 

Modified: python-mechanize/branches/upstream/current/ChangeLog.txt
===================================================================
--- python-mechanize/branches/upstream/current/ChangeLog.txt	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/ChangeLog.txt	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,7 +1,110 @@
 This isn't really in proper GNU ChangeLog format, it just happens to
 look that way.
 
-2006-05-06 John J Lee <jjl at pobox.com>
+2007-01-07 John J Lee <jjl at pobox.com>
+
+	* 0.1.6b release
+	* Add mechanize.ParseError class, document it as part of the
+	  mechanize.Factory interface, and raise it from all Factory
+	  implementations.  This is backwards-compatible, since the new
+	  exception derives from the old exceptions.
+	* Bug fix: Truncation when there is no full .read() before
+	  navigating to the next page, and an old response is read after
+	  navigation.  This happened e.g. with r = br.open();
+	  r.readline(); br.open(url); r.read(); br.back() .
+	* Bug fix: when .back() caused a reload, it was returning the old
+	  response, not the .reload()ed one.
+	* Bug fix: .back() was not returning a copy of the response, which
+	  presumably would cause seek position problems.
+	* Bug fix: base tag without href attribute would override document
+	  URL with a None value, causing a crash (thanks Nathan Eror).
+	* Fix .set_response() to close current response first.
+	* Fix non-idempotent behaviour of Factory.forms() / .links() .
+	  Previously, if for example you got a ParseError during execution
+	  of .forms(), you could call it again and have it not raise an
+	  exception, because it started out where it left off!
+	* Add a missing copy.copy() to RobustFactory .
+	* Fix redirection to 'URIs' that contain characters that are not
+	  allowed in URIs (thanks Riko Wichmann).  Also, Request
+	  constructor now logs a module logging warning about any such bad
+	  URIs.
+	* Add .global_form() method to Browser to support form controls
+	  whose HTML elements are not descendants of any FORM element.
+	* Add a new method .visit_response() .  This creates a new history
+	  entry from a response object, rather than just changing the
+	  current visited response.  This is useful e.g. when you want to
+	  use Browser features in a handler.
+	* Misc minor bug fixes.
+
+2006-10-25 John J Lee <jjl at pobox.com>
+
+	* 0.1.5b release: Update setuptools dependencies to depend on
+	  ClientForm>=0.2.5 (for an important bug fix affecting fragments
+	  in URLs).  There are no other changes in this release -- this
+	  release was done purely so that people upgrading to the latest
+	  version of mechanize will get the latest ClientForm too.
+
+2006-10-14 John J Lee <jjl at pobox.com>
+	* 0.1.4b release: (skipped a version deliberately for obscure
+	  reasons)
+	* Improved auth & proxies support.
+	* Follow RFC 3986.
+	* Add a .set_cookie() method to Browser .
+	* Add Browser.open_novisit() and Request.visit to allow fetching
+	  files without affecting Browser state.
+	* UserAgent and Browser are now subclasses of UserAgentBase.
+	  UserAgent's only role in life above what UserAgentBase does is
+	  to provide the .set_seekable_responses() method (it lives there
+	  because Browser depends on seekable responses, because that's
+	  how browser history is implemented).
+	* Bundle BeautifulSoup 2.1.1.  No more dependency pain!  Note that
+	  BeautifulSoup is, and always was, optional, and that mechanize
+	  will eventually switch to BeautifulSoup version 3, at which
+	  point it may well stop bundling BeautifulSoup.  Note also that
+	  the module is only used internally, and is not available as a
+	  public attribute of the package.  If you dare, you can import it
+	  ("from mechanize import _beautifulsoup"), but beware that it
+	  will go away later, and that the API of BeautifulSoup will
+	  change when the upgrade to 3 happens.  Also, BeautifulSoup
+	  support (mainly RobustFactory) is still a little experimental
+	  and buggy.
+	* Fix HTTP-EQUIV with no content attribute case (thanks Pratik
+	  Dam).
+	* Fix bug with quoted META Refresh URL (thanks Nilton Volpato).
+	* Fix crash with </base> tag (yajdbgr02 at sneakemail.com).
+	* Somebody found a server that (incorrectly) depends on HTTP
+	  header case, so follow the Title-Case convention.  Note that the
+	  Request headers interface(s), which were (somewhat oddly -- this
+	  is an inheritance from urllib2 that should really be fixed in a
+	  better way than it is currently) always case-sensitive still
+	  are; the only thing that changed is what actually eventually
+	  gets sent over the wire.
+	* Use mechanize (not urllib) to open robots.txt.  Don't consult
+	  RobotFileParser instance about non-HTTP URLs.
+	* Fix OpenerDirector.retrieve(), which was very broken (thanks
+	  Duncan Booth).
+	* Crash in a much more obvious way if trying to use OpenerDirector
+	  after .close() .
+	* .reload() on .back() if necessary (necessary iff response was
+	  not fully .read() on first .open()ing ) * Strip fragments before
+	  retrieving URLs (fixed Request.get_selector() to strip fragment)
+	* Fix catching HTTPError subclasses while still preserving all
+	  their response behaviour
+	* Correct over-enthusiastic documented guarantees of
+	  closeable_response .
+	* Fix assumption that httplib.HTTPMessage treats dict-style
+	  __setitem__ as append rather than set (where on earth did I get
+	  that from?).
+	* Expose History in mechanize/__init__.py (though interface is
+	  still experimental).
+	* Lots of other "internals" bugs fixed (thanks to reports /
+	  patches from Benji York especially, also Titus Brown, Duncan
+	  Booth, and me ;-), where I'm not 100% sure exactly when they
+	  were introduced, so not listing them here in detail.
+	* Numerous other minor fixes.
+	* Some code cleanup.
+
+2006-05-21 John J Lee <jjl at pobox.com>
 	* 0.1.2b release:
 	* mechanize now exports the whole urllib2 interface.
 	* Pull in bugfixed auth/proxy support code from Python 2.5.

Modified: python-mechanize/branches/upstream/current/MANIFEST.in
===================================================================
--- python-mechanize/branches/upstream/current/MANIFEST.in	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/MANIFEST.in	2007-04-09 20:40:55 UTC (rev 764)
@@ -10,5 +10,7 @@
 include ChangeLog.txt
 include 0.1.0-changes.txt
 include *.py
+prune docs-in-progress
 recursive-include examples *.py
 recursive-include attic *.py
+recursive-include test-tools *.py

Modified: python-mechanize/branches/upstream/current/PKG-INFO
===================================================================
--- python-mechanize/branches/upstream/current/PKG-INFO	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/PKG-INFO	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,12 +1,12 @@
 Metadata-Version: 1.0
 Name: mechanize
-Version: 0.1.2b
+Version: 0.1.6b
 Summary: Stateful programmatic web browsing.
 Home-page: http://wwwsearch.sourceforge.net/mechanize/
 Author: John J. Lee
 Author-email: jjl at pobox.com
 License: BSD
-Download-URL: http://wwwsearch.sourceforge.net/mechanize/src/mechanize-0.1.2b.tar.gz
+Download-URL: http://wwwsearch.sourceforge.net/mechanize/src/mechanize-0.1.6b.tar.gz
 Description: Stateful programmatic web browsing, after Andy Lester's Perl module
         WWW::Mechanize.
         
@@ -25,7 +25,7 @@
         
         
 Platform: any
-Classifier: Development Status :: 3 - Alpha
+Classifier: Development Status :: 4 - Beta
 Classifier: Intended Audience :: Developers
 Classifier: Intended Audience :: System Administrators
 Classifier: License :: OSI Approved :: BSD License

Modified: python-mechanize/branches/upstream/current/README.html
===================================================================
--- python-mechanize/branches/upstream/current/README.html	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/README.html	2007-04-09 20:40:55 UTC (rev 764)
@@ -5,7 +5,7 @@
 <head>
   <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
   <meta name="author" content="John J. Lee &lt;jjl at pobox.com&gt;">
-  <meta name="date" content="2006-05-21">
+  <meta name="date" content="2006-12-30">
   <meta name="keywords" content="Python,HTML,HTTP,browser,stateful,web,client,client-side,mechanize,cookie,form,META,HTTP-EQUIV,Refresh,ClientForm,ClientCookie,pullparser,WWW::Mechanize">
   <meta name="keywords" content="cookie,HTTP,Python,web,client,client-side,HTML,META,HTTP-EQUIV,Refresh">
   <title>mechanize</title>
@@ -31,16 +31,18 @@
 <ul>
 
   <li><code>mechanize.Browser</code> is a subclass of
-    <code>mechanize.UserAgent</code>, which is, in turn, a subclass of
+    <code>mechanize.UserAgentBase</code>, which is, in turn, a subclass of
     <code>urllib2.OpenerDirector</code> (in fact, of
     <code>mechanize.OpenerDirector</code>), so:
     <ul>
       <li>any URL can be opened, not just <code>http:</code>
-      <li><code>mechanize.UserAgent</code> offers easy dynamic configuration of
-      user-agent features like protocol, cookie, redirection and
-      <code>robots.txt</code> handling, without having to make a new
-      <code>OpenerDirector</code> each time, e.g.  by calling
-      <code>build_opener()</code>.
+
+      <li><code>mechanize.UserAgentBase</code> offers easy dynamic
+      configuration of user-agent features like protocol, cookie,
+      redirection and <code>robots.txt</code> handling, without having
+      to make a new <code>OpenerDirector</code> each time, e.g.  by
+      calling <code>build_opener()</code>.
+
     </ul>
   <li>Easy HTML form filling, using <a href="../ClientForm/">ClientForm</a>
     interface.
@@ -145,7 +147,6 @@
 <span class="pycmt"># Sometimes it's useful to process bad headers or bad HTML:
 </span>response = br.response()  <span class="pycmt"># this is a copy of response</span>
 headers = response.info()  <span class="pycmt"># currently, this is a mimetools.Message</span>
-<span class="pykw">del</span> headers[<span class="pystr">"Content-type"</span>]  <span class="pycmt"># get rid of (possibly multiple) existing headers</span>
 headers[<span class="pystr">"Content-type"</span>] = <span class="pystr">"text/html; charset=utf-8"</span>
 response.set_data(response.get_data().replace(<span class="pystr">"&lt;!---"</span>, <span class="pystr">"&lt;!--"</span>))
 br.set_response(response)</pre>
@@ -160,7 +161,7 @@
 
 
 
-so anything you would normally import from <code>urllib2</code> can
+<p>so anything you would normally import from <code>urllib2</code> can
 (and should, by preference, to insulate you from future changes) be
 imported from mechanize instead.  In many cases if you import an
 object from mechanize it will be the very same object you would get if
@@ -170,6 +171,28 @@
 way.
 
 
+<a name="useragentbase"></a>
+<h2>UserAgent vs UserAgentBase</h2>
+
+<p><code>mechanize.UserAgent</code> is a trivial subclass of
+<code>mechanize.UserAgentBase</code>, adding just one method,
+<code>.set_seekable_responses()</code>, which allows switching off the
+addition of the <code>.seek()</code> method to response objects:
+
+<pre>
+ua = mechanize.UserAgent()
+ua.set_seekable_responses(False)
+ua.set_handle_equiv(False)  <span class="pycmt"># handling HTTP-EQUIV would add the .seek() method too</span>
+response = ua.open(<span class="pystr">'http://wwwsearch.sourceforge.net/'</span>)
+<span class="pykw">assert</span> <span class="pykw">not</span> hasattr(response, <span class="pystr">"seek"</span>)
+<span class="pykw">print</span> response.read()</pre>
+
+
+<p>The reason for the extra class is that
+<code>mechanize.Browser</code> depends on seekable response objects
+(because response objects are used to implement the browser history).
+
+
 <a name="compatnotes"></a>
 <h2>Compatibility</h2>
 
@@ -263,35 +286,38 @@
 
 
 <a name="todo"></a>
-<h2>Todo</h2>
+<h2>To do</h2>
 
 <p>Contributions welcome!
 
-<h3>Specific to mechanize</h3>
+<p>The documentation to-do list has moved to the new "docs-in-progress"
+directory in SVN.
 
-<em>This is <strong>very</strong> roughly in order of priority</em>
+<p><em>This is <strong>very</strong> roughly in order of priority</em>
 
 <ul>
-  <li>Add .get_method() to Request.
   <li>Test <code>.any_response()</code> two handlers case: ordering.
   <li>Test referer bugs (frags and don't add in redirect unless orig
     req had Referer)
-  <li>Implement RFC 3986 URL absolutization.
+  <li>Remove use of urlparse from _auth.py.
   <li>Proper XHTML support!
-  <li>Make encoding_finder public, I guess (but probably improve it first).
-    (For example: support Mark Pilgrim's universal encoding detector?)
-  <li>Continue with the de-crufting enabled by requirement for Python 2.3.
   <li>Fix BeautifulSoup support to use a single BeautifulSoup instance
     per page.
   <li>Test BeautifulSoup support better / fix encoding issue.
+  <li>Support BeautifulSoup 3.
   <li>Add another History implementation or two and finalise interface.
   <li>History cache expiration.
-  <li>Investigate possible leak (see Balazs Ree's list posting).
-  <li>Add two-way links between BeautifulSoup & ClientForm object models.
-  <li>In 0.2: fork urllib2 &#8212; easier maintenance.
+  <li>Investigate possible leak further (see Balazs Ree's list posting).
+  <li>Make <code>EncodingFinder</code> public, I guess (but probably
+    improve it first).  (For example: support Mark Pilgrim's universal
+    encoding detector?)
+  <li>Add two-way links between BeautifulSoup &amp; ClientForm object
+    models.
   <li>In 0.2: switch to Python unicode strings everywhere appropriate
     (HTTP level should still use byte strings, of course).
-  <li>clean_url(): test browser behaviour.  I <em>think</em> this is correct...
+  <li><code>clean_url()</code>: test browser behaviour.  I <em>think</em>
+    this is correct...
+  <li>Use a nicer RFC 3986 join / split / unsplit implementation.
   <li>Figure out the Right Thing (if such a thing exists) for %-encoding.
   <li>How do IRIs fit into the world?
   <li>IDNA -- must read about security stuff first.
@@ -303,23 +329,16 @@
   <li>gzip transfer encoding (there's already a handler for this in
     mechanize, but it's poorly implemented ATM).
   <li>proxy.pac parsing (I don't think this needs JS interpretation)
-</ul>
+  <li>Topological sort for handlers, instead of .handler_order
+    attribute.  Ordering and other dependencies (where unavoidable)
+    should be defined separate from handlers themselves.  Add new
+    build_opener and deprecate the old one?  Actually, _useragent is
+    probably not far off what I'd have in mind (would just need a
+    method or two and a base class adding I think), and it's not a high
+    priority since I guess most people will just use the UserAgent and
+    Browser classes.
 
-<h3>Documentation</h3>
-<ul>
-  <li>Document means of processing response on ad-hoc basis with
-    .set_response() - e.g. to fix bad encoding in Content-type header or
-    clean up bad HTML.
-  <li>Add example to documentation showing can pass None as handle arg
-    to <code>mechanize.UserAgent</code> methods and then .add_handler()
-    if need to give it a specific handler instance to use for one of the
-    things it UserAgent already handles.  Hmm, think this contradicts docs
-    ATM!  And is it better to do this a different way...??
-  <li>Rearrange so have decent class-by-class docs,
-    a tutorial/background-info doc, and a howto/examples doc.
-  <li>Add more functional tests.
-  <li>Auth / proxies.
-</ul>
+ </ul>
 
 
 <a name="download"></a>
@@ -347,13 +366,11 @@
 EasyInstall is a one-liner for the common case, to be compared with the usual
 download-unpack-install cycle with <code>setup.py</code>.
 
-<p><strong>You need EasyInstall version 0.6a8 or newer.</strong>
-
 <h3>Using EasyInstall to download and install mechanize</h3>
 
 <ol>
   <li><a href="http://peak.telecommunity.com/DevCenter/EasyInstall#installing-easy-install">
-Install easy_install</a> (you need version 0.6a8 or newer)
+Install easy_install</a>
   <li><code>easy_install mechanize</code>
 </ol>
 
@@ -388,9 +405,7 @@
 <code>easy_install "projectname=dev"</code> for that project.
 
 <p>Note also that you can still carry on using a plain old SVN checkout as
-usual if you like (optionally in conjunction with <a
-href="./#develop"><code>setup.py develop</code></a> &#8211; this is
-particularly useful on Windows, since it functions rather like symlinks).
+usual if you like.
 
 <h3>Using setup.py from a .tar.gz, .zip or an SVN checkout to download and install mechanize</h3>
 
@@ -404,50 +419,19 @@
 
 <pre>python setup.py easy_install mechanize</pre>
 
-<a name="develop"></a>
-<h3>Using setup.py to install mechanize for development work on mechanize</h3>
 
-<p><strong>Note: this section is only useful for people who want to change
-mechanize</strong>: It is not useful to do this if all you want is to <a
-href="./#svnhead">keep up with SVN</a>.
-
-<p>For development of mechanize using EasyInstall (see the <a
-href="http://peak.telecommunity.com/DevCenter/setuptools">setuptools</a> docs
-for details), you have the option of using the <code>develop</code> distutils
-command.  This is particularly useful on Windows, since it functions rather
-like symlinks.  Get the mechanize source, then:
-
-<pre>python setup.py develop</pre>
-
-<p>Note that after every <code>svn update</code> on a
-<code>develop</code>-installed project, you should run <code>setup.py
-develop</code> to ensure that project's dependencies are updated if required.
-
-<p>Also note that, currently, if you also use the <code>develop</code>
-distutils command on the <em>dependencies</em> of mechanize (<em>viz</em>,
-ClientForm, and optionally BeautifulSoup) to keep up with SVN, you must run
-<code>setup.py develop</code> for each dependency of mechanize before running
-it for mechanize itself.  As a result, in this case it's probably simplest to
-just set up your <code>sys.path</code> manually rather than using
-<code>setup.py develop</code>.
-
-<p>One convenient way to get the latest source is:
-
-<pre>easy_install --editable --build-directory mybuilddir "mechanize==dev"</pre>
-
-
 <a name="source"></a>
 <h2>Download</h2>
 <p>All documentation (including this web page) is included in the distribution.
 
-<p>This is an alpha release: interfaces may change, and there will be bugs.
+<p>This is a beta release: there will be bugs.
 
 <p><em>Development release.</em>
 
 <ul>
 
-<li><a href="./src/mechanize-0.1.2b.tar.gz">mechanize-0.1.2b.tar.gz</a>
-<li><a href="./src/mechanize-0.1.2b.zip">mechanize-0.1.2b.zip</a>
+<li><a href="./src/mechanize-0.1.6b.tar.gz">mechanize-0.1.6b.tar.gz</a>
+<li><a href="./src/mechanize-0.1.6b.zip">mechanize-0.1.6b.zip</a>
 <li><a href="./src/ChangeLog.txt">Change Log</a> (included in distribution)
 <li><a href="./src/">Older versions.</a>
 </ul>
@@ -511,9 +495,9 @@
 
 <ul>
 
-  <li><a href="http://cheeseshop.python.org/pypi?:action=display&name=zope.testbrowser">
+  <li><a href="http://cheeseshop.python.org/pypi?:action=display&amp;name=zope.testbrowser">
     <code>zope.testbrowser</code></a> (or
-    <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&name=ZopeTestbrowser">
+    <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&amp;name=ZopeTestbrowser">
     <code>ZopeTestBrowser</code></a>, the standalone version).
   <li><a href="http://www.idyll.org/~t/www-tools/twill.html">twill</a>.
 </ul>
@@ -541,6 +525,13 @@
   <p>2.3 or above.
   <li>What else do I need?
   <p>mechanize depends on <a href="../ClientForm/">ClientForm</a>.
+  <li>Does mechanize depend on BeautifulSoup?
+     No.  mechanize offers a few (still rather experimental) classes that make
+     use of BeautifulSoup, but these classes are not required to use mechanize.
+     mechanize bundles BeautifulSoup version 2, so that module is no longer
+     required.  A future version of mechanize will support BeautifulSoup
+     version 3, at which point mechanize will likely no longer bundle the
+     module.
   <p>The versions of those required modules are listed in the
      <code>setup.py</code> for mechanize (included with the download).  The
      dependencies are automatically fetched by <a
@@ -559,6 +550,11 @@
 <a name="usagefaq"></a>
 <h2>FAQs - usage</h2>
 <ul>
+  <li>I'm not getting the HTML page I expected to see.
+    <ul>
+      <li><a href="http://wwwsearch.sourceforge.net/mechanize/doc.html#debugging">Debugging tips</a>
+      <li><a href="http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html">More tips</a>
+     </ul>
   <li>I'm <strong><em>sure</em></strong> this page is HTML, why does
      <code>mechanize.Browser</code> think otherwise?
 <pre>
@@ -576,7 +572,7 @@
 mailing list</a> rather than direct to me.
 
 <p><a href="mailto:jjl at pobox.com">John J. Lee</a>,
-May 2006.
+December 2006.
 
 <hr>
 

Modified: python-mechanize/branches/upstream/current/README.html.in
===================================================================
--- python-mechanize/branches/upstream/current/README.html.in	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/README.html.in	2007-04-09 20:40:55 UTC (rev 764)
@@ -6,7 +6,7 @@
 from colorize import colorize
 import time
 import release
-last_modified = release.svn_id_to_time("$Id: README.html.in 27559 2006-05-21 22:39:21Z jjlee $")
+last_modified = release.svn_id_to_time("$Id: README.html.in 36066 2006-12-30 21:00:39Z jjlee $")
 try:
     base
 except NameError:
@@ -42,16 +42,18 @@
 <ul>
 
   <li><code>mechanize.Browser</code> is a subclass of
-    <code>mechanize.UserAgent</code>, which is, in turn, a subclass of
+    <code>mechanize.UserAgentBase</code>, which is, in turn, a subclass of
     <code>urllib2.OpenerDirector</code> (in fact, of
     <code>mechanize.OpenerDirector</code>), so:
     <ul>
       <li>any URL can be opened, not just <code>http:</code>
-      <li><code>mechanize.UserAgent</code> offers easy dynamic configuration of
-      user-agent features like protocol, cookie, redirection and
-      <code>robots.txt</code> handling, without having to make a new
-      <code>OpenerDirector</code> each time, e.g.  by calling
-      <code>build_opener()</code>.
+
+      <li><code>mechanize.UserAgentBase</code> offers easy dynamic
+      configuration of user-agent features like protocol, cookie,
+      redirection and <code>robots.txt</code> handling, without having
+      to make a new <code>OpenerDirector</code> each time, e.g.  by
+      calling <code>build_opener()</code>.
+
     </ul>
   <li>Easy HTML form filling, using <a href="../ClientForm/">ClientForm</a>
     interface.
@@ -156,7 +158,6 @@
 # Sometimes it's useful to process bad headers or bad HTML:
 response = br.response()  # this is a copy of response
 headers = response.info()  # currently, this is a mimetools.Message
-del headers["Content-type"]  # get rid of (possibly multiple) existing headers
 headers["Content-type"] = "text/html; charset=utf-8"
 response.set_data(response.get_data().replace("<!---", "<!--"))
 br.set_response(response)
@@ -171,7 +172,7 @@
 """)}
 
 
-so anything you would normally import from <code>urllib2</code> can
+<p>so anything you would normally import from <code>urllib2</code> can
 (and should, by preference, to insulate you from future changes) be
 imported from mechanize instead.  In many cases if you import an
 object from mechanize it will be the very same object you would get if
@@ -181,6 +182,28 @@
 way.
 
 
+<a name="useragentbase"></a>
+<h2>UserAgent vs UserAgentBase</h2>
+
+<p><code>mechanize.UserAgent</code> is a trivial subclass of
+<code>mechanize.UserAgentBase</code>, adding just one method,
+<code>.set_seekable_responses()</code>, which allows switching off the
+addition of the <code>.seek()</code> method to response objects:
+
+@{colorize("""
+ua = mechanize.UserAgent()
+ua.set_seekable_responses(False)
+ua.set_handle_equiv(False)  # handling HTTP-EQUIV would add the .seek() method too
+response = ua.open('http://wwwsearch.sourceforge.net/')
+assert not hasattr(response, "seek")
+print response.read()
+""")}
+
+<p>The reason for the extra class is that
+<code>mechanize.Browser</code> depends on seekable response objects
+(because response objects are used to implement the browser history).
+
+
 <a name="compatnotes"></a>
 <h2>Compatibility</h2>
 
@@ -274,35 +297,38 @@
 
 
 <a name="todo"></a>
-<h2>Todo</h2>
+<h2>To do</h2>
 
 <p>Contributions welcome!
 
-<h3>Specific to mechanize</h3>
+<p>The documentation to-do list has moved to the new "docs-in-progress"
+directory in SVN.
 
-<em>This is <strong>very</strong> roughly in order of priority</em>
+<p><em>This is <strong>very</strong> roughly in order of priority</em>
 
 <ul>
-  <li>Add .get_method() to Request.
   <li>Test <code>.any_response()</code> two handlers case: ordering.
   <li>Test referer bugs (frags and don't add in redirect unless orig
     req had Referer)
-  <li>Implement RFC 3986 URL absolutization.
+  <li>Remove use of urlparse from _auth.py.
   <li>Proper XHTML support!
-  <li>Make encoding_finder public, I guess (but probably improve it first).
-    (For example: support Mark Pilgrim's universal encoding detector?)
-  <li>Continue with the de-crufting enabled by requirement for Python 2.3.
   <li>Fix BeautifulSoup support to use a single BeautifulSoup instance
     per page.
   <li>Test BeautifulSoup support better / fix encoding issue.
+  <li>Support BeautifulSoup 3.
   <li>Add another History implementation or two and finalise interface.
   <li>History cache expiration.
-  <li>Investigate possible leak (see Balazs Ree's list posting).
-  <li>Add two-way links between BeautifulSoup & ClientForm object models.
-  <li>In 0.2: fork urllib2 &#8212; easier maintenance.
+  <li>Investigate possible leak further (see Balazs Ree's list posting).
+  <li>Make <code>EncodingFinder</code> public, I guess (but probably
+    improve it first).  (For example: support Mark Pilgrim's universal
+    encoding detector?)
+  <li>Add two-way links between BeautifulSoup &amp; ClientForm object
+    models.
   <li>In 0.2: switch to Python unicode strings everywhere appropriate
     (HTTP level should still use byte strings, of course).
-  <li>clean_url(): test browser behaviour.  I <em>think</em> this is correct...
+  <li><code>clean_url()</code>: test browser behaviour.  I <em>think</em>
+    this is correct...
+  <li>Use a nicer RFC 3986 join / split / unsplit implementation.
   <li>Figure out the Right Thing (if such a thing exists) for %-encoding.
   <li>How do IRIs fit into the world?
   <li>IDNA -- must read about security stuff first.
@@ -314,23 +340,16 @@
   <li>gzip transfer encoding (there's already a handler for this in
     mechanize, but it's poorly implemented ATM).
   <li>proxy.pac parsing (I don't think this needs JS interpretation)
-</ul>
+  <li>Topological sort for handlers, instead of .handler_order
+    attribute.  Ordering and other dependencies (where unavoidable)
+    should be defined separate from handlers themselves.  Add new
+    build_opener and deprecate the old one?  Actually, _useragent is
+    probably not far off what I'd have in mind (would just need a
+    method or two and a base class adding I think), and it's not a high
+    priority since I guess most people will just use the UserAgent and
+    Browser classes.
 
-<h3>Documentation</h3>
-<ul>
-  <li>Document means of processing response on ad-hoc basis with
-    .set_response() - e.g. to fix bad encoding in Content-type header or
-    clean up bad HTML.
-  <li>Add example to documentation showing can pass None as handle arg
-    to <code>mechanize.UserAgent</code> methods and then .add_handler()
-    if need to give it a specific handler instance to use for one of the
-    things it UserAgent already handles.  Hmm, think this contradicts docs
-    ATM!  And is it better to do this a different way...??
-  <li>Rearrange so have decent class-by-class docs,
-    a tutorial/background-info doc, and a howto/examples doc.
-  <li>Add more functional tests.
-  <li>Auth / proxies.
-</ul>
+ </ul>
 
 
 <a name="download"></a>
@@ -358,13 +377,11 @@
 EasyInstall is a one-liner for the common case, to be compared with the usual
 download-unpack-install cycle with <code>setup.py</code>.
 
-<p><strong>You need EasyInstall version 0.6a8 or newer.</strong>
-
 <h3>Using EasyInstall to download and install mechanize</h3>
 
 <ol>
   <li><a href="http://peak.telecommunity.com/DevCenter/EasyInstall#installing-easy-install">
-Install easy_install</a> (you need version 0.6a8 or newer)
+Install easy_install</a>
   <li><code>easy_install mechanize</code>
 </ol>
 
@@ -399,9 +416,7 @@
 <code>easy_install "projectname=dev"</code> for that project.
 
 <p>Note also that you can still carry on using a plain old SVN checkout as
-usual if you like (optionally in conjunction with <a
-href="./#develop"><code>setup.py develop</code></a> &#8211; this is
-particularly useful on Windows, since it functions rather like symlinks).
+usual if you like.
 
 <h3>Using setup.py from a .tar.gz, .zip or an SVN checkout to download and install mechanize</h3>
 
@@ -415,48 +430,17 @@
 
 <pre>python setup.py easy_install mechanize</pre>
 
-<a name="develop"></a>
-<h3>Using setup.py to install mechanize for development work on mechanize</h3>
 
-<p><strong>Note: this section is only useful for people who want to change
-mechanize</strong>: It is not useful to do this if all you want is to <a
-href="./#svnhead">keep up with SVN</a>.
-
-<p>For development of mechanize using EasyInstall (see the <a
-href="http://peak.telecommunity.com/DevCenter/setuptools">setuptools</a> docs
-for details), you have the option of using the <code>develop</code> distutils
-command.  This is particularly useful on Windows, since it functions rather
-like symlinks.  Get the mechanize source, then:
-
-<pre>python setup.py develop</pre>
-
-<p>Note that after every <code>svn update</code> on a
-<code>develop</code>-installed project, you should run <code>setup.py
-develop</code> to ensure that project's dependencies are updated if required.
-
-<p>Also note that, currently, if you also use the <code>develop</code>
-distutils command on the <em>dependencies</em> of mechanize (<em>viz</em>,
-ClientForm, and optionally BeautifulSoup) to keep up with SVN, you must run
-<code>setup.py develop</code> for each dependency of mechanize before running
-it for mechanize itself.  As a result, in this case it's probably simplest to
-just set up your <code>sys.path</code> manually rather than using
-<code>setup.py develop</code>.
-
-<p>One convenient way to get the latest source is:
-
-<pre>easy_install --editable --build-directory mybuilddir "mechanize==dev"</pre>
-
-
 <a name="source"></a>
 <h2>Download</h2>
 <p>All documentation (including this web page) is included in the distribution.
 
-<p>This is an alpha release: interfaces may change, and there will be bugs.
+<p>This is a beta release: there will be bugs.
 
 <p><em>Development release.</em>
 
 <ul>
-@{version = "0.1.2b"}
+@{version = "0.1.6b"}
 <li><a href="./src/mechanize-@(version).tar.gz">mechanize-@(version).tar.gz</a>
 <li><a href="./src/mechanize-@(version).zip">mechanize-@(version).zip</a>
 <li><a href="./src/ChangeLog.txt">Change Log</a> (included in distribution)
@@ -522,9 +506,9 @@
 
 <ul>
 
-  <li><a href="http://cheeseshop.python.org/pypi?:action=display&name=zope.testbrowser">
+  <li><a href="http://cheeseshop.python.org/pypi?:action=display&amp;name=zope.testbrowser">
     <code>zope.testbrowser</code></a> (or
-    <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&name=ZopeTestbrowser">
+    <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&amp;name=ZopeTestbrowser">
     <code>ZopeTestBrowser</code></a>, the standalone version).
   <li><a href="http://www.idyll.org/~t/www-tools/twill.html">twill</a>.
 </ul>
@@ -552,6 +536,13 @@
   <p>2.3 or above.
   <li>What else do I need?
   <p>mechanize depends on <a href="../ClientForm/">ClientForm</a>.
+  <li>Does mechanize depend on BeautifulSoup?
+     No.  mechanize offers a few (still rather experimental) classes that make
+     use of BeautifulSoup, but these classes are not required to use mechanize.
+     mechanize bundles BeautifulSoup version 2, so that module is no longer
+     required.  A future version of mechanize will support BeautifulSoup
+     version 3, at which point mechanize will likely no longer bundle the
+     module.
   <p>The versions of those required modules are listed in the
      <code>setup.py</code> for mechanize (included with the download).  The
      dependencies are automatically fetched by <a
@@ -570,6 +561,11 @@
 <a name="usagefaq"></a>
 <h2>FAQs - usage</h2>
 <ul>
+  <li>I'm not getting the HTML page I expected to see.
+    <ul>
+      <li><a href="http://wwwsearch.sourceforge.net/mechanize/doc.html#debugging">Debugging tips</a>
+      <li><a href="http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html">More tips</a>
+     </ul>
   <li>I'm <strong><em>sure</em></strong> this page is HTML, why does
      <code>mechanize.Browser</code> think otherwise?
 @{colorize("""

Modified: python-mechanize/branches/upstream/current/README.txt
===================================================================
--- python-mechanize/branches/upstream/current/README.txt	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/README.txt	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,536 +1,615 @@
-   [1]SourceForge.net Logo
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
+        "http://www.w3.org/TR/html4/strict.dtd">
 
-                                   mechanize
+<html>
+<head>
+  <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
+  <meta name="author" content="John J. Lee &lt;jjl at pobox.com&gt;">
+  <meta name="date" content="2006-12-30">
+  <meta name="keywords" content="Python,HTML,HTTP,browser,stateful,web,client,client-side,mechanize,cookie,form,META,HTTP-EQUIV,Refresh,ClientForm,ClientCookie,pullparser,WWW::Mechanize">
+  <meta name="keywords" content="cookie,HTTP,Python,web,client,client-side,HTML,META,HTTP-EQUIV,Refresh">
+  <title>mechanize</title>
+  <style type="text/css" media="screen">@import "../styles/style.css";</style>
+  
+</head>
+<body>
 
-   Stateful programmatic web browsing in Python, after Andy Lester's Perl
-   module [2]WWW::Mechanize .
-     * mechanize.Browser is a subclass of mechanize.UserAgent, which is,
-       in turn, a subclass of urllib2.OpenerDirector (in fact, of
-       mechanize.OpenerDirector), so:
-          + any URL can be opened, not just http:
-          + mechanize.UserAgent offers easy dynamic configuration of
-            user-agent features like protocol, cookie, redirection and
-            robots.txt handling, without having to make a new
-            OpenerDirector each time, e.g. by calling build_opener().
-     * Easy HTML form filling, using [3]ClientForm interface.
-     * Convenient link parsing and following.
-     * Browser history (.back() and .reload() methods).
-     * The Referer HTTP header is added properly (optional).
-     * Automatic observance of [4]robots.txt.
-     * Automatic handling of HTTP-Equiv and Refresh.
+<div id="sf"><a href="http://sourceforge.net">
+<img src="http://sourceforge.net/sflogo.php?group_id=48205&amp;type=2"
+ width="125" height="37" alt="SourceForge.net Logo"></a></div>
+<!--<img src="../images/sflogo.png"-->
 
-Examples
+<h1>mechanize</h1>
 
-   This documentation is in need of reorganisation and extension!
+<div id="Content">
 
-   The two below are just to give the gist. There are also some [5]actual
-   working examples.
-import re
-from mechanize import Browser
+<p>Stateful programmatic web browsing in Python, after Andy Lester's Perl
+module <a
+href="http://search.cpan.org/dist/WWW-Mechanize/"><code>WWW::Mechanize</code>
+</a>.
 
+<ul>
+
+  <li><code>mechanize.Browser</code> is a subclass of
+    <code>mechanize.UserAgentBase</code>, which is, in turn, a subclass of
+    <code>urllib2.OpenerDirector</code> (in fact, of
+    <code>mechanize.OpenerDirector</code>), so:
+    <ul>
+      <li>any URL can be opened, not just <code>http:</code>
+
+      <li><code>mechanize.UserAgentBase</code> offers easy dynamic
+      configuration of user-agent features like protocol, cookie,
+      redirection and <code>robots.txt</code> handling, without having
+      to make a new <code>OpenerDirector</code> each time, e.g.  by
+      calling <code>build_opener()</code>.
+
+    </ul>
+  <li>Easy HTML form filling, using <a href="../ClientForm/">ClientForm</a>
+    interface.
+  <li>Convenient link parsing and following.
+  <li>Browser history (<code>.back()</code> and <code>.reload()</code>
+    methods).
+  <li>The <code>Referer</code> HTTP header is added properly (optional).
+  <li>Automatic observance of <a
+    href="http://www.robotstxt.org/wc/norobots.html">
+    <code>robots.txt</code></a>.
+  <li>Automatic handling of HTTP-Equiv and Refresh.
+</ul>
+
+
+<a name="examples"></a>
+<h2>Examples</h2>
+
+<p class="docwarning">This documentation is in need of reorganisation and
+extension!</p>
+
+<p>The two below are just to give the gist.  There are also some <a
+href="./#tests">actual working examples</a>.
+
+<pre>
+<span class="pykw">import</span> re
+<span class="pykw">from</span> mechanize <span class="pykw">import</span> Browser
+
 br = Browser()
-br.open("http://www.example.com/")
-# follow second link with element text matching regular expression
-response1 = br.follow_link(text_regex=r"cheese\s*shop", nr=1)
-assert br.viewing_html()
-print br.title()
-print response1.geturl()
-print response1.info()  # headers
-print response1.read()  # body
-response1.close()  # (shown for clarity; in fact Browser does this for you)
+br.open(<span class="pystr">"http://www.example.com/"</span>)
+<span class="pycmt"># follow second link with element text matching regular expression
+</span>response1 = br.follow_link(text_regex=<span class="pystr">r"cheese\s*shop"</span>, nr=1)
+<span class="pykw">assert</span> br.viewing_html()
+<span class="pykw">print</span> br.title()
+<span class="pykw">print</span> response1.geturl()
+<span class="pykw">print</span> response1.info()  <span class="pycmt"># headers</span>
+<span class="pykw">print</span> response1.read()  <span class="pycmt"># body</span>
+response1.close()  <span class="pycmt"># (shown for clarity; in fact Browser does this for you)</span>
 
-br.select_form(name="order")
-# Browser passes through unknown attributes (including methods)
-# to the selected HTMLForm (from ClientForm).
-br["cheeses"] = ["mozzarella", "caerphilly"]  # (the method here is __setitem__
-)
-response2 = br.submit()  # submit current form
+br.select_form(name=<span class="pystr">"order"</span>)
+<span class="pycmt"># Browser passes through unknown attributes (including methods)
+</span><span class="pycmt"># to the selected HTMLForm (from ClientForm).
+</span>br[<span class="pystr">"cheeses"</span>] = [<span class="pystr">"mozzarella"</span>, <span class="pystr">"caerphilly"</span>]  <span class="pycmt"># (the method here is __setitem__)</span>
+response2 = br.submit()  <span class="pycmt"># submit current form</span>
 
-# print currently selected form (don't call .submit() on this, use br.submit())
-print br.form
+<span class="pycmt"># print currently selected form (don't call .submit() on this, use br.submit())
+</span><span class="pykw">print</span> br.form
 
-response3 = br.back()  # back to cheese shop (same data as response1)
-# the history mechanism returns cached response objects
-# we can still use the response, even though we closed it:
-response3.seek(0)
+response3 = br.back()  <span class="pycmt"># back to cheese shop (same data as response1)</span>
+<span class="pycmt"># the history mechanism returns cached response objects
+</span><span class="pycmt"># we can still use the response, even though we closed it:
+</span>response3.seek(0)
 response3.read()
-response4 = br.reload()  # fetches from server
+response4 = br.reload()  <span class="pycmt"># fetches from server</span>
 
-for form in br.forms():
-    print form
-# .links() optionally accepts the keyword args of .follow_/.find_link()
-for link in br.links(url_regex="python.org"):
-    print link
-    br.follow_link(link)  # takes EITHER Link instance OR keyword args
-    br.back()
+<span class="pykw">for</span> form <span class="pykw">in</span> br.forms():
+    <span class="pykw">print</span> form
+<span class="pycmt"># .links() optionally accepts the keyword args of .follow_/.find_link()
+</span><span class="pykw">for</span> link <span class="pykw">in</span> br.links(url_regex=<span class="pystr">"python.org"</span>):
+    <span class="pykw">print</span> link
+    br.follow_link(link)  <span class="pycmt"># takes EITHER Link instance OR keyword args</span>
+    br.back()</pre>
 
-   You may control the browser's policy by using the methods of
-   mechanize.Browser's base class, mechanize.UserAgent. For example:
+
+<p>You may control the browser's policy by using the methods of
+<code>mechanize.Browser</code>'s base class, <code>mechanize.UserAgent</code>.
+For example:
+
+<pre>
 br = Browser()
-# Explicitly configure proxies (Browser will attempt to set good defaults).
-# Note the userinfo ("joe:password@") and port number (":3128") are optional.
-br.set_proxies({"http": "joe:password at myproxy.example.com:3128",
-                "ftp": "proxy.example.com",
+<span class="pycmt"># Explicitly configure proxies (Browser will attempt to set good defaults).
+</span><span class="pycmt"># Note the userinfo ("joe:password@") and port number (":3128") are optional.
+</span>br.set_proxies({<span class="pystr">"http"</span>: <span class="pystr">"joe:password at myproxy.example.com:3128"</span>,
+                <span class="pystr">"ftp"</span>: <span class="pystr">"proxy.example.com"</span>,
                 })
-# Add HTTP Basic/Digest auth username and password for HTTP proxy access.
-# (equivalent to using "joe:password at ..." form above)
-br.add_proxy_password("joe", "password")
-# Add HTTP Basic/Digest auth username and password for website access.
-br.add_password("http://example.com/protected/", "joe", "password")
-# Don't handle HTTP-EQUIV headers (HTTP headers embedded in HTML).
-br.set_handle_equiv(False)
-# Ignore robots.txt.  Do not do this without thought and consideration.
-br.set_handle_robots(False)
-# Don't handle cookies
-br.set_cookiejar()
-# Supply your own mechanize.CookieJar (NOTE: cookie handling is ON by
-# default: no need to do this unless you have some reason to use a
-# particular cookiejar)
-br.set_cookiejar(cj)
-# Log information about HTTP redirects and Refreshes.
-br.set_debug_redirects(True)
-# Log HTTP response bodies (ie. the HTML, most of the time).
-br.set_debug_responses(True)
-# Print HTTP headers.
-br.set_debug_http(True)
+<span class="pycmt"># Add HTTP Basic/Digest auth username and password for HTTP proxy access.
+</span><span class="pycmt"># (equivalent to using "joe:password at ..." form above)
+</span>br.add_proxy_password(<span class="pystr">"joe"</span>, <span class="pystr">"password"</span>)
+<span class="pycmt"># Add HTTP Basic/Digest auth username and password for website access.
+</span>br.add_password(<span class="pystr">"http://example.com/protected/"</span>, <span class="pystr">"joe"</span>, <span class="pystr">"password"</span>)
+<span class="pycmt"># Don't handle HTTP-EQUIV headers (HTTP headers embedded in HTML).
+</span>br.set_handle_equiv(False)
+<span class="pycmt"># Ignore robots.txt.  Do not do this without thought and consideration.
+</span>br.set_handle_robots(False)
+<span class="pycmt"># Don't handle cookies
+</span>br.set_cookiejar()
+<span class="pycmt"># Supply your own mechanize.CookieJar (NOTE: cookie handling is ON by
+</span><span class="pycmt"># default: no need to do this unless you have some reason to use a
+</span><span class="pycmt"># particular cookiejar)
+</span>br.set_cookiejar(cj)
+<span class="pycmt"># Log information about HTTP redirects and Refreshes.
+</span>br.set_debug_redirects(True)
+<span class="pycmt"># Log HTTP response bodies (ie. the HTML, most of the time).
+</span>br.set_debug_responses(True)
+<span class="pycmt"># Print HTTP headers.
+</span>br.set_debug_http(True)
 
-# To make sure you're seeing all debug output:
-logger = logging.getLogger("mechanize")
+<span class="pycmt"># To make sure you're seeing all debug output:
+</span>logger = logging.getLogger(<span class="pystr">"mechanize"</span>)
 logger.addHandler(logging.StreamHandler(sys.stdout))
 logger.setLevel(logging.INFO)
 
-# Sometimes it's useful to process bad headers or bad HTML:
-response = br.response()  # this is a copy of response
-headers = response.info()  # currently, this is a mimetools.Message
-del headers["Content-type"]  # get rid of (possibly multiple) existing headers
-headers["Content-type"] = "text/html; charset=utf-8"
-response.set_data(response.get_data().replace("<!---", "<!--"))
-br.set_response(response)
+<span class="pycmt"># Sometimes it's useful to process bad headers or bad HTML:
+</span>response = br.response()  <span class="pycmt"># this is a copy of response</span>
+headers = response.info()  <span class="pycmt"># currently, this is a mimetools.Message</span>
+headers[<span class="pystr">"Content-type"</span>] = <span class="pystr">"text/html; charset=utf-8"</span>
+response.set_data(response.get_data().replace(<span class="pystr">"&lt;!---"</span>, <span class="pystr">"&lt;!--"</span>))
+br.set_response(response)</pre>
 
-   mechanize exports the complete interface of urllib2:
-import mechanize
-response = mechanize.urlopen("http://www.example.com/")
-print response.read()
 
-   so anything you would normally import from urllib2 can (and should, by
-   preference, to insulate you from future changes) be imported from
-   mechanize instead. In many cases if you import an object from
-   mechanize it will be the very same object you would get if you
-   imported from urllib2. In many other cases, though, the implementation
-   comes from mechanize, either because bug fixes have been applied or
-   the functionality of urllib2 has been extended in some way.
+<p>mechanize exports the complete interface of <code>urllib2</code>:
 
-Compatibility
+<pre>
+<span class="pykw">import</span> mechanize
+response = mechanize.urlopen(<span class="pystr">"http://www.example.com/"</span>)
+<span class="pykw">print</span> response.read()</pre>
 
-   These notes explain the relationship between mechanize, ClientCookie,
-   cookielib and urllib2, and which to use when. If you're just using
-   mechanize, and not any of those other libraries, you can ignore this
-   section.
-    1. mechanize works with Python 2.3, Python 2.4 and Python 2.5.
-    2. ClientCookie is no longer maintained as a separate package. The
-       code is now part of mechanize, and its interface is now exported
-       through module mechanize (since mechanize 0.1.0). Old code can
-       simply be changed to import mechanize as ClientCookie and should
-       continue to work.
-    3. The cookie handling parts of mechanize are in Python 2.4 standard
-       library as module cookielib and extensions to module urllib2.
 
-   IMPORTANT: The following are the ONLY cases where mechanize and
-   urllib2 code are intended to work together. For all other code, use
-   mechanize exclusively: do NOT mix use of mechanize and urllib2!
-    1. Handler classes that are missing from 2.4's urllib2 (e.g.
-       HTTPRefreshProcessor, HTTPEquivProcessor, HTTPRobotRulesProcessor)
-       may be used with the urllib2 of Python 2.4 or newer. There are not
-       currently any functional tests for this in mechanize, however, so
-       this feature may be broken.
-    2. If you want to use mechanize.RefreshProcessor with Python >= 2.4's
-       urllib2, you must also use mechanize.HTTPRedirectHandler.
-    3. mechanize.HTTPRefererProcessor requires special support from
-       mechanize.Browser, so cannot be used with vanilla urllib2.
-    4. mechanize.HTTPRequestUpgradeProcessor and
-       mechanize.ResponseUpgradeProcessor are not useful outside of
-       mechanize.
-    5. Request and response objects from code based on urllib2 work with
-       mechanize, and vice-versa.
-    6. The classes and functions exported by mechanize in its public
-       interface that come straight from urllib2 (e.g. FTPHandler, at the
-       time of writing) do work with mechanize (duh ;-). Exactly which of
-       these classes and functions come straight from urllib2 without
-       extension or modification will change over time, though, so don't
-       rely on it; instead, just import everything you need from
-       mechanize, never from urllib2. The exception is usage as described
-       in the first item in this list, which is explicitly OK (though not
-       well tested ATM), subject to the other restrictions in the list
-       above .
 
-Documentation
+<p>so anything you would normally import from <code>urllib2</code> can
+(and should, by preference, to insulate you from future changes) be
+imported from mechanize instead.  In many cases if you import an
+object from mechanize it will be the very same object you would get if
+you imported from urllib2.  In many other cases, though, the
+implementation comes from mechanize, either because bug fixes have
+been applied or the functionality of urllib2 has been extended in some
+way.
 
-   Full documentation is in the docstrings.
 
-   The documentation in the web pages is in need of reorganisation at the
-   moment, after the merge of ClientCookie into mechanize.
+<a name="useragentbase"></a>
+<h2>UserAgent vs UserAgentBase</h2>
 
-Credits
+<p><code>mechanize.UserAgent</code> is a trivial subclass of
+<code>mechanize.UserAgentBase</code>, adding just one method,
+<code>.set_seekable_responses()</code>, which allows switching off the
+addition of the <code>.seek()</code> method to response objects:
 
-   Thanks to all the too-numerous-to-list people who reported bugs and
-   provided patches. Also thanks to Ian Bicking, for persuading me that a
-   UserAgent class would be useful, and to Ronald Tschalar for advice on
-   Netscape cookies.
+<pre>
+ua = mechanize.UserAgent()
+ua.set_seekable_responses(False)
+ua.set_handle_equiv(False)  <span class="pycmt"># handling HTTP-EQUIV would add the .seek() method too</span>
+response = ua.open(<span class="pystr">'http://wwwsearch.sourceforge.net/'</span>)
+<span class="pykw">assert</span> <span class="pykw">not</span> hasattr(response, <span class="pystr">"seek"</span>)
+<span class="pykw">print</span> response.read()</pre>
 
-   A lot of credit must go to Gisle Aas, who wrote libwww-perl, from
-   which large parts of mechanize originally derived, and Andy Lester for
-   the original, [6]WWW::Mechanize . Finally, thanks to the
-   (coincidentally-named) Johnny Lee for the MSIE CookieJar Perl code
-   from which mechanize's support for that is derived.
 
-Todo
+<p>The reason for the extra class is that
+<code>mechanize.Browser</code> depends on seekable response objects
+(because response objects are used to implement the browser history).
 
-   Contributions welcome!
 
-Specific to mechanize
+<a name="compatnotes"></a>
+<h2>Compatibility</h2>
 
-   This is very roughly in order of priority
-     * Add .get_method() to Request.
-     * Test .any_response() two handlers case: ordering.
-     * Test referer bugs (frags and don't add in redirect unless orig req
-       had Referer)
-     * Implement RFC 3986 URL absolutization.
-     * Proper XHTML support!
-     * Make encoding_finder public, I guess (but probably improve it
-       first). (For example: support Mark Pilgrim's universal encoding
-       detector?)
-     * Continue with the de-crufting enabled by requirement for Python
-       2.3.
-     * Fix BeautifulSoup support to use a single BeautifulSoup instance
-       per page.
-     * Test BeautifulSoup support better / fix encoding issue.
-     * Add another History implementation or two and finalise interface.
-     * History cache expiration.
-     * Investigate possible leak (see Balazs Ree's list posting).
-     * Add two-way links between BeautifulSoup & ClientForm object
-       models.
-     * In 0.2: fork urllib2 -- easier maintenance.
-     * In 0.2: switch to Python unicode strings everywhere appropriate
-       (HTTP level should still use byte strings, of course).
-     * clean_url(): test browser behaviour. I think this is correct...
-     * Figure out the Right Thing (if such a thing exists) for
-       %-encoding.
-     * How do IRIs fit into the world?
-     * IDNA -- must read about security stuff first.
-     * Unicode support in general.
-     * Provide per-connection access to timeouts.
-     * Keep-alive / connection caching.
-     * Pipelining??
-     * Content negotiation.
-     * gzip transfer encoding (there's already a handler for this in
-       mechanize, but it's poorly implemented ATM).
-     * proxy.pac parsing (I don't think this needs JS interpretation)
+<p>These notes explain the relationship between mechanize, ClientCookie,
+<code>cookielib</code> and <code>urllib2</code>, and which to use when.  If
+you're just using mechanize, and not any of those other libraries, you can
+ignore this section.
 
-Documentation
+<ol>
 
-     * Document means of processing response on ad-hoc basis with
-       .set_response() - e.g. to fix bad encoding in Content-type header
-       or clean up bad HTML.
-     * Add example to documentation showing can pass None as handle arg
-       to mechanize.UserAgent methods and then .add_handler() if need to
-       give it a specific handler instance to use for one of the things
-       it UserAgent already handles. Hmm, think this contradicts docs
-       ATM! And is it better to do this a different way...??
-     * Rearrange so have decent class-by-class docs, a
-       tutorial/background-info doc, and a howto/examples doc.
-     * Add more functional tests.
-     * Auth / proxies.
+  <li>mechanize works with Python 2.3, Python 2.4 and Python 2.5.
 
-Getting mechanize
+  <li>ClientCookie is no longer maintained as a separate package.  The code is
+      now part of mechanize, and its interface is now exported through module
+      mechanize (since mechanize 0.1.0).  Old code can simply be changed to
+      <code>import mechanize as ClientCookie</code> and should continue to
+      work.
 
-   You can install the [7]old-fashioned way, or using [8]EasyInstall. I
-   recommend the latter even though EasyInstall is still in alpha,
-   because it will automatically ensure you have the necessary
-   dependencies, downloading if necessary.
+  <li>The cookie handling parts of mechanize are in Python 2.4 standard library
+      as module <code>cookielib</code> and extensions to module
+      <code>urllib2</code>.
 
-   [9]Subversion (SVN) access is also available.
+</ol>
 
-   Since EasyInstall is new, I include some instructions below, but
-   mechanize follows standard EasyInstall / setuptools conventions, so
-   you should refer to the [10]EasyInstall and [11]setuptools
-   documentation if you need more detailed or up-to-date instructions.
+<p><strong>IMPORTANT:</strong> The following are the ONLY cases where
+<code>mechanize</code> and <code>urllib2</code> code are intended to work
+together.  For all other code, use mechanize
+<em><strong>exclusively</strong></em>: do NOT mix use of mechanize and
+<code>urllib2</code>!
 
-EasyInstall / setuptools
+<ol>
 
-   The benefit of EasyInstall and the new setuptools-supporting setup.py
-   is that they grab all dependencies for you. Also, using EasyInstall is
-   a one-liner for the common case, to be compared with the usual
-   download-unpack-install cycle with setup.py.
+  <li>Handler classes that are missing from 2.4's <code>urllib2</code>
+      (e.g. <code>HTTPRefreshProcessor</code>, <code>HTTPEquivProcessor</code>,
+      <code>HTTPRobotRulesProcessor</code>) may be used with the
+      <code>urllib2</code> of Python 2.4 or newer.  There are not currently any
+      functional tests for this in mechanize, however, so this feature may be
+      broken.
 
-   You need EasyInstall version 0.6a8 or newer.
+  <li>If you want to use <code>mechanize.RefreshProcessor</code> with Python >=
+      2.4's <code>urllib2</code>, you must also use
+      <code>mechanize.HTTPRedirectHandler</code>.
 
-Using EasyInstall to download and install mechanize
+  <li><code>mechanize.HTTPRefererProcessor</code> requires special support from
+      <code>mechanize.Browser</code>, so cannot be used with vanilla
+      <code>urllib2</code>.
 
-    1. [12]Install easy_install (you need version 0.6a8 or newer)
-    2. easy_install mechanize
+  <li><code>mechanize.HTTPRequestUpgradeProcessor</code> and
+      <code>mechanize.ResponseUpgradeProcessor</code> are not useful outside of
+      mechanize.
 
-   If you're on a Unix-like OS, you may need root permissions for that
-   last step (or see the [13]EasyInstall documentation for other
-   installation options).
+  <li>Request and response objects from code based on <code>urllib2</code> work
+      with mechanize, and vice-versa.
 
-   If you already have mechanize installed as a [14]Python Egg (as you do
-   if you installed using EasyInstall, or using setup.py install from
-   mechanize 0.0.10a or newer), you can upgrade to the latest version
-   using:
-easy_install --upgrade mechanize
+  <li>The classes and functions exported by mechanize in its public interface
+      that come straight from <code>urllib2</code>
+      (e.g. <code>FTPHandler</code>, at the time of writing) do work with
+      mechanize (duh ;-).  Exactly which of these classes and functions come
+      straight from <code>urllib2</code> without extension or modification will
+      change over time, though, so don't rely on it; instead, just import
+      everything you need from mechanize, never from <code>urllib2</code>.  The
+      exception is usage as described in the first item in this list, which is
+      explicitly OK (though not well tested ATM), subject to the other
+      restrictions in the list above .
 
-   You may want to read up on the -m option to easy_install, which lets
-   you install multiple versions of a package.
+</ol>
 
-Using EasyInstall to download and install the latest in-development (SVN
-HEAD) version of mechanize
 
-easy_install "mechanize==dev"
+<a name="docs"></a>
+<h2>Documentation</h2>
 
-   Note that that will not necessarily grab the SVN versions of
-   dependencies, such as ClientForm: It will use SVN to fetch
-   dependencies if and only if the SVN HEAD version of mechanize declares
-   itself to depend on the SVN versions of those dependencies; even then,
-   those declared dependencies won't necessarily be on SVN HEAD, but
-   rather a particular revision. If you want SVN HEAD for a dependency
-   project, you should ask for it explicitly by running easy_install
-   "projectname=dev" for that project.
+<p>Full documentation is in the docstrings.
 
-   Note also that you can still carry on using a plain old SVN checkout
-   as usual if you like (optionally in conjunction with [15]setup.py
-   develop - this is particularly useful on Windows, since it functions
-   rather like symlinks).
+<p>The documentation in the web pages is in need of reorganisation at the
+moment, after the merge of ClientCookie into mechanize.
 
-Using setup.py from a .tar.gz, .zip or an SVN checkout to download and
-install mechanize
 
-   setup.py should correctly resolve and download dependencies:
-python setup.py install
+<a name="credits"></a>
+<h2>Credits</h2>
 
-   Or, to get access to the same options that easy_install accepts, use
-   the easy_install distutils command instead of install (see python
-   setup.py --help easy_install)
-python setup.py easy_install mechanize
+<p>Thanks to all the too-numerous-to-list people who reported bugs and provided
+patches.  Also thanks to Ian Bicking, for persuading me that a
+<code>UserAgent</code> class would be useful, and to Ronald Tschalar for advice
+on Netscape cookies.
 
-Using setup.py to install mechanize for development work on mechanize
+<p>A lot of credit must go to Gisle Aas, who wrote libwww-perl, from which
+large parts of mechanize originally derived, and Andy Lester for the original,
+<a href="http://search.cpan.org/dist/WWW-Mechanize/"><code>WWW::Mechanize</code>
+</a>.  Finally, thanks to the (coincidentally-named) Johnny Lee for the MSIE
+CookieJar Perl code from which mechanize's support for that is derived.
 
-   Note: this section is only useful for people who want to change
-   mechanize: It is not useful to do this if all you want is to [16]keep
-   up with SVN.
 
-   For development of mechanize using EasyInstall (see the [17]setuptools
-   docs for details), you have the option of using the develop distutils
-   command. This is particularly useful on Windows, since it functions
-   rather like symlinks. Get the mechanize source, then:
-python setup.py develop
+<a name="todo"></a>
+<h2>To do</h2>
 
-   Note that after every svn update on a develop-installed project, you
-   should run setup.py develop to ensure that project's dependencies are
-   updated if required.
+<p>Contributions welcome!
 
-   Also note that, currently, if you also use the develop distutils
-   command on the dependencies of mechanize (viz, ClientForm, and
-   optionally BeautifulSoup) to keep up with SVN, you must run setup.py
-   develop for each dependency of mechanize before running it for
-   mechanize itself. As a result, in this case it's probably simplest to
-   just set up your sys.path manually rather than using setup.py develop.
+<p>The documentation to-do list has moved to the new "docs-in-progress"
+directory in SVN.
 
-   One convenient way to get the latest source is:
-easy_install --editable --build-directory mybuilddir "mechanize==dev"
+<p><em>This is <strong>very</strong> roughly in order of priority</em>
 
-Download
+<ul>
+  <li>Test <code>.any_response()</code> two handlers case: ordering.
+  <li>Test referer bugs (frags and don't add in redirect unless orig
+    req had Referer)
+  <li>Remove use of urlparse from _auth.py.
+  <li>Proper XHTML support!
+  <li>Fix BeautifulSoup support to use a single BeautifulSoup instance
+    per page.
+  <li>Test BeautifulSoup support better / fix encoding issue.
+  <li>Support BeautifulSoup 3.
+  <li>Add another History implementation or two and finalise interface.
+  <li>History cache expiration.
+  <li>Investigate possible leak further (see Balazs Ree's list posting).
+  <li>Make <code>EncodingFinder</code> public, I guess (but probably
+    improve it first).  (For example: support Mark Pilgrim's universal
+    encoding detector?)
+  <li>Add two-way links between BeautifulSoup &amp; ClientForm object
+    models.
+  <li>In 0.2: switch to Python unicode strings everywhere appropriate
+    (HTTP level should still use byte strings, of course).
+  <li><code>clean_url()</code>: test browser behaviour.  I <em>think</em>
+    this is correct...
+  <li>Use a nicer RFC 3986 join / split / unsplit implementation.
+  <li>Figure out the Right Thing (if such a thing exists) for %-encoding.
+  <li>How do IRIs fit into the world?
+  <li>IDNA -- must read about security stuff first.
+  <li>Unicode support in general.
+  <li>Provide per-connection access to timeouts.
+  <li>Keep-alive / connection caching.
+  <li>Pipelining??
+  <li>Content negotiation.
+  <li>gzip transfer encoding (there's already a handler for this in
+    mechanize, but it's poorly implemented ATM).
+  <li>proxy.pac parsing (I don't think this needs JS interpretation)
+  <li>Topological sort for handlers, instead of .handler_order
+    attribute.  Ordering and other dependencies (where unavoidable)
+    should be defined separate from handlers themselves.  Add new
+    build_opener and deprecate the old one?  Actually, _useragent is
+    probably not far off what I'd have in mind (would just need a
+    method or two and a base class adding I think), and it's not a high
+    priority since I guess most people will just use the UserAgent and
+    Browser classes.
 
-   All documentation (including this web page) is included in the
-   distribution.
+ </ul>
 
-   This is an alpha release: interfaces may change, and there will be
-   bugs.
 
-   Development release.
-     * [18]mechanize-0.1.2b.tar.gz
-     * [19]mechanize-0.1.2b.zip
-     * [20]Change Log (included in distribution)
-     * [21]Older versions.
+<a name="download"></a>
+<h2>Getting mechanize</h2>
 
-   For old-style installation instructions, see the INSTALL file included
-   in the distribution. Better, [22]use EasyInstall.
+<p>You can install the <a href="./#source">old-fashioned way</a>, or using <a
+href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall</a>.  I
+recommend the latter even though EasyInstall is still in alpha, because it will
+automatically ensure you have the necessary dependencies, downloading if
+necessary.
 
-Subversion
+<p><a href="./#svn">Subversion (SVN) access</a> is also available.
 
-   The [23]Subversion (SVN) trunk is
-   [24]http://codespeak.net/svn/wwwsearch/mechanize/trunk, so to check
-   out the source:
+<p>Since EasyInstall is new, I include some instructions below, but mechanize
+follows standard EasyInstall / <code>setuptools</code> conventions, so you
+should refer to the <a
+href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall</a> and
+<a href="http://peak.telecommunity.com/DevCenter/setuptools">setuptools</a>
+documentation if you need more detailed or up-to-date instructions.
+
+<h2>EasyInstall / setuptools</h2>
+
+<p>The benefit of EasyInstall and the new <code>setuptools</code>-supporting
+<code>setup.py</code> is that they grab all dependencies for you.  Also, using
+EasyInstall is a one-liner for the common case, to be compared with the usual
+download-unpack-install cycle with <code>setup.py</code>.
+
+<h3>Using EasyInstall to download and install mechanize</h3>
+
+<ol>
+  <li><a href="http://peak.telecommunity.com/DevCenter/EasyInstall#installing-easy-install">
+Install easy_install</a>
+  <li><code>easy_install mechanize</code>
+</ol>
+
+<p>If you're on a Unix-like OS, you may need root permissions for that last
+step (or see the <a
+href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall
+documentation</a> for other installation options).
+
+<p>If you already have mechanize installed as a <a
+href="http://peak.telecommunity.com/DevCenter/PythonEggs">Python Egg</a> (as
+you do if you installed using EasyInstall, or using <code>setup.py
+install</code> from mechanize 0.0.10a or newer), you can upgrade to the latest
+version using:
+
+<pre>easy_install --upgrade mechanize</pre>
+
+<p>You may want to read up on the <code>-m</code> option to
+<code>easy_install</code>, which lets you install multiple versions of a
+package.
+
+<a name="svnhead"></a>
+<h3>Using EasyInstall to download and install the latest in-development (SVN HEAD) version of mechanize</h3>
+
+<pre>easy_install "mechanize==dev"</pre>
+
+<p>Note that that will not necessarily grab the SVN versions of dependencies,
+such as ClientForm: It will use SVN to fetch dependencies if and only if the
+SVN HEAD version of mechanize declares itself to depend on the SVN versions of
+those dependencies; even then, those declared dependencies won't necessarily be
+on SVN HEAD, but rather a particular revision.  If you want SVN HEAD for a
+dependency project, you should ask for it explicitly by running
+<code>easy_install "projectname=dev"</code> for that project.
+
+<p>Note also that you can still carry on using a plain old SVN checkout as
+usual if you like.
+
+<h3>Using setup.py from a .tar.gz, .zip or an SVN checkout to download and install mechanize</h3>
+
+<p><code>setup.py</code> should correctly resolve and download dependencies:
+
+<pre>python setup.py install</pre>
+
+<p>Or, to get access to the same options that <code>easy_install</code>
+accepts, use the <code>easy_install</code> distutils command instead of
+<code>install</code> (see <code>python setup.py --help easy_install</code>)
+
+<pre>python setup.py easy_install mechanize</pre>
+
+
+<a name="source"></a>
+<h2>Download</h2>
+<p>All documentation (including this web page) is included in the distribution.
+
+<p>This is a beta release: there will be bugs.
+
+<p><em>Development release.</em>
+
+<ul>
+
+<li><a href="./src/mechanize-0.1.6b.tar.gz">mechanize-0.1.6b.tar.gz</a>
+<li><a href="./src/mechanize-0.1.6b.zip">mechanize-0.1.6b.zip</a>
+<li><a href="./src/ChangeLog.txt">Change Log</a> (included in distribution)
+<li><a href="./src/">Older versions.</a>
+</ul>
+
+<p>For old-style installation instructions, see the INSTALL file included in
+the distribution.  Better, <a href="./#download">use EasyInstall</a>.
+
+
+<a name="svn"></a>
+<h2>Subversion</h2>
+
+<p>The <a href="http://subversion.tigris.org/">Subversion (SVN)</a> trunk is <a href="http://codespeak.net/svn/wwwsearch/mechanize/trunk#egg=mechanize-dev">http://codespeak.net/svn/wwwsearch/mechanize/trunk</a>, so to check out the source:
+
+<pre>
 svn co http://codespeak.net/svn/wwwsearch/mechanize/trunk mechanize
+</pre>
 
-Tests and examples
+<a name="tests"></a>
+<h2>Tests and examples</h2>
 
-Examples
+<h3>Examples</h3>
 
-   The examples directory in the [25]source packages contains a couple of
-   silly, but working, scripts to demonstrate basic use of the module.
-   Note that it's in the nature of web scraping for such scripts to
-   break, so don't be too suprised if that happens - do let me know,
-   though!
+<p>The <code>examples</code> directory in the <a href="./#source">source
+packages</a> contains a couple of silly, but working, scripts to demonstrate
+basic use of the module.  Note that it's in the nature of web scraping for such
+scripts to break, so don't be too suprised if that happens &#8211; do let me
+know, though!
 
-   It's worth knowing also that the examples on the [26]ClientForm web
-   page are useful for mechanize users, and are now real run-able scripts
-   rather than just documentation.
+<p>It's worth knowing also that the examples on the <a
+href="../ClientForm/">ClientForm web page</a> are useful for mechanize users,
+and are now real run-able scripts rather than just documentation.
 
-Functional tests
+<h3>Functional tests</h3>
 
-   To run the functional tests (which do access the network), run the
-   following command:
-python functional_tests.py
+<p>To run the functional tests (which <strong>do</strong> access the network),
+run the following
 
-Unit tests
+command:
+<pre>python functional_tests.py</pre>
 
-   Note that ClientForm (a dependency of mechanize) has its own unit
-   tests, which must be run separately.
+<h3>Unit tests</h3>
 
-   To run the unit tests (none of which access the network), run the
-   following command:
-python test.py
+<p>Note that ClientForm (a dependency of mechanize) has its own unit tests,
+which must be run separately.
 
-   This runs the tests against the source files extracted from the
-   package. For help on command line options:
-python test.py --help
+<p>To run the unit tests (none of which access the network), run the following
+command:
 
-See also
+<pre>python test.py</pre>
 
-   There are several wrappers around mechanize designed for functional
-   testing of web applications:
-     * [27]zope.testbrowser (or [28]ZopeTestBrowser, the standalone
-       version).
-     * [29]twill.
+<p>This runs the tests against the source files extracted from the
+package.  For help on command line options:
 
-   Richard Jones' [30]webunit (this is not the same as Steven Purcell's
-   [31]code of the same name). webunit and mechanize are quite similar.
-   On the minus side, webunit is missing things like browser history,
-   high-level forms and links handling, thorough cookie handling, refresh
-   redirection, adding of the Referer header, observance of robots.txt
-   and easy extensibility. On the plus side, webunit has a bunch of
-   utility functions bound up in its WebFetcher class, which look useful
-   for writing tests (though they'd be easy to duplicate using
-   mechanize). In general, webunit has more of a frameworky emphasis,
-   with aims limited to writing tests, where mechanize and the modules it
-   depends on try hard to be general-purpose libraries.
+<pre>python test.py --help</pre>
 
-   There are many related links in the [32]General FAQ page, too.
 
-FAQs - pre install
+<h2>See also</h2>
 
-     * Which version of Python do I need?
-       2.3 or above.
-     * What else do I need?
-       mechanize depends on [33]ClientForm.
-       The versions of those required modules are listed in the setup.py
-       for mechanize (included with the download). The dependencies are
-       automatically fetched by [34]EasyInstall (or by [35]downloading a
-       mechanize source package and running python setup.py install). If
-       you like you can fetch and install them manually, instead - see
-       the INSTALL.txt file (included with the distribution).
-     * Which license?
-       mechanize is dual-licensed: you may pick either the [36]BSD
-       license, or the [37]ZPL 2.1 (both are included in the
-       distribution).
+<p>There are several wrappers around mechanize designed for functional testing
+of web applications:
 
-FAQs - usage
+<ul>
 
-     * I'm sure this page is HTML, why does mechanize.Browser think
-       otherwise?
+  <li><a href="http://cheeseshop.python.org/pypi?:action=display&amp;name=zope.testbrowser">
+    <code>zope.testbrowser</code></a> (or
+    <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&amp;name=ZopeTestbrowser">
+    <code>ZopeTestBrowser</code></a>, the standalone version).
+  <li><a href="http://www.idyll.org/~t/www-tools/twill.html">twill</a>.
+</ul>
+
+<p>Richard Jones' <a href="http://mechanicalcat.net/tech/webunit/">webunit</a>
+(this is not the same as Steven Purcell's <a
+href="http://webunit.sourceforge.net/">code of the same name</a>).  webunit and
+mechanize are quite similar.  On the minus side, webunit is missing things like
+browser history, high-level forms and links handling, thorough cookie handling,
+refresh redirection, adding of the Referer header, observance of robots.txt and
+easy extensibility.  On the plus side, webunit has a bunch of utility functions
+bound up in its WebFetcher class, which look useful for writing tests (though
+they'd be easy to duplicate using mechanize).  In general, webunit has more of
+a frameworky emphasis, with aims limited to writing tests, where mechanize and
+the modules it depends on try hard to be general-purpose libraries.
+
+<p>There are many related links in the <a
+href="../bits/GeneralFAQ.html">General FAQ</a> page, too.
+
+
+<a name="faq"></a>
+<h2>FAQs - pre install</h2>
+<ul>
+  <li>Which version of Python do I need?
+  <p>2.3 or above.
+  <li>What else do I need?
+  <p>mechanize depends on <a href="../ClientForm/">ClientForm</a>.
+  <li>Does mechanize depend on BeautifulSoup?
+     No.  mechanize offers a few (still rather experimental) classes that make
+     use of BeautifulSoup, but these classes are not required to use mechanize.
+     mechanize bundles BeautifulSoup version 2, so that module is no longer
+     required.  A future version of mechanize will support BeautifulSoup
+     version 3, at which point mechanize will likely no longer bundle the
+     module.
+  <p>The versions of those required modules are listed in the
+     <code>setup.py</code> for mechanize (included with the download).  The
+     dependencies are automatically fetched by <a
+     href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall</a>
+     (or by <a href="./#source">downloading</a> a mechanize source package and
+     running <code>python setup.py install</code>).  If you like you can fetch
+     and install them manually, instead &#8211; see the <code>INSTALL.txt</code>
+     file (included with the distribution).
+  <li>Which license?
+  <p>mechanize is dual-licensed: you may pick either the
+     <a href="http://www.opensource.org/licenses/bsd-license.php">BSD license</a>,
+     or the <a href="http://www.zope.org/Resources/ZPL">ZPL 2.1</a> (both are
+     included in the distribution).
+</ul>
+
+<a name="usagefaq"></a>
+<h2>FAQs - usage</h2>
+<ul>
+  <li>I'm not getting the HTML page I expected to see.
+    <ul>
+      <li><a href="http://wwwsearch.sourceforge.net/mechanize/doc.html#debugging">Debugging tips</a>
+      <li><a href="http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html">More tips</a>
+     </ul>
+  <li>I'm <strong><em>sure</em></strong> this page is HTML, why does
+     <code>mechanize.Browser</code> think otherwise?
+<pre>
 b = mechanize.Browser(
-    # mechanize's XHTML support needs work, so is currently switched off.  If
-    # we want to get our work done, we have to turn it on by supplying a
-    # mechanize.Factory (with XHTML support turned on):
+    <span class="pycmt"># mechanize's XHTML support needs work, so is currently switched off.  If</span>
+    <span class="pycmt"># we want to get our work done, we have to turn it on by supplying a</span>
+    <span class="pycmt"># mechanize.Factory (with XHTML support turned on):</span>
     factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)
-    )
+    )</pre>
 
-   I prefer questions and comments to be sent to the [38]mailing list
-   rather than direct to me.
+</ul>
 
-   [39]John J. Lee, May 2006.
-     _________________________________________________________________
+<p>I prefer questions and comments to be sent to the <a
+href="http://lists.sourceforge.net/lists/listinfo/wwwsearch-general">
+mailing list</a> rather than direct to me.
 
-   [40]Home
-   [41]General FAQs
-   mechanize
-   [42]mechanize docs
-   [43]ClientForm
-   [44]ClientCookie
-   [45]ClientCookie docs
-   [46]pullparser
-   [47]DOMForm
-   [48]python-spidermonkey
-   [49]ClientTable
-   [50]1.5.2 urllib2.py
-   [51]1.5.2 urllib.py
-   [52]Examples
-   [53]Compatibility
-   [54]Documentation
-   [55]To-do
-   [56]Download
-   [57]Subversion
-   [58]More examples
-   [59]FAQs
+<p><a href="mailto:jjl at pobox.com">John J. Lee</a>,
+December 2006.
 
-References
+<hr>
 
-   1. http://sourceforge.net/
-   2. http://search.cpan.org/dist/WWW-Mechanize/
-   3. file://localhost/tmp/ClientForm/
-   4. http://www.robotstxt.org/wc/norobots.html
-   5. file://localhost/tmp/tmpexjjQ7/#tests
-   6. http://search.cpan.org/dist/WWW-Mechanize/
-   7. file://localhost/tmp/tmpexjjQ7/#source
-   8. http://peak.telecommunity.com/DevCenter/EasyInstall
-   9. file://localhost/tmp/tmpexjjQ7/#svn
-  10. http://peak.telecommunity.com/DevCenter/EasyInstall
-  11. http://peak.telecommunity.com/DevCenter/setuptools
-  12. http://peak.telecommunity.com/DevCenter/EasyInstall#installing-easy-install
-  13. http://peak.telecommunity.com/DevCenter/EasyInstall
-  14. http://peak.telecommunity.com/DevCenter/PythonEggs
-  15. file://localhost/tmp/tmpexjjQ7/#develop
-  16. file://localhost/tmp/tmpexjjQ7/#svnhead
-  17. http://peak.telecommunity.com/DevCenter/setuptools
-  18. file://localhost/tmp/tmpexjjQ7/src/mechanize-0.1.2b.tar.gz
-  19. file://localhost/tmp/tmpexjjQ7/src/mechanize-0.1.2b.zip
-  20. file://localhost/tmp/tmpexjjQ7/src/ChangeLog.txt
-  21. file://localhost/tmp/tmpexjjQ7/src/
-  22. file://localhost/tmp/tmpexjjQ7/#download
-  23. http://subversion.tigris.org/
-  24. http://codespeak.net/svn/wwwsearch/mechanize/trunk#egg=mechanize-dev
-  25. file://localhost/tmp/tmpexjjQ7/#source
-  26. file://localhost/tmp/ClientForm/
-  27. http://cheeseshop.python.org/pypi?:action=display&name=zope.testbrowser
-  28. http://cheeseshop.python.org/pypi?%3Aaction=display&name=ZopeTestbrowser
-  29. http://www.idyll.org/~t/www-tools/twill.html
-  30. http://mechanicalcat.net/tech/webunit/
-  31. http://webunit.sourceforge.net/
-  32. file://localhost/tmp/bits/GeneralFAQ.html
-  33. file://localhost/tmp/ClientForm/
-  34. http://peak.telecommunity.com/DevCenter/EasyInstall
-  35. file://localhost/tmp/tmpexjjQ7/#source
-  36. http://www.opensource.org/licenses/bsd-license.php
-  37. http://www.zope.org/Resources/ZPL
-  38. http://lists.sourceforge.net/lists/listinfo/wwwsearch-general
-  39. mailto:jjl at pobox.com
-  40. file://localhost/tmp
-  41. file://localhost/tmp/bits/GeneralFAQ.html
-  42. file://localhost/tmp/mechanize/doc.html
-  43. file://localhost/tmp/ClientForm/
-  44. file://localhost/tmp/ClientCookie/
-  45. file://localhost/tmp/ClientCookie/doc.html
-  46. file://localhost/tmp/pullparser/
-  47. file://localhost/tmp/DOMForm/
-  48. file://localhost/tmp/python-spidermonkey/
-  49. file://localhost/tmp/ClientTable/
-  50. file://localhost/tmp/bits/urllib2_152.py
-  51. file://localhost/tmp/bits/urllib_152.py
-  52. file://localhost/tmp/tmpexjjQ7/#examples
-  53. file://localhost/tmp/tmpexjjQ7/#compatnotes
-  54. file://localhost/tmp/tmpexjjQ7/#docs
-  55. file://localhost/tmp/tmpexjjQ7/#todo
-  56. file://localhost/tmp/tmpexjjQ7/#download
-  57. file://localhost/tmp/tmpexjjQ7/#svn
-  58. file://localhost/tmp/tmpexjjQ7/#tests
-  59. file://localhost/tmp/tmpexjjQ7/#faq
+</div>
+
+<div id="Menu">
+
+<a href="..">Home</a><br>
+<br>
+<a href="../bits/GeneralFAQ.html">General FAQs</a><br>
+<br>
+<span class="thispage">mechanize</span><br>
+<a href="../mechanize/doc.html"><span class="subpage">mechanize docs</span></a><br>
+<a href="../ClientForm/">ClientForm</a><br>
+<br>
+<a href="../ClientCookie/">ClientCookie</a><br>
+<a href="../ClientCookie/doc.html"><span class="subpage">ClientCookie docs</span></a><br>
+<a href="../pullparser/">pullparser</a><br>
+<a href="../DOMForm/">DOMForm</a><br>
+<a href="../python-spidermonkey/">python-spidermonkey</a><br>
+<a href="../ClientTable/">ClientTable</a><br>
+<a href="../bits/urllib2_152.py">1.5.2 urllib2.py</a><br>
+<a href="../bits/urllib_152.py">1.5.2 urllib.py</a><br>
+
+<br>
+
+<a href="./#examples">Examples</a><br>
+<a href="./#compatnotes">Compatibility</a><br>
+<a href="./#docs">Documentation</a><br>
+<a href="./#todo">To-do</a><br>
+<a href="./#download">Download</a><br>
+<a href="./#svn">Subversion</a><br>
+<a href="./#tests">More examples</a><br>
+<a href="./#faq">FAQs</a><br>
+
+</div>
+
+
+</body>
+</html>

Modified: python-mechanize/branches/upstream/current/doc.html
===================================================================
--- python-mechanize/branches/upstream/current/doc.html	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/doc.html	2007-04-09 20:40:55 UTC (rev 764)
@@ -5,7 +5,7 @@
 <head>
   <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
   <meta name="author" content="John J. Lee &lt;jjl at pobox.com&gt;">
-  <meta name="date" content="2006-05-21">
+  <meta name="date" content="2006-09-08">
   <title>mechanize documentation</title>
   <style type="text/css" media="screen">@import "../styles/style.css";</style>
   
@@ -468,7 +468,7 @@
 
 
   <li>If you're using a <code>urllib2.Request</code> from Python 2.4 or later,
-  or you're using a <code>mechanize.Request<code>, use the
+  or you're using a <code>mechanize.Request</code>, use the
   <code>unverifiable</code> and <code>origin_req_host</code> arguments to the
   constructor:
 
@@ -706,7 +706,7 @@
 keep compatibility with the Netscape protocol as implemented by Netscape.
 Microsoft Internet Explorer (MSIE) was very new when the standard was designed,
 but was starting to be very popular when the standard was finalised.  XXX P3P,
-and MSIE & Mozilla options
+and MSIE &amp; Mozilla options
 
 <p>XXX Apparently MSIE implements bits of RFC 2109 - but not very compliant
 (surprise).  Presumably other browsers do too, as a result.  mechanize
@@ -838,7 +838,7 @@
 mailing list</a> rather than direct to me.
 
 <p><a href="mailto:jjl at pobox.com">John J. Lee</a>,
-May 2006.
+September 2006.
 
 <hr>
 
@@ -866,7 +866,7 @@
 <br>
 
 <a href="./doc.html#examples">Examples</a><br>
-<a href="./doc.html#browsers">Mozilla & MSIE</a><br>
+<a href="./doc.html#browsers">Mozilla &amp; MSIE</a><br>
 <a href="./doc.html#file">Cookies in a file</a><br>
 <a href="./doc.html#cookiejar">Using a <code>CookieJar</code></a><br>
 <a href="./doc.html#extras">Processors</a><br>

Modified: python-mechanize/branches/upstream/current/doc.html.in
===================================================================
--- python-mechanize/branches/upstream/current/doc.html.in	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/doc.html.in	2007-04-09 20:40:55 UTC (rev 764)
@@ -6,7 +6,7 @@
 from colorize import colorize
 import time
 import release
-last_modified = release.svn_id_to_time("$Id: doc.html.in 27546 2006-05-21 18:52:39Z jjlee $")
+last_modified = release.svn_id_to_time("$Id: doc.html.in 32090 2006-09-08 21:19:26Z jjlee $")
 try:
     base
 except NameError:
@@ -479,7 +479,7 @@
 """)}
 
   <li>If you're using a <code>urllib2.Request</code> from Python 2.4 or later,
-  or you're using a <code>mechanize.Request<code>, use the
+  or you're using a <code>mechanize.Request</code>, use the
   <code>unverifiable</code> and <code>origin_req_host</code> arguments to the
   constructor:
 
@@ -718,7 +718,7 @@
 keep compatibility with the Netscape protocol as implemented by Netscape.
 Microsoft Internet Explorer (MSIE) was very new when the standard was designed,
 but was starting to be very popular when the standard was finalised.  XXX P3P,
-and MSIE & Mozilla options
+and MSIE &amp; Mozilla options
 
 <p>XXX Apparently MSIE implements bits of RFC 2109 - but not very compliant
 (surprise).  Presumably other browsers do too, as a result.  mechanize
@@ -863,7 +863,7 @@
 <br>
 
 <a href="./doc.html#examples">Examples</a><br>
-<a href="./doc.html#browsers">Mozilla & MSIE</a><br>
+<a href="./doc.html#browsers">Mozilla &amp; MSIE</a><br>
 <a href="./doc.html#file">Cookies in a file</a><br>
 <a href="./doc.html#cookiejar">Using a <code>CookieJar</code></a><br>
 <a href="./doc.html#extras">Processors</a><br>

Modified: python-mechanize/branches/upstream/current/examples/pypi.py
===================================================================
--- python-mechanize/branches/upstream/current/examples/pypi.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/examples/pypi.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,5 +1,10 @@
 #!/usr/bin/env python
 
+# ------------------------------------------------------------------------
+# THIS SCRIPT IS CURRENTLY NOT WORKING, SINCE PYPI's SEARCH FEATURE HAS
+# BEEN REMOVED!
+# ------------------------------------------------------------------------
+
 # Search PyPI, the Python Package Index, and retrieve latest mechanize
 # tarball.
 
@@ -16,6 +21,10 @@
     # mechanize.Factory (with XHTML support turned on):
     factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)
     )
+# Addition 2005-06-13: Be naughty, since robots.txt asks not to
+# access /pypi now.  We're not madly searching for everything, so
+# I don't feel too guilty.
+b.set_handle_robots(False)
 
 # search PyPI
 b.open("http://www.python.org/pypi")

Added: python-mechanize/branches/upstream/current/ez_setup.py
===================================================================
--- python-mechanize/branches/upstream/current/ez_setup.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/ez_setup.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,222 @@
+#!python
+"""Bootstrap setuptools installation
+
+If you want to use setuptools in your package's setup.py, just include this
+file in the same directory with it, and add this to the top of your setup.py::
+
+    from ez_setup import use_setuptools
+    use_setuptools()
+
+If you want to require a specific version of setuptools, set a download
+mirror, or use an alternate download directory, you can do so by supplying
+the appropriate options to ``use_setuptools()``.
+
+This file can also be run as a script to install or upgrade setuptools.
+"""
+import sys
+DEFAULT_VERSION = "0.6c3"
+DEFAULT_URL     = "http://cheeseshop.python.org/packages/%s/s/setuptools/" % sys.version[:3]
+
+md5_data = {
+    'setuptools-0.6b1-py2.3.egg': '8822caf901250d848b996b7f25c6e6ca',
+    'setuptools-0.6b1-py2.4.egg': 'b79a8a403e4502fbb85ee3f1941735cb',
+    'setuptools-0.6b2-py2.3.egg': '5657759d8a6d8fc44070a9d07272d99b',
+    'setuptools-0.6b2-py2.4.egg': '4996a8d169d2be661fa32a6e52e4f82a',
+    'setuptools-0.6b3-py2.3.egg': 'bb31c0fc7399a63579975cad9f5a0618',
+    'setuptools-0.6b3-py2.4.egg': '38a8c6b3d6ecd22247f179f7da669fac',
+    'setuptools-0.6b4-py2.3.egg': '62045a24ed4e1ebc77fe039aa4e6f7e5',
+    'setuptools-0.6b4-py2.4.egg': '4cb2a185d228dacffb2d17f103b3b1c4',
+    'setuptools-0.6c1-py2.3.egg': 'b3f2b5539d65cb7f74ad79127f1a908c',
+    'setuptools-0.6c1-py2.4.egg': 'b45adeda0667d2d2ffe14009364f2a4b',
+    'setuptools-0.6c2-py2.3.egg': 'f0064bf6aa2b7d0f3ba0b43f20817c27',
+    'setuptools-0.6c2-py2.4.egg': '616192eec35f47e8ea16cd6a122b7277',
+    'setuptools-0.6c3-py2.3.egg': 'f181fa125dfe85a259c9cd6f1d7b78fa',
+    'setuptools-0.6c3-py2.4.egg': 'e0ed74682c998bfb73bf803a50e7b71e',
+    'setuptools-0.6c3-py2.5.egg': 'abef16fdd61955514841c7c6bd98965e',
+}
+
+import sys, os
+
+def _validate_md5(egg_name, data):
+    if egg_name in md5_data:
+        from md5 import md5
+        digest = md5(data).hexdigest()
+        if digest != md5_data[egg_name]:
+            print >>sys.stderr, (
+                "md5 validation of %s failed!  (Possible download problem?)"
+                % egg_name
+            )
+            sys.exit(2)
+    return data
+
+
+def use_setuptools(
+    version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir,
+    download_delay=15
+):
+    """Automatically find/download setuptools and make it available on sys.path
+
+    `version` should be a valid setuptools version number that is available
+    as an egg for download under the `download_base` URL (which should end with
+    a '/').  `to_dir` is the directory where setuptools will be downloaded, if
+    it is not already available.  If `download_delay` is specified, it should
+    be the number of seconds that will be paused before initiating a download,
+    should one be required.  If an older version of setuptools is installed,
+    this routine will print a message to ``sys.stderr`` and raise SystemExit in
+    an attempt to abort the calling script.
+    """
+    try:
+        import setuptools
+        if setuptools.__version__ == '0.0.1':
+            print >>sys.stderr, (
+            "You have an obsolete version of setuptools installed.  Please\n"
+            "remove it from your system entirely before rerunning this script."
+            )
+            sys.exit(2)
+    except ImportError:
+        egg = download_setuptools(version, download_base, to_dir, download_delay)
+        sys.path.insert(0, egg)
+        import setuptools; setuptools.bootstrap_install_from = egg
+
+    import pkg_resources
+    try:
+        pkg_resources.require("setuptools>="+version)
+
+    except pkg_resources.VersionConflict, e:
+        # XXX could we install in a subprocess here?
+        print >>sys.stderr, (
+            "The required version of setuptools (>=%s) is not available, and\n"
+            "can't be installed while this script is running. Please install\n"
+            " a more recent version first.\n\n(Currently using %r)"
+        ) % (version, e.args[0])
+        sys.exit(2)
+
+def download_setuptools(
+    version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir,
+    delay = 15
+):
+    """Download setuptools from a specified location and return its filename
+
+    `version` should be a valid setuptools version number that is available
+    as an egg for download under the `download_base` URL (which should end
+    with a '/'). `to_dir` is the directory where the egg will be downloaded.
+    `delay` is the number of seconds to pause before an actual download attempt.
+    """
+    import urllib2, shutil
+    egg_name = "setuptools-%s-py%s.egg" % (version,sys.version[:3])
+    url = download_base + egg_name
+    saveto = os.path.join(to_dir, egg_name)
+    src = dst = None
+    if not os.path.exists(saveto):  # Avoid repeated downloads
+        try:
+            from distutils import log
+            if delay:
+                log.warn("""
+---------------------------------------------------------------------------
+This script requires setuptools version %s to run (even to display
+help).  I will attempt to download it for you (from
+%s), but
+you may need to enable firewall access for this script first.
+I will start the download in %d seconds.
+
+(Note: if this machine does not have network access, please obtain the file
+
+   %s
+
+and place it in this directory before rerunning this script.)
+---------------------------------------------------------------------------""",
+                    version, download_base, delay, url
+                ); from time import sleep; sleep(delay)
+            log.warn("Downloading %s", url)
+            src = urllib2.urlopen(url)
+            # Read/write all in one block, so we don't create a corrupt file
+            # if the download is interrupted.
+            data = _validate_md5(egg_name, src.read())
+            dst = open(saveto,"wb"); dst.write(data)
+        finally:
+            if src: src.close()
+            if dst: dst.close()
+    return os.path.realpath(saveto)
+
+def main(argv, version=DEFAULT_VERSION):
+    """Install or upgrade setuptools and EasyInstall"""
+
+    try:
+        import setuptools
+    except ImportError:
+        egg = None
+        try:
+            egg = download_setuptools(version, delay=0)
+            sys.path.insert(0,egg)
+            from setuptools.command.easy_install import main
+            return main(list(argv)+[egg])   # we're done here
+        finally:
+            if egg and os.path.exists(egg):
+                os.unlink(egg)
+    else:
+        if setuptools.__version__ == '0.0.1':
+            # tell the user to uninstall obsolete version
+            use_setuptools(version)
+
+    req = "setuptools>="+version
+    import pkg_resources
+    try:
+        pkg_resources.require(req)
+    except pkg_resources.VersionConflict:
+        try:
+            from setuptools.command.easy_install import main
+        except ImportError:
+            from easy_install import main
+        main(list(argv)+[download_setuptools(delay=0)])
+        sys.exit(0) # try to force an exit
+    else:
+        if argv:
+            from setuptools.command.easy_install import main
+            main(argv)
+        else:
+            print "Setuptools version",version,"or greater has been installed."
+            print '(Run "ez_setup.py -U setuptools" to reinstall or upgrade.)'
+
+
+
+def update_md5(filenames):
+    """Update our built-in md5 registry"""
+
+    import re
+    from md5 import md5
+
+    for name in filenames:
+        base = os.path.basename(name)
+        f = open(name,'rb')
+        md5_data[base] = md5(f.read()).hexdigest()
+        f.close()
+
+    data = ["    %r: %r,\n" % it for it in md5_data.items()]
+    data.sort()
+    repl = "".join(data)
+
+    import inspect
+    srcfile = inspect.getsourcefile(sys.modules[__name__])
+    f = open(srcfile, 'rb'); src = f.read(); f.close()
+
+    match = re.search("\nmd5_data = {\n([^}]+)}", src)
+    if not match:
+        print >>sys.stderr, "Internal error!"
+        sys.exit(2)
+
+    src = src[:match.start(1)] + repl + src[match.end(1):]
+    f = open(srcfile,'w')
+    f.write(src)
+    f.close()
+
+
+if __name__=='__main__':
+    if len(sys.argv)>2 and sys.argv[1]=='--md5update':
+        update_md5(sys.argv[2:])
+    else:
+        main(sys.argv[1:])
+
+
+
+
+

Modified: python-mechanize/branches/upstream/current/functional_tests.py
===================================================================
--- python-mechanize/branches/upstream/current/functional_tests.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/functional_tests.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -18,6 +18,7 @@
 
 #from mechanize import CreateBSDDBCookieJar
 
+## import logging
 ## logger = logging.getLogger("mechanize")
 ## logger.addHandler(logging.StreamHandler())
 ## logger.setLevel(logging.DEBUG)
@@ -59,15 +60,40 @@
         self.assertEqual(self.browser.title(), 'Python bits')
 
     def test_redirect(self):
-        # 302 redirect due to missing final '/'
-        self.browser.open('http://wwwsearch.sourceforge.net')
+        # 301 redirect due to missing final '/'
+        r = self.browser.open('http://wwwsearch.sourceforge.net/bits')
+        self.assertEqual(r.code, 200)
+        self.assert_("GeneralFAQ.html" in r.read(2048))
 
     def test_file_url(self):
         url = "file://%s" % sanepathname2url(
             os.path.abspath('functional_tests.py'))
-        self.browser.open(url)
+        r = self.browser.open(url)
+        self.assert_("this string appears in this file ;-)" in r.read())
 
+    def test_open_novisit(self):
+        def test_state(br):
+            self.assert_(br.request is None)
+            self.assert_(br.response() is None)
+            self.assertRaises(mechanize.BrowserStateError, br.back)
+        test_state(self.browser)
+        # note this involves a redirect, which should itself be non-visiting
+        r = self.browser.open_novisit("http://wwwsearch.sourceforge.net/bits")
+        test_state(self.browser)
+        self.assert_("GeneralFAQ.html" in r.read(2048))
 
+    def test_non_seekable(self):
+        # check everything still works without response_seek_wrapper and
+        # the .seek() method on response objects
+        ua = mechanize.UserAgent()
+        ua.set_seekable_responses(False)
+        ua.set_handle_equiv(False)
+        response = ua.open('http://wwwsearch.sourceforge.net/')
+        self.failIf(hasattr(response, "seek"))
+        data = response.read()
+        self.assert_("Python bits" in data)
+
+
 class ResponseTests(TestCase):
 
     def test_seek(self):
@@ -147,7 +173,6 @@
 class FunctionalTests(TestCase):
     def test_cookies(self):
         import urllib2
-        from mechanize import _urllib2_support
         # this test page depends on cookies, and an http-equiv refresh
         #cj = CreateBSDDBCookieJar("/home/john/db.db")
         cj = CookieJar()
@@ -183,23 +208,67 @@
             self.assert_(samedata == data)
         finally:
             o.close()
-            # uninstall opener (don't try this at home)
-            _urllib2_support._opener = None
+            install_opener(None)
 
+    def test_robots(self):
+        plain_opener = mechanize.build_opener(mechanize.HTTPRobotRulesProcessor)
+        browser = mechanize.Browser()
+        for opener in plain_opener, browser:
+            r = opener.open("http://wwwsearch.sourceforge.net/robots")
+            self.assertEqual(r.code, 200)
+            self.assertRaises(
+                mechanize.RobotExclusionError,
+                opener.open, "http://wwwsearch.sourceforge.net/norobots")
+
     def test_urlretrieve(self):
         url = "http://www.python.org/"
+        test_filename = "python.html"
+        def check_retrieve(opener, filename, headers):
+            self.assertEqual(headers.get('Content-Type'), 'text/html')
+            f = open(filename)
+            data = f.read()
+            f.close()
+            opener.close()
+            from urllib import urlopen
+            r = urlopen(url)
+            self.assertEqual(data, r.read())
+            r.close()
+
+        opener = mechanize.build_opener()
         verif = CallbackVerifier(self)
-        fn, hdrs = urlretrieve(url, "python.html", verif.callback)
+        filename, headers = opener.retrieve(url, test_filename, verif.callback)
         try:
-            f = open(fn)
-            data = f.read()
-            f.close()
+            self.assertEqual(filename, test_filename)
+            check_retrieve(opener, filename, headers)
+            self.assert_(os.path.isfile(filename))
         finally:
-            os.remove(fn)
-        r = urlopen(url)
-        self.assert_(data == r.read())
-        r.close()
+            os.remove(filename)
 
+        opener = mechanize.build_opener()
+        verif = CallbackVerifier(self)
+        filename, headers = opener.retrieve(url, reporthook=verif.callback)
+        check_retrieve(opener, filename, headers)
+        # closing the opener removed the temporary file
+        self.failIf(os.path.isfile(filename))
+
+    def test_reload_read_incomplete(self):
+        from mechanize import Browser
+        browser = Browser()
+        r1 = browser.open(
+            "http://wwwsearch.sf.net/bits/mechanize_reload_test.html")
+        # if we don't do anything and go straight to another page, most of the
+        # last page's response won't be .read()...
+        r2 = browser.open("http://wwwsearch.sf.net/mechanize")
+        self.assert_(len(r1.get_data()) < 4097)  # we only .read() a little bit
+        # ...so if we then go back, .follow_link() for a link near the end (a
+        # few kb in, past the point that always gets read in HTML files because
+        # of HEAD parsing) will only work if it causes a .reload()...
+        r3 = browser.back()
+        browser.follow_link(text="near the end")
+        # ... good, no LinkNotFoundError, so we did reload.
+        # we have .read() the whole file
+        self.assertEqual(len(r3._seek_wrapper__cache.getvalue()), 4202)
+
 ##     def test_cacheftp(self):
 ##         from urllib2 import CacheFTPHandler, build_opener
 ##         o = build_opener(CacheFTPHandler())
@@ -217,8 +286,7 @@
         self._count = 0
         self._testcase = testcase
     def callback(self, block_nr, block_size, total_size):
-        if block_nr != self._count:
-            self._testcase.fail()
+        self._testcase.assertEqual(block_nr, self._count)
         self._count = self._count + 1
 
 

Modified: python-mechanize/branches/upstream/current/mechanize/__init__.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/__init__.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/__init__.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,3 +1,85 @@
+__all__ = [
+    'AbstractBasicAuthHandler',
+    'AbstractDigestAuthHandler',
+    'BaseHandler',
+    'Browser',
+    'BrowserStateError',
+    'CacheFTPHandler',
+    'ContentTooShortError',
+    'Cookie',
+    'CookieJar',
+    'CookiePolicy',
+    'DefaultCookiePolicy',
+    'DefaultFactory',
+    'FTPHandler',
+    'Factory',
+    'FileCookieJar',
+    'FileHandler',
+    'FormNotFoundError',
+    'FormsFactory',
+    'GopherError',
+    'GopherHandler',
+    'HTTPBasicAuthHandler',
+    'HTTPCookieProcessor',
+    'HTTPDefaultErrorHandler',
+    'HTTPDigestAuthHandler',
+    'HTTPEquivProcessor',
+    'HTTPError',
+    'HTTPErrorProcessor',
+    'HTTPHandler',
+    'HTTPPasswordMgr',
+    'HTTPPasswordMgrWithDefaultRealm',
+    'HTTPProxyPasswordMgr',
+    'HTTPRedirectDebugProcessor',
+    'HTTPRedirectHandler',
+    'HTTPRefererProcessor',
+    'HTTPRefreshProcessor',
+    'HTTPRequestUpgradeProcessor',
+    'HTTPResponseDebugProcessor',
+    'HTTPRobotRulesProcessor',
+    'HTTPSClientCertMgr',
+    'HTTPSHandler',
+    'HeadParser',
+    'History',
+    'LWPCookieJar',
+    'Link',
+    'LinkNotFoundError',
+    'LinksFactory',
+    'LoadError',
+    'MSIECookieJar',
+    'MozillaCookieJar',
+    'OpenerDirector',
+    'OpenerFactory',
+    'ParseError',
+    'ProxyBasicAuthHandler',
+    'ProxyDigestAuthHandler',
+    'ProxyHandler',
+    'Request',
+    'ResponseUpgradeProcessor',
+    'RobotExclusionError',
+    'RobustFactory',
+    'RobustFormsFactory',
+    'RobustLinksFactory',
+    'RobustTitleFactory',
+    'SeekableProcessor',
+    'TitleFactory',
+    'URLError',
+    'USE_BARE_EXCEPT',
+    'UnknownHandler',
+    'UserAgent',
+    'UserAgentBase',
+    'XHTMLCompatibleHeadParser',
+    '__version__',
+    'build_opener',
+    'install_opener',
+    'lwp_cookie_str',
+    'make_response',
+    'request_host',
+    'response_seek_wrapper',
+    'str2time',
+    'urlopen',
+    'urlretrieve']
+
 from _mechanize import __version__
 
 # high-level stateful browser-style interface
@@ -2,8 +84,9 @@
 from _mechanize import \
-     Browser, \
+     Browser, History, \
      BrowserStateError, LinkNotFoundError, FormNotFoundError
 
 # configurable URL-opener interface
-from _useragent import UserAgent
+from _useragent import UserAgentBase, UserAgent
 from _html import \
+     ParseError, \
      Link, \
@@ -14,19 +97,19 @@
      RobustFormsFactory, RobustLinksFactory, RobustTitleFactory
 
 # urllib2 work-alike interface (part from mechanize, part from urllib2)
+# This is a superset of the urllib2 interface.
 from _urllib2 import *
 
 # misc
+from _opener import ContentTooShortError, OpenerFactory, urlretrieve
 from _util import http2time as str2time
-from _util import response_seek_wrapper, make_response
-from _urllib2_support import HeadParser
+from _response import response_seek_wrapper, make_response
+from _http import HeadParser
 try:
-    from _urllib2_support import XHTMLCompatibleHeadParser
+    from _http import XHTMLCompatibleHeadParser
 except ImportError:
     pass
-#from _gzip import HTTPGzipProcessor  # crap ATM
 
-
 # cookies
 from _clientcookie import Cookie, CookiePolicy, DefaultCookiePolicy, \
      CookieJar, FileCookieJar, LoadError, request_host

Modified: python-mechanize/branches/upstream/current/mechanize/_auth.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_auth.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_auth.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -11,10 +11,11 @@
 
 """
 
-import re, base64, urlparse, posixpath, md5, sha
+import re, base64, urlparse, posixpath, md5, sha, sys
 
 from urllib2 import BaseHandler
-from urllib import getproxies, unquote, splittype, splituser, splitpasswd
+from urllib import getproxies, unquote, splittype, splituser, splitpasswd, \
+     splitport
 
 
 def _parse_proxy(proxy):
@@ -135,32 +136,45 @@
         # uri could be a single URI or a sequence
         if isinstance(uri, basestring):
             uri = [uri]
-        uri = tuple(map(self.reduce_uri, uri))
         if not realm in self.passwd:
             self.passwd[realm] = {}
-        self.passwd[realm][uri] = (user, passwd)
+        for default_port in True, False:
+            reduced_uri = tuple(
+                [self.reduce_uri(u, default_port) for u in uri])
+            self.passwd[realm][reduced_uri] = (user, passwd)
 
     def find_user_password(self, realm, authuri):
         domains = self.passwd.get(realm, {})
-        authuri = self.reduce_uri(authuri)
-        for uris, authinfo in domains.iteritems():
-            for uri in uris:
-                if self.is_suburi(uri, authuri):
-                    return authinfo
+        for default_port in True, False:
+            reduced_authuri = self.reduce_uri(authuri, default_port)
+            for uris, authinfo in domains.iteritems():
+                for uri in uris:
+                    if self.is_suburi(uri, reduced_authuri):
+                        return authinfo
         return None, None
 
-    def reduce_uri(self, uri):
-        """Accept netloc or URI and extract only the netloc and path"""
+    def reduce_uri(self, uri, default_port=True):
+        """Accept authority or URI and extract only the authority and path."""
+        # note HTTP URLs do not have a userinfo component
         parts = urlparse.urlsplit(uri)
         if parts[1]:
             # URI
-            return parts[1], parts[2] or '/'
-        elif parts[0]:
-            # host:port
-            return uri, '/'
+            scheme = parts[0]
+            authority = parts[1]
+            path = parts[2] or '/'
         else:
-            # host
-            return parts[2], '/'
+            # host or host:port
+            scheme = None
+            authority = uri
+            path = '/'
+        host, port = splitport(authority)
+        if default_port and port is None and scheme is not None:
+            dport = {"http": 80,
+                     "https": 443,
+                     }.get(scheme)
+            if dport is not None:
+                authority = "%s:%d" % (host, dport)
+        return authority, path
 
     def is_suburi(self, base, test):
         """Check if test is below base in a URI tree
@@ -404,6 +418,7 @@
     """
 
     auth_header = 'Authorization'
+    handler_order = 490
 
     def http_error_401(self, req, fp, code, msg, headers):
         host = urlparse.urlparse(req.get_full_url())[1]
@@ -416,6 +431,7 @@
 class ProxyDigestAuthHandler(BaseHandler, AbstractDigestAuthHandler):
 
     auth_header = 'Proxy-Authorization'
+    handler_order = 490
 
     def http_error_407(self, req, fp, code, msg, headers):
         host = req.get_host()
@@ -425,7 +441,7 @@
         return retry
 
 
-
+# XXX ugly implementation, should probably not bother deriving
 class HTTPProxyPasswordMgr(HTTPPasswordMgr):
     # has default realm and host/port
     def add_password(self, realm, uri, user, passwd):
@@ -436,32 +452,34 @@
             uris = uri
         passwd_by_domain = self.passwd.setdefault(realm, {})
         for uri in uris:
-            uri = self.reduce_uri(uri)
-            passwd_by_domain[uri] = (user, passwd)
+            for default_port in True, False:
+                reduced_uri = self.reduce_uri(uri, default_port)
+                passwd_by_domain[reduced_uri] = (user, passwd)
 
     def find_user_password(self, realm, authuri):
-        perms = [(realm, authuri), (None, authuri)]
+        attempts = [(realm, authuri), (None, authuri)]
         # bleh, want default realm to take precedence over default
         # URI/authority, hence this outer loop
         for default_uri in False, True:
-            for realm, authuri in perms:
+            for realm, authuri in attempts:
                 authinfo_by_domain = self.passwd.get(realm, {})
-                reduced_authuri = self.reduce_uri(authuri)
-                for uri, authinfo in authinfo_by_domain.iteritems():
-                    if uri is None and not default_uri:
-                        continue
-                    if self.is_suburi(uri, reduced_authuri):
-                        return authinfo
-                user, password = None, None
+                for default_port in True, False:
+                    reduced_authuri = self.reduce_uri(authuri, default_port)
+                    for uri, authinfo in authinfo_by_domain.iteritems():
+                        if uri is None and not default_uri:
+                            continue
+                        if self.is_suburi(uri, reduced_authuri):
+                            return authinfo
+                    user, password = None, None
 
-                if user is not None:
-                    break
+                    if user is not None:
+                        break
         return user, password
 
-    def reduce_uri(self, uri):
+    def reduce_uri(self, uri, default_port=True):
         if uri is None:
             return None
-        return HTTPPasswordMgr.reduce_uri(self, uri)
+        return HTTPPasswordMgr.reduce_uri(self, uri, default_port)
 
     def is_suburi(self, base, test):
         if base is None:
@@ -469,3 +487,11 @@
             hostport, path = test
             base = (hostport, "/")
         return HTTPPasswordMgr.is_suburi(self, base, test)
+
+
+class HTTPSClientCertMgr(HTTPPasswordMgr):
+    # implementation inheritance: this is not a proper subclass
+    def add_key_cert(self, uri, key_file, cert_file):
+        self.add_password(None, uri, key_file, cert_file)
+    def find_key_cert(self, authuri):
+        return HTTPPasswordMgr.find_user_password(self, None, authuri)

Added: python-mechanize/branches/upstream/current/mechanize/_beautifulsoup.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_beautifulsoup.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_beautifulsoup.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,1080 @@
+"""Beautiful Soup
+Elixir and Tonic
+"The Screen-Scraper's Friend"
+v2.1.1
+http://www.crummy.com/software/BeautifulSoup/
+
+Beautiful Soup parses arbitrarily invalid XML- or HTML-like substance
+into a tree representation. It provides methods and Pythonic idioms
+that make it easy to search and modify the tree.
+
+A well-formed XML/HTML document will yield a well-formed data
+structure. An ill-formed XML/HTML document will yield a
+correspondingly ill-formed data structure. If your document is only
+locally well-formed, you can use this library to find and process the
+well-formed part of it. The BeautifulSoup class has heuristics for
+obtaining a sensible parse tree in the face of common HTML errors.
+
+Beautiful Soup has no external dependencies. It works with Python 2.2
+and up.
+
+Beautiful Soup defines classes for four different parsing strategies:
+
+ * BeautifulStoneSoup, for parsing XML, SGML, or your domain-specific
+   language that kind of looks like XML.
+
+ * BeautifulSoup, for parsing run-of-the-mill HTML code, be it valid
+   or invalid.
+
+ * ICantBelieveItsBeautifulSoup, for parsing valid but bizarre HTML
+   that trips up BeautifulSoup.
+
+ * BeautifulSOAP, for making it easier to parse XML documents that use
+   lots of subelements containing a single string, where you'd prefer
+   they put that string into an attribute (such as SOAP messages).
+
+You can subclass BeautifulStoneSoup or BeautifulSoup to create a
+parsing strategy specific to an XML schema or a particular bizarre
+HTML document. Typically your subclass would just override
+SELF_CLOSING_TAGS and/or NESTABLE_TAGS.
+"""
+from __future__ import generators
+
+__author__ = "Leonard Richardson (leonardr at segfault.org)"
+__version__ = "2.1.1"
+__date__ = "$Date: 2004/10/18 00:14:20 $"
+__copyright__ = "Copyright (c) 2004-2005 Leonard Richardson"
+__license__ = "PSF"
+
+from sgmllib import SGMLParser, SGMLParseError
+import types
+import re
+import sgmllib
+
+#This code makes Beautiful Soup able to parse XML with namespaces
+sgmllib.tagfind = re.compile('[a-zA-Z][-_.:a-zA-Z0-9]*')
+
+class NullType(object):
+
+    """Similar to NoneType with a corresponding singleton instance
+    'Null' that, unlike None, accepts any message and returns itself.
+
+    Examples:
+    >>> Null("send", "a", "message")("and one more",
+    ...      "and what you get still") is Null
+    True
+    """
+
+    def __new__(cls):                    return Null
+    def __call__(self, *args, **kwargs): return Null
+##    def __getstate__(self, *args):       return Null
+    def __getattr__(self, attr):         return Null
+    def __getitem__(self, item):         return Null
+    def __setattr__(self, attr, value):  pass
+    def __setitem__(self, item, value):  pass
+    def __len__(self):                   return 0
+    # FIXME: is this a python bug? otherwise ``for x in Null: pass``
+    #        never terminates...
+    def __iter__(self):                  return iter([])
+    def __contains__(self, item):        return False
+    def __repr__(self):                  return "Null"
+Null = object.__new__(NullType)
+
+class PageElement:
+    """Contains the navigational information for some part of the page
+    (either a tag or a piece of text)"""
+
+    def setup(self, parent=Null, previous=Null):
+        """Sets up the initial relations between this element and
+        other elements."""
+        self.parent = parent
+        self.previous = previous
+        self.next = Null
+        self.previousSibling = Null
+        self.nextSibling = Null
+        if self.parent and self.parent.contents:
+            self.previousSibling = self.parent.contents[-1]
+            self.previousSibling.nextSibling = self
+
+    def findNext(self, name=None, attrs={}, text=None):
+        """Returns the first item that matches the given criteria and
+        appears after this Tag in the document."""
+        return self._first(self.fetchNext, name, attrs, text)
+    firstNext = findNext
+
+    def fetchNext(self, name=None, attrs={}, text=None, limit=None):
+        """Returns all items that match the given criteria and appear
+        before after Tag in the document."""
+        return self._fetch(name, attrs, text, limit, self.nextGenerator)
+
+    def findNextSibling(self, name=None, attrs={}, text=None):
+        """Returns the closest sibling to this Tag that matches the
+        given criteria and appears after this Tag in the document."""
+        return self._first(self.fetchNextSiblings, name, attrs, text)
+    firstNextSibling = findNextSibling
+
+    def fetchNextSiblings(self, name=None, attrs={}, text=None, limit=None):
+        """Returns the siblings of this Tag that match the given
+        criteria and appear after this Tag in the document."""
+        return self._fetch(name, attrs, text, limit, self.nextSiblingGenerator)
+
+    def findPrevious(self, name=None, attrs={}, text=None):
+        """Returns the first item that matches the given criteria and
+        appears before this Tag in the document."""
+        return self._first(self.fetchPrevious, name, attrs, text)
+
+    def fetchPrevious(self, name=None, attrs={}, text=None, limit=None):
+        """Returns all items that match the given criteria and appear
+        before this Tag in the document."""
+        return self._fetch(name, attrs, text, limit, self.previousGenerator)
+    firstPrevious = findPrevious
+
+    def findPreviousSibling(self, name=None, attrs={}, text=None):
+        """Returns the closest sibling to this Tag that matches the
+        given criteria and appears before this Tag in the document."""
+        return self._first(self.fetchPreviousSiblings, name, attrs, text)
+    firstPreviousSibling = findPreviousSibling
+
+    def fetchPreviousSiblings(self, name=None, attrs={}, text=None,
+                              limit=None):
+        """Returns the siblings of this Tag that match the given
+        criteria and appear before this Tag in the document."""
+        return self._fetch(name, attrs, text, limit,
+                           self.previousSiblingGenerator)
+
+    def findParent(self, name=None, attrs={}):
+        """Returns the closest parent of this Tag that matches the given
+        criteria."""
+        r = Null
+        l = self.fetchParents(name, attrs, 1)
+        if l:
+            r = l[0]
+        return r
+    firstParent = findParent
+
+    def fetchParents(self, name=None, attrs={}, limit=None):
+        """Returns the parents of this Tag that match the given
+        criteria."""
+        return self._fetch(name, attrs, None, limit, self.parentGenerator)
+
+    #These methods do the real heavy lifting.
+
+    def _first(self, method, name, attrs, text):
+        r = Null
+        l = method(name, attrs, text, 1)
+        if l:
+            r = l[0]
+        return r
+    
+    def _fetch(self, name, attrs, text, limit, generator):
+        "Iterates over a generator looking for things that match."
+        if not hasattr(attrs, 'items'):
+            attrs = {'class' : attrs}
+
+        results = []
+        g = generator()
+        while True:
+            try:
+                i = g.next()
+            except StopIteration:
+                break
+            found = None
+            if isinstance(i, Tag):
+                if not text:
+                    if not name or self._matches(i, name):
+                        match = True
+                        for attr, matchAgainst in attrs.items():
+                            check = i.get(attr)
+                            if not self._matches(check, matchAgainst):
+                                match = False
+                                break
+                        if match:
+                            found = i
+            elif text:
+                if self._matches(i, text):
+                    found = i                    
+            if found:
+                results.append(found)
+                if limit and len(results) >= limit:
+                    break
+        return results
+
+    #Generators that can be used to navigate starting from both
+    #NavigableTexts and Tags.                
+    def nextGenerator(self):
+        i = self
+        while i:
+            i = i.next
+            yield i
+
+    def nextSiblingGenerator(self):
+        i = self
+        while i:
+            i = i.nextSibling
+            yield i
+
+    def previousGenerator(self):
+        i = self
+        while i:
+            i = i.previous
+            yield i
+
+    def previousSiblingGenerator(self):
+        i = self
+        while i:
+            i = i.previousSibling
+            yield i
+
+    def parentGenerator(self):
+        i = self
+        while i:
+            i = i.parent
+            yield i
+
+    def _matches(self, chunk, howToMatch):
+        #print 'looking for %s in %s' % (howToMatch, chunk)
+        #
+        # If given a list of items, return true if the list contains a
+        # text element that matches.
+        if isList(chunk) and not isinstance(chunk, Tag):
+            for tag in chunk:
+                if isinstance(tag, NavigableText) and self._matches(tag, howToMatch):
+                    return True
+            return False
+        if callable(howToMatch):
+            return howToMatch(chunk)
+        if isinstance(chunk, Tag):
+            #Custom match methods take the tag as an argument, but all other
+            #ways of matching match the tag name as a string
+            chunk = chunk.name
+        #Now we know that chunk is a string
+        if not isinstance(chunk, basestring):
+            chunk = str(chunk)
+        if hasattr(howToMatch, 'match'):
+            # It's a regexp object.
+            return howToMatch.search(chunk)
+        if isList(howToMatch):
+            return chunk in howToMatch
+        if hasattr(howToMatch, 'items'):
+            return howToMatch.has_key(chunk)
+        #It's just a string
+        return str(howToMatch) == chunk
+
+class NavigableText(PageElement):
+
+    def __getattr__(self, attr):
+        "For backwards compatibility, text.string gives you text"
+        if attr == 'string':
+            return self
+        else:
+            raise AttributeError, "'%s' object has no attribute '%s'" % (self.__class__.__name__, attr)
+        
+class NavigableString(str, NavigableText):
+    pass
+
+class NavigableUnicodeString(unicode, NavigableText):
+    pass
+
+class Tag(PageElement):
+
+    """Represents a found HTML tag with its attributes and contents."""
+
+    def __init__(self, name, attrs=None, parent=Null, previous=Null):
+        "Basic constructor."
+        self.name = name
+        if attrs == None:
+            attrs = []
+        self.attrs = attrs
+        self.contents = []
+        self.setup(parent, previous)
+        self.hidden = False
+
+    def get(self, key, default=None):
+        """Returns the value of the 'key' attribute for the tag, or
+        the value given for 'default' if it doesn't have that
+        attribute."""
+        return self._getAttrMap().get(key, default)    
+
+    def __getitem__(self, key):
+        """tag[key] returns the value of the 'key' attribute for the tag,
+        and throws an exception if it's not there."""
+        return self._getAttrMap()[key]
+
+    def __iter__(self):
+        "Iterating over a tag iterates over its contents."
+        return iter(self.contents)
+
+    def __len__(self):
+        "The length of a tag is the length of its list of contents."
+        return len(self.contents)
+
+    def __contains__(self, x):
+        return x in self.contents
+
+    def __nonzero__(self):
+        "A tag is non-None even if it has no contents."
+        return True
+
+    def __setitem__(self, key, value):        
+        """Setting tag[key] sets the value of the 'key' attribute for the
+        tag."""
+        self._getAttrMap()
+        self.attrMap[key] = value
+        found = False
+        for i in range(0, len(self.attrs)):
+            if self.attrs[i][0] == key:
+                self.attrs[i] = (key, value)
+                found = True
+        if not found:
+            self.attrs.append((key, value))
+        self._getAttrMap()[key] = value
+
+    def __delitem__(self, key):
+        "Deleting tag[key] deletes all 'key' attributes for the tag."
+        for item in self.attrs:
+            if item[0] == key:
+                self.attrs.remove(item)
+                #We don't break because bad HTML can define the same
+                #attribute multiple times.
+            self._getAttrMap()
+            if self.attrMap.has_key(key):
+                del self.attrMap[key]
+
+    def __call__(self, *args, **kwargs):
+        """Calling a tag like a function is the same as calling its
+        fetch() method. Eg. tag('a') returns a list of all the A tags
+        found within this tag."""
+        return apply(self.fetch, args, kwargs)
+
+    def __getattr__(self, tag):
+        if len(tag) > 3 and tag.rfind('Tag') == len(tag)-3:
+            return self.first(tag[:-3])
+        elif tag.find('__') != 0:
+            return self.first(tag)
+
+    def __eq__(self, other):
+        """Returns true iff this tag has the same name, the same attributes,
+        and the same contents (recursively) as the given tag.
+
+        NOTE: right now this will return false if two tags have the
+        same attributes in a different order. Should this be fixed?"""
+        if not hasattr(other, 'name') or not hasattr(other, 'attrs') or not hasattr(other, 'contents') or self.name != other.name or self.attrs != other.attrs or len(self) != len(other):
+            return False
+        for i in range(0, len(self.contents)):
+            if self.contents[i] != other.contents[i]:
+                return False
+        return True
+
+    def __ne__(self, other):
+        """Returns true iff this tag is not identical to the other tag,
+        as defined in __eq__."""
+        return not self == other
+
+    def __repr__(self):
+        """Renders this tag as a string."""
+        return str(self)
+
+    def __unicode__(self):
+        return self.__str__(1)
+
+    def __str__(self, needUnicode=None, showStructureIndent=None):
+        """Returns a string or Unicode representation of this tag and
+        its contents.
+
+        NOTE: since Python's HTML parser consumes whitespace, this
+        method is not certain to reproduce the whitespace present in
+        the original string."""
+        
+        attrs = []
+        if self.attrs:
+            for key, val in self.attrs:
+                attrs.append('%s="%s"' % (key, val))
+        close = ''
+        closeTag = ''
+        if self.isSelfClosing():
+            close = ' /'
+        else:
+            closeTag = '</%s>' % self.name
+        indentIncrement = None        
+        if showStructureIndent != None:
+            indentIncrement = showStructureIndent
+            if not self.hidden:
+                indentIncrement += 1
+        contents = self.renderContents(indentIncrement, needUnicode=needUnicode)        
+        if showStructureIndent:
+            space = '\n%s' % (' ' * showStructureIndent)
+        if self.hidden:
+            s = contents
+        else:
+            s = []
+            attributeString = ''
+            if attrs:
+                attributeString = ' ' + ' '.join(attrs)            
+            if showStructureIndent:
+                s.append(space)
+            s.append('<%s%s%s>' % (self.name, attributeString, close))
+            s.append(contents)
+            if closeTag and showStructureIndent != None:
+                s.append(space)
+            s.append(closeTag)
+            s = ''.join(s)
+        isUnicode = type(s) == types.UnicodeType
+        if needUnicode and not isUnicode:
+            s = unicode(s)
+        elif isUnicode and needUnicode==False:
+            s = str(s)
+        return s
+
+    def prettify(self, needUnicode=None):
+        return self.__str__(needUnicode, showStructureIndent=True)
+
+    def renderContents(self, showStructureIndent=None, needUnicode=None):
+        """Renders the contents of this tag as a (possibly Unicode) 
+        string."""
+        s=[]
+        for c in self:
+            text = None
+            if isinstance(c, NavigableUnicodeString) or type(c) == types.UnicodeType:
+                text = unicode(c)
+            elif isinstance(c, Tag):
+                s.append(c.__str__(needUnicode, showStructureIndent))
+            elif needUnicode:
+                text = unicode(c)
+            else:
+                text = str(c)
+            if text:
+                if showStructureIndent != None:
+                    if text[-1] == '\n':
+                        text = text[:-1]
+                s.append(text)
+        return ''.join(s)    
+
+    #Soup methods
+
+    def firstText(self, text, recursive=True):
+        """Convenience method to retrieve the first piece of text matching the
+        given criteria. 'text' can be a string, a regular expression object,
+        a callable that takes a string and returns whether or not the
+        string 'matches', etc."""
+        return self.first(recursive=recursive, text=text)
+
+    def fetchText(self, text, recursive=True, limit=None):
+        """Convenience method to retrieve all pieces of text matching the
+        given criteria. 'text' can be a string, a regular expression object,
+        a callable that takes a string and returns whether or not the
+        string 'matches', etc."""
+        return self.fetch(recursive=recursive, text=text, limit=limit)
+
+    def first(self, name=None, attrs={}, recursive=True, text=None):
+        """Return only the first child of this
+        Tag matching the given criteria."""
+        r = Null
+        l = self.fetch(name, attrs, recursive, text, 1)
+        if l:
+            r = l[0]
+        return r
+    findChild = first
+
+    def fetch(self, name=None, attrs={}, recursive=True, text=None,
+              limit=None):
+        """Extracts a list of Tag objects that match the given
+        criteria.  You can specify the name of the Tag and any
+        attributes you want the Tag to have.
+
+        The value of a key-value pair in the 'attrs' map can be a
+        string, a list of strings, a regular expression object, or a
+        callable that takes a string and returns whether or not the
+        string matches for some custom definition of 'matches'. The
+        same is true of the tag name."""
+        generator = self.recursiveChildGenerator
+        if not recursive:
+            generator = self.childGenerator
+        return self._fetch(name, attrs, text, limit, generator)
+    fetchChildren = fetch
+    
+    #Utility methods
+
+    def isSelfClosing(self):
+        """Returns true iff this is a self-closing tag as defined in the HTML
+        standard.
+
+        TODO: This is specific to BeautifulSoup and its subclasses, but it's
+        used by __str__"""
+        return self.name in BeautifulSoup.SELF_CLOSING_TAGS
+
+    def append(self, tag):
+        """Appends the given tag to the contents of this tag."""
+        self.contents.append(tag)
+
+    #Private methods
+
+    def _getAttrMap(self):
+        """Initializes a map representation of this tag's attributes,
+        if not already initialized."""
+        if not getattr(self, 'attrMap'):
+            self.attrMap = {}
+            for (key, value) in self.attrs:
+                self.attrMap[key] = value 
+        return self.attrMap
+
+    #Generator methods
+    def childGenerator(self):
+        for i in range(0, len(self.contents)):
+            yield self.contents[i]
+        raise StopIteration
+    
+    def recursiveChildGenerator(self):
+        stack = [(self, 0)]
+        while stack:
+            tag, start = stack.pop()
+            if isinstance(tag, Tag):            
+                for i in range(start, len(tag.contents)):
+                    a = tag.contents[i]
+                    yield a
+                    if isinstance(a, Tag) and tag.contents:
+                        if i < len(tag.contents) - 1:
+                            stack.append((tag, i+1))
+                        stack.append((a, 0))
+                        break
+        raise StopIteration
+
+
+def isList(l):
+    """Convenience method that works with all 2.x versions of Python
+    to determine whether or not something is listlike."""
+    return hasattr(l, '__iter__') \
+           or (type(l) in (types.ListType, types.TupleType))
+
+def buildTagMap(default, *args):
+    """Turns a list of maps, lists, or scalars into a single map.
+    Used to build the SELF_CLOSING_TAGS and NESTABLE_TAGS maps out
+    of lists and partial maps."""
+    built = {}
+    for portion in args:
+        if hasattr(portion, 'items'):
+            #It's a map. Merge it.
+            for k,v in portion.items():
+                built[k] = v
+        elif isList(portion):
+            #It's a list. Map each item to the default.
+            for k in portion:
+                built[k] = default
+        else:
+            #It's a scalar. Map it to the default.
+            built[portion] = default
+    return built
+
+class BeautifulStoneSoup(Tag, SGMLParser):
+
+    """This class contains the basic parser and fetch code. It defines
+    a parser that knows nothing about tag behavior except for the
+    following:
+   
+      You can't close a tag without closing all the tags it encloses.
+      That is, "<foo><bar></foo>" actually means
+      "<foo><bar></bar></foo>".
+
+    [Another possible explanation is "<foo><bar /></foo>", but since
+    this class defines no SELF_CLOSING_TAGS, it will never use that
+    explanation.]
+
+    This class is useful for parsing XML or made-up markup languages,
+    or when BeautifulSoup makes an assumption counter to what you were
+    expecting."""
+
+    SELF_CLOSING_TAGS = {}
+    NESTABLE_TAGS = {}
+    RESET_NESTING_TAGS = {}
+    QUOTE_TAGS = {}
+
+    #As a public service we will by default silently replace MS smart quotes
+    #and similar characters with their HTML or ASCII equivalents.
+    MS_CHARS = { '\x80' : '&euro;',
+                 '\x81' : ' ',
+                 '\x82' : '&sbquo;',
+                 '\x83' : '&fnof;',
+                 '\x84' : '&bdquo;',
+                 '\x85' : '&hellip;',
+                 '\x86' : '&dagger;',
+                 '\x87' : '&Dagger;',
+                 '\x88' : '&caret;',
+                 '\x89' : '%',
+                 '\x8A' : '&Scaron;',
+                 '\x8B' : '&lt;',
+                 '\x8C' : '&OElig;',
+                 '\x8D' : '?',
+                 '\x8E' : 'Z',
+                 '\x8F' : '?',
+                 '\x90' : '?',
+                 '\x91' : '&lsquo;',
+                 '\x92' : '&rsquo;',
+                 '\x93' : '&ldquo;',
+                 '\x94' : '&rdquo;',
+                 '\x95' : '&bull;',
+                 '\x96' : '&ndash;',
+                 '\x97' : '&mdash;',
+                 '\x98' : '&tilde;',
+                 '\x99' : '&trade;',
+                 '\x9a' : '&scaron;',
+                 '\x9b' : '&gt;',
+                 '\x9c' : '&oelig;',
+                 '\x9d' : '?',
+                 '\x9e' : 'z',
+                 '\x9f' : '&Yuml;',}
+
+    PARSER_MASSAGE = [(re.compile('(<[^<>]*)/>'),
+                       lambda(x):x.group(1) + ' />'),
+                      (re.compile('<!\s+([^<>]*)>'),
+                       lambda(x):'<!' + x.group(1) + '>'),
+                      (re.compile("([\x80-\x9f])"),
+                       lambda(x): BeautifulStoneSoup.MS_CHARS.get(x.group(1)))
+                      ]
+
+    ROOT_TAG_NAME = '[document]'
+
+    def __init__(self, text=None, avoidParserProblems=True,
+                 initialTextIsEverything=True):
+        """Initialize this as the 'root tag' and feed in any text to
+        the parser.
+
+        NOTE about avoidParserProblems: sgmllib will process most bad
+        HTML, and BeautifulSoup has tricks for dealing with some HTML
+        that kills sgmllib, but Beautiful Soup can nonetheless choke
+        or lose data if your data uses self-closing tags or
+        declarations incorrectly. By default, Beautiful Soup sanitizes
+        its input to avoid the vast majority of these problems. The
+        problems are relatively rare, even in bad HTML, so feel free
+        to pass in False to avoidParserProblems if they don't apply to
+        you, and you'll get better performance. The only reason I have
+        this turned on by default is so I don't get so many tech
+        support questions.
+
+        The two most common instances of invalid HTML that will choke
+        sgmllib are fixed by the default parser massage techniques:
+
+         <br/> (No space between name of closing tag and tag close)
+         <! --Comment--> (Extraneous whitespace in declaration)
+
+        You can pass in a custom list of (RE object, replace method)
+        tuples to get Beautiful Soup to scrub your input the way you
+        want."""
+        Tag.__init__(self, self.ROOT_TAG_NAME)
+        if avoidParserProblems \
+           and not isList(avoidParserProblems):
+            avoidParserProblems = self.PARSER_MASSAGE            
+        self.avoidParserProblems = avoidParserProblems
+        SGMLParser.__init__(self)
+        self.quoteStack = []
+        self.hidden = 1
+        self.reset()
+        if hasattr(text, 'read'):
+            #It's a file-type object.
+            text = text.read()
+        if text:
+            self.feed(text)
+        if initialTextIsEverything:
+            self.done()
+
+    def __getattr__(self, methodName):
+        """This method routes method call requests to either the SGMLParser
+        superclass or the Tag superclass, depending on the method name."""
+        if methodName.find('start_') == 0 or methodName.find('end_') == 0 \
+               or methodName.find('do_') == 0:
+            return SGMLParser.__getattr__(self, methodName)
+        elif methodName.find('__') != 0:
+            return Tag.__getattr__(self, methodName)
+        else:
+            raise AttributeError
+
+    def feed(self, text):
+        if self.avoidParserProblems:
+            for fix, m in self.avoidParserProblems:
+                text = fix.sub(m, text)
+        SGMLParser.feed(self, text)
+
+    def done(self):
+        """Called when you're done parsing, so that the unclosed tags can be
+        correctly processed."""
+        self.endData() #NEW
+        while self.currentTag.name != self.ROOT_TAG_NAME:
+            self.popTag()
+            
+    def reset(self):
+        SGMLParser.reset(self)
+        self.currentData = []
+        self.currentTag = None
+        self.tagStack = []
+        self.pushTag(self)        
+    
+    def popTag(self):
+        tag = self.tagStack.pop()
+        # Tags with just one string-owning child get the child as a
+        # 'string' property, so that soup.tag.string is shorthand for
+        # soup.tag.contents[0]
+        if len(self.currentTag.contents) == 1 and \
+           isinstance(self.currentTag.contents[0], NavigableText):
+            self.currentTag.string = self.currentTag.contents[0]
+
+        #print "Pop", tag.name
+        if self.tagStack:
+            self.currentTag = self.tagStack[-1]
+        return self.currentTag
+
+    def pushTag(self, tag):
+        #print "Push", tag.name
+        if self.currentTag:
+            self.currentTag.append(tag)
+        self.tagStack.append(tag)
+        self.currentTag = self.tagStack[-1]
+
+    def endData(self):
+        currentData = ''.join(self.currentData)
+        if currentData:
+            if not currentData.strip():
+                if '\n' in currentData:
+                    currentData = '\n'
+                else:
+                    currentData = ' '
+            c = NavigableString
+            if type(currentData) == types.UnicodeType:
+                c = NavigableUnicodeString
+            o = c(currentData)
+            o.setup(self.currentTag, self.previous)
+            if self.previous:
+                self.previous.next = o
+            self.previous = o
+            self.currentTag.contents.append(o)
+        self.currentData = []
+
+    def _popToTag(self, name, inclusivePop=True):
+        """Pops the tag stack up to and including the most recent
+        instance of the given tag. If inclusivePop is false, pops the tag
+        stack up to but *not* including the most recent instqance of
+        the given tag."""
+        if name == self.ROOT_TAG_NAME:
+            return            
+
+        numPops = 0
+        mostRecentTag = None
+        for i in range(len(self.tagStack)-1, 0, -1):
+            if name == self.tagStack[i].name:
+                numPops = len(self.tagStack)-i
+                break
+        if not inclusivePop:
+            numPops = numPops - 1
+
+        for i in range(0, numPops):
+            mostRecentTag = self.popTag()
+        return mostRecentTag    
+
+    def _smartPop(self, name):
+
+        """We need to pop up to the previous tag of this type, unless
+        one of this tag's nesting reset triggers comes between this
+        tag and the previous tag of this type, OR unless this tag is a
+        generic nesting trigger and another generic nesting trigger
+        comes between this tag and the previous tag of this type.
+
+        Examples:
+         <p>Foo<b>Bar<p> should pop to 'p', not 'b'.
+         <p>Foo<table>Bar<p> should pop to 'table', not 'p'.
+         <p>Foo<table><tr>Bar<p> should pop to 'tr', not 'p'.
+         <p>Foo<b>Bar<p> should pop to 'p', not 'b'.
+
+         <li><ul><li> *<li>* should pop to 'ul', not the first 'li'.
+         <tr><table><tr> *<tr>* should pop to 'table', not the first 'tr'
+         <td><tr><td> *<td>* should pop to 'tr', not the first 'td'
+        """
+
+        nestingResetTriggers = self.NESTABLE_TAGS.get(name)
+        isNestable = nestingResetTriggers != None
+        isResetNesting = self.RESET_NESTING_TAGS.has_key(name)
+        popTo = None
+        inclusive = True
+        for i in range(len(self.tagStack)-1, 0, -1):
+            p = self.tagStack[i]
+            if (not p or p.name == name) and not isNestable:
+                #Non-nestable tags get popped to the top or to their
+                #last occurance.
+                popTo = name
+                break
+            if (nestingResetTriggers != None
+                and p.name in nestingResetTriggers) \
+                or (nestingResetTriggers == None and isResetNesting
+                    and self.RESET_NESTING_TAGS.has_key(p.name)):
+                
+                #If we encounter one of the nesting reset triggers
+                #peculiar to this tag, or we encounter another tag
+                #that causes nesting to reset, pop up to but not
+                #including that tag.
+
+                popTo = p.name
+                inclusive = False
+                break
+            p = p.parent
+        if popTo:
+            self._popToTag(popTo, inclusive)
+
+    def unknown_starttag(self, name, attrs, selfClosing=0):
+        #print "Start tag %s" % name
+        if self.quoteStack:
+            #This is not a real tag.
+            #print "<%s> is not real!" % name
+            attrs = ''.join(map(lambda(x, y): ' %s="%s"' % (x, y), attrs))
+            self.handle_data('<%s%s>' % (name, attrs))
+            return
+        self.endData()
+        if not name in self.SELF_CLOSING_TAGS and not selfClosing:
+            self._smartPop(name)
+        tag = Tag(name, attrs, self.currentTag, self.previous)        
+        if self.previous:
+            self.previous.next = tag
+        self.previous = tag
+        self.pushTag(tag)
+        if selfClosing or name in self.SELF_CLOSING_TAGS:
+            self.popTag()                
+        if name in self.QUOTE_TAGS:
+            #print "Beginning quote (%s)" % name
+            self.quoteStack.append(name)
+            self.literal = 1
+
+    def unknown_endtag(self, name):
+        if self.quoteStack and self.quoteStack[-1] != name:
+            #This is not a real end tag.
+            #print "</%s> is not real!" % name
+            self.handle_data('</%s>' % name)
+            return
+        self.endData()
+        self._popToTag(name)
+        if self.quoteStack and self.quoteStack[-1] == name:
+            self.quoteStack.pop()
+            self.literal = (len(self.quoteStack) > 0)
+
+    def handle_data(self, data):
+        self.currentData.append(data)
+
+    def handle_pi(self, text):
+        "Propagate processing instructions right through."
+        self.handle_data("<?%s>" % text)
+
+    def handle_comment(self, text):
+        "Propagate comments right through."
+        self.handle_data("<!--%s-->" % text)
+
+    def handle_charref(self, ref):
+        "Propagate char refs right through."
+        self.handle_data('&#%s;' % ref)
+
+    def handle_entityref(self, ref):
+        "Propagate entity refs right through."
+        self.handle_data('&%s;' % ref)
+        
+    def handle_decl(self, data):
+        "Propagate DOCTYPEs and the like right through."
+        self.handle_data('<!%s>' % data)
+
+    def parse_declaration(self, i):
+        """Treat a bogus SGML declaration as raw data. Treat a CDATA
+        declaration as regular data."""
+        j = None
+        if self.rawdata[i:i+9] == '<![CDATA[':
+             k = self.rawdata.find(']]>', i)
+             if k == -1:
+                 k = len(self.rawdata)
+             self.handle_data(self.rawdata[i+9:k])
+             j = k+3
+        else:
+            try:
+                j = SGMLParser.parse_declaration(self, i)
+            except SGMLParseError:
+                toHandle = self.rawdata[i:]
+                self.handle_data(toHandle)
+                j = i + len(toHandle)
+        return j
+
+class BeautifulSoup(BeautifulStoneSoup):
+
+    """This parser knows the following facts about HTML:
+
+    * Some tags have no closing tag and should be interpreted as being
+      closed as soon as they are encountered.
+
+    * The text inside some tags (ie. 'script') may contain tags which
+      are not really part of the document and which should be parsed
+      as text, not tags. If you want to parse the text as tags, you can
+      always fetch it and parse it explicitly.
+
+    * Tag nesting rules:
+
+      Most tags can't be nested at all. For instance, the occurance of
+      a <p> tag should implicitly close the previous <p> tag.
+
+       <p>Para1<p>Para2
+        should be transformed into:
+       <p>Para1</p><p>Para2
+
+      Some tags can be nested arbitrarily. For instance, the occurance
+      of a <blockquote> tag should _not_ implicitly close the previous
+      <blockquote> tag.
+
+       Alice said: <blockquote>Bob said: <blockquote>Blah
+        should NOT be transformed into:
+       Alice said: <blockquote>Bob said: </blockquote><blockquote>Blah
+
+      Some tags can be nested, but the nesting is reset by the
+      interposition of other tags. For instance, a <tr> tag should
+      implicitly close the previous <tr> tag within the same <table>,
+      but not close a <tr> tag in another table.
+
+       <table><tr>Blah<tr>Blah
+        should be transformed into:
+       <table><tr>Blah</tr><tr>Blah
+        but,
+       <tr>Blah<table><tr>Blah
+        should NOT be transformed into
+       <tr>Blah<table></tr><tr>Blah
+
+    Differing assumptions about tag nesting rules are a major source
+    of problems with the BeautifulSoup class. If BeautifulSoup is not
+    treating as nestable a tag your page author treats as nestable,
+    try ICantBelieveItsBeautifulSoup before writing your own
+    subclass."""
+
+    SELF_CLOSING_TAGS = buildTagMap(None, ['br' , 'hr', 'input', 'img', 'meta',
+                                           'spacer', 'link', 'frame', 'base'])
+
+    QUOTE_TAGS = {'script': None}
+    
+    #According to the HTML standard, each of these inline tags can
+    #contain another tag of the same type. Furthermore, it's common
+    #to actually use these tags this way.
+    NESTABLE_INLINE_TAGS = ['span', 'font', 'q', 'object', 'bdo', 'sub', 'sup',
+                            'center']
+
+    #According to the HTML standard, these block tags can contain
+    #another tag of the same type. Furthermore, it's common
+    #to actually use these tags this way.
+    NESTABLE_BLOCK_TAGS = ['blockquote', 'div', 'fieldset', 'ins', 'del']
+
+    #Lists can contain other lists, but there are restrictions.    
+    NESTABLE_LIST_TAGS = { 'ol' : [],
+                           'ul' : [],
+                           'li' : ['ul', 'ol'],
+                           'dl' : [],
+                           'dd' : ['dl'],
+                           'dt' : ['dl'] }
+
+    #Tables can contain other tables, but there are restrictions.    
+    NESTABLE_TABLE_TAGS = {'table' : [], 
+                           'tr' : ['table', 'tbody', 'tfoot', 'thead'],
+                           'td' : ['tr'],
+                           'th' : ['tr'],
+                           }
+
+    NON_NESTABLE_BLOCK_TAGS = ['address', 'form', 'p', 'pre']
+
+    #If one of these tags is encountered, all tags up to the next tag of
+    #this type are popped.
+    RESET_NESTING_TAGS = buildTagMap(None, NESTABLE_BLOCK_TAGS, 'noscript',
+                                     NON_NESTABLE_BLOCK_TAGS,
+                                     NESTABLE_LIST_TAGS,
+                                     NESTABLE_TABLE_TAGS)
+
+    NESTABLE_TAGS = buildTagMap([], NESTABLE_INLINE_TAGS, NESTABLE_BLOCK_TAGS,
+                                NESTABLE_LIST_TAGS, NESTABLE_TABLE_TAGS)
+    
+class ICantBelieveItsBeautifulSoup(BeautifulSoup):
+
+    """The BeautifulSoup class is oriented towards skipping over
+    common HTML errors like unclosed tags. However, sometimes it makes
+    errors of its own. For instance, consider this fragment:
+
+     <b>Foo<b>Bar</b></b>
+
+    This is perfectly valid (if bizarre) HTML. However, the
+    BeautifulSoup class will implicitly close the first b tag when it
+    encounters the second 'b'. It will think the author wrote
+    "<b>Foo<b>Bar", and didn't close the first 'b' tag, because
+    there's no real-world reason to bold something that's already
+    bold. When it encounters '</b></b>' it will close two more 'b'
+    tags, for a grand total of three tags closed instead of two. This
+    can throw off the rest of your document structure. The same is
+    true of a number of other tags, listed below.
+
+    It's much more common for someone to forget to close (eg.) a 'b'
+    tag than to actually use nested 'b' tags, and the BeautifulSoup
+    class handles the common case. This class handles the
+    not-co-common case: where you can't believe someone wrote what
+    they did, but it's valid HTML and BeautifulSoup screwed up by
+    assuming it wouldn't be.
+
+    If this doesn't do what you need, try subclassing this class or
+    BeautifulSoup, and providing your own list of NESTABLE_TAGS."""
+
+    I_CANT_BELIEVE_THEYRE_NESTABLE_INLINE_TAGS = \
+     ['em', 'big', 'i', 'small', 'tt', 'abbr', 'acronym', 'strong',
+      'cite', 'code', 'dfn', 'kbd', 'samp', 'strong', 'var', 'b',
+      'big']
+
+    I_CANT_BELIEVE_THEYRE_NESTABLE_BLOCK_TAGS = ['noscript']
+
+    NESTABLE_TAGS = buildTagMap([], BeautifulSoup.NESTABLE_TAGS,
+                                I_CANT_BELIEVE_THEYRE_NESTABLE_BLOCK_TAGS,
+                                I_CANT_BELIEVE_THEYRE_NESTABLE_INLINE_TAGS)
+
+class BeautifulSOAP(BeautifulStoneSoup):
+    """This class will push a tag with only a single string child into
+    the tag's parent as an attribute. The attribute's name is the tag
+    name, and the value is the string child. An example should give
+    the flavor of the change:
+
+    <foo><bar>baz</bar></foo>
+     =>
+    <foo bar="baz"><bar>baz</bar></foo>
+
+    You can then access fooTag['bar'] instead of fooTag.barTag.string.
+
+    This is, of course, useful for scraping structures that tend to
+    use subelements instead of attributes, such as SOAP messages. Note
+    that it modifies its input, so don't print the modified version
+    out.
+
+    I'm not sure how many people really want to use this class; let me
+    know if you do. Mainly I like the name."""
+
+    def popTag(self):
+        if len(self.tagStack) > 1:
+            tag = self.tagStack[-1]
+            parent = self.tagStack[-2]
+            parent._getAttrMap()
+            if (isinstance(tag, Tag) and len(tag.contents) == 1 and
+                isinstance(tag.contents[0], NavigableText) and 
+                not parent.attrMap.has_key(tag.name)):
+                parent[tag.name] = tag.contents[0]
+        BeautifulStoneSoup.popTag(self)
+
+#Enterprise class names! It has come to our attention that some people
+#think the names of the Beautiful Soup parser classes are too silly
+#and "unprofessional" for use in enterprise screen-scraping. We feel
+#your pain! For such-minded folk, the Beautiful Soup Consortium And
+#All-Night Kosher Bakery recommends renaming this file to
+#"RobustParser.py" (or, in cases of extreme enterprisitude,
+#"RobustParserBeanInterface.class") and using the following
+#enterprise-friendly class aliases:
+class RobustXMLParser(BeautifulStoneSoup):
+    pass
+class RobustHTMLParser(BeautifulSoup):
+    pass
+class RobustWackAssHTMLParser(ICantBelieveItsBeautifulSoup):
+    pass
+class SimplifyingSOAPParser(BeautifulSOAP):
+    pass
+
+###
+
+
+#By default, act as an HTML pretty-printer.
+if __name__ == '__main__':
+    import sys
+    soup = BeautifulStoneSoup(sys.stdin.read())
+    print soup.prettify()

Modified: python-mechanize/branches/upstream/current/mechanize/_clientcookie.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_clientcookie.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_clientcookie.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,4 +1,4 @@
-"""HTTP cookie handling for web clients, plus some other stuff.
+"""HTTP cookie handling for web clients.
 
 This module originally developed from my port of Gisle Aas' Perl module
 HTTP::Cookies, from the libwww-perl library.
@@ -32,7 +32,7 @@
 
 """
 
-import sys, re, urlparse, string, copy, time, struct, urllib, types, logging
+import sys, re, copy, time, struct, urllib, types, logging
 try:
     import threading
     _threading = threading; del threading
@@ -46,7 +46,8 @@
 DEFAULT_HTTP_PORT = str(httplib.HTTP_PORT)
 
 from _headersutil import split_header_words, parse_ns_headers
-from _util import startswith, endswith, isstringlike, getheaders
+from _util import isstringlike
+import _rfc3986
 
 debug = logging.getLogger("mechanize.cookies").debug
 
@@ -105,17 +106,17 @@
     """
     # Note that, if A or B are IP addresses, the only relevant part of the
     # definition of the domain-match algorithm is the direct string-compare.
-    A = string.lower(A)
-    B = string.lower(B)
+    A = A.lower()
+    B = B.lower()
     if A == B:
         return True
     if not is_HDN(A):
         return False
-    i = string.rfind(A, B)
+    i = A.rfind(B)
     has_form_nb = not (i == -1 or i == 0)
     return (
         has_form_nb and
-        startswith(B, ".") and
+        B.startswith(".") and
         is_HDN(B[1:])
         )
 
@@ -133,15 +134,15 @@
     A and B may be host domain names or IP addresses.
 
     """
-    A = string.lower(A)
-    B = string.lower(B)
+    A = A.lower()
+    B = B.lower()
     if not (liberal_is_HDN(A) and liberal_is_HDN(B)):
         if A == B:
             # equal IP addresses
             return True
         return False
-    initial_dot = startswith(B, ".")
-    if initial_dot and endswith(A, B):
+    initial_dot = B.startswith(".")
+    if initial_dot and A.endswith(B):
         return True
     if not initial_dot and A == B:
         return True
@@ -156,13 +157,13 @@
 
     """
     url = request.get_full_url()
-    host = urlparse.urlparse(url)[1]
-    if host == "":
+    host = _rfc3986.urlsplit(url)[1]
+    if host is None:
         host = request.get_header("Host", "")
 
     # remove port, if present
     host = cut_port_re.sub("", host, 1)
-    return string.lower(host)
+    return host.lower()
 
 def eff_request_host(request):
     """Return a tuple (request-host, effective request-host name).
@@ -171,28 +172,23 @@
 
     """
     erhn = req_host = request_host(request)
-    if string.find(req_host, ".") == -1 and not IPV4_RE.search(req_host):
+    if req_host.find(".") == -1 and not IPV4_RE.search(req_host):
         erhn = req_host + ".local"
     return req_host, erhn
 
 def request_path(request):
     """request-URI, as defined by RFC 2965."""
     url = request.get_full_url()
-    #scheme, netloc, path, parameters, query, frag = urlparse.urlparse(url)
-    #req_path = escape_path(string.join(urlparse.urlparse(url)[2:], ""))
-    path, parameters, query, frag = urlparse.urlparse(url)[2:]
-    if parameters:
-        path = "%s;%s" % (path, parameters)
+    path, query, frag = _rfc3986.urlsplit(url)[2:]
     path = escape_path(path)
-    req_path = urlparse.urlunparse(("", "", path, "", query, frag))
-    if not startswith(req_path, "/"):
-        # fix bad RFC 2396 absoluteURI
+    req_path = _rfc3986.urlunsplit((None, None, path, query, frag))
+    if not req_path.startswith("/"):
         req_path = "/"+req_path
     return req_path
 
 def request_port(request):
     host = request.get_host()
-    i = string.find(host, ':')
+    i = host.find(':')
     if i >= 0:
         port = host[i+1:]
         try:
@@ -209,7 +205,7 @@
 HTTP_PATH_SAFE = "%/;:@&=+$,!~*'()"
 ESCAPED_CHAR_RE = re.compile(r"%([0-9a-fA-F][0-9a-fA-F])")
 def uppercase_escaped_char(match):
-    return "%%%s" % string.upper(match.group(1))
+    return "%%%s" % match.group(1).upper()
 def escape_path(path):
     """Escape any invalid characters in HTTP URL, and uppercase all escapes."""
     # There's no knowing what character encoding was used to create URLs
@@ -252,11 +248,11 @@
     '.local'
 
     """
-    i = string.find(h, ".")
+    i = h.find(".")
     if i >= 0:
         #a = h[:i]  # this line is only here to show what a is
         b = h[i+1:]
-        i = string.find(b, ".")
+        i = b.find(".")
         if is_HDN(h) and (i >= 0 or b == "local"):
             return "."+b
     return h
@@ -344,7 +340,7 @@
         self.port = port
         self.port_specified = port_specified
         # normalise case, as per RFC 2965 section 3.3.3
-        self.domain = string.lower(domain)
+        self.domain = domain.lower()
         self.domain_specified = domain_specified
         # Sigh.  We need to know whether the domain given in the
         # cookie-attribute had an initial dot, in order to follow RFC 2965
@@ -397,7 +393,7 @@
             args.append("%s=%s" % (name, repr(attr)))
         args.append("rest=%s" % repr(self._rest))
         args.append("rfc2109=%s" % repr(self.rfc2109))
-        return "Cookie(%s)" % string.join(args, ", ")
+        return "Cookie(%s)" % ", ".join(args)
 
 
 class CookiePolicy:
@@ -701,7 +697,7 @@
         # Try and stop servers setting V0 cookies designed to hack other
         # servers that know both V0 and V1 protocols.
         if (cookie.version == 0 and self.strict_ns_set_initial_dollar and
-            startswith(cookie.name, "$")):
+            cookie.name.startswith("$")):
             debug("   illegal name (starts with '$'): '%s'", cookie.name)
             return False
         return True
@@ -711,7 +707,7 @@
             req_path = request_path(request)
             if ((cookie.version > 0 or
                  (cookie.version == 0 and self.strict_ns_set_path)) and
-                not startswith(req_path, cookie.path)):
+                not req_path.startswith(cookie.path)):
                 debug("   path attribute %s is not a prefix of request "
                       "path %s", cookie.path, req_path)
                 return False
@@ -728,12 +724,12 @@
             domain = cookie.domain
             # since domain was specified, we know that:
             assert domain.startswith(".")
-            if string.count(domain, ".") == 2:
+            if domain.count(".") == 2:
                 # domain like .foo.bar
-                i = string.rfind(domain, ".")
+                i = domain.rfind(".")
                 tld = domain[i+1:]
                 sld = domain[1:i]
-                if (string.lower(sld) in [
+                if (sld.lower() in [
                     "co", "ac",
                     "com", "edu", "org", "net", "gov", "mil", "int",
                     "aero", "biz", "cat", "coop", "info", "jobs", "mobi",
@@ -757,19 +753,19 @@
         if cookie.domain_specified:
             req_host, erhn = eff_request_host(request)
             domain = cookie.domain
-            if startswith(domain, "."):
+            if domain.startswith("."):
                 undotted_domain = domain[1:]
             else:
                 undotted_domain = domain
-            embedded_dots = (string.find(undotted_domain, ".") >= 0)
+            embedded_dots = (undotted_domain.find(".") >= 0)
             if not embedded_dots and domain != ".local":
                 debug("   non-local domain %s contains no embedded dot",
                       domain)
                 return False
             if cookie.version == 0:
-                if (not endswith(erhn, domain) and
-                    (not startswith(erhn, ".") and
-                     not endswith("."+erhn, domain))):
+                if (not erhn.endswith(domain) and
+                    (not erhn.startswith(".") and
+                     not ("."+erhn).endswith(domain))):
                     debug("   effective request-host %s (even with added "
                           "initial dot) does not end end with %s",
                           erhn, domain)
@@ -783,7 +779,7 @@
             if (cookie.version > 0 or
                 (self.strict_ns_domain & self.DomainStrictNoDots)):
                 host_prefix = req_host[:-len(domain)]
-                if (string.find(host_prefix, ".") >= 0 and
+                if (host_prefix.find(".") >= 0 and
                     not IPV4_RE.search(req_host)):
                     debug("   host prefix %s for domain %s contains a dot",
                           host_prefix, domain)
@@ -797,7 +793,7 @@
                 req_port = "80"
             else:
                 req_port = str(req_port)
-            for p in string.split(cookie.port, ","):
+            for p in cookie.port.split(","):
                 try:
                     int(p)
                 except ValueError:
@@ -867,7 +863,7 @@
             req_port = request_port(request)
             if req_port is None:
                 req_port = "80"
-            for p in string.split(cookie.port, ","):
+            for p in cookie.port.split(","):
                 if p == req_port:
                     break
             else:
@@ -892,7 +888,7 @@
             debug("   effective request-host name %s does not domain-match "
                   "RFC 2965 cookie domain %s", erhn, domain)
             return False
-        if cookie.version == 0 and not endswith("."+erhn, domain):
+        if cookie.version == 0 and not ("."+erhn).endswith(domain):
             debug("   request-host %s does not match Netscape cookie domain "
                   "%s", req_host, domain)
             return False
@@ -905,12 +901,12 @@
         # Munge req_host and erhn to always start with a dot, so as to err on
         # the side of letting cookies through.
         dotted_req_host, dotted_erhn = eff_request_host(request)
-        if not startswith(dotted_req_host, "."):
+        if not dotted_req_host.startswith("."):
             dotted_req_host = "."+dotted_req_host
-        if not startswith(dotted_erhn, "."):
+        if not dotted_erhn.startswith("."):
             dotted_erhn = "."+dotted_erhn
-        if not (endswith(dotted_req_host, domain) or
-                endswith(dotted_erhn, domain)):
+        if not (dotted_req_host.endswith(domain) or
+                dotted_erhn.endswith(domain)):
             #debug("   request domain %s does not match cookie domain %s",
             #      req_host, domain)
             return False
@@ -927,7 +923,7 @@
     def path_return_ok(self, path, request):
         debug("- checking cookie path=%s", path)
         req_path = request_path(request)
-        if not startswith(req_path, path):
+        if not req_path.startswith(path):
             debug("  %s does not path-match %s", req_path, path)
             return False
         return True
@@ -1096,10 +1092,10 @@
             if version > 0:
                 if cookie.path_specified:
                     attrs.append('$Path="%s"' % cookie.path)
-                if startswith(cookie.domain, "."):
+                if cookie.domain.startswith("."):
                     domain = cookie.domain
                     if (not cookie.domain_initial_dot and
-                        startswith(domain, ".")):
+                        domain.startswith(".")):
                         domain = domain[1:]
                     attrs.append('$Domain="%s"' % domain)
                 if cookie.port is not None:
@@ -1137,8 +1133,7 @@
         attrs = self._cookie_attrs(cookies)
         if attrs:
             if not request.has_header("Cookie"):
-                request.add_unredirected_header(
-                    "Cookie", string.join(attrs, "; "))
+                request.add_unredirected_header("Cookie", "; ".join(attrs))
 
         # if necessary, advertise that we know RFC 2965
         if self._policy.rfc2965 and not self._policy.hide_cookie2:
@@ -1188,7 +1183,7 @@
             standard = {}
             rest = {}
             for k, v in cookie_attrs[1:]:
-                lc = string.lower(k)
+                lc = k.lower()
                 # don't lose case distinction for unknown fields
                 if lc in value_attrs or lc in boolean_attrs:
                     k = lc
@@ -1205,7 +1200,7 @@
                         bad_cookie = True
                         break
                     # RFC 2965 section 3.3.3
-                    v = string.lower(v)
+                    v = v.lower()
                 if k == "expires":
                     if max_age_set:
                         # Prefer max-age to expires (like Mozilla)
@@ -1272,7 +1267,7 @@
         else:
             path_specified = False
             path = request_path(request)
-            i = string.rfind(path, "/")
+            i = path.rfind("/")
             if i != -1:
                 if version == 0:
                     # Netscape spec parts company from reality here
@@ -1286,11 +1281,11 @@
         # but first we have to remember whether it starts with a dot
         domain_initial_dot = False
         if domain_specified:
-            domain_initial_dot = bool(startswith(domain, "."))
+            domain_initial_dot = bool(domain.startswith("."))
         if domain is Absent:
             req_host, erhn = eff_request_host(request)
             domain = erhn
-        elif not startswith(domain, "."):
+        elif not domain.startswith("."):
             domain = "."+domain
 
         # set default port
@@ -1365,8 +1360,8 @@
         """
         # get cookie-attributes for RFC 2965 and Netscape protocols
         headers = response.info()
-        rfc2965_hdrs = getheaders(headers, "Set-Cookie2")
-        ns_hdrs = getheaders(headers, "Set-Cookie")
+        rfc2965_hdrs = headers.getheaders("Set-Cookie2")
+        ns_hdrs = headers.getheaders("Set-Cookie")
 
         rfc2965 = self._policy.rfc2965
         netscape = self._policy.netscape
@@ -1550,12 +1545,12 @@
     def __repr__(self):
         r = []
         for cookie in self: r.append(repr(cookie))
-        return "<%s[%s]>" % (self.__class__, string.join(r, ", "))
+        return "<%s[%s]>" % (self.__class__, ", ".join(r))
 
     def __str__(self):
         r = []
         for cookie in self: r.append(str(cookie))
-        return "<%s[%s]>" % (self.__class__, string.join(r, ", "))
+        return "<%s[%s]>" % (self.__class__, ", ".join(r))
 
 
 class LoadError(Exception): pass

Added: python-mechanize/branches/upstream/current/mechanize/_debug.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_debug.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_debug.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,28 @@
+import logging
+
+from urllib2 import BaseHandler
+from _response import response_seek_wrapper
+
+
+class HTTPResponseDebugProcessor(BaseHandler):
+    handler_order = 900  # before redirections, after everything else
+
+    def http_response(self, request, response):
+        if not hasattr(response, "seek"):
+            response = response_seek_wrapper(response)
+        info = logging.getLogger("mechanize.http_responses").info
+        try:
+            info(response.read())
+        finally:
+            response.seek(0)
+        info("*****************************************************")
+        return response
+
+    https_response = http_response
+
+class HTTPRedirectDebugProcessor(BaseHandler):
+    def http_request(self, request):
+        if hasattr(request, "redirect_dict"):
+            info = logging.getLogger("mechanize.http_redirects").info
+            info("redirecting to %s", request.get_full_url())
+        return request

Modified: python-mechanize/branches/upstream/current/mechanize/_gzip.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_gzip.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_gzip.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,6 +1,6 @@
 import urllib2
 from cStringIO import StringIO
-import _util
+import _response
 
 # GzipConsumer was taken from Fredrik Lundh's effbot.org-0.1-20041009 library
 class GzipConsumer:
@@ -65,7 +65,7 @@
     def __init__(self): self.data = []
     def feed(self, data): self.data.append(data)
 
-class stupid_gzip_wrapper(_util.closeable_response):
+class stupid_gzip_wrapper(_response.closeable_response):
     def __init__(self, response):
         self._response = response
 

Modified: python-mechanize/branches/upstream/current/mechanize/_headersutil.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_headersutil.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_headersutil.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -9,12 +9,13 @@
 
 """
 
-import os, re, string, urlparse
+import os, re
 from types import StringType
 from types import UnicodeType
 STRING_TYPES = StringType, UnicodeType
 
-from _util import startswith, endswith, http2time
+from _util import http2time
+import _rfc3986
 
 def is_html(ct_headers, url, allow_xhtml=False):
     """
@@ -24,7 +25,7 @@
     """
     if not ct_headers:
         # guess
-        ext = os.path.splitext(urlparse.urlparse(url)[2])[1]
+        ext = os.path.splitext(_rfc3986.urlsplit(url)[2])[1]
         html_exts = [".htm", ".html"]
         if allow_xhtml:
             html_exts += [".xhtml"]
@@ -113,14 +114,14 @@
                     if m:  # unquoted value
                         text = unmatched(m)
                         value = m.group(1)
-                        value = string.rstrip(value)
+                        value = value.rstrip()
                     else:
                         # no value, a lone token
                         value = None
                 pairs.append((name, value))
-            elif startswith(string.lstrip(text), ","):
+            elif text.lstrip().startswith(","):
                 # concatenated headers, as per RFC 2616 section 4.2
-                text = string.lstrip(text)[1:]
+                text = text.lstrip()[1:]
                 if pairs: result.append(pairs)
                 pairs = []
             else:
@@ -159,8 +160,8 @@
                 else:
                     k = "%s=%s" % (k, v)
             attr.append(k)
-        if attr: headers.append(string.join(attr, "; "))
-    return string.join(headers, ", ")
+        if attr: headers.append("; ".join(attr))
+    return ", ".join(headers)
 
 def parse_ns_headers(ns_headers):
     """Ad-hoc parser for Netscape protocol cookie-attributes.
@@ -188,15 +189,15 @@
         params = re.split(r";\s*", ns_header)
         for ii in range(len(params)):
             param = params[ii]
-            param = string.rstrip(param)
+            param = param.rstrip()
             if param == "": continue
             if "=" not in param:
                 k, v = param, None
             else:
                 k, v = re.split(r"\s*=\s*", param, 1)
-                k = string.lstrip(k)
+                k = k.lstrip()
             if ii != 0:
-                lc = string.lower(k)
+                lc = k.lower()
                 if lc in known_attrs:
                     k = lc
                 if k == "version":
@@ -204,8 +205,8 @@
                     version_set = True
                 if k == "expires":
                     # convert expires date to seconds since epoch
-                    if startswith(v, '"'): v = v[1:]
-                    if endswith(v, '"'): v = v[:-1]
+                    if v.startswith('"'): v = v[1:]
+                    if v.endswith('"'): v = v[:-1]
                     v = http2time(v)  # None if invalid
             pairs.append((k, v))
 

Modified: python-mechanize/branches/upstream/current/mechanize/_html.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_html.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_html.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -8,47 +8,42 @@
 
 """
 
-import re, copy, urllib, htmlentitydefs
-from urlparse import urljoin
+import re, copy, htmlentitydefs
+import sgmllib, HTMLParser, ClientForm
 
 import _request
 from _headersutil import split_header_words, is_html as _is_html
+import _rfc3986
 
-## # XXXX miserable hack
-## def urljoin(base, url):
-##     if url.startswith("?"):
-##         return base+url
-##     else:
-##         return urlparse.urljoin(base, url)
+DEFAULT_ENCODING = "latin-1"
 
-## def chr_range(a, b):
-##     return "".join(map(chr, range(ord(a), ord(b)+1)))
 
-## RESERVED_URL_CHARS = ("ABCDEFGHIJKLMNOPQRSTUVWXYZ"
-##                       "abcdefghijklmnopqrstuvwxyz"
-##                       "-_.~")
-## UNRESERVED_URL_CHARS = "!*'();:@&=+$,/?%#[]"
-# we want (RESERVED_URL_CHARS+UNRESERVED_URL_CHARS), minus those
-# 'safe'-by-default characters that urllib.urlquote never quotes
-URLQUOTE_SAFE_URL_CHARS = "!*'();:@&=+$,/?%#[]~"
+# the base classe is purely for backwards compatibility
+class ParseError(ClientForm.ParseError): pass
 
-DEFAULT_ENCODING = "latin-1"
 
 class CachingGeneratorFunction(object):
     """Caching wrapper around a no-arguments iterable."""
+
     def __init__(self, iterable):
-        self._iterable = iterable
         self._cache = []
+        # wrap iterable to make it non-restartable (otherwise, repeated
+        # __call__ would give incorrect results)
+        self._iterator = iter(iterable)
+
     def __call__(self):
         cache = self._cache
         for item in cache:
             yield item
-        for item in self._iterable:
+        for item in self._iterator:
             cache.append(item)
             yield item
 
-def encoding_finder(default_encoding):
-    def encoding(response):
+
+class EncodingFinder:
+    def __init__(self, default_encoding):
+        self._default_encoding = default_encoding
+    def encoding(self, response):
         # HTTPEquivProcessor may be in use, so both HTTP and HTTP-EQUIV
         # headers may be in the response.  HTTP-EQUIV headers come last,
         # so try in order from first to last.
@@ -56,17 +51,18 @@
             for k, v in split_header_words([ct])[0]:
                 if k == "charset":
                     return v
-        return default_encoding
-    return encoding
+        return self._default_encoding
 
-def make_is_html(allow_xhtml):
-    def is_html(response, encoding):
+class ResponseTypeFinder:
+    def __init__(self, allow_xhtml):
+        self._allow_xhtml = allow_xhtml
+    def is_html(self, response, encoding):
         ct_hdrs = response.info().getheaders("content-type")
         url = response.geturl()
         # XXX encoding
-        return _is_html(ct_hdrs, url, allow_xhtml)
-    return is_html
+        return _is_html(ct_hdrs, url, self._allow_xhtml)
 
+
 # idea for this argument-processing trick is from Peter Otten
 class Args:
     def __init__(self, args_map):
@@ -90,7 +86,7 @@
     def __init__(self, base_url, url, text, tag, attrs):
         assert None not in [url, tag, attrs]
         self.base_url = base_url
-        self.absolute_url = urljoin(base_url, url)
+        self.absolute_url = _rfc3986.urljoin(base_url, url)
         self.url, self.text, self.tag, self.attrs = url, text, tag, attrs
     def __cmp__(self, other):
         try:
@@ -105,19 +101,6 @@
             self.base_url, self.url, self.text, self.tag, self.attrs)
 
 
-def clean_url(url, encoding):
-    # percent-encode illegal URL characters
-    # Trying to come up with test cases for this gave me a headache, revisit
-    # when do switch to unicode.
-    # Somebody else's comments (lost the attribution):
-##     - IE will return you the url in the encoding you send it
-##     - Mozilla/Firefox will send you latin-1 if there's no non latin-1
-##     characters in your link. It will send you utf-8 however if there are...
-    if type(url) == type(""):
-        url = url.decode(encoding, "replace")
-    url = url.strip()
-    return urllib.quote(url.encode(encoding), URLQUOTE_SAFE_URL_CHARS)
-
 class LinksFactory:
 
     def __init__(self,
@@ -153,40 +136,49 @@
         base_url = self._base_url
         p = self.link_parser_class(response, encoding=encoding)
 
-        for token in p.tags(*(self.urltags.keys()+["base"])):
-            if token.data == "base":
-                base_url = dict(token.attrs).get("href")
-                continue
-            if token.type == "endtag":
-                continue
-            attrs = dict(token.attrs)
-            tag = token.data
-            name = attrs.get("name")
-            text = None
-            # XXX use attr_encoding for ref'd doc if that doc does not provide
-            #  one by other means
-            #attr_encoding = attrs.get("charset")
-            url = attrs.get(self.urltags[tag])  # XXX is "" a valid URL?
-            if not url:
-                # Probably an <A NAME="blah"> link or <AREA NOHREF...>.
-                # For our purposes a link is something with a URL, so ignore
-                # this.
-                continue
+        try:
+            for token in p.tags(*(self.urltags.keys()+["base"])):
+                if token.type == "endtag":
+                    continue
+                if token.data == "base":
+                    base_href = dict(token.attrs).get("href")
+                    if base_href is not None:
+                        base_url = base_href
+                    continue
+                attrs = dict(token.attrs)
+                tag = token.data
+                name = attrs.get("name")
+                text = None
+                # XXX use attr_encoding for ref'd doc if that doc does not
+                #  provide one by other means
+                #attr_encoding = attrs.get("charset")
+                url = attrs.get(self.urltags[tag])  # XXX is "" a valid URL?
+                if not url:
+                    # Probably an <A NAME="blah"> link or <AREA NOHREF...>.
+                    # For our purposes a link is something with a URL, so
+                    # ignore this.
+                    continue
 
-            url = clean_url(url, encoding)
-            if tag == "a":
-                if token.type != "startendtag":
-                    # hmm, this'd break if end tag is missing
-                    text = p.get_compressed_text(("endtag", tag))
-                # but this doesn't work for eg. <a href="blah"><b>Andy</b></a>
-                #text = p.get_compressed_text()
+                url = _rfc3986.clean_url(url, encoding)
+                if tag == "a":
+                    if token.type != "startendtag":
+                        # hmm, this'd break if end tag is missing
+                        text = p.get_compressed_text(("endtag", tag))
+                    # but this doesn't work for eg.
+                    # <a href="blah"><b>Andy</b></a>
+                    #text = p.get_compressed_text()
 
-            yield Link(base_url, url, text, tag, token.attrs)
+                yield Link(base_url, url, text, tag, token.attrs)
+        except sgmllib.SGMLParseError, exc:
+            raise ParseError(exc)
 
 class FormsFactory:
 
     """Makes a sequence of objects satisfying ClientForm.HTMLForm interface.
 
+    After calling .forms(), the .global_form attribute is a form object
+    containing all controls not a descendant of any FORM element.
+
     For constructor argument docs, see ClientForm.ParseResponse
     argument docs.
 
@@ -209,22 +201,31 @@
         self.backwards_compat = backwards_compat
         self._response = None
         self.encoding = None
+        self.global_form = None
 
     def set_response(self, response, encoding):
         self._response = response
         self.encoding = encoding
+        self.global_form = None
 
     def forms(self):
         import ClientForm
         encoding = self.encoding
-        return ClientForm.ParseResponse(
-            self._response,
-            select_default=self.select_default,
-            form_parser_class=self.form_parser_class,
-            request_class=self.request_class,
-            backwards_compat=self.backwards_compat,
-            encoding=encoding,
-            )
+        try:
+            forms = ClientForm.ParseResponseEx(
+                self._response,
+                select_default=self.select_default,
+                form_parser_class=self.form_parser_class,
+                request_class=self.request_class,
+                encoding=encoding,
+                _urljoin=_rfc3986.urljoin,
+                _urlparse=_rfc3986.urlsplit,
+                _urlunparse=_rfc3986.urlunsplit,
+                )
+        except ClientForm.ParseError, exc:
+            raise ParseError(exc)
+        self.global_form = forms[0]
+        return forms[1:]
 
 class TitleFactory:
     def __init__(self):
@@ -239,11 +240,14 @@
         p = _pullparser.TolerantPullParser(
             self._response, encoding=self._encoding)
         try:
-            p.get_tag("title")
-        except _pullparser.NoMoreTokensError:
-            return None
-        else:
-            return p.get_text()
+            try:
+                p.get_tag("title")
+            except _pullparser.NoMoreTokensError:
+                return None
+            else:
+                return p.get_text()
+        except sgmllib.SGMLParseError, exc:
+            raise ParseError(exc)
 
 
 def unescape(data, entities, encoding):
@@ -284,42 +288,44 @@
         return repl
 
 
-try:
-    import BeautifulSoup
-except ImportError:
-    pass
-else:
-    import sgmllib
-    # monkeypatch to fix http://www.python.org/sf/803422 :-(
-    sgmllib.charref = re.compile("&#(x?[0-9a-fA-F]+)[^0-9a-fA-F]")
-    class MechanizeBs(BeautifulSoup.BeautifulSoup):
-        _entitydefs = htmlentitydefs.name2codepoint
-        # don't want the magic Microsoft-char workaround
-        PARSER_MASSAGE = [(re.compile('(<[^<>]*)/>'),
-                           lambda(x):x.group(1) + ' />'),
-                          (re.compile('<!\s+([^<>]*)>'),
-                           lambda(x):'<!' + x.group(1) + '>')
-                          ]
+# bizarre import gymnastics for bundled BeautifulSoup
+import _beautifulsoup
+import ClientForm
+RobustFormParser, NestingRobustFormParser = ClientForm._create_bs_classes(
+    _beautifulsoup.BeautifulSoup, _beautifulsoup.ICantBelieveItsBeautifulSoup
+    )
+# monkeypatch sgmllib to fix http://www.python.org/sf/803422 :-(
+import sgmllib
+sgmllib.charref = re.compile("&#(x?[0-9a-fA-F]+)[^0-9a-fA-F]")
 
-        def __init__(self, encoding, text=None, avoidParserProblems=True,
-                     initialTextIsEverything=True):
-            self._encoding = encoding
-            BeautifulSoup.BeautifulSoup.__init__(
-                self, text, avoidParserProblems, initialTextIsEverything)
+class MechanizeBs(_beautifulsoup.BeautifulSoup):
+    _entitydefs = htmlentitydefs.name2codepoint
+    # don't want the magic Microsoft-char workaround
+    PARSER_MASSAGE = [(re.compile('(<[^<>]*)/>'),
+                       lambda(x):x.group(1) + ' />'),
+                      (re.compile('<!\s+([^<>]*)>'),
+                       lambda(x):'<!' + x.group(1) + '>')
+                      ]
 
-        def handle_charref(self, ref):
-            t = unescape("&#%s;"%ref, self._entitydefs, self._encoding)
-            self.handle_data(t)
-        def handle_entityref(self, ref):
-            t = unescape("&%s;"%ref, self._entitydefs, self._encoding)
-            self.handle_data(t)
-        def unescape_attrs(self, attrs):
-            escaped_attrs = []
-            for key, val in attrs:
-                val = unescape(val, self._entitydefs, self._encoding)
-                escaped_attrs.append((key, val))
-            return escaped_attrs
+    def __init__(self, encoding, text=None, avoidParserProblems=True,
+                 initialTextIsEverything=True):
+        self._encoding = encoding
+        _beautifulsoup.BeautifulSoup.__init__(
+            self, text, avoidParserProblems, initialTextIsEverything)
 
+    def handle_charref(self, ref):
+        t = unescape("&#%s;"%ref, self._entitydefs, self._encoding)
+        self.handle_data(t)
+    def handle_entityref(self, ref):
+        t = unescape("&%s;"%ref, self._entitydefs, self._encoding)
+        self.handle_data(t)
+    def unescape_attrs(self, attrs):
+        escaped_attrs = []
+        for key, val in attrs:
+            val = unescape(val, self._entitydefs, self._encoding)
+            escaped_attrs.append((key, val))
+        return escaped_attrs
+
 class RobustLinksFactory:
 
     compress_re = re.compile(r"\s+")
@@ -329,7 +335,7 @@
                  link_class=Link,
                  urltags=None,
                  ):
-        import BeautifulSoup
+        import _beautifulsoup
         if link_parser_class is None:
             link_parser_class = MechanizeBs
         self.link_parser_class = link_parser_class
@@ -352,27 +358,29 @@
         self._encoding = encoding
 
     def links(self):
-        import BeautifulSoup
+        import _beautifulsoup
         bs = self._bs
         base_url = self._base_url
         encoding = self._encoding
         gen = bs.recursiveChildGenerator()
         for ch in bs.recursiveChildGenerator():
-            if (isinstance(ch, BeautifulSoup.Tag) and
+            if (isinstance(ch, _beautifulsoup.Tag) and
                 ch.name in self.urltags.keys()+["base"]):
                 link = ch
                 attrs = bs.unescape_attrs(link.attrs)
                 attrs_dict = dict(attrs)
                 if link.name == "base":
-                    base_url = attrs_dict.get("href")
+                    base_href = attrs_dict.get("href")
+                    if base_href is not None:
+                        base_url = base_href
                     continue
                 url_attr = self.urltags[link.name]
                 url = attrs_dict.get(url_attr)
                 if not url:
                     continue
-                url = clean_url(url, encoding)
+                url = _rfc3986.clean_url(url, encoding)
                 text = link.firstText(lambda t: True)
-                if text is BeautifulSoup.Null:
+                if text is _beautifulsoup.Null:
                     # follow _pullparser's weird behaviour rigidly
                     if link.name == "a":
                         text = ""
@@ -388,7 +396,7 @@
         import ClientForm
         args = form_parser_args(*args, **kwds)
         if args.form_parser_class is None:
-            args.form_parser_class = ClientForm.RobustFormParser
+            args.form_parser_class = RobustFormParser
         FormsFactory.__init__(self, **args.dictionary)
 
     def set_response(self, response, encoding):
@@ -404,10 +412,10 @@
         self._bs = soup
         self._encoding = encoding
 
-    def title(soup):
-        import BeautifulSoup
+    def title(self):
+        import _beautifulsoup
         title = self._bs.first("title")
-        if title == BeautifulSoup.Null:
+        if title == _beautifulsoup.Null:
             return None
         else:
             return title.firstText(lambda t: True)
@@ -427,18 +435,25 @@
 
     Public attributes:
 
+    Note that accessing these attributes may raise ParseError.
+
     encoding: string specifying the encoding of response if it contains a text
      document (this value is left unspecified for documents that do not have
      an encoding, e.g. an image file)
     is_html: true if response contains an HTML document (XHTML may be
      regarded as HTML too)
     title: page title, or None if no title or not HTML
+    global_form: form object containing all controls that are not descendants
+     of any FORM element, or None if the forms_factory does not support
+     supplying a global form
 
     """
 
+    LAZY_ATTRS = ["encoding", "is_html", "title", "global_form"]
+
     def __init__(self, forms_factory, links_factory, title_factory,
-                 get_encoding=encoding_finder(DEFAULT_ENCODING),
-                 is_html_p=make_is_html(allow_xhtml=False),
+                 encoding_finder=EncodingFinder(DEFAULT_ENCODING),
+                 response_type_finder=ResponseTypeFinder(allow_xhtml=False),
                  ):
         """
 
@@ -454,8 +469,8 @@
         self._forms_factory = forms_factory
         self._links_factory = links_factory
         self._title_factory = title_factory
-        self._get_encoding = get_encoding
-        self._is_html_p = is_html_p
+        self._encoding_finder = encoding_finder
+        self._response_type_finder = response_type_finder
 
         self.set_response(None)
 
@@ -471,51 +486,71 @@
     def set_response(self, response):
         """Set response.
 
-        The response must implement the same interface as objects returned by
-        urllib2.urlopen().
+        The response must either be None or implement the same interface as
+        objects returned by urllib2.urlopen().
 
         """
         self._response = response
         self._forms_genf = self._links_genf = None
         self._get_title = None
-        for name in ["encoding", "is_html", "title"]:
+        for name in self.LAZY_ATTRS:
             try:
                 delattr(self, name)
             except AttributeError:
                 pass
 
     def __getattr__(self, name):
-        if name not in ["encoding", "is_html", "title"]:
+        if name not in self.LAZY_ATTRS:
             return getattr(self.__class__, name)
 
-        try:
-            if name == "encoding":
-                self.encoding = self._get_encoding(self._response)
-                return self.encoding
-            elif name == "is_html":
-                self.is_html = self._is_html_p(self._response, self.encoding)
-                return self.is_html
-            elif name == "title":
-                if self.is_html:
-                    self.title = self._title_factory.title()
-                else:
-                    self.title = None
-                return self.title
-        finally:
-            self._response.seek(0)
+        if name == "encoding":
+            self.encoding = self._encoding_finder.encoding(
+                copy.copy(self._response))
+            return self.encoding
+        elif name == "is_html":
+            self.is_html = self._response_type_finder.is_html(
+                copy.copy(self._response), self.encoding)
+            return self.is_html
+        elif name == "title":
+            if self.is_html:
+                self.title = self._title_factory.title()
+            else:
+                self.title = None
+            return self.title
+        elif name == "global_form":
+            self.forms()
+            return self.global_form
 
     def forms(self):
-        """Return iterable over ClientForm.HTMLForm-like objects."""
+        """Return iterable over ClientForm.HTMLForm-like objects.
+
+        Raises mechanize.ParseError on failure.
+        """
+        # this implementation sets .global_form as a side-effect, for benefit
+        # of __getattr__ impl
         if self._forms_genf is None:
-            self._forms_genf = CachingGeneratorFunction(
-                self._forms_factory.forms())
+            try:
+                self._forms_genf = CachingGeneratorFunction(
+                    self._forms_factory.forms())
+            except:  # XXXX define exception!
+                self.set_response(self._response)
+                raise
+            self.global_form = getattr(
+                self._forms_factory, "global_form", None)
         return self._forms_genf()
 
     def links(self):
-        """Return iterable over mechanize.Link-like objects."""
+        """Return iterable over mechanize.Link-like objects.
+
+        Raises mechanize.ParseError on failure.
+        """
         if self._links_genf is None:
-            self._links_genf = CachingGeneratorFunction(
-                self._links_factory.links())
+            try:
+                self._links_genf = CachingGeneratorFunction(
+                    self._links_factory.links())
+            except:  # XXXX define exception!
+                self.set_response(self._response)
+                raise
         return self._links_genf()
 
 class DefaultFactory(Factory):
@@ -526,7 +561,8 @@
             forms_factory=FormsFactory(),
             links_factory=LinksFactory(),
             title_factory=TitleFactory(),
-            is_html_p=make_is_html(allow_xhtml=i_want_broken_xhtml_support),
+            response_type_finder=ResponseTypeFinder(
+                allow_xhtml=i_want_broken_xhtml_support),
             )
 
     def set_response(self, response):
@@ -535,7 +571,7 @@
             self._forms_factory.set_response(
                 copy.copy(response), self.encoding)
             self._links_factory.set_response(
-                copy.copy(response), self._response.geturl(), self.encoding)
+                copy.copy(response), response.geturl(), self.encoding)
             self._title_factory.set_response(
                 copy.copy(response), self.encoding)
 
@@ -551,19 +587,21 @@
             forms_factory=RobustFormsFactory(),
             links_factory=RobustLinksFactory(),
             title_factory=RobustTitleFactory(),
-            is_html_p=make_is_html(allow_xhtml=i_want_broken_xhtml_support),
+            response_type_finder=ResponseTypeFinder(
+                allow_xhtml=i_want_broken_xhtml_support),
             )
         if soup_class is None:
             soup_class = MechanizeBs
         self._soup_class = soup_class
 
     def set_response(self, response):
-        import BeautifulSoup
+        import _beautifulsoup
         Factory.set_response(self, response)
         if response is not None:
             data = response.read()
             soup = self._soup_class(self.encoding, data)
-            self._forms_factory.set_response(response, self.encoding)
+            self._forms_factory.set_response(
+                copy.copy(response), self.encoding)
             self._links_factory.set_soup(
                 soup, response.geturl(), self.encoding)
             self._title_factory.set_soup(soup, self.encoding)

Added: python-mechanize/branches/upstream/current/mechanize/_http.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_http.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_http.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,696 @@
+"""HTTP related handlers.
+
+Note that some other HTTP handlers live in more specific modules: _auth.py,
+_gzip.py, etc.
+
+
+Copyright 2002-2006 John J Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+import copy, time, tempfile, htmlentitydefs, re, logging, socket, \
+       urllib2, urllib, httplib, sgmllib
+from urllib2 import URLError, HTTPError, BaseHandler
+from cStringIO import StringIO
+
+from _request import Request
+from _util import isstringlike
+from _response import closeable_response, response_seek_wrapper
+from _html import unescape, unescape_charref
+from _headersutil import is_html
+from _clientcookie import CookieJar, request_host
+import _rfc3986
+
+debug = logging.getLogger("mechanize").debug
+
+
+CHUNK = 1024  # size of chunks fed to HTML HEAD parser, in bytes
+DEFAULT_ENCODING = 'latin-1'
+
+
+# This adds "refresh" to the list of redirectables and provides a redirection
+# algorithm that doesn't go into a loop in the presence of cookies
+# (Python 2.4 has this new algorithm, 2.3 doesn't).
+class HTTPRedirectHandler(BaseHandler):
+    # maximum number of redirections to any single URL
+    # this is needed because of the state that cookies introduce
+    max_repeats = 4
+    # maximum total number of redirections (regardless of URL) before
+    # assuming we're in a loop
+    max_redirections = 10
+
+    # Implementation notes:
+
+    # To avoid the server sending us into an infinite loop, the request
+    # object needs to track what URLs we have already seen.  Do this by
+    # adding a handler-specific attribute to the Request object.  The value
+    # of the dict is used to count the number of times the same URL has
+    # been visited.  This is needed because visiting the same URL twice
+    # does not necessarily imply a loop, thanks to state introduced by
+    # cookies.
+
+    # Always unhandled redirection codes:
+    # 300 Multiple Choices: should not handle this here.
+    # 304 Not Modified: no need to handle here: only of interest to caches
+    #     that do conditional GETs
+    # 305 Use Proxy: probably not worth dealing with here
+    # 306 Unused: what was this for in the previous versions of protocol??
+
+    def redirect_request(self, newurl, req, fp, code, msg, headers):
+        """Return a Request or None in response to a redirect.
+
+        This is called by the http_error_30x methods when a redirection
+        response is received.  If a redirection should take place, return a
+        new Request to allow http_error_30x to perform the redirect;
+        otherwise, return None to indicate that an HTTPError should be
+        raised.
+
+        """
+        if code in (301, 302, 303, "refresh") or \
+               (code == 307 and not req.has_data()):
+            # Strictly (according to RFC 2616), 301 or 302 in response to
+            # a POST MUST NOT cause a redirection without confirmation
+            # from the user (of urllib2, in this case).  In practice,
+            # essentially all clients do redirect in this case, so we do
+            # the same.
+            try:
+                visit = req.visit
+            except AttributeError:
+                visit = None
+            return Request(newurl,
+                           headers=req.headers,
+                           origin_req_host=req.get_origin_req_host(),
+                           unverifiable=True,
+                           visit=visit,
+                           )
+        else:
+            raise HTTPError(req.get_full_url(), code, msg, headers, fp)
+
+    def http_error_302(self, req, fp, code, msg, headers):
+        # Some servers (incorrectly) return multiple Location headers
+        # (so probably same goes for URI).  Use first header.
+        if headers.has_key('location'):
+            newurl = headers.getheaders('location')[0]
+        elif headers.has_key('uri'):
+            newurl = headers.getheaders('uri')[0]
+        else:
+            return
+        newurl = _rfc3986.clean_url(newurl, "latin-1")
+        newurl = _rfc3986.urljoin(req.get_full_url(), newurl)
+
+        # XXX Probably want to forget about the state of the current
+        # request, although that might interact poorly with other
+        # handlers that also use handler-specific request attributes
+        new = self.redirect_request(newurl, req, fp, code, msg, headers)
+        if new is None:
+            return
+
+        # loop detection
+        # .redirect_dict has a key url if url was previously visited.
+        if hasattr(req, 'redirect_dict'):
+            visited = new.redirect_dict = req.redirect_dict
+            if (visited.get(newurl, 0) >= self.max_repeats or
+                len(visited) >= self.max_redirections):
+                raise HTTPError(req.get_full_url(), code,
+                                self.inf_msg + msg, headers, fp)
+        else:
+            visited = new.redirect_dict = req.redirect_dict = {}
+        visited[newurl] = visited.get(newurl, 0) + 1
+
+        # Don't close the fp until we are sure that we won't use it
+        # with HTTPError.  
+        fp.read()
+        fp.close()
+
+        return self.parent.open(new)
+
+    http_error_301 = http_error_303 = http_error_307 = http_error_302
+    http_error_refresh = http_error_302
+
+    inf_msg = "The HTTP server returned a redirect error that would " \
+              "lead to an infinite loop.\n" \
+              "The last 30x error message was:\n"
+
+
+# XXX would self.reset() work, instead of raising this exception?
+class EndOfHeadError(Exception): pass
+class AbstractHeadParser:
+    # only these elements are allowed in or before HEAD of document
+    head_elems = ("html", "head",
+                  "title", "base",
+                  "script", "style", "meta", "link", "object")
+    _entitydefs = htmlentitydefs.name2codepoint
+    _encoding = DEFAULT_ENCODING
+
+    def __init__(self):
+        self.http_equiv = []
+
+    def start_meta(self, attrs):
+        http_equiv = content = None
+        for key, value in attrs:
+            if key == "http-equiv":
+                http_equiv = self.unescape_attr_if_required(value)
+            elif key == "content":
+                content = self.unescape_attr_if_required(value)
+        if http_equiv is not None and content is not None:
+            self.http_equiv.append((http_equiv, content))
+
+    def end_head(self):
+        raise EndOfHeadError()
+
+    def handle_entityref(self, name):
+        #debug("%s", name)
+        self.handle_data(unescape(
+            '&%s;' % name, self._entitydefs, self._encoding))
+
+    def handle_charref(self, name):
+        #debug("%s", name)
+        self.handle_data(unescape_charref(name, self._encoding))
+
+    def unescape_attr(self, name):
+        #debug("%s", name)
+        return unescape(name, self._entitydefs, self._encoding)
+
+    def unescape_attrs(self, attrs):
+        #debug("%s", attrs)
+        escaped_attrs = {}
+        for key, val in attrs.items():
+            escaped_attrs[key] = self.unescape_attr(val)
+        return escaped_attrs
+
+    def unknown_entityref(self, ref):
+        self.handle_data("&%s;" % ref)
+
+    def unknown_charref(self, ref):
+        self.handle_data("&#%s;" % ref)
+
+
+try:
+    import HTMLParser
+except ImportError:
+    pass
+else:
+    class XHTMLCompatibleHeadParser(AbstractHeadParser,
+                                    HTMLParser.HTMLParser):
+        def __init__(self):
+            HTMLParser.HTMLParser.__init__(self)
+            AbstractHeadParser.__init__(self)
+
+        def handle_starttag(self, tag, attrs):
+            if tag not in self.head_elems:
+                raise EndOfHeadError()
+            try:
+                method = getattr(self, 'start_' + tag)
+            except AttributeError:
+                try:
+                    method = getattr(self, 'do_' + tag)
+                except AttributeError:
+                    pass # unknown tag
+                else:
+                    method(attrs)
+            else:
+                method(attrs)
+
+        def handle_endtag(self, tag):
+            if tag not in self.head_elems:
+                raise EndOfHeadError()
+            try:
+                method = getattr(self, 'end_' + tag)
+            except AttributeError:
+                pass # unknown tag
+            else:
+                method()
+
+        def unescape(self, name):
+            # Use the entitydefs passed into constructor, not
+            # HTMLParser.HTMLParser's entitydefs.
+            return self.unescape_attr(name)
+
+        def unescape_attr_if_required(self, name):
+            return name  # HTMLParser.HTMLParser already did it
+
+class HeadParser(AbstractHeadParser, sgmllib.SGMLParser):
+
+    def _not_called(self):
+        assert False
+
+    def __init__(self):
+        sgmllib.SGMLParser.__init__(self)
+        AbstractHeadParser.__init__(self)
+
+    def handle_starttag(self, tag, method, attrs):
+        if tag not in self.head_elems:
+            raise EndOfHeadError()
+        if tag == "meta":
+            method(attrs)
+
+    def unknown_starttag(self, tag, attrs):
+        self.handle_starttag(tag, self._not_called, attrs)
+
+    def handle_endtag(self, tag, method):
+        if tag in self.head_elems:
+            method()
+        else:
+            raise EndOfHeadError()
+
+    def unescape_attr_if_required(self, name):
+        return self.unescape_attr(name)
+
+def parse_head(fileobj, parser):
+    """Return a list of key, value pairs."""
+    while 1:
+        data = fileobj.read(CHUNK)
+        try:
+            parser.feed(data)
+        except EndOfHeadError:
+            break
+        if len(data) != CHUNK:
+            # this should only happen if there is no HTML body, or if
+            # CHUNK is big
+            break
+    return parser.http_equiv
+
+class HTTPEquivProcessor(BaseHandler):
+    """Append META HTTP-EQUIV headers to regular HTTP headers."""
+
+    handler_order = 300  # before handlers that look at HTTP headers
+
+    def __init__(self, head_parser_class=HeadParser,
+                 i_want_broken_xhtml_support=False,
+                 ):
+        self.head_parser_class = head_parser_class
+        self._allow_xhtml = i_want_broken_xhtml_support
+
+    def http_response(self, request, response):
+        if not hasattr(response, "seek"):
+            response = response_seek_wrapper(response)
+        http_message = response.info()
+        url = response.geturl()
+        ct_hdrs = http_message.getheaders("content-type")
+        if is_html(ct_hdrs, url, self._allow_xhtml):
+            try:
+                try:
+                    html_headers = parse_head(response, self.head_parser_class())
+                finally:
+                    response.seek(0)
+            except (HTMLParser.HTMLParseError,
+                    sgmllib.SGMLParseError):
+                pass
+            else:
+                for hdr, val in html_headers:
+                    # add a header
+                    http_message.dict[hdr.lower()] = val
+                    text = hdr + ": " + val
+                    for line in text.split("\n"):
+                        http_message.headers.append(line + "\n")
+        return response
+
+    https_response = http_response
+
+class HTTPCookieProcessor(BaseHandler):
+    """Handle HTTP cookies.
+
+    Public attributes:
+
+    cookiejar: CookieJar instance
+
+    """
+    def __init__(self, cookiejar=None):
+        if cookiejar is None:
+            cookiejar = CookieJar()
+        self.cookiejar = cookiejar
+
+    def http_request(self, request):
+        self.cookiejar.add_cookie_header(request)
+        return request
+
+    def http_response(self, request, response):
+        self.cookiejar.extract_cookies(response, request)
+        return response
+
+    https_request = http_request
+    https_response = http_response
+
+try:
+    import robotparser
+except ImportError:
+    pass
+else:
+    class MechanizeRobotFileParser(robotparser.RobotFileParser):
+
+        def __init__(self, url='', opener=None):
+            import _opener
+            robotparser.RobotFileParser.__init__(self, url)
+            self._opener = opener
+
+        def set_opener(self, opener=None):
+            if opener is None:
+                opener = _opener.OpenerDirector()
+            self._opener = opener
+
+        def read(self):
+            """Reads the robots.txt URL and feeds it to the parser."""
+            if self._opener is None:
+                self.set_opener()
+            req = Request(self.url, unverifiable=True, visit=False)
+            try:
+                f = self._opener.open(req)
+            except HTTPError, f:
+                pass
+            except (IOError, socket.error, OSError), exc:
+                robotparser._debug("ignoring error opening %r: %s" %
+                                   (self.url, exc))
+                return
+            lines = []
+            line = f.readline()
+            while line:
+                lines.append(line.strip())
+                line = f.readline()
+            status = f.code
+            if status == 401 or status == 403:
+                self.disallow_all = True
+                robotparser._debug("disallow all")
+            elif status >= 400:
+                self.allow_all = True
+                robotparser._debug("allow all")
+            elif status == 200 and lines:
+                robotparser._debug("parse lines")
+                self.parse(lines)
+
+    class RobotExclusionError(urllib2.HTTPError):
+        def __init__(self, request, *args):
+            apply(urllib2.HTTPError.__init__, (self,)+args)
+            self.request = request
+
+    class HTTPRobotRulesProcessor(BaseHandler):
+        # before redirections, after everything else
+        handler_order = 800
+
+        try:
+            from httplib import HTTPMessage
+        except:
+            from mimetools import Message
+            http_response_class = Message
+        else:
+            http_response_class = HTTPMessage
+
+        def __init__(self, rfp_class=MechanizeRobotFileParser):
+            self.rfp_class = rfp_class
+            self.rfp = None
+            self._host = None
+
+        def http_request(self, request):
+            scheme = request.get_type()
+            if scheme not in ["http", "https"]:
+                # robots exclusion only applies to HTTP
+                return request
+
+            if request.get_selector() == "/robots.txt":
+                # /robots.txt is always OK to fetch
+                return request
+
+            host = request.get_host()
+            if host != self._host:
+                self.rfp = self.rfp_class()
+                try:
+                    self.rfp.set_opener(self.parent)
+                except AttributeError:
+                    debug("%r instance does not support set_opener" %
+                          self.rfp.__class__)
+                self.rfp.set_url(scheme+"://"+host+"/robots.txt")
+                self.rfp.read()
+                self._host = host
+
+            ua = request.get_header("User-agent", "")
+            if self.rfp.can_fetch(ua, request.get_full_url()):
+                return request
+            else:
+                msg = "request disallowed by robots.txt"
+                raise RobotExclusionError(
+                    request,
+                    request.get_full_url(),
+                    403, msg,
+                    self.http_response_class(StringIO()), StringIO(msg))
+
+        https_request = http_request
+
+class HTTPRefererProcessor(BaseHandler):
+    """Add Referer header to requests.
+
+    This only makes sense if you use each RefererProcessor for a single
+    chain of requests only (so, for example, if you use a single
+    HTTPRefererProcessor to fetch a series of URLs extracted from a single
+    page, this will break).
+
+    There's a proper implementation of this in module mechanize.
+
+    """
+    def __init__(self):
+        self.referer = None
+
+    def http_request(self, request):
+        if ((self.referer is not None) and
+            not request.has_header("Referer")):
+            request.add_unredirected_header("Referer", self.referer)
+        return request
+
+    def http_response(self, request, response):
+        self.referer = response.geturl()
+        return response
+
+    https_request = http_request
+    https_response = http_response
+
+
+def clean_refresh_url(url):
+    # e.g. Firefox 1.5 does (something like) this
+    if ((url.startswith('"') and url.endswith('"')) or
+        (url.startswith("'") and url.endswith("'"))):
+        url = url[1:-1]
+    return _rfc3986.clean_url(url, "latin-1")  # XXX encoding
+
+def parse_refresh_header(refresh):
+    """
+    >>> parse_refresh_header("1; url=http://example.com/")
+    (1.0, 'http://example.com/')
+    >>> parse_refresh_header("1; url='http://example.com/'")
+    (1.0, 'http://example.com/')
+    >>> parse_refresh_header("1")
+    (1.0, None)
+    >>> parse_refresh_header("blah")
+    Traceback (most recent call last):
+    ValueError: invalid literal for float(): blah
+
+    """
+
+    ii = refresh.find(";")
+    if ii != -1:
+        pause, newurl_spec = float(refresh[:ii]), refresh[ii+1:]
+        jj = newurl_spec.find("=")
+        key = None
+        if jj != -1:
+            key, newurl = newurl_spec[:jj], newurl_spec[jj+1:]
+            newurl = clean_refresh_url(newurl)
+        if key is None or key.strip().lower() != "url":
+            raise ValueError()
+    else:
+        pause, newurl = float(refresh), None
+    return pause, newurl
+
+class HTTPRefreshProcessor(BaseHandler):
+    """Perform HTTP Refresh redirections.
+
+    Note that if a non-200 HTTP code has occurred (for example, a 30x
+    redirect), this processor will do nothing.
+
+    By default, only zero-time Refresh headers are redirected.  Use the
+    max_time attribute / constructor argument to allow Refresh with longer
+    pauses.  Use the honor_time attribute / constructor argument to control
+    whether the requested pause is honoured (with a time.sleep()) or
+    skipped in favour of immediate redirection.
+
+    Public attributes:
+
+    max_time: see above
+    honor_time: see above
+
+    """
+    handler_order = 1000
+
+    def __init__(self, max_time=0, honor_time=True):
+        self.max_time = max_time
+        self.honor_time = honor_time
+
+    def http_response(self, request, response):
+        code, msg, hdrs = response.code, response.msg, response.info()
+
+        if code == 200 and hdrs.has_key("refresh"):
+            refresh = hdrs.getheaders("refresh")[0]
+            try:
+                pause, newurl = parse_refresh_header(refresh)
+            except ValueError:
+                debug("bad Refresh header: %r" % refresh)
+                return response
+            if newurl is None:
+                newurl = response.geturl()
+            if (self.max_time is None) or (pause <= self.max_time):
+                if pause > 1E-3 and self.honor_time:
+                    time.sleep(pause)
+                hdrs["location"] = newurl
+                # hardcoded http is NOT a bug
+                response = self.parent.error(
+                    "http", request, response,
+                    "refresh", msg, hdrs)
+
+        return response
+
+    https_response = http_response
+
+class HTTPErrorProcessor(BaseHandler):
+    """Process HTTP error responses.
+
+    The purpose of this handler is to to allow other response processors a
+    look-in by removing the call to parent.error() from
+    AbstractHTTPHandler.
+
+    For non-200 error codes, this just passes the job on to the
+    Handler.<proto>_error_<code> methods, via the OpenerDirector.error
+    method.  Eventually, urllib2.HTTPDefaultErrorHandler will raise an
+    HTTPError if no other handler handles the error.
+
+    """
+    handler_order = 1000  # after all other processors
+
+    def http_response(self, request, response):
+        code, msg, hdrs = response.code, response.msg, response.info()
+
+        if code != 200:
+            # hardcoded http is NOT a bug
+            response = self.parent.error(
+                "http", request, response, code, msg, hdrs)
+
+        return response
+
+    https_response = http_response
+
+
+class AbstractHTTPHandler(BaseHandler):
+
+    def __init__(self, debuglevel=0):
+        self._debuglevel = debuglevel
+
+    def set_http_debuglevel(self, level):
+        self._debuglevel = level
+
+    def do_request_(self, request):
+        host = request.get_host()
+        if not host:
+            raise URLError('no host given')
+
+        if request.has_data():  # POST
+            data = request.get_data()
+            if not request.has_header('Content-type'):
+                request.add_unredirected_header(
+                    'Content-type',
+                    'application/x-www-form-urlencoded')
+
+        scheme, sel = urllib.splittype(request.get_selector())
+        sel_host, sel_path = urllib.splithost(sel)
+        if not request.has_header('Host'):
+            request.add_unredirected_header('Host', sel_host or host)
+        for name, value in self.parent.addheaders:
+            name = name.capitalize()
+            if not request.has_header(name):
+                request.add_unredirected_header(name, value)
+
+        return request
+
+    def do_open(self, http_class, req):
+        """Return an addinfourl object for the request, using http_class.
+
+        http_class must implement the HTTPConnection API from httplib.
+        The addinfourl return value is a file-like object.  It also
+        has methods and attributes including:
+            - info(): return a mimetools.Message object for the headers
+            - geturl(): return the original request URL
+            - code: HTTP status code
+        """
+        host = req.get_host()
+        if not host:
+            raise URLError('no host given')
+
+        h = http_class(host) # will parse host:port
+        h.set_debuglevel(self._debuglevel)
+
+        headers = dict(req.headers)
+        headers.update(req.unredirected_hdrs)
+        # We want to make an HTTP/1.1 request, but the addinfourl
+        # class isn't prepared to deal with a persistent connection.
+        # It will try to read all remaining data from the socket,
+        # which will block while the server waits for the next request.
+        # So make sure the connection gets closed after the (only)
+        # request.
+        headers["Connection"] = "close"
+        headers = dict(
+            [(name.title(), val) for name, val in headers.items()])
+        try:
+            h.request(req.get_method(), req.get_selector(), req.data, headers)
+            r = h.getresponse()
+        except socket.error, err: # XXX what error?
+            raise URLError(err)
+
+        # Pick apart the HTTPResponse object to get the addinfourl
+        # object initialized properly.
+
+        # Wrap the HTTPResponse object in socket's file object adapter
+        # for Windows.  That adapter calls recv(), so delegate recv()
+        # to read().  This weird wrapping allows the returned object to
+        # have readline() and readlines() methods.
+
+        # XXX It might be better to extract the read buffering code
+        # out of socket._fileobject() and into a base class.
+
+        r.recv = r.read
+        fp = socket._fileobject(r)
+
+        resp = closeable_response(fp, r.msg, req.get_full_url(),
+                                  r.status, r.reason)
+        return resp
+
+
+class HTTPHandler(AbstractHTTPHandler):
+    def http_open(self, req):
+        return self.do_open(httplib.HTTPConnection, req)
+
+    http_request = AbstractHTTPHandler.do_request_
+
+if hasattr(httplib, 'HTTPS'):
+
+    class HTTPSConnectionFactory:
+        def __init__(self, key_file, cert_file):
+            self._key_file = key_file
+            self._cert_file = cert_file
+        def __call__(self, hostport):
+            return httplib.HTTPSConnection(
+                hostport,
+                key_file=self._key_file, cert_file=self._cert_file)
+
+    class HTTPSHandler(AbstractHTTPHandler):
+        def __init__(self, client_cert_manager=None):
+            AbstractHTTPHandler.__init__(self)
+            self.client_cert_manager = client_cert_manager
+
+        def https_open(self, req):
+            if self.client_cert_manager is not None:
+                key_file, cert_file = self.client_cert_manager.find_key_cert(
+                    req.get_full_url())
+                conn_factory = HTTPSConnectionFactory(key_file, cert_file)
+            else:
+                conn_factory = httplib.HTTPSConnection
+            return self.do_open(conn_factory, req)
+
+        https_request = AbstractHTTPHandler.do_request_

Modified: python-mechanize/branches/upstream/current/mechanize/_lwpcookiejar.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_lwpcookiejar.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_lwpcookiejar.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -18,12 +18,12 @@
 
 """
 
-import time, re, string, logging
+import time, re, logging
 
 from _clientcookie import reraise_unmasked_exceptions, FileCookieJar, Cookie, \
      MISSING_FILENAME_TEXT, LoadError
 from _headersutil import join_header_words, split_header_words
-from _util import startswith, iso2time, time2isoz
+from _util import iso2time, time2isoz
 
 debug = logging.getLogger("mechanize").debug
 
@@ -89,7 +89,7 @@
                 debug("   Not saving %s: expired", cookie.name)
                 continue
             r.append("Set-Cookie3: %s" % lwp_cookie_str(cookie))
-        return string.join(r+[""], "\n")
+        return "\n".join(r+[""])
 
     def save(self, filename=None, ignore_discard=False, ignore_expires=False):
         if filename is None:
@@ -127,9 +127,9 @@
             while 1:
                 line = f.readline()
                 if line == "": break
-                if not startswith(line, header):
+                if not line.startswith(header):
                     continue
-                line = string.strip(line[len(header):])
+                line = line[len(header):].strip()
 
                 for data in split_header_words([line]):
                     name, value = data[0]
@@ -139,7 +139,7 @@
                         standard[k] = False
                     for k, v in data[1:]:
                         if k is not None:
-                            lc = string.lower(k)
+                            lc = k.lower()
                         else:
                             lc = None
                         # don't lose case distinction for unknown fields
@@ -161,7 +161,7 @@
                     if expires is None:
                         discard = True
                     domain = h("domain")
-                    domain_specified = startswith(domain, ".")
+                    domain_specified = domain.startswith(".")
                     c = Cookie(h("version"), name, value,
                                h("port"), h("port_spec"),
                                domain, domain_specified, h("domain_dot"),

Modified: python-mechanize/branches/upstream/current/mechanize/_mechanize.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_mechanize.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_mechanize.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -9,14 +9,16 @@
 
 """
 
-import urllib2, urlparse, sys, copy, re
+import urllib2, sys, copy, re
 
-from _useragent import UserAgent
+from _useragent import UserAgentBase
 from _html import DefaultFactory
-from _util import response_seek_wrapper, closeable_response
+from _response import response_seek_wrapper, closeable_response
+import _upgrade
 import _request
+import _rfc3986
 
-__version__ = (0, 1, 2, "b", None)  # 0.1.2b
+__version__ = (0, 1, 6, "b", None)  # 0.1.6b
 
 class BrowserStateError(Exception): pass
 class LinkNotFoundError(Exception): pass
@@ -46,46 +48,12 @@
         del self._history[:]
     def close(self):
         for request, response in self._history:
-            response.close()
+            if response is not None:
+                response.close()
         del self._history[:]
 
-# Horrible, but needed, at least until fork urllib2.  Even then, may want
-# to preseve urllib2 compatibility.
-def upgrade_response(response):
-    # a urllib2 handler constructed the response, i.e. the response is an
-    # urllib.addinfourl, instead of a _Util.closeable_response as returned
-    # by e.g. mechanize.HTTPHandler
-    try:
-        code = response.code
-    except AttributeError:
-        code = None
-    try:
-        msg = response.msg
-    except AttributeError:
-        msg = None
 
-    # may have already-.read() data from .seek() cache
-    data = None
-    get_data = getattr(response, "get_data", None)
-    if get_data:
-        data = get_data()
-
-    response = closeable_response(
-        response.fp, response.info(), response.geturl(), code, msg)
-    response = response_seek_wrapper(response)
-    if data:
-        response.set_data(data)
-    return response
-class ResponseUpgradeProcessor(urllib2.BaseHandler):
-    # upgrade responses to be .close()able without becoming unusable
-    handler_order = 0  # before anything else
-    def any_response(self, request, response):
-        if not hasattr(response, 'closeable_response'):
-            response = upgrade_response(response)
-        return response
-
-
-class Browser(UserAgent):
+class Browser(UserAgentBase):
     """Browser-like class with support for history, forms and links.
 
     BrowserStateError is raised whenever the browser is in the wrong state to
@@ -100,9 +68,9 @@
 
     """
 
-    handler_classes = UserAgent.handler_classes.copy()
-    handler_classes["_response_upgrade"] = ResponseUpgradeProcessor
-    default_others = copy.copy(UserAgent.default_others)
+    handler_classes = UserAgentBase.handler_classes.copy()
+    handler_classes["_response_upgrade"] = _upgrade.ResponseUpgradeProcessor
+    default_others = copy.copy(UserAgentBase.default_others)
     default_others.append("_response_upgrade")
 
     def __init__(self,
@@ -115,8 +83,8 @@
         Only named arguments should be passed to this constructor.
 
         factory: object implementing the mechanize.Factory interface.
-        history: object implementing the mechanize.History interface.  Note this
-         interface is still experimental and may change in future.
+        history: object implementing the mechanize.History interface.  Note
+         this interface is still experimental and may change in future.
         request_class: Request class to use.  Defaults to mechanize.Request
          by default for Pythons older than 2.4, urllib2.Request otherwise.
 
@@ -132,8 +100,6 @@
         if history is None:
             history = History()
         self._history = history
-        self.request = self._response = None
-        self.form = None
 
         if request_class is None:
             if not hasattr(urllib2.Request, "add_unredirected_header"):
@@ -147,48 +113,77 @@
         self._factory = factory
         self.request_class = request_class
 
-        UserAgent.__init__(self)  # do this last to avoid __getattr__ problems
+        self.request = None
+        self._set_response(None, False)
 
+        # do this last to avoid __getattr__ problems
+        UserAgentBase.__init__(self)
+
     def close(self):
+        UserAgentBase.close(self)
         if self._response is not None:
             self._response.close()    
-        UserAgent.close(self)
         if self._history is not None:
             self._history.close()
             self._history = None
+
+        # make use after .close easy to spot
+        self.form = None
         self.request = self._response = None
+        self.request = self.response = self.set_response = None
+        self.geturl =  self.reload = self.back = None
+        self.clear_history = self.set_cookie = self.links = self.forms = None
+        self.viewing_html = self.encoding = self.title = None
+        self.select_form = self.click = self.submit = self.click_link = None
+        self.follow_link = self.find_link = None
 
+    def open_novisit(self, url, data=None):
+        """Open a URL without visiting it.
+
+        The browser state (including .request, .response(), history, forms and
+        links) are all left unchanged by calling this function.
+
+        The interface is the same as for .open().
+
+        This is useful for things like fetching images.
+
+        See also .retrieve().
+
+        """
+        return self._mech_open(url, data, visit=False)
+
     def open(self, url, data=None):
-        if self._response is not None:
-            self._response.close()
         return self._mech_open(url, data)
 
-    def _mech_open(self, url, data=None, update_history=True):
+    def _mech_open(self, url, data=None, update_history=True, visit=None):
         try:
             url.get_full_url
         except AttributeError:
             # string URL -- convert to absolute URL if required
-            scheme, netloc = urlparse.urlparse(url)[:2]
-            if not scheme:
+            scheme, authority = _rfc3986.urlsplit(url)[:2]
+            if scheme is None:
                 # relative URL
-                assert not netloc, "malformed URL"
                 if self._response is None:
                     raise BrowserStateError(
-                        "can't fetch relative URL: not viewing any document")
-                url = urlparse.urljoin(self._response.geturl(), url)
+                        "can't fetch relative reference: "
+                        "not viewing any document")
+                url = _rfc3986.urljoin(self._response.geturl(), url)
 
-        if self.request is not None and update_history:
-            self._history.add(self.request, self._response)
-        self._response = None
-        # we want self.request to be assigned even if UserAgent.open fails
-        self.request = self._request(url, data)
-        self._previous_scheme = self.request.get_type()
+        request = self._request(url, data, visit)
+        visit = request.visit
+        if visit is None:
+            visit = True
 
+        if visit:
+            self._visit_request(request, update_history)
+
         success = True
         try:
-            response = UserAgent.open(self, self.request, data)
+            response = UserAgentBase.open(self, request, data)
         except urllib2.HTTPError, error:
             success = False
+            if error.fp is None:  # not a response
+                raise
             response = error
 ##         except (IOError, socket.error, OSError), error:
 ##             # Yes, urllib2 really does raise all these :-((
@@ -201,10 +196,16 @@
 ##             # Python core, a fix would need some backwards-compat. hack to be
 ##             # acceptable.
 ##             raise
-        self.set_response(response)
+
+        if visit:
+            self._set_response(response, False)
+            response = copy.copy(self._response)
+        elif response is not None:
+            response = _upgrade.upgrade_response(response)
+
         if not success:
-            raise error
-        return copy.copy(self._response)
+            raise response
+        return response
 
     def __str__(self):
         text = []
@@ -228,24 +229,52 @@
         return copy.copy(self._response)
 
     def set_response(self, response):
-        """Replace current response with (a copy of) response."""
+        """Replace current response with (a copy of) response.
+
+        response may be None.
+
+        This is intended mostly for HTML-preprocessing.
+        """
+        self._set_response(response, True)
+
+    def _set_response(self, response, close_current):
         # sanity check, necessary but far from sufficient
-        if not (hasattr(response, "info") and hasattr(response, "geturl") and
-                hasattr(response, "read")):
+        if not (response is None or
+                (hasattr(response, "info") and hasattr(response, "geturl") and
+                 hasattr(response, "read")
+                 )
+                ):
             raise ValueError("not a response object")
 
         self.form = None
+        if response is not None:
+            response = _upgrade.upgrade_response(response)
+        if close_current and self._response is not None:
+            self._response.close()
+        self._response = response
+        self._factory.set_response(response)
 
-        if not hasattr(response, "seek"):
-            response = response_seek_wrapper(response)
-        if not hasattr(response, "closeable_response"):
-            response = upgrade_response(response)
-        else:
-            response = copy.copy(response)
+    def visit_response(self, response, request=None):
+        """Visit the response, as if it had been .open()ed.
 
-        self._response = response
-        self._factory.set_response(self._response)
+        Unlike .set_response(), this updates history rather than replacing the
+        current response.
+        """
+        if request is None:
+            request = _request.Request(response.geturl())
+        self._visit_request(request, True)
+        self._set_response(response, False)
 
+    def _visit_request(self, request, update_history):
+        if self._response is not None:
+            self._response.close()
+        if self.request is not None and update_history:
+            self._history.add(self.request, self._response)
+        self._response = None
+        # we want self.request to be assigned even if UserAgentBase.open
+        # fails
+        self.request = request
+
     def geturl(self):
         """Get URL of current document."""
         if self._response is None:
@@ -270,11 +299,53 @@
             self._response.close()
         self.request, response = self._history.back(n, self._response)
         self.set_response(response)
-        return response
+        if not response.read_complete:
+            return self.reload()
+        return copy.copy(response)
 
     def clear_history(self):
         self._history.clear()
 
+    def set_cookie(self, cookie_string):
+        """Request to set a cookie.
+
+        Note that it is NOT necessary to call this method under ordinary
+        circumstances: cookie handling is normally entirely automatic.  The
+        intended use case is rather to simulate the setting of a cookie by
+        client script in a web page (e.g. JavaScript).  In that case, use of
+        this method is necessary because mechanize currently does not support
+        JavaScript, VBScript, etc.
+
+        The cookie is added in the same way as if it had arrived with the
+        current response, as a result of the current request.  This means that,
+        for example, it is not appropriate to set the cookie based on the
+        current request, no cookie will be set.
+
+        The cookie will be returned automatically with subsequent responses
+        made by the Browser instance whenever that's appropriate.
+
+        cookie_string should be a valid value of the Set-Cookie header.
+
+        For example:
+
+        browser.set_cookie(
+            "sid=abcdef; expires=Wednesday, 09-Nov-06 23:12:40 GMT")
+
+        Currently, this method does not allow for adding RFC 2986 cookies.
+        This limitation will be lifted if anybody requests it.
+
+        """
+        if self._response is None:
+            raise BrowserStateError("not viewing any document")
+        if self.request.get_type() not in ["http", "https"]:
+            raise BrowserStateError("can't set cookie for non-HTTP/HTTPS "
+                                    "transactions")
+        cookiejar = self._ua_handlers["_cookies"].cookiejar
+        response = self.response()  # copy
+        headers = response.info()
+        headers["Set-cookie"] = cookie_string
+        cookiejar.extract_cookies(response, self.request)
+
     def links(self, **kwds):
         """Return iterable over links (mechanize.Link objects)."""
         if not self.viewing_html():
@@ -295,6 +366,24 @@
             raise BrowserStateError("not viewing HTML")
         return self._factory.forms()
 
+    def global_form(self):
+        """Return the global form object, or None if the factory implementation
+        did not supply one.
+
+        The "global" form object contains all controls that are not descendants of
+        any FORM element.
+
+        The returned form object implements the ClientForm.HTMLForm interface.
+
+        This is a separate method since the global form is not regarded as part
+        of the sequence of forms in the document -- mostly for
+        backwards-compatibility.
+
+        """
+        if not self.viewing_html():
+            raise BrowserStateError("not viewing HTML")
+        return self._factory.global_form
+
     def viewing_html(self):
         """Return whether the current response contains HTML data."""
         if self._response is None:
@@ -327,6 +416,10 @@
         interface, so you can call methods like .set_value(), .set(), and
         .click().
 
+        Another way to select a form is to assign to the .form attribute.  The
+        form assigned should be one of the objects returned by the .forms()
+        method.
+
         At least one of the name, predicate and nr arguments must be supplied.
         If no matching form is found, mechanize.FormNotFoundError is raised.
 
@@ -383,9 +476,9 @@
             original_scheme in ["http", "https"] and
             not (original_scheme == "https" and scheme != "https")):
             # strip URL fragment (RFC 2616 14.36)
-            parts = urlparse.urlparse(self.request.get_full_url())
-            parts = parts[:-1]+("",)
-            referer = urlparse.urlunparse(parts)
+            parts = _rfc3986.urlsplit(self.request.get_full_url())
+            parts = parts[:-1]+(None,)
+            referer = _rfc3986.urlunsplit(parts)
             request.add_unredirected_header("Referer", referer)
         return request
 
@@ -494,9 +587,6 @@
                 ".select_form()?)" % (self.__class__, name))
         return getattr(form, name)
 
-#---------------------------------------------------
-# Private methods.
-
     def _filter_links(self, links,
                     text=None, text_regex=None,
                     name=None, name_regex=None,

Modified: python-mechanize/branches/upstream/current/mechanize/_mozillacookiejar.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_mozillacookiejar.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_mozillacookiejar.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -9,11 +9,10 @@
 
 """
 
-import re, string, time, logging
+import re, time, logging
 
 from _clientcookie import reraise_unmasked_exceptions, FileCookieJar, Cookie, \
      MISSING_FILENAME_TEXT, LoadError
-from _util import startswith, endswith
 debug = logging.getLogger("ClientCookie").debug
 
 
@@ -72,23 +71,23 @@
                 if line == "": break
 
                 # last field may be absent, so keep any trailing tab
-                if endswith(line, "\n"): line = line[:-1]
+                if line.endswith("\n"): line = line[:-1]
 
                 # skip comments and blank lines XXX what is $ for?
-                if (startswith(string.strip(line), "#") or
-                    startswith(string.strip(line), "$") or
-                    string.strip(line) == ""):
+                if (line.strip().startswith("#") or
+                    line.strip().startswith("$") or
+                    line.strip() == ""):
                     continue
 
                 domain, domain_specified, path, secure, expires, name, value = \
-                        string.split(line, "\t")
+                        line.split("\t")
                 secure = (secure == "TRUE")
                 domain_specified = (domain_specified == "TRUE")
                 if name == "":
                     name = value
                     value = None
 
-                initial_dot = startswith(domain, ".")
+                initial_dot = domain.startswith(".")
                 assert domain_specified == initial_dot
 
                 discard = False
@@ -137,7 +136,7 @@
                     continue
                 if cookie.secure: secure = "TRUE"
                 else: secure = "FALSE"
-                if startswith(cookie.domain, "."): initial_dot = "TRUE"
+                if cookie.domain.startswith("."): initial_dot = "TRUE"
                 else: initial_dot = "FALSE"
                 if cookie.expires is not None:
                     expires = str(cookie.expires)
@@ -153,8 +152,8 @@
                     name = cookie.name
                     value = cookie.value
                 f.write(
-                    string.join([cookie.domain, initial_dot, cookie.path,
-                                 secure, expires, name, value], "\t")+
+                    "\t".join([cookie.domain, initial_dot, cookie.path,
+                               secure, expires, name, value])+
                     "\n")
         finally:
             f.close()

Modified: python-mechanize/branches/upstream/current/mechanize/_msiecookiejar.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_msiecookiejar.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_msiecookiejar.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -11,13 +11,12 @@
 
 # XXX names and comments are not great here
 
-import os, re, string, time, struct, logging
+import os, re, time, struct, logging
 if os.name == "nt":
     import _winreg
 
 from _clientcookie import FileCookieJar, CookieJar, Cookie, \
      MISSING_FILENAME_TEXT, LoadError
-from _util import startswith
 
 debug = logging.getLogger("mechanize").debug
 
@@ -50,7 +49,7 @@
     return divmod((filetime - WIN32_EPOCH), 10000000L)[0]
 
 def binary_to_char(c): return "%02X" % ord(c)
-def binary_to_str(d): return string.join(map(binary_to_char, list(d)), "")
+def binary_to_str(d): return "".join(map(binary_to_char, list(d)))
 
 class MSIEBase:
     magic_re = re.compile(r"Client UrlCache MMF Ver \d\.\d.*")
@@ -153,7 +152,7 @@
             else:
                 discard = False
             domain = cookie["DOMAIN"]
-            initial_dot = startswith(domain, ".")
+            initial_dot = domain.startswith(".")
             if initial_dot:
                 domain_specified = True
             else:
@@ -201,7 +200,7 @@
         now = int(time.time())
 
         if username is None:
-            username = string.lower(os.environ['USERNAME'])
+            username = os.environ['USERNAME'].lower()
 
         cookie_dir = os.path.dirname(filename)
 

Modified: python-mechanize/branches/upstream/current/mechanize/_opener.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_opener.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_opener.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -9,45 +9,30 @@
 
 """
 
-import urllib2, string, bisect, urlparse
-
-from _util import startswith, isstringlike
-from _request import Request
-
+import os, urllib2, bisect, urllib, httplib, types, tempfile
 try:
+    import threading as _threading
+except ImportError:
+    import dummy_threading as _threading
+try:
     set
 except NameError:
     import sets
     set = sets.Set
 
-def methnames(obj):
-    """Return method names of class instance.
+import _http
+import _upgrade
+import _rfc3986
+from _util import isstringlike
+from _request import Request
 
-    dir(obj) doesn't work across Python versions, this does.
 
-    """
-    return methnames_of_instance_as_dict(obj).keys()
+class ContentTooShortError(urllib2.URLError):
+    def __init__(self, reason, result):
+        urllib2.URLError.__init__(self, reason)
+        self.result = result
 
-def methnames_of_instance_as_dict(inst):
-    names = {}
-    names.update(methnames_of_class_as_dict(inst.__class__))
-    for methname in dir(inst):
-        candidate = getattr(inst, methname)
-        if callable(candidate):
-            names[methname] = None
-    return names
 
-def methnames_of_class_as_dict(klass):
-    names = {}
-    for methname in dir(klass):
-        candidate = getattr(klass, methname)
-        if callable(candidate):
-            names[methname] = None
-    for baseclass in klass.__bases__:
-        names.update(methnames_of_class_as_dict(baseclass))
-    return names
-
-
 class OpenerDirector(urllib2.OpenerDirector):
     def __init__(self):
         urllib2.OpenerDirector.__init__(self)
@@ -58,6 +43,7 @@
         self._any_request = {}
         self._any_response = {}
         self._handler_index_valid = True
+        self._tempfiles = []
 
     def add_handler(self, handler):
         if handler in self.handlers:
@@ -81,7 +67,7 @@
 
         for handler in self.handlers:
             added = False
-            for meth in methnames(handler):
+            for meth in dir(handler):
                 if meth in ["redirect_request", "do_open", "proxy_open"]:
                     # oops, coincidental match
                     continue
@@ -99,8 +85,8 @@
                 scheme = meth[:ii]
                 condition = meth[ii+1:]
 
-                if startswith(condition, "error"):
-                    jj = string.find(meth[ii+1:], "_") + ii + 1
+                if condition.startswith("error"):
+                    jj = meth[ii+1:].find("_") + ii + 1
                     kind = meth[jj+1:]
                     try:
                         kind = int(kind)
@@ -151,18 +137,25 @@
         self._any_request = any_request
         self._any_response = any_response
 
-    def _request(self, url_or_req, data):
+    def _request(self, url_or_req, data, visit):
         if isstringlike(url_or_req):
-            req = Request(url_or_req, data)
+            req = Request(url_or_req, data, visit=visit)
         else:
             # already a urllib2.Request or mechanize.Request instance
             req = url_or_req
             if data is not None:
                 req.add_data(data)
+            # XXX yuck, give request a .visit attribute if it doesn't have one
+            try:
+                req.visit
+            except AttributeError:
+                req.visit = None
+            if visit is not None:
+                req.visit = visit
         return req
 
     def open(self, fullurl, data=None):
-        req = self._request(fullurl, data)
+        req = self._request(fullurl, data, None)
         req_scheme = req.get_type()
 
         self._maybe_reindex_handlers()
@@ -220,48 +213,174 @@
             args = (dict, 'default', 'http_error_default') + orig_args
             return apply(self._call_chain, args)
 
+    BLOCK_SIZE = 1024*8
     def retrieve(self, fullurl, filename=None, reporthook=None, data=None):
         """Returns (filename, headers).
 
         For remote objects, the default filename will refer to a temporary
-        file.
+        file.  Temporary files are removed when the OpenerDirector.close()
+        method is called.
 
+        For file: URLs, at present the returned filename is None.  This may
+        change in future.
+
+        If the actual number of bytes read is less than indicated by the
+        Content-Length header, raises ContentTooShortError (a URLError
+        subclass).  The exception's .result attribute contains the (filename,
+        headers) that would have been returned.
+
         """
-        req = self._request(fullurl, data)
-        type_ = req.get_type()
+        req = self._request(fullurl, data, False)
+        scheme = req.get_type()
         fp = self.open(req)
         headers = fp.info()
-        if filename is None and type == 'file':
-            return url2pathname(req.get_selector()), headers
+        if filename is None and scheme == 'file':
+            # XXX req.get_selector() seems broken here, return None,
+            #   pending sanity :-/
+            return None, headers
+            #return urllib.url2pathname(req.get_selector()), headers
         if filename:
             tfp = open(filename, 'wb')
         else:
-            path = urlparse(fullurl)[2]
+            path = _rfc3986.urlsplit(fullurl)[2]
             suffix = os.path.splitext(path)[1]
-            tfp = tempfile.TemporaryFile("wb", suffix=suffix)
+            fd, filename = tempfile.mkstemp(suffix)
+            self._tempfiles.append(filename)
+            tfp = os.fdopen(fd, 'wb')
+
         result = filename, headers
-        bs = 1024*8
+        bs = self.BLOCK_SIZE
         size = -1
         read = 0
-        blocknum = 1
+        blocknum = 0
         if reporthook:
-            if headers.has_key("content-length"):
+            if "content-length" in headers:
                 size = int(headers["Content-Length"])
-            reporthook(0, bs, size)
+            reporthook(blocknum, bs, size)
         while 1:
             block = fp.read(bs)
+            if block == "":
+                break
             read += len(block)
+            tfp.write(block)
+            blocknum += 1
             if reporthook:
                 reporthook(blocknum, bs, size)
-            blocknum = blocknum + 1
-            if not block:
-                break
-            tfp.write(block)
         fp.close()
         tfp.close()
         del fp
         del tfp
-        if size>=0 and read<size:
-            raise IOError("incomplete retrieval error",
-                          "got only %d bytes out of %d" % (read,size))
+
+        # raise exception if actual size does not match content-length header
+        if size >= 0 and read < size:
+            raise ContentTooShortError(
+                "retrieval incomplete: "
+                "got only %i out of %i bytes" % (read, size),
+                result
+                )
+
         return result
+
+    def close(self):
+        urllib2.OpenerDirector.close(self)
+
+        # make it very obvious this object is no longer supposed to be used
+        self.open = self.error = self.retrieve = self.add_handler = None
+
+        if self._tempfiles:
+            for filename in self._tempfiles:
+                try:
+                    os.unlink(filename)
+                except OSError:
+                    pass
+            del self._tempfiles[:]
+
+
+class OpenerFactory:
+    """This class's interface is quite likely to change."""
+
+    default_classes = [
+        # handlers
+        urllib2.ProxyHandler,
+        urllib2.UnknownHandler,
+        _http.HTTPHandler,  # derived from new AbstractHTTPHandler
+        urllib2.HTTPDefaultErrorHandler,
+        _http.HTTPRedirectHandler,  # bugfixed
+        urllib2.FTPHandler,
+        urllib2.FileHandler,
+        # processors
+        _upgrade.HTTPRequestUpgradeProcessor,
+        _http.HTTPCookieProcessor,
+        _http.HTTPErrorProcessor,
+        ]
+    if hasattr(httplib, 'HTTPS'):
+        default_classes.append(_http.HTTPSHandler)
+    handlers = []
+    replacement_handlers = []
+
+    def __init__(self, klass=OpenerDirector):
+        self.klass = klass
+
+    def build_opener(self, *handlers):
+        """Create an opener object from a list of handlers and processors.
+
+        The opener will use several default handlers and processors, including
+        support for HTTP and FTP.
+
+        If any of the handlers passed as arguments are subclasses of the
+        default handlers, the default handlers will not be used.
+
+        """
+        opener = self.klass()
+        default_classes = list(self.default_classes)
+        skip = []
+        for klass in default_classes:
+            for check in handlers:
+                if type(check) == types.ClassType:
+                    if issubclass(check, klass):
+                        skip.append(klass)
+                elif type(check) == types.InstanceType:
+                    if isinstance(check, klass):
+                        skip.append(klass)
+        for klass in skip:
+            default_classes.remove(klass)
+
+        for klass in default_classes:
+            opener.add_handler(klass())
+        for h in handlers:
+            if type(h) == types.ClassType:
+                h = h()
+            opener.add_handler(h)
+
+        return opener
+
+
+build_opener = OpenerFactory().build_opener
+
+_opener = None
+urlopen_lock = _threading.Lock()
+def urlopen(url, data=None):
+    global _opener
+    if _opener is None:
+        urlopen_lock.acquire()
+        try:
+            if _opener is None:
+                _opener = build_opener()
+        finally:
+            urlopen_lock.release()
+    return _opener.open(url, data)
+
+def urlretrieve(url, filename=None, reporthook=None, data=None):
+    global _opener
+    if _opener is None:
+        urlopen_lock.acquire()
+        try:
+            if _opener is None:
+                _opener = build_opener()
+        finally:
+            urlopen_lock.release()
+    return _opener.retrieve(url, filename, reporthook, data)
+
+def install_opener(opener):
+    global _opener
+    _opener = opener

Modified: python-mechanize/branches/upstream/current/mechanize/_request.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_request.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_request.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -8,16 +8,33 @@
 
 """
 
-import urllib2, string
+import urllib2, urllib, logging
 
 from _clientcookie import request_host
+import _rfc3986
 
+warn = logging.getLogger("mechanize").warning
+# don't complain about missing logging handler
+logging.getLogger("mechanize").setLevel(logging.ERROR)
 
+
 class Request(urllib2.Request):
     def __init__(self, url, data=None, headers={},
-             origin_req_host=None, unverifiable=False):
+                 origin_req_host=None, unverifiable=False, visit=None):
+        # In mechanize 0.2, the interpretation of a unicode url argument will
+        # change: A unicode url argument will be interpreted as an IRI, and a
+        # bytestring as a URI. For now, we accept unicode or bytestring.  We
+        # don't insist that the value is always a URI (specifically, must only
+        # contain characters which are legal), because that might break working
+        # code (who knows what bytes some servers want to see, especially with
+        # browser plugins for internationalised URIs).
+        if not _rfc3986.is_clean_uri(url):
+            warn("url argument is not a URI "
+                 "(contains illegal characters) %r" % url)
         urllib2.Request.__init__(self, url, data, headers)
+        self.selector = None
         self.unredirected_hdrs = {}
+        self.visit = visit
 
         # All the terminology below comes from RFC 2965.
         self.unverifiable = unverifiable
@@ -31,6 +48,11 @@
             origin_req_host = request_host(self)
         self.origin_req_host = origin_req_host
 
+    def get_selector(self):
+        if self.selector is None:
+            self.selector, self.__r_selector = urllib.splittag(self.__r_host)
+        return self.selector
+
     def get_origin_req_host(self):
         return self.origin_req_host
 
@@ -39,14 +61,12 @@
 
     def add_unredirected_header(self, key, val):
         """Add a header that will not be added to a redirected request."""
-        self.unredirected_hdrs[string.capitalize(key)] = val
+        self.unredirected_hdrs[key.capitalize()] = val
 
     def has_header(self, header_name):
         """True iff request has named header (regular or unredirected)."""
-        if (self.headers.has_key(header_name) or
-            self.unredirected_hdrs.has_key(header_name)):
-            return True
-        return False
+        return (header_name in self.headers or
+                header_name in self.unredirected_hdrs)
 
     def get_header(self, header_name, default=None):
         return self.headers.get(

Added: python-mechanize/branches/upstream/current/mechanize/_response.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_response.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_response.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,467 @@
+"""Response classes.
+
+The seek_wrapper code is not used if you're using UserAgent with
+.set_seekable_responses(False), or if you're using the urllib2-level interface
+without SeekableProcessor or HTTPEquivProcessor.  Class closeable_response is
+instantiated by some handlers (AbstractHTTPHandler), but the closeable_response
+interface is only depended upon by Browser-level code.  Function
+upgrade_response is only used if you're using Browser or
+ResponseUpgradeProcessor.
+
+
+Copyright 2006 John J. Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file COPYING.txt
+included with the distribution).
+
+"""
+
+import copy, mimetools
+from cStringIO import StringIO
+import urllib2
+
+# XXX Andrew Dalke kindly sent me a similar class in response to my request on
+# comp.lang.python, which I then proceeded to lose.  I wrote this class
+# instead, but I think he's released his code publicly since, could pinch the
+# tests from it, at least...
+
+# For testing seek_wrapper invariant (note that
+# test_urllib2.HandlerTest.test_seekable is expected to fail when this
+# invariant checking is turned on).  The invariant checking is done by module
+# ipdc, which is available here:
+# http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/436834
+## from ipdbc import ContractBase
+## class seek_wrapper(ContractBase):
+class seek_wrapper:
+    """Adds a seek method to a file object.
+
+    This is only designed for seeking on readonly file-like objects.
+
+    Wrapped file-like object must have a read method.  The readline method is
+    only supported if that method is present on the wrapped object.  The
+    readlines method is always supported.  xreadlines and iteration are
+    supported only for Python 2.2 and above.
+
+    Public attributes:
+
+    wrapped: the wrapped file object
+    is_closed: true iff .close() has been called
+
+    WARNING: All other attributes of the wrapped object (ie. those that are not
+    one of wrapped, read, readline, readlines, xreadlines, __iter__ and next)
+    are passed through unaltered, which may or may not make sense for your
+    particular file object.
+
+    """
+    # General strategy is to check that cache is full enough, then delegate to
+    # the cache (self.__cache, which is a cStringIO.StringIO instance).  A seek
+    # position (self.__pos) is maintained independently of the cache, in order
+    # that a single cache may be shared between multiple seek_wrapper objects.
+    # Copying using module copy shares the cache in this way.
+
+    def __init__(self, wrapped):
+        self.wrapped = wrapped
+        self.__read_complete_state = [False]
+        self.__is_closed_state = [False]
+        self.__have_readline = hasattr(self.wrapped, "readline")
+        self.__cache = StringIO()
+        self.__pos = 0  # seek position
+
+    def invariant(self):
+        # The end of the cache is always at the same place as the end of the
+        # wrapped file.
+        return self.wrapped.tell() == len(self.__cache.getvalue())
+
+    def close(self):
+        self.wrapped.close()
+        self.is_closed = True
+
+    def __getattr__(self, name):
+        if name == "is_closed":
+            return self.__is_closed_state[0]
+        elif name == "read_complete":
+            return self.__read_complete_state[0]
+
+        wrapped = self.__dict__.get("wrapped")
+        if wrapped:
+            return getattr(wrapped, name)
+
+        return getattr(self.__class__, name)
+
+    def __setattr__(self, name, value):
+        if name == "is_closed":
+            self.__is_closed_state[0] = bool(value)
+        elif name == "read_complete":
+            if not self.is_closed:
+                self.__read_complete_state[0] = bool(value)
+        else:
+            self.__dict__[name] = value
+
+    def seek(self, offset, whence=0):
+        assert whence in [0,1,2]
+
+        # how much data, if any, do we need to read?
+        if whence == 2:  # 2: relative to end of *wrapped* file
+            if offset < 0: raise ValueError("negative seek offset")
+            # since we don't know yet where the end of that file is, we must
+            # read everything
+            to_read = None
+        else:
+            if whence == 0:  # 0: absolute
+                if offset < 0: raise ValueError("negative seek offset")
+                dest = offset
+            else:  # 1: relative to current position
+                pos = self.__pos
+                if pos < offset:
+                    raise ValueError("seek to before start of file")
+                dest = pos + offset
+            end = len(self.__cache.getvalue())
+            to_read = dest - end
+            if to_read < 0:
+                to_read = 0
+
+        if to_read != 0:
+            self.__cache.seek(0, 2)
+            if to_read is None:
+                assert whence == 2
+                self.__cache.write(self.wrapped.read())
+                self.read_complete = True
+                self.__pos = self.__cache.tell() - offset
+            else:
+                data = self.wrapped.read(to_read)
+                if not data:
+                    self.read_complete = True
+                else:
+                    self.__cache.write(data)
+                # Don't raise an exception even if we've seek()ed past the end
+                # of .wrapped, since fseek() doesn't complain in that case.
+                # Also like fseek(), pretend we have seek()ed past the end,
+                # i.e. not:
+                #self.__pos = self.__cache.tell()
+                # but rather:
+                self.__pos = dest
+        else:
+            self.__pos = dest
+
+    def tell(self):
+        return self.__pos
+
+    def __copy__(self):
+        cpy = self.__class__(self.wrapped)
+        cpy.__cache = self.__cache
+        cpy.__read_complete_state = self.__read_complete_state
+        cpy.__is_closed_state = self.__is_closed_state
+        return cpy
+
+    def get_data(self):
+        pos = self.__pos
+        try:
+            self.seek(0)
+            return self.read(-1)
+        finally:
+            self.__pos = pos
+
+    def read(self, size=-1):
+        pos = self.__pos
+        end = len(self.__cache.getvalue())
+        available = end - pos
+
+        # enough data already cached?
+        if size <= available and size != -1:
+            self.__cache.seek(pos)
+            self.__pos = pos+size
+            return self.__cache.read(size)
+
+        # no, so read sufficient data from wrapped file and cache it
+        self.__cache.seek(0, 2)
+        if size == -1:
+            self.__cache.write(self.wrapped.read())
+            self.read_complete = True
+        else:
+            to_read = size - available
+            assert to_read > 0
+            data = self.wrapped.read(to_read)
+            if not data:
+                self.read_complete = True
+            else:
+                self.__cache.write(data)
+        self.__cache.seek(pos)
+
+        data = self.__cache.read(size)
+        self.__pos = self.__cache.tell()
+        assert self.__pos == pos + len(data)
+        return data
+
+    def readline(self, size=-1):
+        if not self.__have_readline:
+            raise NotImplementedError("no readline method on wrapped object")
+
+        # line we're about to read might not be complete in the cache, so
+        # read another line first
+        pos = self.__pos
+        self.__cache.seek(0, 2)
+        data = self.wrapped.readline()
+        if not data:
+            self.read_complete = True
+        else:
+            self.__cache.write(data)
+        self.__cache.seek(pos)
+
+        data = self.__cache.readline()
+        if size != -1:
+            r = data[:size]
+            self.__pos = pos+size
+        else:
+            r = data
+            self.__pos = pos+len(data)
+        return r
+
+    def readlines(self, sizehint=-1):
+        pos = self.__pos
+        self.__cache.seek(0, 2)
+        self.__cache.write(self.wrapped.read())
+        self.read_complete = True
+        self.__cache.seek(pos)
+        data = self.__cache.readlines(sizehint)
+        self.__pos = self.__cache.tell()
+        return data
+
+    def __iter__(self): return self
+    def next(self):
+        line = self.readline()
+        if line == "": raise StopIteration
+        return line
+
+    xreadlines = __iter__
+
+    def __repr__(self):
+        return ("<%s at %s (%d) whose wrapped object = %r>" %
+                (self.__class__.__name__, hex(id(self)), self.__pos,
+                 self.wrapped))
+
+
+class response_seek_wrapper(seek_wrapper):
+
+    """
+    Supports copying response objects and setting response body data.
+
+    """
+
+    def __init__(self, wrapped):
+        seek_wrapper.__init__(self, wrapped)
+        self._headers = self.wrapped.info()
+
+    def __copy__(self):
+        cpy = seek_wrapper.__copy__(self)
+        # copy headers from delegate
+        cpy._headers = copy.copy(self.info())
+        return cpy
+
+    def info(self):
+        return self._headers
+
+    def set_data(self, data):
+        self.seek(0)
+        self.read()
+        self.close()
+        cache = self._seek_wrapper__cache = StringIO()
+        cache.write(data)
+        self.seek(0)
+
+
+class eoffile:
+    # file-like object that always claims to be at end-of-file...
+    def read(self, size=-1): return ""
+    def readline(self, size=-1): return ""
+    def __iter__(self): return self
+    def next(self): return ""
+    def close(self): pass
+
+class eofresponse(eoffile):
+    def __init__(self, url, headers, code, msg):
+        self._url = url
+        self._headers = headers
+        self.code = code
+        self.msg = msg
+    def geturl(self): return self._url
+    def info(self): return self._headers
+
+
+class closeable_response:
+    """Avoids unnecessarily clobbering urllib.addinfourl methods on .close().
+
+    Only supports responses returned by mechanize.HTTPHandler.
+
+    After .close(), the following methods are supported:
+
+    .read()
+    .readline()
+    .info()
+    .geturl()
+    .__iter__()
+    .next()
+    .close()
+
+    and the following attributes are supported:
+
+    .code
+    .msg
+
+    Also supports pickling (but the stdlib currently does something to prevent
+    it: http://python.org/sf/1144636).
+
+    """
+    # presence of this attr indicates is useable after .close()
+    closeable_response = None
+
+    def __init__(self, fp, headers, url, code, msg):
+        self._set_fp(fp)
+        self._headers = headers
+        self._url = url
+        self.code = code
+        self.msg = msg
+
+    def _set_fp(self, fp):
+        self.fp = fp
+        self.read = self.fp.read
+        self.readline = self.fp.readline
+        if hasattr(self.fp, "readlines"): self.readlines = self.fp.readlines
+        if hasattr(self.fp, "fileno"):
+            self.fileno = self.fp.fileno
+        else:
+            self.fileno = lambda: None
+        self.__iter__ = self.fp.__iter__
+        self.next = self.fp.next
+
+    def __repr__(self):
+        return '<%s at %s whose fp = %r>' % (
+            self.__class__.__name__, hex(id(self)), self.fp)
+
+    def info(self):
+        return self._headers
+
+    def geturl(self):
+        return self._url
+
+    def close(self):
+        wrapped = self.fp
+        wrapped.close()
+        new_wrapped = eofresponse(
+            self._url, self._headers, self.code, self.msg)
+        self._set_fp(new_wrapped)
+
+    def __getstate__(self):
+        # There are three obvious options here:
+        # 1. truncate
+        # 2. read to end
+        # 3. close socket, pickle state including read position, then open
+        #    again on unpickle and use Range header
+        # XXXX um, 4. refuse to pickle unless .close()d.  This is better,
+        #  actually ("errors should never pass silently").  Pickling doesn't
+        #  work anyway ATM, because of http://python.org/sf/1144636 so fix
+        #  this later
+
+        # 2 breaks pickle protocol, because one expects the original object
+        # to be left unscathed by pickling.  3 is too complicated and
+        # surprising (and too much work ;-) to happen in a sane __getstate__.
+        # So we do 1.
+
+        state = self.__dict__.copy()
+        new_wrapped = eofresponse(
+            self._url, self._headers, self.code, self.msg)
+        state["wrapped"] = new_wrapped
+        return state
+
+def test_response(data='test data', headers=[],
+                  url="http://example.com/", code=200, msg="OK"):
+    return make_response(data, headers, url, code, msg)
+
+def test_html_response(data='test data', headers=[],
+                       url="http://example.com/", code=200, msg="OK"):
+    headers += [("Content-type", "text/html")]
+    return make_response(data, headers, url, code, msg)
+
+def make_response(data, headers, url, code, msg):
+    """Convenient factory for objects implementing response interface.
+
+    data: string containing response body data
+    headers: sequence of (name, value) pairs
+    url: URL of response
+    code: integer response code (e.g. 200)
+    msg: string response code message (e.g. "OK")
+
+    """
+    mime_headers = make_headers(headers)
+    r = closeable_response(StringIO(data), mime_headers, url, code, msg)
+    return response_seek_wrapper(r)
+
+def make_headers(headers):
+    """
+    headers: sequence of (name, value) pairs
+    """
+    hdr_text = []
+    for name_value in headers:
+        hdr_text.append("%s: %s" % name_value)
+    return mimetools.Message(StringIO("\n".join(hdr_text)))
+
+
+# Horrible, but needed, at least until fork urllib2.  Even then, may want
+# to preseve urllib2 compatibility.
+def upgrade_response(response):
+    """Return a copy of response that supports mechanize response interface.
+
+    Accepts responses from both mechanize and urllib2 handlers.
+    """
+    if isinstance(response, urllib2.HTTPError):
+        class httperror_seek_wrapper(response_seek_wrapper, response.__class__):
+            # this only derives from HTTPError in order to be a subclass --
+            # the HTTPError behaviour comes from delegation
+
+            def __init__(self, wrapped):
+                assert isinstance(wrapped, closeable_response), wrapped
+                response_seek_wrapper.__init__(self, wrapped)
+                # be compatible with undocumented HTTPError attributes :-(
+                self.hdrs = wrapped._headers
+                self.filename = wrapped._url
+
+            # we don't want the HTTPError implementation of these
+
+            def geturl(self):
+                return self.wrapped.geturl()
+
+            def close(self):
+                self.wrapped.close()
+        wrapper_class = httperror_seek_wrapper
+    else:
+        wrapper_class = response_seek_wrapper
+
+    if hasattr(response, "closeable_response"):
+        if not hasattr(response, "seek"):
+            response = wrapper_class(response)
+        return copy.copy(response)
+
+    # a urllib2 handler constructed the response, i.e. the response is an
+    # urllib.addinfourl or a urllib2.HTTPError, instead of a
+    # _Util.closeable_response as returned by e.g. mechanize.HTTPHandler
+    try:
+        code = response.code
+    except AttributeError:
+        code = None
+    try:
+        msg = response.msg
+    except AttributeError:
+        msg = None
+
+    # may have already-.read() data from .seek() cache
+    data = None
+    get_data = getattr(response, "get_data", None)
+    if get_data:
+        data = get_data()
+
+    response = closeable_response(
+        response.fp, response.info(), response.geturl(), code, msg)
+    response = wrapper_class(response)
+    if data:
+        response.set_data(data)
+    return response

Added: python-mechanize/branches/upstream/current/mechanize/_rfc3986.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_rfc3986.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_rfc3986.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,239 @@
+"""RFC 3986 URI parsing and relative reference resolution / absolutization.
+
+(aka splitting and joining)
+
+Copyright 2006 John J. Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it under
+the terms of the BSD or ZPL 2.1 licenses (see the file COPYING.txt
+included with the distribution).
+
+"""
+
+# XXX Wow, this is ugly.  Overly-direct translation of the RFC ATM.
+
+import sys, re, posixpath, urllib
+
+## def chr_range(a, b):
+##     return "".join(map(chr, range(ord(a), ord(b)+1)))
+
+## UNRESERVED_URI_CHARS = ("ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+##                         "abcdefghijklmnopqrstuvwxyz"
+##                         "0123456789"
+##                         "-_.~")
+## RESERVED_URI_CHARS = "!*'();:@&=+$,/?#[]"
+## URI_CHARS = RESERVED_URI_CHARS+UNRESERVED_URI_CHARS+'%'
+# this re matches any character that's not in URI_CHARS
+BAD_URI_CHARS_RE = re.compile("[^A-Za-z0-9\-_.~!*'();:@&=+$,/?%#[\]]")
+
+
+def clean_url(url, encoding):
+    # percent-encode illegal URI characters
+    # Trying to come up with test cases for this gave me a headache, revisit
+    # when do switch to unicode.
+    # Somebody else's comments (lost the attribution):
+##     - IE will return you the url in the encoding you send it
+##     - Mozilla/Firefox will send you latin-1 if there's no non latin-1
+##     characters in your link. It will send you utf-8 however if there are...
+    if type(url) == type(""):
+        url = url.decode(encoding, "replace")
+    url = url.strip()
+    # for second param to urllib.quote(), we want URI_CHARS, minus the
+    # 'always_safe' characters that urllib.quote() never percent-encodes
+    return urllib.quote(url.encode(encoding), "!*'();:@&=+$,/?%#[]~")
+
+def is_clean_uri(uri):
+    """
+    >>> is_clean_uri("ABC!")
+    True
+    >>> is_clean_uri(u"ABC!")
+    True
+    >>> is_clean_uri("ABC|")
+    False
+    >>> is_clean_uri(u"ABC|")
+    False
+    >>> is_clean_uri("http://example.com/0")
+    True
+    >>> is_clean_uri(u"http://example.com/0")
+    True
+    """
+    # note module re treats bytestrings as through they were decoded as latin-1
+    # so this function accepts both unicode and bytestrings
+    return not bool(BAD_URI_CHARS_RE.search(uri))
+
+
+SPLIT_MATCH = re.compile(
+    r"^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?").match
+def urlsplit(absolute_uri):
+    """Return scheme, authority, path, query, fragment."""
+    match = SPLIT_MATCH(absolute_uri)
+    if match:
+        g = match.groups()
+        return g[1], g[3], g[4], g[6], g[8]
+
+def urlunsplit(parts):
+    scheme, authority, path, query, fragment = parts
+    r = []
+    append = r.append
+    if scheme is not None:
+        append(scheme)
+        append(":")
+    if authority is not None:
+        append("//")
+        append(authority)
+    append(path)
+    if query is not None:
+        append("?")
+        append(query)
+    if fragment is not None:
+        append("#")
+        append(fragment)
+    return "".join(r)
+
+def urljoin(base_uri, uri_reference):
+    return urlunsplit(urljoin_parts(urlsplit(base_uri),
+                                    urlsplit(uri_reference)))
+
+# oops, this doesn't do the same thing as the literal translation
+# from the RFC below
+## def urljoin_parts(base_parts, reference_parts):
+##     scheme, authority, path, query, fragment = base_parts
+##     rscheme, rauthority, rpath, rquery, rfragment = reference_parts
+
+##     # compute target URI path
+##     if rpath == "":
+##         tpath = path
+##     else:
+##         tpath = rpath
+##         if not tpath.startswith("/"):
+##             tpath = merge(authority, path, tpath)
+##         tpath = posixpath.normpath(tpath)
+
+##     if rscheme is not None:
+##         return (rscheme, rauthority, tpath, rquery, rfragment)
+##     elif rauthority is not None:
+##         return (scheme, rauthority, tpath, rquery, rfragment)
+##     elif rpath == "":
+##         if rquery is not None:
+##             tquery = rquery
+##         else:
+##             tquery = query
+##         return (scheme, authority, tpath, tquery, rfragment)
+##     else:
+##         return (scheme, authority, tpath, rquery, rfragment)
+
+def urljoin_parts(base_parts, reference_parts):
+    scheme, authority, path, query, fragment = base_parts
+    rscheme, rauthority, rpath, rquery, rfragment = reference_parts
+
+    if rscheme == scheme:
+        rscheme = None
+
+    if rscheme is not None:
+        tscheme, tauthority, tpath, tquery = (
+            rscheme, rauthority, remove_dot_segments(rpath), rquery)
+    else:
+        if rauthority is not None:
+            tauthority, tpath, tquery = (
+                rauthority, remove_dot_segments(rpath), rquery)
+        else:
+            if rpath == "":
+                tpath = path
+                if rquery is not None:
+                    tquery = rquery
+                else:
+                    tquery = query
+            else:
+                if rpath.startswith("/"):
+                    tpath = remove_dot_segments(rpath)
+                else:
+                    tpath = merge(authority, path, rpath)
+                    tpath = remove_dot_segments(tpath)
+                tquery = rquery
+            tauthority = authority
+        tscheme = scheme
+    tfragment = rfragment
+    return (tscheme, tauthority, tpath, tquery, tfragment)
+
+# um, something *vaguely* like this is what I want, but I have to generate
+# lots of test cases first, if only to understand what it is that
+# remove_dot_segments really does...
+## def remove_dot_segments(path):
+##     if path == '':
+##         return ''
+##     comps = path.split('/')
+##     new_comps = []
+##     for comp in comps:
+##         if comp in ['.', '']:
+##             if not new_comps or new_comps[-1]:
+##                 new_comps.append('')
+##             continue
+##         if comp != '..':
+##             new_comps.append(comp)
+##         elif new_comps:
+##             new_comps.pop()
+##     return '/'.join(new_comps)
+
+
+def remove_dot_segments(path):
+    r = []
+    while path:
+        # A
+        if path.startswith("../"):
+            path = path[3:]
+            continue
+        if path.startswith("./"):
+            path = path[2:]
+            continue
+        # B
+        if path.startswith("/./"):
+            path = path[2:]
+            continue
+        if path == "/.":
+            path = "/"
+            continue
+        # C
+        if path.startswith("/../"):
+            path = path[3:]
+            if r:
+                r.pop()
+            continue
+        if path == "/..":
+            path = "/"
+            r.pop()
+            continue
+        # D
+        if path == ".":
+            path = path[1:]
+            continue
+        if path == "..":
+            path = path[2:]
+            continue
+        # E
+        start = 0
+        if path.startswith("/"):
+            start = 1
+        ii = path.find("/", start)
+        if ii < 0:
+            ii = None
+        r.append(path[:ii])
+        if ii is None:
+            break
+        path = path[ii:]
+    return "".join(r)
+
+def merge(base_authority, base_path, ref_path):
+    # XXXX Oddly, the sample Perl implementation of this by Roy Fielding
+    # doesn't even take base_authority as a parameter, despite the wording in
+    # the RFC suggesting otherwise.  Perhaps I'm missing some obvious identity.
+    #if base_authority is not None and base_path == "":
+    if base_path == "":
+        return "/" + ref_path
+    ii = base_path.rfind("/")
+    if ii >= 0:
+        return base_path[:ii+1] + ref_path
+    return ref_path
+
+if __name__ == "__main__":
+    import doctest
+    doctest.testmod()

Added: python-mechanize/branches/upstream/current/mechanize/_seek.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_seek.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_seek.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,11 @@
+from urllib2 import BaseHandler
+from _response import response_seek_wrapper
+
+
+class SeekableProcessor(BaseHandler):
+    """Make responses seekable."""
+
+    def any_response(self, request, response):
+        if not hasattr(response, "seek"):
+            return response_seek_wrapper(response)
+        return response

Added: python-mechanize/branches/upstream/current/mechanize/_upgrade.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_upgrade.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_upgrade.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,34 @@
+from urllib2 import BaseHandler
+
+from _request import Request
+from _response import upgrade_response
+
+
+class HTTPRequestUpgradeProcessor(BaseHandler):
+    # upgrade urllib2.Request to this module's Request
+    # yuck!
+    handler_order = 0  # before anything else
+
+    def http_request(self, request):
+        if not hasattr(request, "add_unredirected_header"):
+            newrequest = Request(request._Request__original, request.data,
+                                 request.headers)
+            try: newrequest.origin_req_host = request.origin_req_host
+            except AttributeError: pass
+            try: newrequest.unverifiable = request.unverifiable
+            except AttributeError: pass
+            try: newrequest.visit = request.visit
+            except AttributeError: pass
+            request = newrequest
+        return request
+
+    https_request = http_request
+
+
+class ResponseUpgradeProcessor(BaseHandler):
+    # upgrade responses to be .close()able without becoming unusable
+    handler_order = 0  # before anything else
+    def any_response(self, request, response):
+        if not hasattr(response, 'closeable_response'):
+            response = upgrade_response(response)
+        return response

Modified: python-mechanize/branches/upstream/current/mechanize/_urllib2.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_urllib2.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_urllib2.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -3,24 +3,25 @@
 from urllib2 import \
      URLError, \
      HTTPError, \
-     GopherError, \
+     GopherError
+# ...and from mechanize
+from _opener import OpenerDirector, \
+     build_opener, install_opener, urlopen
+from _auth import \
      HTTPPasswordMgr, \
      HTTPPasswordMgrWithDefaultRealm, \
      AbstractBasicAuthHandler, \
-     AbstractDigestAuthHandler
-# ...and from mechanize
-from _opener import OpenerDirector
-from _auth import \
+     AbstractDigestAuthHandler, \
      HTTPProxyPasswordMgr, \
      ProxyHandler, \
      ProxyBasicAuthHandler, \
      ProxyDigestAuthHandler, \
      HTTPBasicAuthHandler, \
-     HTTPDigestAuthHandler
-from _urllib2_support import \
-     Request, \
-     build_opener, install_opener, urlopen, \
-     OpenerFactory, urlretrieve, \
+     HTTPDigestAuthHandler, \
+     HTTPSClientCertMgr
+from _request import \
+     Request
+from _http import \
      RobotExclusionError
 
 # handlers...
@@ -34,20 +35,27 @@
      FileHandler, \
      GopherHandler
 # ...and from mechanize
-from _urllib2_support import \
+from _http import \
      HTTPHandler, \
      HTTPRedirectHandler, \
-     HTTPRequestUpgradeProcessor, \
      HTTPEquivProcessor, \
-     SeekableProcessor, \
      HTTPCookieProcessor, \
      HTTPRefererProcessor, \
      HTTPRefreshProcessor, \
      HTTPErrorProcessor, \
+     HTTPRobotRulesProcessor
+from _upgrade import \
+     HTTPRequestUpgradeProcessor, \
+     ResponseUpgradeProcessor
+from _debug import \
      HTTPResponseDebugProcessor, \
-     HTTPRedirectDebugProcessor, \
-     HTTPRobotRulesProcessor
+     HTTPRedirectDebugProcessor
+from _seek import \
+     SeekableProcessor
+# crap ATM
+## from _gzip import \
+##      HTTPGzipProcessor
 import httplib
 if hasattr(httplib, 'HTTPS'):
-    from _urllib2_support import HTTPSHandler
+    from _http import HTTPSHandler
 del httplib

Deleted: python-mechanize/branches/upstream/current/mechanize/_urllib2_support.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_urllib2_support.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_urllib2_support.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,718 +0,0 @@
-"""Integration with Python standard library module urllib2.
-
-Also includes a redirection bugfix, support for parsing HTML HEAD blocks for
-the META HTTP-EQUIV tag contents, and following Refresh header redirects.
-
-Copyright 2002-2006 John J Lee <jjl at pobox.com>
-
-This code is free software; you can redistribute it and/or modify it
-under the terms of the BSD or ZPL 2.1 licenses (see the file
-COPYING.txt included with the distribution).
-
-"""
-
-import copy, time, tempfile, htmlentitydefs, re, logging, types, \
-       string, socket, urlparse, urllib2, urllib, httplib, sgmllib
-from urllib2 import URLError, HTTPError, BaseHandler
-from cStringIO import StringIO
-try:
-    import threading as _threading
-except ImportError:
-    import dummy_threading as _threading
-
-import _opener
-from _request import Request
-from _util import isstringlike, startswith, \
-     getheaders, closeable_response, response_seek_wrapper
-from _html import unescape, unescape_charref
-from _headersutil import is_html
-from _clientcookie import CookieJar, request_host
-
-debug = logging.getLogger("mechanize.cookies").debug
-
-
-CHUNK = 1024  # size of chunks fed to HTML HEAD parser, in bytes
-DEFAULT_ENCODING = 'latin-1'
-
-
-# This fixes a bug in urllib2 as of Python 2.1.3 and 2.2.2
-#  (http://www.python.org/sf/549151)
-# 2.2.3 is broken here (my fault!), 2.3 is fixed.
-class HTTPRedirectHandler(BaseHandler):
-    # maximum number of redirections to any single URL
-    # this is needed because of the state that cookies introduce
-    max_repeats = 4
-    # maximum total number of redirections (regardless of URL) before
-    # assuming we're in a loop
-    max_redirections = 10
-
-    # Implementation notes:
-
-    # To avoid the server sending us into an infinite loop, the request
-    # object needs to track what URLs we have already seen.  Do this by
-    # adding a handler-specific attribute to the Request object.  The value
-    # of the dict is used to count the number of times the same URL has
-    # been visited.  This is needed because visiting the same URL twice
-    # does not necessarily imply a loop, thanks to state introduced by
-    # cookies.
-
-    # Always unhandled redirection codes:
-    # 300 Multiple Choices: should not handle this here.
-    # 304 Not Modified: no need to handle here: only of interest to caches
-    #     that do conditional GETs
-    # 305 Use Proxy: probably not worth dealing with here
-    # 306 Unused: what was this for in the previous versions of protocol??
-
-    def redirect_request(self, newurl, req, fp, code, msg, headers):
-        """Return a Request or None in response to a redirect.
-
-        This is called by the http_error_30x methods when a redirection
-        response is received.  If a redirection should take place, return a
-        new Request to allow http_error_30x to perform the redirect;
-        otherwise, return None to indicate that an HTTPError should be
-        raised.
-
-        """
-        if code in (301, 302, 303, "refresh") or \
-               (code == 307 and not req.has_data()):
-            # Strictly (according to RFC 2616), 301 or 302 in response to
-            # a POST MUST NOT cause a redirection without confirmation
-            # from the user (of urllib2, in this case).  In practice,
-            # essentially all clients do redirect in this case, so we do
-            # the same.
-            return Request(newurl,
-                           headers=req.headers,
-                           origin_req_host=req.get_origin_req_host(),
-                           unverifiable=True)
-        else:
-            raise HTTPError(req.get_full_url(), code, msg, headers, fp)
-
-    def http_error_302(self, req, fp, code, msg, headers):
-        # Some servers (incorrectly) return multiple Location headers
-        # (so probably same goes for URI).  Use first header.
-        if headers.has_key('location'):
-            newurl = getheaders(headers, 'location')[0]
-        elif headers.has_key('uri'):
-            newurl = getheaders(headers, 'uri')[0]
-        else:
-            return
-        newurl = urlparse.urljoin(req.get_full_url(), newurl)
-
-        # XXX Probably want to forget about the state of the current
-        # request, although that might interact poorly with other
-        # handlers that also use handler-specific request attributes
-        new = self.redirect_request(newurl, req, fp, code, msg, headers)
-        if new is None:
-            return
-
-        # loop detection
-        # .redirect_dict has a key url if url was previously visited.
-        if hasattr(req, 'redirect_dict'):
-            visited = new.redirect_dict = req.redirect_dict
-            if (visited.get(newurl, 0) >= self.max_repeats or
-                len(visited) >= self.max_redirections):
-                raise HTTPError(req.get_full_url(), code,
-                                self.inf_msg + msg, headers, fp)
-        else:
-            visited = new.redirect_dict = req.redirect_dict = {}
-        visited[newurl] = visited.get(newurl, 0) + 1
-
-        # Don't close the fp until we are sure that we won't use it
-        # with HTTPError.  
-        fp.read()
-        fp.close()
-
-        return self.parent.open(new)
-
-    http_error_301 = http_error_303 = http_error_307 = http_error_302
-    http_error_refresh = http_error_302
-
-    inf_msg = "The HTTP server returned a redirect error that would " \
-              "lead to an infinite loop.\n" \
-              "The last 30x error message was:\n"
-
-
-class HTTPRequestUpgradeProcessor(BaseHandler):
-    # upgrade urllib2.Request to this module's Request
-    # yuck!
-    handler_order = 0  # before anything else
-
-    def http_request(self, request):
-        if not hasattr(request, "add_unredirected_header"):
-            newrequest = Request(request._Request__original, request.data,
-                                 request.headers)
-            try: newrequest.origin_req_host = request.origin_req_host
-            except AttributeError: pass
-            try: newrequest.unverifiable = request.unverifiable
-            except AttributeError: pass
-            request = newrequest
-        return request
-
-    https_request = http_request
-
-# XXX would self.reset() work, instead of raising this exception?
-class EndOfHeadError(Exception): pass
-class AbstractHeadParser:
-    # only these elements are allowed in or before HEAD of document
-    head_elems = ("html", "head",
-                  "title", "base",
-                  "script", "style", "meta", "link", "object")
-    _entitydefs = htmlentitydefs.name2codepoint
-    _encoding = DEFAULT_ENCODING
-
-    def __init__(self):
-        self.http_equiv = []
-
-    def start_meta(self, attrs):
-        http_equiv = content = None
-        for key, value in attrs:
-            if key == "http-equiv":
-                http_equiv = self.unescape_attr_if_required(value)
-            elif key == "content":
-                content = self.unescape_attr_if_required(value)
-        if http_equiv is not None:
-            self.http_equiv.append((http_equiv, content))
-
-    def end_head(self):
-        raise EndOfHeadError()
-
-    def handle_entityref(self, name):
-        #debug("%s", name)
-        self.handle_data(unescape(
-            '&%s;' % name, self._entitydefs, self._encoding))
-
-    def handle_charref(self, name):
-        #debug("%s", name)
-        self.handle_data(unescape_charref(name, self._encoding))
-
-    def unescape_attr(self, name):
-        #debug("%s", name)
-        return unescape(name, self._entitydefs, self._encoding)
-
-    def unescape_attrs(self, attrs):
-        #debug("%s", attrs)
-        escaped_attrs = {}
-        for key, val in attrs.items():
-            escaped_attrs[key] = self.unescape_attr(val)
-        return escaped_attrs
-
-    def unknown_entityref(self, ref):
-        self.handle_data("&%s;" % ref)
-
-    def unknown_charref(self, ref):
-        self.handle_data("&#%s;" % ref)
-
-
-try:
-    import HTMLParser
-except ImportError:
-    pass
-else:
-    class XHTMLCompatibleHeadParser(AbstractHeadParser,
-                                    HTMLParser.HTMLParser):
-        def __init__(self):
-            HTMLParser.HTMLParser.__init__(self)
-            AbstractHeadParser.__init__(self)
-
-        def handle_starttag(self, tag, attrs):
-            if tag not in self.head_elems:
-                raise EndOfHeadError()
-            try:
-                method = getattr(self, 'start_' + tag)
-            except AttributeError:
-                try:
-                    method = getattr(self, 'do_' + tag)
-                except AttributeError:
-                    pass # unknown tag
-                else:
-                    method(attrs)
-            else:
-                method(attrs)
-
-        def handle_endtag(self, tag):
-            if tag not in self.head_elems:
-                raise EndOfHeadError()
-            try:
-                method = getattr(self, 'end_' + tag)
-            except AttributeError:
-                pass # unknown tag
-            else:
-                method()
-
-        def unescape(self, name):
-            # Use the entitydefs passed into constructor, not
-            # HTMLParser.HTMLParser's entitydefs.
-            return self.unescape_attr(name)
-
-        def unescape_attr_if_required(self, name):
-            return name  # HTMLParser.HTMLParser already did it
-
-class HeadParser(AbstractHeadParser, sgmllib.SGMLParser):
-
-    def _not_called(self):
-        assert False
-
-    def __init__(self):
-        sgmllib.SGMLParser.__init__(self)
-        AbstractHeadParser.__init__(self)
-
-    def handle_starttag(self, tag, method, attrs):
-        if tag not in self.head_elems:
-            raise EndOfHeadError()
-        if tag == "meta":
-            method(attrs)
-
-    def unknown_starttag(self, tag, attrs):
-        self.handle_starttag(tag, self._not_called, attrs)
-
-    def handle_endtag(self, tag, method):
-        if tag in self.head_elems:
-            method()
-        else:
-            raise EndOfHeadError()
-
-    def unescape_attr_if_required(self, name):
-        return self.unescape_attr(name)
-
-def parse_head(fileobj, parser):
-    """Return a list of key, value pairs."""
-    while 1:
-        data = fileobj.read(CHUNK)
-        try:
-            parser.feed(data)
-        except EndOfHeadError:
-            break
-        if len(data) != CHUNK:
-            # this should only happen if there is no HTML body, or if
-            # CHUNK is big
-            break
-    return parser.http_equiv
-
-class HTTPEquivProcessor(BaseHandler):
-    """Append META HTTP-EQUIV headers to regular HTTP headers."""
-
-    handler_order = 300  # before handlers that look at HTTP headers
-
-    def __init__(self, head_parser_class=HeadParser,
-                 i_want_broken_xhtml_support=False,
-                 ):
-        self.head_parser_class = head_parser_class
-        self._allow_xhtml = i_want_broken_xhtml_support
-
-    def http_response(self, request, response):
-        if not hasattr(response, "seek"):
-            response = response_seek_wrapper(response)
-        headers = response.info()
-        url = response.geturl()
-        ct_hdrs = getheaders(response.info(), "content-type")
-        if is_html(ct_hdrs, url, self._allow_xhtml):
-            try:
-                try:
-                    html_headers = parse_head(response, self.head_parser_class())
-                finally:
-                    response.seek(0)
-            except (HTMLParser.HTMLParseError,
-                    sgmllib.SGMLParseError):
-                pass
-            else:
-                for hdr, val in html_headers:
-                    # rfc822.Message interprets this as appending, not clobbering
-                    headers[hdr] = val
-        return response
-
-    https_response = http_response
-
-class SeekableProcessor(BaseHandler):
-    """Make responses seekable."""
-
-    def any_response(self, request, response):
-        if not hasattr(response, "seek"):
-            return response_seek_wrapper(response)
-        return response
-
-class HTTPCookieProcessor(BaseHandler):
-    """Handle HTTP cookies.
-
-    Public attributes:
-
-    cookiejar: CookieJar instance
-
-    """
-    def __init__(self, cookiejar=None):
-        if cookiejar is None:
-            cookiejar = CookieJar()
-        self.cookiejar = cookiejar
-
-    def http_request(self, request):
-        self.cookiejar.add_cookie_header(request)
-        return request
-
-    def http_response(self, request, response):
-        self.cookiejar.extract_cookies(response, request)
-        return response
-
-    https_request = http_request
-    https_response = http_response
-
-try:
-    import robotparser
-except ImportError:
-    pass
-else:
-    class RobotExclusionError(urllib2.HTTPError):
-        def __init__(self, request, *args):
-            apply(urllib2.HTTPError.__init__, (self,)+args)
-            self.request = request
-
-    class HTTPRobotRulesProcessor(BaseHandler):
-        # before redirections, after everything else
-        handler_order = 800
-
-        try:
-            from httplib import HTTPMessage
-        except:
-            from mimetools import Message
-            http_response_class = Message
-        else:
-            http_response_class = HTTPMessage
-
-        def __init__(self, rfp_class=robotparser.RobotFileParser):
-            self.rfp_class = rfp_class
-            self.rfp = None
-            self._host = None
-
-        def http_request(self, request):
-            host = request.get_host()
-            scheme = request.get_type()
-            if host != self._host:
-                self.rfp = self.rfp_class()
-                self.rfp.set_url(scheme+"://"+host+"/robots.txt")
-                self.rfp.read()
-                self._host = host
-
-            ua = request.get_header("User-agent", "")
-            if self.rfp.can_fetch(ua, request.get_full_url()):
-                return request
-            else:
-                msg = "request disallowed by robots.txt"
-                raise RobotExclusionError(
-                    request,
-                    request.get_full_url(),
-                    403, msg,
-                    self.http_response_class(StringIO()), StringIO(msg))
-
-        https_request = http_request
-
-class HTTPRefererProcessor(BaseHandler):
-    """Add Referer header to requests.
-
-    This only makes sense if you use each RefererProcessor for a single
-    chain of requests only (so, for example, if you use a single
-    HTTPRefererProcessor to fetch a series of URLs extracted from a single
-    page, this will break).
-
-    There's a proper implementation of this in module mechanize.
-
-    """
-    def __init__(self):
-        self.referer = None
-
-    def http_request(self, request):
-        if ((self.referer is not None) and
-            not request.has_header("Referer")):
-            request.add_unredirected_header("Referer", self.referer)
-        return request
-
-    def http_response(self, request, response):
-        self.referer = response.geturl()
-        return response
-
-    https_request = http_request
-    https_response = http_response
-
-class HTTPResponseDebugProcessor(BaseHandler):
-    handler_order = 900  # before redirections, after everything else
-
-    def http_response(self, request, response):
-        if not hasattr(response, "seek"):
-            response = response_seek_wrapper(response)
-        info = getLogger("mechanize.http_responses").info
-        try:
-            info(response.read())
-        finally:
-            response.seek(0)
-        info("*****************************************************")
-        return response
-
-    https_response = http_response
-
-class HTTPRedirectDebugProcessor(BaseHandler):
-    def http_request(self, request):
-        if hasattr(request, "redirect_dict"):
-            info = getLogger("mechanize.http_redirects").info
-            info("redirecting to %s", request.get_full_url())
-        return request
-
-class HTTPRefreshProcessor(BaseHandler):
-    """Perform HTTP Refresh redirections.
-
-    Note that if a non-200 HTTP code has occurred (for example, a 30x
-    redirect), this processor will do nothing.
-
-    By default, only zero-time Refresh headers are redirected.  Use the
-    max_time attribute / constructor argument to allow Refresh with longer
-    pauses.  Use the honor_time attribute / constructor argument to control
-    whether the requested pause is honoured (with a time.sleep()) or
-    skipped in favour of immediate redirection.
-
-    Public attributes:
-
-    max_time: see above
-    honor_time: see above
-
-    """
-    handler_order = 1000
-
-    def __init__(self, max_time=0, honor_time=True):
-        self.max_time = max_time
-        self.honor_time = honor_time
-
-    def http_response(self, request, response):
-        code, msg, hdrs = response.code, response.msg, response.info()
-
-        if code == 200 and hdrs.has_key("refresh"):
-            refresh = getheaders(hdrs, "refresh")[0]
-            ii = string.find(refresh, ";")
-            if ii != -1:
-                pause, newurl_spec = float(refresh[:ii]), refresh[ii+1:]
-                jj = string.find(newurl_spec, "=")
-                if jj != -1:
-                    key, newurl = newurl_spec[:jj], newurl_spec[jj+1:]
-                if key.strip().lower() != "url":
-                    debug("bad Refresh header: %r" % refresh)
-                    return response
-            else:
-                pause, newurl = float(refresh), response.geturl()
-            if (self.max_time is None) or (pause <= self.max_time):
-                if pause > 1E-3 and self.honor_time:
-                    time.sleep(pause)
-                hdrs["location"] = newurl
-                # hardcoded http is NOT a bug
-                response = self.parent.error(
-                    "http", request, response,
-                    "refresh", msg, hdrs)
-
-        return response
-
-    https_response = http_response
-
-class HTTPErrorProcessor(BaseHandler):
-    """Process HTTP error responses.
-
-    The purpose of this handler is to to allow other response processors a
-    look-in by removing the call to parent.error() from
-    AbstractHTTPHandler.
-
-    For non-200 error codes, this just passes the job on to the
-    Handler.<proto>_error_<code> methods, via the OpenerDirector.error
-    method.  Eventually, urllib2.HTTPDefaultErrorHandler will raise an
-    HTTPError if no other handler handles the error.
-
-    """
-    handler_order = 1000  # after all other processors
-
-    def http_response(self, request, response):
-        code, msg, hdrs = response.code, response.msg, response.info()
-
-        if code != 200:
-            # hardcoded http is NOT a bug
-            response = self.parent.error(
-                "http", request, response, code, msg, hdrs)
-
-        return response
-
-    https_response = http_response
-
-
-class AbstractHTTPHandler(BaseHandler):
-
-    def __init__(self, debuglevel=0):
-        self._debuglevel = debuglevel
-
-    def set_http_debuglevel(self, level):
-        self._debuglevel = level
-
-    def do_request_(self, request):
-        host = request.get_host()
-        if not host:
-            raise URLError('no host given')
-
-        if request.has_data():  # POST
-            data = request.get_data()
-            if not request.has_header('Content-type'):
-                request.add_unredirected_header(
-                    'Content-type',
-                    'application/x-www-form-urlencoded')
-
-        scheme, sel = urllib.splittype(request.get_selector())
-        sel_host, sel_path = urllib.splithost(sel)
-        if not request.has_header('Host'):
-            request.add_unredirected_header('Host', sel_host or host)
-        for name, value in self.parent.addheaders:
-            name = string.capitalize(name)
-            if not request.has_header(name):
-                request.add_unredirected_header(name, value)
-
-        return request
-
-    def do_open(self, http_class, req):
-        """Return an addinfourl object for the request, using http_class.
-
-        http_class must implement the HTTPConnection API from httplib.
-        The addinfourl return value is a file-like object.  It also
-        has methods and attributes including:
-            - info(): return a mimetools.Message object for the headers
-            - geturl(): return the original request URL
-            - code: HTTP status code
-        """
-        host = req.get_host()
-        if not host:
-            raise URLError('no host given')
-
-        h = http_class(host) # will parse host:port
-        h.set_debuglevel(self._debuglevel)
-
-        headers = req.headers.copy()
-        headers.update(req.unredirected_hdrs)
-        # We want to make an HTTP/1.1 request, but the addinfourl
-        # class isn't prepared to deal with a persistent connection.
-        # It will try to read all remaining data from the socket,
-        # which will block while the server waits for the next request.
-        # So make sure the connection gets closed after the (only)
-        # request.
-        headers["Connection"] = "close"
-        try:
-            h.request(req.get_method(), req.get_selector(), req.data, headers)
-            r = h.getresponse()
-        except socket.error, err: # XXX what error?
-            raise URLError(err)
-
-        # Pick apart the HTTPResponse object to get the addinfourl
-        # object initialized properly.
-
-        # Wrap the HTTPResponse object in socket's file object adapter
-        # for Windows.  That adapter calls recv(), so delegate recv()
-        # to read().  This weird wrapping allows the returned object to
-        # have readline() and readlines() methods.
-
-        # XXX It might be better to extract the read buffering code
-        # out of socket._fileobject() and into a base class.
-
-        r.recv = r.read
-        fp = socket._fileobject(r, 'rb', -1)
-
-        resp = closeable_response(fp, r.msg, req.get_full_url(),
-                                  r.status, r.reason)
-        return resp
-
-
-class HTTPHandler(AbstractHTTPHandler):
-    def http_open(self, req):
-        return self.do_open(httplib.HTTPConnection, req)
-
-    http_request = AbstractHTTPHandler.do_request_
-
-if hasattr(httplib, 'HTTPS'):
-    class HTTPSHandler(AbstractHTTPHandler):
-        def https_open(self, req):
-            return self.do_open(httplib.HTTPSConnection, req)
-
-        https_request = AbstractHTTPHandler.do_request_
-
-class OpenerFactory:
-    """This class's interface is quite likely to change."""
-
-    default_classes = [
-        # handlers
-        urllib2.ProxyHandler,
-        urllib2.UnknownHandler,
-        HTTPHandler,  # from this module (derived from new AbstractHTTPHandler)
-        urllib2.HTTPDefaultErrorHandler,
-        HTTPRedirectHandler,  # from this module (bugfixed)
-        urllib2.FTPHandler,
-        urllib2.FileHandler,
-        # processors
-        HTTPRequestUpgradeProcessor,
-        HTTPCookieProcessor,
-        HTTPErrorProcessor
-        ]
-    handlers = []
-    replacement_handlers = []
-
-    def __init__(self, klass=_opener.OpenerDirector):
-        self.klass = klass
-
-    def build_opener(self, *handlers):
-        """Create an opener object from a list of handlers and processors.
-
-        The opener will use several default handlers and processors, including
-        support for HTTP and FTP.
-
-        If any of the handlers passed as arguments are subclasses of the
-        default handlers, the default handlers will not be used.
-
-        """
-        opener = self.klass()
-        default_classes = list(self.default_classes)
-        if hasattr(httplib, 'HTTPS'):
-            default_classes.append(HTTPSHandler)
-        skip = []
-        for klass in default_classes:
-            for check in handlers:
-                if type(check) == types.ClassType:
-                    if issubclass(check, klass):
-                        skip.append(klass)
-                elif type(check) == types.InstanceType:
-                    if isinstance(check, klass):
-                        skip.append(klass)
-        for klass in skip:
-            default_classes.remove(klass)
-
-        for klass in default_classes:
-            opener.add_handler(klass())
-        for h in handlers:
-            if type(h) == types.ClassType:
-                h = h()
-            opener.add_handler(h)
-
-        return opener
-
-build_opener = OpenerFactory().build_opener
-
-_opener = None
-urlopen_lock = _threading.Lock()
-def urlopen(url, data=None):
-    global _opener
-    if _opener is None:
-        urlopen_lock.acquire()
-        try:
-            if _opener is None:
-                _opener = build_opener()
-        finally:
-            urlopen_lock.release()
-    return _opener.open(url, data)
-
-def urlretrieve(url, filename=None, reporthook=None, data=None):
-    global _opener
-    if _opener is None:
-        urlopen_lock.acquire()
-        try:
-            if _opener is None:
-                _opener = build_opener()
-        finally:
-            urlopen_lock.release()
-    return _opener.retrieve(url, filename, reporthook, data)
-
-def install_opener(opener):
-    global _opener
-    _opener = opener

Modified: python-mechanize/branches/upstream/current/mechanize/_useragent.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_useragent.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_useragent.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -35,18 +35,24 @@
     https_request = http_request
 
 
-class UserAgent(OpenerDirector):
+class UserAgentBase(OpenerDirector):
     """Convenient user-agent class.
 
     Do not use .add_handler() to add a handler for something already dealt with
     by this code.
 
+    The only reason at present for the distinction between UserAgent and
+    UserAgentBase is so that classes that depend on .seek()able responses
+    (e.g. mechanize.Browser) can inherit from UserAgentBase.  The subclass
+    UserAgent exposes a .set_seekable_responses() method that allows switching
+    off the adding of a .seek() method to responses.
+
     Public attributes:
 
     addheaders: list of (name, value) pairs specifying headers to send with
      every request, unless they are overridden in the Request instance.
 
-     >>> ua = UserAgent()
+     >>> ua = UserAgentBase()
      >>> ua.addheaders = [
      ...  ("User-agent", "Mozilla/5.0 (compatible)"),
      ...  ("From", "responsible.person at example.com")]
@@ -130,6 +136,10 @@
             ppm = _auth.HTTPProxyPasswordMgr()
         self.set_password_manager(pm)
         self.set_proxy_password_manager(ppm)
+        # set default certificate manager
+        if "https" in ua_handlers:
+            cm = _urllib2.HTTPSClientCertMgr()
+            self.set_client_cert_manager(cm)
 
         # special case, requires extra support from mechanize.Browser
         self._handle_referer = True
@@ -200,6 +210,25 @@
         self._proxy_password_manager.add_password(
             realm, hostport, user, password)
 
+    def add_client_certificate(self, url, key_file, cert_file):
+        """Add an SSL client certificate, for HTTPS client auth.
+
+        key_file and cert_file must be filenames of the key and certificate
+        files, in PEM format.  You can use e.g. OpenSSL to convert a p12 (PKCS
+        12) file to PEM format:
+
+        openssl pkcs12 -clcerts -nokeys -in cert.p12 -out cert.pem
+        openssl pkcs12 -nocerts -in cert.p12 -out key.pem
+
+
+        Note that client certificate password input is very inflexible ATM.  At
+        the moment this seems to be console only, which is presumably the
+        default behaviour of libopenssl.  In future mechanize may support
+        third-party libraries that (I assume) allow more options here.
+
+        """
+        self._client_cert_manager.add_key_cert(url, key_file, cert_file)
+
     # the following are rarely useful -- use add_password / add_proxy_password
     # instead
     def set_password_manager(self, password_manager):
@@ -212,6 +241,11 @@
         self._proxy_password_manager = password_manager
         self._set_handler("_proxy_basicauth", obj=password_manager)
         self._set_handler("_proxy_digestauth", obj=password_manager)
+    def set_client_cert_manager(self, cert_manager):
+        """Set a mechanize.HTTPClientCertMgr, or None."""
+        self._client_cert_manager = cert_manager
+        handler = self._ua_handlers["https"]
+        handler.client_cert_manager = cert_manager
 
     # these methods all take a boolean parameter
     def set_handle_robots(self, handle):
@@ -321,3 +355,10 @@
         if newhandler is not None:
             self.add_handler(newhandler)
             self._ua_handlers[name] = newhandler
+
+
+class UserAgent(UserAgentBase):
+
+    def set_seekable_responses(self, handle):
+        """Make response objects .seek()able."""
+        self._set_handler("_seek", handle)

Modified: python-mechanize/branches/upstream/current/mechanize/_util.py
===================================================================
--- python-mechanize/branches/upstream/current/mechanize/_util.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize/_util.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,4 +1,4 @@
-"""Python backwards-compat., date/time routines, seekable file object wrapper.
+"""Utility functions and date/time routines.
 
  Copyright 2002-2006 John J Lee <jjl at pobox.com>
 
@@ -8,32 +8,13 @@
 
 """
 
-import re, string, time, copy, urllib, mimetools
-from types import TupleType
-from cStringIO import StringIO
+import re, string, time
 
-def startswith(string, initial):
-    if len(initial) > len(string): return False
-    return string[:len(initial)] == initial
-
-def endswith(string, final):
-    if len(final) > len(string): return False
-    return string[-len(final):] == final
-
 def isstringlike(x):
     try: x+""
     except: return False
     else: return True
 
-SPACE_DICT = {}
-for c in string.whitespace:
-    SPACE_DICT[c] = None
-del c
-def isspace(string):
-    for c in string:
-        if not SPACE_DICT.has_key(c): return False
-    return True
-
 ## def caller():
 ##     try:
 ##         raise SyntaxError
@@ -42,33 +23,6 @@
 ##     return sys.exc_traceback.tb_frame.f_back.f_back.f_code.co_name
 
 
-# this is here rather than in _HeadersUtil as it's just for
-# compatibility with old Python versions, rather than entirely new code
-def getheaders(msg, name):
-    """Get all values for a header.
-
-    This returns a list of values for headers given more than once; each
-    value in the result list is stripped in the same way as the result of
-    getheader().  If the header is not given, return an empty list.
-    """
-    result = []
-    current = ''
-    have_header = 0
-    for s in msg.getallmatchingheaders(name):
-        if isspace(s[0]):
-            if current:
-                current = "%s\n %s" % (current, string.strip(s))
-            else:
-                current = string.strip(s)
-        else:
-            if have_header:
-                result.append(current)
-            current = string.strip(s[string.find(s, ":") + 1:])
-            have_header = 1
-    if have_header:
-        result.append(current)
-    return result
-
 from calendar import timegm
 
 # Date/time conversion routines for formats used by the HTTP protocol.
@@ -86,7 +40,7 @@
 months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun",
           "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
 months_lower = []
-for month in months: months_lower.append(string.lower(month))
+for month in months: months_lower.append(month.lower())
 
 
 def time2isoz(t=None):
@@ -144,7 +98,7 @@
     # translate month name to number
     # month numbers start with 1 (January)
     try:
-        mon = months_lower.index(string.lower(mon))+1
+        mon = months_lower.index(mon.lower())+1
     except ValueError:
         # maybe it's already a number
         try:
@@ -185,7 +139,7 @@
         # adjust time using timezone string, to get absolute time since epoch
         if tz is None:
             tz = "UTC"
-        tz = string.upper(tz)
+        tz = tz.upper()
         offset = offset_from_tz_string(tz)
         if offset is None:
             return None
@@ -247,7 +201,7 @@
     m = strict_re.search(text)
     if m:
         g = m.groups()
-        mon = months_lower.index(string.lower(g[1])) + 1
+        mon = months_lower.index(g[1].lower()) + 1
         tt = (int(g[2]), mon, int(g[0]),
               int(g[3]), int(g[4]), float(g[5]))
         return my_timegm(tt)
@@ -255,7 +209,7 @@
     # No, we need some messy parsing...
 
     # clean up
-    text = string.lstrip(text)
+    text = text.lstrip()
     text = wkday_re.sub("", text, 1)  # Useless weekday
 
     # tz is time zone specifier string
@@ -300,7 +254,7 @@
 
     """
     # clean up
-    text = string.lstrip(text)
+    text = text.lstrip()
 
     # tz is time zone specifier string
     day, mon, yr, hr, min, sec, tz = [None]*7
@@ -315,336 +269,3 @@
         return None  # bad format
 
     return _str2time(day, mon, yr, hr, min, sec, tz)
-
-
-# XXX Andrew Dalke kindly sent me a similar class in response to my request on
-# comp.lang.python, which I then proceeded to lose.  I wrote this class
-# instead, but I think he's released his code publicly since, could pinch the
-# tests from it, at least...
-
-# For testing seek_wrapper invariant (note that
-# test_urllib2.HandlerTest.test_seekable is expected to fail when this
-# invariant checking is turned on).  The invariant checking is done by module
-# ipdc, which is available here:
-# http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/436834
-## from ipdbc import ContractBase
-## class seek_wrapper(ContractBase):
-class seek_wrapper:
-    """Adds a seek method to a file object.
-
-    This is only designed for seeking on readonly file-like objects.
-
-    Wrapped file-like object must have a read method.  The readline method is
-    only supported if that method is present on the wrapped object.  The
-    readlines method is always supported.  xreadlines and iteration are
-    supported only for Python 2.2 and above.
-
-    Public attribute: wrapped (the wrapped file object).
-
-    WARNING: All other attributes of the wrapped object (ie. those that are not
-    one of wrapped, read, readline, readlines, xreadlines, __iter__ and next)
-    are passed through unaltered, which may or may not make sense for your
-    particular file object.
-
-    """
-    # General strategy is to check that cache is full enough, then delegate to
-    # the cache (self.__cache, which is a cStringIO.StringIO instance).  A seek
-    # position (self.__pos) is maintained independently of the cache, in order
-    # that a single cache may be shared between multiple seek_wrapper objects.
-    # Copying using module copy shares the cache in this way.
-
-    def __init__(self, wrapped):
-        self.wrapped = wrapped
-        self.__have_readline = hasattr(self.wrapped, "readline")
-        self.__cache = StringIO()
-        self.__pos = 0  # seek position
-
-    def invariant(self):
-        # The end of the cache is always at the same place as the end of the
-        # wrapped file.
-        return self.wrapped.tell() == len(self.__cache.getvalue())
-
-    def __getattr__(self, name):
-        wrapped = self.__dict__.get("wrapped")
-        if wrapped:
-            return getattr(wrapped, name)
-        return getattr(self.__class__, name)
-
-    def seek(self, offset, whence=0):
-        assert whence in [0,1,2]
-
-        # how much data, if any, do we need to read?
-        if whence == 2:  # 2: relative to end of *wrapped* file
-            if offset < 0: raise ValueError("negative seek offset")
-            # since we don't know yet where the end of that file is, we must
-            # read everything
-            to_read = None
-        else:
-            if whence == 0:  # 0: absolute
-                if offset < 0: raise ValueError("negative seek offset")
-                dest = offset
-            else:  # 1: relative to current position
-                pos = self.__pos
-                if pos < offset:
-                    raise ValueError("seek to before start of file")
-                dest = pos + offset
-            end = len(self.__cache.getvalue())
-            to_read = dest - end
-            if to_read < 0:
-                to_read = 0
-
-        if to_read != 0:
-            self.__cache.seek(0, 2)
-            if to_read is None:
-                assert whence == 2
-                self.__cache.write(self.wrapped.read())
-                self.__pos = self.__cache.tell() - offset
-            else:
-                self.__cache.write(self.wrapped.read(to_read))
-                # Don't raise an exception even if we've seek()ed past the end
-                # of .wrapped, since fseek() doesn't complain in that case.
-                # Also like fseek(), pretend we have seek()ed past the end,
-                # i.e. not:
-                #self.__pos = self.__cache.tell()
-                # but rather:
-                self.__pos = dest
-        else:
-            self.__pos = dest
-
-    def tell(self):
-        return self.__pos
-
-    def __copy__(self):
-        cpy = self.__class__(self.wrapped)
-        cpy.__cache = self.__cache
-        return cpy
-
-    def get_data(self):
-        pos = self.__pos
-        try:
-            self.seek(0)
-            return self.read(-1)
-        finally:
-            self.__pos = pos
-
-    def read(self, size=-1):
-        pos = self.__pos
-        end = len(self.__cache.getvalue())
-        available = end - pos
-
-        # enough data already cached?
-        if size <= available and size != -1:
-            self.__cache.seek(pos)
-            self.__pos = pos+size
-            return self.__cache.read(size)
-
-        # no, so read sufficient data from wrapped file and cache it
-        self.__cache.seek(0, 2)
-        if size == -1:
-            self.__cache.write(self.wrapped.read())
-        else:
-            to_read = size - available
-            assert to_read > 0
-            self.__cache.write(self.wrapped.read(to_read))
-        self.__cache.seek(pos)
-
-        data = self.__cache.read(size)
-        self.__pos = self.__cache.tell()
-        assert self.__pos == pos + len(data)
-        return data
-
-    def readline(self, size=-1):
-        if not self.__have_readline:
-            raise NotImplementedError("no readline method on wrapped object")
-
-        # line we're about to read might not be complete in the cache, so
-        # read another line first
-        pos = self.__pos
-        self.__cache.seek(0, 2)
-        self.__cache.write(self.wrapped.readline())
-        self.__cache.seek(pos)
-
-        data = self.__cache.readline()
-        if size != -1:
-            r = data[:size]
-            self.__pos = pos+size
-        else:
-            r = data
-            self.__pos = pos+len(data)
-        return r
-
-    def readlines(self, sizehint=-1):
-        pos = self.__pos
-        self.__cache.seek(0, 2)
-        self.__cache.write(self.wrapped.read())
-        self.__cache.seek(pos)
-        data = self.__cache.readlines(sizehint)
-        self.__pos = self.__cache.tell()
-        return data
-
-    def __iter__(self): return self
-    def next(self):
-        line = self.readline()
-        if line == "": raise StopIteration
-        return line
-
-    xreadlines = __iter__
-
-    def __repr__(self):
-        return ("<%s at %s whose wrapped object = %r>" %
-                (self.__class__.__name__, hex(id(self)), self.wrapped))
-
-
-class response_seek_wrapper(seek_wrapper):
-
-    """
-    Supports copying response objects and setting response body data.
-
-    """
-
-    def __init__(self, wrapped):
-        seek_wrapper.__init__(self, wrapped)
-        self._headers = self.wrapped.info()
-
-    def __copy__(self):
-        cpy = seek_wrapper.__copy__(self)
-        # copy headers from delegate
-        cpy._headers = copy.copy(self.info())
-        return cpy
-
-    def info(self):
-        return self._headers
-
-    def set_data(self, data):
-        self.seek(0)
-        self.read()
-        self.close()
-        cache = self._seek_wrapper__cache = StringIO()
-        cache.write(data)
-        self.seek(0)
-
-
-class eoffile:
-    # file-like object that always claims to be at end-of-file...
-    def read(self, size=-1): return ""
-    def readline(self, size=-1): return ""
-    def __iter__(self): return self
-    def next(self): return ""
-    def close(self): pass
-
-class eofresponse(eoffile):
-    def __init__(self, url, headers, code, msg):
-        self._url = url
-        self._headers = headers
-        self.code = code
-        self.msg = msg
-    def geturl(self): return self._url
-    def info(self): return self._headers
-
-
-class closeable_response:
-    """Avoids unnecessarily clobbering urllib.addinfourl methods on .close().
-
-    Only supports responses returned by mechanize.HTTPHandler.
-
-    After .close(), the following methods are supported:
-
-    .read()
-    .readline()
-    .readlines()
-    .seek()
-    .tell()
-    .info()
-    .geturl()
-    .__iter__()
-    .next()
-    .close()
-
-    and the following attributes are supported:
-
-    .code
-    .msg
-
-    Also supports pickling (but the stdlib currently does something to prevent
-    it: http://python.org/sf/1144636).
-
-    """
-    # presence of this attr indicates is useable after .close()
-    closeable_response = None
-
-    def __init__(self, fp, headers, url, code, msg):
-        self._set_fp(fp)
-        self._headers = headers
-        self._url = url
-        self.code = code
-        self.msg = msg
-
-    def _set_fp(self, fp):
-        self.fp = fp
-        self.read = self.fp.read
-        self.readline = self.fp.readline
-        if hasattr(self.fp, "readlines"): self.readlines = self.fp.readlines
-        if hasattr(self.fp, "fileno"):
-            self.fileno = self.fp.fileno
-        else:
-            self.fileno = lambda: None
-        if hasattr(self.fp, "__iter__"):
-            self.__iter__ = self.fp.__iter__
-            if hasattr(self.fp, "next"):
-                self.next = self.fp.next
-
-    def __repr__(self):
-        return '<%s at %s whose fp = %r>' % (
-            self.__class__.__name__, hex(id(self)), self.fp)
-
-    def info(self):
-        return self._headers
-
-    def geturl(self):
-        return self._url
-
-    def close(self):
-        wrapped = self.fp
-        wrapped.close()
-        new_wrapped = eofresponse(
-            self._url, self._headers, self.code, self.msg)
-        self._set_fp(new_wrapped)
-
-    def __getstate__(self):
-        # There are three obvious options here:
-        # 1. truncate
-        # 2. read to end
-        # 3. close socket, pickle state including read position, then open
-        #    again on unpickle and use Range header
-        # XXXX um, 4. refuse to pickle unless .close()d.  This is better,
-        #  actually ("errors should never pass silently").  Pickling doesn't
-        #  work anyway ATM, because of http://python.org/sf/1144636 so fix
-        #  this later
-
-        # 2 breaks pickle protocol, because one expects the original object
-        # to be left unscathed by pickling.  3 is too complicated and
-        # surprising (and too much work ;-) to happen in a sane __getstate__.
-        # So we do 1.
-
-        state = self.__dict__.copy()
-        new_wrapped = eofresponse(
-            self._url, self._headers, self.code, self.msg)
-        state["wrapped"] = new_wrapped
-        return state
-
-def make_response(data, headers, url, code, msg):
-    """Convenient factory for objects implementing response interface.
-
-    data: string containing response body data
-    headers: sequence of (name, value) pairs
-    url: URL of response
-    code: integer response code (e.g. 200)
-    msg: string response code message (e.g. "OK")
-
-    """
-    hdr_text = []
-    for name_value in headers:
-        hdr_text.append("%s: %s" % name_value)
-    mime_headers = mimetools.Message(StringIO("\n".join(hdr_text)))
-    r = closeable_response(StringIO(data), mime_headers, url, code, msg)
-    return response_seek_wrapper(r)

Modified: python-mechanize/branches/upstream/current/mechanize.egg-info/PKG-INFO
===================================================================
--- python-mechanize/branches/upstream/current/mechanize.egg-info/PKG-INFO	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize.egg-info/PKG-INFO	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,12 +1,12 @@
 Metadata-Version: 1.0
 Name: mechanize
-Version: 0.1.2b
+Version: 0.1.6b
 Summary: Stateful programmatic web browsing.
 Home-page: http://wwwsearch.sourceforge.net/mechanize/
 Author: John J. Lee
 Author-email: jjl at pobox.com
 License: BSD
-Download-URL: http://wwwsearch.sourceforge.net/mechanize/src/mechanize-0.1.2b.tar.gz
+Download-URL: http://wwwsearch.sourceforge.net/mechanize/src/mechanize-0.1.6b.tar.gz
 Description: Stateful programmatic web browsing, after Andy Lester's Perl module
         WWW::Mechanize.
         
@@ -25,7 +25,7 @@
         
         
 Platform: any
-Classifier: Development Status :: 3 - Alpha
+Classifier: Development Status :: 4 - Beta
 Classifier: Intended Audience :: Developers
 Classifier: Intended Audience :: System Administrators
 Classifier: License :: OSI Approved :: BSD License

Modified: python-mechanize/branches/upstream/current/mechanize.egg-info/SOURCES.txt
===================================================================
--- python-mechanize/branches/upstream/current/mechanize.egg-info/SOURCES.txt	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize.egg-info/SOURCES.txt	2007-04-09 20:40:55 UTC (rev 764)
@@ -9,6 +9,7 @@
 README.txt
 doc.html
 doc.html.in
+ez_setup.py
 functional_tests.py
 setup.py
 test.py
@@ -17,14 +18,15 @@
 examples/cookietest.cgi
 examples/hack21.py
 examples/pypi.py
-ez_setup/README.txt
-ez_setup/__init__.py
 mechanize/__init__.py
 mechanize/_auth.py
+mechanize/_beautifulsoup.py
 mechanize/_clientcookie.py
+mechanize/_debug.py
 mechanize/_gzip.py
 mechanize/_headersutil.py
 mechanize/_html.py
+mechanize/_http.py
 mechanize/_lwpcookiejar.py
 mechanize/_mechanize.py
 mechanize/_mozillacookiejar.py
@@ -32,20 +34,36 @@
 mechanize/_opener.py
 mechanize/_pullparser.py
 mechanize/_request.py
+mechanize/_response.py
+mechanize/_rfc3986.py
+mechanize/_seek.py
+mechanize/_upgrade.py
 mechanize/_urllib2.py
-mechanize/_urllib2_support.py
 mechanize/_useragent.py
 mechanize/_util.py
 mechanize.egg-info/PKG-INFO
 mechanize.egg-info/SOURCES.txt
+mechanize.egg-info/dependency_links.txt
 mechanize.egg-info/requires.txt
 mechanize.egg-info/top_level.txt
 mechanize.egg-info/zip-safe
-test/test_conncache.py
+test/test_browser.doctest
+test/test_browser.py
 test/test_cookies.py
 test/test_date.py
+test/test_forms.doctest
 test/test_headers.py
-test/test_mechanize.py
-test/test_misc.py
+test/test_history.doctest
+test/test_html.doctest
+test/test_html.py
+test/test_opener.py
+test/test_password_manager.doctest
 test/test_pullparser.py
+test/test_request.doctest
+test/test_response.doctest
+test/test_response.py
+test/test_rfc3986.doctest
 test/test_urllib2.py
+test/test_useragent.py
+test-tools/doctest.py
+test-tools/linecache_copy.py

Added: python-mechanize/branches/upstream/current/mechanize.egg-info/dependency_links.txt
===================================================================
--- python-mechanize/branches/upstream/current/mechanize.egg-info/dependency_links.txt	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize.egg-info/dependency_links.txt	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1 @@
+

Modified: python-mechanize/branches/upstream/current/mechanize.egg-info/requires.txt
===================================================================
--- python-mechanize/branches/upstream/current/mechanize.egg-info/requires.txt	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize.egg-info/requires.txt	2007-04-09 20:40:55 UTC (rev 764)
@@ -1 +1 @@
-ClientForm>=0.2.2, ==dev
\ No newline at end of file
+ClientForm>=0.2.6, ==dev
\ No newline at end of file

Modified: python-mechanize/branches/upstream/current/mechanize.egg-info/zip-safe
===================================================================
--- python-mechanize/branches/upstream/current/mechanize.egg-info/zip-safe	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/mechanize.egg-info/zip-safe	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1 @@
+

Added: python-mechanize/branches/upstream/current/setup.cfg
===================================================================
--- python-mechanize/branches/upstream/current/setup.cfg	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/setup.cfg	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,5 @@
+[egg_info]
+tag_build = 
+tag_date = 0
+tag_svn_revision = 0
+

Modified: python-mechanize/branches/upstream/current/setup.py
===================================================================
--- python-mechanize/branches/upstream/current/setup.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/setup.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -52,15 +52,15 @@
 ## VERSION_MATCH = re.search(r'__version__ = \((.*)\)',
 ##                           open("mechanize/_mechanize.py").read())
 ## VERSION = unparse_version(str_to_tuple(VERSION_MATCH.group(1)))
-VERSION = "0.1.2b"
-INSTALL_REQUIRES = ["ClientForm>=0.2.2, ==dev"]
+VERSION = "0.1.6b"
+INSTALL_REQUIRES = ["ClientForm>=0.2.6, ==dev"]
 NAME = "mechanize"
 PACKAGE = True
 LICENSE = "BSD"  # or ZPL 2.1
 PLATFORMS = ["any"]
 ZIP_SAFE = True
 CLASSIFIERS = """\
-Development Status :: 3 - Alpha
+Development Status :: 4 - Beta
 Intended Audience :: Developers
 Intended Audience :: System Administrators
 License :: OSI Approved :: BSD License

Added: python-mechanize/branches/upstream/current/test/test_browser.doctest
===================================================================
--- python-mechanize/branches/upstream/current/test/test_browser.doctest	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_browser.doctest	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,190 @@
+>>> import mechanize
+>>> from mechanize._response import test_response
+>>> from test_browser import TestBrowser2, make_mock_handler
+
+
+Opening a new response should close the old one.
+
+>>> class TestHttpHandler(mechanize.BaseHandler):
+...     def http_open(self, request):
+...         return test_response(url=request.get_full_url())
+>>> class TestHttpBrowser(TestBrowser2):
+...     handler_classes = TestBrowser2.handler_classes.copy()
+...     handler_classes["http"] = TestHttpHandler
+...     default_schemes = ["http"]
+>>> def response_impl(response):
+...     return response.wrapped.fp.__class__.__name__
+
+>>> br = TestHttpBrowser()
+>>> r = br.open("http://example.com")
+>>> print response_impl(r)
+StringI
+>>> r2 = br.open("http://example.com")
+>>> print response_impl(r2)
+StringI
+>>> print response_impl(r)
+eofresponse
+
+So should .set_response()
+
+>>> br.set_response(test_response())
+>>> print response_impl(r2)
+eofresponse
+
+
+.visit_response() works very similarly to .open()
+
+>>> br = TestHttpBrowser()
+>>> r = br.open("http://example.com")
+>>> r2 = test_response(url="http://example.com/2")
+>>> print response_impl(r2)
+StringI
+>>> br.visit_response(r2)
+>>> print response_impl(r)
+eofresponse
+>>> br.geturl() == br.request.get_full_url() == "http://example.com/2"
+True
+>>> junk = br.back()
+>>> br.geturl() == br.request.get_full_url() == "http://example.com"
+True
+
+
+.back() may reload if the complete response was not read.  If so, it
+should return the new response, not the old one
+
+>>> class ReloadCheckBrowser(TestHttpBrowser):
+...     reloaded = False
+...     def reload(self):
+...         self.reloaded = True
+...         return TestHttpBrowser.reload(self)
+>>> br = ReloadCheckBrowser()
+>>> old = br.open("http://example.com")
+>>> junk = br.open("http://example.com/2")
+>>> new = br.back()
+>>> br.reloaded
+True
+>>> new.wrapped is not old.wrapped
+True
+
+
+Warn early about some mistakes setting a response object
+
+>>> import StringIO
+>>> br = TestBrowser2()
+>>> br.set_response("blah")
+Traceback (most recent call last):
+...
+ValueError: not a response object
+>>> br.set_response(StringIO.StringIO())
+Traceback (most recent call last):
+...
+ValueError: not a response object
+
+
+.open() without an appropriate scheme handler should fail with
+URLError
+
+>>> br = TestBrowser2()
+>>> br.open("http://example.com")
+Traceback (most recent call last):
+...
+URLError: <urlopen error unknown url type: http>
+
+Reload after failed .open() should fail due to failure to open, not
+with BrowserStateError
+
+>>> br.reload()
+Traceback (most recent call last):
+...
+URLError: <urlopen error unknown url type: http>
+
+
+.clear_history() should do what it says on the tin.  Note that the
+history does not include the current response!
+
+>>> br = TestBrowser2()
+>>> br.add_handler(make_mock_handler(test_response)([("http_open", None)]))
+
+>>> br.response() is None
+True
+>>> len(br._history._history)
+0
+
+>>> r = br.open("http://example.com/1")
+>>> br.response() is not None
+True
+>>> len(br._history._history)
+0
+
+>>> br.clear_history()
+>>> br.response() is not None
+True
+>>> len(br._history._history)
+0
+
+>>> r = br.open("http://example.com/2")
+>>> br.response() is not None
+True
+>>> len(br._history._history)
+1
+
+>>> br.clear_history()
+>>> br.response() is not None
+True
+>>> len(br._history._history)
+0
+
+
+.open()ing a Request with False .visit does not affect Browser state.
+Redirections during such a non-visiting request should also be
+non-visiting.
+
+>>> from mechanize import BrowserStateError, Request, HTTPRedirectHandler
+>>> from test_urllib2 import MockHTTPHandler
+
+>>> req = Request("http://example.com")
+>>> req.visit = False
+>>> br = TestBrowser2()
+>>> hh = MockHTTPHandler(301, "Location: http://example.com/\r\n\r\n")
+>>> br.add_handler(hh)
+>>> br.add_handler(HTTPRedirectHandler())
+>>> def raises(exc_class, fn, *args, **kwds):
+...     try:
+...         fn(*args, **kwds)
+...     except exc_class, exc:
+...         return True
+...     return False
+>>> def test_state(br):
+...     return (br.request is None and
+...             br.response() is None and
+...             raises(BrowserStateError, br.back)
+...             )
+>>> test_state(br)
+True
+>>> r = br.open(req)
+>>> test_state(br)
+True
+
+
+.global_form() is separate from the other forms (partly for backwards-
+compatibility reasons).
+
+>>> from mechanize._response import test_response
+>>> br = TestBrowser2()
+>>> html = """\
+... <html><body>
+... <input type="text" name="a" />
+... <form><input type="text" name="b" /></form>
+... </body></html>
+... """
+>>> response = test_response(html, headers=[("Content-type", "text/html")])
+>>> br.global_form()
+Traceback (most recent call last):
+BrowserStateError: not viewing any document
+>>> br.set_response(response)
+>>> br.global_form().find_control(nr=0).name
+'a'
+>>> len(list(br.forms()))
+1
+>>> iter(br.forms()).next().find_control(nr=0).name
+'b'

Added: python-mechanize/branches/upstream/current/test/test_browser.py
===================================================================
--- python-mechanize/branches/upstream/current/test/test_browser.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_browser.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,765 @@
+#!/usr/bin/env python
+"""Tests for mechanize.Browser."""
+
+import sys, os, random
+from unittest import TestCase
+import StringIO, re, urllib2
+
+import mechanize
+from mechanize._response import test_html_response
+FACTORY_CLASSES = [mechanize.DefaultFactory, mechanize.RobustFactory]
+
+
+# XXX these 'mock' classes are badly in need of simplification / removal
+# (note this stuff is also used by test_useragent.py and test_browser.doctest)
+class MockMethod:
+    def __init__(self, meth_name, action, handle):
+        self.meth_name = meth_name
+        self.handle = handle
+        self.action = action
+    def __call__(self, *args):
+        return apply(self.handle, (self.meth_name, self.action)+args)
+
+class MockHeaders(dict):
+    def getheaders(self, name):
+        name = name.lower()
+        return [v for k, v in self.iteritems() if name == k.lower()]
+
+class MockResponse:
+    closeable_response = None
+    def __init__(self, url="http://example.com/", data=None, info=None):
+        self.url = url
+        self.fp = StringIO.StringIO(data)
+        if info is None: info = {}
+        self._info = MockHeaders(info)
+        self.source = "%d%d" % (id(self), random.randint(0, sys.maxint-1))
+        # otherwise we can't test for "same_response" in test_history
+        self.read_complete = True
+    def info(self): return self._info
+    def geturl(self): return self.url
+    def read(self, size=-1): return self.fp.read(size)
+    def seek(self, whence):
+        assert whence == 0
+        self.fp.seek(0)
+    def close(self): pass
+    def get_data(self): pass
+    def __getstate__(self):
+        state = self.__dict__
+        state['source'] = self.source
+        return state
+    def __setstate__(self, state):
+        self.__dict__ = state
+
+def make_mock_handler(response_class=MockResponse):
+    class MockHandler:
+        processor_order = 500
+        handler_order = -1
+        def __init__(self, methods):
+            self._define_methods(methods)
+        def _define_methods(self, methods):
+            for name, action in methods:
+                if name.endswith("_open"):
+                    meth = MockMethod(name, action, self.handle)
+                else:
+                    meth = MockMethod(name, action, self.process)
+                setattr(self.__class__, name, meth)
+        def handle(self, fn_name, response, *args, **kwds):
+            self.parent.calls.append((self, fn_name, args, kwds))
+            if response:
+                if isinstance(response, urllib2.HTTPError):
+                    raise response
+                r = response
+                r.seek(0)
+            else:
+                r = response_class()
+            req = args[0]
+            r.url = req.get_full_url()
+            return r
+        def process(self, fn_name, action, *args, **kwds):
+            self.parent.calls.append((self, fn_name, args, kwds))
+            if fn_name.endswith("_request"):
+                return args[0]
+            else:
+                return args[1]
+        def close(self): pass
+        def add_parent(self, parent):
+            self.parent = parent
+            self.parent.calls = []
+        def __lt__(self, other):
+            if not hasattr(other, "handler_order"):
+                # Try to preserve the old behavior of having custom classes
+                # inserted after default ones (works only for custom user
+                # classes which are not aware of handler_order).
+                return True
+            return self.handler_order < other.handler_order
+    return MockHandler
+
+class TestBrowser(mechanize.Browser):
+    default_features = ["_seek"]
+    default_others = []
+    default_schemes = []
+
+class TestBrowser2(mechanize.Browser):
+    # XXX better name!
+    # As TestBrowser, this is neutered so doesn't know about protocol handling,
+    # but still knows what to do with unknown schemes, etc., because
+    # UserAgent's default_others list is left intact, including classes like
+    # UnknownHandler
+    default_features = ["_seek"]
+    default_schemes = []
+
+
+class BrowserTests(TestCase):
+
+    def test_referer(self):
+        b = TestBrowser()
+        url = "http://www.example.com/"
+        r = MockResponse(url,
+"""<html>
+<head><title>Title</title></head>
+<body>
+<form name="form1">
+ <input type="hidden" name="foo" value="bar"></input>
+ <input type="submit"></input>
+ </form>
+<a href="http://example.com/foo/bar.html" name="apples"></a>
+<a href="https://example.com/spam/eggs.html" name="secure"></a>
+<a href="blah://example.com/" name="pears"></a>
+</body>
+</html>
+""", {"content-type": "text/html"})
+        b.add_handler(make_mock_handler()([("http_open", r)]))
+
+        # Referer not added by .open()...
+        req = mechanize.Request(url)
+        b.open(req)
+        self.assert_(req.get_header("Referer") is None)
+        # ...even if we're visiting a document
+        b.open(req)
+        self.assert_(req.get_header("Referer") is None)
+        # Referer added by .click_link() and .click()
+        b.select_form("form1")
+        req2 = b.click()
+        self.assertEqual(req2.get_header("Referer"), url)
+        r2 = b.open(req2)
+        req3 = b.click_link(name="apples")
+        self.assertEqual(req3.get_header("Referer"), url+"?foo=bar")
+        # Referer not added when going from https to http URL
+        b.add_handler(make_mock_handler()([("https_open", r)]))
+        r3 = b.open(req3)
+        req4 = b.click_link(name="secure")
+        self.assertEqual(req4.get_header("Referer"),
+                         "http://example.com/foo/bar.html")
+        r4 = b.open(req4)
+        req5 = b.click_link(name="apples")
+        self.assert_(not req5.has_header("Referer"))
+        # Referer not added for non-http, non-https requests
+        b.add_handler(make_mock_handler()([("blah_open", r)]))
+        req6 = b.click_link(name="pears")
+        self.assert_(not req6.has_header("Referer"))
+        # Referer not added when going from non-http, non-https URL
+        r4 = b.open(req6)
+        req7 = b.click_link(name="apples")
+        self.assert_(not req7.has_header("Referer"))
+
+        # XXX Referer added for redirect
+
+    def test_encoding(self):
+        import mechanize
+        from StringIO import StringIO
+        import urllib, mimetools
+        # always take first encoding, since that's the one from the real HTTP
+        # headers, rather than from HTTP-EQUIV
+        b = mechanize.Browser()
+        for s, ct in [("", mechanize._html.DEFAULT_ENCODING),
+
+                      ("Foo: Bar\r\n\r\n", mechanize._html.DEFAULT_ENCODING),
+
+                      ("Content-Type: text/html; charset=UTF-8\r\n\r\n",
+                       "UTF-8"),
+
+                      ("Content-Type: text/html; charset=UTF-8\r\n"
+                       "Content-Type: text/html; charset=KOI8-R\r\n\r\n",
+                       "UTF-8"),
+                      ]:
+            msg = mimetools.Message(StringIO(s))
+            r = urllib.addinfourl(StringIO(""), msg, "http://www.example.com/")
+            b.set_response(r)
+            self.assertEqual(b.encoding(), ct)
+
+    def test_history(self):
+        import mechanize
+
+        def same_response(ra, rb):
+            return ra.source == rb.source
+
+        b = TestBrowser()
+        b.add_handler(make_mock_handler()([("http_open", None)]))
+        self.assertRaises(mechanize.BrowserStateError, b.back)
+        r1 = b.open("http://example.com/")
+        self.assertRaises(mechanize.BrowserStateError, b.back)
+        r2 = b.open("http://example.com/foo")
+        self.assert_(same_response(b.back(), r1))
+        r3 = b.open("http://example.com/bar")
+        r4 = b.open("http://example.com/spam")
+        self.assert_(same_response(b.back(), r3))
+        self.assert_(same_response(b.back(), r1))
+        self.assertRaises(mechanize.BrowserStateError, b.back)
+        # reloading does a real HTTP fetch rather than using history cache
+        r5 = b.reload()
+        self.assert_(not same_response(r5, r1))
+        # .geturl() gets fed through to b.response
+        self.assertEquals(b.geturl(), "http://example.com/")
+        # can go back n times
+        r6 = b.open("spam")
+        self.assertEquals(b.geturl(), "http://example.com/spam")
+        r7 = b.open("/spam")
+        self.assert_(same_response(b.response(), r7))
+        self.assertEquals(b.geturl(), "http://example.com/spam")
+        self.assert_(same_response(b.back(2), r5))
+        self.assertEquals(b.geturl(), "http://example.com/")
+        self.assertRaises(mechanize.BrowserStateError, b.back, 2)
+        r8 = b.open("/spam")
+
+        # even if we get a HTTPError, history, .response() and .request should
+        # still get updated
+        error = urllib2.HTTPError("http://example.com/bad", 503, "Oops",
+                                  MockHeaders(), StringIO.StringIO())
+        b.add_handler(make_mock_handler()([("https_open", error)]))
+        self.assertRaises(urllib2.HTTPError, b.open, "https://example.com/badreq")
+        self.assertEqual(b.response().geturl(), error.geturl())
+        self.assertEqual(b.request.get_full_url(), "https://example.com/badreq")
+        self.assert_(same_response(b.back(), r8))
+
+        # .close() should make use of Browser methods and attributes complain
+        # noisily, since they should not be called after .close()
+        b.form = "blah"
+        b.close()
+        for attr in ("form open error retrieve add_handler "
+                     "request response set_response geturl reload back "
+                     "clear_history set_cookie links forms viewing_html "
+                     "encoding title select_form click submit click_link "
+                     "follow_link find_link".split()
+                     ):
+            self.assert_(getattr(b, attr) is None)
+
+    def test_reload_read_incomplete(self):
+        import mechanize
+        from mechanize._response import test_response
+        class Browser(TestBrowser):
+            def __init__(self):
+                TestBrowser.__init__(self)
+                self.reloaded = False
+            def reload(self):
+                self.reloaded = True
+                TestBrowser.reload(self)
+        br = Browser()
+        data = "<html><head><title></title></head><body>%s</body></html>"
+        data = data % ("The quick brown fox jumps over the lazy dog."*100)
+        class Handler(mechanize.BaseHandler):
+            def http_open(self, requst):
+                return test_response(data, [("content-type", "text/html")])
+        br.add_handler(Handler())
+
+        # .reload() on .back() if the whole response hasn't already been read
+        # (.read_incomplete is True)
+        r = br.open("http://example.com")
+        r.read(10)
+        br.open('http://www.example.com/blah')
+        self.failIf(br.reloaded)
+        br.back()
+        self.assert_(br.reloaded)
+
+        # don't reload if already read
+        br.reloaded = False
+        br.response().read()
+        br.open('http://www.example.com/blah')
+        br.back()
+        self.failIf(br.reloaded)
+
+    def test_viewing_html(self):
+        # XXX not testing multiple Content-Type headers
+        import mechanize
+        url = "http://example.com/"
+
+        for allow_xhtml in False, True:
+            for ct, expect in [
+                (None, False),
+                ("text/plain", False),
+                ("text/html", True),
+
+                # don't try to handle XML until we can do it right!
+                ("text/xhtml", allow_xhtml),
+                ("text/xml", allow_xhtml),
+                ("application/xml", allow_xhtml),
+                ("application/xhtml+xml", allow_xhtml),
+
+                ("text/html; charset=blah", True),
+                (" text/html ; charset=ook ", True),
+                ]:
+                b = TestBrowser(mechanize.DefaultFactory(
+                    i_want_broken_xhtml_support=allow_xhtml))
+                hdrs = {}
+                if ct is not None:
+                    hdrs["Content-Type"] = ct
+                b.add_handler(make_mock_handler()([("http_open",
+                                            MockResponse(url, "", hdrs))]))
+                r = b.open(url)
+                self.assertEqual(b.viewing_html(), expect)
+
+        for allow_xhtml in False, True:
+            for ext, expect in [
+                (".htm", True),
+                (".html", True),
+
+                # don't try to handle XML until we can do it right!
+                (".xhtml", allow_xhtml),
+
+                (".html?foo=bar&a=b;whelk#kool", True),
+                (".txt", False),
+                (".xml", False),
+                ("", False),
+                ]:
+                b = TestBrowser(mechanize.DefaultFactory(
+                    i_want_broken_xhtml_support=allow_xhtml))
+                url = "http://example.com/foo"+ext
+                b.add_handler(make_mock_handler()(
+                    [("http_open", MockResponse(url, "", {}))]))
+                r = b.open(url)
+                self.assertEqual(b.viewing_html(), expect)
+
+    def test_empty(self):
+        import mechanize
+        url = "http://example.com/"
+
+        b = TestBrowser()
+
+        self.assert_(b.response() is None)
+
+        # To open a relative reference (often called a "relative URL"), you
+        # have to have already opened a URL for it "to be relative to".
+        self.assertRaises(mechanize.BrowserStateError, b.open, "relative_ref")
+
+        # we can still clear the history even if we've not visited any URL
+        b.clear_history()
+
+        # most methods raise BrowserStateError...
+        def test_state_error(method_names):
+            for attr in method_names:
+                method = getattr(b, attr)
+                #print attr
+                self.assertRaises(mechanize.BrowserStateError, method)
+            self.assertRaises(mechanize.BrowserStateError, b.select_form,
+                              name="blah")
+            self.assertRaises(mechanize.BrowserStateError, b.find_link,
+                              name="blah")
+        # ...if not visiting a URL...
+        test_state_error(("geturl reload back viewing_html encoding "
+                          "click links forms title select_form".split()))
+        self.assertRaises(mechanize.BrowserStateError, b.set_cookie, "foo=bar")
+        self.assertRaises(mechanize.BrowserStateError, b.submit, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.click_link, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.follow_link, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.find_link, nr=0)
+        # ...and lots do so if visiting a non-HTML URL
+        b.add_handler(make_mock_handler()(
+            [("http_open", MockResponse(url, "", {}))]))
+        r = b.open(url)
+        self.assert_(not b.viewing_html())
+        test_state_error("click links forms title select_form".split())
+        self.assertRaises(mechanize.BrowserStateError, b.submit, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.click_link, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.follow_link, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.find_link, nr=0)
+
+        b = TestBrowser()
+        r = MockResponse(url,
+"""<html>
+<head><title>Title</title></head>
+<body>
+</body>
+</html>
+""", {"content-type": "text/html"})
+        b.add_handler(make_mock_handler()([("http_open", r)]))
+        r = b.open(url)
+        self.assertEqual(b.title(), "Title")
+        self.assertEqual(len(list(b.links())), 0)
+        self.assertEqual(len(list(b.forms())), 0)
+        self.assertRaises(ValueError, b.select_form)
+        self.assertRaises(mechanize.FormNotFoundError, b.select_form,
+                          name="blah")
+        self.assertRaises(mechanize.FormNotFoundError, b.select_form,
+                          predicate=lambda x: True)
+        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
+                          name="blah")
+        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
+                          predicate=lambda x: True)
+
+    def test_forms(self):
+        for factory_class in FACTORY_CLASSES:
+            self._test_forms(factory_class())
+    def _test_forms(self, factory):
+        import mechanize
+        url = "http://example.com"
+
+        b = TestBrowser(factory=factory)
+        r = test_html_response(
+            url=url,
+            headers=[("content-type", "text/html")],
+            data="""\
+<html>
+<head><title>Title</title></head>
+<body>
+<form name="form1">
+ <input type="text"></input>
+ <input type="checkbox" name="cheeses" value="cheddar"></input>
+ <input type="checkbox" name="cheeses" value="edam"></input>
+ <input type="submit" name="one"></input>
+</form>
+<a href="http://example.com/foo/bar.html" name="apples">
+<form name="form2">
+ <input type="submit" name="two">
+</form>
+</body>
+</html>
+"""
+            )
+        b.add_handler(make_mock_handler()([("http_open", r)]))
+        r = b.open(url)
+
+        forms = list(b.forms())
+        self.assertEqual(len(forms), 2)
+        for got, expect in zip([f.name for f in forms], [
+            "form1", "form2"]):
+            self.assertEqual(got, expect)
+
+        self.assertRaises(mechanize.FormNotFoundError, b.select_form, "foo")
+
+        # no form is set yet
+        self.assertRaises(AttributeError, getattr, b, "possible_items")
+        b.select_form("form1")
+        # now unknown methods are fed through to selected ClientForm.HTMLForm
+        self.assertEqual(
+            [i.name for i in b.find_control("cheeses").items],
+            ["cheddar", "edam"])
+        b["cheeses"] = ["cheddar", "edam"]
+        self.assertEqual(b.click_pairs(), [
+            ("cheeses", "cheddar"), ("cheeses", "edam"), ("one", "")])
+
+        b.select_form(nr=1)
+        self.assertEqual(b.name, "form2")
+        self.assertEqual(b.click_pairs(), [("two", "")])
+
+    def test_link_encoding(self):
+        for factory_class in FACTORY_CLASSES:
+            self._test_link_encoding(factory_class())
+    def _test_link_encoding(self, factory):
+        import urllib
+        import mechanize
+        from mechanize._rfc3986 import clean_url
+        url = "http://example.com/"
+        for encoding in ["UTF-8", "latin-1"]:
+            encoding_decl = "; charset=%s" % encoding
+            b = TestBrowser(factory=factory)
+            r = MockResponse(url, """\
+<a href="http://example.com/foo/bar&mdash;&#x2014;.html"
+   name="name0&mdash;&#x2014;">blah&mdash;&#x2014;</a>
+""", #"
+{"content-type": "text/html%s" % encoding_decl})
+            b.add_handler(make_mock_handler()([("http_open", r)]))
+            r = b.open(url)
+
+            Link = mechanize.Link
+            try:
+                mdashx2 = u"\u2014".encode(encoding)*2
+            except UnicodeError:
+                mdashx2 = '&mdash;&#x2014;'
+            qmdashx2 = clean_url(mdashx2, encoding)
+            # base_url, url, text, tag, attrs
+            exp = Link(url, "http://example.com/foo/bar%s.html" % qmdashx2,
+                       "blah"+mdashx2, "a",
+                       [("href", "http://example.com/foo/bar%s.html" % mdashx2),
+                        ("name", "name0%s" % mdashx2)])
+            # nr
+            link = b.find_link()
+##             print
+##             print exp
+##             print link
+            self.assertEqual(link, exp)
+
+    def test_link_whitespace(self):
+        from mechanize import Link
+        for factory_class in FACTORY_CLASSES:
+            base_url = "http://example.com/"
+            url = "  http://example.com/foo.html%20+ "
+            stripped_url = url.strip()
+            html = '<a href="%s"></a>' % url
+            b = TestBrowser(factory=factory_class())
+            r = MockResponse(base_url, html, {"content-type": "text/html"})
+            b.add_handler(make_mock_handler()([("http_open", r)]))
+            r = b.open(base_url)
+            link = b.find_link(nr=0)
+            self.assertEqual(
+                link,
+                Link(base_url, stripped_url, "", "a", [("href", url)])
+                )
+
+    def test_links(self):
+        for factory_class in FACTORY_CLASSES:
+            self._test_links(factory_class())
+    def _test_links(self, factory):
+        import mechanize
+        from mechanize import Link
+        url = "http://example.com/"
+
+        b = TestBrowser(factory=factory)
+        r = MockResponse(url,
+"""<html>
+<head><title>Title</title></head>
+<body>
+<a href="http://example.com/foo/bar.html" name="apples"></a>
+<a name="pears"></a>
+<a href="spam" name="pears"></a>
+<area href="blah" name="foo"></area>
+<form name="form2">
+ <input type="submit" name="two">
+</form>
+<frame name="name" href="href" src="src"></frame>
+<iframe name="name2" href="href" src="src"></iframe>
+<a name="name3" href="one">yada yada</a>
+<a name="pears" href="two" weird="stuff">rhubarb</a>
+<a></a>
+<iframe src="foo"></iframe>
+</body>
+</html>
+""", {"content-type": "text/html"})
+        b.add_handler(make_mock_handler()([("http_open", r)]))
+        r = b.open(url)
+
+        exp_links = [
+            # base_url, url, text, tag, attrs
+            Link(url, "http://example.com/foo/bar.html", "", "a",
+                 [("href", "http://example.com/foo/bar.html"),
+                  ("name", "apples")]),
+            Link(url, "spam", "", "a", [("href", "spam"), ("name", "pears")]),
+            Link(url, "blah", None, "area",
+                 [("href", "blah"), ("name", "foo")]),
+            Link(url, "src", None, "frame",
+                 [("name", "name"), ("href", "href"), ("src", "src")]),
+            Link(url, "src", None, "iframe",
+                 [("name", "name2"), ("href", "href"), ("src", "src")]),
+            Link(url, "one", "yada yada", "a",
+                 [("name", "name3"), ("href", "one")]),
+            Link(url, "two", "rhubarb", "a",
+                 [("name", "pears"), ("href", "two"), ("weird", "stuff")]),
+            Link(url, "foo", None, "iframe",
+                 [("src", "foo")]),
+            ]
+        links = list(b.links())
+        self.assertEqual(len(links), len(exp_links))
+        for got, expect in zip(links, exp_links):
+            self.assertEqual(got, expect)
+        # nr
+        l = b.find_link()
+        self.assertEqual(l.url, "http://example.com/foo/bar.html")
+        l = b.find_link(nr=1)
+        self.assertEqual(l.url, "spam")
+        # text
+        l = b.find_link(text="yada yada")
+        self.assertEqual(l.url, "one")
+        self.assertRaises(mechanize.LinkNotFoundError,
+                          b.find_link, text="da ya")
+        l = b.find_link(text_regex=re.compile("da ya"))
+        self.assertEqual(l.url, "one")
+        l = b.find_link(text_regex="da ya")
+        self.assertEqual(l.url, "one")
+        # name
+        l = b.find_link(name="name3")
+        self.assertEqual(l.url, "one")
+        l = b.find_link(name_regex=re.compile("oo"))
+        self.assertEqual(l.url, "blah")
+        l = b.find_link(name_regex="oo")
+        self.assertEqual(l.url, "blah")
+        # url
+        l = b.find_link(url="spam")
+        self.assertEqual(l.url, "spam")
+        l = b.find_link(url_regex=re.compile("pam"))
+        self.assertEqual(l.url, "spam")
+        l = b.find_link(url_regex="pam")
+        self.assertEqual(l.url, "spam")
+        # tag
+        l = b.find_link(tag="area")
+        self.assertEqual(l.url, "blah")
+        # predicate
+        l = b.find_link(predicate=
+                        lambda l: dict(l.attrs).get("weird") == "stuff")
+        self.assertEqual(l.url, "two")
+        # combinations
+        l = b.find_link(name="pears", nr=1)
+        self.assertEqual(l.text, "rhubarb")
+        l = b.find_link(url="src", nr=0, name="name2")
+        self.assertEqual(l.tag, "iframe")
+        self.assertEqual(l.url, "src")
+        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
+                          url="src", nr=1, name="name2")
+        l = b.find_link(tag="a", predicate=
+                        lambda l: dict(l.attrs).get("weird") == "stuff")
+        self.assertEqual(l.url, "two")
+
+        # .links()
+        self.assertEqual(list(b.links(url="src")), [
+            Link(url, url="src", text=None, tag="frame",
+                 attrs=[("name", "name"), ("href", "href"), ("src", "src")]),
+            Link(url, url="src", text=None, tag="iframe",
+                 attrs=[("name", "name2"), ("href", "href"), ("src", "src")]),
+            ])
+
+    def test_base_uri(self):
+        import mechanize
+        url = "http://example.com/"
+
+        for html, urls in [
+            (
+"""<base href="http://www.python.org/foo/">
+<a href="bar/baz.html"></a>
+<a href="/bar/baz.html"></a>
+<a href="http://example.com/bar %2f%2Fblah;/baz@~._-.html"></a>
+""",
+            [
+            "http://www.python.org/foo/bar/baz.html",
+            "http://www.python.org/bar/baz.html",
+            "http://example.com/bar%20%2f%2Fblah;/baz@~._-.html",
+            ]),
+            (
+"""<a href="bar/baz.html"></a>
+<a href="/bar/baz.html"></a>
+<a href="http://example.com/bar/baz.html"></a>
+""",
+            [
+            "http://example.com/bar/baz.html",
+            "http://example.com/bar/baz.html",
+            "http://example.com/bar/baz.html",
+            ]
+            ),
+            ]:
+            b = TestBrowser()
+            r = MockResponse(url, html, {"content-type": "text/html"})
+            b.add_handler(make_mock_handler()([("http_open", r)]))
+            r = b.open(url)
+            self.assertEqual([link.absolute_url for link in b.links()], urls)
+
+    def test_set_cookie(self):
+        class CookieTestBrowser(TestBrowser):
+            default_features = list(TestBrowser.default_features)+["_cookies"]
+
+        # have to be visiting HTTP/HTTPS URL
+        url = "ftp://example.com/"
+        br = CookieTestBrowser()
+        r = mechanize.make_response(
+            "<html><head><title>Title</title></head><body></body></html>",
+            [("content-type", "text/html")],
+            url,
+            200, "OK",
+            )
+        br.add_handler(make_mock_handler()([("http_open", r)]))
+        handler = br._ua_handlers["_cookies"]
+        cj = handler.cookiejar
+        self.assertRaises(mechanize.BrowserStateError,
+                          br.set_cookie, "foo=bar")
+        self.assertEqual(len(cj), 0)
+
+
+        url = "http://example.com/"
+        br = CookieTestBrowser()
+        r = mechanize.make_response(
+            "<html><head><title>Title</title></head><body></body></html>",
+            [("content-type", "text/html")],
+            url,
+            200, "OK",
+            )
+        br.add_handler(make_mock_handler()([("http_open", r)]))
+        handler = br._ua_handlers["_cookies"]
+        cj = handler.cookiejar
+
+        # have to be visiting a URL
+        self.assertRaises(mechanize.BrowserStateError,
+                          br.set_cookie, "foo=bar")
+        self.assertEqual(len(cj), 0)
+
+
+        # normal case
+        br.open(url)
+        br.set_cookie("foo=bar")
+        self.assertEqual(len(cj), 1)
+        self.assertEqual(cj._cookies["example.com"]["/"]["foo"].value, "bar")
+
+
+class ResponseTests(TestCase):
+
+    def test_set_response(self):
+        import copy
+        from mechanize import response_seek_wrapper
+
+        br = TestBrowser()
+        url = "http://example.com/"
+        html = """<html><body><a href="spam">click me</a></body></html>"""
+        headers = {"content-type": "text/html"}
+        r = response_seek_wrapper(MockResponse(url, html, headers))
+        br.add_handler(make_mock_handler()([("http_open", r)]))
+
+        r = br.open(url)
+        self.assertEqual(r.read(), html)
+        r.seek(0)
+        self.assertEqual(copy.copy(r).read(), html)
+        self.assertEqual(list(br.links())[0].url, "spam")
+
+        newhtml = """<html><body><a href="eggs">click me</a></body></html>"""
+
+        r.set_data(newhtml)
+        self.assertEqual(r.read(), newhtml)
+        self.assertEqual(br.response().read(), html)
+        br.response().set_data(newhtml)
+        self.assertEqual(br.response().read(), html)
+        self.assertEqual(list(br.links())[0].url, "spam")
+        r.seek(0)
+
+        br.set_response(r)
+        self.assertEqual(br.response().read(), newhtml)
+        self.assertEqual(list(br.links())[0].url, "eggs")
+
+    def test_str(self):
+        import mimetools
+        from mechanize import _response
+
+        br = TestBrowser()
+        self.assertEqual(
+            str(br),
+            "<TestBrowser (not visiting a URL)>"
+            )
+
+        fp = StringIO.StringIO('<html><form name="f"><input /></form></html>')
+        headers = mimetools.Message(
+            StringIO.StringIO("Content-type: text/html"))
+        response = _response.response_seek_wrapper(
+            _response.closeable_response(
+            fp, headers, "http://example.com/", 200, "OK"))
+        br.set_response(response)
+        self.assertEqual(
+            str(br),
+            "<TestBrowser visiting http://example.com/>"
+            )
+
+        br.select_form(nr=0)
+        self.assertEqual(
+            str(br),
+            """\
+<TestBrowser visiting http://example.com/
+ selected form:
+ <f GET http://example.com/ application/x-www-form-urlencoded
+  <TextControl(<None>=)>>
+>""")
+
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Deleted: python-mechanize/branches/upstream/current/test/test_conncache.py
===================================================================
--- python-mechanize/branches/upstream/current/test/test_conncache.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_conncache.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,14 +0,0 @@
-"""Tests for mechanize._ConnCache module."""
-
-import unittest, sys
-
-class ConnCacheTests(unittest.TestCase):
-
-    def test_ConnectionCache(self):
-        from mechanize import ConnectionCache
-        ConnectionCache()
-
-
-if __name__ == "__main__":
-    #unittest.main()
-    pass

Modified: python-mechanize/branches/upstream/current/test/test_cookies.py
===================================================================
--- python-mechanize/branches/upstream/current/test/test_cookies.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_cookies.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,17 +1,15 @@
 """Tests for _ClientCookie."""
 
-import urllib2, re, os, string, StringIO, mimetools, time
+import urllib2, re, os, StringIO, mimetools, time
 from time import localtime
 from unittest import TestCase
 
-from mechanize._util import startswith
-
 class FakeResponse:
     def __init__(self, headers=[], url=None):
         """
         headers: list of RFC822-style 'Key: value' strings
         """
-        f = StringIO.StringIO(string.join(headers, "\n"))
+        f = StringIO.StringIO("\n".join(headers))
         self._headers = mimetools.Message(f)
         self._url = url
     def info(self): return self._headers
@@ -231,7 +229,7 @@
                           now)
         h = interact_netscape(c, "http://www.acme.com/")
         assert len(c) == 1
-        assert string.find(h, 'spam="bar"') != -1 and string.find(h, "foo") == -1
+        assert h.find('spam="bar"') != -1 and h.find("foo") == -1
 
         # max-age takes precedence over expires, and zero max-age is request to
         # delete both new cookie and any old matching cookie
@@ -252,7 +250,7 @@
         assert len(c) == 2
         c.clear_session_cookies()
         assert len(c) == 1
-        assert string.find(h, 'spam="bar"') != -1
+        assert h.find('spam="bar"') != -1
 
         # XXX RFC 2965 expiry rules (some apply to V0 too)
 
@@ -679,14 +677,14 @@
         url = "http://foo.bar.com/"
         interact_2965(c, url, "spam=eggs; Version=1")
         h = interact_2965(c, url)
-        assert string.find(h, "Domain") == -1, \
+        assert h.find( "Domain") == -1, \
                "absent domain returned with domain present"
 
         c = CookieJar(pol)
         url = "http://foo.bar.com/"
         interact_2965(c, url, 'spam=eggs; Version=1; Domain=.bar.com')
         h = interact_2965(c, url)
-        assert string.find(h, '$Domain=".bar.com"') != -1, \
+        assert h.find('$Domain=".bar.com"') != -1, \
                "domain not returned"
 
         c = CookieJar(pol)
@@ -694,7 +692,7 @@
         # note missing initial dot in Domain
         interact_2965(c, url, 'spam=eggs; Version=1; Domain=bar.com')
         h = interact_2965(c, url)
-        assert string.find(h, '$Domain="bar.com"') != -1, \
+        assert h.find('$Domain="bar.com"') != -1, \
                "domain not returned"
 
     def test_path_mirror(self):
@@ -706,14 +704,14 @@
         url = "http://foo.bar.com/"
         interact_2965(c, url, "spam=eggs; Version=1")
         h = interact_2965(c, url)
-        assert string.find(h, "Path") == -1, \
+        assert h.find("Path") == -1, \
                "absent path returned with path present"
 
         c = CookieJar(pol)
         url = "http://foo.bar.com/"
         interact_2965(c, url, 'spam=eggs; Version=1; Path=/')
         h = interact_2965(c, url)
-        assert string.find(h, '$Path="/"') != -1, "path not returned"
+        assert h.find('$Path="/"') != -1, "path not returned"
 
     def test_port_mirror(self):
         from mechanize import CookieJar, DefaultCookiePolicy
@@ -724,7 +722,7 @@
         url = "http://foo.bar.com/"
         interact_2965(c, url, "spam=eggs; Version=1")
         h = interact_2965(c, url)
-        assert string.find(h, "Port") == -1, \
+        assert h.find("Port") == -1, \
                "absent port returned with port present"
 
         c = CookieJar(pol)
@@ -738,14 +736,14 @@
         url = "http://foo.bar.com/"
         interact_2965(c, url, 'spam=eggs; Version=1; Port="80"')
         h = interact_2965(c, url)
-        assert string.find(h, '$Port="80"') != -1, \
+        assert h.find('$Port="80"') != -1, \
                "port with single value not returned with single value"
 
         c = CookieJar(pol)
         url = "http://foo.bar.com/"
         interact_2965(c, url, 'spam=eggs; Version=1; Port="80,8080"')
         h = interact_2965(c, url)
-        assert string.find(h, '$Port="80,8080"') != -1, \
+        assert h.find('$Port="80,8080"') != -1, \
                "port with multiple values not returned with multiple values"
 
     def test_no_return_comment(self):
@@ -757,7 +755,7 @@
                       'Comment="does anybody read these?"; '
                       'CommentURL="http://foo.bar.net/comment.html"')
         h = interact_2965(c, url)
-        assert string.find(h, "Comment") == -1, \
+        assert h.find("Comment") == -1, \
                "Comment or CommentURL cookie-attributes returned to server"
 
 # just pondering security here -- this isn't really a test (yet)
@@ -939,8 +937,8 @@
         c.add_cookie_header(req)
 
         h = req.get_header("Cookie")
-        assert (string.find(h, "PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
-                string.find(h, "CUSTOMER=WILE_E_COYOTE") != -1)
+        assert (h.find("PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
+                h.find("CUSTOMER=WILE_E_COYOTE") != -1)
 
 
         headers.append('Set-Cookie: SHIPPING=FEDEX; path=/foo')
@@ -951,18 +949,18 @@
         c.add_cookie_header(req)
 
         h = req.get_header("Cookie")
-        assert (string.find(h, "PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
-                string.find(h, "CUSTOMER=WILE_E_COYOTE") != -1 and
-                not string.find(h, "SHIPPING=FEDEX") != -1)
+        assert (h.find("PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
+                h.find("CUSTOMER=WILE_E_COYOTE") != -1 and
+                not h.find("SHIPPING=FEDEX") != -1)
 
 
         req = Request("http://www.acme.com/foo/")
         c.add_cookie_header(req)
 
         h = req.get_header("Cookie")
-        assert (string.find(h, "PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
-                string.find(h, "CUSTOMER=WILE_E_COYOTE") != -1 and
-                startswith(h, "SHIPPING=FEDEX;"))
+        assert (h.find("PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
+                h.find("CUSTOMER=WILE_E_COYOTE") != -1 and
+                h.startswith("SHIPPING=FEDEX;"))
 
     def test_netscape_example_2(self):
         from mechanize import CookieJar, Request
@@ -1121,7 +1119,7 @@
 
         cookie = interact_2965(c, "http://www.acme.com/acme/process")
         assert (re.search(r'Shipping="?FedEx"?;\s*\$Path="\/acme"', cookie) and
-                string.find(cookie, "WILE_E_COYOTE") != -1)
+                cookie.find("WILE_E_COYOTE") != -1)
 
         # 
         # The user agent makes a series of requests on the origin server, after
@@ -1182,8 +1180,8 @@
         # the server.
 
         cookie = interact_2965(c, "http://www.acme.com/acme/parts/")
-        assert (string.find(cookie, "Rocket_Launcher_0001") != -1 and
-                not string.find(cookie, "Riding_Rocket_0023") != -1)
+        assert (cookie.find("Rocket_Launcher_0001") != -1 and
+                not cookie.find("Riding_Rocket_0023") != -1)
 
     def test_rejection(self):
         # Test rejection of Set-Cookie2 responses based on domain, path, port.
@@ -1292,7 +1290,7 @@
             c, "http://www.acme.com/foo%2f%25/<<%0anew\345/\346\370\345",
             'bar=baz; path="/foo/"; version=1');
         version_re = re.compile(r'^\$version=\"?1\"?', re.I)
-        assert (string.find(cookie, "foo=bar") != -1 and
+        assert (cookie.find("foo=bar") != -1 and
                 version_re.search(cookie))
 
         cookie = interact_2965(
@@ -1340,11 +1338,11 @@
 
         new_c = save_and_restore(c, True)
         assert len(new_c) == 6  # none discarded
-        assert string.find(repr(new_c), "name='foo1', value='bar'") != -1
+        assert repr(new_c).find("name='foo1', value='bar'") != -1
 
         new_c = save_and_restore(c, False)
         assert len(new_c) == 4  # 2 of them discarded on save
-        assert string.find(repr(new_c), "name='foo1', value='bar'") != -1
+        assert repr(new_c).find("name='foo1', value='bar'") != -1
 
     def test_netscape_misc(self):
         # Some additional Netscape cookies tests.
@@ -1369,9 +1367,8 @@
         req = Request("http://foo.bar.acme.com/foo")
         c.add_cookie_header(req)
         assert (
-            string.find(req.get_header("Cookie"), "PART_NUMBER=3,4") != -1 and
-            string.find(
-            req.get_header("Cookie"), "Customer=WILE_E_COYOTE") != -1)
+            req.get_header("Cookie").find("PART_NUMBER=3,4") != -1 and
+            req.get_header("Cookie").find("Customer=WILE_E_COYOTE") != -1)
 
     def test_intranet_domains_2965(self):
         # Test handling of local intranet hostnames without a dot.
@@ -1382,11 +1379,11 @@
                       "foo1=bar; PORT; Discard; Version=1;")
         cookie = interact_2965(c, "http://example/",
                                'foo2=bar; domain=".local"; Version=1')
-        assert string.find(cookie, "foo1=bar") >= 0
+        assert cookie.find("foo1=bar") >= 0
 
         interact_2965(c, "http://example/", 'foo3=bar; Version=1')
         cookie = interact_2965(c, "http://example/")
-        assert string.find(cookie, "foo2=bar") >= 0 and len(c) == 3
+        assert cookie.find("foo2=bar") >= 0 and len(c) == 3
 
     def test_intranet_domains_ns(self):
         from mechanize import CookieJar, DefaultCookiePolicy
@@ -1396,10 +1393,10 @@
         cookie = interact_netscape(c, "http://example/",
                                    'foo2=bar; domain=.local')
         assert len(c) == 2
-        assert string.find(cookie, "foo1=bar") >= 0
+        assert cookie.find("foo1=bar") >= 0
 
         cookie = interact_netscape(c, "http://example/")
-        assert string.find(cookie, "foo2=bar") >= 0 and len(c) == 2
+        assert cookie.find("foo2=bar") >= 0 and len(c) == 2
 
     def test_empty_path(self):
         from mechanize import CookieJar, Request, DefaultCookiePolicy

Modified: python-mechanize/branches/upstream/current/test/test_date.py
===================================================================
--- python-mechanize/branches/upstream/current/test/test_date.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_date.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,6 +1,6 @@
 """Tests for ClientCookie._HTTPDate."""
 
-import re, string, time
+import re, time
 from unittest import TestCase
 
 class DateTimeTests(TestCase):
@@ -68,8 +68,8 @@
 
         for s in tests:
             t = http2time(s)
-            t2 = http2time(string.lower(s))
-            t3 = http2time(string.upper(s))
+            t2 = http2time(s.lower())
+            t3 = http2time(s.upper())
 
             assert t == t2 == t3 == test_t, \
                    "'%s'  =>  %s, %s, %s (%s)" % (s, t, t2, t3, test_t)

Added: python-mechanize/branches/upstream/current/test/test_forms.doctest
===================================================================
--- python-mechanize/branches/upstream/current/test/test_forms.doctest	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_forms.doctest	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,59 @@
+Integration regression test for case where ClientForm handled RFC 3986
+url unparsing incorrectly (it was using "" in place of None for
+fragment, due to continuing to support use of stdlib module urlparse
+as well as mechanize._rfc3986).  Fixed in ClientForm r33622 .
+
+>>> import mechanize
+>>> from mechanize._response import test_response
+
+>>> def forms():
+...     forms = []
+...     for method in ["GET", "POST"]:
+...         data = ('<form action="" method="%s">'
+...         '<input type="submit" name="s"/></form>' % method
+...         )
+...         br = mechanize.Browser()
+...         response = test_response(data, [("content-type", "text/html")])
+...         br.set_response(response)
+...         br.select_form(nr=0)
+...         forms.append(br.form)
+...     return forms
+
+>>> getform, postform = forms()
+>>> getform.click().get_full_url()
+'http://example.com/?s='
+>>> postform.click().get_full_url()
+'http://example.com/'
+
+>>> data = '<form action=""><isindex /></form>'
+>>> br = mechanize.Browser()
+>>> response = test_response(data, [("content-type", "text/html")])
+>>> br.set_response(response)
+>>> br.select_form(nr=0)
+>>> br.find_control(type="isindex").value = "blah"
+>>> br.click(type="isindex").get_full_url()
+'http://example.com/?blah'
+
+
+If something (e.g. calling .forms() triggers parsing, and parsing
+fails, the next attempt should not succeed!  This used to happen
+because the response held by LinksFactory etc was stale, since it had
+already been .read().  Fixed by calling Factory.set_response() on
+error.
+
+>>> import mechanize
+>>> br = mechanize.Browser()
+>>> r = mechanize._response.test_html_response("""\
+... <form>
+... <input type="text" name="foo" value="a"></input><!!!>
+... <input type="text" name="bar" value="b"></input>
+... </form>
+... """)
+>>> br.set_response(r)
+>>> try:
+...     br.select_form(nr=0)
+... except mechanize.ParseError:
+...     pass
+>>> br.select_form(nr=0)  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError: expected name token

Added: python-mechanize/branches/upstream/current/test/test_history.doctest
===================================================================
--- python-mechanize/branches/upstream/current/test/test_history.doctest	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_history.doctest	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,12 @@
+>>> from mechanize import History
+
+If nothing has been added, .close should work.
+
+>>> history = History()
+>>> history.close()
+
+Under some circumstances response can be None, in that case
+this method should not raise an exception.
+
+>>> history.add(None, None)
+>>> history.close()

Added: python-mechanize/branches/upstream/current/test/test_html.doctest
===================================================================
--- python-mechanize/branches/upstream/current/test/test_html.doctest	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_html.doctest	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,163 @@
+>>> import mechanize
+>>> from mechanize._response import test_html_response
+>>> from mechanize._html import LinksFactory, FormsFactory, TitleFactory, \
+... MechanizeBs, \
+... RobustLinksFactory,  RobustFormsFactory, RobustTitleFactory
+
+mechanize.ParseError should be raised on parsing erroneous HTML.
+
+For backwards compatibility, mechanize.ParseError derives from
+exception classes that mechanize used to raise, prior to version
+0.1.6.
+
+>>> import sgmllib
+>>> import HTMLParser
+>>> import ClientForm
+>>> issubclass(mechanize.ParseError, sgmllib.SGMLParseError)
+True
+>>> issubclass(mechanize.ParseError, HTMLParser.HTMLParseError)
+True
+>>> issubclass(mechanize.ParseError, ClientForm.ParseError)
+True
+
+>>> def create_response(error=True):
+...     extra = ""
+...     if error:
+...         extra = "<!!!>"
+...     html = """\
+... <html>
+... <head>
+...     <title>Title</title>
+...     %s
+... </head>
+... <body>
+...     <p>Hello world
+... </body>
+... </html>
+... """ % extra
+...     return test_html_response(html)
+
+>>> f = LinksFactory()
+>>> f.set_response(create_response(), "http://example.com", "latin-1")
+>>> list(f.links())  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+>>> f = FormsFactory()
+>>> f.set_response(create_response(), "latin-1")
+>>> list(f.forms())  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+>>> f = TitleFactory()
+>>> f.set_response(create_response(), "latin-1")
+>>> f.title()  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+
+
+Accessing attributes on Factory may also raise ParseError
+
+>>> def factory_getattr(attr_name):
+...    fact = mechanize.DefaultFactory()
+...    fact.set_response(create_response())
+...    getattr(fact, attr_name)
+>>> factory_getattr("title")  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+>>> factory_getattr("global_form")  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+
+
+BeautifulSoup ParseErrors:
+
+XXX If I could come up with examples that break links and forms
+parsing, I'd uncomment these!
+
+>>> def create_soup(html):
+...     r = test_html_response(html)
+...     return MechanizeBs("latin-1", r.read())
+
+#>>> f = RobustLinksFactory()
+#>>> html = """\
+#... <a href="a">
+#... <frame src="b">
+#... <a href="c">
+#... <iframe src="d">
+#... </a>
+#... </area>
+#... </frame>
+#... """
+#>>> f.set_soup(create_soup(html), "http://example.com", "latin-1")
+#>>> list(f.links())  # doctest: +IGNORE_EXCEPTION_DETAIL
+#Traceback (most recent call last):
+#ParseError:
+
+>>> html = """\
+... <table>
+... <tr><td>
+... <input name='broken'>
+... </td>
+... </form>
+... </tr>
+... </form>
+... """
+>>> f = RobustFormsFactory()
+>>> f.set_response(create_response(), "latin-1")
+>>> list(f.forms())  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+
+#>>> f = RobustTitleFactory()
+#>>> f.set_soup(create_soup(""), "latin-1")
+#>>> f.title()  # doctest: +IGNORE_EXCEPTION_DETAIL
+#Traceback (most recent call last):
+#ParseError:
+
+
+
+Utility class for caching forms etc.
+
+>>> from mechanize._html import CachingGeneratorFunction
+
+>>> i = [1]
+>>> func = CachingGeneratorFunction(i)
+>>> list(func())
+[1]
+>>> list(func())
+[1]
+
+>>> i = [1, 2, 3]
+>>> func = CachingGeneratorFunction(i)
+>>> list(func())
+[1, 2, 3]
+
+>>> i = func()
+>>> i.next()
+1
+>>> i.next()
+2
+>>> i.next()
+3
+
+>>> i = func()
+>>> j = func()
+>>> i.next()
+1
+>>> j.next()
+1
+>>> i.next()
+2
+>>> j.next()
+2
+>>> j.next()
+3
+>>> i.next()
+3
+>>> i.next()
+Traceback (most recent call last):
+...
+StopIteration
+>>> j.next()
+Traceback (most recent call last):
+...
+StopIteration

Added: python-mechanize/branches/upstream/current/test/test_html.py
===================================================================
--- python-mechanize/branches/upstream/current/test/test_html.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_html.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,100 @@
+#!/usr/bin/env python
+
+from unittest import TestCase
+
+import mechanize
+from mechanize._response import test_html_response
+
+
+class RegressionTests(TestCase):
+
+    def test_close_base_tag(self):
+        # any document containing a </base> tag used to cause an exception
+        br = mechanize.Browser()
+        response = test_html_response("</base>")
+        br.set_response(response)
+        list(br.links())
+
+    def test_bad_base_tag(self):
+        # a document with a base tag with no href used to cause an exception
+        for factory in [mechanize.DefaultFactory(), mechanize.RobustFactory()]:
+            br = mechanize.Browser(factory=factory)
+            response = test_html_response(
+                "<BASE TARGET='_main'><a href='http://example.com/'>eg</a>")
+            br.set_response(response)
+            list(br.links())
+
+
+class CachingGeneratorFunctionTests(TestCase):
+
+    def _get_simple_cgenf(self, log):
+        from mechanize._html import CachingGeneratorFunction
+        todo = []
+        for ii in range(2):
+            def work(ii=ii):
+                log.append(ii)
+                return ii
+            todo.append(work)
+        def genf():
+            for a in todo:
+                yield a()
+        return CachingGeneratorFunction(genf())
+
+    def test_cache(self):
+        log = []
+        cgenf = self._get_simple_cgenf(log)
+        for repeat in range(2):
+            for ii, jj in zip(cgenf(), range(2)):
+                self.assertEqual(ii, jj)
+            self.assertEqual(log, range(2))  # work only done once
+
+    def test_interleaved(self):
+        log = []
+        cgenf = self._get_simple_cgenf(log)
+        cgen = cgenf()
+        self.assertEqual(cgen.next(), 0)
+        self.assertEqual(log, [0])
+        cgen2 = cgenf()
+        self.assertEqual(cgen2.next(), 0)
+        self.assertEqual(log, [0])
+        self.assertEqual(cgen.next(), 1)
+        self.assertEqual(log, [0, 1])
+        self.assertEqual(cgen2.next(), 1)
+        self.assertEqual(log, [0, 1])
+        self.assertRaises(StopIteration, cgen.next)
+        self.assertRaises(StopIteration, cgen2.next)
+
+
+class UnescapeTests(TestCase):
+
+    def test_unescape_charref(self):
+        from mechanize._html import unescape_charref
+        mdash_utf8 = u"\u2014".encode("utf-8")
+        for ref, codepoint, utf8, latin1 in [
+            ("38", 38, u"&".encode("utf-8"), "&"),
+            ("x2014", 0x2014, mdash_utf8, "&#x2014;"),
+            ("8212", 8212, mdash_utf8, "&#8212;"),
+            ]:
+            self.assertEqual(unescape_charref(ref, None), unichr(codepoint))
+            self.assertEqual(unescape_charref(ref, 'latin-1'), latin1)
+            self.assertEqual(unescape_charref(ref, 'utf-8'), utf8)
+
+    def test_unescape(self):
+        import htmlentitydefs
+        from mechanize._html import unescape
+        data = "&amp; &lt; &mdash; &#8212; &#x2014;"
+        mdash_utf8 = u"\u2014".encode("utf-8")
+        ue = unescape(data, htmlentitydefs.name2codepoint, "utf-8")
+        self.assertEqual("& < %s %s %s" % ((mdash_utf8,)*3), ue)
+
+        for text, expect in [
+            ("&a&amp;", "&a&"),
+            ("a&amp;", "a&"),
+            ]:
+            got = unescape(text, htmlentitydefs.name2codepoint, "latin-1")
+            self.assertEqual(got, expect)
+
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Deleted: python-mechanize/branches/upstream/current/test/test_mechanize.py
===================================================================
--- python-mechanize/branches/upstream/current/test/test_mechanize.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_mechanize.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,850 +0,0 @@
-#!/usr/bin/env python
-
-import sys, random
-from unittest import TestCase
-import StringIO, re, UserDict, urllib2
-
-import mechanize
-FACTORY_CLASSES = [mechanize.DefaultFactory]
-try:
-    import BeautifulSoup
-except ImportError:
-    import warnings
-    warnings.warn("skipping tests involving BeautifulSoup")
-else:
-    FACTORY_CLASSES.append(mechanize.RobustFactory)
-
-
-def test_password_manager(self):
-    """
-    >>> mgr = mechanize.HTTPProxyPasswordMgr()
-    >>> add = mgr.add_password
-
-    >>> add("Some Realm", "http://example.com/", "joe", "password")
-    >>> add("Some Realm", "http://example.com/ni", "ni", "ni")
-    >>> add("c", "http://example.com/foo", "foo", "ni")
-    >>> add("c", "http://example.com/bar", "bar", "nini")
-    >>> add("b", "http://example.com/", "first", "blah")
-    >>> add("b", "http://example.com/", "second", "spam")
-    >>> add("a", "http://example.com", "1", "a")
-    >>> add("Some Realm", "http://c.example.com:3128", "3", "c")
-    >>> add("Some Realm", "d.example.com", "4", "d")
-    >>> add("Some Realm", "e.example.com:3128", "5", "e")
-
-    >>> mgr.find_user_password("Some Realm", "example.com")
-    ('joe', 'password')
-    >>> mgr.find_user_password("Some Realm", "http://example.com")
-    ('joe', 'password')
-    >>> mgr.find_user_password("Some Realm", "http://example.com/")
-    ('joe', 'password')
-    >>> mgr.find_user_password("Some Realm", "http://example.com/spam")
-    ('joe', 'password')
-    >>> mgr.find_user_password("Some Realm", "http://example.com/spam/spam")
-    ('joe', 'password')
-    >>> mgr.find_user_password("c", "http://example.com/foo")
-    ('foo', 'ni')
-    >>> mgr.find_user_password("c", "http://example.com/bar")
-    ('bar', 'nini')
-
-    Currently, we use the highest-level path where more than one match:
-
-    >>> mgr.find_user_password("Some Realm", "http://example.com/ni")
-    ('joe', 'password')
-
-    Use latest add_password() in case of conflict:
-
-    >>> mgr.find_user_password("b", "http://example.com/")
-    ('second', 'spam')
-
-    No special relationship between a.example.com and example.com:
-
-    >>> mgr.find_user_password("a", "http://example.com/")
-    ('1', 'a')
-    >>> mgr.find_user_password("a", "http://a.example.com/")
-    (None, None)
-
-    Ports:
-
-    >>> mgr.find_user_password("Some Realm", "c.example.com")
-    (None, None)
-    >>> mgr.find_user_password("Some Realm", "c.example.com:3128")
-    ('3', 'c')
-    >>> mgr.find_user_password("Some Realm", "http://c.example.com:3128")
-    ('3', 'c')
-    >>> mgr.find_user_password("Some Realm", "d.example.com")
-    ('4', 'd')
-    >>> mgr.find_user_password("Some Realm", "e.example.com:3128")
-    ('5', 'e')
-
-
-    Now features specific to HTTPProxyPasswordMgr.
-
-    Default realm:
-
-    >>> mgr.find_user_password("d", "f.example.com")
-    (None, None)
-    >>> add(None, "f.example.com", "6", "f")
-    >>> mgr.find_user_password("d", "f.example.com")
-    ('6', 'f')
-
-    Default host/port:
-
-    >>> mgr.find_user_password("e", "g.example.com")
-    (None, None)
-    >>> add("e", None, "7", "g")
-    >>> mgr.find_user_password("e", "g.example.com")
-    ('7', 'g')
-
-    Default realm and host/port:
-
-    >>> mgr.find_user_password("f", "h.example.com")
-    (None, None)
-    >>> add(None, None, "8", "h")
-    >>> mgr.find_user_password("f", "h.example.com")
-    ('8', 'h')
-
-    Default realm beats default host/port:
-
-    >>> add("d", None, "9", "i")
-    >>> mgr.find_user_password("d", "f.example.com")
-    ('6', 'f')
-
-    """
-    pass
-
-
-class CachingGeneratorFunctionTests(TestCase):
-
-    def _get_simple_cgenf(self, log):
-        from mechanize._html import CachingGeneratorFunction
-        todo = []
-        for ii in range(2):
-            def work(ii=ii):
-                log.append(ii)
-                return ii
-            todo.append(work)
-        def genf():
-            for a in todo:
-                yield a()
-        return CachingGeneratorFunction(genf())
-
-    def test_cache(self):
-        log = []
-        cgenf = self._get_simple_cgenf(log)
-        for repeat in range(2):
-            for ii, jj in zip(cgenf(), range(2)):
-                self.assertEqual(ii, jj)
-            self.assertEqual(log, range(2))  # work only done once
-
-    def test_interleaved(self):
-        log = []
-        cgenf = self._get_simple_cgenf(log)
-        cgen = cgenf()
-        self.assertEqual(cgen.next(), 0)
-        self.assertEqual(log, [0])
-        cgen2 = cgenf()
-        self.assertEqual(cgen2.next(), 0)
-        self.assertEqual(log, [0])
-        self.assertEqual(cgen.next(), 1)
-        self.assertEqual(log, [0, 1])
-        self.assertEqual(cgen2.next(), 1)
-        self.assertEqual(log, [0, 1])
-        self.assertRaises(StopIteration, cgen.next)
-        self.assertRaises(StopIteration, cgen2.next)
-
-
-class UnescapeTests(TestCase):
-
-    def test_unescape_charref(self):
-        from mechanize._html import unescape_charref
-        mdash_utf8 = u"\u2014".encode("utf-8")
-        for ref, codepoint, utf8, latin1 in [
-            ("38", 38, u"&".encode("utf-8"), "&"),
-            ("x2014", 0x2014, mdash_utf8, "&#x2014;"),
-            ("8212", 8212, mdash_utf8, "&#8212;"),
-            ]:
-            self.assertEqual(unescape_charref(ref, None), unichr(codepoint))
-            self.assertEqual(unescape_charref(ref, 'latin-1'), latin1)
-            self.assertEqual(unescape_charref(ref, 'utf-8'), utf8)
-
-    def test_unescape(self):
-        import htmlentitydefs
-        from mechanize._html import unescape
-        data = "&amp; &lt; &mdash; &#8212; &#x2014;"
-        mdash_utf8 = u"\u2014".encode("utf-8")
-        ue = unescape(data, htmlentitydefs.name2codepoint, "utf-8")
-        self.assertEqual("& < %s %s %s" % ((mdash_utf8,)*3), ue)
-
-        for text, expect in [
-            ("&a&amp;", "&a&"),
-            ("a&amp;", "a&"),
-            ]:
-            got = unescape(text, htmlentitydefs.name2codepoint, "latin-1")
-            self.assertEqual(got, expect)
-
-
-# XXX these 'mock' classes are badly in need of simplification
-class MockMethod:
-    def __init__(self, meth_name, action, handle):
-        self.meth_name = meth_name
-        self.handle = handle
-        self.action = action
-    def __call__(self, *args):
-        return apply(self.handle, (self.meth_name, self.action)+args)
-
-class MockHeaders(UserDict.UserDict):
-    def getallmatchingheaders(self, name):
-        return ["%s: %s" % (k, v) for k, v in self.data.iteritems()]
-    def getheaders(self, name):
-        return self.data.values()
-
-class MockResponse:
-    closeable_response = None
-    def __init__(self, url="http://example.com/", data=None, info=None):
-        self.url = url
-        self.fp = StringIO.StringIO(data)
-        if info is None: info = {}
-        self._info = MockHeaders(info)
-        self.source = "%d%d" % (id(self), random.randint(0, sys.maxint-1))
-    def info(self): return self._info
-    def geturl(self): return self.url
-    def read(self, size=-1): return self.fp.read(size)
-    def seek(self, whence):
-        assert whence == 0
-        self.fp.seek(0)
-    def close(self): pass
-    def __getstate__(self):
-        state = self.__dict__
-        state['source'] = self.source
-        return state
-    def __setstate__(self, state):
-        self.__dict__ = state
-
-def make_mock_handler():
-    class MockHandler:
-        processor_order = 500
-        handler_order = -1
-        def __init__(self, methods):
-            self._define_methods(methods)
-        def _define_methods(self, methods):
-            for name, action in methods:
-                if name.endswith("_open"):
-                    meth = MockMethod(name, action, self.handle)
-                else:
-                    meth = MockMethod(name, action, self.process)
-                setattr(self.__class__, name, meth)
-        def handle(self, fn_name, response, *args, **kwds):
-            self.parent.calls.append((self, fn_name, args, kwds))
-            if response:
-                if isinstance(response, urllib2.HTTPError):
-                    raise response
-                r = response
-                r.seek(0)
-            else:
-                r = MockResponse()
-            req = args[0]
-            r.url = req.get_full_url()
-            return r
-        def process(self, fn_name, action, *args, **kwds):
-            self.parent.calls.append((self, fn_name, args, kwds))
-            if fn_name.endswith("_request"):
-                return args[0]
-            else:
-                return args[1]
-        def close(self): pass
-        def add_parent(self, parent):
-            self.parent = parent
-            self.parent.calls = []
-        def __lt__(self, other):
-            if not hasattr(other, "handler_order"):
-                # Try to preserve the old behavior of having custom classes
-                # inserted after default ones (works only for custom user
-                # classes which are not aware of handler_order).
-                return True
-            return self.handler_order < other.handler_order
-    return MockHandler
-
-class TestBrowser(mechanize.Browser):
-    default_features = ["_seek"]
-    default_others = []
-    default_schemes = []
-
-
-class BrowserTests(TestCase):
-    def test_referer(self):
-        b = TestBrowser()
-        url = "http://www.example.com/"
-        r = MockResponse(url,
-"""<html>
-<head><title>Title</title></head>
-<body>
-<form name="form1">
- <input type="hidden" name="foo" value="bar"></input>
- <input type="submit"></input>
- </form>
-<a href="http://example.com/foo/bar.html" name="apples"></a>
-<a href="https://example.com/spam/eggs.html" name="secure"></a>
-<a href="blah://example.com/" name="pears"></a>
-</body>
-</html>
-""", {"content-type": "text/html"})
-        b.add_handler(make_mock_handler()([("http_open", r)]))
-
-        # Referer not added by .open()...
-        req = mechanize.Request(url)
-        b.open(req)
-        self.assert_(req.get_header("Referer") is None)
-        # ...even if we're visiting a document
-        b.open(req)
-        self.assert_(req.get_header("Referer") is None)
-        # Referer added by .click_link() and .click()
-        b.select_form("form1")
-        req2 = b.click()
-        self.assertEqual(req2.get_header("Referer"), url)
-        r2 = b.open(req2)
-        req3 = b.click_link(name="apples")
-        self.assertEqual(req3.get_header("Referer"), url+"?foo=bar")
-        # Referer not added when going from https to http URL
-        b.add_handler(make_mock_handler()([("https_open", r)]))
-        r3 = b.open(req3)
-        req4 = b.click_link(name="secure")
-        self.assertEqual(req4.get_header("Referer"),
-                         "http://example.com/foo/bar.html")
-        r4 = b.open(req4)
-        req5 = b.click_link(name="apples")
-        self.assert_(not req5.has_header("Referer"))
-        # Referer not added for non-http, non-https requests
-        b.add_handler(make_mock_handler()([("blah_open", r)]))
-        req6 = b.click_link(name="pears")
-        self.assert_(not req6.has_header("Referer"))
-        # Referer not added when going from non-http, non-https URL
-        r4 = b.open(req6)
-        req7 = b.click_link(name="apples")
-        self.assert_(not req7.has_header("Referer"))
-
-        # XXX Referer added for redirect
-
-    def test_encoding(self):
-        import mechanize
-        from StringIO import StringIO
-        import urllib, mimetools
-        # always take first encoding, since that's the one from the real HTTP
-        # headers, rather than from HTTP-EQUIV
-        b = mechanize.Browser()
-        for s, ct in [("", mechanize._html.DEFAULT_ENCODING),
-
-                      ("Foo: Bar\r\n\r\n", mechanize._html.DEFAULT_ENCODING),
-
-                      ("Content-Type: text/html; charset=UTF-8\r\n\r\n",
-                       "UTF-8"),
-
-                      ("Content-Type: text/html; charset=UTF-8\r\n"
-                       "Content-Type: text/html; charset=KOI8-R\r\n\r\n",
-                       "UTF-8"),
-                      ]:
-            msg = mimetools.Message(StringIO(s))
-            r = urllib.addinfourl(StringIO(""), msg, "http://www.example.com/")
-            b.set_response(r)
-            self.assertEqual(b.encoding(), ct)
-
-    def test_history(self):
-        import mechanize
-
-        def same_response(ra, rb):
-            return ra.source == rb.source
-
-        b = TestBrowser()
-        b.add_handler(make_mock_handler()([("http_open", None)]))
-        self.assertRaises(mechanize.BrowserStateError, b.back)
-        r1 = b.open("http://example.com/")
-        self.assertRaises(mechanize.BrowserStateError, b.back)
-        r2 = b.open("http://example.com/foo")
-        self.assert_(same_response(b.back(), r1))
-        r3 = b.open("http://example.com/bar")
-        r4 = b.open("http://example.com/spam")
-        self.assert_(same_response(b.back(), r3))
-        self.assert_(same_response(b.back(), r1))
-        self.assertRaises(mechanize.BrowserStateError, b.back)
-        # reloading does a real HTTP fetch rather than using history cache
-        r5 = b.reload()
-        self.assert_(not same_response(r5, r1))
-        # .geturl() gets fed through to b.response
-        self.assertEquals(b.geturl(), "http://example.com/")
-        # can go back n times
-        r6 = b.open("spam")
-        self.assertEquals(b.geturl(), "http://example.com/spam")
-        r7 = b.open("/spam")
-        self.assert_(same_response(b.response(), r7))
-        self.assertEquals(b.geturl(), "http://example.com/spam")
-        self.assert_(same_response(b.back(2), r5))
-        self.assertEquals(b.geturl(), "http://example.com/")
-        self.assertRaises(mechanize.BrowserStateError, b.back, 2)
-        r8 = b.open("/spam")
-
-        # even if we get a HTTPError, history and .response() should still get updated
-        error = urllib2.HTTPError("http://example.com/bad", 503, "Oops",
-                                  MockHeaders(), StringIO.StringIO())
-        b.add_handler(make_mock_handler()([("https_open", error)]))
-        self.assertRaises(urllib2.HTTPError, b.open, "https://example.com/")
-        self.assertEqual(b.response().geturl(), error.geturl())
-        self.assert_(same_response(b.back(), r8))
-
-        b.close()
-        # XXX assert BrowserStateError
-
-    def test_viewing_html(self):
-        # XXX not testing multiple Content-Type headers
-        import mechanize
-        url = "http://example.com/"
-
-        for allow_xhtml in False, True:
-            for ct, expect in [
-                (None, False),
-                ("text/plain", False),
-                ("text/html", True),
-
-                # don't try to handle XML until we can do it right!
-                ("text/xhtml", allow_xhtml),
-                ("text/xml", allow_xhtml),
-                ("application/xml", allow_xhtml),
-                ("application/xhtml+xml", allow_xhtml),
-
-                ("text/html; charset=blah", True),
-                (" text/html ; charset=ook ", True),
-                ]:
-                b = TestBrowser(mechanize.DefaultFactory(
-                    i_want_broken_xhtml_support=allow_xhtml))
-                hdrs = {}
-                if ct is not None:
-                    hdrs["Content-Type"] = ct
-                b.add_handler(make_mock_handler()([("http_open",
-                                            MockResponse(url, "", hdrs))]))
-                r = b.open(url)
-                self.assertEqual(b.viewing_html(), expect)
-
-        for allow_xhtml in False, True:
-            for ext, expect in [
-                (".htm", True),
-                (".html", True),
-
-                # don't try to handle XML until we can do it right!
-                (".xhtml", allow_xhtml),
-
-                (".html?foo=bar&a=b;whelk#kool", True),
-                (".txt", False),
-                (".xml", False),
-                ("", False),
-                ]:
-                b = TestBrowser(mechanize.DefaultFactory(
-                    i_want_broken_xhtml_support=allow_xhtml))
-                url = "http://example.com/foo"+ext
-                b.add_handler(make_mock_handler()(
-                    [("http_open", MockResponse(url, "", {}))]))
-                r = b.open(url)
-                self.assertEqual(b.viewing_html(), expect)
-
-    def test_empty(self):
-        import mechanize
-        url = "http://example.com/"
-
-        b = TestBrowser()
-        b.add_handler(make_mock_handler()([("http_open", MockResponse(url, "", {}))]))
-        r = b.open(url)
-        self.assert_(not b.viewing_html())
-        self.assertRaises(mechanize.BrowserStateError, b.links)
-        self.assertRaises(mechanize.BrowserStateError, b.forms)
-        self.assertRaises(mechanize.BrowserStateError, b.title)
-        self.assertRaises(mechanize.BrowserStateError, b.select_form)
-        self.assertRaises(mechanize.BrowserStateError, b.select_form,
-                          name="blah")
-        self.assertRaises(mechanize.BrowserStateError, b.find_link,
-                          name="blah")
-
-        b = TestBrowser()
-        r = MockResponse(url,
-"""<html>
-<head><title>Title</title></head>
-<body>
-</body>
-</html>
-""", {"content-type": "text/html"})
-        b.add_handler(make_mock_handler()([("http_open", r)]))
-        r = b.open(url)
-        self.assertEqual(b.title(), "Title")
-        self.assertEqual(len(list(b.links())), 0)
-        self.assertEqual(len(list(b.forms())), 0)
-        self.assertRaises(ValueError, b.select_form)
-        self.assertRaises(mechanize.FormNotFoundError, b.select_form,
-                          name="blah")
-        self.assertRaises(mechanize.FormNotFoundError, b.select_form,
-                          predicate=lambda x: True)
-        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
-                          name="blah")
-        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
-                          predicate=lambda x: True)
-
-    def test_forms(self):
-        for factory_class in FACTORY_CLASSES:
-            self._test_forms(factory_class())
-    def _test_forms(self, factory):
-        import mechanize
-        url = "http://example.com"
-
-        b = TestBrowser(factory=factory)
-        r = MockResponse(url,
-"""<html>
-<head><title>Title</title></head>
-<body>
-<form name="form1">
- <input type="text"></input>
- <input type="checkbox" name="cheeses" value="cheddar"></input>
- <input type="checkbox" name="cheeses" value="edam"></input>
- <input type="submit" name="one"></input>
-</form>
-<a href="http://example.com/foo/bar.html" name="apples">
-<form name="form2">
- <input type="submit" name="two">
-</form>
-</body>
-</html>
-""", {"content-type": "text/html"})
-        b.add_handler(make_mock_handler()([("http_open", r)]))
-        r = b.open(url)
-
-        forms = list(b.forms())
-        self.assertEqual(len(forms), 2)
-        for got, expect in zip([f.name for f in forms], [
-            "form1", "form2"]):
-            self.assertEqual(got, expect)
-
-        self.assertRaises(mechanize.FormNotFoundError, b.select_form, "foo")
-
-        # no form is set yet
-        self.assertRaises(AttributeError, getattr, b, "possible_items")
-        b.select_form("form1")
-        # now unknown methods are fed through to selected ClientForm.HTMLForm
-        self.assertEqual(
-            [i.name for i in b.find_control("cheeses").items],
-            ["cheddar", "edam"])
-        b["cheeses"] = ["cheddar", "edam"]
-        self.assertEqual(b.click_pairs(), [
-            ("cheeses", "cheddar"), ("cheeses", "edam"), ("one", "")])
-
-        b.select_form(nr=1)
-        self.assertEqual(b.name, "form2")
-        self.assertEqual(b.click_pairs(), [("two", "")])
-
-    def test_link_encoding(self):
-        for factory_class in FACTORY_CLASSES:
-            self._test_link_encoding(factory_class())
-    def _test_link_encoding(self, factory):
-        import urllib
-        import mechanize
-        from mechanize._html import clean_url
-        url = "http://example.com/"
-        for encoding in ["UTF-8", "latin-1"]:
-            encoding_decl = "; charset=%s" % encoding
-            b = TestBrowser(factory=factory)
-            r = MockResponse(url, """\
-<a href="http://example.com/foo/bar&mdash;&#x2014;.html"
-   name="name0&mdash;&#x2014;">blah&mdash;&#x2014;</a>
-""", #"
-{"content-type": "text/html%s" % encoding_decl})
-            b.add_handler(make_mock_handler()([("http_open", r)]))
-            r = b.open(url)
-
-            Link = mechanize.Link
-            try:
-                mdashx2 = u"\u2014".encode(encoding)*2
-            except UnicodeError:
-                mdashx2 = '&mdash;&#x2014;'
-            qmdashx2 = clean_url(mdashx2, encoding)
-            # base_url, url, text, tag, attrs
-            exp = Link(url, "http://example.com/foo/bar%s.html" % qmdashx2,
-                       "blah"+mdashx2, "a",
-                       [("href", "http://example.com/foo/bar%s.html" % mdashx2),
-                        ("name", "name0%s" % mdashx2)])
-            # nr
-            link = b.find_link()
-##             print
-##             print exp
-##             print link
-            self.assertEqual(link, exp)
-
-    def test_link_whitespace(self):
-        from mechanize import Link
-        for factory_class in FACTORY_CLASSES:
-            base_url = "http://example.com/"
-            url = "  http://example.com/foo.html%20+ "
-            stripped_url = url.strip()
-            html = '<a href="%s"></a>' % url
-            b = TestBrowser(factory=factory_class())
-            r = MockResponse(base_url, html, {"content-type": "text/html"})
-            b.add_handler(make_mock_handler()([("http_open", r)]))
-            r = b.open(base_url)
-            link = b.find_link(nr=0)
-            self.assertEqual(
-                link,
-                Link(base_url, stripped_url, "", "a", [("href", url)])
-                )
-
-    def test_links(self):
-        for factory_class in FACTORY_CLASSES:
-            self._test_links(factory_class())
-    def _test_links(self, factory):
-        import mechanize
-        from mechanize import Link
-        url = "http://example.com/"
-
-        b = TestBrowser(factory=factory)
-        r = MockResponse(url,
-"""<html>
-<head><title>Title</title></head>
-<body>
-<a href="http://example.com/foo/bar.html" name="apples"></a>
-<a name="pears"></a>
-<a href="spam" name="pears"></a>
-<area href="blah" name="foo"></area>
-<form name="form2">
- <input type="submit" name="two">
-</form>
-<frame name="name" href="href" src="src"></frame>
-<iframe name="name2" href="href" src="src"></iframe>
-<a name="name3" href="one">yada yada</a>
-<a name="pears" href="two" weird="stuff">rhubarb</a>
-<a></a>
-<iframe src="foo"></iframe>
-</body>
-</html>
-""", {"content-type": "text/html"})
-        b.add_handler(make_mock_handler()([("http_open", r)]))
-        r = b.open(url)
-
-        exp_links = [
-            # base_url, url, text, tag, attrs
-            Link(url, "http://example.com/foo/bar.html", "", "a",
-                 [("href", "http://example.com/foo/bar.html"),
-                  ("name", "apples")]),
-            Link(url, "spam", "", "a", [("href", "spam"), ("name", "pears")]),
-            Link(url, "blah", None, "area",
-                 [("href", "blah"), ("name", "foo")]),
-            Link(url, "src", None, "frame",
-                 [("name", "name"), ("href", "href"), ("src", "src")]),
-            Link(url, "src", None, "iframe",
-                 [("name", "name2"), ("href", "href"), ("src", "src")]),
-            Link(url, "one", "yada yada", "a",
-                 [("name", "name3"), ("href", "one")]),
-            Link(url, "two", "rhubarb", "a",
-                 [("name", "pears"), ("href", "two"), ("weird", "stuff")]),
-            Link(url, "foo", None, "iframe",
-                 [("src", "foo")]),
-            ]
-        links = list(b.links())
-        self.assertEqual(len(links), len(exp_links))
-        for got, expect in zip(links, exp_links):
-            self.assertEqual(got, expect)
-        # nr
-        l = b.find_link()
-        self.assertEqual(l.url, "http://example.com/foo/bar.html")
-        l = b.find_link(nr=1)
-        self.assertEqual(l.url, "spam")
-        # text
-        l = b.find_link(text="yada yada")
-        self.assertEqual(l.url, "one")
-        self.assertRaises(mechanize.LinkNotFoundError,
-                          b.find_link, text="da ya")
-        l = b.find_link(text_regex=re.compile("da ya"))
-        self.assertEqual(l.url, "one")
-        l = b.find_link(text_regex="da ya")
-        self.assertEqual(l.url, "one")
-        # name
-        l = b.find_link(name="name3")
-        self.assertEqual(l.url, "one")
-        l = b.find_link(name_regex=re.compile("oo"))
-        self.assertEqual(l.url, "blah")
-        l = b.find_link(name_regex="oo")
-        self.assertEqual(l.url, "blah")
-        # url
-        l = b.find_link(url="spam")
-        self.assertEqual(l.url, "spam")
-        l = b.find_link(url_regex=re.compile("pam"))
-        self.assertEqual(l.url, "spam")
-        l = b.find_link(url_regex="pam")
-        self.assertEqual(l.url, "spam")
-        # tag
-        l = b.find_link(tag="area")
-        self.assertEqual(l.url, "blah")
-        # predicate
-        l = b.find_link(predicate=
-                        lambda l: dict(l.attrs).get("weird") == "stuff")
-        self.assertEqual(l.url, "two")
-        # combinations
-        l = b.find_link(name="pears", nr=1)
-        self.assertEqual(l.text, "rhubarb")
-        l = b.find_link(url="src", nr=0, name="name2")
-        self.assertEqual(l.tag, "iframe")
-        self.assertEqual(l.url, "src")
-        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
-                          url="src", nr=1, name="name2")
-        l = b.find_link(tag="a", predicate=
-                        lambda l: dict(l.attrs).get("weird") == "stuff")
-        self.assertEqual(l.url, "two")
-
-        # .links()
-        self.assertEqual(list(b.links(url="src")), [
-            Link(url, url="src", text=None, tag="frame",
-                 attrs=[("name", "name"), ("href", "href"), ("src", "src")]),
-            Link(url, url="src", text=None, tag="iframe",
-                 attrs=[("name", "name2"), ("href", "href"), ("src", "src")]),
-            ])
-
-    def test_base_uri(self):
-        import mechanize
-        url = "http://example.com/"
-
-        for html, urls in [
-            (
-"""<base href="http://www.python.org/foo/">
-<a href="bar/baz.html"></a>
-<a href="/bar/baz.html"></a>
-<a href="http://example.com/bar %2f%2Fblah;/baz@~._-.html"></a>
-""",
-            [
-            "http://www.python.org/foo/bar/baz.html",
-            "http://www.python.org/bar/baz.html",
-            "http://example.com/bar%20%2f%2Fblah;/baz@~._-.html",
-            ]),
-            (
-"""<a href="bar/baz.html"></a>
-<a href="/bar/baz.html"></a>
-<a href="http://example.com/bar/baz.html"></a>
-""",
-            [
-            "http://example.com/bar/baz.html",
-            "http://example.com/bar/baz.html",
-            "http://example.com/bar/baz.html",
-            ]
-            ),
-            ]:
-            b = TestBrowser()
-            r = MockResponse(url, html, {"content-type": "text/html"})
-            b.add_handler(make_mock_handler()([("http_open", r)]))
-            r = b.open(url)
-            self.assertEqual([link.absolute_url for link in b.links()], urls)
-
-
-class ResponseTests(TestCase):
-    def test_set_response(self):
-        import copy
-        from mechanize import response_seek_wrapper
-
-        br = TestBrowser()
-        url = "http://example.com/"
-        html = """<html><body><a href="spam">click me</a></body></html>"""
-        headers = {"content-type": "text/html"}
-        r = response_seek_wrapper(MockResponse(url, html, headers))
-        br.add_handler(make_mock_handler()([("http_open", r)]))
-
-        r = br.open(url)
-        self.assertEqual(r.read(), html)
-        r.seek(0)
-        self.assertEqual(copy.copy(r).read(), html)
-        self.assertEqual(list(br.links())[0].url, "spam")
-
-        newhtml = """<html><body><a href="eggs">click me</a></body></html>"""
-
-        r.set_data(newhtml)
-        self.assertEqual(r.read(), newhtml)
-        self.assertEqual(br.response().read(), html)
-        br.response().set_data(newhtml)
-        self.assertEqual(br.response().read(), html)
-        self.assertEqual(list(br.links())[0].url, "spam")
-        r.seek(0)
-
-        br.set_response(r)
-        self.assertEqual(br.response().read(), newhtml)
-        self.assertEqual(list(br.links())[0].url, "eggs")
-
-    def test_str(self):
-        import mimetools
-        from mechanize import _util
-
-        br = TestBrowser()
-        self.assertEqual(
-            str(br),
-            "<TestBrowser (not visiting a URL)>"
-            )
-
-        fp = StringIO.StringIO('<html><form name="f"><input /></form></html>')
-        headers = mimetools.Message(
-            StringIO.StringIO("Content-type: text/html"))
-        response = _util.response_seek_wrapper(_util.closeable_response(
-            fp, headers, "http://example.com/", 200, "OK"))
-        br.set_response(response)
-        self.assertEqual(
-            str(br),
-            "<TestBrowser visiting http://example.com/>"
-            )
-
-        br.select_form(nr=0)
-        self.assertEqual(
-            str(br),
-            """\
-<TestBrowser visiting http://example.com/
- selected form:
- <f GET http://example.com/ application/x-www-form-urlencoded
-  <TextControl(<None>=)>>
->""")
-
-
-class UserAgentTests(TestCase):
-    def test_set_handled_schemes(self):
-        import mechanize
-        class MockHandlerClass(make_mock_handler()):
-            def __call__(self): return self
-        class BlahHandlerClass(MockHandlerClass): pass
-        class BlahProcessorClass(MockHandlerClass): pass
-        BlahHandler = BlahHandlerClass([("blah_open", None)])
-        BlahProcessor = BlahProcessorClass([("blah_request", None)])
-        class TestUserAgent(mechanize.UserAgent):
-            default_others = []
-            default_features = []
-            handler_classes = mechanize.UserAgent.handler_classes.copy()
-            handler_classes.update(
-                {"blah": BlahHandler, "_blah": BlahProcessor})
-        ua = TestUserAgent()
-
-        self.assertEqual(len(ua.handlers), 5)
-        ua.set_handled_schemes(["http", "https"])
-        self.assertEqual(len(ua.handlers), 2)
-        self.assertRaises(ValueError,
-            ua.set_handled_schemes, ["blah", "non-existent"])
-        self.assertRaises(ValueError,
-            ua.set_handled_schemes, ["blah", "_blah"])
-        ua.set_handled_schemes(["blah"])
-
-        req = mechanize.Request("blah://example.com/")
-        r = ua.open(req)
-        exp_calls = [("blah_open", (req,), {})]
-        assert len(ua.calls) == len(exp_calls)
-        for got, expect in zip(ua.calls, exp_calls):
-            self.assertEqual(expect, got[1:])
-
-        ua.calls = []
-        req = mechanize.Request("blah://example.com/")
-        ua._set_handler("_blah", True)
-        r = ua.open(req)
-        exp_calls = [
-            ("blah_request", (req,), {}),
-            ("blah_open", (req,), {})]
-        assert len(ua.calls) == len(exp_calls)
-        for got, expect in zip(ua.calls, exp_calls):
-            self.assertEqual(expect, got[1:])
-        ua._set_handler("_blah", True)
-
-if __name__ == "__main__":
-    import test_mechanize
-    import doctest
-    doctest.testmod(test_mechanize)
-    import unittest
-    unittest.main()

Deleted: python-mechanize/branches/upstream/current/test/test_misc.py
===================================================================
--- python-mechanize/branches/upstream/current/test/test_misc.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_misc.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,218 +0,0 @@
-"""Miscellaneous pyunit tests."""
-
-import copy
-import cStringIO, string
-from unittest import TestCase
-
-class TestUnSeekable:
-    def __init__(self, text):
-        self._file = cStringIO.StringIO(text)
-        self.log = []
-
-    def tell(self): return self._file.tell()
-
-    def seek(self, offset, whence=0): assert False
-
-    def read(self, size=-1):
-        self.log.append(("read", size))
-        return self._file.read(size)
-
-    def readline(self, size=-1):
-        self.log.append(("readline", size))
-        return self._file.readline(size)
-
-    def readlines(self, sizehint=-1):
-        self.log.append(("readlines", sizehint))
-        return self._file.readlines(sizehint)
-
-class TestUnSeekableResponse(TestUnSeekable):
-    def __init__(self, text, headers):
-        TestUnSeekable.__init__(self, text)
-        self.code = 200
-        self.msg = "OK"
-        self.headers = headers
-        self.url = "http://example.com/"
-
-    def geturl(self):
-        return self.url
-
-    def info(self):
-        return self.headers
-
-    def close(self):
-        pass
-
-
-class SeekableTests(TestCase):
-
-    text = """\
-The quick brown fox
-jumps over the lazy
-
-dog.
-
-"""
-    text_lines = map(lambda l: l+"\n", string.split(text, "\n")[:-1])
-
-    def testSeekable(self):
-        from mechanize._util import seek_wrapper
-        text = self.text
-        text_lines = self.text_lines
-
-        for ii in range(1, 6):
-            fh = TestUnSeekable(text)
-            sfh = seek_wrapper(fh)
-            test = getattr(self, "_test%d" % ii)
-            test(sfh)
-
-        # copies have independent seek positions
-        fh = TestUnSeekable(text)
-        sfh = seek_wrapper(fh)
-        self._testCopy(sfh)
-
-    def _testCopy(self, sfh):
-        sfh2 = copy.copy(sfh)
-        sfh.read(10)
-        text = self.text
-        self.assertEqual(sfh2.read(10), text[:10])
-        sfh2.seek(5)
-        self.assertEqual(sfh.read(10), text[10:20])
-        self.assertEqual(sfh2.read(10), text[5:15])
-        sfh.seek(0)
-        sfh2.seek(0)
-        return sfh2
-
-    def _test1(self, sfh):
-        text = self.text
-        text_lines = self.text_lines
-        assert sfh.read(10) == text[:10]  # calls fh.read
-        assert sfh.log[-1] == ("read", 10)  # .log delegated to fh
-        sfh.seek(0)  # doesn't call fh.seek
-        assert sfh.read(10) == text[:10]  # doesn't call fh.read
-        assert len(sfh.log) == 1
-        sfh.seek(0)
-        assert sfh.read(5) == text[:5]  # read only part of cached data
-        assert len(sfh.log) == 1
-        sfh.seek(0)
-        assert sfh.read(25) == text[:25]  # calls fh.read
-        assert sfh.log[1] == ("read", 15)
-        lines = []
-        sfh.seek(-1, 1)
-        while 1:
-            l = sfh.readline()
-            if l == "": break
-            lines.append(l)
-        assert lines == ["s over the lazy\n"]+text_lines[2:]
-        assert sfh.log[2:] == [("readline", -1)]*5
-        sfh.seek(0)
-        lines = []
-        while 1:
-            l = sfh.readline()
-            if l == "": break
-            lines.append(l)
-        assert lines == text_lines
-
-    def _test2(self, sfh):
-        text = self.text
-        sfh.read(5)
-        sfh.seek(0)
-        assert sfh.read() == text
-        assert sfh.read() == ""
-        sfh.seek(0)
-        assert sfh.read() == text
-        sfh.seek(0)
-        assert sfh.readline(5) == "The q"
-        assert sfh.read() == text[5:]
-        sfh.seek(0)
-        assert sfh.readline(5) == "The q"
-        assert sfh.readline() == "uick brown fox\n"
-
-    def _test3(self, sfh):
-        text = self.text
-        text_lines = self.text_lines
-        sfh.read(25)
-        sfh.seek(-1, 1)
-        self.assertEqual(sfh.readlines(), ["s over the lazy\n"]+text_lines[2:])
-        nr_logs = len(sfh.log)
-        sfh.seek(0)
-        assert sfh.readlines() == text_lines
-
-    def _test4(self, sfh):
-        text = self.text
-        text_lines = self.text_lines
-        count = 0
-        limit = 10
-        while count < limit:
-            if count == 5:
-                self.assertRaises(StopIteration, sfh.next)
-                break
-            else:
-                sfh.next() == text_lines[count]
-            count = count + 1
-        else:
-            assert False, "StopIteration not raised"
-
-    def _test5(self, sfh):
-        text = self.text
-        sfh.read(10)
-        sfh.seek(5)
-        self.assert_(sfh.invariant())
-        sfh.seek(0, 2)
-        self.assert_(sfh.invariant())
-        sfh.seek(0)
-        self.assertEqual(sfh.read(), text)
-
-    def testResponseSeekWrapper(self):
-        from mechanize import response_seek_wrapper
-        hdrs = {"Content-type": "text/html"}
-        r = TestUnSeekableResponse(self.text, hdrs)
-        rsw = response_seek_wrapper(r)
-        rsw2 = self._testCopy(rsw)
-        self.assert_(rsw is not rsw2)
-        self.assertEqual(rsw.info(), rsw2.info())
-        self.assert_(rsw.info() is not rsw2.info())
-
-        # should be able to close already-closed object
-        rsw2.close()
-        rsw2.close()
-
-    def testSetResponseData(self):
-        from mechanize import response_seek_wrapper
-        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
-        rsw = response_seek_wrapper(r)
-        rsw.set_data("""\
-A Seeming somwhat more than View;
-  That doth instruct the Mind
-  In Things that ly behind,
-""")
-        self.assertEqual(rsw.read(9), "A Seeming")
-        self.assertEqual(rsw.read(13), " somwhat more")
-        rsw.seek(0)
-        self.assertEqual(rsw.read(9), "A Seeming")
-        self.assertEqual(rsw.readline(), " somwhat more than View;\n")
-        rsw.seek(0)
-        self.assertEqual(rsw.readline(), "A Seeming somwhat more than View;\n")
-        rsw.seek(-1, 1)
-        self.assertEqual(rsw.read(7), "\n  That")
-
-        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
-        rsw = response_seek_wrapper(r)
-        rsw.set_data(self.text)
-        self._test2(rsw)
-        rsw.seek(0)
-        self._test4(rsw)
-
-    def testGetResponseData(self):
-        from mechanize import response_seek_wrapper
-        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
-        rsw = response_seek_wrapper(r)
-
-        self.assertEqual(rsw.get_data(), self.text)
-        self._test2(rsw)
-        rsw.seek(0)
-        self._test4(rsw)
-
-
-if __name__ == "__main__":
-    import unittest
-    unittest.main()

Added: python-mechanize/branches/upstream/current/test/test_opener.py
===================================================================
--- python-mechanize/branches/upstream/current/test/test_opener.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_opener.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,178 @@
+#!/usr/bin/env python
+
+import os, math
+from unittest import TestCase
+
+import mechanize
+
+
+def killfile(filename):
+    try:
+        os.remove(filename)
+    except OSError:
+        if os.name=='nt':
+            try:
+                os.chmod(filename, stat.S_IWRITE)
+                os.remove(filename)
+            except OSError:
+                pass
+
+class OpenerTests(TestCase):
+
+    def test_retrieve(self):
+        # The .retrieve() method deals with a number of different cases.  In
+        # each case, .read() should be called the expected number of times, the
+        # progress callback should be called as expected, and we should end up
+        # with a filename and some headers.
+
+        class Opener(mechanize.OpenerDirector):
+            def __init__(self, content_length=None):
+                mechanize.OpenerDirector.__init__(self)
+                self.calls = []
+                self.block_size = mechanize.OpenerDirector.BLOCK_SIZE
+                self.nr_blocks = 2.5
+                self.data = int((self.block_size/8)*self.nr_blocks)*"01234567"
+                self.total_size = len(self.data)
+                self._content_length = content_length
+            def open(self, fullurl, data=None):
+                from mechanize import _response
+                self.calls.append((fullurl, data))
+                headers = [("Foo", "Bar")]
+                if self._content_length is not None:
+                    if self._content_length is True:
+                        content_length = str(len(self.data))
+                    else:
+                        content_length = str(self._content_length)
+                    headers.append(("content-length", content_length))
+                return _response.test_response(self.data, headers)
+
+        class CallbackVerifier:
+            def __init__(self, testcase, total_size, block_size):
+                self.count = 0
+                self._testcase = testcase
+                self._total_size = total_size
+                self._block_size = block_size
+            def callback(self, block_nr, block_size, total_size):
+                self._testcase.assertEqual(block_nr, self.count)
+                self._testcase.assertEqual(block_size, self._block_size)
+                self._testcase.assertEqual(total_size, self._total_size)
+                self.count += 1
+
+        # ensure we start without the test file present
+        tfn = "mechanize_test_73940ukewrl.txt"
+        killfile(tfn)
+
+        # case 1: filename supplied
+        op = Opener()
+        verif = CallbackVerifier(self, -1, op.block_size)
+        url = "http://example.com/"
+        try:
+            filename, headers = op.retrieve(
+                url, tfn, reporthook=verif.callback)
+            self.assertEqual(filename, tfn)
+            self.assertEqual(headers["foo"], 'Bar')
+            self.assertEqual(open(filename, "rb").read(), op.data)
+            self.assertEqual(len(op.calls), 1)
+            self.assertEqual(verif.count, math.ceil(op.nr_blocks) + 1)
+            op.close()
+            # .close()ing the opener does NOT remove non-temporary files
+            self.assert_(os.path.isfile(filename))
+        finally:
+            killfile(filename)
+
+        # case 2: no filename supplied, use a temporary file
+        op = Opener(content_length=True)
+        # We asked the Opener to add a content-length header to the response
+        # this time.  Verify the total size passed to the callback is that case
+        # is according to the content-length (rather than -1).
+        verif = CallbackVerifier(self, op.total_size, op.block_size)
+        url = "http://example.com/"
+        filename, headers = op.retrieve(url, reporthook=verif.callback)
+        self.assertNotEqual(filename, tfn)  # (some temp filename instead)
+        self.assertEqual(headers["foo"], 'Bar')
+        self.assertEqual(open(filename, "rb").read(), op.data)
+        self.assertEqual(len(op.calls), 1)
+        # .close()ing the opener removes temporary files
+        self.assert_(os.path.exists(filename))
+        op.close()
+        self.failIf(os.path.exists(filename))
+        self.assertEqual(verif.count, math.ceil(op.nr_blocks) + 1)
+
+        # case 3: "file:" URL with no filename supplied
+        # we DON'T create a temporary file, since there's a file there already
+        op = Opener()
+        verif = CallbackVerifier(self, -1, op.block_size)
+        tifn = "input_for_"+tfn
+        try:
+            f = open(tifn, 'wb')
+            try:
+                f.write(op.data)
+            finally:
+                f.close()
+            url = "file://" + tifn
+            filename, headers = op.retrieve(url, reporthook=verif.callback)
+            self.assertEqual(filename, None)  # this may change
+            self.assertEqual(headers["foo"], 'Bar')
+            self.assertEqual(open(tifn, "rb").read(), op.data)
+            # no .read()s took place, since we already have the disk file,
+            # and we weren't asked to write it to another filename
+            self.assertEqual(verif.count, 0)
+            op.close()
+            # .close()ing the opener does NOT remove the file!
+            self.assert_(os.path.isfile(tifn))
+        finally:
+            killfile(tifn)
+
+        # case 4: "file:" URL and filename supplied
+        # we DO create a new file in this case
+        op = Opener()
+        verif = CallbackVerifier(self, -1, op.block_size)
+        tifn = "input_for_"+tfn
+        try:
+            f = open(tifn, 'wb')
+            try:
+                f.write(op.data)
+            finally:
+                f.close()
+            url = "file://" + tifn
+            try:
+                filename, headers = op.retrieve(
+                    url, tfn, reporthook=verif.callback)
+                self.assertEqual(filename, tfn)
+                self.assertEqual(headers["foo"], 'Bar')
+                self.assertEqual(open(tifn, "rb").read(), op.data)
+                self.assertEqual(verif.count, math.ceil(op.nr_blocks) + 1)
+                op.close()
+                # .close()ing the opener does NOT remove non-temporary files
+                self.assert_(os.path.isfile(tfn))
+            finally:
+                killfile(tfn)
+        finally:
+            killfile(tifn)
+
+        # Content-Length mismatch with real file length gives URLError
+        big = 1024*32
+        op = Opener(content_length=big)
+        verif = CallbackVerifier(self, big, op.block_size)
+        url = "http://example.com/"
+        try:
+            try:
+                op.retrieve(url, reporthook=verif.callback)
+            except mechanize.ContentTooShortError, exc:
+                filename, headers = exc.result
+                self.assertNotEqual(filename, tfn)
+                self.assertEqual(headers["foo"], 'Bar')
+                # We still read and wrote to disk everything available, despite
+                # the exception.
+                self.assertEqual(open(filename, "rb").read(), op.data)
+                self.assertEqual(len(op.calls), 1)
+                self.assertEqual(verif.count, math.ceil(op.nr_blocks) + 1)
+                # cleanup should still take place
+                self.assert_(os.path.isfile(filename))
+                op.close()
+                self.failIf(os.path.isfile(filename))
+            else:
+                self.fail()
+        finally:
+            killfile(filename)
+

Added: python-mechanize/branches/upstream/current/test/test_password_manager.doctest
===================================================================
--- python-mechanize/branches/upstream/current/test/test_password_manager.doctest	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_password_manager.doctest	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,148 @@
+Features common to HTTPPasswordMgr and HTTPProxyPasswordMgr
+===========================================================
+
+(mgr_class gets here through globs argument)
+
+>>> mgr = mgr_class()
+>>> add = mgr.add_password
+
+>>> add("Some Realm", "http://example.com/", "joe", "password")
+>>> add("Some Realm", "http://example.com/ni", "ni", "ni")
+>>> add("c", "http://example.com/foo", "foo", "ni")
+>>> add("c", "http://example.com/bar", "bar", "nini")
+>>> add("b", "http://example.com/", "first", "blah")
+>>> add("b", "http://example.com/", "second", "spam")
+>>> add("a", "http://example.com", "1", "a")
+>>> add("Some Realm", "http://c.example.com:3128", "3", "c")
+>>> add("Some Realm", "d.example.com", "4", "d")
+>>> add("Some Realm", "e.example.com:3128", "5", "e")
+
+>>> mgr.find_user_password("Some Realm", "example.com")
+('joe', 'password')
+>>> mgr.find_user_password("Some Realm", "http://example.com")
+('joe', 'password')
+>>> mgr.find_user_password("Some Realm", "http://example.com/")
+('joe', 'password')
+>>> mgr.find_user_password("Some Realm", "http://example.com/spam")
+('joe', 'password')
+>>> mgr.find_user_password("Some Realm", "http://example.com/spam/spam")
+('joe', 'password')
+>>> mgr.find_user_password("c", "http://example.com/foo")
+('foo', 'ni')
+>>> mgr.find_user_password("c", "http://example.com/bar")
+('bar', 'nini')
+
+Actually, this is really undefined ATM
+#Currently, we use the highest-level path where more than one match:
+#
+#>>> mgr.find_user_password("Some Realm", "http://example.com/ni")
+#('joe', 'password')
+
+Use latest add_password() in case of conflict:
+
+>>> mgr.find_user_password("b", "http://example.com/")
+('second', 'spam')
+
+No special relationship between a.example.com and example.com:
+
+>>> mgr.find_user_password("a", "http://example.com/")
+('1', 'a')
+>>> mgr.find_user_password("a", "http://a.example.com/")
+(None, None)
+
+Ports:
+
+>>> mgr.find_user_password("Some Realm", "c.example.com")
+(None, None)
+>>> mgr.find_user_password("Some Realm", "c.example.com:3128")
+('3', 'c')
+>>> mgr.find_user_password("Some Realm", "http://c.example.com:3128")
+('3', 'c')
+>>> mgr.find_user_password("Some Realm", "d.example.com")
+('4', 'd')
+>>> mgr.find_user_password("Some Realm", "e.example.com:3128")
+('5', 'e')
+
+
+Default port tests
+------------------
+
+>>> mgr = mgr_class()
+>>> add = mgr.add_password
+
+The point to note here is that we can't guess the default port if there's
+no scheme.  This applies to both add_password and find_user_password.
+
+>>> add("f", "http://g.example.com:80", "10", "j")
+>>> add("g", "http://h.example.com", "11", "k")
+>>> add("h", "i.example.com:80", "12", "l")
+>>> add("i", "j.example.com", "13", "m")
+>>> mgr.find_user_password("f", "g.example.com:100")
+(None, None)
+>>> mgr.find_user_password("f", "g.example.com:80")
+('10', 'j')
+>>> mgr.find_user_password("f", "g.example.com")
+(None, None)
+>>> mgr.find_user_password("f", "http://g.example.com:100")
+(None, None)
+>>> mgr.find_user_password("f", "http://g.example.com:80")
+('10', 'j')
+>>> mgr.find_user_password("f", "http://g.example.com")
+('10', 'j')
+>>> mgr.find_user_password("g", "h.example.com")
+('11', 'k')
+>>> mgr.find_user_password("g", "h.example.com:80")
+('11', 'k')
+>>> mgr.find_user_password("g", "http://h.example.com:80")
+('11', 'k')
+>>> mgr.find_user_password("h", "i.example.com")
+(None, None)
+>>> mgr.find_user_password("h", "i.example.com:80")
+('12', 'l')
+>>> mgr.find_user_password("h", "http://i.example.com:80")
+('12', 'l')
+>>> mgr.find_user_password("i", "j.example.com")
+('13', 'm')
+>>> mgr.find_user_password("i", "j.example.com:80")
+(None, None)
+>>> mgr.find_user_password("i", "http://j.example.com")
+('13', 'm')
+>>> mgr.find_user_password("i", "http://j.example.com:80")
+(None, None)
+
+
+Features specific to HTTPProxyPasswordMgr
+=========================================
+
+Default realm:
+
+>>> mgr = mechanize.HTTPProxyPasswordMgr()
+>>> add = mgr.add_password
+
+>>> mgr.find_user_password("d", "f.example.com")
+(None, None)
+>>> add(None, "f.example.com", "6", "f")
+>>> mgr.find_user_password("d", "f.example.com")
+('6', 'f')
+
+Default host/port:
+
+>>> mgr.find_user_password("e", "g.example.com")
+(None, None)
+>>> add("e", None, "7", "g")
+>>> mgr.find_user_password("e", "g.example.com")
+('7', 'g')
+
+Default realm and host/port:
+
+>>> mgr.find_user_password("f", "h.example.com")
+(None, None)
+>>> add(None, None, "8", "h")
+>>> mgr.find_user_password("f", "h.example.com")
+('8', 'h')
+
+Default realm beats default host/port:
+
+>>> add("d", None, "9", "i")
+>>> mgr.find_user_password("d", "f.example.com")
+('6', 'f')

Added: python-mechanize/branches/upstream/current/test/test_request.doctest
===================================================================
--- python-mechanize/branches/upstream/current/test/test_request.doctest	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_request.doctest	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,66 @@
+>>> from mechanize import Request
+>>> r = Request("http://example.com/foo#frag")
+>>> r.get_selector()
+'/foo'
+
+
+Request Headers Dictionary
+--------------------------
+
+The Request.headers dictionary is not a documented interface.  It should
+stay that way, because the complete set of headers are only accessible
+through the .get_header(), .has_header(), .header_items() interface.
+However, .headers pre-dates those methods, and so real code will be using
+the dictionary.
+
+The introduction in 2.4 of those methods was a mistake for the same reason:
+code that previously saw all (urllib2 user)-provided headers in .headers
+now sees only a subset (and the function interface is ugly and incomplete).
+A better change would have been to replace .headers dict with a dict
+subclass (or UserDict.DictMixin instance?)  that preserved the .headers
+interface and also provided access to the "unredirected" headers.  It's
+probably too late to fix that, though.
+
+
+Check .capitalize() case normalization:
+
+>>> url = "http://example.com"
+>>> Request(url, headers={"Spam-eggs": "blah"}).headers["Spam-eggs"]
+'blah'
+>>> Request(url, headers={"spam-EggS": "blah"}).headers["Spam-eggs"]
+'blah'
+
+Currently, Request(url, "Spam-eggs").headers["Spam-Eggs"] raises KeyError,
+but that could be changed in future.
+
+
+Request Headers Methods
+-----------------------
+
+Note the case normalization of header names here, to .capitalize()-case.
+This should be preserved for backwards-compatibility.  (In the HTTP case,
+normalization to .title()-case is done by urllib2 before sending headers to
+httplib).
+
+>>> url = "http://example.com"
+>>> r = Request(url, headers={"Spam-eggs": "blah"})
+>>> r.has_header("Spam-eggs")
+True
+>>> r.header_items()
+[('Spam-eggs', 'blah')]
+>>> r.add_header("Foo-Bar", "baz")
+>>> items = r.header_items()
+>>> items.sort()
+>>> items
+[('Foo-bar', 'baz'), ('Spam-eggs', 'blah')]
+
+Note that e.g. r.has_header("spam-EggS") is currently False, and
+r.get_header("spam-EggS") returns None, but that could be changed in
+future.
+
+>>> r.has_header("Not-there")
+False
+>>> print r.get_header("Not-there")
+None
+>>> r.get_header("Not-there", "default")
+'default'

Added: python-mechanize/branches/upstream/current/test/test_response.doctest
===================================================================
--- python-mechanize/branches/upstream/current/test/test_response.doctest	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_response.doctest	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,112 @@
+The read_complete flag lets us know if all of the wrapped file's data
+has been read.  We want to know this because Browser.back() must
+.reload() the response if not.
+
+I've noted here the various cases where .read_complete may be set.
+
+>>> text = "To err is human, to moo, bovine.\n"*10
+>>> def get_wrapper():
+...     import cStringIO
+...     from mechanize._response import seek_wrapper
+...     f = cStringIO.StringIO(text)
+...     wr = seek_wrapper(f)
+...     return wr
+
+.read() case #1
+
+>>> wr = get_wrapper()
+>>> wr.read_complete
+False
+>>> junk = wr.read()
+>>> wr.read_complete
+True
+>>> wr.seek(0)
+>>> wr.read_complete
+True
+
+Excercise partial .read() and .readline(), and .seek() case #1
+
+>>> wr = get_wrapper()
+>>> junk = wr.read(10)
+>>> wr.read_complete
+False
+>>> junk = wr.readline()
+>>> wr.read_complete
+False
+>>> wr.seek(0, 2)
+>>> wr.read_complete
+True
+>>> wr.seek(0)
+>>> wr.read_complete
+True
+
+.readlines() case #1
+
+>>> wr = get_wrapper()
+>>> junk = wr.readlines()
+>>> wr.read_complete
+True
+>>> wr.seek(0)
+>>> wr.read_complete
+True
+
+.seek() case #2
+
+>>> wr = get_wrapper()
+>>> wr.seek(10)
+>>> wr.read_complete
+False
+>>> wr.seek(1000000)
+
+.read() case #2
+
+>>> wr = get_wrapper()
+>>> junk = wr.read(1000000)
+>>> wr.read_complete  # we read to the end, but don't know it yet
+False
+>>> junk = wr.read(10)
+>>> wr.read_complete
+True
+
+.readline() case #1
+
+>>> wr = get_wrapper()
+>>> junk = wr.read(len(text)-10)
+>>> wr.read_complete
+False
+>>> junk = wr.readline()
+>>> wr.read_complete  # we read to the end, but don't know it yet
+False
+>>> junk = wr.readline()
+>>> wr.read_complete
+True
+
+Test copying and sharing of .read_complete state
+
+>>> import copy
+>>> wr = get_wrapper()
+>>> wr2 = copy.copy(wr)
+>>> wr.read_complete
+False
+>>> wr2.read_complete
+False
+>>> junk = wr2.read()
+>>> wr.read_complete
+True
+>>> wr2.read_complete
+True
+
+
+Fix from -r36082: .read() after .close() used to break
+.read_complete state
+
+>>> from mechanize._response import test_response
+>>> r = test_response(text)
+>>> junk = r.read(64)
+>>> r.close()
+>>> r.read_complete
+False
+>>> r.read()
+''
+>>> r.read_complete
+False

Added: python-mechanize/branches/upstream/current/test/test_response.py
===================================================================
--- python-mechanize/branches/upstream/current/test/test_response.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_response.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,218 @@
+"""Tests for mechanize._response.seek_wrapper and friends."""
+
+import copy
+import cStringIO
+from unittest import TestCase
+
+class TestUnSeekable:
+    def __init__(self, text):
+        self._file = cStringIO.StringIO(text)
+        self.log = []
+
+    def tell(self): return self._file.tell()
+
+    def seek(self, offset, whence=0): assert False
+
+    def read(self, size=-1):
+        self.log.append(("read", size))
+        return self._file.read(size)
+
+    def readline(self, size=-1):
+        self.log.append(("readline", size))
+        return self._file.readline(size)
+
+    def readlines(self, sizehint=-1):
+        self.log.append(("readlines", sizehint))
+        return self._file.readlines(sizehint)
+
+class TestUnSeekableResponse(TestUnSeekable):
+    def __init__(self, text, headers):
+        TestUnSeekable.__init__(self, text)
+        self.code = 200
+        self.msg = "OK"
+        self.headers = headers
+        self.url = "http://example.com/"
+
+    def geturl(self):
+        return self.url
+
+    def info(self):
+        return self.headers
+
+    def close(self):
+        pass
+
+
+class SeekableTests(TestCase):
+
+    text = """\
+The quick brown fox
+jumps over the lazy
+
+dog.
+
+"""
+    text_lines = map(lambda l: l+"\n", text.split("\n")[:-1])
+
+    def testSeekable(self):
+        from mechanize._response import seek_wrapper
+        text = self.text
+        text_lines = self.text_lines
+
+        for ii in range(1, 6):
+            fh = TestUnSeekable(text)
+            sfh = seek_wrapper(fh)
+            test = getattr(self, "_test%d" % ii)
+            test(sfh)
+
+        # copies have independent seek positions
+        fh = TestUnSeekable(text)
+        sfh = seek_wrapper(fh)
+        self._testCopy(sfh)
+
+    def _testCopy(self, sfh):
+        sfh2 = copy.copy(sfh)
+        sfh.read(10)
+        text = self.text
+        self.assertEqual(sfh2.read(10), text[:10])
+        sfh2.seek(5)
+        self.assertEqual(sfh.read(10), text[10:20])
+        self.assertEqual(sfh2.read(10), text[5:15])
+        sfh.seek(0)
+        sfh2.seek(0)
+        return sfh2
+
+    def _test1(self, sfh):
+        text = self.text
+        text_lines = self.text_lines
+        assert sfh.read(10) == text[:10]  # calls fh.read
+        assert sfh.log[-1] == ("read", 10)  # .log delegated to fh
+        sfh.seek(0)  # doesn't call fh.seek
+        assert sfh.read(10) == text[:10]  # doesn't call fh.read
+        assert len(sfh.log) == 1
+        sfh.seek(0)
+        assert sfh.read(5) == text[:5]  # read only part of cached data
+        assert len(sfh.log) == 1
+        sfh.seek(0)
+        assert sfh.read(25) == text[:25]  # calls fh.read
+        assert sfh.log[1] == ("read", 15)
+        lines = []
+        sfh.seek(-1, 1)
+        while 1:
+            l = sfh.readline()
+            if l == "": break
+            lines.append(l)
+        assert lines == ["s over the lazy\n"]+text_lines[2:]
+        assert sfh.log[2:] == [("readline", -1)]*5
+        sfh.seek(0)
+        lines = []
+        while 1:
+            l = sfh.readline()
+            if l == "": break
+            lines.append(l)
+        assert lines == text_lines
+
+    def _test2(self, sfh):
+        text = self.text
+        sfh.read(5)
+        sfh.seek(0)
+        assert sfh.read() == text
+        assert sfh.read() == ""
+        sfh.seek(0)
+        assert sfh.read() == text
+        sfh.seek(0)
+        assert sfh.readline(5) == "The q"
+        assert sfh.read() == text[5:]
+        sfh.seek(0)
+        assert sfh.readline(5) == "The q"
+        assert sfh.readline() == "uick brown fox\n"
+
+    def _test3(self, sfh):
+        text = self.text
+        text_lines = self.text_lines
+        sfh.read(25)
+        sfh.seek(-1, 1)
+        self.assertEqual(sfh.readlines(), ["s over the lazy\n"]+text_lines[2:])
+        nr_logs = len(sfh.log)
+        sfh.seek(0)
+        assert sfh.readlines() == text_lines
+
+    def _test4(self, sfh):
+        text = self.text
+        text_lines = self.text_lines
+        count = 0
+        limit = 10
+        while count < limit:
+            if count == 5:
+                self.assertRaises(StopIteration, sfh.next)
+                break
+            else:
+                sfh.next() == text_lines[count]
+            count = count + 1
+        else:
+            assert False, "StopIteration not raised"
+
+    def _test5(self, sfh):
+        text = self.text
+        sfh.read(10)
+        sfh.seek(5)
+        self.assert_(sfh.invariant())
+        sfh.seek(0, 2)
+        self.assert_(sfh.invariant())
+        sfh.seek(0)
+        self.assertEqual(sfh.read(), text)
+
+    def testResponseSeekWrapper(self):
+        from mechanize import response_seek_wrapper
+        hdrs = {"Content-type": "text/html"}
+        r = TestUnSeekableResponse(self.text, hdrs)
+        rsw = response_seek_wrapper(r)
+        rsw2 = self._testCopy(rsw)
+        self.assert_(rsw is not rsw2)
+        self.assertEqual(rsw.info(), rsw2.info())
+        self.assert_(rsw.info() is not rsw2.info())
+
+        # should be able to close already-closed object
+        rsw2.close()
+        rsw2.close()
+
+    def testSetResponseData(self):
+        from mechanize import response_seek_wrapper
+        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
+        rsw = response_seek_wrapper(r)
+        rsw.set_data("""\
+A Seeming somwhat more than View;
+  That doth instruct the Mind
+  In Things that ly behind,
+""")
+        self.assertEqual(rsw.read(9), "A Seeming")
+        self.assertEqual(rsw.read(13), " somwhat more")
+        rsw.seek(0)
+        self.assertEqual(rsw.read(9), "A Seeming")
+        self.assertEqual(rsw.readline(), " somwhat more than View;\n")
+        rsw.seek(0)
+        self.assertEqual(rsw.readline(), "A Seeming somwhat more than View;\n")
+        rsw.seek(-1, 1)
+        self.assertEqual(rsw.read(7), "\n  That")
+
+        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
+        rsw = response_seek_wrapper(r)
+        rsw.set_data(self.text)
+        self._test2(rsw)
+        rsw.seek(0)
+        self._test4(rsw)
+
+    def testGetResponseData(self):
+        from mechanize import response_seek_wrapper
+        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
+        rsw = response_seek_wrapper(r)
+
+        self.assertEqual(rsw.get_data(), self.text)
+        self._test2(rsw)
+        rsw.seek(0)
+        self._test4(rsw)
+
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Added: python-mechanize/branches/upstream/current/test/test_rfc3986.doctest
===================================================================
--- python-mechanize/branches/upstream/current/test/test_rfc3986.doctest	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_rfc3986.doctest	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,156 @@
+>>> from mechanize._rfc3986 import urlsplit, urljoin, remove_dot_segments
+
+Some common cases
+
+>>> urlsplit("http://example.com/spam/eggs/spam.html?apples=pears&a=b#foo")
+('http', 'example.com', '/spam/eggs/spam.html', 'apples=pears&a=b', 'foo')
+>>> urlsplit("http://example.com/spam.html#foo")
+('http', 'example.com', '/spam.html', None, 'foo')
+>>> urlsplit("ftp://example.com/foo.gif")
+('ftp', 'example.com', '/foo.gif', None, None)
+>>> urlsplit('ftp://joe:password@example.com:port')
+('ftp', 'joe:password at example.com:port', '', None, None)
+>>> urlsplit("mailto:jjl at pobox.com")
+('mailto', None, 'jjl at pobox.com', None, None)
+
+The five path productions
+
+path-abempty:
+
+>>> urlsplit("http://www.example.com")
+('http', 'www.example.com', '', None, None)
+>>> urlsplit("http://www.example.com/foo")
+('http', 'www.example.com', '/foo', None, None)
+
+path-absolute:
+
+>>> urlsplit("a:/")
+('a', None, '/', None, None)
+>>> urlsplit("a:/b:/c/")
+('a', None, '/b:/c/', None, None)
+
+path-noscheme:
+
+>>> urlsplit("a:b/:c/")
+('a', None, 'b/:c/', None, None)
+
+path-rootless:
+
+>>> urlsplit("a:b:/c/")
+('a', None, 'b:/c/', None, None)
+
+path-empty:
+
+>>> urlsplit("quack:")
+('quack', None, '', None, None)
+
+
+>>> remove_dot_segments("/a/b/c/./../../g")
+'/a/g'
+>>> remove_dot_segments("mid/content=5/../6")
+'mid/6'
+>>> remove_dot_segments("/b/c/.")
+'/b/c/'
+>>> remove_dot_segments("/b/c/./.")
+'/b/c/'
+>>> remove_dot_segments(".")
+''
+>>> remove_dot_segments("/.")
+'/'
+>>> remove_dot_segments("./")
+''
+
+
+Examples from RFC 3986 section 5.4
+
+Normal Examples
+
+>>> base = "http://a/b/c/d;p?q"
+>>> def join(uri): return urljoin(base, uri)
+>>> join("g:h")
+'g:h'
+>>> join("g")
+'http://a/b/c/g'
+>>> join("./g")
+'http://a/b/c/g'
+>>> join("g/")
+'http://a/b/c/g/'
+>>> join("/g")
+'http://a/g'
+>>> join("//g")
+'http://g'
+>>> join("?y")
+'http://a/b/c/d;p?y'
+>>> join("g?y")
+'http://a/b/c/g?y'
+>>> join("#s")
+'http://a/b/c/d;p?q#s'
+>>> join("g#s")
+'http://a/b/c/g#s'
+>>> join("g?y#s")
+'http://a/b/c/g?y#s'
+>>> join(";x")
+'http://a/b/c/;x'
+>>> join("g;x")
+'http://a/b/c/g;x'
+>>> join("g;x?y#s")
+'http://a/b/c/g;x?y#s'
+>>> join("")
+'http://a/b/c/d;p?q'
+>>> join(".")
+'http://a/b/c/'
+>>> join("./")
+'http://a/b/c/'
+>>> join("..")
+'http://a/b/'
+>>> join("../")
+'http://a/b/'
+>>> join("../g")
+'http://a/b/g'
+>>> join("../..")
+'http://a/'
+>>> join("../../")
+'http://a/'
+>>> join("../../g")
+'http://a/g'
+
+Abnormal Examples
+
+>>> join("../../../g")
+'http://a/g'
+>>> join("../../../../g")
+'http://a/g'
+>>> join("/./g")
+'http://a/g'
+>>> join("/../g")
+'http://a/g'
+>>> join("g.")
+'http://a/b/c/g.'
+>>> join(".g")
+'http://a/b/c/.g'
+>>> join("g..")
+'http://a/b/c/g..'
+>>> join("..g")
+'http://a/b/c/..g'
+>>> join("./../g")
+'http://a/b/g'
+>>> join("./g/.")
+'http://a/b/c/g/'
+>>> join("g/./h")
+'http://a/b/c/g/h'
+>>> join("g/../h")
+'http://a/b/c/h'
+>>> join("g;x=1/./y")
+'http://a/b/c/g;x=1/y'
+>>> join("g;x=1/../y")
+'http://a/b/c/y'
+>>> join("g?y/./x")
+'http://a/b/c/g?y/./x'
+>>> join("g?y/../x")
+'http://a/b/c/g?y/../x'
+>>> join("g#s/./x")
+'http://a/b/c/g#s/./x'
+>>> join("g#s/../x")
+'http://a/b/c/g#s/../x'
+>>> join("http:g")
+'http://a/b/c/g'

Modified: python-mechanize/branches/upstream/current/test/test_urllib2.py
===================================================================
--- python-mechanize/branches/upstream/current/test/test_urllib2.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_urllib2.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -1,26 +1,33 @@
-"""Tests for ClientCookie._urllib2_support (and for urllib2)."""
+"""Tests for urllib2-level functionality.
 
+This is made up of:
+
+ - tests that I've contributed back to stdlib test_urllib2.py
+
+ - tests for features that aren't in urllib2, but works on the level of the
+   interfaces exported by urllib2, especially urllib2 "handler" interface,
+   but *excluding* the extended interfaces provided by mechanize.UserAgent
+   and mechanize.Browser.
+
+"""
+
 # XXX
 # Request (I'm too lazy)
 # CacheFTPHandler (hard to write)
-# parse_keqv_list, parse_http_list (I'm leaving this for Anthony Baxter
-#  and Greg Stein, since they're doing Digest Authentication)
-# Authentication stuff (ditto)
-# ProxyHandler, CustomProxy, CustomProxyHandler (I don't use a proxy)
+# parse_keqv_list, parse_http_list
 # GopherHandler (haven't used gopher for a decade or so...)
 
-import unittest, StringIO, os, sys, UserDict
+import unittest, StringIO, os, sys, UserDict, httplib
 
 import mechanize
 
-from mechanize._urllib2_support import Request, AbstractHTTPHandler, \
-     build_opener, parse_head, urlopen
-from mechanize._util import startswith
+from mechanize._http import AbstractHTTPHandler, parse_head
+from mechanize._response import test_response
 from mechanize import HTTPRedirectHandler, HTTPRequestUpgradeProcessor, \
      HTTPEquivProcessor, HTTPRefreshProcessor, SeekableProcessor, \
      HTTPCookieProcessor, HTTPRefererProcessor, \
      HTTPErrorProcessor, HTTPHandler
-from mechanize import OpenerDirector
+from mechanize import OpenerDirector, build_opener, urlopen, Request
 
 ## from logging import getLogger, DEBUG
 ## l = getLogger("mechanize")
@@ -38,14 +45,19 @@
     def readline(self, count=None): pass
     def close(self): pass
 
-class MockHeaders(UserDict.UserDict):
-    def getallmatchingheaders(self, name):
-        r = []
-        for k, v in self.data.items():
-            if k.lower() == name:
-                r.append("%s: %s" % (k, v))
-        return r
+def http_message(mapping):
+    """
+    >>> http_message({"Content-Type": "text/html"}).items()
+    [('content-type', 'text/html')]
 
+    """
+    f = []
+    for kv in mapping.items():
+        f.append("%s: %s" % kv)
+    f.append("")
+    msg = httplib.HTTPMessage(StringIO.StringIO("\r\n".join(f)))
+    return msg
+
 class MockResponse(StringIO.StringIO):
     def __init__(self, code, msg, headers, data, url=None):
         StringIO.StringIO.__init__(self, data)
@@ -90,7 +102,7 @@
             return res
         elif action == "return request":
             return Request("http://blah/")
-        elif startswith(action, "error"):
+        elif action.startswith("error"):
             code = int(action[-3:])
             res = MockResponse(200, "OK", {}, "")
             return self.parent.error("http", args[0], res, code, "", {})
@@ -391,6 +403,8 @@
         return self
     def set_url(self, url):
         self.calls.append(("set_url", url))
+    def set_opener(self, opener):
+        self.calls.append(("set_opener", opener))
     def read(self):
         self.calls.append("read")
     def can_fetch(self, ua, url):
@@ -666,8 +680,10 @@
             return  # skip test
         else:
             from mechanize import HTTPRobotRulesProcessor
+        opener = OpenerDirector()
         rfpc = MockRobotFileParserClass()
         h = HTTPRobotRulesProcessor(rfpc)
+        opener.add_handler(h)
 
         url = "http://example.com:80/foo/bar.html"
         req = Request(url)
@@ -676,6 +692,7 @@
         h.http_request(req)
         self.assert_(rfpc.calls == [
             "__call__",
+            ("set_opener", opener),
             ("set_url", "http://example.com:80/robots.txt"),
             "read",
             ("can_fetch", "", url),
@@ -715,6 +732,7 @@
         h.http_request(req)
         self.assert_(rfpc.calls == [
             "__call__",
+            ("set_opener", opener),
             ("set_url", "http://example.com/robots.txt"),
             "read",
             ("can_fetch", "", url),
@@ -726,10 +744,17 @@
         h.http_request(req)
         self.assert_(rfpc.calls == [
             "__call__",
+            ("set_opener", opener),
             ("set_url", "https://example.org/robots.txt"),
             "read",
             ("can_fetch", "", url),
             ])
+        # non-HTTP URL -> ignore robots.txt
+        rfpc.clear()
+        url = "ftp://example.com/"
+        req = Request(url)
+        h.http_request(req)
+        self.assert_(rfpc.calls == [])
 
     def test_cookies(self):
         cj = MockCookieJar()
@@ -769,28 +794,38 @@
         req = Request("http://example.com/")
         r = MockResponse(
             200, "OK",
-            MockHeaders({"Foo": "Bar", "Content-type": "text/html"}),
+            http_message({"Foo": "Bar",
+                          "Content-type": "text/html",
+                          "Refresh": "blah"}),
             '<html><head>'
             '<meta http-equiv="Refresh" content="spam&amp;eggs">'
-            '</head></html>'
+            '</head></html>',
+            "http://example.com/"
             )
         newr = h.http_response(req, r)
         headers = newr.info()
+        self.assert_(headers["Foo"] == "Bar")
         self.assert_(headers["Refresh"] == "spam&eggs")
-        self.assert_(headers["Foo"] == "Bar")
+        self.assert_(headers.getheaders("Refresh") == ["blah", "spam&eggs"])
 
     def test_refresh(self):
         # XXX test processor constructor optional args
         h = HTTPRefreshProcessor(max_time=None, honor_time=False)
 
-        for val in ['0; url="http://example.com/foo/"', "2"]:
+        for val, valid in [
+            ('0; url="http://example.com/foo/"', True),
+            ("2", True),
+            # in the past, this failed with UnboundLocalError
+            ('0; "http://example.com/foo/"', False),
+            ]:
             o = h.parent = MockOpener()
             req = Request("http://example.com/")
-            headers = MockHeaders({"refresh": val})
-            r = MockResponse(200, "OK", headers, "")
+            headers = http_message({"refresh": val})
+            r = MockResponse(200, "OK", headers, "", "http://example.com/")
             newr = h.http_response(req, r)
-            self.assertEqual(o.proto, "http")
-            self.assertEqual(o.args, (req, r, "refresh", "OK", headers))
+            if valid:
+                self.assertEqual(o.proto, "http")
+                self.assertEqual(o.args, (req, r, "refresh", "OK", headers))
 
     def test_redirect(self):
         from_url = "http://example.com/a.html"
@@ -808,7 +843,7 @@
                 req.origin_req_host = "example.com"  # XXX
                 try:
                     method(req, MockFile(), code, "Blah",
-                           MockHeaders({"location": to_url}))
+                           http_message({"location": to_url}))
                 except mechanize.HTTPError:
                     # 307 in response to POST requires user OK
                     self.assert_(code == 307 and data is not None)
@@ -824,7 +859,7 @@
         # loop detection
         def redirect(h, req, url=to_url):
             h.http_error_302(req, MockFile(), 302, "Blah",
-                             MockHeaders({"location": url}))
+                             http_message({"location": url}))
         # Note that the *original* request shares the same record of
         # redirections with the sub-requests caused by the redirections.
 
@@ -851,6 +886,39 @@
         except mechanize.HTTPError:
             self.assert_(count == HTTPRedirectHandler.max_redirections)
 
+    def test_redirect_bad_uri(self):
+        # bad URIs should be cleaned up before redirection
+        from mechanize._response import test_html_response
+        from_url = "http://example.com/a.html"
+        bad_to_url = "http://example.com/b. |html"
+        good_to_url = "http://example.com/b.%20%7Chtml"
+
+        h = HTTPRedirectHandler()
+        o = h.parent = MockOpener()
+
+        req = Request(from_url)
+        h.http_error_302(req, test_html_response(), 302, "Blah",
+                         http_message({"location": bad_to_url}),
+                         )
+        self.assertEqual(o.req.get_full_url(), good_to_url)
+
+    def test_refresh_bad_uri(self):
+        # bad URIs should be cleaned up before redirection
+        from mechanize._response import test_html_response
+        from_url = "http://example.com/a.html"
+        bad_to_url = "http://example.com/b. |html"
+        good_to_url = "http://example.com/b.%20%7Chtml"
+
+        h = HTTPRefreshProcessor(max_time=None, honor_time=False)
+        o = h.parent = MockOpener()
+
+        req = Request("http://example.com/")
+        r = test_html_response(
+            headers=[("refresh", '0; url="%s"' % bad_to_url)])
+        newr = h.http_response(req, r)
+        headers = o.args[-1]
+        self.assertEqual(headers["Location"], good_to_url)
+
     def test_cookie_redirect(self):
         # cookies shouldn't leak into redirected requests
         import mechanize
@@ -896,6 +964,8 @@
         realm = "ACME Widget Store"
         http_handler = MockHTTPHandler(
             401, 'WWW-Authenticate: Basic realm="%s"\r\n\r\n' % realm)
+        opener.add_handler(auth_handler)
+        opener.add_handler(http_handler)
         self._test_basic_auth(opener, auth_handler, "Authorization",
                               realm, http_handler, password_manager,
                               "http://acme.example.com/protected",
@@ -911,6 +981,8 @@
         realm = "ACME Networks"
         http_handler = MockHTTPHandler(
             407, 'Proxy-Authenticate: Basic realm="%s"\r\n\r\n' % realm)
+        opener.add_handler(auth_handler)
+        opener.add_handler(http_handler)
         self._test_basic_auth(opener, auth_handler, "Proxy-authorization",
                               realm, http_handler, password_manager,
                               "http://acme.example.com:3128/protected",
@@ -922,29 +994,54 @@
         # response (http://python.org/sf/1479302), where it should instead
         # return None to allow another handler (especially
         # HTTPBasicAuthHandler) to handle the response.
+
+        # Also (http://python.org/sf/1479302, RFC 2617 section 1.2), we must
+        # try digest first (since it's the strongest auth scheme), so we record
+        # order of calls here to check digest comes first:
+        class RecordingOpenerDirector(OpenerDirector):
+            def __init__(self):
+                OpenerDirector.__init__(self)
+                self.recorded = []
+            def record(self, info):
+                self.recorded.append(info)
         class TestDigestAuthHandler(mechanize.HTTPDigestAuthHandler):
-            handler_order = 400  # strictly before HTTPBasicAuthHandler
-        opener = OpenerDirector()
+            def http_error_401(self, *args, **kwds):
+                self.parent.record("digest")
+                mechanize.HTTPDigestAuthHandler.http_error_401(self,
+                                                             *args, **kwds)
+        class TestBasicAuthHandler(mechanize.HTTPBasicAuthHandler):
+            def http_error_401(self, *args, **kwds):
+                self.parent.record("basic")
+                mechanize.HTTPBasicAuthHandler.http_error_401(self,
+                                                            *args, **kwds)
+
+        opener = RecordingOpenerDirector()
         password_manager = MockPasswordManager()
         digest_handler = TestDigestAuthHandler(password_manager)
-        basic_handler = mechanize.HTTPBasicAuthHandler(password_manager)
-        opener.add_handler(digest_handler)
+        basic_handler = TestBasicAuthHandler(password_manager)
         realm = "ACME Networks"
         http_handler = MockHTTPHandler(
             401, 'WWW-Authenticate: Basic realm="%s"\r\n\r\n' % realm)
+        opener.add_handler(digest_handler)
+        opener.add_handler(basic_handler)
+        opener.add_handler(http_handler)
+        opener._maybe_reindex_handlers()
+
+        # check basic auth isn't blocked by digest handler failing
         self._test_basic_auth(opener, basic_handler, "Authorization",
                               realm, http_handler, password_manager,
                               "http://acme.example.com/protected",
                               "http://acme.example.com/protected",
                               )
+        # check digest was tried before basic (twice, because
+        # _test_basic_auth called .open() twice)
+        self.assertEqual(opener.recorded, ["digest", "basic"]*2)
 
     def _test_basic_auth(self, opener, auth_handler, auth_header,
                          realm, http_handler, password_manager,
                          request_url, protected_url):
         import base64, httplib
         user, password = "wile", "coyote"
-        opener.add_handler(auth_handler)
-        opener.add_handler(http_handler)
 
         # .add_password() fed through to password manager
         auth_handler.add_password(realm, request_url, user, password)
@@ -995,7 +1092,10 @@
             <meta http-equiv="moo" content="cow">
             </html>
             """,
-             [("refresh", "1; http://example.com/"), ("foo", "bar")])
+             [("refresh", "1; http://example.com/"), ("foo", "bar")]),
+            ("""<meta http-equiv="refresh">
+            """,
+             [])
             ]
         for html, result in htmls:
             self.assertEqual(parse_head(StringIO.StringIO(html), HeadParser()), result)
@@ -1026,11 +1126,10 @@
             self._count = self._count + 1
             msg = mimetools.Message(StringIO(self.headers))
             return self.parent.error(
-                "http", req, MockFile(), self.code, "Blah", msg)
+                "http", req, test_response(), self.code, "Blah", msg)
         else:
             self.req = req
-            msg = mimetools.Message(StringIO("\r\n\r\n"))
-            return MockResponse(200, "OK", msg, "", req.get_full_url())
+            return test_response("", [], req.get_full_url())
 
 
 class MyHTTPHandler(HTTPHandler): pass
@@ -1082,36 +1181,8 @@
         else:
             self.assert_(False)
 
-    def _methnames(self, *objs):
-        from mechanize._opener import methnames
-        r = []
-        for i in range(len(objs)):
-            obj = objs[i]
-            names = methnames(obj)
-            names.sort()
-            # special methods vary over Python versions
-            names = filter(lambda mn: mn[0:2] != "__" , names)
-            r.append(names)
-        return r
 
-    def test_methnames(self):
-        a, b, c, d = A(), B(), C(), D()
-        a, b, c, d = self._methnames(a, b, c, d)
-        self.assert_(a == ["a"])
-        self.assert_(b == ["a", "b"])
-        self.assert_(c == ["a", "c"])
-        self.assert_(d == ["a", "b", "c", "d"])
-
-        a, b, c, d = A(), B(), C(), D()
-        a.x = lambda self: None
-        b.y = lambda self: None
-        d.z = lambda self: None
-        a, b, c, d = self._methnames(a, b, c, d)
-        self.assert_(a == ["a", "x"])
-        self.assert_(b == ["a", "b", "y"])
-        self.assert_(c == ["a", "c"])
-        self.assert_(d == ["a", "b", "c", "d", "z"])
-
-
 if __name__ == "__main__":
+    import doctest
+    doctest.testmod()
     unittest.main()

Added: python-mechanize/branches/upstream/current/test/test_useragent.py
===================================================================
--- python-mechanize/branches/upstream/current/test/test_useragent.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test/test_useragent.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,58 @@
+#!/usr/bin/env python
+
+from unittest import TestCase
+
+import mechanize
+
+from test_browser import make_mock_handler
+
+
+class UserAgentTests(TestCase):
+    def test_set_handled_schemes(self):
+        import mechanize
+        class MockHandlerClass(make_mock_handler()):
+            def __call__(self): return self
+        class BlahHandlerClass(MockHandlerClass): pass
+        class BlahProcessorClass(MockHandlerClass): pass
+        BlahHandler = BlahHandlerClass([("blah_open", None)])
+        BlahProcessor = BlahProcessorClass([("blah_request", None)])
+        class TestUserAgent(mechanize.UserAgent):
+            default_others = []
+            default_features = []
+            handler_classes = mechanize.UserAgent.handler_classes.copy()
+            handler_classes.update(
+                {"blah": BlahHandler, "_blah": BlahProcessor})
+        ua = TestUserAgent()
+
+        self.assertEqual(len(ua.handlers), 5)
+        ua.set_handled_schemes(["http", "https"])
+        self.assertEqual(len(ua.handlers), 2)
+        self.assertRaises(ValueError,
+            ua.set_handled_schemes, ["blah", "non-existent"])
+        self.assertRaises(ValueError,
+            ua.set_handled_schemes, ["blah", "_blah"])
+        ua.set_handled_schemes(["blah"])
+
+        req = mechanize.Request("blah://example.com/")
+        r = ua.open(req)
+        exp_calls = [("blah_open", (req,), {})]
+        assert len(ua.calls) == len(exp_calls)
+        for got, expect in zip(ua.calls, exp_calls):
+            self.assertEqual(expect, got[1:])
+
+        ua.calls = []
+        req = mechanize.Request("blah://example.com/")
+        ua._set_handler("_blah", True)
+        r = ua.open(req)
+        exp_calls = [
+            ("blah_request", (req,), {}),
+            ("blah_open", (req,), {})]
+        assert len(ua.calls) == len(exp_calls)
+        for got, expect in zip(ua.calls, exp_calls):
+            self.assertEqual(expect, got[1:])
+        ua._set_handler("_blah", True)
+
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Added: python-mechanize/branches/upstream/current/test-tools/doctest.py
===================================================================
--- python-mechanize/branches/upstream/current/test-tools/doctest.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test-tools/doctest.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,2695 @@
+# Module doctest.
+# Released to the public domain 16-Jan-2001, by Tim Peters (tim at python.org).
+# Major enhancements and refactoring by:
+#     Jim Fulton
+#     Edward Loper
+
+# Provided as-is; use at your own risk; no warranty; no promises; enjoy!
+
+r"""Module doctest -- a framework for running examples in docstrings.
+
+In simplest use, end each module M to be tested with:
+
+def _test():
+    import doctest
+    doctest.testmod()
+
+if __name__ == "__main__":
+    _test()
+
+Then running the module as a script will cause the examples in the
+docstrings to get executed and verified:
+
+python M.py
+
+This won't display anything unless an example fails, in which case the
+failing example(s) and the cause(s) of the failure(s) are printed to stdout
+(why not stderr? because stderr is a lame hack <0.2 wink>), and the final
+line of output is "Test failed.".
+
+Run it with the -v switch instead:
+
+python M.py -v
+
+and a detailed report of all examples tried is printed to stdout, along
+with assorted summaries at the end.
+
+You can force verbose mode by passing "verbose=True" to testmod, or prohibit
+it by passing "verbose=False".  In either of those cases, sys.argv is not
+examined by testmod.
+
+There are a variety of other ways to run doctests, including integration
+with the unittest framework, and support for running non-Python text
+files containing doctests.  There are also many ways to override parts
+of doctest's default behaviors.  See the Library Reference Manual for
+details.
+"""
+
+__docformat__ = 'reStructuredText en'
+
+__all__ = [
+    # 0, Option Flags
+    'register_optionflag',
+    'DONT_ACCEPT_TRUE_FOR_1',
+    'DONT_ACCEPT_BLANKLINE',
+    'NORMALIZE_WHITESPACE',
+    'ELLIPSIS',
+    'SKIP',
+    'IGNORE_EXCEPTION_DETAIL',
+    'COMPARISON_FLAGS',
+    'REPORT_UDIFF',
+    'REPORT_CDIFF',
+    'REPORT_NDIFF',
+    'REPORT_ONLY_FIRST_FAILURE',
+    'REPORTING_FLAGS',
+    # 1. Utility Functions
+    'is_private',
+    # 2. Example & DocTest
+    'Example',
+    'DocTest',
+    # 3. Doctest Parser
+    'DocTestParser',
+    # 4. Doctest Finder
+    'DocTestFinder',
+    # 5. Doctest Runner
+    'DocTestRunner',
+    'OutputChecker',
+    'DocTestFailure',
+    'UnexpectedException',
+    'DebugRunner',
+    # 6. Test Functions
+    'testmod',
+    'testfile',
+    'run_docstring_examples',
+    # 7. Tester
+    'Tester',
+    # 8. Unittest Support
+    'DocTestSuite',
+    'DocFileSuite',
+    'set_unittest_reportflags',
+    # 9. Debugging Support
+    'script_from_examples',
+    'testsource',
+    'debug_src',
+    'debug',
+]
+
+import __future__
+
+import sys, traceback, inspect, linecache_copy, os, re, types
+import unittest, difflib, pdb, tempfile
+import warnings
+from StringIO import StringIO
+
+# Don't whine about the deprecated is_private function in this
+# module's tests.
+warnings.filterwarnings("ignore", "is_private", DeprecationWarning,
+                        __name__, 0)
+
+# There are 4 basic classes:
+#  - Example: a <source, want> pair, plus an intra-docstring line number.
+#  - DocTest: a collection of examples, parsed from a docstring, plus
+#    info about where the docstring came from (name, filename, lineno).
+#  - DocTestFinder: extracts DocTests from a given object's docstring and
+#    its contained objects' docstrings.
+#  - DocTestRunner: runs DocTest cases, and accumulates statistics.
+#
+# So the basic picture is:
+#
+#                             list of:
+# +------+                   +---------+                   +-------+
+# |object| --DocTestFinder-> | DocTest | --DocTestRunner-> |results|
+# +------+                   +---------+                   +-------+
+#                            | Example |
+#                            |   ...   |
+#                            | Example |
+#                            +---------+
+
+# Option constants.
+
+OPTIONFLAGS_BY_NAME = {}
+def register_optionflag(name):
+    flag = 1 << len(OPTIONFLAGS_BY_NAME)
+    OPTIONFLAGS_BY_NAME[name] = flag
+    return flag
+
+DONT_ACCEPT_TRUE_FOR_1 = register_optionflag('DONT_ACCEPT_TRUE_FOR_1')
+DONT_ACCEPT_BLANKLINE = register_optionflag('DONT_ACCEPT_BLANKLINE')
+NORMALIZE_WHITESPACE = register_optionflag('NORMALIZE_WHITESPACE')
+ELLIPSIS = register_optionflag('ELLIPSIS')
+SKIP = register_optionflag('SKIP')
+IGNORE_EXCEPTION_DETAIL = register_optionflag('IGNORE_EXCEPTION_DETAIL')
+
+COMPARISON_FLAGS = (DONT_ACCEPT_TRUE_FOR_1 |
+                    DONT_ACCEPT_BLANKLINE |
+                    NORMALIZE_WHITESPACE |
+                    ELLIPSIS |
+                    SKIP |
+                    IGNORE_EXCEPTION_DETAIL)
+
+REPORT_UDIFF = register_optionflag('REPORT_UDIFF')
+REPORT_CDIFF = register_optionflag('REPORT_CDIFF')
+REPORT_NDIFF = register_optionflag('REPORT_NDIFF')
+REPORT_ONLY_FIRST_FAILURE = register_optionflag('REPORT_ONLY_FIRST_FAILURE')
+
+REPORTING_FLAGS = (REPORT_UDIFF |
+                   REPORT_CDIFF |
+                   REPORT_NDIFF |
+                   REPORT_ONLY_FIRST_FAILURE)
+
+# Special string markers for use in `want` strings:
+BLANKLINE_MARKER = '<BLANKLINE>'
+ELLIPSIS_MARKER = '...'
+
+######################################################################
+## Table of Contents
+######################################################################
+#  1. Utility Functions
+#  2. Example & DocTest -- store test cases
+#  3. DocTest Parser -- extracts examples from strings
+#  4. DocTest Finder -- extracts test cases from objects
+#  5. DocTest Runner -- runs test cases
+#  6. Test Functions -- convenient wrappers for testing
+#  7. Tester Class -- for backwards compatibility
+#  8. Unittest Support
+#  9. Debugging Support
+# 10. Example Usage
+
+######################################################################
+## 1. Utility Functions
+######################################################################
+
+def is_private(prefix, base):
+    """prefix, base -> true iff name prefix + "." + base is "private".
+
+    Prefix may be an empty string, and base does not contain a period.
+    Prefix is ignored (although functions you write conforming to this
+    protocol may make use of it).
+    Return true iff base begins with an (at least one) underscore, but
+    does not both begin and end with (at least) two underscores.
+
+    >>> is_private("a.b", "my_func")
+    False
+    >>> is_private("____", "_my_func")
+    True
+    >>> is_private("someclass", "__init__")
+    False
+    >>> is_private("sometypo", "__init_")
+    True
+    >>> is_private("x.y.z", "_")
+    True
+    >>> is_private("_x.y.z", "__")
+    False
+    >>> is_private("", "")  # senseless but consistent
+    False
+    """
+    warnings.warn("is_private is deprecated; it wasn't useful; "
+                  "examine DocTestFinder.find() lists instead",
+                  DeprecationWarning, stacklevel=2)
+    return base[:1] == "_" and not base[:2] == "__" == base[-2:]
+
+def _extract_future_flags(globs):
+    """
+    Return the compiler-flags associated with the future features that
+    have been imported into the given namespace (globs).
+    """
+    flags = 0
+    for fname in __future__.all_feature_names:
+        feature = globs.get(fname, None)
+        if feature is getattr(__future__, fname):
+            flags |= feature.compiler_flag
+    return flags
+
+def _normalize_module(module, depth=2):
+    """
+    Return the module specified by `module`.  In particular:
+      - If `module` is a module, then return module.
+      - If `module` is a string, then import and return the
+        module with that name.
+      - If `module` is None, then return the calling module.
+        The calling module is assumed to be the module of
+        the stack frame at the given depth in the call stack.
+    """
+    if inspect.ismodule(module):
+        return module
+    elif isinstance(module, (str, unicode)):
+        return __import__(module, globals(), locals(), ["*"])
+    elif module is None:
+        return sys.modules[sys._getframe(depth).f_globals['__name__']]
+    else:
+        raise TypeError("Expected a module, string, or None")
+
+def _load_testfile(filename, package, module_relative):
+    if module_relative:
+        package = _normalize_module(package, 3)
+        filename = _module_relative_path(package, filename)
+        if hasattr(package, '__loader__'):
+            if hasattr(package.__loader__, 'get_data'):
+                return package.__loader__.get_data(filename), filename
+    return open(filename).read(), filename
+
+def _indent(s, indent=4):
+    """
+    Add the given number of space characters to the beginning every
+    non-blank line in `s`, and return the result.
+    """
+    # This regexp matches the start of non-blank lines:
+    return re.sub('(?m)^(?!$)', indent*' ', s)
+
+def _exception_traceback(exc_info):
+    """
+    Return a string containing a traceback message for the given
+    exc_info tuple (as returned by sys.exc_info()).
+    """
+    # Get a traceback message.
+    excout = StringIO()
+    exc_type, exc_val, exc_tb = exc_info
+    traceback.print_exception(exc_type, exc_val, exc_tb, file=excout)
+    return excout.getvalue()
+
+# Override some StringIO methods.
+class _SpoofOut(StringIO):
+    def getvalue(self):
+        result = StringIO.getvalue(self)
+        # If anything at all was written, make sure there's a trailing
+        # newline.  There's no way for the expected output to indicate
+        # that a trailing newline is missing.
+        if result and not result.endswith("\n"):
+            result += "\n"
+        # Prevent softspace from screwing up the next test case, in
+        # case they used print with a trailing comma in an example.
+        if hasattr(self, "softspace"):
+            del self.softspace
+        return result
+
+    def truncate(self,   size=None):
+        StringIO.truncate(self, size)
+        if hasattr(self, "softspace"):
+            del self.softspace
+
+# Worst-case linear-time ellipsis matching.
+def _ellipsis_match(want, got):
+    """
+    Essentially the only subtle case:
+    >>> _ellipsis_match('aa...aa', 'aaa')
+    False
+    """
+    if ELLIPSIS_MARKER not in want:
+        return want == got
+
+    # Find "the real" strings.
+    ws = want.split(ELLIPSIS_MARKER)
+    assert len(ws) >= 2
+
+    # Deal with exact matches possibly needed at one or both ends.
+    startpos, endpos = 0, len(got)
+    w = ws[0]
+    if w:   # starts with exact match
+        if got.startswith(w):
+            startpos = len(w)
+            del ws[0]
+        else:
+            return False
+    w = ws[-1]
+    if w:   # ends with exact match
+        if got.endswith(w):
+            endpos -= len(w)
+            del ws[-1]
+        else:
+            return False
+
+    if startpos > endpos:
+        # Exact end matches required more characters than we have, as in
+        # _ellipsis_match('aa...aa', 'aaa')
+        return False
+
+    # For the rest, we only need to find the leftmost non-overlapping
+    # match for each piece.  If there's no overall match that way alone,
+    # there's no overall match period.
+    for w in ws:
+        # w may be '' at times, if there are consecutive ellipses, or
+        # due to an ellipsis at the start or end of `want`.  That's OK.
+        # Search for an empty string succeeds, and doesn't change startpos.
+        startpos = got.find(w, startpos, endpos)
+        if startpos < 0:
+            return False
+        startpos += len(w)
+
+    return True
+
+def _comment_line(line):
+    "Return a commented form of the given line"
+    line = line.rstrip()
+    if line:
+        return '# '+line
+    else:
+        return '#'
+
+class _OutputRedirectingPdb(pdb.Pdb):
+    """
+    A specialized version of the python debugger that redirects stdout
+    to a given stream when interacting with the user.  Stdout is *not*
+    redirected when traced code is executed.
+    """
+    def __init__(self, out):
+        self.__out = out
+        self.__debugger_used = False
+        pdb.Pdb.__init__(self)
+
+    def set_trace(self):
+        self.__debugger_used = True
+        pdb.Pdb.set_trace(self)
+
+    def set_continue(self):
+        # Calling set_continue unconditionally would break unit test coverage
+        # reporting, as Bdb.set_continue calls sys.settrace(None).
+        if self.__debugger_used:
+            pdb.Pdb.set_continue(self)
+
+    def trace_dispatch(self, *args):
+        # Redirect stdout to the given stream.
+        save_stdout = sys.stdout
+        sys.stdout = self.__out
+        # Call Pdb's trace dispatch method.
+        try:
+            return pdb.Pdb.trace_dispatch(self, *args)
+        finally:
+            sys.stdout = save_stdout
+
+# [XX] Normalize with respect to os.path.pardir?
+def _module_relative_path(module, path):
+    if not inspect.ismodule(module):
+        raise TypeError, 'Expected a module: %r' % module
+    if path.startswith('/'):
+        raise ValueError, 'Module-relative files may not have absolute paths'
+
+    # Find the base directory for the path.
+    if hasattr(module, '__file__'):
+        # A normal module/package
+        basedir = os.path.split(module.__file__)[0]
+    elif module.__name__ == '__main__':
+        # An interactive session.
+        if len(sys.argv)>0 and sys.argv[0] != '':
+            basedir = os.path.split(sys.argv[0])[0]
+        else:
+            basedir = os.curdir
+    else:
+        # A module w/o __file__ (this includes builtins)
+        raise ValueError("Can't resolve paths relative to the module " +
+                         module + " (it has no __file__)")
+
+    # Combine the base directory and the path.
+    return os.path.join(basedir, *(path.split('/')))
+
+######################################################################
+## 2. Example & DocTest
+######################################################################
+## - An "example" is a <source, want> pair, where "source" is a
+##   fragment of source code, and "want" is the expected output for
+##   "source."  The Example class also includes information about
+##   where the example was extracted from.
+##
+## - A "doctest" is a collection of examples, typically extracted from
+##   a string (such as an object's docstring).  The DocTest class also
+##   includes information about where the string was extracted from.
+
+class Example:
+    """
+    A single doctest example, consisting of source code and expected
+    output.  `Example` defines the following attributes:
+
+      - source: A single Python statement, always ending with a newline.
+        The constructor adds a newline if needed.
+
+      - want: The expected output from running the source code (either
+        from stdout, or a traceback in case of exception).  `want` ends
+        with a newline unless it's empty, in which case it's an empty
+        string.  The constructor adds a newline if needed.
+
+      - exc_msg: The exception message generated by the example, if
+        the example is expected to generate an exception; or `None` if
+        it is not expected to generate an exception.  This exception
+        message is compared against the return value of
+        `traceback.format_exception_only()`.  `exc_msg` ends with a
+        newline unless it's `None`.  The constructor adds a newline
+        if needed.
+
+      - lineno: The line number within the DocTest string containing
+        this Example where the Example begins.  This line number is
+        zero-based, with respect to the beginning of the DocTest.
+
+      - indent: The example's indentation in the DocTest string.
+        I.e., the number of space characters that preceed the
+        example's first prompt.
+
+      - options: A dictionary mapping from option flags to True or
+        False, which is used to override default options for this
+        example.  Any option flags not contained in this dictionary
+        are left at their default value (as specified by the
+        DocTestRunner's optionflags).  By default, no options are set.
+    """
+    def __init__(self, source, want, exc_msg=None, lineno=0, indent=0,
+                 options=None):
+        # Normalize inputs.
+        if not source.endswith('\n'):
+            source += '\n'
+        if want and not want.endswith('\n'):
+            want += '\n'
+        if exc_msg is not None and not exc_msg.endswith('\n'):
+            exc_msg += '\n'
+        # Store properties.
+        self.source = source
+        self.want = want
+        self.lineno = lineno
+        self.indent = indent
+        if options is None: options = {}
+        self.options = options
+        self.exc_msg = exc_msg
+
+class DocTest:
+    """
+    A collection of doctest examples that should be run in a single
+    namespace.  Each `DocTest` defines the following attributes:
+
+      - examples: the list of examples.
+
+      - globs: The namespace (aka globals) that the examples should
+        be run in.
+
+      - name: A name identifying the DocTest (typically, the name of
+        the object whose docstring this DocTest was extracted from).
+
+      - filename: The name of the file that this DocTest was extracted
+        from, or `None` if the filename is unknown.
+
+      - lineno: The line number within filename where this DocTest
+        begins, or `None` if the line number is unavailable.  This
+        line number is zero-based, with respect to the beginning of
+        the file.
+
+      - docstring: The string that the examples were extracted from,
+        or `None` if the string is unavailable.
+    """
+    def __init__(self, examples, globs, name, filename, lineno, docstring):
+        """
+        Create a new DocTest containing the given examples.  The
+        DocTest's globals are initialized with a copy of `globs`.
+        """
+        assert not isinstance(examples, basestring), \
+               "DocTest no longer accepts str; use DocTestParser instead"
+        self.examples = examples
+        self.docstring = docstring
+        self.globs = globs.copy()
+        self.name = name
+        self.filename = filename
+        self.lineno = lineno
+
+    def __repr__(self):
+        if len(self.examples) == 0:
+            examples = 'no examples'
+        elif len(self.examples) == 1:
+            examples = '1 example'
+        else:
+            examples = '%d examples' % len(self.examples)
+        return ('<DocTest %s from %s:%s (%s)>' %
+                (self.name, self.filename, self.lineno, examples))
+
+
+    # This lets us sort tests by name:
+    def __cmp__(self, other):
+        if not isinstance(other, DocTest):
+            return -1
+        return cmp((self.name, self.filename, self.lineno, id(self)),
+                   (other.name, other.filename, other.lineno, id(other)))
+
+######################################################################
+## 3. DocTestParser
+######################################################################
+
+class DocTestParser:
+    """
+    A class used to parse strings containing doctest examples.
+    """
+    # This regular expression is used to find doctest examples in a
+    # string.  It defines three groups: `source` is the source code
+    # (including leading indentation and prompts); `indent` is the
+    # indentation of the first (PS1) line of the source code; and
+    # `want` is the expected output (including leading indentation).
+    _EXAMPLE_RE = re.compile(r'''
+        # Source consists of a PS1 line followed by zero or more PS2 lines.
+        (?P<source>
+            (?:^(?P<indent> [ ]*) >>>    .*)    # PS1 line
+            (?:\n           [ ]*  \.\.\. .*)*)  # PS2 lines
+        \n?
+        # Want consists of any non-blank lines that do not start with PS1.
+        (?P<want> (?:(?![ ]*$)    # Not a blank line
+                     (?![ ]*>>>)  # Not a line starting with PS1
+                     .*$\n?       # But any other line
+                  )*)
+        ''', re.MULTILINE | re.VERBOSE)
+
+    # A regular expression for handling `want` strings that contain
+    # expected exceptions.  It divides `want` into three pieces:
+    #    - the traceback header line (`hdr`)
+    #    - the traceback stack (`stack`)
+    #    - the exception message (`msg`), as generated by
+    #      traceback.format_exception_only()
+    # `msg` may have multiple lines.  We assume/require that the
+    # exception message is the first non-indented line starting with a word
+    # character following the traceback header line.
+    _EXCEPTION_RE = re.compile(r"""
+        # Grab the traceback header.  Different versions of Python have
+        # said different things on the first traceback line.
+        ^(?P<hdr> Traceback\ \(
+            (?: most\ recent\ call\ last
+            |   innermost\ last
+            ) \) :
+        )
+        \s* $                # toss trailing whitespace on the header.
+        (?P<stack> .*?)      # don't blink: absorb stuff until...
+        ^ (?P<msg> \w+ .*)   #     a line *starts* with alphanum.
+        """, re.VERBOSE | re.MULTILINE | re.DOTALL)
+
+    # A callable returning a true value iff its argument is a blank line
+    # or contains a single comment.
+    _IS_BLANK_OR_COMMENT = re.compile(r'^[ ]*(#.*)?$').match
+
+    def parse(self, string, name='<string>'):
+        """
+        Divide the given string into examples and intervening text,
+        and return them as a list of alternating Examples and strings.
+        Line numbers for the Examples are 0-based.  The optional
+        argument `name` is a name identifying this string, and is only
+        used for error messages.
+        """
+        string = string.expandtabs()
+        # If all lines begin with the same indentation, then strip it.
+        min_indent = self._min_indent(string)
+        if min_indent > 0:
+            string = '\n'.join([l[min_indent:] for l in string.split('\n')])
+
+        output = []
+        charno, lineno = 0, 0
+        # Find all doctest examples in the string:
+        for m in self._EXAMPLE_RE.finditer(string):
+            # Add the pre-example text to `output`.
+            output.append(string[charno:m.start()])
+            # Update lineno (lines before this example)
+            lineno += string.count('\n', charno, m.start())
+            # Extract info from the regexp match.
+            (source, options, want, exc_msg) = \
+                     self._parse_example(m, name, lineno)
+            # Create an Example, and add it to the list.
+            if not self._IS_BLANK_OR_COMMENT(source):
+                output.append( Example(source, want, exc_msg,
+                                    lineno=lineno,
+                                    indent=min_indent+len(m.group('indent')),
+                                    options=options) )
+            # Update lineno (lines inside this example)
+            lineno += string.count('\n', m.start(), m.end())
+            # Update charno.
+            charno = m.end()
+        # Add any remaining post-example text to `output`.
+        output.append(string[charno:])
+        return output
+
+    def get_doctest(self, string, globs, name, filename, lineno):
+        """
+        Extract all doctest examples from the given string, and
+        collect them into a `DocTest` object.
+
+        `globs`, `name`, `filename`, and `lineno` are attributes for
+        the new `DocTest` object.  See the documentation for `DocTest`
+        for more information.
+        """
+        return DocTest(self.get_examples(string, name), globs,
+                       name, filename, lineno, string)
+
+    def get_examples(self, string, name='<string>'):
+        """
+        Extract all doctest examples from the given string, and return
+        them as a list of `Example` objects.  Line numbers are
+        0-based, because it's most common in doctests that nothing
+        interesting appears on the same line as opening triple-quote,
+        and so the first interesting line is called \"line 1\" then.
+
+        The optional argument `name` is a name identifying this
+        string, and is only used for error messages.
+        """
+        return [x for x in self.parse(string, name)
+                if isinstance(x, Example)]
+
+    def _parse_example(self, m, name, lineno):
+        """
+        Given a regular expression match from `_EXAMPLE_RE` (`m`),
+        return a pair `(source, want)`, where `source` is the matched
+        example's source code (with prompts and indentation stripped);
+        and `want` is the example's expected output (with indentation
+        stripped).
+
+        `name` is the string's name, and `lineno` is the line number
+        where the example starts; both are used for error messages.
+        """
+        # Get the example's indentation level.
+        indent = len(m.group('indent'))
+
+        # Divide source into lines; check that they're properly
+        # indented; and then strip their indentation & prompts.
+        source_lines = m.group('source').split('\n')
+        self._check_prompt_blank(source_lines, indent, name, lineno)
+        self._check_prefix(source_lines[1:], ' '*indent + '.', name, lineno)
+        source = '\n'.join([sl[indent+4:] for sl in source_lines])
+
+        # Divide want into lines; check that it's properly indented; and
+        # then strip the indentation.  Spaces before the last newline should
+        # be preserved, so plain rstrip() isn't good enough.
+        want = m.group('want')
+        want_lines = want.split('\n')
+        if len(want_lines) > 1 and re.match(r' *$', want_lines[-1]):
+            del want_lines[-1]  # forget final newline & spaces after it
+        self._check_prefix(want_lines, ' '*indent, name,
+                           lineno + len(source_lines))
+        want = '\n'.join([wl[indent:] for wl in want_lines])
+
+        # If `want` contains a traceback message, then extract it.
+        m = self._EXCEPTION_RE.match(want)
+        if m:
+            exc_msg = m.group('msg')
+        else:
+            exc_msg = None
+
+        # Extract options from the source.
+        options = self._find_options(source, name, lineno)
+
+        return source, options, want, exc_msg
+
+    # This regular expression looks for option directives in the
+    # source code of an example.  Option directives are comments
+    # starting with "doctest:".  Warning: this may give false
+    # positives for string-literals that contain the string
+    # "#doctest:".  Eliminating these false positives would require
+    # actually parsing the string; but we limit them by ignoring any
+    # line containing "#doctest:" that is *followed* by a quote mark.
+    _OPTION_DIRECTIVE_RE = re.compile(r'#\s*doctest:\s*([^\n\'"]*)$',
+                                      re.MULTILINE)
+
+    def _find_options(self, source, name, lineno):
+        """
+        Return a dictionary containing option overrides extracted from
+        option directives in the given source string.
+
+        `name` is the string's name, and `lineno` is the line number
+        where the example starts; both are used for error messages.
+        """
+        options = {}
+        # (note: with the current regexp, this will match at most once:)
+        for m in self._OPTION_DIRECTIVE_RE.finditer(source):
+            option_strings = m.group(1).replace(',', ' ').split()
+            for option in option_strings:
+                if (option[0] not in '+-' or
+                    option[1:] not in OPTIONFLAGS_BY_NAME):
+                    raise ValueError('line %r of the doctest for %s '
+                                     'has an invalid option: %r' %
+                                     (lineno+1, name, option))
+                flag = OPTIONFLAGS_BY_NAME[option[1:]]
+                options[flag] = (option[0] == '+')
+        if options and self._IS_BLANK_OR_COMMENT(source):
+            raise ValueError('line %r of the doctest for %s has an option '
+                             'directive on a line with no example: %r' %
+                             (lineno, name, source))
+        return options
+
+    # This regular expression finds the indentation of every non-blank
+    # line in a string.
+    _INDENT_RE = re.compile('^([ ]*)(?=\S)', re.MULTILINE)
+
+    def _min_indent(self, s):
+        "Return the minimum indentation of any non-blank line in `s`"
+        indents = [len(indent) for indent in self._INDENT_RE.findall(s)]
+        if len(indents) > 0:
+            return min(indents)
+        else:
+            return 0
+
+    def _check_prompt_blank(self, lines, indent, name, lineno):
+        """
+        Given the lines of a source string (including prompts and
+        leading indentation), check to make sure that every prompt is
+        followed by a space character.  If any line is not followed by
+        a space character, then raise ValueError.
+        """
+        for i, line in enumerate(lines):
+            if len(line) >= indent+4 and line[indent+3] != ' ':
+                raise ValueError('line %r of the docstring for %s '
+                                 'lacks blank after %s: %r' %
+                                 (lineno+i+1, name,
+                                  line[indent:indent+3], line))
+
+    def _check_prefix(self, lines, prefix, name, lineno):
+        """
+        Check that every line in the given list starts with the given
+        prefix; if any line does not, then raise a ValueError.
+        """
+        for i, line in enumerate(lines):
+            if line and not line.startswith(prefix):
+                raise ValueError('line %r of the docstring for %s has '
+                                 'inconsistent leading whitespace: %r' %
+                                 (lineno+i+1, name, line))
+
+
+######################################################################
+## 4. DocTest Finder
+######################################################################
+
+class DocTestFinder:
+    """
+    A class used to extract the DocTests that are relevant to a given
+    object, from its docstring and the docstrings of its contained
+    objects.  Doctests can currently be extracted from the following
+    object types: modules, functions, classes, methods, staticmethods,
+    classmethods, and properties.
+    """
+
+    def __init__(self, verbose=False, parser=DocTestParser(),
+                 recurse=True, _namefilter=None, exclude_empty=True):
+        """
+        Create a new doctest finder.
+
+        The optional argument `parser` specifies a class or
+        function that should be used to create new DocTest objects (or
+        objects that implement the same interface as DocTest).  The
+        signature for this factory function should match the signature
+        of the DocTest constructor.
+
+        If the optional argument `recurse` is false, then `find` will
+        only examine the given object, and not any contained objects.
+
+        If the optional argument `exclude_empty` is false, then `find`
+        will include tests for objects with empty docstrings.
+        """
+        self._parser = parser
+        self._verbose = verbose
+        self._recurse = recurse
+        self._exclude_empty = exclude_empty
+        # _namefilter is undocumented, and exists only for temporary backward-
+        # compatibility support of testmod's deprecated isprivate mess.
+        self._namefilter = _namefilter
+
+    def find(self, obj, name=None, module=None, globs=None,
+             extraglobs=None):
+        """
+        Return a list of the DocTests that are defined by the given
+        object's docstring, or by any of its contained objects'
+        docstrings.
+
+        The optional parameter `module` is the module that contains
+        the given object.  If the module is not specified or is None, then
+        the test finder will attempt to automatically determine the
+        correct module.  The object's module is used:
+
+            - As a default namespace, if `globs` is not specified.
+            - To prevent the DocTestFinder from extracting DocTests
+              from objects that are imported from other modules.
+            - To find the name of the file containing the object.
+            - To help find the line number of the object within its
+              file.
+
+        Contained objects whose module does not match `module` are ignored.
+
+        If `module` is False, no attempt to find the module will be made.
+        This is obscure, of use mostly in tests:  if `module` is False, or
+        is None but cannot be found automatically, then all objects are
+        considered to belong to the (non-existent) module, so all contained
+        objects will (recursively) be searched for doctests.
+
+        The globals for each DocTest is formed by combining `globs`
+        and `extraglobs` (bindings in `extraglobs` override bindings
+        in `globs`).  A new copy of the globals dictionary is created
+        for each DocTest.  If `globs` is not specified, then it
+        defaults to the module's `__dict__`, if specified, or {}
+        otherwise.  If `extraglobs` is not specified, then it defaults
+        to {}.
+
+        """
+        # If name was not specified, then extract it from the object.
+        if name is None:
+            name = getattr(obj, '__name__', None)
+            if name is None:
+                raise ValueError("DocTestFinder.find: name must be given "
+                        "when obj.__name__ doesn't exist: %r" %
+                                 (type(obj),))
+
+        # Find the module that contains the given object (if obj is
+        # a module, then module=obj.).  Note: this may fail, in which
+        # case module will be None.
+        if module is False:
+            module = None
+        elif module is None:
+            module = inspect.getmodule(obj)
+
+        # Read the module's source code.  This is used by
+        # DocTestFinder._find_lineno to find the line number for a
+        # given object's docstring.
+        try:
+            file = inspect.getsourcefile(obj) or inspect.getfile(obj)
+            source_lines = linecache_copy.getlines(file)
+            if not source_lines:
+                source_lines = None
+        except TypeError:
+            source_lines = None
+
+        # Initialize globals, and merge in extraglobs.
+        if globs is None:
+            if module is None:
+                globs = {}
+            else:
+                globs = module.__dict__.copy()
+        else:
+            globs = globs.copy()
+        if extraglobs is not None:
+            globs.update(extraglobs)
+
+        # Recursively expore `obj`, extracting DocTests.
+        tests = []
+        self._find(tests, obj, name, module, source_lines, globs, {})
+        return tests
+
+    def _filter(self, obj, prefix, base):
+        """
+        Return true if the given object should not be examined.
+        """
+        return (self._namefilter is not None and
+                self._namefilter(prefix, base))
+
+    def _from_module(self, module, object):
+        """
+        Return true if the given object is defined in the given
+        module.
+        """
+        if module is None:
+            return True
+        elif inspect.isfunction(object):
+            return module.__dict__ is object.func_globals
+        elif inspect.isclass(object):
+            return module.__name__ == object.__module__
+        elif inspect.getmodule(object) is not None:
+            return module is inspect.getmodule(object)
+        elif hasattr(object, '__module__'):
+            return module.__name__ == object.__module__
+        elif isinstance(object, property):
+            return True # [XX] no way not be sure.
+        else:
+            raise ValueError("object must be a class or function")
+
+    def _find(self, tests, obj, name, module, source_lines, globs, seen):
+        """
+        Find tests for the given object and any contained objects, and
+        add them to `tests`.
+        """
+        if self._verbose:
+            print 'Finding tests in %s' % name
+
+        # If we've already processed this object, then ignore it.
+        if id(obj) in seen:
+            return
+        seen[id(obj)] = 1
+
+        # Find a test for this object, and add it to the list of tests.
+        test = self._get_test(obj, name, module, globs, source_lines)
+        if test is not None:
+            tests.append(test)
+
+        # Look for tests in a module's contained objects.
+        if inspect.ismodule(obj) and self._recurse:
+            for valname, val in obj.__dict__.items():
+                # Check if this contained object should be ignored.
+                if self._filter(val, name, valname):
+                    continue
+                valname = '%s.%s' % (name, valname)
+                # Recurse to functions & classes.
+                if ((inspect.isfunction(val) or inspect.isclass(val)) and
+                    self._from_module(module, val)):
+                    self._find(tests, val, valname, module, source_lines,
+                               globs, seen)
+
+        # Look for tests in a module's __test__ dictionary.
+        if inspect.ismodule(obj) and self._recurse:
+            for valname, val in getattr(obj, '__test__', {}).items():
+                if not isinstance(valname, basestring):
+                    raise ValueError("DocTestFinder.find: __test__ keys "
+                                     "must be strings: %r" %
+                                     (type(valname),))
+                if not (inspect.isfunction(val) or inspect.isclass(val) or
+                        inspect.ismethod(val) or inspect.ismodule(val) or
+                        isinstance(val, basestring)):
+                    raise ValueError("DocTestFinder.find: __test__ values "
+                                     "must be strings, functions, methods, "
+                                     "classes, or modules: %r" %
+                                     (type(val),))
+                valname = '%s.__test__.%s' % (name, valname)
+                self._find(tests, val, valname, module, source_lines,
+                           globs, seen)
+
+        # Look for tests in a class's contained objects.
+        if inspect.isclass(obj) and self._recurse:
+            for valname, val in obj.__dict__.items():
+                # Check if this contained object should be ignored.
+                if self._filter(val, name, valname):
+                    continue
+                # Special handling for staticmethod/classmethod.
+                if isinstance(val, staticmethod):
+                    val = getattr(obj, valname)
+                if isinstance(val, classmethod):
+                    val = getattr(obj, valname).im_func
+
+                # Recurse to methods, properties, and nested classes.
+                if ((inspect.isfunction(val) or inspect.isclass(val) or
+                      isinstance(val, property)) and
+                      self._from_module(module, val)):
+                    valname = '%s.%s' % (name, valname)
+                    self._find(tests, val, valname, module, source_lines,
+                               globs, seen)
+
+    def _get_test(self, obj, name, module, globs, source_lines):
+        """
+        Return a DocTest for the given object, if it defines a docstring;
+        otherwise, return None.
+        """
+        # Extract the object's docstring.  If it doesn't have one,
+        # then return None (no test for this object).
+        if isinstance(obj, basestring):
+            docstring = obj
+        else:
+            try:
+                if obj.__doc__ is None:
+                    docstring = ''
+                else:
+                    docstring = obj.__doc__
+                    if not isinstance(docstring, basestring):
+                        docstring = str(docstring)
+            except (TypeError, AttributeError):
+                docstring = ''
+
+        # Find the docstring's location in the file.
+        lineno = self._find_lineno(obj, source_lines)
+
+        # Don't bother if the docstring is empty.
+        if self._exclude_empty and not docstring:
+            return None
+
+        # Return a DocTest for this object.
+        if module is None:
+            filename = None
+        else:
+            filename = getattr(module, '__file__', module.__name__)
+            if filename[-4:] in (".pyc", ".pyo"):
+                filename = filename[:-1]
+        return self._parser.get_doctest(docstring, globs, name,
+                                        filename, lineno)
+
+    def _find_lineno(self, obj, source_lines):
+        """
+        Return a line number of the given object's docstring.  Note:
+        this method assumes that the object has a docstring.
+        """
+        lineno = None
+
+        # Find the line number for modules.
+        if inspect.ismodule(obj):
+            lineno = 0
+
+        # Find the line number for classes.
+        # Note: this could be fooled if a class is defined multiple
+        # times in a single file.
+        if inspect.isclass(obj):
+            if source_lines is None:
+                return None
+            pat = re.compile(r'^\s*class\s*%s\b' %
+                             getattr(obj, '__name__', '-'))
+            for i, line in enumerate(source_lines):
+                if pat.match(line):
+                    lineno = i
+                    break
+
+        # Find the line number for functions & methods.
+        if inspect.ismethod(obj): obj = obj.im_func
+        if inspect.isfunction(obj): obj = obj.func_code
+        if inspect.istraceback(obj): obj = obj.tb_frame
+        if inspect.isframe(obj): obj = obj.f_code
+        if inspect.iscode(obj):
+            lineno = getattr(obj, 'co_firstlineno', None)-1
+
+        # Find the line number where the docstring starts.  Assume
+        # that it's the first line that begins with a quote mark.
+        # Note: this could be fooled by a multiline function
+        # signature, where a continuation line begins with a quote
+        # mark.
+        if lineno is not None:
+            if source_lines is None:
+                return lineno+1
+            pat = re.compile('(^|.*:)\s*\w*("|\')')
+            for lineno in range(lineno, len(source_lines)):
+                if pat.match(source_lines[lineno]):
+                    return lineno
+
+        # We couldn't find the line number.
+        return None
+
+######################################################################
+## 5. DocTest Runner
+######################################################################
+
+class DocTestRunner:
+    """
+    A class used to run DocTest test cases, and accumulate statistics.
+    The `run` method is used to process a single DocTest case.  It
+    returns a tuple `(f, t)`, where `t` is the number of test cases
+    tried, and `f` is the number of test cases that failed.
+
+        >>> tests = DocTestFinder().find(_TestClass)
+        >>> runner = DocTestRunner(verbose=False)
+        >>> for test in tests:
+        ...     print runner.run(test)
+        (0, 2)
+        (0, 1)
+        (0, 2)
+        (0, 2)
+
+    The `summarize` method prints a summary of all the test cases that
+    have been run by the runner, and returns an aggregated `(f, t)`
+    tuple:
+
+        >>> runner.summarize(verbose=1)
+        4 items passed all tests:
+           2 tests in _TestClass
+           2 tests in _TestClass.__init__
+           2 tests in _TestClass.get
+           1 tests in _TestClass.square
+        7 tests in 4 items.
+        7 passed and 0 failed.
+        Test passed.
+        (0, 7)
+
+    The aggregated number of tried examples and failed examples is
+    also available via the `tries` and `failures` attributes:
+
+        >>> runner.tries
+        7
+        >>> runner.failures
+        0
+
+    The comparison between expected outputs and actual outputs is done
+    by an `OutputChecker`.  This comparison may be customized with a
+    number of option flags; see the documentation for `testmod` for
+    more information.  If the option flags are insufficient, then the
+    comparison may also be customized by passing a subclass of
+    `OutputChecker` to the constructor.
+
+    The test runner's display output can be controlled in two ways.
+    First, an output function (`out) can be passed to
+    `TestRunner.run`; this function will be called with strings that
+    should be displayed.  It defaults to `sys.stdout.write`.  If
+    capturing the output is not sufficient, then the display output
+    can be also customized by subclassing DocTestRunner, and
+    overriding the methods `report_start`, `report_success`,
+    `report_unexpected_exception`, and `report_failure`.
+    """
+    # This divider string is used to separate failure messages, and to
+    # separate sections of the summary.
+    DIVIDER = "*" * 70
+
+    def __init__(self, checker=None, verbose=None, optionflags=0):
+        """
+        Create a new test runner.
+
+        Optional keyword arg `checker` is the `OutputChecker` that
+        should be used to compare the expected outputs and actual
+        outputs of doctest examples.
+
+        Optional keyword arg 'verbose' prints lots of stuff if true,
+        only failures if false; by default, it's true iff '-v' is in
+        sys.argv.
+
+        Optional argument `optionflags` can be used to control how the
+        test runner compares expected output to actual output, and how
+        it displays failures.  See the documentation for `testmod` for
+        more information.
+        """
+        self._checker = checker or OutputChecker()
+        if verbose is None:
+            verbose = '-v' in sys.argv
+        self._verbose = verbose
+        self.optionflags = optionflags
+        self.original_optionflags = optionflags
+
+        # Keep track of the examples we've run.
+        self.tries = 0
+        self.failures = 0
+        self._name2ft = {}
+
+        # Create a fake output target for capturing doctest output.
+        self._fakeout = _SpoofOut()
+
+    #/////////////////////////////////////////////////////////////////
+    # Reporting methods
+    #/////////////////////////////////////////////////////////////////
+
+    def report_start(self, out, test, example):
+        """
+        Report that the test runner is about to process the given
+        example.  (Only displays a message if verbose=True)
+        """
+        if self._verbose:
+            if example.want:
+                out('Trying:\n' + _indent(example.source) +
+                    'Expecting:\n' + _indent(example.want))
+            else:
+                out('Trying:\n' + _indent(example.source) +
+                    'Expecting nothing\n')
+
+    def report_success(self, out, test, example, got):
+        """
+        Report that the given example ran successfully.  (Only
+        displays a message if verbose=True)
+        """
+        if self._verbose:
+            out("ok\n")
+
+    def report_failure(self, out, test, example, got):
+        """
+        Report that the given example failed.
+        """
+        out(self._failure_header(test, example) +
+            self._checker.output_difference(example, got, self.optionflags))
+
+    def report_unexpected_exception(self, out, test, example, exc_info):
+        """
+        Report that the given example raised an unexpected exception.
+        """
+        out(self._failure_header(test, example) +
+            'Exception raised:\n' + _indent(_exception_traceback(exc_info)))
+
+    def _failure_header(self, test, example):
+        out = [self.DIVIDER]
+        if test.filename:
+            if test.lineno is not None and example.lineno is not None:
+                lineno = test.lineno + example.lineno + 1
+            else:
+                lineno = '?'
+            out.append('File "%s", line %s, in %s' %
+                       (test.filename, lineno, test.name))
+        else:
+            out.append('Line %s, in %s' % (example.lineno+1, test.name))
+        out.append('Failed example:')
+        source = example.source
+        out.append(_indent(source))
+        return '\n'.join(out)
+
+    #/////////////////////////////////////////////////////////////////
+    # DocTest Running
+    #/////////////////////////////////////////////////////////////////
+
+    def __run(self, test, compileflags, out):
+        """
+        Run the examples in `test`.  Write the outcome of each example
+        with one of the `DocTestRunner.report_*` methods, using the
+        writer function `out`.  `compileflags` is the set of compiler
+        flags that should be used to execute examples.  Return a tuple
+        `(f, t)`, where `t` is the number of examples tried, and `f`
+        is the number of examples that failed.  The examples are run
+        in the namespace `test.globs`.
+        """
+        # Keep track of the number of failures and tries.
+        failures = tries = 0
+
+        # Save the option flags (since option directives can be used
+        # to modify them).
+        original_optionflags = self.optionflags
+
+        SUCCESS, FAILURE, BOOM = range(3) # `outcome` state
+
+        check = self._checker.check_output
+
+        # Process each example.
+        for examplenum, example in enumerate(test.examples):
+
+            # If REPORT_ONLY_FIRST_FAILURE is set, then supress
+            # reporting after the first failure.
+            quiet = (self.optionflags & REPORT_ONLY_FIRST_FAILURE and
+                     failures > 0)
+
+            # Merge in the example's options.
+            self.optionflags = original_optionflags
+            if example.options:
+                for (optionflag, val) in example.options.items():
+                    if val:
+                        self.optionflags |= optionflag
+                    else:
+                        self.optionflags &= ~optionflag
+
+            # If 'SKIP' is set, then skip this example.
+            if self.optionflags & SKIP:
+                continue
+
+            # Record that we started this example.
+            tries += 1
+            if not quiet:
+                self.report_start(out, test, example)
+
+            # Use a special filename for compile(), so we can retrieve
+            # the source code during interactive debugging (see
+            # __patched_linecache_getlines).
+            filename = '<doctest %s[%d]>' % (test.name, examplenum)
+
+            # Run the example in the given context (globs), and record
+            # any exception that gets raised.  (But don't intercept
+            # keyboard interrupts.)
+            try:
+                # Don't blink!  This is where the user's code gets run.
+                exec compile(example.source, filename, "single",
+                             compileflags, 1) in test.globs
+                self.debugger.set_continue() # ==== Example Finished ====
+                exception = None
+            except KeyboardInterrupt:
+                raise
+            except:
+                exception = sys.exc_info()
+                self.debugger.set_continue() # ==== Example Finished ====
+
+            got = self._fakeout.getvalue()  # the actual output
+            self._fakeout.truncate(0)
+            outcome = FAILURE   # guilty until proved innocent or insane
+
+            # If the example executed without raising any exceptions,
+            # verify its output.
+            if exception is None:
+                if check(example.want, got, self.optionflags):
+                    outcome = SUCCESS
+
+            # The example raised an exception:  check if it was expected.
+            else:
+                exc_info = sys.exc_info()
+                exc_msg = traceback.format_exception_only(*exc_info[:2])[-1]
+                if not quiet:
+                    got += _exception_traceback(exc_info)
+
+                # If `example.exc_msg` is None, then we weren't expecting
+                # an exception.
+                if example.exc_msg is None:
+                    outcome = BOOM
+
+                # We expected an exception:  see whether it matches.
+                elif check(example.exc_msg, exc_msg, self.optionflags):
+                    outcome = SUCCESS
+
+                # Another chance if they didn't care about the detail.
+                elif self.optionflags & IGNORE_EXCEPTION_DETAIL:
+                    m1 = re.match(r'[^:]*:', example.exc_msg)
+                    m2 = re.match(r'[^:]*:', exc_msg)
+                    if m1 and m2 and check(m1.group(0), m2.group(0),
+                                           self.optionflags):
+                        outcome = SUCCESS
+
+            # Report the outcome.
+            if outcome is SUCCESS:
+                if not quiet:
+                    self.report_success(out, test, example, got)
+            elif outcome is FAILURE:
+                if not quiet:
+                    self.report_failure(out, test, example, got)
+                failures += 1
+            elif outcome is BOOM:
+                if not quiet:
+                    self.report_unexpected_exception(out, test, example,
+                                                     exc_info)
+                failures += 1
+            else:
+                assert False, ("unknown outcome", outcome)
+
+        # Restore the option flags (in case they were modified)
+        self.optionflags = original_optionflags
+
+        # Record and return the number of failures and tries.
+        self.__record_outcome(test, failures, tries)
+        return failures, tries
+
+    def __record_outcome(self, test, f, t):
+        """
+        Record the fact that the given DocTest (`test`) generated `f`
+        failures out of `t` tried examples.
+        """
+        f2, t2 = self._name2ft.get(test.name, (0,0))
+        self._name2ft[test.name] = (f+f2, t+t2)
+        self.failures += f
+        self.tries += t
+
+    __LINECACHE_FILENAME_RE = re.compile(r'<doctest '
+                                         r'(?P<name>[\w\.]+)'
+                                         r'\[(?P<examplenum>\d+)\]>$')
+    def __patched_linecache_getlines(self, filename, module_globals=None):
+        m = self.__LINECACHE_FILENAME_RE.match(filename)
+        if m and m.group('name') == self.test.name:
+            example = self.test.examples[int(m.group('examplenum'))]
+            return example.source.splitlines(True)
+        else:
+            return self.save_linecache_getlines(filename, module_globals)
+
+    def run(self, test, compileflags=None, out=None, clear_globs=True):
+        """
+        Run the examples in `test`, and display the results using the
+        writer function `out`.
+
+        The examples are run in the namespace `test.globs`.  If
+        `clear_globs` is true (the default), then this namespace will
+        be cleared after the test runs, to help with garbage
+        collection.  If you would like to examine the namespace after
+        the test completes, then use `clear_globs=False`.
+
+        `compileflags` gives the set of flags that should be used by
+        the Python compiler when running the examples.  If not
+        specified, then it will default to the set of future-import
+        flags that apply to `globs`.
+
+        The output of each example is checked using
+        `DocTestRunner.check_output`, and the results are formatted by
+        the `DocTestRunner.report_*` methods.
+        """
+        self.test = test
+
+        if compileflags is None:
+            compileflags = _extract_future_flags(test.globs)
+
+        save_stdout = sys.stdout
+        if out is None:
+            out = save_stdout.write
+        sys.stdout = self._fakeout
+
+        # Patch pdb.set_trace to restore sys.stdout during interactive
+        # debugging (so it's not still redirected to self._fakeout).
+        # Note that the interactive output will go to *our*
+        # save_stdout, even if that's not the real sys.stdout; this
+        # allows us to write test cases for the set_trace behavior.
+        save_set_trace = pdb.set_trace
+        self.debugger = _OutputRedirectingPdb(save_stdout)
+        self.debugger.reset()
+        pdb.set_trace = self.debugger.set_trace
+
+        # Patch linecache_copy.getlines, so we can see the example's source
+        # when we're inside the debugger.
+        self.save_linecache_getlines = linecache_copy.getlines
+        linecache_copy.getlines = self.__patched_linecache_getlines
+
+        try:
+            return self.__run(test, compileflags, out)
+        finally:
+            sys.stdout = save_stdout
+            pdb.set_trace = save_set_trace
+            linecache_copy.getlines = self.save_linecache_getlines
+            if clear_globs:
+                test.globs.clear()
+
+    #/////////////////////////////////////////////////////////////////
+    # Summarization
+    #/////////////////////////////////////////////////////////////////
+    def summarize(self, verbose=None):
+        """
+        Print a summary of all the test cases that have been run by
+        this DocTestRunner, and return a tuple `(f, t)`, where `f` is
+        the total number of failed examples, and `t` is the total
+        number of tried examples.
+
+        The optional `verbose` argument controls how detailed the
+        summary is.  If the verbosity is not specified, then the
+        DocTestRunner's verbosity is used.
+        """
+        if verbose is None:
+            verbose = self._verbose
+        notests = []
+        passed = []
+        failed = []
+        totalt = totalf = 0
+        for x in self._name2ft.items():
+            name, (f, t) = x
+            assert f <= t
+            totalt += t
+            totalf += f
+            if t == 0:
+                notests.append(name)
+            elif f == 0:
+                passed.append( (name, t) )
+            else:
+                failed.append(x)
+        if verbose:
+            if notests:
+                print len(notests), "items had no tests:"
+                notests.sort()
+                for thing in notests:
+                    print "   ", thing
+            if passed:
+                print len(passed), "items passed all tests:"
+                passed.sort()
+                for thing, count in passed:
+                    print " %3d tests in %s" % (count, thing)
+        if failed:
+            print self.DIVIDER
+            print len(failed), "items had failures:"
+            failed.sort()
+            for thing, (f, t) in failed:
+                print " %3d of %3d in %s" % (f, t, thing)
+        if verbose:
+            print totalt, "tests in", len(self._name2ft), "items."
+            print totalt - totalf, "passed and", totalf, "failed."
+        if totalf:
+            print "***Test Failed***", totalf, "failures."
+        elif verbose:
+            print "Test passed."
+        return totalf, totalt
+
+    #/////////////////////////////////////////////////////////////////
+    # Backward compatibility cruft to maintain doctest.master.
+    #/////////////////////////////////////////////////////////////////
+    def merge(self, other):
+        d = self._name2ft
+        for name, (f, t) in other._name2ft.items():
+            if name in d:
+                print "*** DocTestRunner.merge: '" + name + "' in both" \
+                    " testers; summing outcomes."
+                f2, t2 = d[name]
+                f = f + f2
+                t = t + t2
+            d[name] = f, t
+
+class OutputChecker:
+    """
+    A class used to check the whether the actual output from a doctest
+    example matches the expected output.  `OutputChecker` defines two
+    methods: `check_output`, which compares a given pair of outputs,
+    and returns true if they match; and `output_difference`, which
+    returns a string describing the differences between two outputs.
+    """
+    def check_output(self, want, got, optionflags):
+        """
+        Return True iff the actual output from an example (`got`)
+        matches the expected output (`want`).  These strings are
+        always considered to match if they are identical; but
+        depending on what option flags the test runner is using,
+        several non-exact match types are also possible.  See the
+        documentation for `TestRunner` for more information about
+        option flags.
+        """
+        # Handle the common case first, for efficiency:
+        # if they're string-identical, always return true.
+        if got == want:
+            return True
+
+        # The values True and False replaced 1 and 0 as the return
+        # value for boolean comparisons in Python 2.3.
+        if not (optionflags & DONT_ACCEPT_TRUE_FOR_1):
+            if (got,want) == ("True\n", "1\n"):
+                return True
+            if (got,want) == ("False\n", "0\n"):
+                return True
+
+        # <BLANKLINE> can be used as a special sequence to signify a
+        # blank line, unless the DONT_ACCEPT_BLANKLINE flag is used.
+        if not (optionflags & DONT_ACCEPT_BLANKLINE):
+            # Replace <BLANKLINE> in want with a blank line.
+            want = re.sub('(?m)^%s\s*?$' % re.escape(BLANKLINE_MARKER),
+                          '', want)
+            # If a line in got contains only spaces, then remove the
+            # spaces.
+            got = re.sub('(?m)^\s*?$', '', got)
+            if got == want:
+                return True
+
+        # This flag causes doctest to ignore any differences in the
+        # contents of whitespace strings.  Note that this can be used
+        # in conjunction with the ELLIPSIS flag.
+        if optionflags & NORMALIZE_WHITESPACE:
+            got = ' '.join(got.split())
+            want = ' '.join(want.split())
+            if got == want:
+                return True
+
+        # The ELLIPSIS flag says to let the sequence "..." in `want`
+        # match any substring in `got`.
+        if optionflags & ELLIPSIS:
+            if _ellipsis_match(want, got):
+                return True
+
+        # We didn't find any match; return false.
+        return False
+
+    # Should we do a fancy diff?
+    def _do_a_fancy_diff(self, want, got, optionflags):
+        # Not unless they asked for a fancy diff.
+        if not optionflags & (REPORT_UDIFF |
+                              REPORT_CDIFF |
+                              REPORT_NDIFF):
+            return False
+
+        # If expected output uses ellipsis, a meaningful fancy diff is
+        # too hard ... or maybe not.  In two real-life failures Tim saw,
+        # a diff was a major help anyway, so this is commented out.
+        # [todo] _ellipsis_match() knows which pieces do and don't match,
+        # and could be the basis for a kick-ass diff in this case.
+        ##if optionflags & ELLIPSIS and ELLIPSIS_MARKER in want:
+        ##    return False
+
+        # ndiff does intraline difference marking, so can be useful even
+        # for 1-line differences.
+        if optionflags & REPORT_NDIFF:
+            return True
+
+        # The other diff types need at least a few lines to be helpful.
+        return want.count('\n') > 2 and got.count('\n') > 2
+
+    def output_difference(self, example, got, optionflags):
+        """
+        Return a string describing the differences between the
+        expected output for a given example (`example`) and the actual
+        output (`got`).  `optionflags` is the set of option flags used
+        to compare `want` and `got`.
+        """
+        want = example.want
+        # If <BLANKLINE>s are being used, then replace blank lines
+        # with <BLANKLINE> in the actual output string.
+        if not (optionflags & DONT_ACCEPT_BLANKLINE):
+            got = re.sub('(?m)^[ ]*(?=\n)', BLANKLINE_MARKER, got)
+
+        # Check if we should use diff.
+        if self._do_a_fancy_diff(want, got, optionflags):
+            # Split want & got into lines.
+            want_lines = want.splitlines(True)  # True == keep line ends
+            got_lines = got.splitlines(True)
+            # Use difflib to find their differences.
+            if optionflags & REPORT_UDIFF:
+                diff = difflib.unified_diff(want_lines, got_lines, n=2)
+                diff = list(diff)[2:] # strip the diff header
+                kind = 'unified diff with -expected +actual'
+            elif optionflags & REPORT_CDIFF:
+                diff = difflib.context_diff(want_lines, got_lines, n=2)
+                diff = list(diff)[2:] # strip the diff header
+                kind = 'context diff with expected followed by actual'
+            elif optionflags & REPORT_NDIFF:
+                engine = difflib.Differ(charjunk=difflib.IS_CHARACTER_JUNK)
+                diff = list(engine.compare(want_lines, got_lines))
+                kind = 'ndiff with -expected +actual'
+            else:
+                assert 0, 'Bad diff option'
+            # Remove trailing whitespace on diff output.
+            diff = [line.rstrip() + '\n' for line in diff]
+            return 'Differences (%s):\n' % kind + _indent(''.join(diff))
+
+        # If we're not using diff, then simply list the expected
+        # output followed by the actual output.
+        if want and got:
+            return 'Expected:\n%sGot:\n%s' % (_indent(want), _indent(got))
+        elif want:
+            return 'Expected:\n%sGot nothing\n' % _indent(want)
+        elif got:
+            return 'Expected nothing\nGot:\n%s' % _indent(got)
+        else:
+            return 'Expected nothing\nGot nothing\n'
+
+class DocTestFailure(Exception):
+    """A DocTest example has failed in debugging mode.
+
+    The exception instance has variables:
+
+    - test: the DocTest object being run
+
+    - excample: the Example object that failed
+
+    - got: the actual output
+    """
+    def __init__(self, test, example, got):
+        self.test = test
+        self.example = example
+        self.got = got
+
+    def __str__(self):
+        return str(self.test)
+
+class UnexpectedException(Exception):
+    """A DocTest example has encountered an unexpected exception
+
+    The exception instance has variables:
+
+    - test: the DocTest object being run
+
+    - excample: the Example object that failed
+
+    - exc_info: the exception info
+    """
+    def __init__(self, test, example, exc_info):
+        self.test = test
+        self.example = example
+        self.exc_info = exc_info
+
+    def __str__(self):
+        return str(self.test)
+
+class DebugRunner(DocTestRunner):
+    r"""Run doc tests but raise an exception as soon as there is a failure.
+
+       If an unexpected exception occurs, an UnexpectedException is raised.
+       It contains the test, the example, and the original exception:
+
+         >>> runner = DebugRunner(verbose=False)
+         >>> test = DocTestParser().get_doctest('>>> raise KeyError\n42',
+         ...                                    {}, 'foo', 'foo.py', 0)
+         >>> try:
+         ...     runner.run(test)
+         ... except UnexpectedException, failure:
+         ...     pass
+
+         >>> failure.test is test
+         True
+
+         >>> failure.example.want
+         '42\n'
+
+         >>> exc_info = failure.exc_info
+         >>> raise exc_info[0], exc_info[1], exc_info[2]
+         Traceback (most recent call last):
+         ...
+         KeyError
+
+       We wrap the original exception to give the calling application
+       access to the test and example information.
+
+       If the output doesn't match, then a DocTestFailure is raised:
+
+         >>> test = DocTestParser().get_doctest('''
+         ...      >>> x = 1
+         ...      >>> x
+         ...      2
+         ...      ''', {}, 'foo', 'foo.py', 0)
+
+         >>> try:
+         ...    runner.run(test)
+         ... except DocTestFailure, failure:
+         ...    pass
+
+       DocTestFailure objects provide access to the test:
+
+         >>> failure.test is test
+         True
+
+       As well as to the example:
+
+         >>> failure.example.want
+         '2\n'
+
+       and the actual output:
+
+         >>> failure.got
+         '1\n'
+
+       If a failure or error occurs, the globals are left intact:
+
+         >>> del test.globs['__builtins__']
+         >>> test.globs
+         {'x': 1}
+
+         >>> test = DocTestParser().get_doctest('''
+         ...      >>> x = 2
+         ...      >>> raise KeyError
+         ...      ''', {}, 'foo', 'foo.py', 0)
+
+         >>> runner.run(test)
+         Traceback (most recent call last):
+         ...
+         UnexpectedException: <DocTest foo from foo.py:0 (2 examples)>
+
+         >>> del test.globs['__builtins__']
+         >>> test.globs
+         {'x': 2}
+
+       But the globals are cleared if there is no error:
+
+         >>> test = DocTestParser().get_doctest('''
+         ...      >>> x = 2
+         ...      ''', {}, 'foo', 'foo.py', 0)
+
+         >>> runner.run(test)
+         (0, 1)
+
+         >>> test.globs
+         {}
+
+       """
+
+    def run(self, test, compileflags=None, out=None, clear_globs=True):
+        r = DocTestRunner.run(self, test, compileflags, out, False)
+        if clear_globs:
+            test.globs.clear()
+        return r
+
+    def report_unexpected_exception(self, out, test, example, exc_info):
+        raise UnexpectedException(test, example, exc_info)
+
+    def report_failure(self, out, test, example, got):
+        raise DocTestFailure(test, example, got)
+
+######################################################################
+## 6. Test Functions
+######################################################################
+# These should be backwards compatible.
+
+# For backward compatibility, a global instance of a DocTestRunner
+# class, updated by testmod.
+master = None
+
+def testmod(m=None, name=None, globs=None, verbose=None, isprivate=None,
+            report=True, optionflags=0, extraglobs=None,
+            raise_on_error=False, exclude_empty=False):
+    """m=None, name=None, globs=None, verbose=None, isprivate=None,
+       report=True, optionflags=0, extraglobs=None, raise_on_error=False,
+       exclude_empty=False
+
+    Test examples in docstrings in functions and classes reachable
+    from module m (or the current module if m is not supplied), starting
+    with m.__doc__.  Unless isprivate is specified, private names
+    are not skipped.
+
+    Also test examples reachable from dict m.__test__ if it exists and is
+    not None.  m.__test__ maps names to functions, classes and strings;
+    function and class docstrings are tested even if the name is private;
+    strings are tested directly, as if they were docstrings.
+
+    Return (#failures, #tests).
+
+    See doctest.__doc__ for an overview.
+
+    Optional keyword arg "name" gives the name of the module; by default
+    use m.__name__.
+
+    Optional keyword arg "globs" gives a dict to be used as the globals
+    when executing examples; by default, use m.__dict__.  A copy of this
+    dict is actually used for each docstring, so that each docstring's
+    examples start with a clean slate.
+
+    Optional keyword arg "extraglobs" gives a dictionary that should be
+    merged into the globals that are used to execute examples.  By
+    default, no extra globals are used.  This is new in 2.4.
+
+    Optional keyword arg "verbose" prints lots of stuff if true, prints
+    only failures if false; by default, it's true iff "-v" is in sys.argv.
+
+    Optional keyword arg "report" prints a summary at the end when true,
+    else prints nothing at the end.  In verbose mode, the summary is
+    detailed, else very brief (in fact, empty if all tests passed).
+
+    Optional keyword arg "optionflags" or's together module constants,
+    and defaults to 0.  This is new in 2.3.  Possible values (see the
+    docs for details):
+
+        DONT_ACCEPT_TRUE_FOR_1
+        DONT_ACCEPT_BLANKLINE
+        NORMALIZE_WHITESPACE
+        ELLIPSIS
+        SKIP
+        IGNORE_EXCEPTION_DETAIL
+        REPORT_UDIFF
+        REPORT_CDIFF
+        REPORT_NDIFF
+        REPORT_ONLY_FIRST_FAILURE
+
+    Optional keyword arg "raise_on_error" raises an exception on the
+    first unexpected exception or failure. This allows failures to be
+    post-mortem debugged.
+
+    Deprecated in Python 2.4:
+    Optional keyword arg "isprivate" specifies a function used to
+    determine whether a name is private.  The default function is
+    treat all functions as public.  Optionally, "isprivate" can be
+    set to doctest.is_private to skip over functions marked as private
+    using the underscore naming convention; see its docs for details.
+
+    Advanced tomfoolery:  testmod runs methods of a local instance of
+    class doctest.Tester, then merges the results into (or creates)
+    global Tester instance doctest.master.  Methods of doctest.master
+    can be called directly too, if you want to do something unusual.
+    Passing report=0 to testmod is especially useful then, to delay
+    displaying a summary.  Invoke doctest.master.summarize(verbose)
+    when you're done fiddling.
+    """
+    global master
+
+    if isprivate is not None:
+        warnings.warn("the isprivate argument is deprecated; "
+                      "examine DocTestFinder.find() lists instead",
+                      DeprecationWarning)
+
+    # If no module was given, then use __main__.
+    if m is None:
+        # DWA - m will still be None if this wasn't invoked from the command
+        # line, in which case the following TypeError is about as good an error
+        # as we should expect
+        m = sys.modules.get('__main__')
+
+    # Check that we were actually given a module.
+    if not inspect.ismodule(m):
+        raise TypeError("testmod: module required; %r" % (m,))
+
+    # If no name was given, then use the module's name.
+    if name is None:
+        name = m.__name__
+
+    # Find, parse, and run all tests in the given module.
+    finder = DocTestFinder(_namefilter=isprivate, exclude_empty=exclude_empty)
+
+    if raise_on_error:
+        runner = DebugRunner(verbose=verbose, optionflags=optionflags)
+    else:
+        runner = DocTestRunner(verbose=verbose, optionflags=optionflags)
+
+    for test in finder.find(m, name, globs=globs, extraglobs=extraglobs):
+        runner.run(test)
+
+    if report:
+        runner.summarize()
+
+    if master is None:
+        master = runner
+    else:
+        master.merge(runner)
+
+    return runner.failures, runner.tries
+
+def testfile(filename, module_relative=True, name=None, package=None,
+             globs=None, verbose=None, report=True, optionflags=0,
+             extraglobs=None, raise_on_error=False, parser=DocTestParser()):
+    """
+    Test examples in the given file.  Return (#failures, #tests).
+
+    Optional keyword arg "module_relative" specifies how filenames
+    should be interpreted:
+
+      - If "module_relative" is True (the default), then "filename"
+         specifies a module-relative path.  By default, this path is
+         relative to the calling module's directory; but if the
+         "package" argument is specified, then it is relative to that
+         package.  To ensure os-independence, "filename" should use
+         "/" characters to separate path segments, and should not
+         be an absolute path (i.e., it may not begin with "/").
+
+      - If "module_relative" is False, then "filename" specifies an
+        os-specific path.  The path may be absolute or relative (to
+        the current working directory).
+
+    Optional keyword arg "name" gives the name of the test; by default
+    use the file's basename.
+
+    Optional keyword argument "package" is a Python package or the
+    name of a Python package whose directory should be used as the
+    base directory for a module relative filename.  If no package is
+    specified, then the calling module's directory is used as the base
+    directory for module relative filenames.  It is an error to
+    specify "package" if "module_relative" is False.
+
+    Optional keyword arg "globs" gives a dict to be used as the globals
+    when executing examples; by default, use {}.  A copy of this dict
+    is actually used for each docstring, so that each docstring's
+    examples start with a clean slate.
+
+    Optional keyword arg "extraglobs" gives a dictionary that should be
+    merged into the globals that are used to execute examples.  By
+    default, no extra globals are used.
+
+    Optional keyword arg "verbose" prints lots of stuff if true, prints
+    only failures if false; by default, it's true iff "-v" is in sys.argv.
+
+    Optional keyword arg "report" prints a summary at the end when true,
+    else prints nothing at the end.  In verbose mode, the summary is
+    detailed, else very brief (in fact, empty if all tests passed).
+
+    Optional keyword arg "optionflags" or's together module constants,
+    and defaults to 0.  Possible values (see the docs for details):
+
+        DONT_ACCEPT_TRUE_FOR_1
+        DONT_ACCEPT_BLANKLINE
+        NORMALIZE_WHITESPACE
+        ELLIPSIS
+        SKIP
+        IGNORE_EXCEPTION_DETAIL
+        REPORT_UDIFF
+        REPORT_CDIFF
+        REPORT_NDIFF
+        REPORT_ONLY_FIRST_FAILURE
+
+    Optional keyword arg "raise_on_error" raises an exception on the
+    first unexpected exception or failure. This allows failures to be
+    post-mortem debugged.
+
+    Optional keyword arg "parser" specifies a DocTestParser (or
+    subclass) that should be used to extract tests from the files.
+
+    Advanced tomfoolery:  testmod runs methods of a local instance of
+    class doctest.Tester, then merges the results into (or creates)
+    global Tester instance doctest.master.  Methods of doctest.master
+    can be called directly too, if you want to do something unusual.
+    Passing report=0 to testmod is especially useful then, to delay
+    displaying a summary.  Invoke doctest.master.summarize(verbose)
+    when you're done fiddling.
+    """
+    global master
+
+    if package and not module_relative:
+        raise ValueError("Package may only be specified for module-"
+                         "relative paths.")
+
+    # Relativize the path
+    text, filename = _load_testfile(filename, package, module_relative)
+
+    # If no name was given, then use the file's name.
+    if name is None:
+        name = os.path.basename(filename)
+
+    # Assemble the globals.
+    if globs is None:
+        globs = {}
+    else:
+        globs = globs.copy()
+    if extraglobs is not None:
+        globs.update(extraglobs)
+
+    if raise_on_error:
+        runner = DebugRunner(verbose=verbose, optionflags=optionflags)
+    else:
+        runner = DocTestRunner(verbose=verbose, optionflags=optionflags)
+
+    # Read the file, convert it to a test, and run it.
+    test = parser.get_doctest(text, globs, name, filename, 0)
+    runner.run(test)
+
+    if report:
+        runner.summarize()
+
+    if master is None:
+        master = runner
+    else:
+        master.merge(runner)
+
+    return runner.failures, runner.tries
+
+def run_docstring_examples(f, globs, verbose=False, name="NoName",
+                           compileflags=None, optionflags=0):
+    """
+    Test examples in the given object's docstring (`f`), using `globs`
+    as globals.  Optional argument `name` is used in failure messages.
+    If the optional argument `verbose` is true, then generate output
+    even if there are no failures.
+
+    `compileflags` gives the set of flags that should be used by the
+    Python compiler when running the examples.  If not specified, then
+    it will default to the set of future-import flags that apply to
+    `globs`.
+
+    Optional keyword arg `optionflags` specifies options for the
+    testing and output.  See the documentation for `testmod` for more
+    information.
+    """
+    # Find, parse, and run all tests in the given module.
+    finder = DocTestFinder(verbose=verbose, recurse=False)
+    runner = DocTestRunner(verbose=verbose, optionflags=optionflags)
+    for test in finder.find(f, name, globs=globs):
+        runner.run(test, compileflags=compileflags)
+
+######################################################################
+## 7. Tester
+######################################################################
+# This is provided only for backwards compatibility.  It's not
+# actually used in any way.
+
+class Tester:
+    def __init__(self, mod=None, globs=None, verbose=None,
+                 isprivate=None, optionflags=0):
+
+        warnings.warn("class Tester is deprecated; "
+                      "use class doctest.DocTestRunner instead",
+                      DeprecationWarning, stacklevel=2)
+        if mod is None and globs is None:
+            raise TypeError("Tester.__init__: must specify mod or globs")
+        if mod is not None and not inspect.ismodule(mod):
+            raise TypeError("Tester.__init__: mod must be a module; %r" %
+                            (mod,))
+        if globs is None:
+            globs = mod.__dict__
+        self.globs = globs
+
+        self.verbose = verbose
+        self.isprivate = isprivate
+        self.optionflags = optionflags
+        self.testfinder = DocTestFinder(_namefilter=isprivate)
+        self.testrunner = DocTestRunner(verbose=verbose,
+                                        optionflags=optionflags)
+
+    def runstring(self, s, name):
+        test = DocTestParser().get_doctest(s, self.globs, name, None, None)
+        if self.verbose:
+            print "Running string", name
+        (f,t) = self.testrunner.run(test)
+        if self.verbose:
+            print f, "of", t, "examples failed in string", name
+        return (f,t)
+
+    def rundoc(self, object, name=None, module=None):
+        f = t = 0
+        tests = self.testfinder.find(object, name, module=module,
+                                     globs=self.globs)
+        for test in tests:
+            (f2, t2) = self.testrunner.run(test)
+            (f,t) = (f+f2, t+t2)
+        return (f,t)
+
+    def rundict(self, d, name, module=None):
+        import new
+        m = new.module(name)
+        m.__dict__.update(d)
+        if module is None:
+            module = False
+        return self.rundoc(m, name, module)
+
+    def run__test__(self, d, name):
+        import new
+        m = new.module(name)
+        m.__test__ = d
+        return self.rundoc(m, name)
+
+    def summarize(self, verbose=None):
+        return self.testrunner.summarize(verbose)
+
+    def merge(self, other):
+        self.testrunner.merge(other.testrunner)
+
+######################################################################
+## 8. Unittest Support
+######################################################################
+
+_unittest_reportflags = 0
+
+def set_unittest_reportflags(flags):
+    """Sets the unittest option flags.
+
+    The old flag is returned so that a runner could restore the old
+    value if it wished to:
+
+      >>> import doctest
+      >>> old = doctest._unittest_reportflags
+      >>> doctest.set_unittest_reportflags(REPORT_NDIFF |
+      ...                          REPORT_ONLY_FIRST_FAILURE) == old
+      True
+
+      >>> doctest._unittest_reportflags == (REPORT_NDIFF |
+      ...                                   REPORT_ONLY_FIRST_FAILURE)
+      True
+
+    Only reporting flags can be set:
+
+      >>> doctest.set_unittest_reportflags(ELLIPSIS)
+      Traceback (most recent call last):
+      ...
+      ValueError: ('Only reporting flags allowed', 8)
+
+      >>> doctest.set_unittest_reportflags(old) == (REPORT_NDIFF |
+      ...                                   REPORT_ONLY_FIRST_FAILURE)
+      True
+    """
+    global _unittest_reportflags
+
+    if (flags & REPORTING_FLAGS) != flags:
+        raise ValueError("Only reporting flags allowed", flags)
+    old = _unittest_reportflags
+    _unittest_reportflags = flags
+    return old
+
+
+class DocTestCase(unittest.TestCase):
+
+    def __init__(self, test, optionflags=0, setUp=None, tearDown=None,
+                 checker=None):
+
+        unittest.TestCase.__init__(self)
+        self._dt_optionflags = optionflags
+        self._dt_checker = checker
+        self._dt_test = test
+        self._dt_setUp = setUp
+        self._dt_tearDown = tearDown
+
+    def setUp(self):
+        test = self._dt_test
+
+        if self._dt_setUp is not None:
+            self._dt_setUp(test)
+
+    def tearDown(self):
+        test = self._dt_test
+
+        if self._dt_tearDown is not None:
+            self._dt_tearDown(test)
+
+        test.globs.clear()
+
+    def runTest(self):
+        test = self._dt_test
+        old = sys.stdout
+        new = StringIO()
+        optionflags = self._dt_optionflags
+
+        if not (optionflags & REPORTING_FLAGS):
+            # The option flags don't include any reporting flags,
+            # so add the default reporting flags
+            optionflags |= _unittest_reportflags
+
+        runner = DocTestRunner(optionflags=optionflags,
+                               checker=self._dt_checker, verbose=False)
+
+        try:
+            runner.DIVIDER = "-"*70
+            failures, tries = runner.run(
+                test, out=new.write, clear_globs=False)
+        finally:
+            sys.stdout = old
+
+        if failures:
+            raise self.failureException(self.format_failure(new.getvalue()))
+
+    def format_failure(self, err):
+        test = self._dt_test
+        if test.lineno is None:
+            lineno = 'unknown line number'
+        else:
+            lineno = '%s' % test.lineno
+        lname = '.'.join(test.name.split('.')[-1:])
+        return ('Failed doctest test for %s\n'
+                '  File "%s", line %s, in %s\n\n%s'
+                % (test.name, test.filename, lineno, lname, err)
+                )
+
+    def debug(self):
+        r"""Run the test case without results and without catching exceptions
+
+           The unit test framework includes a debug method on test cases
+           and test suites to support post-mortem debugging.  The test code
+           is run in such a way that errors are not caught.  This way a
+           caller can catch the errors and initiate post-mortem debugging.
+
+           The DocTestCase provides a debug method that raises
+           UnexpectedException errors if there is an unexepcted
+           exception:
+
+             >>> test = DocTestParser().get_doctest('>>> raise KeyError\n42',
+             ...                {}, 'foo', 'foo.py', 0)
+             >>> case = DocTestCase(test)
+             >>> try:
+             ...     case.debug()
+             ... except UnexpectedException, failure:
+             ...     pass
+
+           The UnexpectedException contains the test, the example, and
+           the original exception:
+
+             >>> failure.test is test
+             True
+
+             >>> failure.example.want
+             '42\n'
+
+             >>> exc_info = failure.exc_info
+             >>> raise exc_info[0], exc_info[1], exc_info[2]
+             Traceback (most recent call last):
+             ...
+             KeyError
+
+           If the output doesn't match, then a DocTestFailure is raised:
+
+             >>> test = DocTestParser().get_doctest('''
+             ...      >>> x = 1
+             ...      >>> x
+             ...      2
+             ...      ''', {}, 'foo', 'foo.py', 0)
+             >>> case = DocTestCase(test)
+
+             >>> try:
+             ...    case.debug()
+             ... except DocTestFailure, failure:
+             ...    pass
+
+           DocTestFailure objects provide access to the test:
+
+             >>> failure.test is test
+             True
+
+           As well as to the example:
+
+             >>> failure.example.want
+             '2\n'
+
+           and the actual output:
+
+             >>> failure.got
+             '1\n'
+
+           """
+
+        self.setUp()
+        runner = DebugRunner(optionflags=self._dt_optionflags,
+                             checker=self._dt_checker, verbose=False)
+        runner.run(self._dt_test)
+        self.tearDown()
+
+    def id(self):
+        return self._dt_test.name
+
+    def __repr__(self):
+        name = self._dt_test.name.split('.')
+        return "%s (%s)" % (name[-1], '.'.join(name[:-1]))
+
+    __str__ = __repr__
+
+    def shortDescription(self):
+        return "Doctest: " + self._dt_test.name
+
+def DocTestSuite(module=None, globs=None, extraglobs=None, test_finder=None,
+                 **options):
+    """
+    Convert doctest tests for a module to a unittest test suite.
+
+    This converts each documentation string in a module that
+    contains doctest tests to a unittest test case.  If any of the
+    tests in a doc string fail, then the test case fails.  An exception
+    is raised showing the name of the file containing the test and a
+    (sometimes approximate) line number.
+
+    The `module` argument provides the module to be tested.  The argument
+    can be either a module or a module name.
+
+    If no argument is given, the calling module is used.
+
+    A number of options may be provided as keyword arguments:
+
+    setUp
+      A set-up function.  This is called before running the
+      tests in each file. The setUp function will be passed a DocTest
+      object.  The setUp function can access the test globals as the
+      globs attribute of the test passed.
+
+    tearDown
+      A tear-down function.  This is called after running the
+      tests in each file.  The tearDown function will be passed a DocTest
+      object.  The tearDown function can access the test globals as the
+      globs attribute of the test passed.
+
+    globs
+      A dictionary containing initial global variables for the tests.
+
+    optionflags
+       A set of doctest option flags expressed as an integer.
+    """
+
+    if test_finder is None:
+        test_finder = DocTestFinder()
+
+    module = _normalize_module(module)
+    tests = test_finder.find(module, globs=globs, extraglobs=extraglobs)
+    if globs is None:
+        globs = module.__dict__
+    if not tests:
+        # Why do we want to do this? Because it reveals a bug that might
+        # otherwise be hidden.
+        raise ValueError(module, "has no tests")
+
+    tests.sort()
+    suite = unittest.TestSuite()
+    for test in tests:
+        if len(test.examples) == 0:
+            continue
+        if not test.filename:
+            filename = module.__file__
+            if filename[-4:] in (".pyc", ".pyo"):
+                filename = filename[:-1]
+            test.filename = filename
+        suite.addTest(DocTestCase(test, **options))
+
+    return suite
+
+class DocFileCase(DocTestCase):
+
+    def id(self):
+        return '_'.join(self._dt_test.name.split('.'))
+
+    def __repr__(self):
+        return self._dt_test.filename
+    __str__ = __repr__
+
+    def format_failure(self, err):
+        return ('Failed doctest test for %s\n  File "%s", line 0\n\n%s'
+                % (self._dt_test.name, self._dt_test.filename, err)
+                )
+
+def DocFileTest(path, module_relative=True, package=None,
+                globs=None, parser=DocTestParser(), **options):
+    if globs is None:
+        globs = {}
+    else:
+        globs = globs.copy()
+
+    if package and not module_relative:
+        raise ValueError("Package may only be specified for module-"
+                         "relative paths.")
+
+    # Relativize the path.
+    doc, path = _load_testfile(path, package, module_relative)
+
+    if "__file__" not in globs:
+        globs["__file__"] = path
+
+    # Find the file and read it.
+    name = os.path.basename(path)
+
+    # Convert it to a test, and wrap it in a DocFileCase.
+    test = parser.get_doctest(doc, globs, name, path, 0)
+    return DocFileCase(test, **options)
+
+def DocFileSuite(*paths, **kw):
+    """A unittest suite for one or more doctest files.
+
+    The path to each doctest file is given as a string; the
+    interpretation of that string depends on the keyword argument
+    "module_relative".
+
+    A number of options may be provided as keyword arguments:
+
+    module_relative
+      If "module_relative" is True, then the given file paths are
+      interpreted as os-independent module-relative paths.  By
+      default, these paths are relative to the calling module's
+      directory; but if the "package" argument is specified, then
+      they are relative to that package.  To ensure os-independence,
+      "filename" should use "/" characters to separate path
+      segments, and may not be an absolute path (i.e., it may not
+      begin with "/").
+
+      If "module_relative" is False, then the given file paths are
+      interpreted as os-specific paths.  These paths may be absolute
+      or relative (to the current working directory).
+
+    package
+      A Python package or the name of a Python package whose directory
+      should be used as the base directory for module relative paths.
+      If "package" is not specified, then the calling module's
+      directory is used as the base directory for module relative
+      filenames.  It is an error to specify "package" if
+      "module_relative" is False.
+
+    setUp
+      A set-up function.  This is called before running the
+      tests in each file. The setUp function will be passed a DocTest
+      object.  The setUp function can access the test globals as the
+      globs attribute of the test passed.
+
+    tearDown
+      A tear-down function.  This is called after running the
+      tests in each file.  The tearDown function will be passed a DocTest
+      object.  The tearDown function can access the test globals as the
+      globs attribute of the test passed.
+
+    globs
+      A dictionary containing initial global variables for the tests.
+
+    optionflags
+      A set of doctest option flags expressed as an integer.
+
+    parser
+      A DocTestParser (or subclass) that should be used to extract
+      tests from the files.
+    """
+    suite = unittest.TestSuite()
+
+    # We do this here so that _normalize_module is called at the right
+    # level.  If it were called in DocFileTest, then this function
+    # would be the caller and we might guess the package incorrectly.
+    if kw.get('module_relative', True):
+        kw['package'] = _normalize_module(kw.get('package'))
+
+    for path in paths:
+        suite.addTest(DocFileTest(path, **kw))
+
+    return suite
+
+######################################################################
+## 9. Debugging Support
+######################################################################
+
+def script_from_examples(s):
+    r"""Extract script from text with examples.
+
+       Converts text with examples to a Python script.  Example input is
+       converted to regular code.  Example output and all other words
+       are converted to comments:
+
+       >>> text = '''
+       ...       Here are examples of simple math.
+       ...
+       ...           Python has super accurate integer addition
+       ...
+       ...           >>> 2 + 2
+       ...           5
+       ...
+       ...           And very friendly error messages:
+       ...
+       ...           >>> 1/0
+       ...           To Infinity
+       ...           And
+       ...           Beyond
+       ...
+       ...           You can use logic if you want:
+       ...
+       ...           >>> if 0:
+       ...           ...    blah
+       ...           ...    blah
+       ...           ...
+       ...
+       ...           Ho hum
+       ...           '''
+
+       >>> print script_from_examples(text)
+       # Here are examples of simple math.
+       #
+       #     Python has super accurate integer addition
+       #
+       2 + 2
+       # Expected:
+       ## 5
+       #
+       #     And very friendly error messages:
+       #
+       1/0
+       # Expected:
+       ## To Infinity
+       ## And
+       ## Beyond
+       #
+       #     You can use logic if you want:
+       #
+       if 0:
+          blah
+          blah
+       #
+       #     Ho hum
+       <BLANKLINE>
+       """
+    output = []
+    for piece in DocTestParser().parse(s):
+        if isinstance(piece, Example):
+            # Add the example's source code (strip trailing NL)
+            output.append(piece.source[:-1])
+            # Add the expected output:
+            want = piece.want
+            if want:
+                output.append('# Expected:')
+                output += ['## '+l for l in want.split('\n')[:-1]]
+        else:
+            # Add non-example text.
+            output += [_comment_line(l)
+                       for l in piece.split('\n')[:-1]]
+
+    # Trim junk on both ends.
+    while output and output[-1] == '#':
+        output.pop()
+    while output and output[0] == '#':
+        output.pop(0)
+    # Combine the output, and return it.
+    # Add a courtesy newline to prevent exec from choking (see bug #1172785)
+    return '\n'.join(output) + '\n'
+
+def testsource(module, name):
+    """Extract the test sources from a doctest docstring as a script.
+
+    Provide the module (or dotted name of the module) containing the
+    test to be debugged and the name (within the module) of the object
+    with the doc string with tests to be debugged.
+    """
+    module = _normalize_module(module)
+    tests = DocTestFinder().find(module)
+    test = [t for t in tests if t.name == name]
+    if not test:
+        raise ValueError(name, "not found in tests")
+    test = test[0]
+    testsrc = script_from_examples(test.docstring)
+    return testsrc
+
+def debug_src(src, pm=False, globs=None):
+    """Debug a single doctest docstring, in argument `src`'"""
+    testsrc = script_from_examples(src)
+    debug_script(testsrc, pm, globs)
+
+def debug_script(src, pm=False, globs=None):
+    "Debug a test script.  `src` is the script, as a string."
+    import pdb
+
+    # Note that tempfile.NameTemporaryFile() cannot be used.  As the
+    # docs say, a file so created cannot be opened by name a second time
+    # on modern Windows boxes, and execfile() needs to open it.
+    srcfilename = tempfile.mktemp(".py", "doctestdebug")
+    f = open(srcfilename, 'w')
+    f.write(src)
+    f.close()
+
+    try:
+        if globs:
+            globs = globs.copy()
+        else:
+            globs = {}
+
+        if pm:
+            try:
+                execfile(srcfilename, globs, globs)
+            except:
+                print sys.exc_info()[1]
+                pdb.post_mortem(sys.exc_info()[2])
+        else:
+            # Note that %r is vital here.  '%s' instead can, e.g., cause
+            # backslashes to get treated as metacharacters on Windows.
+            pdb.run("execfile(%r)" % srcfilename, globs, globs)
+
+    finally:
+        os.remove(srcfilename)
+
+def debug(module, name, pm=False):
+    """Debug a single doctest docstring.
+
+    Provide the module (or dotted name of the module) containing the
+    test to be debugged and the name (within the module) of the object
+    with the docstring with tests to be debugged.
+    """
+    module = _normalize_module(module)
+    testsrc = testsource(module, name)
+    debug_script(testsrc, pm, module.__dict__)
+
+######################################################################
+## 10. Example Usage
+######################################################################
+class _TestClass:
+    """
+    A pointless class, for sanity-checking of docstring testing.
+
+    Methods:
+        square()
+        get()
+
+    >>> _TestClass(13).get() + _TestClass(-12).get()
+    1
+    >>> hex(_TestClass(13).square().get())
+    '0xa9'
+    """
+
+    def __init__(self, val):
+        """val -> _TestClass object with associated value val.
+
+        >>> t = _TestClass(123)
+        >>> print t.get()
+        123
+        """
+
+        self.val = val
+
+    def square(self):
+        """square() -> square TestClass's associated value
+
+        >>> _TestClass(13).square().get()
+        169
+        """
+
+        self.val = self.val ** 2
+        return self
+
+    def get(self):
+        """get() -> return TestClass's associated value.
+
+        >>> x = _TestClass(-42)
+        >>> print x.get()
+        -42
+        """
+
+        return self.val
+
+__test__ = {"_TestClass": _TestClass,
+            "string": r"""
+                      Example of a string object, searched as-is.
+                      >>> x = 1; y = 2
+                      >>> x + y, x * y
+                      (3, 2)
+                      """,
+
+            "bool-int equivalence": r"""
+                                    In 2.2, boolean expressions displayed
+                                    0 or 1.  By default, we still accept
+                                    them.  This can be disabled by passing
+                                    DONT_ACCEPT_TRUE_FOR_1 to the new
+                                    optionflags argument.
+                                    >>> 4 == 4
+                                    1
+                                    >>> 4 == 4
+                                    True
+                                    >>> 4 > 4
+                                    0
+                                    >>> 4 > 4
+                                    False
+                                    """,
+
+            "blank lines": r"""
+                Blank lines can be marked with <BLANKLINE>:
+                    >>> print 'foo\n\nbar\n'
+                    foo
+                    <BLANKLINE>
+                    bar
+                    <BLANKLINE>
+            """,
+
+            "ellipsis": r"""
+                If the ellipsis flag is used, then '...' can be used to
+                elide substrings in the desired output:
+                    >>> print range(1000) #doctest: +ELLIPSIS
+                    [0, 1, 2, ..., 999]
+            """,
+
+            "whitespace normalization": r"""
+                If the whitespace normalization flag is used, then
+                differences in whitespace are ignored.
+                    >>> print range(30) #doctest: +NORMALIZE_WHITESPACE
+                    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
+                     15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
+                     27, 28, 29]
+            """,
+           }
+
+def _test():
+    r = unittest.TextTestRunner()
+    r.run(DocTestSuite())
+
+if __name__ == "__main__":
+    _test()

Added: python-mechanize/branches/upstream/current/test-tools/linecache_copy.py
===================================================================
--- python-mechanize/branches/upstream/current/test-tools/linecache_copy.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test-tools/linecache_copy.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -0,0 +1,132 @@
+"""Cache lines from files.
+
+This is intended to read lines from modules imported -- hence if a filename
+is not found, it will look down the module search path for a file by
+that name.
+"""
+
+import sys
+import os
+
+__all__ = ["getline", "clearcache", "checkcache"]
+
+def getline(filename, lineno, module_globals=None):
+    lines = getlines(filename, module_globals)
+    if 1 <= lineno <= len(lines):
+        return lines[lineno-1]
+    else:
+        return ''
+
+
+# The cache
+
+cache = {} # The cache
+
+
+def clearcache():
+    """Clear the cache entirely."""
+
+    global cache
+    cache = {}
+
+
+def getlines(filename, module_globals=None):
+    """Get the lines for a file from the cache.
+    Update the cache if it doesn't contain an entry for this file already."""
+
+    if filename in cache:
+        return cache[filename][2]
+    else:
+        return updatecache(filename, module_globals)
+
+
+def checkcache(filename=None):
+    """Discard cache entries that are out of date.
+    (This is not checked upon each call!)"""
+
+    if filename is None:
+        filenames = cache.keys()
+    else:
+        if filename in cache:
+            filenames = [filename]
+        else:
+            return
+
+    for filename in filenames:
+        size, mtime, lines, fullname = cache[filename]
+        if mtime is None:
+            continue   # no-op for files loaded via a __loader__
+        try:
+            stat = os.stat(fullname)
+        except os.error:
+            del cache[filename]
+            continue
+        if size != stat.st_size or mtime != stat.st_mtime:
+            del cache[filename]
+
+
+def updatecache(filename, module_globals=None):
+    """Update a cache entry and return its list of lines.
+    If something's wrong, print a message, discard the cache entry,
+    and return an empty list."""
+
+    if filename in cache:
+        del cache[filename]
+    if not filename or filename[0] + filename[-1] == '<>':
+        return []
+
+    fullname = filename
+    try:
+        stat = os.stat(fullname)
+    except os.error, msg:
+        basename = os.path.split(filename)[1]
+
+        # Try for a __loader__, if available
+        if module_globals and '__loader__' in module_globals:
+            name = module_globals.get('__name__')
+            loader = module_globals['__loader__']
+            get_source = getattr(loader, 'get_source', None)
+
+            if name and get_source:
+                if basename.startswith(name.split('.')[-1]+'.'):
+                    try:
+                        data = get_source(name)
+                    except (ImportError, IOError):
+                        pass
+                    else:
+                        cache[filename] = (
+                            len(data), None,
+                            [line+'\n' for line in data.splitlines()], fullname
+                        )
+                        return cache[filename][2]
+
+        # Try looking through the module search path.
+
+        for dirname in sys.path:
+            # When using imputil, sys.path may contain things other than
+            # strings; ignore them when it happens.
+            try:
+                fullname = os.path.join(dirname, basename)
+            except (TypeError, AttributeError):
+                # Not sufficiently string-like to do anything useful with.
+                pass
+            else:
+                try:
+                    stat = os.stat(fullname)
+                    break
+                except os.error:
+                    pass
+        else:
+            # No luck
+##          print '*** Cannot stat', filename, ':', msg
+            return []
+    try:
+        fp = open(fullname, 'rU')
+        lines = fp.readlines()
+        fp.close()
+    except IOError, msg:
+##      print '*** Cannot open', fullname, ':', msg
+        return []
+    size, mtime = stat.st_size, stat.st_mtime
+    cache[filename] = size, mtime, lines, fullname
+    return lines

Modified: python-mechanize/branches/upstream/current/test.py
===================================================================
--- python-mechanize/branches/upstream/current/test.py	2007-04-08 14:56:11 UTC (rev 763)
+++ python-mechanize/branches/upstream/current/test.py	2007-04-09 20:40:55 UTC (rev 764)
@@ -8,20 +8,46 @@
 
 """
 
+import cgitb
+#cgitb.enable(format="text")
+
 # Modules containing tests to run -- a test is anything named *Tests, which
 # should be classes deriving from unittest.TestCase.
-MODULE_NAMES = ["test_date", "test_mechanize", "test_misc", "test_cookies",
+MODULE_NAMES = ["test_date", "test_browser", "test_response", "test_cookies",
                 "test_headers", "test_urllib2", "test_pullparser",
+                "test_useragent", "test_html", "test_opener",
                 ]
 
-import sys, os, traceback, logging
-from unittest import defaultTestLoader, TextTestRunner, TestSuite, TestCase
+import sys, os, traceback, logging, glob
+from unittest import defaultTestLoader, TextTestRunner, TestSuite, TestCase, \
+     _TextTestResult
 
-level = logging.DEBUG
+#level = logging.DEBUG
 #level = logging.INFO
+#level = logging.WARNING
 #level = logging.NOTSET
 #logging.getLogger("mechanize").setLevel(level)
+#logging.getLogger("mechanize").addHandler(logging.StreamHandler(sys.stdout))
 
+
+class CgitbTextResult(_TextTestResult):
+    def _exc_info_to_string(self, err, test):
+        """Converts a sys.exc_info()-style tuple of values into a string."""
+        exctype, value, tb = err
+        # Skip test runner traceback levels
+        while tb and self._is_relevant_tb_level(tb):
+            tb = tb.tb_next
+        if exctype is test.failureException:
+            # Skip assert*() traceback levels
+            length = self._count_relevant_tb_levels(tb)
+            return cgitb.text((exctype, value, tb))
+        return cgitb.text((exctype, value, tb))
+
+class CgitbTextTestRunner(TextTestRunner):
+    def _makeResult(self):
+        return CgitbTextResult(self.stream, self.descriptions, self.verbosity)
+
+
 class TestProgram:
     """A command-line program that runs a set of tests; this is primarily
        for making test modules conveniently executable.
@@ -57,7 +83,6 @@
         self.testLoader = testLoader
         self.progName = os.path.basename(argv[0])
         self.parseArgs(argv)
-        self.runTests()
 
     def usageExit(self, msg=None):
         if msg: print msg
@@ -98,23 +123,114 @@
         if self.testRunner is None:
             self.testRunner = TextTestRunner(verbosity=self.verbosity)
         result = self.testRunner.run(self.test)
-        sys.exit(not result.wasSuccessful())
+        return result
 
 
 if __name__ == "__main__":
+##     sys.path.insert(0, '/home/john/comp/dev/rl/jjlee/lib/python')
+##     import jjl
+##     import __builtin__
+##     __builtin__.jjl = jjl
+
     # XXX temporary stop-gap to run doctests
-    assert os.path.isdir('test')
-    sys.path.insert(0, 'test')
+
+    # XXXX coverage output seems incorrect ATM
+    run_coverage = "-c" in sys.argv
+    if run_coverage:
+        sys.argv.remove("-c")
+    use_cgitb = "-t" in sys.argv
+    if use_cgitb:
+        sys.argv.remove("-t")
+    run_doctests = "-d" not in sys.argv
+    if not run_doctests:
+        sys.argv.remove("-d")
+
+    # import local copy of Python 2.5 doctest
+    assert os.path.isdir("test")
+    sys.path.insert(0, "test")
+    # needed for recent doctest / linecache -- this is only for testing
+    # purposes, these don't get installed
+    # doctest.py revision 45701 and linecache.py revision 45940.  Since
+    # linecache is used by Python itself, linecache.py is renamed
+    # linecache_copy.py, and this copy of doctest is modified (only) to use
+    # that renamed module.
+    sys.path.insert(0, "test-tools")
     import doctest
-    import test_mechanize
-    doctest.testmod(test_mechanize)
-    from mechanize import _headersutil, _auth, _clientcookie, _pullparser
-    doctest.testmod(_headersutil)
-    doctest.testmod(_auth)
-    doctest.testmod(_clientcookie)
-    doctest.testmod(_pullparser)
 
+    import coverage
+    if run_coverage:
+        print 'running coverage'
+        coverage.erase()
+        coverage.start()
+
+    import mechanize
+
+    if run_doctests:
+        # run .doctest files needing special support
+        common_globs = {"mechanize": mechanize}
+        pm_doctest_filename = os.path.join("test", "test_password_manager.doctest")
+        for globs in [
+            {"mgr_class": mechanize.HTTPPasswordMgr},
+            {"mgr_class": mechanize.HTTPProxyPasswordMgr},
+            ]:
+            globs.update(common_globs)
+            doctest.testfile(
+                pm_doctest_filename,
+                #os.path.join("test", "test_scratch.doctest"),
+                globs=globs,
+                )
+
+        # run .doctest files
+        special_doctests = [pm_doctest_filename,
+                            os.path.join("test", "test_scratch.doctest"),
+                            ]
+        doctest_files = glob.glob(os.path.join("test", "*.doctest"))
+
+        for dt in special_doctests:
+            if dt in doctest_files:
+                doctest_files.remove(dt)
+        for df in doctest_files:
+            doctest.testfile(df)
+
+        # run doctests in docstrings
+        from mechanize import _headersutil, _auth, _clientcookie, _pullparser, \
+             _http, _rfc3986
+        doctest.testmod(_headersutil)
+        doctest.testmod(_rfc3986)
+        doctest.testmod(_auth)
+        doctest.testmod(_clientcookie)
+        doctest.testmod(_pullparser)
+        doctest.testmod(_http)
+
+    # run vanilla unittest tests
     import unittest
     test_path = os.path.join(os.path.dirname(sys.argv[0]), "test")
     sys.path.insert(0, test_path)
-    TestProgram(MODULE_NAMES)
+    test_runner = None
+    if use_cgitb:
+        test_runner = CgitbTextTestRunner()
+    prog = TestProgram(MODULE_NAMES, testRunner=test_runner)
+    result = prog.runTests()
+
+    if run_coverage:
+        # HTML coverage report
+        import colorize
+        try:
+            os.mkdir("coverage")
+        except OSError:
+            pass
+        private_modules = glob.glob("mechanize/_*.py")
+        private_modules.remove("mechanize/__init__.py")
+        for module_filename in private_modules:
+            module_name = module_filename.replace("/", ".")[:-3]
+            print module_name
+            module = sys.modules[module_name]
+            f, s, m, mf = coverage.analysis(module)
+            fo = open(os.path.join('coverage', os.path.basename(f)+'.html'), 'wb')
+            colorize.colorize_file(f, outstream=fo, not_covered=mf)
+            fo.close()
+            coverage.report(module)
+            #print coverage.analysis(module)
+
+    # XXX exit status is wrong -- does not take account of doctests
+    sys.exit(not result.wasSuccessful())




More information about the pkg-zope-commits mailing list