This test uses Perl module Test::WWW::Mechanize::CGI to check gitweb output, using HTML::Lint (if present) to validate HTML. Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- Review, review, review! This test requires test_external_without_stderr in t/test-lib.sh by Lea Wiemann, send to git mailing list in [PATCH] t/test-lib.sh: add test_external and test_external_without_stderr http://thread.gmane.org/gmane.comp.version-control.git/83415 Message-Id: <1212276975-27428-1-git-send-email-LeWiemann@gmail.com> also present in git://repo.or.cz/git/gitweb-caching.git since commit (currently) d28d31467db2ec5737948685ade281d5c0704a27. This is very much work in progress; it uses Test::WWW::Mechanize::CGI Perl module (from CPAN), which makes writing tests checking form (but not requiring exact details) of gitweb output. The test driver, i.e. t/t9503-gitweb-Mechanize.sh skips tests if they cannot be done; note that it assumes that if one jas TWM::CGI module it also has all modules it depends on (Test::WWW::Mechanize, HTTP::Request::AsCGI,...) installed. The alternatives would be to redo work of TWM::CGI using standard or simply more common modules (LWP::* modules, HTTP::Request::AsCGI), or (proposed during off-line discussion with Lea Wiemann) to create expected output documents and diff (compare) literal output. The second solution has two disadvantages I can see. First, it freezes gitweb output format, making improvements more difficult. Second, if there is change that invalidates some or all tests vectors, you would have to regenerate those vectors from actual gitweb output without having any tests to check this output; it would be quite easy to put into test vectors errorneous output. On the other hand current approach means much, much more tests. It is also bit more challenging, because you have to find invariants of the output and test that they are fullfilled. NOTE: Currently test_external_without_stderr fails because when trying to access URL for ...
For everyone's convenience, here's the raw patch: http://article.gmane.org/gmane.comp.version-control.git/83415/raw (When repo.or.cz is up again you can also locate a version with fixed Awesome, thank you so much for making a start here! Here are some quick Without having looked at the cause of that, I think that gitweb should not be writing stuff to stderr unless an internal or serious error occurs; in particular trying to access invalid commits shouldn't cause messages on stderr, only to log files if at all. That said, as long as it isn't fixed, here's my workaround to temporarily discard stderr (from my t/t9710/test.pl): our $old_stderr; sub discard_stderr { open our $old_stderr, ">&", STDERR or die "cannot save STDERR"; close STDERR; } sub restore_stderr { open STDERR, ">&", $old_stderr or die "cannot restore STDERR"; } t9500 seems to be doing the same(?) thing, but this somehow doesn't work with your t9503 test: $ git # no git in PATH to make sure it picks up the right git binary bash: git: command not found $ ./t9500-gitweb-standalone-no-errors.sh | grep passed * passed all 75 test(s) $ ./t9503-gitweb-Mechanize.sh -v [...] gitweb.perl: Can't exec "git": No such file or directory at /home/lea/source/git/fresh-git/gitweb/gitweb.perl line 380. -- Lea --
Actually it isn't gitweb (or Perl) writing to stderr, but git itself. Somehow, at least for gitweb run as CGI script (and under legacy mod_perl) with Apache 2 as web server this error message: fatal: bad revision 'non-existent' doesn't land in web server logs (/var/log/httpd/error_log). So it lands in /dev/null when running gitweb as web script, so it was deemed not important (also fixing this is not very easy, as you can read below). t/t9500-gitweb-standalone-no-errors.sh considers as errors only those error message which would make it into web server logs, see gitweb_run() function there. This catches compilation errors. Fixing this is not that simple. There is no option to git-rev-list to not write any output to stderr (no, '--quiet' is about something else), and I'd rather not lose all advantages of list (shell-less) form of magical "|-" open only for "2>/dev/null" redirection as in git_object subroutine in gitweb. Perhaps it could be solved in Git.pm, and when gitweb is rewritten to use "use Git" (and global $repo object instead of global $git_dir variable) it would automatically fix it (using Git.pm would have the advantage of making gitweb more portable, I think - to ActiveState broken Perl implementation, with broken magic "|-" open). Or git-rev-list, or even git wrapper itself, could acquire option to redirect all stderr to dev null... I think adding it in git wrapper would be even better; simply change "warning" and "die" to null It should work. test-lib.sh sets up $PATH to have 'git' binary (just ...and it would be very strange for t9500 to pass, but t9503 do not pass. (Of course both tests passes at my computer, otherwise I wouldn't send this patch in current form). Hmmm... perhaps $PATH doesn't get passed down... strange. But thanks to your report I have found bug in gitweb. I have changed t/t9503-gitweb-Mechanize.sh... diff --git a/t/t9503-gitweb-Mechanize.sh b/t/t9503-gitweb-Mechanize.sh index 5df22c4..a5be275 100755 --- ...
Okay, this one will go away with the new API I'm writing, which uses cat-file --batch-check instead of rev-list. In the meantime (and in other cases) I guess diverting stderr in the test code is fine. (I wouldn't want to ignore stderr in all cases, even where you're not expecting any output on stderr, since that might actually indicate an Since you're accessing http://localhost/ URLs, the web server's PATH is in effect, which doesn't get overridden by the tests. But anyways, using an environment variable (see my other email) will move the For t9503 this'll go away if the GITWEB_TEST_BASE_URL thing is implemented. For t9500 (which contains $GIT = "git" as well), it should be fine as is since test-lib.sh sets PATH and thus gitweb picks up the right binary. -- Lea --
Won't work. Well, it would work for p=commit;h=non-existent, but it would not work for p=log;h=non-existent, where git-rev-list (or git-log) is really needed. And I'd rather have parse_commit() (used in 'commit' view) and parse_commits() (used in log-like views) share the same commit parser, parse_commit_text(), so it's rev-list also for git_commit, not cat-file. Well, I'd divert stderr in tests cases (or use simply 'test_external'), or better filter stderr, only in those cases where there is spurious but not dangerous thing on stderr. But again, I think the solution would be either to add feature to Git.pm, something like command_output_pipe_no_stderr, which would redirect stderr to /dev/null, or modify git wrapper to redirect stderr to /dev/null in scripts / when calling script, and redefine It isn't. http://localhost/ is just access convention, and WWW::Mechanize::CGI actually calls system()[*1*] on provided path made absolute (hence error when it contains whitespace), setting CGI environmental variables before calling it. [*1*] it would be nice to have perl_application in WWW::Mechanize::CGI, which would simply setup %ENV and use do() instead of system() on provided application. Perhaps it would be better in meantime to simply craft $mech->cgi( sub { ... } ), and do not use generic $mech->cgi_application($path); we could do without all the checking, too. -- Jakub Narebski Poland --
Since you can only run whole perl scripts with test_external, we'll have to resort to using those discard_/restore_stderr functions. I doubt (FYI, Git.pm has such a feature in the command method, though it it simply closes STDERR and doesn't properly re-open it, I believe.) The Git::Repo API I'm writing doesn't currently generate any spurious output Yup, I should've read TFM. :) On my system, $PATH is empty in gitweb.cgi for some strange reason, but since using an absolute path for Gitweb and probably CGI::Carp qw(fatalsToBrowser) use 'exit', so we'd have to use Test::Trap (or so) to catch those. I think we should defer this until performance actually becomes an issue. -- Lea --
The idea was more to avoid bug in cgi_application() method of WM::CGI,
which fails on pathname containing embedded spaces (that is why
workaround in setting $gitweb path, to avoid 'trash directory' in it),
than for performance reasons.
Nevertheless naive $mech->cgi( sub { do "$gitweb"; } ); doesn't work;
and after thinking about it a little it simply couldn't work...
--
Jakub Narebski
Poland
--
Just noticed this -- here's the cause of the problem I encountered in my previous post. You can't really probe for a running gitweb installation on localhost, because even if there's one, you don't know if you're testing against your tree and your test repo. So unless it's (easily) possible to force Mechanize to eat gitweb.cgi's output in lieu of making actual HTTP requests, I'd suggest that we simply use an environment variable (e.g. GITWEB_TEST_BASE_URL=http://localhost/cgi-bin/gitweb.cgi) -- if it's unset, the tests get skipped, otherwise they're run against that base URL. *wanders-off-to-write-a-patch* -- Lea --
Errr... WWW::Mechanize::CGI does not access web server, but runs CGI script (or CGI subroutine) as if it was installed in the root directory of localhost. 'http://localhost' simply means installed CGI script. For example tests works for me, even though I have gitweb installed somewhere deeper than directly at locahost: http://localhost/cgi-bin/gitweb/gitweb.cgi Besides you can check output and see that it is consistent with set and used gitweb config in t9503* test. Test::WWW::Mechanize::CGI has the following description: http://search.cpan.org/~mramberg/Test-WWW-Mechanize-CGI-0.1/lib/Test/WWW/Mechanize/CGI.pm Provides a convenient way of testing CGI applications without a external daemon. The documentation of module (0.1 in the case of TWM::CGI, 0.3 in the case of WM::CGI) leaves something to be desired, so I mainly used examples provided as guideline. -- Jakub Narebski Poland --
From: Jakub Narebski <jnareb@gmail.com>
This test uses Test::WWW::Mechanize::CGI to check gitweb's output. It
also uses HTML::Lint (if present) to validate the HTML.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Lea Wiemann <LeWiemann@gmail.com>
---
This requires the following two patches:
[PATCH] gitweb: Make it work with $GIT containing spaces
http://article.gmane.org/gmane.comp.version-control.git/85036/raw
[PATCH] t/test-lib.sh: add test_external and test_external_without_stderr
http://article.gmane.org/gmane.comp.version-control.git/83415/raw
Changed since v1:
- Use "$GIT_EXEC_PATH/git" instead of "git" for the git binary path,
since on my system $PATH gets lost mysteriously.
- Use test_external instead of test_external_without_stderr so that
the test doesn't fail if there is output on stderr.
Discarding and restoring STDERR (as I mentioned in a recent message)
doesn't work with Mechanize, as Mechanize somehow seems to save the
STDERR handle and reuse it, so subsequent calls to Mechanize keep
using the discarded STDERR. (The discard/restore functions work
fine in other test modules, it's really just Mechanize here.)
Using test_external means that in verbose mode we get some spurious
error messages in the test output, but they're not harmful (and it
should be clear to anyone running the test suite that they're OK).
- Reworded commit message a little for clarity.
The test works now on my system. If you run this successfully (or
unsuccessfully), please post a note on the list.
Comments/patches/additions, in particular from people who've used
Mechanize before, is still much appreciated!
This patch is WIP and not for inclusion. At some point we'll take
further revisions off the list and use a public repository instead
(once repo.or.cz is up again); we'll post a notice when that happens.
When the test suite is finished we'll post it here again.
t/t9503-gitweb-Mechanize.sh | 127 ...Actually I think it is CGI module itself catching and redirecting
STDERR from Perl to logfile, and WWW::Mechanize::CGI having to catch
and redirect all stderr of invoked application, but I'm not sure.
What should we think about now, I guess, is
1.) Should we put all tests in one file, or should they be split
according to what do they test? For example separate tests
for correct 4xx and 5xx responses for requests for non-exiting
objects, and for not permitted views.
2.) What invariants should we test, beside obvious HTML validation
using HTML::Lint (by the way, is there some Perl module for RSS,
Atom and OPML validation?). Checking for example if all items
are listed in a 'tree' view, or if all inner links (#link) are
valid would be a good start... pity that Mechanize modules don't
have very good documetation.
3.) What invariants you want to test for your caching efforts, e.g.
checking (not necessary with TWM::CGI) if cached output matches
non-cached, with exception of marking output as cached?
--
Jakub Narebski
Poland
--
I'd suggest we leave it in a single file until test execution time
becomes an issue. Then (when it has become too large) we'll be able to
Yup; completeness of item lists is especially relevant for paginated
output. Also check for the presence and validity of links (like
"parent" links, etc.), and for the presence of certain elements (like
the file modes in the tree view).
Also, with a $ENV{LONG_GIT_TEST} variable or so, we could automatically
validate all links for each page we're checking -- it takes a long time,
but it's still way more efficient than exhaustive spidering of the whole
I can't find anything on Google right now, but piping them into external
validators might be just as fine. Also, since those formats are
generated using print statements (which is really error-prone for XML
How about this:
1. Run the Mechanize tests (and possibly also the existing t9500 tests)
*without* caching, recording the URL's and contents of all pages the
test suite accesses.
2. Get all those URL's again *with* caching (from a cold cache), and
assert that the output is identical.
3. Get all those URL's again *with* caching (from a warm cache), and
assert that the output is identical. Perhaps also assert that no call
to the git binary is made (i.e. everything has actually been cached).
(Of course we might need options for the production site to not cache
certain things, but let's defer this discussion.)
-- Lea
--
I wanted to split tests mainly not because of performance, but because of making it easier to maintain. Although perhaps single driver-test, We can use Test::XML / Test::XML::Valid / Test::XML::Simple for being well-formed XML. If RSS / Atom / OPML have good DTD / XML Schema / / Relax-NG schema / Sablotron rules, they could be checked using that Well, it might be identical, but it also might have "cached output" Or at least (if we don't cache everything, and that could be good idea) to check if there are less git binary calls. -- Jakub Narebski Poland --
We can make that deactivatable so that the output is actually identical. -- Lea --
Good idea. Or we can just do diff and check the difference. BTW. another thing worth considering: generating "Generating page..." or something like that like kernel.org's gitweb does when it has to regenerate cache. Nice feature, that is. -- Jakub Narebski Poland --
From: Jakub Narebski <jnareb@gmail.com>
This test uses Test::WWW::Mechanize::CGI to check gitweb's output. It
also uses HTML::Lint (if present) to validate the HTML.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Lea Wiemann <LeWiemann@gmail.com>
---
I haven't gotten around to merging Jakub's recent XML validation patch
yet, but I think it'd be good to have some review now; I'll merge it
tomorrow and send another patch.
Changes since v2:
t9503-gitweb-Mechanize.sh has stayed the same, but t9503/test.pl has
been overhauled: extracted common code into helper functions, only
validate if --long-tests if given, added link-checking with
--long-tests, added some minor tests for page contents, and added
tests for the summary page at the bottom.
This runs on top of the *next* branch plus the following patches:
[PATCH] gitweb: respect $GITPERLLIB
http://article.gmane.org/gmane.comp.version-control.git/85586/raw
[PATCH 1/2 v3] t/test-lib.sh: add test_external and test_external_without_stderr
http://article.gmane.org/gmane.comp.version-control.git/85504/raw
[PATCH v3] gitweb: standarize HTTP status codes
http://article.gmane.org/gmane.comp.version-control.git/85522/raw
(The latter two patches might be in next already by the time you're
reading this.)
t/t9503-gitweb-Mechanize.sh | 128 +++++++++++++++++++++++++++++++++++++++++++
t/t9503/test.pl | 128 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 256 insertions(+), 0 deletions(-)
create mode 100755 t/t9503-gitweb-Mechanize.sh
create mode 100755 t/t9503/test.pl
diff --git a/t/t9503-gitweb-Mechanize.sh b/t/t9503-gitweb-Mechanize.sh
new file mode 100755
index 0000000..abcf987
--- /dev/null
+++ b/t/t9503-gitweb-Mechanize.sh
@@ -0,0 +1,128 @@
+#!/bin/sh
+#
+# Copyright (c) 2008 Jakub Narebski
+# Copyright (c) 2008 Lea Wiemann
+#
+
+test_description='gitweb as CGI (using WWW::Mechanize)
+
+This test uses Perl module Test::WWW::Mechanize::CGI to
+check ...If you have rewritten the patch I won't mind if you change authorship First, it is not XML validation[*1*], but check if XML is well-formed. Second, I think it would be better if adding XML checks (RSS, Atom, OPML) would be left as separate commit. [*1*] There is W3C feed validation service, http://validator.w3.org/feed/ which provides also standalone Python script for feed validation, and also SOAP interface with WebService::Validator::Feed::W3C wrapper (but it would require having Internet connection), or we can try validate against XML Schema Definition (there exists one for RSS, and there is [cut] I have created this part as a copy of older t9500 gitweb test, thinking about what we might want to test, but the WIP of Mechanize based t9503 Actually it is not that we cannot could properly when skipping, because there are two ways to have skipped tests and test count upfront, both implemented in my initial version (v1) of t/t9503/test.pl. You can set number of tests upfront depending if some feature is available, e.g. "plan tests => <n> - $use_lint*<m2> - $check_xml*<m2>" or just "if ($use_lint) { plan tests => <m> } else ...". Or you can explicitly skip tests using SKIP: BLOCK (see Test::More(3pm)). What we have here is that we don't know number of tests upfront, because it is complicated, and that's what I'd like to see in the Why do you use shortened SHA1 identifier, instead of full SHA1? Links Another solution would be to copy relevant parts of cgi_application (without all the checks for example), and use $mech->cgi( sub { ... } ); I think we can use 'my' here; gitweb uses 'our' only to be able to run Style: I would write "our (", with space between keyword and opening Style: I would use function call form, i.e. "test_page(...)", not command-like (or script-like) form. As to the rest of the test: I think as preliminary test it is a good thing to have. We can think about what tests to add later (unless you rather have exhaustive test suite ...
Sure; FWIW I'm generally in favor of having a large initial commit for
new independent files (and since we don't need to worry about conflicts
I'm not fervent about getting this into the next branch ASAP), but we
I was thinking about that. Right now the tests are so generic that you
can replace the test repository with anything else as long as it has
We're skipping tests if a page-load fails to prevent a slew of failure,
using constructs like if(test_page '?...') { ... inspect the page ...}.
It's my impression that we shouldn't end up with a wrong test count even
if one test fails; but then we'd have to replace those ifs with
cumbersome skip blocks.
I'm generally not in favor of maintaining any test count plans; they're
an unnecessary failure source, and I don't think they buy us much, if
That's for readability in the test output; links get checked anyway,
don't they? If you think we should be testing against full SHA1s,
ISTR that using cgi(sub ...) gives us problems with untrappable exits in
gitweb.cgi (and possibly more things), right? I'm fine with the
I'll be writing tests as I go and change parts of gitweb. I won't be
able to write exhaustive tests, but I at least want to make sure that
the code I'm changing is covered somehow.
-- Lea
--
I just think that having this separate could help bisectability in Well, I've tried to put the cases where something can go wrong, and to cover wide range of possibilities. Rename, copy, typechange, merge commit for testing *diff views, symlink to test 'tree' view, different types of tags and tagged objects... What could be added is different types of stage output: filenames with '*', '+', '=', ':', '?', whitespace, etc. Checking if submodules While they can detect otherwise unnoticed errors, I think most of time This would reduce number of operations when crawling gitweb output. [about workaround bug in TWM::CGI when path to application contains Actually ->cgi_application(<path>) is implemented using ->cgi(<sub>) in TWM::CGI. The bug is that it uses straight "system($application)", without any quoting, after ensuring that $application is absolute path They work around the fact that we use 'trash directory', but they would fail if you run test from the directory which contains spaces, for example "/home/Lea Wiemann/git/t" (I think this has greater probability happening on operating systems other than Linux, for Actually 'my' would work here too; the problem with gitweb being forced to use 'our' to make it work with mod_perl isn't about the fact that they are global variables, but with initializing them, Write test when we notice something breaking seems a common theme in git development... ;-) -- Jakub Narebski Poland --
Hm... I wouldn't think that bisect could be necessary for a long linear Yup, filenames with ampersands and semicolons would be fun, too. OK, now I see what you mean. Well, awesome, borked lirbareis. I've put Fixed, and thanks for the offline explanations. -- Lea --
This test uses Test::WWW::Mechanize::CGI to check gitweb's output. It also uses HTML::Lint (if present) to validate the HTML. TOOD: Make this runnable even when the path to the git tree (i.e. your working copy) contains blanks. http://mid.gmane.org/200806202003.55919.jnareb@gmail.com Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lea Wiemann <LeWiemann@gmail.com> --- This for everyone's reference so you have my most recent version to base changes on. Changed since v3: applied Jakub's suggestions, and added TODO note to commit message so this doesn't get applied accidentally. t/t9503-gitweb-Mechanize.sh | 128 +++++++++++++++++++++++++++++++++++++++++++ t/t9503/test.pl | 128 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 256 insertions(+), 0 deletions(-) create mode 100755 t/t9503-gitweb-Mechanize.sh create mode 100755 t/t9503/test.pl diff --git a/t/t9503-gitweb-Mechanize.sh b/t/t9503-gitweb-Mechanize.sh new file mode 100755 index 0000000..abcf987 --- /dev/null +++ b/t/t9503-gitweb-Mechanize.sh @@ -0,0 +1,128 @@ +#!/bin/sh +# +# Copyright (c) 2008 Jakub Narebski +# Copyright (c) 2008 Lea Wiemann +# + +test_description='gitweb as CGI (using WWW::Mechanize) + +This test uses Perl module Test::WWW::Mechanize::CGI to +check gitweb output, using HTML::Lint to validate HTML.' + +# helper functions + +safe_chmod () { + chmod "$1" "$2" && + if [ "$(git config --get core.filemode)" = false ] + then + git update-index --chmod="$1" "$2" + fi +} + +. ./test-lib.sh + +# check if test can be run +perl -MEncode -e 'decode_utf8("", Encode::FB_CROAK)' >/dev/null 2>&1 || { + test_expect_success \ + 'skipping gitweb tests, perl version is too old' : + test_done + exit +} + +perl -MTest::WWW::Mechanize::CGI -e '' >/dev/null 2>&1 || { + test_expect_success \ + 'skipping gitweb tests, Test::WWW::Mechanize::CGI not found' : + test_done + exit +} + +# set up test repository +test_expect_success 'set up test ...
This test uses Test::WWW::Mechanize::CGI to check gitweb's output. It also uses HTML::Lint, XML::Parser, and Archive::Tar (if present, each) to validate the HTML/XML/tgz output, and checks all links on the tested pages if --long-tests is given. Also add a GITPERL environment variable that allows running Perl-based tests with different perl binaries (and thus under different versions). Signed-off-by: Lea Wiemann <LeWiemann@gmail.com> Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- Windows users: Please apply this patch and run the test, I'd love to know if it works on Windows! This patch applies against the *next* branch (without any other patches, finally). Note that in order to actually run the test you need the Test::WWW::Mechanize::CGI Perl module. Changes since v4: - Add a $GITPERL variable so you can choose the perl binary and run the test suite with different Perl versions. Note that this affects test-lib.sh. - Add a workaroud for WWW::Mechanize::CGI so that the test suite also works if the path to the Git tree contains blanks (which could happen on Windows, for instance). - Integrated all of Jakub's "[RFC/PATCH (WIP)] gitweb: Check that RSS, Atom and OPML output is well formed XML", which uses XML::Parser. - Also added snapshot validation using Archive::Tar. - Spidering (with --long-tests) is now much more efficient, in that it doesn't check any page more than once. - Added checks to ensure that fragments have corresponding name/id attributes in the page. - Added plenty more actual tests: check more elements on the summary page, and also test the commit, commitdiff, tree, blame, history, blob, and blob_plain views. Some of these tests don't do anything more than downloading a few pages with the given action (without checking any of the content), but this already helps a lot since it checks that gitweb doesn't die, that the HTML is valid, and (if --long-tests is given) that all links work. I've tested the ...
This test uses Test::WWW::Mechanize::CGI to check gitweb's output. It
also uses HTML::Lint, XML::Parser, and Archive::Tar (if present, each)
to validate the HTML/XML/tgz output, and checks all links on the
tested pages if --long-tests is given.
Also add a GITPERL environment variable that allows running Perl-based
tests with different perl binaries (and thus under different
versions).
Signed-off-by: Lea Wiemann <LeWiemann@gmail.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
Changed since v5: honor $ENV{GITPERL} for gitweb.cgi execution, not
just test execution:
diff --git a/t/t9503/test.pl b/t/t9503/test.pl
index 28894c5..947e2e8 100755
--- a/t/t9503/test.pl
+++ b/t/t9503/test.pl
@@ -54,5 +54,5 @@ my $gitweb = abs_path(File::Spec->catfile('..', '..', 'gitweb', 'gitweb.cgi'));
my $cgi = sub {
# Use exec, not the shell, to support blanks in the path.
- my $status = system $gitweb $gitweb;
+ my $status = system { $ENV{GITPERL} } $ENV{GITPERL}, $gitweb;
my $value = $status >> 8;
t/t9503-gitweb-Mechanize.sh | 132 ++++++++++++++++
t/t9503/test.pl | 354 +++++++++++++++++++++++++++++++++++++++++++
t/test-lib.sh | 2 +
3 files changed, 488 insertions(+), 0 deletions(-)
create mode 100755 t/t9503-gitweb-Mechanize.sh
create mode 100755 t/t9503/test.pl
diff --git a/t/t9503-gitweb-Mechanize.sh b/t/t9503-gitweb-Mechanize.sh
new file mode 100755
index 0000000..3fe6d8b
--- /dev/null
+++ b/t/t9503-gitweb-Mechanize.sh
@@ -0,0 +1,132 @@
+#!/bin/sh
+#
+# Copyright (c) 2008 Jakub Narebski
+# Copyright (c) 2008 Lea Wiemann
+#
+
+# This test supports the --long-tests option.
+
+# This test only runs on Perl 5.8 and later versions, since
+# Test::WWW::Mechanize::CGI requires Perl 5.8.
+
+test_description='gitweb tests (using WWW::Mechanize)
+
+This test uses Test::WWW::Mechanize::CGI to test gitweb.'
+
+# helper functions
+
+safe_chmod () {
+ chmod "$1" "$2" ...I'd rather either skip this comment all together, or change it to read something like that (I don't have good solution): +# Counting tests upfront is difficult, as number of tests depends +# on existence of several Perl modules, and is next to impossible +# when spidering gitweb output (for --long-test). Additionally, +# having number of tests planned stated at beginning is not necessary, +# as this test is to be run from git test suite, and not to be +# processed further by TAP (Test Anything Protocol) Perl modules. See http://www.perlfoundation.org/perl5/index.cgi?testing for explanation why (and when) 'plan' is useful: TAP output usually starts with a plan line: 1..9 which specifies how many tests are to follow. The above example specifies 9 tests. This line is optional, but recommended. Should a test suite die part-way through, the plan allows the testing framework to recognise Here also it could be "my $long_tests"; but I am not a Perl hacker, Wouldn't be easier to use '--format=%(refname)' in git-for-each-ref invocation? Besides, gitweb now uses in link _full_ refname, i.e. "refs/heads/<name>" and "refs/tags/<name>" to avoid branch / tag Errr... I think that without my comment you removed it is hard to understand what this comment talks about. "Thus ..." without any blanks = embedded whitespace? Could you tell us here what are the limitations? Are those limits This I think should be written using Test::More equivalent of test_expect_failure in t/test-lib.sh, namely TODO block, either as TODO: { local $TODO = "why"; ...normal testing code... } or if it causes problem with t9503-gitweb-Mechanize.sh failing, perhaps as TODO: { todo_skip "why", <n> if <condition>; ...normal testing code... } How it is different from PERL_PATH? -- Jakub Narebski ShadeHawk on #git Thorn, Poland --
[Open issue about PERL_PATH at the bottom!]
I'll just remove it. Anybody trying to add a test count will give up
This one is to make "local $long_tests = 0" work -- it wouldn't work
*checks* No, since I want to strip the leading refs/heads/ anyway (in
order to test the page text, which mentions heads and tags without the
directory prefix) -- hence the 'split' wouldn't go away with
That's a tyop: s/Thus/This/ (fixed). It refers to the following "my
Yup. Improved comment, and simplified system call:
- # Use exec, not the shell, to support blanks in the path.
- my $status = system { $ENV{GITPERL} } $ENV{GITPERL}, $gitweb;
+ # Use exec, not the shell, to support embedded whitespace in
+ # the path to gitweb.cgi.
No idea. I tried it and it told me that the URL was too long -- I
suspect it's the W3C server that rejected it. I wouldn't spend more
effort trying to get online validation to work; it's probably easier to
just occasionally validate manually. ;) XML well-formedness tests
Right -- here's my new version (which still fails if the feeds die or
are not well-formed XML -- I'll want that when I change gitweb):
# RSS/Atom/OPML view
# Simply retrieve and verify well-formedness, but don't spider.
$mech->get_ok('?p=.git;a=atom', 'Atom feed') and _verify_page;
$mech->get_ok('?p=.git;a=rss', 'RSS feed') and _verify_page;
TODO: {
# Now spider -- but there are broken links.
# http://mid.gmane.org/485EB333.5070108@gmail.com
local $TODO = "fix broken links in Atom/RSS feeds";
test_page('?p=.git;a=atom', 'Atom feed');
test_page('?p=.git;a=rss', 'RSS feed');
}
Right, I didn't think of that. PERL_PATH isn't available in the tests
though, it's only used internally by the Makefile to generate (among
other things) gitweb.cgi. This means that while we can control under
which Perl version gitweb.cgi runs, we cannot control under which Perl
version the test suite runs (at least without $PATH trickery). Does
this bother us?
If yes, I'd ...I just think that not having comment is better than have comment which is not fully true, or five lines of explanations which could be expanded further. Simply: planning (coming with number of tests upfront) is hard, and for Hmmmm... Siednote: some time ago I though that I understood difference between It makes sense now, thanks. Although I guess it would be nice to have link to the ticket to the bug (report) in WWW::Mechanize::CGI, so we could use cgi_application() ...if not for the trick with explicitly calling Perl (GITPERL), to Right. Well, W3C feed validator provides SOAP interface we could use, using POST of course and not GET (but it requires net connection), and also That's what I was asking about. -- Jakub Narebski Poland --
Sure, I've added a link to http://rt.cpan.org/Ticket/Display.html?id=36654 for completeness, though we won't be able to use it if it gets fixed, since we can't break compatibility to older versions. Btw, you may want to add a reply to the bug report (if that's possible -- haven't used CPAN's ticket system before) that "system $application $application" would be a better No, my point was that it will fail *in case* it breaks (which is good). Putting everything into a TODO block would silently ignore issues with Breaking the indecisiveness, I've ripped it out. ;-) We can still test gitweb under different Perl versions by make'ing it with different PERL_PATHs, so I think it'll be fine. -- Lea --
This test uses Test::WWW::Mechanize::CGI to check gitweb's output. It
also uses HTML::Lint, XML::Parser, and Archive::Tar (if present, each)
to validate the HTML/XML/tgz output, and checks all links on the
tested pages if --long-tests is given.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Lea Wiemann <LeWiemann@gmail.com>
---
Changes since v6:
- Removed GITPERL again.
- Added some basic tests for the page contents of blob and blob_plain
views.
- Incorporated Jakub's suggestions (see parent posts); in particular
blame and feed views *do* get spidered now in --long-tests mode, but
the spider tests are marked as TODO and thus don't cause the test
suite to fail.
- Some minor (internal) changes: @files only contains blobs, not
trees; improved the logic that avoids checking pages twice; die if
test.pl is run directly rather than from the shell script.
t/t9503-gitweb-Mechanize.sh | 132 +++++++++++++++
t/t9503/test.pl | 378 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 510 insertions(+), 0 deletions(-)
create mode 100755 t/t9503-gitweb-Mechanize.sh
create mode 100755 t/t9503/test.pl
diff --git a/t/t9503-gitweb-Mechanize.sh b/t/t9503-gitweb-Mechanize.sh
new file mode 100755
index 0000000..6f7168e
--- /dev/null
+++ b/t/t9503-gitweb-Mechanize.sh
@@ -0,0 +1,132 @@
+#!/bin/sh
+#
+# Copyright (c) 2008 Jakub Narebski
+# Copyright (c) 2008 Lea Wiemann
+#
+
+# This test supports the --long-tests option.
+
+# This test only runs on Perl 5.8 and later versions, since
+# Test::WWW::Mechanize::CGI requires Perl 5.8.
+
+test_description='gitweb tests (using WWW::Mechanize)
+
+This test uses Test::WWW::Mechanize::CGI to test gitweb.'
+
+# helper functions
+
+safe_chmod () {
+ chmod "$1" "$2" &&
+ if [ "$(git config --get core.filemode)" = false ]
+ then
+ git update-index --chmod="$1" "$2"
+ fi
+}
+
+. ./test-lib.sh
+
+# check if test can be run
+perl -MEncode -e 'decode_utf8("", ...This test uses Test::WWW::Mechanize::CGI to check gitweb's output. It also uses HTML::Lint, XML::Parser, and Archive::Tar (if present, each) to validate the HTML/XML/tgz output, and checks all links on the tested pages if --long-tests is given. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lea Wiemann <LeWiemann@gmail.com> --- This patch applies on next. Changes since v7: - In Makefile, dump $PERL_PATH to GIT-BUILD-OPTIONS (which gets source'd by test-lib.sh), and use it in t9710-perl-git-repo.sh. - .git/description in the test repository no longer depends on $0 (this would e.g. cause 'cd t; make' to fail). - Add test_link subroutine and use it everywhere in place of ok(find_link...) so that links whose presence get tested get checked and spidered in --long-tests mode. - Add simple page caching (reduces execution time without --long-tests by 25%). That's *really* helpful when you have to run those tests on a regular basis. ;) (WWW::Mechanize::Cached won't work with TWM::CGI, so we have to implement it ourselves; but it's easier anyway.) - Follow redirects rather than failing. - Test subdirectories in tree view. - Test error handling for non-existent hashes or hashes of wrong type. - Test commitdiff_plain view. - Expand test for history view. - Test tag objects (not just symbolic tags) in tag list. - Test a specific bug (under "diff formatting", marked TODO). - Do not test correctness of line number fragments (#l[0-9]+); they're broken too often right now. - Probably some more minor improvements I've forgotten about. :) Makefile | 1 + t/t9503-gitweb-Mechanize.sh | 138 +++++++++++ t/t9503/test.pl | 553 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 692 insertions(+), 0 deletions(-) create mode 100755 t/t9503-gitweb-Mechanize.sh create mode 100755 t/t9503/test.pl diff --git a/Makefile b/Makefile index a129491..3bf4f14 100644 --- a/Makefile +++ ...
This test uses Test::WWW::Mechanize::CGI to check gitweb's output. It also uses HTML::Lint, XML::Parser, and Archive::Tar (if present, each) to validate the HTML/XML/tgz output, and checks all links on the tested pages if --long-tests is given. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lea Wiemann <LeWiemann@gmail.com> --- [Resent with correct subject.] This patch applies on next. Changes since v7: - In Makefile, dump $PERL_PATH to GIT-BUILD-OPTIONS (which gets source'd by test-lib.sh), and use it in t9710-perl-git-repo.sh. - .git/description in the test repository no longer depends on $0 (this would e.g. cause 'cd t; make' to fail). - Add test_link subroutine and use it everywhere in place of ok(find_link...) so that links whose presence get tested get checked and spidered in --long-tests mode. - Add simple page caching (reduces execution time without --long-tests by 25%). That's *really* helpful when you have to run those tests on a regular basis. ;) (WWW::Mechanize::Cached won't work with TWM::CGI, so we have to implement it ourselves; but it's easier anyway.) - Follow redirects rather than failing. - Test subdirectories in tree view. - Test error handling for non-existent hashes or hashes of wrong type. - Test commitdiff_plain view. - Expand test for history view. - Test tag objects (not just symbolic tags) in tag list. - Test a specific bug (under "diff formatting", marked TODO). - Do not test correctness of line number fragments (#l[0-9]+); they're broken too often right now. - Probably some more minor improvements I've forgotten about. :) Makefile | 1 + t/t9503-gitweb-Mechanize.sh | 138 +++++++++++ t/t9503/test.pl | 553 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 692 insertions(+), 0 deletions(-) create mode 100755 t/t9503-gitweb-Mechanize.sh create mode 100755 t/t9503/test.pl diff --git a/Makefile b/Makefile index ...
Wouldn't it be better to use "(each if present)"? I am not a native English speaker, though. I'd rather use more descriptive name, so when you look it up in gitweb to manually (visually?) checks its output you would know of which test is it. Something like "gitweb Mechanize test repository", or "test What do you need caching for? You check if page was already accessed when spidering... If it is about checking links, alternate solution would be to replace simple $mech->page_links_ok( [ $desc ] ) by finding all the links either using $mech->followable_links() or $mech->find_all_links( ... ), or just $mech->links, then filtering out links which you have checked Why this package is not named WWW::Mechanize::CGI::Cached, I wonder? By the way, if you want to add a comment to mentioned WM::CGI ticket http://rt.cpan.org/Ticket/Display.html?id=36654 you have to either register, or send comment via mail with the following info From "Bugs in WWW-Mechanize-CGI via RT" <bug-WWW-Mechanize-CGI@rt.cpan.org>: | | Please include the string: | | [rt.cpan.org #36654] | | in the subject line of all future correspondence about this issue. To do so, More tests are always nice to have, although I begin to wonder if I'll try to investigate; I guess this uses wrong name or wrong hash Good work! -- Jakub Narebski Poland --
This is actually about the short tests (I don't think it gains much for spidering, percentage-wise). The test suite uses some repetitive set-up code (like get_summary && follow_link 'tree' && ...), so the second and Yeah, basically. :) Plus, it's not to be confused with a proper implementation of a TWM::CGI::Cached package. (For instance, it uses the URL rather than the complete request as cache key. Hence it ignores headers like Referer; but that's great for the Gitweb tests since TWM apparently sets the Referer, but Gitweb doesn't check it, so there's no test_page doesn't use $mech->get_ok anymore, but rather calls $mech->get and checks that the status is [23][0-9][0-9]. If it's 3xx, it also It's not yet painful (or long-running) enough for me to care. :) I'm This is only visible with --long-tests -- there are links to line numbers that don't exist (IOW the fragment doesn't exist in a name or id attribute on the target page). I've even seen links to #l0. Bug fixes welcome. ;-) (There are also some other [mostly minor] issues marked with "TODO:" in the test code; I didn't want to swamp the list with half a dozen bug Thanks! :-) I'll send v9 in (hopefully) 2-3 days, together with the first working version of the caching code. -- Lea --
Thanks for the info. By the way, some time ago I have send a patch (dropped, perhaps because of it being feature freeze, or just lost) which converted support for links with hash and without action (introduced in 7f9778b by Gerrit Pape) to use redirect (like for 'object' action) instead of silently filling correct action based on type of object (given by hash). IMHO it is better as it should prevent bookmarking "expensive" URL. I wonder if those are intentional (or at least known) breaking, to form approximate blame file history browsing; there was discussion about it on git mailing list some time ago (it resulted in adding "parent" line One issue of note, after brief peek at Git::Repo code: you should not error out on unknown header in commit object, but either save its value under its name, or just skip it. Unless this has changed... -- Jakub Narebski Poland --
Haha, I'd actually suggest the opposite. ;-) Figuring out the right action is almost free since you have to fetch the object anyways, so I doubt it'll make any difference performance-wise (though it'd be interesting to benchmark this). However, gitweb's URLs are prohibitively long -- so that nobody uses them in email --, and (automatically?) dropping the action parameter where possible would be a good first step to shortening them. Another idea would be to shorten *nods* I wasn't following this in detail; if it turns out to be unfixable, we could also remove the fragment checks for line numbers Unless this can actually happen in practice, I'd rather die aggressively -- it prevents errors (like cache hiccups) slipping through unnoticed. --
It is one fork more[*1*], which doesn't matter on operating systems like Linux where forking is fairly cheap. But perhaps avoiding redirect would be better solution... well, yet another solution would be to leave it up to gitweb configuration ;-) [*1*] Or pre-fetching object (together with its type) which might be About shortening the hashes: _usually_ 8. characters is enough... but sometimes it isn't. What then? If Git::Commit is to be *generic* object oriented interface to git repository (and not only for gitweb), it has either to skip unknown commit headers, or save them as-is. As to checking cache: you can always check if header name is sane... -- Jakub Narebski Poland --
That sounds wrong, as the point of tests would be to make sure the stuff you are going to install would work with what you thought will be used from the system. If "isn't available in the tests" is the problem, is it possible to make it available? We are passing down SHELL_PATH from primary Makefile to t/ and you should be able to do the same for Perl path... About the Test::WWW:Mechanize::CGI thing, how widely available is it? I do not think it is packaged for Debian nor Ubuntu, for example. --
Not very, you basically have to install it from CPAN. If it's not
installed, the only message you get from the test is:
ok 1: skipping gitweb tests, Test::WWW::Mechanize::CGI not found
Should optional test dependencies like Test::WWW::Mechnanize::CGI be
documented in INSTALL?
Best,
Lea
--
Not very widely; practically you have to install it from CPAN. Test::WWW::Mechanize::CGI is at v0.1, WWW::Mechanize::CGI is at v0.3. But if Test::WWW::Mechanize::CGI is not installed, test would be not run. If you are gitweb developer, then installing locally in $HOME from CPAN is I guess viable option; if you are not gitweb developer, we still have t/t9500-gitweb-standalone-no-errors.sh The whole point of using those packages was that it makes it _easy_ to write those tests. This consist of two parts: 1). running gitweb as if it was CGI application, which otherwise would either require deep knowledge of how CGI application is invoked (what it is in WWW::Mechanize::CGI and required dependence HTTP::Request::AsCGI which I think does all the work) or working web server, done a la "git instaweb", 2.) accessing and testing gitweb output (what Test::WWW::Mechanize, WWW::Mechanize does; we could do the same with LWP* modules from libwww-perl / perl-libwww-perl package and Test::More from Perl, but it would be repeating Test::WWW::Mechanize work). -- Jakub Narebski Poland --
