| 12 May 2002 |
Local Installation of W3C Validator on Windows ME
Hallelujah!, switching to Apache has also solved another nagging problem for me - installing the W3C validator locally on Windows ME. There's a fair bit of faffing around to do - mostly installing and configuring if your system isn't already capable of running Perl CGIs but you'd only need to do it once. I'm assuming here that you're installing onto a standalone PC but ideally you'd want it on a networked server. Bear in mind that the W3C installation is really assumes that it will be running on a *nix system so we need to make suitable changes to get the CGI and binary running under DOS so things like spaces in directory names will cause problems - use the shortened DOS form instead with the ~syntax. No doubt I'll have accidentally missed out something vital in these instructions but here's the jist of what changes I've made to get it going on WinME.
- Make sure you have a copy of Apache webserver installed. (tests work under Apache/1.3.22 (Win32)). Make sure you choose a port number that doesn't conflict with any existing webservers like PWS/IIS on your system - make a note of the port number e.g. 81
- Make sure you have a copy of ActiveState ActivePerl installed (tests work under 5.6.1).
- If you want to use the check and checklinks scripts you'll need to install modules Text::Iconv and Time::HiRes. To install these start up a MS-DOS Prompt window and assuming your perl install went OK type 'ppm' - this is a program for installing Perl modules. Type 'install Text::Iconv' and wait for the installer to download and install the module then repeat with 'install Time::HiRes'. To quit the PPM program type 'exit'
- We want to set up Windows so that the validator will be associated with your local server. Open the file called HOSTS inside your WINDOWS directory in a text editor. Add a line '127.0.0.1 validator' and save the file.
- Visit the W3C HTML Validation Service: Source Code page and download the source code 'Grab a tar ball'. Find the directory where you installed Apache and create a directory for the validator code - expand the tar ball into this directory.
- We need to adjust the default installation of Apache to
a) Make sure that Perl scripts can run,
b) We add virtual server for the validator.
From the Apache directory Open up the file 'conf/httpd.conf' in a text editor, in the section where it starts using 'AddHandler' add the following line 'AddHandler cgi-script .pl' - This will allow Apache to pass the Perl scripts to ActiveState to run.
Add a line 'NameVirtualHost *'. Add a virtual server - something similar to
<VirtualHost *>
DocumentRoot "C:\Program Files\Apache Group\Apache\validator\htdocs"
ServerName validator
CustomLog logs/validator_access.log common
ErrorLog logs/validator_error.log
ScriptAlias /cgi-bin "C:\Program Files\Apache Group\Apache\validator\httpd\cgi-bin"
ScriptAlias /check "C:\Program Files\Apache Group\Apache\validator\httpd\cgi-bin\check.pl"
ScriptAlias /checklink "C:\Program Files\Apache Group\Apache\validator\httpd\cgi-bin\checklink.pl"
</VirtualHost>We need to allow the virtual server to run Perl scripts, add
<Directory "C:\Program Files\Apache Group\Apache\servers\validator\httpd\cgi-bin">
Options +ExecCGI
SetHandler cgi-script
</Directory>- Another gotcha I came across with installation was inside the validator htdocs/images open up the '.htaccess' file. On my server the line in this file causes problems serving images so I just would comment it out thus #Header set Cache-Control "max-age=604800"
Caveat! I'll need to revisit the Apache configuration to see what else I need to enable for the cache control to work.
- Go to your validator source code directory and go into 'httpd/cgi-bin', rename the files check -> check.pl, checklink->checklink.pl,LinkChecker->LinkChecker.pl. It's easier for Windows to know that scripts are Perl files if you use the .pl file extension.
- The Perl scripts are just CGI wrappers, the core of the work of validation is carried out by binary executable versions of James's Clark's SP parser. Visit his 'How to get SP' page and download the 'binaries for MSDOS'. Expand these into a directory in your validator 'httpd' directory - you should end up with httpd/sp1_3_4 or similar. Go down the sp_1_3_4/bin directory until you find an executable called nsgmls.exe, make a note of the full path on your system to this executable - you will need to add it later to the Perl script.
- The validator needs to be able to write temporary files so create a directory called 'tmp' inside your validator directory, make sure that your webserver process is able to write to this folder.
- Open the files cgi-bin/checklink.pl and cgi-bin/check.pl and change the first line to read #!perl -w
We need to make several changes to checklink.pl
Add
use Win32::Process;- Whereever the script uses *nix filepaths these need to be changed to DOS e.g. change
my $base_path = '/usr/local/validator/';
to the DOS equivalent on your system. Go through and change the file paths in the script that use '/' to be '\' - Change the paths to the executables to something like
my $sp='C:\PROGRA~1\APACHE~1\APACHE\SERVERS\VALIDA~1\HTTPD\SP1_3_4\BIN\NSGMLS.EXE';
my $osp = $sp;
For convenience I added some extra variables
my $tempdir = "C:\\PROGRA~1\\APACHE~1\\APACHE\\VALIDA~1\\tmp\\";
my $temp ="validate".$$;
my $temphtml=$tempdir.$temp."_html";
my $tempesis = $tempdir.$temp."_esis";
$temp = $tempdir.$temp;
Replace all references in the code for $temp.esis to $tempesis- In the function erase_stuff add the lines
unlink "$tempesis" or warn "unlink($tempesis) returned: $!\n";
unlink "$temphtml" or warn "unlink($temphtml) returned: $!\n"; - The W3C code relies on "open CHECKER, "|$command - >$tempesis"" to run the validator and pipe the output using the current input. To get this running in DOS I've added an extra step - write out the HTML to a file then use the Perl Windows routines to launch a DOS session to run the binary (caveat - I'm sure there are cleaner ways to do this but hey it works for me). so you'll need to replace the above with a section of code like
open OUTHTML, ">$temphtml" or die "fail write html $temphtml";
for (@{$File->{Content}}) {print OUTHTML $_, "\n"; };
close OUTHTML;
my $ProcessObj;
sub ErrorReport{
print Win32::FormatMessage( Win32::GetLastError() );
}
Win32::Process::Create($ProcessObj,
"C:\\windows\\command.com",
"command.com /c $sp -f$temp -E0 $xmlflags -c $catalog $temphtml >$tempesis",
0,
NORMAL_PRIORITY_CLASS,
".")|| die ErrorReport();
$ProcessObj->Wait(INFINITE); - Changes in the parse_errors routine, this is splitting up lines in the output from the SP parser but it splits on the ':' character, ON DOS you'll find that you end up with at least a couple of entries like c: which knock the logic of this routine out. Caveat! I'll need to revisit my hack to make sure I'm getting the same behavior but for a quick 'n' dirty fix I just do a search and replace for your drive letter and colon. After the line 'for (<ERRORS>) {' add
$_ =~ s/C://g;
Comment out the following line thus '#next unless $_err[1] eq '<OSFD>0';'
- OK take your courage in both hands, I'd recommend a reboot...just because. Go and have a cup of tea and a kit kat.
- Start up Apache
- Start up your web brower and open validator on the port you installed Apache http://validator:81/
With a bit of luck you should now see the validator homepage, tip - it should look like what you see on http://validator.w3.org. If it doesn't then something's not right with the configuration of Apache. - Test out the validator by feeding a URL into the form. If you get the dreaded 'Internal Server Error' then most likely it's a problem in the check.pl script, you might want to try running perl -w check.pl from a DOS window to narrow down any syntax errors. If you used the virtual server configuration suggested above then you should also have access to an error log inside your Apache dir/logs/validator_error.log - there should be details of exactly what caused the problem in there.
- OK that's more than enough. You're on your own kiddo!
© Copyright 2002, Scottish Lass, www.scottishlass.co.uk, Contact: