5 #package to work around problems in HTTP headers
6 # Note: This is just a utility module; it should not be instantiated.
9 # Copyright 2003 Katipo Communications
11 # This file is part of Koha.
13 # Koha is free software; you can redistribute it and/or modify it under the
14 # terms of the GNU General Public License as published by the Free Software
15 # Foundation; either version 2 of the License, or (at your option) any later
18 # Koha is distributed in the hope that it will be useful, but WITHOUT ANY
19 # WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
20 # A PARTICULAR PURPOSE. See the GNU General Public License for more details.
22 # You should have received a copy of the GNU General Public License along with
23 # Koha; if not, write to the Free Software Foundation, Inc., 59 Temple Place,
24 # Suite 330, Boston, MA 02111-1307 USA
29 use vars qw($VERSION @ISA @EXPORT);
31 # set the version for version checking
36 C4::Charset - Functions for handling charsets in HTML pages
42 print $query->header(-type => C4::Charset::gettype($output)), $output;
46 The functions in this module peek into a piece of HTML and return strings
47 related to the (guessed) charset.
63 &guesscharset($output)
65 "Guesses" the charset from the some HTML that would be output.
67 C<$output> is the HTML page to be output. If it contains a META tag
68 with a Content-Type, the tag will be scanned for a language code.
69 This code is returned if it is found; undef is returned otherwise.
71 This function only does sloppy guessing; it will be confused by
72 unexpected things like SGML comments. What it basically does is to
73 grab something that looks like a META tag and scan it.
77 sub guesscharset ($) {
80 local($`, $&, $', $1, $2, $3);
81 # FIXME... These regular expressions will miss a lot of valid tags!
82 if ($html =~ /<meta\s+http-equiv=(["']?)Content-Type\1\s+content=(["'])text\/html\s*;\s*charset=([^\2\s\r\n]+)\2\s*(?:\/?)>/is) {
84 } elsif ($html =~ /<meta\s+content=(["'])text\/html\s*;\s*charset=([^\1\s\r\n]+)\1\s+http-equiv=(["']?)Content-Type\3\s*(?:\/?)>/is) {
92 my $charset = guesscharset($html);
93 return defined $charset? "text/html; charset=$charset": "text/html";
96 #---------------------------------
98 END { } # module clean-up code here (global destructor)
107 Koha Developement team <info@koha.org>