Dopo aver fatto il porting di un sito da hosting-Win a hosting-Linux mi son ritrovato con il problema del "case-sensitive". Con questo script ho risolto convertendo in automatico e in modo ricorsivo tutti i TAG di tutte le pagine in minuscolo.
Chiaramente ho dovuto fare anche l'operazione di convertire tutte le directory e i nomi dei file. Per questo pero ho usato uno script che ho trovato in rete (http://www.ibiblio.org/pub/Linux/utils/file/lowercase-1.0.tar.gz)
#!/usr/bin/perl
#
# lowerize.pl
# v1.0alfa
#
# Author: aeniGma a.k.a Eremita Solitario
# WEB : http://www.thekey.it
#
# A little script that recursively convert to lowercase every HTML tag found
# in a directory tree
#
# I wrote this after having ported an internet site from Win-Hosting to
# Linux-Hosting to resolve the problem of the IIS that is not case-sensitive.
#
# Thanks to:
# This script is a merge of other piece of script found on the net as examples
# I have only put them together and myxed them up to obtain this single script
# So my thanks goes to everyone on the net that wrote that piece of code
# Sorry but I lost the links... :-( Excuse...
#
################################################################################
use strict;
use warnings;
################################################################################
# Main routine
################################################################################
# Do the dirty work recursively from the directory specified on command line
if ($ARGV[0]) {
process_recursively($ARGV[0]);
} else {
print "No starting point supplied!...\n";
exit(1);
}
#
print "All done... have a good day!\n";
# When here we have finished
exit;
################################################################################
################################################################################
# Some supporting routines
################################################################################
# process_recursively
#
# Parse every direcotyr recursively and if an HTML file is found call the
# lowerize_content routine that do the lowerize
sub process_recursively {
my($directory) = shift;
my @files = ();
# Open the directory
unless(opendir(DIRECTORY, $directory)) {
print "Cannot open directory \"$directory\"\n";
exit;
}
# Read the directory, ignoring special entries "." and ".."
@files = grep (!/^\.\.?$/, readdir(DIRECTORY));
# Close the directory
closedir(DIRECTORY);
# Now for every file read check if direcotry or file...
foreach my $file (@files) {
# Check if it is a file
if (-f "$directory/$file") {
# Select only html files (.htm or .html)
if ($file =~ /\.html?$/i) {
print "Found file : $directory/$file\n";
# Now lowerize all HTML tag into the file
lowerize_content("$directory/$file");
} else {
print "Skipped : $directory/$file\n";
}
# Check if it is a subdirectory
}elsif( -d "$directory/$file") {
# Recursive call to myself...
process_recursively("$directory/$file");
}
}
}
# lowerize_content
#
# This subroutine thas what his name says, it would put all HTML tag to
# lower-case
sub lowerize_content {
my($file) = shift;
print "Processing file : ".$file;
# Open file for reading
open(REA, $file);
# Open temp file for writing
open(WRI, ">".$file.".low");
# Do the lowering...
while () {
print ".";
# Get every TAG and lowerize them
s/<([^>]+)>/<\L$1\E>/g;
# Write to temp file
print WRI $_;
}
print "\n";
# Close the 2 files
close(WRI);
close(REA);
# Exchange temp with source
# This works ONLY on linux machine where we have the mv command
system "/bin/mv $file.low $file";
}
################################################################################
# EOF
################################################################################
L'ambiente shell e lo scripting: variabili d'ambiente, cicli, strutture base.
come non detto
ecco, dobbiamo correggere la cosa anche sui post della live discussion ..
RispondiProblemi con gli inserimenti
Effettivamente ci sono problemi quando usi caratteri che nel'html vengono interpretata (tipicamente ).
Sappiamo che dobbiamo sistemare una volta per tutti il problema (e non ci vuole nemmeno troppo) ma non l'abbiamo fatto.
Il workaround è usare le html enityt al posto dei segni: > e <
Correzione
Ho corretto un errore che credo sia stato causato dal parser durante la pubblicazione riguardante la REGEXP per mettere in minuscolo tutti TAG. Se avete preso lo script prima di oggi 13.febb.2004 molto probabilmente avrete dei problemi, ricopiatelo di nuovo.