Modifica ricorsiva dei TAG HTML in un sito

Dopo aver fatto il porting di un sito da hosting-Win a hosting-Linux mi son ritrovato con il problema del "case-sensitive". Con questo script ho risolto convertendo in automatico e in modo ricorsivo tutti i TAG di tutte le pagine in minuscolo.
Chiaramente ho dovuto fare anche l'operazione di convertire tutte le directory e i nomi dei file. Per questo pero ho usato uno script che ho trovato in rete (http://www.ibiblio.org/pub/Linux/utils/file/lowercase-1.0.tar.gz)

#!/usr/bin/perl # # lowerize.pl # v1.0alfa # # Author: aeniGma a.k.a Eremita Solitario # WEB : http://www.thekey.it # # A little script that recursively convert to lowercase every HTML tag found # in a directory tree # # I wrote this after having ported an internet site from Win-Hosting to # Linux-Hosting to resolve the problem of the IIS that is not case-sensitive. # # Thanks to: # This script is a merge of other piece of script found on the net as examples # I have only put them together and myxed them up to obtain this single script # So my thanks goes to everyone on the net that wrote that piece of code # Sorry but I lost the links... :-( Excuse... # ################################################################################ use strict; use warnings; ################################################################################ # Main routine ################################################################################ # Do the dirty work recursively from the directory specified on command line if ($ARGV[0]) { process_recursively($ARGV[0]); } else { print "No starting point supplied!...\n"; exit(1); } # print "All done... have a good day!\n"; # When here we have finished exit; ################################################################################ ################################################################################ # Some supporting routines ################################################################################ # process_recursively # # Parse every direcotyr recursively and if an HTML file is found call the # lowerize_content routine that do the lowerize sub process_recursively { my($directory) = shift; my @files = (); # Open the directory unless(opendir(DIRECTORY, $directory)) { print "Cannot open directory \"$directory\"\n"; exit; } # Read the directory, ignoring special entries "." and ".." @files = grep (!/^\.\.?$/, readdir(DIRECTORY)); # Close the directory closedir(DIRECTORY); # Now for every file read check if direcotry or file... foreach my $file (@files) { # Check if it is a file if (-f "$directory/$file") { # Select only html files (.htm or .html) if ($file =~ /\.html?$/i) { print "Found file : $directory/$file\n"; # Now lowerize all HTML tag into the file lowerize_content("$directory/$file"); } else { print "Skipped : $directory/$file\n"; } # Check if it is a subdirectory }elsif( -d "$directory/$file") { # Recursive call to myself... process_recursively("$directory/$file"); } } } # lowerize_content # # This subroutine thas what his name says, it would put all HTML tag to # lower-case sub lowerize_content { my($file) = shift; print "Processing file : ".$file; # Open file for reading open(REA, $file); # Open temp file for writing open(WRI, ">".$file.".low"); # Do the lowering... while () { print "."; # Get every TAG and lowerize them s/<([^>]+)>/<\L$1\E>/g; # Write to temp file print WRI $_; } print "\n"; # Close the 2 files close(WRI); close(REA); # Exchange temp with source # This works ONLY on linux machine where we have the mv command system "/bin/mv $file.low $file"; } ################################################################################ # EOF ################################################################################

Authoring

aenigma - Site: http://www.thekey.it
Tipo Infobox: SCRIPTS
Skill Level: 3- INTERMEDIATE
Ultimo Aggiornamento: 2004-02-13 20:22:59
Data di creazione: 2004-02-13 20:22:59
Lingua:

More info:

Ambiente shell e scripting

L'ambiente shell e lo scripting: variabili d'ambiente, cicli, strutture base.

Linking

(F)AQ: Ho trasferito un sito da Win a Linux e non mi raccapezzo piu con le maiuscole e le minuscole
SOURCE: Fonti varie sulla rete e appunti personali
SOURCE: shell script to convert files or directories to lowercase

Discuting

Rispondi

come non detto

on 2004-02-18 16:28:53

ecco, dobbiamo correggere la cosa anche sui post della live discussion ..

Rispondi

Problemi con gli inserimenti

al on 2004-02-18 16:28:09

Effettivamente ci sono problemi quando usi caratteri che nel'html vengono interpretata (tipicamente ).

Sappiamo che dobbiamo sistemare una volta per tutti il problema (e non ci vuole nemmeno troppo) ma non l'abbiamo fatto.

Il workaround è usare le html enityt al posto dei segni: > e <

Rispondi

Correzione

aeniGma on 2004-02-13 20:25:34

Ho corretto un errore che credo sia stato causato dal parser durante la pubblicazione riguardante la REGEXP per mettere in minuscolo tutti TAG. Se avete preso lo script prima di oggi 13.febb.2004 molto probabilmente avrete dei problemi, ricopiatelo di nuovo.