From: Xah Lee
Subject: How To Add “alt=” To Image Tags With Emacs Lisp
Date: 
Message-ID: <9b52f9fa-8bfe-4354-b789-c098a22c6470@g1g2000pra.googlegroups.com>
a simple tutorial of a example using elisp.

• How To Add “alt=” To Image Tags With Emacs Lisp
  http://xahlee.org/emacs/lisp_update_image_tag.html

plain text version follows (lisp code formatting is screwed up)
--------------------------------------------------------

How To Add “alt=” To Image Tags With Emacs Lisp

Xah Lee, 2008-12-02

This page shows a real world example of using emacs's regex to update
HTML image tags on all files in a dir. You should be familiar with
Elisp Language Basics.

The Problem

Summary

I need to add proper “alt="image description"” to all image tags for a
bunch of HTML files in a dir. The alt's value should be based on the
image's file name.

Technically, this page shows you how to use emacs's regex and a elisp
function for the replacement string, to do find/replace on all files
in a dir.

Detail

I have many HTML files in a dir. Many have a image tag like this:

<img src="paraboloid.png" alt="math surface" width="832" height="513">
Note that their “alt” value is all just “math surface”. I want the alt
value to be more descriptive, based on the file name. So, in this
example, it should be just “alt="paraboloid"”.

All these files are inside a dir, most of these are inside various
subdirs. There are about 100 files. About maybe 50 of them has
“alt="math surface"” that needs to be fixed.

The simplest solution is to use regex with a custome replacement
function. (The method described here can be used if your image tags
don't have “alt=” and you need to add it.).

Solution

The solution is quite simple. To do regexp replace on a bunch of
files, one can use the builtin command dired-do-query-replace-regexp.
So, all we have to do is to go to dired of the dir, call that command,
give the find string and replacement string, and we are done.

List Wanted Files In Dired

Since the files are in different subdirs, so i use find-dired first,
which gets me all files i wanted in one dired listing. So, i type “Alt
+x find-dired”, then give the dir name, then give “-name "*html"”. The
result is all html files in that dir and subdir.

Then, i mark all files i want, by typing “% m”, which invokes dired-
mark-files-regexp. Then i give the pattern “\.html”, which would mark
all html files.

The Search Regex Pattern

The next job is to give regex search pattern. This is simple:

<img src="\([^"]+\)" alt="\([^"]+\)" width="\([0-9]+\)" height="\
([0-9]+\)">
The Replacement Elisp Expression

The heart of this task is to write the elisp function that gives us
the replacement string, where the alt part is the trasformed version
of the file name. This is suprisingly simple too. Here's the lisp
expression we need:

(concat
 "<img src=\""
 (match-string 1)
 "\" alt=\""
 (replace-regexp-in-string ".png" ""
                           (replace-regexp-in-string "_" " " (match-
string 1)))
 "\" width=\""
 (match-string 2)
 "\" height=\""
 (match-string 3)
 "\">"
 )
The “match-string” simply give us the matched values. The interesting
part is the replace-regexp-in-string we used to generate the value for
alt. First, we replace “_” to space, then we delete the “.png”. That's
all there is to it.

Finally, we invoke dired-do-query-replace-regexp in the dired buffer
(keyboard shortcut is Q). Then emacs will ask for the search string.
We give the search string as above, and then emacs will prompt for the
replacement string. We type “\,sexp”, where the sexp is the above lisp
expression (just copy and paste it). Then, emacs will start to do the
search/replace and stop whenever it finds a match. To replace it and
continue, type “y”, and “n” to skip, to do all replacement on the
current file, type “!”. To continue for the whole, just hold on “!”.
Once done, type “Alt+x ibuffer”, then type “*u” to list all unsaved
files, type “S” to save them all, then type “D” to close them all.

Without emacs, the above operation might take a hour or two and is
tedious and error prone. With expertise in perl or python scripting,
the problem is lack of interactive see-and-do. With emacs, the whole
operation is less than 5 minutes.

Advantage Of Emacs Regex Replace On Multiple Files

Suppose you are given a task where hundreds of valid HTML files in a
dir needs to be converted to valid XHTML. Note that XHTML has a
slightly different syntax. For example, all tags such as <p> and <li>
now needs to be closed. Also tags like <img>, <hr>, <br> etc need to
be like <img ... />, <hr/>, <br/>. Also, tags are now case sensitive,
so you need to lower case them. Also, image tags now must be wrapped
inside a container tag, such as “<div>”. The DTD also needs to be
changed, and there are many style oriented tags that needs to be
transformed.

This task seems daunting. You could try a perl script in one shot, but
it would probably take you a whole day or days to develope, and if
your script has a parsing or regex error, it'll delete parts of your
files without you knowing it. You could do a trial and error approach
by regex replacement experimentally one at a time. Still, your script
goes batch. If you make a mistake, you'll have to revert all your
files. With mastery of emacs, you can do the above transform using
regex find/replace one by one, interactively and safely, saving your
time some 10 fold.

  Xah
∑ http://xahlee.org/

☄
From: William James
Subject: Re: How To Add alt= To Image Tags With Emacs Lisp
Date: 
Message-ID: <gh5eop$bf4$1@aioe.org>
Xah Lee wrote:

> The next job is to give regex search pattern. This is simple:
> 
> <img src="\([^"]+\)" alt="\([^"]+\)" width="\([0-9]+\)" height="\
> ([0-9]+\)">
> The Replacement Elisp Expression
> 
> The heart of this task is to write the elisp function that gives us
> the replacement string, where the alt part is the trasformed version
> of the file name. This is suprisingly simple too. Here's the lisp
> expression we need:
> 
> (concat
>  "<img src=\""
>  (match-string 1)
>  "\" alt=\""
>  (replace-regexp-in-string ".png" ""
>                            (replace-regexp-in-string "_" " " (match-
> string 1)))
>  "\" width=\""
>  (match-string 2)
>  "\" height=\""
>  (match-string 3)
>  "\">"
>  )
> The “match-string” simply give us the matched values. The
> interesting part is the replace-regexp-in-string we used to generate
> the value for alt. First, we replace “_” to space, then we delete
> the “.png”. That's all there is to it.
> 
> Finally, we invoke dired-do-query-replace-regexp in the dired buffer
> (keyboard shortcut is Q). Then emacs will ask for the search string.
> We give the search string as above, and then emacs will prompt for the
> replacement string. We type “\,sexp”, where the sexp is the above
> lisp expression (just copy and paste it). Then, emacs will start to
> do the search/replace and stop whenever it finds a match. To replace
> it and continue, type “y”, and “n” to skip, to do all
> replacement on the current file, type “!”. To continue for the
> whole, just hold on “!”.  Once done, type “Alt+x ibuffer”,
> then type “*u” to list all unsaved files, type “S” to save
> them all, then type “D” to close them all.
> 

Ruby:

$sought =
%r{
    (<img \s+ src=")
    (.*?)
    (" \s+ alt=")
    (math\ surface)
  }xmi

$replacement = proc{
  a = $~.captures
  a[-1] = a[1].sub(/\..*/,'').sub(/_/,' ')
  a.to_s
}

ARGV.each{|filename|
  changed = false
  auto = false
  puts "\nReading #{ filename }\n\n"
  text = IO.read( filename )
  text = text.gsub( $sought ){|old|
    new = $replacement.call
    if auto
      new
    else
      print "Replace\n#{ old }\nwith\n#{ new }\n? "
      if (resp = $stdin.gets.strip) =~ /^[y!]/i
        changed = true
        auto = ( "!" == resp )
        new
      else
        old
      end
    end
  }
  if changed
    print "Write changes to #{ filename } ? "
    if $stdin.gets.strip =~ /^y/i
      puts "Writing to #{ filename }\n\n"
      File.open( filename, "w" ){|f| f.print( text ) }
    end
  end
}