From: Xah Lee
Subject: Re: check html file size
Date: 
Message-ID: <1129454283.202378.296120@g44g2000cwa.googlegroups.com>
 Xah Lee wrote:
« would anyone like to translate the following perl
script to Python or Scheme (scsh)?
(Computing the total inline image size of a html file)
for both the Python version and the Perl version, see:
 http://xahlee.org/perl-python/check_html_size.html
»

Schemers, come on... no takers?
i'd really like to see a lisp version version.

 Xah
 ···@xahlee.org
∑ http://xahlee.org/

From: Alex Shinn
Subject: Re: check html file size
Date: 
Message-ID: <1129511712.131461.112900@f14g2000cwb.googlegroups.com>
On Oct 16 Xah Lee wrote:
> Schemers, come on... no takers?

I think most people consider this pointless, but since
it's such a simple program I've included the
Common-Scheme version below.  It uses a proper
HTML parser rather than an incomplete regexp
solution.

As in the Perl version the same image will be counted
multiple times, though if the idea is to get some measure
of server load this probably isn't what you want.

--
Alex

(common-module ()
  ((import-extension (common (file io pathname))
                     (alschemist text html-parser))
   (entry-point main))

(define size-limit (* 800 1000))

(define (main args)
  (let ((dir (cadr args)))
    (define parse
      (make-html-parser
       start: (lambda (tag attrs seed virtual?)
                (cond ((and (eq? tag 'img) (assq 'href attrs))
                       => (lambda (cell)
                            (cons (img-file-size
                                   (path-expand dir (cdr cell)))
                                  seed)))
                      (else seed)))))
    (define (extract-inline-images path)
      (call-with-input-file path (lambda (p) (parse '() p))))
    (define (html-file-size path)
      (apply + (file-size path) (extract-inline-images path)))
    (define (img-file-size path)
      (if (file-exists? path) (file-size path) 0))
    (define (kons d f ls)
      (cond ((and (equal? "html" (path-extension f))
                  (let* ((path (path-build d f))
                         (size (html-file-size path)))
                    (and (>= size size-limit)
                         (cons path size))))
             => (lambda (x) (cons x ls)))
             (else ls)))
    (for-each
     (lambda (x) (pr (cdr x) ": " (car x) nl))
     (directory-fold-tree (cadr args) kons '()))))

)
From: Xah Lee
Subject: Re: check html file size
Date: 
Message-ID: <1129515062.437883.81680@g47g2000cwa.googlegroups.com>
i can't run your program in scsh. (error below) What is
“common-scheme”?

would someone do a scsh version?

 Xah
 ···@xahlee.org
∑ http://xahlee.org/


Warning: invalid expression
         ()

Warning: definition in expression context
         (define size-limit (* 800 1000))

Warning: definition in expression context
         (define (main args) (let ((dir (cadr args))) (define parse
(make-html-parser start: (lambda # #))) (define (extract-inline-images
path) (call-with-input-file path (lambda # #))) (define (html-file-size
path) (apply + (file-size path) (extract-inline-images path))) (define
(img-file-size path) (if (file-exists? path) (file-size path) 0)) ---))

Error: undefined variable
       main
       (package user)


Alex Shinn wrote:
> On Oct 16 Xah Lee wrote:
> http://xahlee.org/perl-python/check_html_size.html
> > Schemers, come on... no takers?
>
> I think most people consider this pointless, but since
> it's such a simple program I've included the
> Common-Scheme version below.  It uses a proper
> HTML parser rather than an incomplete regexp
> solution.
>
> As in the Perl version the same image will be counted
> multiple times, though if the idea is to get some measure
> of server load this probably isn't what you want.
>
> --
> Alex
>
> (common-module ()
>   ((import-extension (common (file io pathname))
>                      (alschemist text html-parser))
>    (entry-point main))
>
> (define size-limit (* 800 1000))
>
> (define (main args)
>   (let ((dir (cadr args)))
>     (define parse
>       (make-html-parser
>        start: (lambda (tag attrs seed virtual?)
>                 (cond ((and (eq? tag 'img) (assq 'href attrs))
>                        => (lambda (cell)
>                             (cons (img-file-size
>                                    (path-expand dir (cdr cell)))
>                                   seed)))
>                       (else seed)))))
>     (define (extract-inline-images path)
>       (call-with-input-file path (lambda (p) (parse '() p))))
>     (define (html-file-size path)
>       (apply + (file-size path) (extract-inline-images path)))
>     (define (img-file-size path)
>       (if (file-exists? path) (file-size path) 0))
>     (define (kons d f ls)
>       (cond ((and (equal? "html" (path-extension f))
>                   (let* ((path (path-build d f))
>                          (size (html-file-size path)))
>                     (and (>= size size-limit)
>                          (cons path size))))
>              => (lambda (x) (cons x ls)))
>              (else ls)))
>     (for-each
>      (lambda (x) (pr (cdr x) ": " (car x) nl))
>      (directory-fold-tree (cadr args) kons '()))))
> 
> )