Hi, I am a Lisp newbie. I am trying to load a web page into a string
with Lispworks, in order to parse the html code.
I am trying to use open-tcp-stream, but I am confused about the value
I have to use for the parameter "service".
Basically I just want to do:
(with-open-stream (web-page (comm:open-tcp-stream
"www.thewebpage.com" ?))
(loop for line = (read-line web-page nil nil)
while line
do (write-line line))))
First, I don't know what the value of "?" should be. Second, I have
the hunch this will not work anyway. Can anyone tell me what am I
missing?
Finally, is there a place where I can look for some examples? I have
looked at the Lispworks Reference Manual, which is very complete, but
I get overwhelmed. Is there something simpler?
Thanks a lot for any help.
PS: I am still undecided about whether to use Lispworks or CLISP, so
the same question applied to CLISP would also help me a lot.
On Apr 1, 10:33 pm, ············@hotmail.com wrote:
> Hi, I am a Lisp newbie. I am trying to load a web page into a string
> with Lispworks, in order to parse the html code.
>
> (with-open-stream (web-page (comm:open-tcp-stream
> "www.thewebpage.com" ?))
> (loop for line = (read-line web-page nil nil)
> while line
> do (write-line line))))
>
> PS: I am still undecided about whether to use Lispworks or CLISP, so
> the same question applied to CLISP would also help me a lot.
In CLISP (using the construct you used):
(with-open-stream (stream (ext:open-http "http://
www.thewebpage.com/"))
(loop for line = (read-line stream nil nil nil)
while line
do (write-line line)))
...should work.
--
Phil
http://phil.nullable.eu/
On Tue, 01 Apr 2008 14:51:08 -0700, philip.armitage wrote:
> On Apr 1, 10:33 pm, ············@hotmail.com wrote:
>> Hi, I am a Lisp newbie. I am trying to load a web page into a string
>> with Lispworks, in order to parse the html code.
>>
>> (with-open-stream (web-page (comm:open-tcp-stream "www.thewebpage.com"
>> ?))
>> (loop for line = (read-line web-page nil nil)
>> while line
>> do (write-line line))))
>>
>> PS: I am still undecided about whether to use Lispworks or CLISP, so
>> the same question applied to CLISP would also help me a lot.
>
> In CLISP (using the construct you used):
>
> (with-open-stream (stream (ext:open-http "http:// www.thewebpage.com/"))
> (loop for line = (read-line stream nil nil nil)
> while line
> do (write-line line)))
>
> ...should work.
almost .. you need to send the initial request line before you start
reading, like:
(princ "GET / HTTP/1.1" stream)
..followed by two "newlines":
(defvar *crlf* (format stream "~C~C" #\Return #\Linefeed))
(princ *crlf* stream)
(princ *crlf* stream)
..but yeah, use Drakma.. :)
--
Lars Rune Nøstdal
http://nostdal.org/
On Apr 2, 2:27 am, Lars Rune Nøstdal <···········@gmail.com> wrote:
> On Tue, 01 Apr 2008 14:51:08 -0700, philip.armitage wrote:
> > (with-open-stream (stream (ext:open-http "http://www.thewebpage.com/"))
> > (loop for line = (read-line stream nil nil nil)
> > while line
> > do (write-line line)))
>
> > ...should work.
>
> almost .. you need to send the initial request line before you start
> reading, like:
>
> (princ "GET / HTTP/1.1" stream)
>
> ..followed by two "newlines":
>
> (defvar *crlf* (format stream "~C~C" #\Return #\Linefeed))
> (princ *crlf* stream)
> (princ *crlf* stream)
No, it works fine without doing all that (I tested before I posted the
code!). Probably you would need to do that on a raw TCP stream.
--
Phil
http://phil.nullable.eu/
···············@gmail.com wrote:
>
> No, it works fine without doing all that (I tested before I posted the
> code!). Probably you would need to do that on a raw TCP stream.
you're right, i did not see the "ext:open-http"-part
--
Lars Rune N�stdal
http://nostdal.org/
On Apr 2, 9:32 am, Lars Rune Nøstdal <···········@gmail.com> wrote:
> ···············@gmail.com wrote:
>
> > No, it works fine without doing all that (I tested before I posted the
> > code!). Probably you would need to do that on a raw TCP stream.
>
> you're right, i did not see the "ext:open-http"-part
>
> --
> Lars Rune Nøstdalhttp://nostdal.org/
I tried it with CLISP but I get this error:
*** - Winsock error 10022 (EINVAL): Invalid argument
Sorry to keep asking, but I am really lost. Thanks a lot again for
your help.
On Apr 2, 8:56 am, ············@hotmail.com wrote:
> On Apr 2, 9:32 am, Lars Rune Nøstdal <···········@gmail.com> wrote:
>
> > ···············@gmail.com wrote:
>
> > > No, it works fine without doing all that (I tested before I posted the
> > > code!). Probably you would need to do that on a raw TCP stream.
>
> > you're right, i did not see the "ext:open-http"-part
>
> > --
> > Lars Rune Nøstdalhttp://nostdal.org/
>
> I tried it with CLISP but I get this error:
>
> *** - Winsock error 10022 (EINVAL): Invalid argument
>
> Sorry to keep asking, but I am really lost. Thanks a lot again for
> your help.
Can you post the exact code you used in CLISP? That error message
often indicates that a duff URL was supplied (note that in my original
post, the URL got wrapped where it shouldn't have...).
--
Phil
http://phil.nullable.eu/
On Apr 2, 10:06 am, ···············@gmail.com wrote:
> On Apr 2, 8:56 am, ············@hotmail.com wrote:
>
>
>
> > On Apr 2, 9:32 am, Lars Rune Nøstdal <···········@gmail.com> wrote:
>
> > > ···············@gmail.com wrote:
>
> > > > No, it works fine without doing all that (I tested before I posted the
> > > > code!). Probably you would need to do that on a raw TCP stream.
>
> > > you're right, i did not see the "ext:open-http"-part
>
> > > --
> > > Lars Rune Nøstdalhttp://nostdal.org/
>
> > I tried it with CLISP but I get this error:
>
> > *** - Winsock error 10022 (EINVAL): Invalid argument
>
> > Sorry to keep asking, but I am really lost. Thanks a lot again for
> > your help.
>
> Can you post the exact code you used in CLISP? That error message
> often indicates that a duff URL was supplied (note that in my original
> post, the URL got wrapped where it shouldn't have...).
> --
> Philhttp://phil.nullable.eu/
Thanks for the quick reply. I used:
(with-open-stream (stream (ext:open-http
"http://www.google.com/"))
(loop for line = (read-line stream nil nil nil)
while line
do (write-line line)))
It says:
;; connecting to "http://www.google.com"...
and then the above error.
On Apr 2, 9:15 am, ············@hotmail.com wrote:
> Thanks for the quick reply. I used:
>
> (with-open-stream (stream (ext:open-http
> "http://www.google.com/"))
> (loop for line = (read-line stream nil nil nil)
> while line
> do (write-line line)))
>
> It says:
>
> ;; connecting to "http://www.google.com"...
>
> and then the above error.
Hmmmm...I'm honestly not sure then. That works fine for me with clisp
2.41 on Linux and 2.42 on Windows. Probably one to take to the clisp
mailing list I guess.
--
Phil
http://phil.nullable.eu/
On Apr 2, 10:22 am, ···············@gmail.com wrote:
> On Apr 2, 9:15 am, ············@hotmail.com wrote:
>
> > Thanks for the quick reply. I used:
>
> > (with-open-stream (stream (ext:open-http
> > "http://www.google.com/"))
> > (loop for line = (read-line stream nil nil nil)
> > while line
> > do (write-line line)))
>
> > It says:
>
> > ;; connecting to "http://www.google.com"...
>
> > and then the above error.
>
> Hmmmm...I'm honestly not sure then. That works fine for me with clisp
> 2.41 on Linux and 2.42 on Windows. Probably one to take to the clisp
> mailing list I guess.
>
> --
> Philhttp://phil.nullable.eu/
After you told me that, I tried from a different location and you were
right: it works just I had written originally. I suspect that the
problem was because there was a firewall at the first location I
tried, so I am posting this here in case anyone ever has the same
problem
Thanks again for your help.
On Wed, 02 Apr 2008 01:27:09 +0000, Lars Rune Nøstdal wrote:
>
> (defvar *crlf* (format stream "~C~C" #\Return #\Linefeed)) (princ *crlf*
> stream)
look at what my news client did, again .. pan was better before 2.x ...
i'm going to turn "wrap text" off for good this time and see if it helps
even if i do some mistakes when wrapping text "manually" maybe the damage
will be less .. *sigh*
--
Lars Rune Nøstdal
http://nostdal.org/
On 1 Apr., 23:33, ············@hotmail.com wrote:
> Hi, I am a Lisp newbie. I am trying to load a web page into a string
> with Lispworks, in order to parse the html code.
> I am trying to use open-tcp-stream, but I am confused about the value
> I have to use for the parameter "service".
> Basically I just want to do:
... use Drakma - a http-client for Common Lisp: http://weitz.de/drakma/
ciao,
Jochen
<············@hotmail.com> wrote:
+---------------
| Hi, I am a Lisp newbie. I am trying to load a web page into a string
| with Lispworks, in order to parse the html code.
| I am trying to use open-tcp-stream, but I am confused about the value
| I have to use for the parameter "service".
| Basically I just want to do:
|
| (with-open-stream (web-page (comm:open-tcp-stream
| "www.thewebpage.com" ?))
| (loop for line = (read-line web-page nil nil)
| while line
| do (write-line line))))
|
| First, I don't know what the value of "?" should be. Second, I have
| the hunch this will not work anyway. Can anyone tell me what am I
| missing?
+---------------
You're missing quite a bit. See RFC 1945 "Hypertext Transfer
Protocol -- HTTP/1.0" <http://www.ietf.org/rfc/rfc1945.txt>
and RFC 2616 "Hypertext Transfer Protocol -- HTTP/1.1"
<http://www.ietf.org/rfc/rfc2616.txt> for details.
Note that for simple client apps you can usually get away with
implementing only the HTTP/1.0 protocol, *except*... even when
doing HTTP/1.0 you should *always* send the HTTP/1.1 "Host:" request
header anyway, since so many web sites these days use name-based
virtual hosting. [Fortunately, all of the popular web servers will,
as required by HTTP/1.1, correctly interpret the "Host:" request
header even if the request uses HTTP/1.0 protocol otherwise.]
-Rob
p.s. A simple standards-conforming HTTP/1.0 [+"Host:"] "GET"-only
client *can* be written in only ~100 lines of CL. [Handling "POST"
is only slightly more complicated.]
-----
Rob Warnock <····@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607
············@hotmail.com writes:
> Hi, I am a Lisp newbie. I am trying to load a web page into a string
> with Lispworks, in order to parse the html code.
> I am trying to use open-tcp-stream, but I am confused about the value
> I have to use for the parameter "service".
> Basically I just want to do:
Well, OPEN-TCP-STREAM will just open a TCP communications stream. You
will have to manage the HTTP protocol yourself.
The service parameter can be satisfied with the appropriate port for the
particular TCP service you want. The standard HTTP port is 80.
WITH-OPEN-STREAM will open a regular file stream, but you have to give
it a file name. Instead you will want to open a network stream and then
communicate with the other end.
> (with-open-stream (web-page (comm:open-tcp-stream
> "www.thewebpage.com" ?))
> (loop for line = (read-line web-page nil nil)
> while line
> do (write-line line))))
>
> First, I don't know what the value of "?" should be. Second, I have
> the hunch this will not work anyway. Can anyone tell me what am I
> missing?
Well, the main thing you are missing is that you first need to send an
HTTP request to the host you are connecting to. Only then will you
actually get any information back. So you will need to look for a
description of the protocol to be completely sure of what you need to
send.
Here is a simple example that uses the slightly easier HTTP/1.0 protocol
to get the page:
(defun write-crlf (stream)
(format stream "~C~C" #\Return #\Linefeed))
;; It is important here to make sure you get the right number of CR/LF
;; pairs. There needs to be a blank line.
;; FINISH-OUTPUT is also critical to make sure the request is sent
;; before listening for a reply, otherwise you can hang.
(defun send-get-request (stream path)
(format stream "GET ~A HTTP/1.0" path)
(write-crlf stream)
(write-crlf stream)
(finish-output stream))
(defun print-lines (stream)
(loop for line = (read-line stream nil nil)
while line do (print line)))
;; Simple test function. Give it a host and the URL path.
;; Ex: (test "www.myhost.com" "/index.html")
(defun test (host path)
(let ((tcp-stream nil))
(unwind-protect
(progn
(setq tcp-stream (comm:open-tcp-stream host 80))
(send-get-request tcp-stream path)
(print-lines tcp-stream))
(when tcp-stream (close tcp-stream)))))
> PS: I am still undecided about whether to use Lispworks or CLISP, so
> the same question applied to CLISP would also help me a lot.
The only difference is changing out the line that opens the tcp
connection. What you probably want to do is something simple like the
following, so that you can gradually expand the set of implementations
you support:
(defun open-tcp-stream (host port)
#+:lispworks (comm:open-tcp-stream host port)
#+:clisp (..... host port)
)
--
Thomas A. Russ, USC/Information Sciences Institute
Thomas A. Russ <···@sevak.isi.edu> wrote:
+---------------
| Here is a simple example that uses the slightly easier HTTP/1.0
| protocol to get the page:
...
| ;; It is important here to make sure you get the right number of CR/LF
| ;; pairs. There needs to be a blank line.
| ;; FINISH-OUTPUT is also critical to make sure the request is sent
| ;; before listening for a reply, otherwise you can hang.
| (defun send-get-request (stream path)
| (format stream "GET ~A HTTP/1.0" path)
| (write-crlf stream)
| (write-crlf stream)
| (finish-output stream))
+---------------
You really, *really* want to use an HTTP/1.1 "Host:" header
even when doing HTTP/1.0 requests, since named-based virtual
hosting is ubiquitous these days. E.g.:
(defconstant +crlf+ (coerce (list #\Return #\Linefeed) 'string))
(defun send-get-request (stream host path)
(format stream "GET ~A HTTP/1.0~AHost: ~A~A~A"
path +crlf+ host +crlf+ +crlf+)
(finish-output stream))
[And, yes, the FINISH-OUTPUT (or FORCE-OUTPUT) *is* critical...]
Since your TEST function already had a HOST input argument,
this is a simple addition.
-Rob
-----
Rob Warnock <····@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607
············@hotmail.com writes:
> Hi, I am a Lisp newbie. I am trying to load a web page into a string
> with Lispworks, in order to parse the html code.
> I am trying to use open-tcp-stream, but I am confused about the value
> I have to use for the parameter "service".
> Basically I just want to do:
Well, OPEN-TCP-STREAM will just open a TCP communications stream. You
will have to manage the HTTP protocol yourself.
The service parameter can be satisfied with the appropriate port for the
particular TCP service you want. The standard HTTP port is 80.
WITH-OPEN-STREAM will open a regular file stream, but you have to give
it a file name. Instead you will want to open a network stream and then
communicate with the other end.
> (with-open-stream (web-page (comm:open-tcp-stream
> "www.thewebpage.com" ?))
> (loop for line = (read-line web-page nil nil)
> while line
> do (write-line line))))
>
> First, I don't know what the value of "?" should be. Second, I have
> the hunch this will not work anyway. Can anyone tell me what am I
> missing?
Well, the main thing you are missing is that you first need to send an
HTTP request to the host you are connecting to. Only then will you
actually get any information back. So you will need to look for a
description of the protocol to be completely sure of what you need to
send.
Here is a simple example that uses the slightly easier HTTP/1.0 protocol
to get the page:
(defun write-crlf (stream)
(format stream "~C~C" #\Return #\Linefeed))
;; It is important here to make sure you get the right number of CR/LF
;; pairs. There needs to be a blank line.
;; FINISH-OUTPUT is also critical to make sure the request is sent
;; before listening for a reply, otherwise you can hang.
(defun send-get-request (stream path)
(format stream "GET ~A HTTP/1.0" path)
(write-crlf stream)
(write-crlf stream)
(finish-output stream))
(defun print-lines (stream)
(loop for line = (read-line stream nil nil)
while line do (print line)))
;; Simple test function. Give it a host and the URL path.
;; Ex: (test "www.myhost.com" "/index.html")
(defun test (host path)
(let ((tcp-stream nil))
(unwind-protect
(progn
(setq tcp-stream (comm:open-tcp-stream host 80))
(send-get-request tcp-stream path)
(print-lines tcp-stream))
(when tcp-stream (close tcp-stream)))))
> PS: I am still undecided about whether to use Lispworks or CLISP, so
> the same question applied to CLISP would also help me a lot.
The only difference is changing out the line that opens the tcp
connection. What you probably want to do is something simple like the
following, so that you can gradually expand the set of implementations
you support:
(defun open-tcp-stream (host port)
#+:lispworks (comm:open-tcp-stream host port)
#+:clisp (..... host port)
)
--
Thomas A. Russ, USC/Information Sciences Institute
···@sevak.isi.edu (Thomas A. Russ) writes:
> WITH-OPEN-STREAM will open a regular file stream, but you have to give
> it a file name. Instead you will want to open a network stream and then
> communicate with the other end.
Brain fade. I was thinking about WITH-OPEN-FILE. Ignore this. Using
WITH-OPEN-STREAM will allow you to dispense with the UNWIND-PROTECT I
was using.
--
Thomas A. Russ, USC/Information Sciences Institute
············@hotmail.com writes:
> Hi, I am a Lisp newbie. I am trying to load a web page into a string
> with Lispworks, in order to parse the html code.
If you want to fetch and parse html you could use (portable)
allegroserve and use do-http-request like I've describe in
http://groups.google.no/group/comp.lang.lisp/msg/cda1a24ac3b50a43
Petter
--
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?