hyphenation

From: Marc Battyani
Subject: hyphenation
Date: Tue, 07 Oct 2003 22:02:40 +0000
Message-ID: <blvd5r$tik@library2.airnews.net>

cl-typesetting now has a TeX compatible word hyphenation algorithm (Thanks to
Fabrice Popineau for his Lisp version)

So I added it to do simple hyphenation on a line basis. That's the easy part
and the result is here as usual:
www.fractalconcept.com/ex.pdf

Now to design a good  paragraph level one there are 3 options.
1 take the TeX one based on penalties
2 find another one
3 make a new one

I personally prefer the option 3 as it's more fun and it's always possible to
fold back to 1 anyway if it's not better.

So anybody has some ideas for 3 ?

What I would like to try is some kind of evaluation function and then some
optimization by  searching and backtracking like in the past Lisp days :)

In order to feed some kind of A* or whatever search we need an evaluation
function.

I didn't thought too much about it but I think we can state that a perfect
layout has uniform spacing of one space between words and no hyphenation at
all. Does this looks like a suitable goal ?
If yes we just need some measure of distance from that noble goal and
minimize it. Something like :
a*number-of-hyphenation^b + c*(sum abs(space-width - normal-space-width)) ^d

What do you think about this ?

Marc

Re: hyphenation Eric Daniel
Re: hyphenation Chris Beggy
- Re: hyphenation Simon Andr�s
Re: hyphenation ·············@comcast.net

From: Eric Daniel
Subject: Re: hyphenation
Date: Tue, 07 Oct 2003 23:18:50 +0000
Message-ID: <3f8349da$1_3@corp.newsgroups.com>

In article <··········@library2.airnews.net>, Marc Battyani wrote:
>  cl-typesetting now has a TeX compatible word hyphenation algorithm (Thanks to
>  Fabrice Popineau for his Lisp version)

Nice!

>  So I added it to do simple hyphenation on a line basis. That's the easy part
>  and the result is here as usual:
>  www.fractalconcept.com/ex.pdf
>  
>  Now to design a good  paragraph level one there are 3 options.
>  1 take the TeX one based on penalties
>  2 find another one
>  3 make a new one
>  
>  I personally prefer the option 3 as it's more fun and it's always possible to
>  fold back to 1 anyway if it's not better.
>  
>  So anybody has some ideas for 3 ?
>  
>  What I would like to try is some kind of evaluation function and then some
>  optimization by  searching and backtracking like in the past Lisp days :)
>  
>  In order to feed some kind of A* or whatever search we need an evaluation
>  function.
>  
>  I didn't thought too much about it but I think we can state that a perfect
>  layout has uniform spacing of one space between words and no hyphenation at
>  all. Does this looks like a suitable goal ?
>  If yes we just need some measure of distance from that noble goal and
>  minimize it. Something like :
>  a*number-of-hyphenation^b + c*(sum abs(space-width - normal-space-width)) ^d
>  
>  What do you think about this ?

I think some kind of mechanism for adding penalties is definitely needed,
for two reasons
  - general typesetting rules (e.g., avoid hyphenation on the last
line of an odd page)
  - user directives (e.g. avoid breaking up "these two" words for whatever
reason)

So:

  a*number-of-hyphenation^b +
  c*(sum abs(space-width - normal-space-width)) ^d +
  e*(sum incidental-penalties)/number-of-lines

Also you may want to make the cost of
individual over- or under-spacings non-linear
to prevent really long paragraphs from
looking            like           this.
Maybe like this:

  a*number-of-hyphenation^b +
  c*(sum abs(space-width - normal-space-width)^d) +
  e*(sum incidental-penalties)/number-of-lines


Eric Daniel


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 100,000 Newsgroups - 19 Different Servers! =-----

From: Chris Beggy
Subject: Re: hyphenation
Date: Wed, 08 Oct 2003 02:25:59 +0000
Message-ID: <87wubgihxk.fsf@lackawana.kippona.com>

"Marc Battyani" <·············@fractalconcept.com> writes:

> cl-typesetting now has a TeX compatible word hyphenation algorithm (Thanks to
> Fabrice Popineau for his Lisp version)
>
> So I added it to do simple hyphenation on a line basis. That's the easy part
> and the result is here as usual:
> www.fractalconcept.com/ex.pdf
>
> Now to design a good  paragraph level one there are 3 options.
> 1 take the TeX one based on penalties
> 2 find another one
> 3 make a new one
>
> I personally prefer the option 3 as it's more fun and it's always possible to
> fold back to 1 anyway if it's not better.
>
> So anybody has some ideas for 3 ?

FWIW, LaTeX (I know it's not TeX) has the _babel_ package which
allows a hyphenation algorithm on language-by-language basis.

Chris

From: Simon Andr�s
Subject: Re: hyphenation
Date: Wed, 08 Oct 2003 09:28:39 +0000
Message-ID: <vcdr81o13js.fsf@tarski.math.bme.hu>

Chris Beggy <······@kippona.com> writes:

> FWIW, LaTeX (I know it's not TeX) has the _babel_ package which
> allows a hyphenation algorithm on language-by-language basis.

The algorithm is the same I think (the one described in the TeXBook),
only the hyphenation patterns are different.

Andras

From: ·············@comcast.net
Subject: Re: hyphenation
Date: Tue, 07 Oct 2003 23:31:04 +0000
Message-ID: <smm462x4.fsf@comcast.net>

"Marc Battyani" <·············@fractalconcept.com> writes:

> cl-typesetting now has a TeX compatible word hyphenation algorithm (Thanks to
> Fabrice Popineau for his Lisp version)
>
> So I added it to do simple hyphenation on a line basis. That's the easy part
> and the result is here as usual:
> www.fractalconcept.com/ex.pdf
>
> Now to design a good  paragraph level one there are 3 options.
> 1 take the TeX one based on penalties
> 2 find another one
> 3 make a new one
>
> I personally prefer the option 3 as it's more fun and it's always possible to
> fold back to 1 anyway if it's not better.
>
> So anybody has some ideas for 3 ?
>
> What I would like to try is some kind of evaluation function and then some
> optimization by  searching and backtracking like in the past Lisp days :)

Actually, if you consider the potential breaking points as nodes in a graph
and the end of the paragraph as the `goal', it is simple `shortest path'
algorithm.