From: ······@gmail.com
Subject: elisp: format-time-string's %z problem
Date: 
Message-ID: <1155009669.061684.107400@p79g2000cwp.googlegroups.com>
i believe there's a problem with elisp's format-time-string.

in format-time-string, for example i have in my emacs init script:

(defun date-time () "insert current date-time string." (interactive)
  (insert (format-time-string "%Y-%m-%dT%T%z"))
)

when evaluated, i get:
2006-08-07T20:39:16-0700

the last 5 chars "-0700" is for time zone, corresponding to the %z.

however, this does not seems to be a standard sec for time (RFC-3339),
which should have a colon after the 07. i.e. 2006-08-07T20:39:16-07:00

is there a way to generate the proper format using just
format-time-string without mucking with the string?

  Xah
  ···@xahlee.org
∑ http://xahlee.org/

From: Barry Margolin
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <barmar-31219C.00345009082006@comcast.dca.giganews.com>
In article <························@p79g2000cwp.googlegroups.com>,
 ·······@gmail.com" <······@gmail.com> wrote:

> i believe there's a problem with elisp's format-time-string.
> 
> in format-time-string, for example i have in my emacs init script:
> 
> (defun date-time () "insert current date-time string." (interactive)
>   (insert (format-time-string "%Y-%m-%dT%T%z"))
> )
> 
> when evaluated, i get:
> 2006-08-07T20:39:16-0700
> 
> the last 5 chars "-0700" is for time zone, corresponding to the %z.
> 
> however, this does not seems to be a standard sec for time (RFC-3339),
> which should have a colon after the 07. i.e. 2006-08-07T20:39:16-07:00

It's the format used in RFC 2822 date/time strings, i.e. headers of 
Internet email messages.

> is there a way to generate the proper format using just
> format-time-string without mucking with the string?

Put it in a string, split it into the first 3 characters and last 2 
characters, and then concatenate these with a ":" in between.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
From: ······@gmail.com
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1155113969.564474.119440@75g2000cwc.googlegroups.com>
Thanks Barry for the answer.

I did some reading yesterday. RFC-3339 is a just a full-form and
stricter version of a particular syntax in ISO 8601.

Iso 8601 allows things like 2006, 200608, 2006-08-09T01:18:39-07:00, or
20060809T011839. Basically, the force behind the power of ISO8602 is to
dictate that year should be spelled out, and the order is
year-month-day, as oppose to the WASP's "month-day, year" and other
imperial moronicities.

Since ISO8601 allows a condensed form like "20060809T011839" thus
that's why in the time zone it also allows "-0700" instead of the more
proper "-07:00". RFC 3339, in its glory of verbosity of 4500 words,
came down to say that their format should be the ISO8601 form of:
yyyy-mm-ddThh:mm:ss±zz:zz

i modified my time stamp code to thus:

(defun date-time () "insert current date-time string." (interactive)
  (insert
   (concat
    (format-time-string "%Y-%m-%dT%T")
    ((lambda (x) (concat (substring x 0 3) ":" (substring x 3 5)))
(format-time-string "%z"))
    )
   )
  )

Btw, this came to my attention because i started to syndicate my online
diary using the Atom protocol. In the Atom spec, time must be in the
form of RFC 3339. I kinda assumed that's what i had in my date-time
code. However, when submitting my .atom page to a validator, that's
when i realized the "-07:00" vs "-0700" discrepancy.

Another point of interest for those using Atom... In Atom, it requires
a unique id in uri format for each entry, which isn't trivial to
generate unless one spend another hour or so to read the spec for uri.
But i've adapted the following for the id part, any emacs users doing
atom might find it useful.

; generate rss atom's id. e.g. tag:xahlee.org,2006-08-08:053432
(defun atomid () "insert current date-time string." (interactive)
  (insert (format-time-string "tag:xahlee.org,%Y-%m-%d:%H%M%S"
(current-time) 1))
)

purely idle philosophizing now, one problem with this id adaption is
that it uses a time stamp. So, as a side effect, it gives away info
about when this id is made. Also, it gives out about my domain name. It
would be interesting, to think of a way to generate a universally
permanently unique id string (as required by Atom) without sides
effects of disclosing other meaningful info. One simple solution is
just to have a large number of random strings. How many characters
necessary can be easily computed by estimating the totality of diary
entries made by all humans accumulatively. I think the interesting
question is whether this is essentially the only solution. I kinda
think it is.

  Xah
  ···@xahlee.org
∑ http://xahlee.org/

Barry Margolin wrote:
> In article <························@p79g2000cwp.googlegroups.com>,
>  ·······@gmail.com" <······@gmail.com> wrote:
>
> > i believe there's a problem with elisp's format-time-string.
> >
> > in format-time-string, for example i have in my emacs init script:
> >
> > (defun date-time () "insert current date-time string." (interactive)
> >   (insert (format-time-string "%Y-%m-%dT%T%z"))
> > )
> >
> > when evaluated, i get:
> > 2006-08-07T20:39:16-0700
> >
> > the last 5 chars "-0700" is for time zone, corresponding to the %z.
> >
> > however, this does not seems to be a standard sec for time (RFC-3339),
> > which should have a colon after the 07. i.e. 2006-08-07T20:39:16-07:00
>
> It's the format used in RFC 2822 date/time strings, i.e. headers of
> Internet email messages.
>
> > is there a way to generate the proper format using just
> > format-time-string without mucking with the string?
>
> Put it in a string, split it into the first 3 characters and last 2
> characters, and then concatenate these with a ":" in between.
>
> --
> Barry Margolin, ······@alum.mit.edu
> Arlington, MA
> *** PLEASE post questions in newsgroups, not directly to me ***
> *** PLEASE don't copy me on replies, I'll read them in the group ***
From: Steve Schafer
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <7uijd25g1osrn70tahl6l0khe507mrd4gb@4ax.com>
On 9 Aug 2006 01:59:29 -0700, ·······@gmail.com" <······@gmail.com>
wrote:

>purely idle philosophizing now, one problem with this id adaption is
>that it uses a time stamp. So, as a side effect, it gives away info
>about when this id is made. Also, it gives out about my domain name. It
>would be interesting, to think of a way to generate a universally
>permanently unique id string (as required by Atom) without sides
>effects of disclosing other meaningful info.

1) Generate a unique identifier, including enough info, meaningful or
not, to guarantee uniqueness.

2) Run the unique identifier through a one-way hash function, such as
MD5 or SHA-1, thus obliterating the meaningfulness of any embeeded info.

Steve Schafer
Fenestra Technologies Corp
http://www.fenestra.com/
From: ······@gmail.com
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1155161559.487708.3120@i3g2000cwc.googlegroups.com>
Great point!

I thought about this. The problem with this solution is that it
essentially ends up exactly as a random string, with added complexity
of the actual generation.

For example, if a hash function generates n chars. It is simpler, to
simply generate randomly n chars in the first place.

I guess what i wanted to say originally, was that, philosophically, to
generate a permanent universally unique number without any meaningful
info, it cannot avoid being a random string. In other words, the nature
of a universal unique number IS a random number of sufficient length.

in the process of writing these out, it seems obvious now. I think i'll
revisite my atom-id function to have it simply generate a long random
string. (with added colon etc so it is in the form of URI)

  Xah
  ···@xahlee.org
∑ http://xahlee.org/

Steve Schafer wrote:
> On 9 Aug 2006 01:59:29 -0700, ·······@gmail.com" <······@gmail.com>
> wrote:
>
> >purely idle philosophizing now, one problem with this id adaption is
> >that it uses a time stamp. So, as a side effect, it gives away info
> >about when this id is made. Also, it gives out about my domain name. It
> >would be interesting, to think of a way to generate a universally
> >permanently unique id string (as required by Atom) without sides
> >effects of disclosing other meaningful info.
>
> 1) Generate a unique identifier, including enough info, meaningful or
> not, to guarantee uniqueness.
>
> 2) Run the unique identifier through a one-way hash function, such as
> MD5 or SHA-1, thus obliterating the meaningfulness of any embeeded info.
>
> Steve Schafer
> Fenestra Technologies Corp
> http://www.fenestra.com/
From: Steve Schafer
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <906nd29lnemtg7ikrebienhkjesbkj1vae@4ax.com>
On 9 Aug 2006 15:12:39 -0700, ·······@gmail.com" <······@gmail.com>
wrote:
>I thought about this. The problem with this solution is that it
>essentially ends up exactly as a random string, with added complexity
>of the actual generation.

No, it's more than that. There is a stronger statistical likelihood of
uniqueness if the source data for the final result contains information
that is known to be unique (e.g., the MAC address of a NIC, such as is
used in the UUID generation algorithm).

Steve Schafer
Fenestra Technologies Corp
http://www.fenestra.com/
From: Xah Lee
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1155822658.012984.205910@h48g2000cwc.googlegroups.com>
if your one-way function returns a 10 digit number, the probability of
it being unique is no better than a random 10-digit generating
function.

  Xah
  ···@xahlee.org
∑ http://xahlee.org/

Steve Schafer wrote:
> On 9 Aug 2006 15:12:39 -0700, ·······@gmail.com" <······@gmail.com>
> wrote:
> >I thought about this. The problem with this solution is that it
> >essentially ends up exactly as a random string, with added complexity
> >of the actual generation.
>
> No, it's more than that. There is a stronger statistical likelihood of
> uniqueness if the source data for the final result contains information
> that is known to be unique (e.g., the MAC address of a NIC, such as is
> used in the UUID generation algorithm).
>
> Steve Schafer
> Fenestra Technologies Corp
> http://www.fenestra.com/
From: Steve Schafer
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <bq29e2tqb1dq3hn1lspc092opkls8l4frd@4ax.com>
On 17 Aug 2006 06:50:58 -0700, "Xah Lee" <···@xahlee.org> wrote:

>if your one-way function returns a 10 digit number, the probability of
>it being unique is no better than a random 10-digit generating
>function.

No, that's not true, precisely because the one-way hash is _not_ random.

To see this, consider UUIDs as a concrete example. A UUID is a
deterministically generated 128-bit number. Two machines are guaranteed
never to generate the same UUID because incorporated into the UUID is
the MAC address of (one of) the machine's NIC, which is a unique 48-bit
number.* Furthermore, a single machine is guaranteed never to generate
the same UUID because the current time is also incorporated into the
UUID.**

So, a UUID would serve perfectly as our universally unique identifier,
except for one thing: It contains human-decipherable information that
can be used for a variety of nefarious purposes.

Enter the one-way hash function. What if we had a perfect hash function,
one that maps any 128-bit number to another 128-bit (or larger) number
in such a way that no two input numbers map to the same output number?
If the hash function sufficiently scrambles the input bits, it will
obliterate the human-decipherable information. And since the function is
one-way, there is no means, apart from brute force trial and error, to
recover the information.

So, if we start with two UUIDs, which are guaranteed to be different
because of the nature of the generation algorithm, and apply our perfect
hash function to them, we will end up with two identifiers which are
also guaranteed to be unique (because the hash is perfect), and in which
the human-decipherable information has been obliterated. Exactly what we
need.

Well, there is one little problem.... We don't have such a perfect
one-way hash function. We do have perfect hash functions that aren't one
way (which does us no good at all), and we also have some hash functions
that are one-way, but not perfect. MD5 and SHA-1 fall into this latter
category, and while they're not perfect, they're pretty darn close. They
can't absolutely guarantee that no collisions will occur, but the
likelihood of a collision is very, very small.

The nature of the hash algorithm in both cases takes advantage of the
fact that although the pool of potential UUIDs contains 2^128 members,
only a small fraction of the pool is ever actually tapped. Using this,
and furthermore ensuring that input numbers that are "close" to each
other tend to hash to values that are "far" from each other, we
significantly reduce the likelihood that two input values will hash to
the same value.

Steve Schafer
Fenestra Technologies Corp
http://www.fenestra.com/

*Excluding for the moment the apparent existence of NICs with MAC
addresses that are specified in flash memory.

**Excluding for the moment a potential weakness in the UUID algorithm
that could lead to duplicate UUIDs in the face of a reset of the system
clock.
From: Ray Dillinger
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <44e5eec9$0$96230$742ec2ed@news.sonic.net>
Steve Schafer wrote:

> Well, there is one little problem.... We don't have such a perfect
> one-way hash function. We do have perfect hash functions that aren't one
> way (which does us no good at all), and we also have some hash functions
> that are one-way, but not perfect. MD5 and SHA-1 fall into this latter
> category, and while they're not perfect, they're pretty darn close. They
> can't absolutely guarantee that no collisions will occur, but the
> likelihood of a collision is very, very small.

There is a slightly deeper problem.  Perfect one-way hash functions
from N bits to N bits cannot exist.

If every 128-bit number maps to exactly one other 128-bit number under
a hash function, then that hash function is reversible (ie, it is not
one-way).  If there is some 128-bit number that more than one other
128-bit number maps to under a hash function, then that hash function
is not one-to-one (ie, it is not perfect).  If there is some 128-bit
numer that no input hashes to under that function, then there must be
at least one other 128-bit number to which more than one input hashes
(pigeonhole principle).

People who do not understand what these terms mean frequently trumpet
that they have found a "perfect one-way hash function" whose outputs
are the same size as its inputs.  Such people may be safely ignored
because what they're really announcing is that they don't know what
they're talking about.

					Bear
From: Joe Marshall
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1155921496.230747.60100@m79g2000cwm.googlegroups.com>
Ray Dillinger wrote:
>
> If every 128-bit number maps to exactly one other 128-bit number under
> a hash function, then that hash function is reversible (ie, it is not
> one-way).

I don't think that's what is usually meant by a `one-way hash'.  I
think the usual meaning is that the hash function does not have an
easily computed inverse.
From: Ray Dillinger
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <44e64ce9$0$96175$742ec2ed@news.sonic.net>
Joe Marshall wrote:
> Ray Dillinger wrote:
> 
>>If every 128-bit number maps to exactly one other 128-bit number under
>>a hash function, then that hash function is reversible (ie, it is not
>>one-way).
> 
> 
> I don't think that's what is usually meant by a `one-way hash'.  I
> think the usual meaning is that the hash function does not have an
> easily computed inverse.

Well... it's certainly the literal definition.

But aside from literal definitions, I *conjecture* that this type
of function is also impossible in the sense you mean it. Not
coincidentally, every easily computed function we know of having
the one-to-one property on the same size data, _is_ easily reversible
by someone who is fully aware of its mathematical structure.

As far as I can tell, one-to-one functions that are N-bits-to-N-bits
*always* have a reversal algorithm no more complex to compute than the
function itself.  This conjecture may be provable, but if so it's
beyond me to prove it.  This is just my experience/intuition, and
I've seen _MANY_ attempts at finding a function for which this is not
true.

It may require you to know some numbers not easily determined from
the hash algorithm itself, like modular-exponentiation, or if you're
very good it may require a mathematical proof or relation that we
haven't discovered (yet), although I've never seen that happen.

I will not be trusting any such function any time soon; if I
see one that seems to be such a case, I will fully expect that
some bright math Ph.D will discover how to easily reverse it the
following year, because I simply don't believe that a deterministic
calculation introducing no entropy can exist without being easily
reversible.

To state my conjecture strongly:

"Any function that is one-to-one and gives results the same size
as inputs, is not only provably 'reversible' rather than 'one-way'
in the technical literal sense, but can also be reversed _with_
_computing_power_equal_to_that_required_to_compute_it_."

I can't prove this, but as yet no counterexamples have been observed.

				Bear
From: Joe Marshall
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1156176859.658944.135060@m73g2000cwd.googlegroups.com>
Ray Dillinger wrote:
>
> To state my conjecture strongly:
>
> "Any function that is one-to-one and gives results the same size
> as inputs, is not only provably 'reversible' rather than 'one-way'
> in the technical literal sense, but can also be reversed _with_
> _computing_power_equal_to_that_required_to_compute_it_."
>
> I can't prove this, but as yet no counterexamples have been observed.
>
> 				Bear

According to various sources, it is an open question whether one-way
functions exist.  On the other hand, public-key cryptography is built
on the assumption that they do.

I'm not sure I'd agree that no counterexamples have been `observed' ---
there exist functions that appear to be one-way --- but certainly no
counterexamples have been *proven*.  Nonetheless, I suspect that things
such as the subset-sum problem are one-way.
From: Ray Dillinger
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <44ea530a$0$96167$742ec2ed@news.sonic.net>
Joe Marshall wrote:
> Ray Dillinger wrote:
> 
>>To state my conjecture strongly:
>>
>>"Any function that is one-to-one and gives results the same size
>>as inputs, is not only provably 'reversible' rather than 'one-way'
>>in the technical literal sense, but can also be reversed _with_
>>_computing_power_equal_to_that_required_to_compute_it_."
>>
>>I can't prove this, but as yet no counterexamples have been observed.
> 
> According to various sources, it is an open question whether one-way
> functions exist.  On the other hand, public-key cryptography is built
> on the assumption that they do.


All forms of public-key cryptography that are one-to-one mappings
on identical-sized blocks have exactly the reversal I specified -
a decrypt function that reverses the encryption and takes the same
computing power as the encrypt function.

Any use of such an algorithm as a "perfect one-way hash"
requires proof that the developer of the hash function did not
save a decryption key or the information necessary to find one.

					Bear
From: Joe Marshall
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1156268271.275578.123780@h48g2000cwc.googlegroups.com>
Ray Dillinger wrote:
> Joe Marshall wrote:
> > Ray Dillinger wrote:
> >
> >>To state my conjecture strongly:
> >>
> >>"Any function that is one-to-one and gives results the same size
> >>as inputs, is not only provably 'reversible' rather than 'one-way'
> >>in the technical literal sense, but can also be reversed _with_
> >>_computing_power_equal_to_that_required_to_compute_it_."
> >>
> >>I can't prove this, but as yet no counterexamples have been observed.
> >
> > According to various sources, it is an open question whether one-way
> > functions exist.  On the other hand, public-key cryptography is built
> > on the assumption that they do.
>
>
> All forms of public-key cryptography that are one-to-one mappings
> on identical-sized blocks have exactly the reversal I specified -
> a decrypt function that reverses the encryption and takes the same
> computing power as the encrypt function.

Yes, they are that way by design --- you wouldn't want a cryptosystem
where it was computationally infeasible to decrypt a message.  But
these systems work by finding or constructing a mapping that is
invertible only if you know the key (these are `trapdoor' one-way
functions).  But you can construct one of these without the trapdoor.
From: Ray Dillinger
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <44eb9603$0$96162$742ec2ed@news.sonic.net>
Joe Marshall wrote:
> Ray Dillinger wrote:

>>All forms of public-key cryptography that are one-to-one mappings
>>on identical-sized blocks have exactly the reversal I specified -
>>a decrypt function that reverses the encryption and takes the same
>>computing power as the encrypt function.


> Yes, they are that way by design --- you wouldn't want a cryptosystem
> where it was computationally infeasible to decrypt a message.  But
> these systems work by finding or constructing a mapping that is
> invertible only if you know the key (these are `trapdoor' one-way
> functions).  But you can construct one of these without the trapdoor.

You can; but I do believe that none of them will be a one-to-one
function on same-sized data. If you know a single counterexample,
talk about it.

				Bear
From: Joe Marshall
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1156354092.762488.291290@74g2000cwt.googlegroups.com>
Ray Dillinger wrote:
> Joe Marshall wrote:
> > Ray Dillinger wrote:
>
> >>All forms of public-key cryptography that are one-to-one mappings
> >>on identical-sized blocks have exactly the reversal I specified -
> >>a decrypt function that reverses the encryption and takes the same
> >>computing power as the encrypt function.
>
>
> > Yes, they are that way by design --- you wouldn't want a cryptosystem
> > where it was computationally infeasible to decrypt a message.  But
> > these systems work by finding or constructing a mapping that is
> > invertible only if you know the key (these are `trapdoor' one-way
> > functions).  But you can construct one of these without the trapdoor.
>
> You can; but I do believe that none of them will be a one-to-one
> function on same-sized data. If you know a single counterexample,
> talk about it.

The discrete logarithm problem is one.  Here is a concrete example:

(define (expmod base exp m)
  (cond ((= exp 0) 1)
        ((even? exp)
         (remainder (square (expmod base (/ exp 2) m))
                    m))
        (else
         (remainder (* base (expmod base (- exp 1) m))
                    m))))

(define *big-prime* (1+ (* 142 (+ 531 (expt 10 301)))))

(define (one-way x) (expmod 3 x *big-prime*))

The procedure one-way can quickly map any integer in the range [0,
*big-prime*) uniquely to another integer in that range, but there is no
known easy way to invert it.  For example, there exists an x such that
(one-way x) evaluates to 2, but try finding it.
From: Ray Dillinger
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <44eceee0$0$96212$742ec2ed@news.sonic.net>
Joe Marshall wrote:
> Ray Dillinger wrote:
> 
>>Joe Marshall wrote:
>>
>>>Ray Dillinger wrote:
>>
>>>>All forms of public-key cryptography that are one-to-one mappings
>>>>on identical-sized blocks have exactly the reversal I specified -
>>>>a decrypt function that reverses the encryption and takes the same
>>>>computing power as the encrypt function.
>>
> 
> The discrete logarithm problem is one.  Here is a concrete example:

<snippage>

Discrete logarithms can be reversed in exactly the time it takes
to compute them.  For every modular exponentiation that forms a
one-to-one and onto function, there's a corresponding modular
exponentiation that reverses it in one step.

You're talking about the difficulty of *searching* for the number
that gives the reversal, not the difficulty of *doing* the reversal.

In other words, just because you have thrown away the decryption
key or don't know how to find it, doesn't make something from
public-key encryption into a hash function.

You cannot know, when you get a modular exponentiation to use
as a hash function, from some source, whether that source has
or does not have the key to reverse that modular exponentiation.

				Bear
From: Joe Marshall
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1156442372.172646.185040@75g2000cwc.googlegroups.com>
Ray Dillinger wrote:
>
> Discrete logarithms can be reversed in exactly the time it takes
> to compute them.  For every modular exponentiation that forms a
> one-to-one and onto function, there's a corresponding modular
> exponentiation that reverses it in one step.
>
> You're talking about the difficulty of *searching* for the number
> that gives the reversal, not the difficulty of *doing* the reversal.

Ok, I see what you are saying.

I think you are right, then.  If a function computes a permutation,
then the repeated application of the function should generate a group
(right?).  Therefore, every permutation should have both a predecessor
and successor, and if the predecessor is known it should be
computationally no more difficult to invert the permutation than to
compute it in the first place.
From: Rob Warnock
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <qKOdnTEALe0Gc3PZnZ2dnUVZ_oOdnZ2d@speakeasy.net>
Joe Marshall <··········@gmail.com> wrote:
+---------------
| Ray Dillinger wrote:
| > that gives the reversal, not the difficulty of *doing* the reversal.
...
| I think you are right, then.  If a function computes a permutation,
| then the repeated application of the function should generate a group
| (right?).
+---------------

Wrong, actually if I understand what all the guys over on "sci.crypt"
were talking about a few years ago when it was proved that DES is *not*
a group! [Google for "DES is not a group" and you'll get lots of hits.]

    http://en.wikipedia.org/wiki/Data_Encryption_Standard
    ...
    DES has also been proved not to be a group, or more precisely, the
    set {EK} (for all possible keys K) under functional composition is
    not a group, nor "close" to being a group (Campbell and Wiener, 1992).
    This was an open question for some time, and if it had been the case,
    it would have been possible to break DES, and multiple encryption modes
    such as Triple DES would not increase the security.

The Campbell & Wiener paper is here:

    http://www3.sympatico.ca/wienerfamily/Michael/MichaelPapers/desgroup.pdf
    "DES is not a Group"
    Keith W. Campbell and Michael J. Wiener
    Bell-Northern Research
    P.O. Box 3511 Station C, Ottawa, Ontario, Canada, K1Y 4H7

    Abstract. We prove that the set of DES permutations (encryption
    and decryption for each DES key) is not closed under functional
    composition. This implies that, in general, multiple DES-encryption
    is not equivalent to single DES-encryption, and that DES is not
    susceptible to a particular known-plaintext attack which requires,
    on average, 228 steps. We also show that the size of the subgroup
    generated by the set of DES permutations is greater than 102499,
    which is too large for potential attacks on DES which would exploit
    a small subgroup.

+---------------
| Therefore, every permutation should have both a predecessor
| and successor, and if the predecessor is known it should be
| computationally no more difficult to invert the permutation
| than to compute it in the first place.
+---------------

Since the presumed premise is false, doesn't that also cast doubt
on the "Therefore"...?


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Joe Marshall
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1156523393.733368.17360@75g2000cwc.googlegroups.com>
Rob Warnock wrote:
> Joe Marshall <··········@gmail.com> wrote:
> +---------------
> | Ray Dillinger wrote:
> | > that gives the reversal, not the difficulty of *doing* the reversal.
> ...
> | I think you are right, then.  If a function computes a permutation,
> | then the repeated application of the function should generate a group
> | (right?).
> +---------------
>
> Wrong, actually if I understand what all the guys over on "sci.crypt"
> were talking about a few years ago when it was proved that DES is *not*
> a group! [Google for "DES is not a group" and you'll get lots of hits.]
>
> +---------------
> | Therefore, every permutation should have both a predecessor
> | and successor, and if the predecessor is known it should be
> | computationally no more difficult to invert the permutation
> | than to compute it in the first place.
> +---------------
>
> Since the presumed premise is false, doesn't that also cast doubt
> on the "Therefore"...?

Well, it is clear that we've drifted into a subject that I am
completely ignorant about.  I'm pretty sure I'm not even using the word
`group' right.  Thanks for the pointers, though.  I'll see if I can't
educate myself.
From: Ray Dillinger
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <44f38003$0$96182$742ec2ed@news.sonic.net>
Joe Marshall wrote:
> 
> However, it doesn't necessarily follow that you can derive that P^-1
> from the original P using the same sort of math you used to get the
> original P (that is, if P is derived from a primitive root of some big
> prime, it doesn't follow that P^-1 can be derived from a different
> primitive root using exponentiation.  P^-1 *can* be derived from the
> same primitive root using discrete logarithms, but that is really
> difficult to do.)

If you can deduce the number N of iterations until you return to
the same permutation, then you can solve the discrete log by
exponentiation because
     For N : ForAll x, P^N(x) = x,
        ForAll x, P^-1(x) == P^(N-1)(x)

If a function is a group, this is easy because N is the cardinality
of the group.  If a function is a non-group, however, it is hard
because N is the least common multiple of the cardinality of all
subgroups.

I'm still trying to figure out if N can be found where P(x) = B^x
mod Z for B primitive relative to Z.

				Bear
From: Marcus Breiing
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <ecnbq4$nei$1@chessie.cirr.com>
····@rpw3.org (Rob Warnock) writes:

> Joe Marshall <··········@gmail.com> wrote:

> | If a function computes a permutation,
> | then the repeated application of the function should generate a group
> | (right?).

I think so.

> Wrong, actually if I understand what all the guys over on "sci.crypt"
> were talking about a few years ago when it was proved that DES is *not*
> a group!

This is different from a set generated by repeated application of a
single permutation. In the case of "DES is not a group", we're looking
at the set of functions generated by ranging over the possible keys.

> | Therefore, every permutation should have both a predecessor
> | and successor, and if the predecessor is known it should be
> | computationally no more difficult to invert the permutation
> | than to compute it in the first place.
> 
> Since the presumed premise is false, doesn't that also cast doubt
> on the "Therefore"...?

I think the premise was OK ... but that doesn't mean I can see how
the "Therefore" follows from it:-)

-- 
Marcus Breiing
(Cologne, Germany)
From: Ray Dillinger
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <44ef8e47$0$96189$742ec2ed@news.sonic.net>
Joe Marshall wrote:
> Ray Dillinger wrote:
> 
>>Discrete logarithms can be reversed in exactly the time it takes
>>to compute them.  For every modular exponentiation that forms a
>>one-to-one and onto function, there's a corresponding modular
>>exponentiation that reverses it in one step.
>>
>>You're talking about the difficulty of *searching* for the number
>>that gives the reversal, not the difficulty of *doing* the reversal.
> 
> 
> Ok, I see what you are saying.
> 
> I think you are right, then.  If a function computes a permutation,
> then the repeated application of the function should generate a group
> (right?).  

Not quite.  Consider the following two permutations of 5 elements:

1:2
2:3
3:1
4:5
5:4

1:2
2:3
3:4
4:5
5:1

Both are one-to-one and onto functions.  The first has two
subgroups of sizes 3 and 2 (that is, two different subsets where
repeated application runs the members of the subsets in a cycle).
The second has a single subgroup of size 5, and so it is properly
a group.

A queer side effect of this is that in the second function, if you
take all possible inputs and apply the function repeatedly, you
get the original vector of inputs again in just five iterations
(the cardinality of the group).

But in the first function, if you take the vector of all possible
inputs and apply the function repeatedly, you get the original
vector of inputs again in 6 iterations.  This is because 6 is the
least common multiple of the cardinality of all subgroups.


> Therefore, every permutation should have both a predecessor
> and successor, and if the predecessor is known it should be
> computationally no more difficult to invert the permutation than to
> compute it in the first place.

I'm still thinking about whether this applies in the case where
the one-to-one and onto function is a non-group.

				Bear
From: David Kastrup
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <85u042kmba.fsf@lola.goethe.zz>
Ray Dillinger <····@sonic.net> writes:

> Joe Marshall wrote:
>> Ray Dillinger wrote:
>>
>>>Joe Marshall wrote:
>>>
>>>>Ray Dillinger wrote:
>>>
>>>>>All forms of public-key cryptography that are one-to-one mappings
>>>>>on identical-sized blocks have exactly the reversal I specified -
>>>>>a decrypt function that reverses the encryption and takes the same
>>>>>computing power as the encrypt function.
>>>
>>
>> The discrete logarithm problem is one.  Here is a concrete example:
>
> <snippage>
>
> Discrete logarithms can be reversed in exactly the time it takes to
> compute them.  For every modular exponentiation that forms a
> one-to-one and onto function, there's a corresponding modular
> exponentiation that reverses it in one step.

Uh, you are confusing roots and logarithms.

x: x->y(x) = x^p can be reversed with a suitable exponentiation.
x: x->y(x) = p^x can't be reversed with suitable exponentiation.

> You're talking about the difficulty of *searching* for the number
> that gives the reversal, not the difficulty of *doing* the reversal.

Nope.

> In other words, just because you have thrown away the decryption
> key or don't know how to find it, doesn't make something from
> public-key encryption into a hash function.

He was talking about logarithms (the inverse of exponential
functions), not roots (the inverse of powers).

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum
From: Ray Dillinger
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <44ef2a29$0$96185$742ec2ed@news.sonic.net>
David Kastrup wrote:

>>>>>Ray Dillinger wrote:

>>>>>>All forms of public-key cryptography that are one-to-one mappings
>>>>>>on identical-sized blocks have exactly the reversal I specified -
>>>>>>a decrypt function that reverses the encryption and takes the same
>>>>>>computing power as the encrypt function.


> x: x->y(x) = x^p can be reversed with a suitable exponentiation.
> x: x->y(x) = p^x can't be reversed with suitable exponentiation.

But p^x modulo something isn't a one-to-one and onto function.

				Bear
	
From: Michael Stewart
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <874pw0itpp.fsf@saaz.cs.gsu.edu>
Ray Dillinger <····@sonic.net> writes:

> David Kastrup wrote:
>
>>>>>>Ray Dillinger wrote:
>
>>>>>>>All forms of public-key cryptography that are one-to-one mappings
>>>>>>>on identical-sized blocks have exactly the reversal I specified -
>>>>>>>a decrypt function that reverses the encryption and takes the same
>>>>>>>computing power as the encrypt function.
>
>
>> x: x->y(x) = x^p can be reversed with a suitable exponentiation.
>> x: x->y(x) = p^x can't be reversed with suitable exponentiation.
>
> But p^x modulo something isn't a one-to-one and onto function.

That depends on what "p" and "something" are.  In the following I
prefer to change "p" to "a" and "something" to "p."  It is possible
for a function f(x)=a^x mod p to be a one-to-one function mapping x
from the set {1,2,...,p-1} onto the set {1,2,...,p-1}.

Examples: a=3 and p=7.  The powers of 3 mod 7 are:

a^1=3, a^2=2, a^3=6, a^3=4, a^5=5, a^6=1  mod 7.

So the function f(x)=3^x mod 7 is a bijection mapping the set
{1,2,3,4,5,6} onto {1,2,3,4,5,6}.

And here is yet another a=99999999999999999,
p=100000000000000000000000067.  I'll spare you an exhaustive listing
of the powers of a mod p.  ;-)

Joe Marshall's p and a=5 is another.

Proving that a function of the form f(x)=a^x mod p is a bijection can
be easy when p-1 factors easily.  I have assigned it as a homework
problem before.  I'm not a cryptography expert, but I believe that in
applications it is usually assumed that the number a is chosen so that
the function is a bijection.  It is not hard to generate such values.
From: David Kastrup
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <853bbkg3fh.fsf@lola.goethe.zz>
Ray Dillinger <····@sonic.net> writes:

> David Kastrup wrote:
>
>>>>>>Ray Dillinger wrote:
>
>>>>>>>All forms of public-key cryptography that are one-to-one mappings
>>>>>>>on identical-sized blocks have exactly the reversal I specified -
>>>>>>>a decrypt function that reverses the encryption and takes the same
>>>>>>>computing power as the encrypt function.
>
>
>> x: x->y(x) = x^p can be reversed with a suitable exponentiation.
>> x: x->y(x) = p^x can't be reversed with suitable exponentiation.
>
> But p^x modulo something isn't a one-to-one and onto function.

If p is a primitive root of "something", and "something" is a prime,
p^x is quite 1:1.

For example, 3^x mod 65537 provides a 1:1 mapping of 1..65536 onto
1..65536.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum
From: Ray Dillinger
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <44ef9228$0$96188$742ec2ed@news.sonic.net>
David Kastrup wrote:
> Ray Dillinger <····@sonic.net> writes:

>>David Kastrup wrote:

>>>x: x->y(x) = x^p can be reversed with a suitable exponentiation.
>>>x: x->y(x) = p^x can't be reversed with suitable exponentiation.

>>But p^x modulo something isn't a one-to-one and onto function.

> If p is a primitive root of "something", and "something" is a prime,
> p^x is quite 1:1.

> For example, 3^x mod 65537 provides a 1:1 mapping of 1..65536 onto
> 1..65536.

Hmmm.... Let me think about this.  It has a structure that
implies things not immediately obvious.

				Bear
From: Michael Stewart
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <87ejv7glq5.fsf@saaz.cs.gsu.edu>
"Joe Marshall" <··········@gmail.com> writes:

> Ray Dillinger wrote:
>>
>> You can; but I do believe that none of them will be a one-to-one
>> function on same-sized data. If you know a single counterexample,
>> talk about it.
>
> The discrete logarithm problem is one.  Here is a concrete example:
>
> (define (expmod base exp m)
>   (cond ((= exp 0) 1)
>         ((even? exp)
>          (remainder (square (expmod base (/ exp 2) m))
>                     m))
>         (else
>          (remainder (* base (expmod base (- exp 1) m))
>                     m))))
>
> (define *big-prime* (1+ (* 142 (+ 531 (expt 10 301)))))
>
> (define (one-way x) (expmod 3 x *big-prime*))
>
> The procedure one-way can quickly map any integer in the range [0,
> *big-prime*) uniquely to another integer in that range, but there is no
> known easy way to invert it.  For example, there exists an x such that
> (one-way x) evaluates to 2, but try finding it.

Not to be picky since I agree with you and the discrete log is a good
example, but I believe that 3 is not a primitive root mod *big-prime*.
So your function is not one-to-one.  When I try

(one-way (- *big-prime* 1))

and

(one-way
71000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000037701)

both evaluate to 1.

I checked and I believe 5 is a primitive root.  So just change 3 to 5.
From: Joe Marshall
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1156371127.439990.254980@75g2000cwc.googlegroups.com>
Michael Stewart wrote:
>
> Not to be picky since I agree with you and the discrete log is a good
> example, but I believe that 3 is not a primitive root mod *big-prime*.
> So your function is not one-to-one.

D'oh!

>
> I checked and I believe 5 is a primitive root.  So just change 3 to 5.

Ah, thank you!  I should have checked this myself.
From: Antony Sequeira
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <85GdnZPwV4H0xXPZnZ2dnUVZ_qWdnZ2d@comcast.com>
Joe Marshall wrote:
> Ray Dillinger wrote:
>> Joe Marshall wrote:
>>> Ray Dillinger wrote:
>>>> All forms of public-key cryptography that are one-to-one mappings
>>>> on identical-sized blocks have exactly the reversal I specified -
>>>> a decrypt function that reverses the encryption and takes the same
>>>> computing power as the encrypt function.
>>
>>> Yes, they are that way by design --- you wouldn't want a cryptosystem
>>> where it was computationally infeasible to decrypt a message.  But
>>> these systems work by finding or constructing a mapping that is
>>> invertible only if you know the key (these are `trapdoor' one-way
>>> functions).  But you can construct one of these without the trapdoor.
>> You can; but I do believe that none of them will be a one-to-one
>> function on same-sized data. If you know a single counterexample,
>> talk about it.
> 
> The discrete logarithm problem is one.  Here is a concrete example:
> 
> (define (expmod base exp m)
>   (cond ((= exp 0) 1)
>         ((even? exp)
>          (remainder (square (expmod base (/ exp 2) m))
>                     m))
>         (else
>          (remainder (* base (expmod base (- exp 1) m))
>                     m))))
> 
> (define *big-prime* (1+ (* 142 (+ 531 (expt 10 301)))))
> 
> (define (one-way x) (expmod 3 x *big-prime*))
> 
> The procedure one-way can quickly map any integer in the range [0,
> *big-prime*) uniquely to another integer in that range, but there is no
> known easy way to invert it.  For example, there exists an x such that
> (one-way x) evaluates to 2, but try finding it.
> 

I have a problem where I think this "discrete logarithm" may be useful. 
I'd appreciate opinions on this.

We have data in a RDBMS table where each row gets a auto incremented 
number as a primary key.
We want to expose these rows through  URLs unique for each row but we do 
not want to give out the primary key since we do not want to make easy 
crawling of the data.
So, I was thinking that we could use  'one-way' function to create 
mapping from the auto generated number to another number and store that 
number for each row. Since this mapping is said to be one to one we can 
then use that as the key into the table and expose it on the url that 
represents each row. We don't need to be able to reverse it since we can 
have a index on the mapped number column.

If I assume we won't get beyond say 1 billion for the auto incremented 
number, I guess all i need to get is a prime number that is bigger than 
1 billion and smaller than the range for the db column where we'll store 
the mapped value.

I do not sure I understood all of what has been said.
If the prime we use is big enough and if there are sufficiently many 
primes in the range say 1 billion to 2 billion, i am assuming this will 
make it 'sufficiently' hard. I do understand that someone could 
potentially make a billion requests.

Is this a sensible way to do what I want.

I know this is not directly related to lisp, but this is the only group 
I lurk on :)

-Antony
From: Kim Minh Kaplan
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1156497316.459237.70560@m73g2000cwd.googlegroups.com>
Antony Sequeira wrote:

> I have a problem where I think this "discrete logarithm" may be useful.
> I'd appreciate opinions on this.
>
> We have data in a RDBMS table where each row gets a auto incremented
> number as a primary key.
> We want to expose these rows through  URLs unique for each row but we do
> not want to give out the primary key since we do not want to make easy
> crawling of the data.
> So, I was thinking that we could use  'one-way' function to create
> mapping from the auto generated number to another number and store that
> number for each row. Since this mapping is said to be one to one we can
> then use that as the key into the table and expose it on the url that
> represents each row. We don't need to be able to reverse it since we can
> have a index on the mapped number column.

This does not really help you, unless you consider that your one way
hash function is secrete.  Otherwise anybody who wants to crawl your
rows can apply the same function to whatever key he whishes to retrieve
and build his query that way i.e your URL are stille easily guessable.
Rather than using a hash function, better use a random function...
Your problem bears many similarities to Xah Lee's original topic.

Kim Minh.
http;//www.kim-minh.com/
From: ···············@space.at
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <umz9vwqhg.fsf@space.at>
>>>>> "Bear" == Ray Dillinger <····@sonic.net> writes:

    Bear> Joe Marshall wrote:
    >> Ray Dillinger wrote:

    >>> All forms of public-key cryptography that are one-to-one mappings
    >>> on identical-sized blocks have exactly the reversal I specified -
    >>> a decrypt function that reverses the encryption and takes the same
    >>> computing power as the encrypt function.

I don't think this is true, e.g. RSA encryption uses significantly
less computation than decryption:

  y = x^e (mod n)      encryption, takes O(log(e)) modular
                       exponentiaticons log2(e) is typically 2..16
  x = y^d (mod n)      decryption, log2(d) \approx log2(n), 
                        typically 700..2000

    >> Yes, they are that way by design --- you wouldn't want a
    >> cryptosystem where it was computationally infeasible to decrypt
    >> a message.  But these systems work by finding or constructing a
    >> mapping that is invertible only if you know the key (these are
    >> `trapdoor' one-way functions).  But you can construct one of
    >> these without the trapdoor.

    Bear> You can; but I do believe that none of them will be a
    Bear> one-to-one function on same-sized data. If you know a single
    Bear> counterexample, talk about it.

Modular exponentiation used for Diffie-Hellman:

  y = g^x (mod p)       O(log(x)) modexps (typically a few hundreds or
                        thousands)
  
to get x from y, you need to solve the discrete log problem, which is
believed to be (not proven) to be hard.  IIRC, it takes about
O(sqrt(p)) operations [1].  For a 128-bit input/output space, we need 128
modexps to compute y from x, but 2^64 [2] operations to compute x from y.

[1] Unless p is badly chosen.
[2] Not computationally one-way with current technology (more than 128
    bits will be needed) but still much more effort than in the
    forward direction.

                                HTH
                                    Roland
From: Ray Dillinger
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <44ecebc0$0$96176$742ec2ed@news.sonic.net>
···············@space.at wrote:

>     Bear> You can; but I do believe that none of them will be a
>     Bear> one-to-one function on same-sized data. If you know a single
>     Bear> counterexample, talk about it.
> 
> Modular exponentiation used for Diffie-Hellman:
> 
>   y = g^x (mod p)       O(log(x)) modexps (typically a few hundreds or
>                         thousands)
>   
> to get x from y, you need to solve the discrete log problem, which is
> believed to be (not proven) to be hard.  IIRC, it takes about
> O(sqrt(p)) operations [1].  For a 128-bit input/output space, we need 128
> modexps to compute y from x, but 2^64 [2] operations to compute x from y.

A corollary of the Chinese Remainder Theorem states that
For every invertible (one-to-one) modular exponentiation,
there is another modular exponentiation under the same
modulus that exactly reverses it.

Formally,

ForAll X: if (Y^X) mod Z = W forms an onto function
	from Y to W, then
ThereExists X': (W^X') mod Z = Y.

Your 2^64 operations assume that X' is not known and must
be searched for; if known, then reversing ((Y^X) mod Z) is
a simple matter of plugging in the correct number and
doing a single modexp.

Thus, the use of this type of function as a hash fails;
you cannot know, if provided X by another party, whether
that party has the corresponding X', and if they do have
it, then this is instantly reversible by them (thus, not
an irreversible hash). RSA is the best-known method by
which someone might know both X and X' (because they have
both been constructed from shared roots).

The only way you can be assured that no one has such a
"solution" in hand, is if you provided some of the bits for
the key yourself (as in Diffie-Hellman Key Agreement
Protocol).  But in that case you have something else besides
a "hash", because you have to distribute the key bits
somehow.  That makes it the same, in protocol design, as
an encryption algorithm.

				Bear
From: Steve VanDevender
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <ecjca5$je6$1@pith.uoregon.edu>
Ray Dillinger <····@sonic.net> writes:

> Joe Marshall wrote:
> 
> > Yes, they are that way by design --- you wouldn't want a cryptosystem
> > where it was computationally infeasible to decrypt a message.  But
> > these systems work by finding or constructing a mapping that is
> > invertible only if you know the key (these are `trapdoor' one-way
> > functions).  But you can construct one of these without the trapdoor.
> 
> You can; but I do believe that none of them will be a one-to-one
> function on same-sized data. If you know a single counterexample,
> talk about it.

I belive I already mentioned that the commonly-used cryptographic hash
functions like MD5 and SHA-1 are designed to be one-to-one on the space
of N-bit inputs equal to their output size; i.e. for all 128-bit inputs
to MD5 it will produce unique 128-bit hashes.  Those functions are based
on operations like those used in symmetric block ciphers, mostly simple
logical operations and bit permutation, and specifically designed
without any explicit or implicit key value as an additional input
(although they tend to make use of some built-in "random" constants). I
don't know of any hash functions based on the kinds of operations used
in public-key algorithms such as modular exponentiation (although I
suppose if I perused my copy of _Applied Cryptography_ it may well
mention some; I recall it discussing the known methods for building hash
functions from block ciphers and block ciphers from hash functions, and
in a basic sense public-key algorithms are a class of block ciphers).

-- 
Steve VanDevender  "I ride the big iron"  http://hexadecimal.uoregon.edu/
······@hexadecimal.uoregon.edu  PGP keyprint 4AD7AF61F0B9DE87 522902969C0A7EE8
Little things break, circuitry burns / Time flies while my little world turns
Every day comes, every day goes / 100 years and nobody shows -- Happy Rhodes
From: ······@gmail.com
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <1156480752.260320.196490@i42g2000cwa.googlegroups.com>
i thought more whether i was correct and about a week ago wrote the
following. Basically, whether a one-way function will beat a random
function in producing outputs of n digits that are unique, depends on
the ratio g/s, where g is the number of already generated numbers and s
is the space of all possible digits (i.e. 10^n).

and, my thought is that if g/s is not extremely small, then random
function win. If g/s is extremely small, basically a requirement
applications that needs a “random” unique string, then the one-way
function win technically, but not in any practical sense.

The following is rather a babbling, and i think towards the end where i
tried to formulate my thougts into formal statements may be wrong...
but i didn't want to spend more time on this. (but do want to, however,
not leave the reply to me hanging)

Thanks for your thoughts.

  Xah
  ···@xahlee.org
∑ http://xahlee.org/

-------
i think the answer is that, when the slots are barely filled, than
one-way function beats random function. But if the slots are slightly
filled or more (say, > 10%), random beats one-way, in the probably of
picking a slot that havn't been filled.

To see this, imagine there are just 10 slots. So, one-way f generates a
number among 1 to 10, based on a given input number from a set of
numbers. While, random is just random. Because of the one-way
function's non-random nature, it's gonna repeat after just a few picks.
While the random function's chances are strictly governed by the ratio
of filled/non-filled.

We see that, for the goal of generating unique numbers, the slots must
be outrageously numerous. Let g be the total number of expected
generations. Let s be the number of slots. So, we want g/s to be
1/gazillion. For practical purposes, gazillion here can be a google.
(note that total population is just 6.5*10^6, suppose to be generous
than everyone writes a blog entry per day (which should compensate
massive rss entries made by professional publishing houses), for the
next 100 years (say, RSS/Atom's life span), then the number is only
g=23725000000. So if s=google, g/s=2.3725e-90, probably good enough.)

so, in the above scenario, one can see that the one-way function's
probability of generating unique number is better than the random
function, but by a difference of some iota.

Let's denote one-way function as owf, and random function as rf. And
p(owf,n/s) is the probability of generating a already generated number
(a slot already filled) where n is number of filled slots. (think of a
number as a slot or hole, and once it is generated, the hole is
filled).

Also, as we get deep into this problem, one critical question is how
the input of owf are choosen, and we also need to give explicitly
assume certain properties of owf in the context of our problem. Without
getting all worked up, basically, we assume owf(a)!=owf(b) iff a!=b,
and we assume we'll always use a different input to owf.

I think the theorem here is that, as g/s approaches 0,
p(owf,n/s)/p(rf,n/s) also approaches 1... and if g/s approaches 1,
p(owf,n/s)/p(rf,n/s) is almost always greater than 1. (the above needs
to be rephrased and reworked)


Steve Schafer wrote:
> On 17 Aug 2006 06:50:58 -0700, "Xah Lee" <···@xahlee.org> wrote:
>
> >if your one-way function returns a 10 digit number, the probability of
> >it being unique is no better than a random 10-digit generating
> >function.
>
> No, that's not true, precisely because the one-way hash is _not_ random.
>
> To see this, consider UUIDs as a concrete example. A UUID is a
> deterministically generated 128-bit number. Two machines are guaranteed
> never to generate the same UUID because incorporated into the UUID is
> the MAC address of (one of) the machine's NIC, which is a unique 48-bit
> number.* Furthermore, a single machine is guaranteed never to generate
> the same UUID because the current time is also incorporated into the
> UUID.**
>
> So, a UUID would serve perfectly as our universally unique identifier,
> except for one thing: It contains human-decipherable information that
> can be used for a variety of nefarious purposes.
>
> Enter the one-way hash function. What if we had a perfect hash function,
> one that maps any 128-bit number to another 128-bit (or larger) number
> in such a way that no two input numbers map to the same output number?
> If the hash function sufficiently scrambles the input bits, it will
> obliterate the human-decipherable information. And since the function is
> one-way, there is no means, apart from brute force trial and error, to
> recover the information.
>
> So, if we start with two UUIDs, which are guaranteed to be different
> because of the nature of the generation algorithm, and apply our perfect
> hash function to them, we will end up with two identifiers which are
> also guaranteed to be unique (because the hash is perfect), and in which
> the human-decipherable information has been obliterated. Exactly what we
> need.
>
> Well, there is one little problem.... We don't have such a perfect
> one-way hash function. We do have perfect hash functions that aren't one
> way (which does us no good at all), and we also have some hash functions
> that are one-way, but not perfect. MD5 and SHA-1 fall into this latter
> category, and while they're not perfect, they're pretty darn close. They
> can't absolutely guarantee that no collisions will occur, but the
> likelihood of a collision is very, very small.
>
> The nature of the hash algorithm in both cases takes advantage of the
> fact that although the pool of potential UUIDs contains 2^128 members,
> only a small fraction of the pool is ever actually tapped. Using this,
> and furthermore ensuring that input numbers that are "close" to each
> other tend to hash to values that are "far" from each other, we
> significantly reduce the likelihood that two input values will hash to
> the same value.
>
> Steve Schafer
> Fenestra Technologies Corp
> http://www.fenestra.com/
>
> *Excluding for the moment the apparent existence of NICs with MAC
> addresses that are specified in flash memory.
>
> **Excluding for the moment a potential weakness in the UUID algorithm
> that could lead to duplicate UUIDs in the face of a reset of the system
> clock.
From: Tim X
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <87ejvpgdda.fsf@tiger.rapttech.com.au>
Xah,

I feel the need to congratulate you. Recently, there was considerable
debate on these newsgroups regarding complaints to your ISP based on
the argument that your posts were largely spam/flame bait etc. 

I will admit being one of the ones that thought your posts were
essentially spam designed to create flame wars and were usually
cross-posted to too many groups, many of which were inappropriate. I
didn't go so far as to formally complain to your ISP, but could
understand why some did recommend doing so. 

I believe this most recent post is possibly the most appropriate and
reasonable post I've ever seen from you. You appear to have done your
research, your points are quite well argued/presented and I think you
raise some valid points that many would find interesting and useful,
particuarly in gnu.emacs.help. As a result, I observe you got a polite and
probably helpful suggestion from Barry. I for one can say that if you
can continue to post relevant and well reasoned/researched messages, I
would encourage you to continue participating in usernet discussions. 

One of the issues I often had with your posts in the past was that
while you sometimes raised issues which were to some extent valid and
sometimes even justified, your proposed alternatives/solutions were
often based on mis-information, poor understanding of the topic and
showed little effort to understand or research the problem. In many
ways, they had the appearance of the frustrated ramblings of someone
venting over some technical issue which they didn't understand or felt
was counter intuitive. This was often combined with proposed
changes/alternatives which showed lack of any real understanding of
the issue. This was regularly expressed in offensive and poorly
structured ways which failed to recognise that sometimes the
underlying rationale for why something is the way it is may not be
obvious initially and sometimes it is better to reserve judgement
until sufficient time has been spent using the system to possibly
understand why certain decisions have been made..

I only have two points/criticisms, which I hope will be seen as
constructive comments. 

1. Your reference to the moronic date formats of some locales is a bit
   harsh. While I do find the US year-month-day format sensible, I
   think its important to recognise that formats such as
   day-month-year only seem as moronic because of the advent of
   computers and electronic representations. From a computer
   perspective, year-month-day is great because things get sorted in
   date order so easily. However, prior to this, the differences were
   much more a matter of cultural taste with little practicle
   difference. 

   I guess the point I wanted to make is that when judging existing
   differences such as date formats, its important to consider the
   historical context before simply labelling something moronic. There
   are many things which have a good historical justification that may
   no longer be relevant, but none the less, have more justification
   than just being a stupid decision. I think its a bit arrogant to
   judge something like a date format as being moronic based on
   current knowledge and technology and the benefits of hindsight.  

2. My second criticism is that while your posts seem to be more
   "on-topic" and posted to roups with slightly more relevance, you
   are still missing the mark a bit. Your post concerns emacs lisp,
   which has nothing to do with scheme. Therefore, I think it would
   have been sufficient to post just to the emacs group, though
   strictly speaking, its not totally "off-topic" for comp.lang.lisp,
   despite the stronger Common Lisp orientation of the group. I can't
   see the justification for posting to comp.lang.scheme. I also
   notice you have posted to comp.emacs - I think the more
   common/preferred group for emacs is gnu.emacs.help. I seem to
   remember seeing somewhere that comp.emacs was to be depricated in
   favor of various gnu.emacs.* groups. 

I look forward to seeing more reasoned questions and issues that are
presented in such a way as to encourage intelligent debate with a genuine
desire to increase understanding and improve things. 

Tim

-- 
tcross (at) rapttech dot com dot au
From: Chris F.A. Johnson
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <6iquq3-gi.ln1@xword.teksavvy.com>
On 2006-08-10, Tim X wrote:
>
> 1. Your reference to the moronic date formats of some locales is a bit
>    harsh. While I do find the US year-month-day format sensible,

    That is not the usual US format. The usual US format is
    month/day/year; that I _do_ consider moronic.

>    I think its important to recognise that formats such as
>    day-month-year only seem as moronic because of the advent of
>    computers and electronic representations. From a computer
>    perspective, year-month-day is great because things get sorted in
>    date order so easily. However, prior to this, the differences
>    were much more a matter of cultural taste with little practicle
>    difference.

    The day-month-year format is logically consistent, and is in
    disfavour not only because year-month-day is more computer
    friendly, but because it is ambiguous (is 04/03/2006 the 3rd of
    April or the 4th of March?).


-- 
   Chris F.A. Johnson, author   |    <http://cfaj.freeshell.org>
   Shell Scripting Recipes:     |  My code in this post, if any,
   A Problem-Solution Approach  |         is released under the
   2005, Apress                 |    GNU General Public Licence
From: Dan Sommers
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <m2hd0id9of.fsf@unique.fqdn>
On Fri, 11 Aug 2006 13:15:18 -0400,
"Chris F.A. Johnson" <··········@gmail.com> wrote:

>     That is not the usual US format. The usual US format is
>     month/day/year; that I _do_ consider moronic.

Agreed.  However....

>     The day-month-year format is logically consistent, and is in
>     disfavour not only because year-month-day is more computer
>     friendly, but because it is ambiguous (is 04/03/2006 the 3rd of
>     April or the 4th of March?).

*Any* format that renders the month and the day of the month *both* as
two-digit numbers is ambigious for ninety days every year.  For example,
is 08/2006/11 day/year/month or month/year/day?

Regards,
Dan

-- 
Dan Sommers
<http://www.tombstonezero.net/dan/>
"I wish people would die in alphabetical order." -- My wife, the genealogist
From: M Jared Finder
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <dO-dne-QfNjq70DZnZ2dnUVZ_qadnZ2d@speakeasy.net>
Dan Sommers wrote:
> On Fri, 11 Aug 2006 13:15:18 -0400,
> "Chris F.A. Johnson" <··········@gmail.com> wrote:
> 
>>     That is not the usual US format. The usual US format is
>>     month/day/year; that I _do_ consider moronic.
> 
> Agreed.  However....
> 
>>     The day-month-year format is logically consistent, and is in
>>     disfavour not only because year-month-day is more computer
>>     friendly, but because it is ambiguous (is 04/03/2006 the 3rd of
>>     April or the 4th of March?).
> 
> *Any* format that renders the month and the day of the month *both* as
> two-digit numbers is ambigious for ninety days every year.  For example,
> is 08/2006/11 day/year/month or month/year/day?

Month/year/day.  And not just because 08/2006/11 is today. ;)  Here's why:

If you look at just the first two numbers, you know it's going to be 
either month/year or day/year, and you can decide between them by 
looking at each from a human perspective.  That shouldn't be too hard; 
you're a human, right?  Month/year is more something a human would 
specify than day/year; it's more continuous.

Incidentally, month/day/year gets lots of unnecessary flak; it's 
actually an extremely well designed (for humans!) way to specify a date. 
  On a day to day basis, the current year is usually irrelevant; it 
changes so slowly.  And low and behold, the year is the last piece of 
information.  When scheduling for the future, the month is much more 
important than the day; you can make a good educated guess about the 
weather and the temperature by just the month, so it should be first. 
Month/year/day makes perfect sense -- for humans.  It's just total shit 
for computers.

   -- MJF
From: David Combs
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <ecptbp$ihn$1@reader2.panix.com>
In article <································@speakeasy.net>,
M Jared Finder  <·····@hpalace.com> wrote:
>Dan Sommers wrote:
>> On Fri, 11 Aug 2006 13:15:18 -0400,
>> "Chris F.A. Johnson" <··········@gmail.com> wrote:
>> 
>>>     That is not the usual US format. The usual US format is
>>>     month/day/year; that I _do_ consider moronic.
>> 
>> Agreed.  However....
>> 
>>>     The day-month-year format is logically consistent, and is in
>>>     disfavour not only because year-month-day is more computer
>>>     friendly, but because it is ambiguous (is 04/03/2006 the 3rd of
>>>     April or the 4th of March?).
>> 
>> *Any* format that renders the month and the day of the month *both* as
>> two-digit numbers is ambigious for ninety days every year.  For example,
>> is 08/2006/11 day/year/month or month/year/day?
>
>Month/year/day.  And not just because 08/2006/11 is today. ;)  Here's why:
>
>If you look at just the first two numbers, you know it's going to be 
>either month/year or day/year, and you can decide between them by 
>looking at each from a human perspective.  That shouldn't be too hard; 
>you're a human, right?  Month/year is more something a human would 
>specify than day/year; it's more continuous.
>
>Incidentally, month/day/year gets lots of unnecessary flak; it's 
>actually an extremely well designed (for humans!) way to specify a date. 
>  On a day to day basis, the current year is usually irrelevant; it 
>changes so slowly.  And low and behold, the year is the last piece of 
>information.  When scheduling for the future, the month is much more 
>important than the day; you can make a good educated guess about the 
>weather and the temperature by just the month, so it should be first. 
>Month/year/day makes perfect sense -- for humans.  It's just total shit 
>for computers.
>
>   -- MJF

I like the way we used in the army:  26aug06

  Say it aloud and it has only one parsing.

David
From: Tim X
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <878xlak6l5.fsf@tiger.rapttech.com.au>
·······@panix.com (David Combs) writes:

> In article <································@speakeasy.net>,
> M Jared Finder  <·····@hpalace.com> wrote:
>>Dan Sommers wrote:
>>> On Fri, 11 Aug 2006 13:15:18 -0400,
>>> "Chris F.A. Johnson" <··········@gmail.com> wrote:
>>> 
>>>>     That is not the usual US format. The usual US format is
>>>>     month/day/year; that I _do_ consider moronic.
>>> 
>>> Agreed.  However....
>>> 
>>>>     The day-month-year format is logically consistent, and is in
>>>>     disfavour not only because year-month-day is more computer
>>>>     friendly, but because it is ambiguous (is 04/03/2006 the 3rd of
>>>>     April or the 4th of March?).
>>> 
>>> *Any* format that renders the month and the day of the month *both* as
>>> two-digit numbers is ambigious for ninety days every year.  For example,
>>> is 08/2006/11 day/year/month or month/year/day?
>>
>>Month/year/day.  And not just because 08/2006/11 is today. ;)  Here's why:
>>
>>If you look at just the first two numbers, you know it's going to be 
>>either month/year or day/year, and you can decide between them by 
>>looking at each from a human perspective.  That shouldn't be too hard; 
>>you're a human, right?  Month/year is more something a human would 
>>specify than day/year; it's more continuous.
>>
>>Incidentally, month/day/year gets lots of unnecessary flak; it's 
>>actually an extremely well designed (for humans!) way to specify a date. 
>>  On a day to day basis, the current year is usually irrelevant; it 
>>changes so slowly.  And low and behold, the year is the last piece of 
>>information.  When scheduling for the future, the month is much more 
>>important than the day; you can make a good educated guess about the 
>>weather and the temperature by just the month, so it should be first. 
>>Month/year/day makes perfect sense -- for humans.  It's just total shit 
>>for computers.
>>
>>   -- MJF
>
> I like the way we used in the army:  26aug06
>
>   Say it aloud and it has only one parsing.
>

Would that one way of parsing be 26th August 1906 or 26 August 2006?

Sorry, I just couldn't resist, but in principal I tend to agree, for
the sake of one additional character i.e. 3 for month name, you do
take most of the ambiguity out. Add another two digits for year and I
doubt anyone will get confused. 

Tim


-- 
tcross (at) rapttech dot com dot au
From: Don Geddis
Subject: Re: elisp: format-time-string's %z problem
Date: 
Message-ID: <87ejvleq0l.fsf@geddis.org>
Dan Sommers <··@privacy.net> wrote on Fri, 11 Aug 2006:
> *Any* format that renders the month and the day of the month *both* as
> two-digit numbers is ambigious for ninety days every year.

In theory, but not in practice.

In the YYYY-MM-DD format, is there anyone, anywhere, that regularly
uses (or confuses) this with some hypothetical YYYY-DD-MM?

> For example, is 08/2006/11 day/year/month or month/year/day?

But 2006-08-11 is highly unlikely to be interpreted as November 8 by anybody.
Only computers put years first, and they do everything else in order of
significant bits.

        -- Don
_______________________________________________________________________________
Don Geddis                  http://don.geddis.org/               ···@geddis.org
I feel that deep down inside, my friend Tom has a lot of anger, because the
last time I saw him he was chasing me.  -- Deep Thoughts, by Jack Handey [1999]