From: Tamas K Papp
Subject: checking for duplicate elements
Date: 
Message-ID: <75ol55F1955c5U1@mid.individual.net>
Is there a function in the CL standard that I could use to check if a
list has duplicate elements?  I could not find any, but my brain is
small and the standard is large.  I could compare the length of the
list to (remove-duplicates list), but that is kind of lame.

It is of course trivial to implement it:

(defun has-duplicates-p (list &optional (test #'equal))
  "Return the first duplicated element, nil if all elements are unique."
  (let (unique-elements)
    (dolist (elt list)
      (let ((found (find elt unique-elements :test test)))
	(if found
	    (return-from has-duplicates-p found)
	    (push elt unique-elements))))
    nil))

I was just curious if I missed anything in the standard.

Tamas

From: Pillsy
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <3adfcfdb-7881-411e-8e4a-f6729e76a5e0@y10g2000prc.googlegroups.com>
On Apr 28, 11:20 am, Tamas K Papp <······@gmail.com> wrote:
[...]
> It is of course trivial to implement it:
>
> (defun has-duplicates-p (list &optional (test #'equal))
>   "Return the first duplicated element, nil if all elements are unique."
>   (let (unique-elements)
>     (dolist (elt list)
>       (let ((found (find elt unique-elements :test test)))
>         (if found
>             (return-from has-duplicates-p found)
>             (push elt unique-elements))))
>     nil))

This reminds me of a related question. Ordinarily I'd write this
function as

(defun has-duplicates-p (list &optional (test #'equal))
  (let ((unique-elements (make-hash-table :test test)))
    (dolist (elt list)
      (multiple-value-bind (found presentp)
          (gethash elt unique-elements)
        (if presentp
            (return-from has-duplicates-p (values found t))
            (setf (gethash elt unique-elements) elt))))
    (values nil nil)))

Using a hash table to test for uniqueness is a good way to avoid using
an O(N^2) algorithm when an O(N) algorithm suffices, but it limits you
to one of four posible tests, at least in portable CL. This means it's
not the right solution for any number of situations. What's the best
way to get around this limitation?

Cheers,
Pillsy
From: TomSW
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <1f6c36cc-c85c-43a5-b925-90b91c622031@j18g2000prm.googlegroups.com>
On Apr 28, 5:48 pm, Pillsy <·········@gmail.com> wrote:
> Using a hash table to test for uniqueness is a good way to avoid using
> an O(N^2) algorithm when an O(N) algorithm suffices,

if you ignore the cost of the hash table then O(n) is actually the
worst case since you can return as soon as the first duplicate is
found.

(defun has-duplicates-p (list &key key (test #'eql))
  (flet ((keyed (item) (if key
                           (funcall key item)
                           key)))
    (let ((hash (make-hash-table :test test)))
      (dolist (item list)
        (when (> (incf (gethash (keyed item) hash 0))
                 1)
          (return t))))))

cheers,
Tom SW
From: Marco Antoniotti
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <b3209081-0eb8-49d7-9fe0-450e5c800b54@s38g2000prg.googlegroups.com>
On Apr 28, 6:28 pm, TomSW <·············@gmail.com> wrote:
> On Apr 28, 5:48 pm, Pillsy <·········@gmail.com> wrote:
>
> > Using a hash table to test for uniqueness is a good way to avoid using
> > an O(N^2) algorithm when an O(N) algorithm suffices,
>
> if you ignore the cost of the hash table then O(n) is actually the
> worst case since you can return as soon as the first duplicate is
> found.
>
> (defun has-duplicates-p (list &key key (test #'eql))
>   (flet ((keyed (item) (if key
>                            (funcall key item)
>                            key)))
>     (let ((hash (make-hash-table :test test)))
>       (dolist (item list)
>         (when (> (incf (gethash (keyed item) hash 0))
>                  1)
>           (return t))))))
>

Nice.  Nitpicking...

(defun has-duplicates-p (list &key (key #'identity) (test #'eql))
   (let ((ht (make-hash-table :test test)))
      (dolist (e list)
         (when (> (incf (gethash (funcall key e) ht 0)) 1)
            (return t)))))

Of course we are waiting for the Ruby version... :)

Cheers
--
Marco
ELS 2009 in Milan, Italy, www.european-lisp-symposium.org
From: Joshua Taylor
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <1db2eea9-74b8-47c6-880f-15a41180c2d2@t36g2000prt.googlegroups.com>
On Apr 28, 12:39 pm, Marco Antoniotti <·······@gmail.com> wrote:
> On Apr 28, 6:28 pm, TomSW <·············@gmail.com> wrote:
>
>
>
> > On Apr 28, 5:48 pm, Pillsy <·········@gmail.com> wrote:
>
> > > Using a hash table to test for uniqueness is a good way to avoid using
> > > an O(N^2) algorithm when an O(N) algorithm suffices,
>
> > if you ignore the cost of the hash table then O(n) is actually the
> > worst case since you can return as soon as the first duplicate is
> > found.
>
> > (defun has-duplicates-p (list &key key (test #'eql))
> >   (flet ((keyed (item) (if key
> >                            (funcall key item)
> >                            key)))
> >     (let ((hash (make-hash-table :test test)))
> >       (dolist (item list)
> >         (when (> (incf (gethash (keyed item) hash 0))
> >                  1)
> >           (return t))))))
>
> Nice.  Nitpicking...
>
> (defun has-duplicates-p (list &key (key #'identity) (test #'eql))
>    (let ((ht (make-hash-table :test test)))
>       (dolist (e list)
>          (when (> (incf (gethash (funcall key e) ht 0)) 1)
>             (return t)))))
>
> Of course we are waiting for the Ruby version... :)
>
> Cheers
> --
> Marco

The original, though using a bit more code, aligns better with the CL
library functions that can accept NIL as a key. E.g.,

CL-USER 2 > (remove-duplicates '(1 2 3 2 1) :key nil)
=> (3 2 1)

//JT
From: sross
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <e8df70fd-c38d-44f7-9788-fa7384c9261d@s38g2000prg.googlegroups.com>
On Apr 28, 5:28 pm, TomSW <·············@gmail.com> wrote:
> On Apr 28, 5:48 pm, Pillsy <·········@gmail.com> wrote:
>
> > Using a hash table to test for uniqueness is a good way to avoid using
> > an O(N^2) algorithm when an O(N) algorithm suffices,
>
> if you ignore the cost of the hash table then O(n) is actually the
> worst case since you can return as soon as the first duplicate is
> found.
>
> (defun has-duplicates-p (list &key key (test #'eql))
>   (flet ((keyed (item) (if key
>                            (funcall key item)
>                            key)))
>     (let ((hash (make-hash-table :test test)))
>       (dolist (item list)
>         (when (> (incf (gethash (keyed item) hash 0))
>                  1)
>           (return t))))))
>
> cheers,
> Tom SW

or, without using a hash-table

(defun has-duplicates-p (list &key (test #'eql) key)
  (loop :for rest :on list :by #'cdr
        :thereis (member (if key (funcall key (car rest)) (car rest))
                         (cdr rest) :test test :key key)))


- sean
From: Thomas A. Russ
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <ymiljpkcrnu.fsf@blackcat.isi.edu>
sross <······@gmail.com> writes:

> On Apr 28, 5:28��pm, TomSW <·············@gmail.com> wrote:
> > On Apr 28, 5:48��pm, Pillsy <·········@gmail.com> wrote:
> >
> > > Using a hash table to test for uniqueness is a good way to avoid using
> > > an O(N^2) algorithm when an O(N) algorithm suffices,
> >
> > if you ignore the cost of the hash table then O(n) is actually the
> > worst case since you can return as soon as the first duplicate is
> > found.
...
> or, without using a hash-table
> 
> (defun has-duplicates-p (list &key (test #'eql) key)
>   (loop :for rest :on list :by #'cdr
>         :thereis (member (if key (funcall key (car rest)) (car rest))
>                          (cdr rest) :test test :key key)))

But this is an O(n^2) algorithm.

For each element in the list, you call MEMBER, repeatedly on the
remaining list, which makes this O(n^2).



-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: budden
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <82290cce-d242-4006-8f66-56d70d0410f6@d25g2000prn.googlegroups.com>
>What's the best
>way to get around this limitation?
Items can be sortable.
Or they can be canonicalizable.
I think there is no general
best solution.
From: Tamas K Papp
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <75opn0F18v1a2U1@mid.individual.net>
On Tue, 28 Apr 2009 08:48:53 -0700, Pillsy wrote:

> On Apr 28, 11:20 am, Tamas K Papp <······@gmail.com> wrote: [...]
>> It is of course trivial to implement it:
>>
>> (defun has-duplicates-p (list &optional (test #'equal))
>>   "Return the first duplicated element, nil if all elements are
>>   unique." (let (unique-elements)
>>     (dolist (elt list)
>>       (let ((found (find elt unique-elements :test test)))
>>         (if found
>>             (return-from has-duplicates-p found)
>>             (push elt unique-elements))))
>>     nil))
> 
> This reminds me of a related question. Ordinarily I'd write this
> function as
> 
> (defun has-duplicates-p (list &optional (test #'equal))
>   (let ((unique-elements (make-hash-table :test test)))
>     (dolist (elt list)
>       (multiple-value-bind (found presentp)
>           (gethash elt unique-elements)
>         (if presentp
>             (return-from has-duplicates-p (values found t)) (setf
>             (gethash elt unique-elements) elt))))
>     (values nil nil)))
> 
> Using a hash table to test for uniqueness is a good way to avoid using
> an O(N^2) algorithm when an O(N) algorithm suffices, but it limits you
> to one of four posible tests, at least in portable CL. This means it's
> not the right solution for any number of situations. What's the best way
> to get around this limitation?

I love hash tables as much as the next man (see cl-sparsematrix), but
here I know that I am not going to deal with more than 20 elements so
they would be overkill (for me).

That said, I think that what you need for the general case is a
generic hash table implementation.  There is genhash
(http://www.cliki.net/genhash), but I have never looked at it, I would
be interested in how it works if anyone has first-hand experience with
it.

Tamas
From: GP lisper
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <slrngven8p.7oj.spambait@phoenix.clouddancer.com>
On 28 Apr 2009 16:37:53 GMT, <······@gmail.com> wrote:
>
> I love hash tables as much as the next man (see cl-sparsematrix), but
> here I know that I am not going to deal with more than 20 elements so
> they would be overkill (for me).

The actual break-even point is implementation dependent, and can be as
low as 8.  There was a comparison post in c.l.l that I recall.

BTW, if I want negative integer indices to an array, I can't get those?
[without a bunch of spagetti translating indices]
So I use a hashtable then?


-- 
Lisp:  Powering `Impossible Thoughts since 1958
From: Pascal J. Bourguignon
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <87hc08a011.fsf@galatea.local>
GP lisper <········@CloudDancer.com> writes:

> On 28 Apr 2009 16:37:53 GMT, <······@gmail.com> wrote:
>>
>> I love hash tables as much as the next man (see cl-sparsematrix), but
>> here I know that I am not going to deal with more than 20 elements so
>> they would be overkill (for me).
>
> The actual break-even point is implementation dependent, and can be as
> low as 8.  There was a comparison post in c.l.l that I recall.
>
> BTW, if I want negative integer indices to an array, I can't get those?
> [without a bunch of spagetti translating indices]

What spaghetti?

(let ((a (make-based-array '((-4 4) (-20 -5)))))
  (setf (bref a -4 -10) (+ 5 (bref a 4 -9))))


> So I use a hashtable then?

It's harder to use hashtables with several indices. You may use lists
or vectors as keys, but implementations may be not so good at hashing
them.


-- 
__Pascal Bourguignon__
From: Thomas A. Russ
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <ymi8wlkcr1l.fsf@blackcat.isi.edu>
···@informatimago.com (Pascal J. Bourguignon) writes:

> GP lisper <········@CloudDancer.com> writes:
> 
> > BTW, if I want negative integer indices to an array, I can't get those?
> > [without a bunch of spagetti translating indices]
> 
> What spaghetti?
> 
> (let ((a (make-based-array '((-4 4) (-20 -5)))))
>   (setf (bref a -4 -10) (+ 5 (bref a 4 -9))))
> 
> 
> > So I use a hashtable then?
> 
> It's harder to use hashtables with several indices. You may use lists
> or vectors as keys, but implementations may be not so good at hashing
> them.

Or you do something similar to your array solution and create nested hash
tables.

(let ((nht (make-hash-table)))
   (setf (get-nested-hash '(I B M) nht) "International Business Machines"))

with an implementation of GET-NESTED-HASH that looks up values based on
a hash-table containing hash-tables until the key length is used up.
This would work well with fixed key lengths.  I think it would be
possible to implement it for variable key lengths as well.

Lists would be relatively easy and efficient to use as keys.  I suspect
that using vectors would be more cumbersome.  One might also be able to
use sparse arrays if all of the indices are numeric.

-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: GP lisper
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <slrngvfe1u.957.spambait@phoenix.clouddancer.com>
On 28 Apr 2009 16:11:18 -0700, <···@sevak.isi.edu> wrote:
> ···@informatimago.com (Pascal J. Bourguignon) writes:
>
>> GP lisper <········@CloudDancer.com> writes:
>> 
>> > BTW, if I want negative integer indices to an array, I can't get those?
>> > [without a bunch of spagetti translating indices]
>> 
>> What spaghetti?
>> 
>> (let ((a (make-based-array '((-4 4) (-20 -5)))))
>>   (setf (bref a -4 -10) (+ 5 (bref a 4 -9))))

Hmm, so I hide all the pasta in this make-based-array gizmo, with it's
associated bref.  Is there a URL for a package?


>> > So I use a hashtable then?
>> 
>> It's harder to use hashtables with several indices. You may use lists
>> or vectors as keys, but implementations may be not so good at hashing
>> them.

Nope.  Just concatenate the indices into a string, with leading +/- signs.

(setf (gethash "-4-10" a)  (+ 5 (gethash "+4-9" a)))


> Or you do something similar to your array solution and create nested hash
> tables.
>
> (let ((nht (make-hash-table)))
>    (setf (get-nested-hash '(I B M) nht) "International Business Machines"))
>
> with an implementation of GET-NESTED-HASH that looks up values based on
> a hash-table containing hash-tables until the key length is used up.
> This would work well with fixed key lengths.  I think it would be
> possible to implement it for variable key lengths as well.

I do that now, and refer to it as chinese box HTs.  Works faster than
SQL at least.  Internal HTs have keys with impossible values to avoid
collision, generally simple numbers 1 2 3 work fine.


-- 
Lisp:  Powering `Impossible Thoughts since 1958
From: Slobodan Blazeski
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <ee9ce5ac-10a9-400b-b692-585de9f152d5@w31g2000prd.googlegroups.com>
On Apr 28, 5:20 pm, Tamas K Papp <······@gmail.com> wrote:
> Is there a function in the CL standard that I could use to check if a
> list has duplicate elements?  I could not find any, but my brain is
> small and the standard is large.  I could compare the length of the
> list to (remove-duplicates list), but that is kind of lame.
>
> It is of course trivial to implement it:
>
> (defun has-duplicates-p (list &optional (test #'equal))
>   "Return the first duplicated element, nil if all elements are unique."
>   (let (unique-elements)
>     (dolist (elt list)
>       (let ((found (find elt unique-elements :test test)))
>         (if found
>             (return-from has-duplicates-p found)
>             (push elt unique-elements))))
>     nil))
>
> I was just curious if I missed anything in the standard.
>
> Tamas


(has-duplicates-p '(1  21 1))
1
(has-duplicates-p '(1  21 12))
NIL
You have a small bug if there is a duplicate NIL.
(has-duplicates-p '(1  nil nil))
NIL
Either return two values or just t.

(defun hdp (list)
  (dolist (e list)
   (if (member e list)
      (return-from hdp t))))

bobi
From: Tamas K Papp
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <75oo00F1955c5U2@mid.individual.net>
On Tue, 28 Apr 2009 08:54:25 -0700, Slobodan Blazeski wrote:

> On Apr 28, 5:20 pm, Tamas K Papp <······@gmail.com> wrote:
>> Is there a function in the CL standard that I could use to check if a
>> list has duplicate elements?  I could not find any, but my brain is
>> small and the standard is large.  I could compare the length of the
>> list to (remove-duplicates list), but that is kind of lame.
>>
>> It is of course trivial to implement it:
>>
>> (defun has-duplicates-p (list &optional (test #'equal))
>>   "Return the first duplicated element, nil if all elements are
>>   unique." (let (unique-elements)
>>     (dolist (elt list)
>>       (let ((found (find elt unique-elements :test test)))
>>         (if found
>>             (return-from has-duplicates-p found)
>>             (push elt unique-elements))))
>>     nil))
>>
>> I was just curious if I missed anything in the standard.
>>
>> Tamas
> 
> 
> (has-duplicates-p '(1  21 1))
> 1
> (has-duplicates-p '(1  21 12))
> NIL
> You have a small bug if there is a duplicate NIL. (has-duplicates-p '(1 
> nil nil))
> NIL
> Either return two values or just t.

Thanks.

> (defun hdp (list)
>   (dolist (e list)
>    (if (member e list)
>       (return-from hdp t))))

I think this always returns t.

Tamas
From: smallpond
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <09916df8-abef-4afa-a5f9-485f6d911b6c@z8g2000prd.googlegroups.com>
On Apr 28, 12:08 pm, Tamas K Papp <······@gmail.com> wrote:
> On Tue, 28 Apr 2009 08:54:25 -0700, Slobodan Blazeski wrote:
> > On Apr 28, 5:20 pm, Tamas K Papp <······@gmail.com> wrote:
> >> Is there a function in the CL standard that I could use to check if a
> >> list has duplicate elements?  I could not find any, but my brain is
> >> small and the standard is large.  I could compare the length of the
> >> list to (remove-duplicates list), but that is kind of lame.
>
> >> It is of course trivial to implement it:
>
> >> (defun has-duplicates-p (list &optional (test #'equal))
> >>   "Return the first duplicated element, nil if all elements are
> >>   unique." (let (unique-elements)
> >>     (dolist (elt list)
> >>       (let ((found (find elt unique-elements :test test)))
> >>         (if found
> >>             (return-from has-duplicates-p found)
> >>             (push elt unique-elements))))
> >>     nil))
>
> >> I was just curious if I missed anything in the standard.
>
> >> Tamas
>
> > (has-duplicates-p '(1  21 1))
> > 1
> > (has-duplicates-p '(1  21 12))
> > NIL
> > You have a small bug if there is a duplicate NIL. (has-duplicates-p '(1
> > nil nil))
> > NIL
> > Either return two values or just t.
>
> Thanks.
>
> > (defun hdp (list)
> >   (dolist (e list)
> >    (if (member e list)
> >       (return-from hdp t))))
>
> I think this always returns t.
>
> Tamas


(hdp ())
NIL

Better might be:
(defun hdp (list)
  (dolist (e list)
    (if (member e (cdr (member e list)))
      (return-from hdp t))))
From: Slobodan Blazeski
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <28f1cd4d-fea3-4df6-8d29-2cab4beaf5dd@w31g2000prd.googlegroups.com>
On Apr 28, 7:14 pm, smallpond <·········@juno.com> wrote:
> On Apr 28, 12:08 pm, Tamas K Papp <······@gmail.com> wrote:
>
>
>
>
>
> > On Tue, 28 Apr 2009 08:54:25 -0700, Slobodan Blazeski wrote:
> > > On Apr 28, 5:20 pm, Tamas K Papp <······@gmail.com> wrote:
> > >> Is there a function in the CL standard that I could use to check if a
> > >> list has duplicate elements?  I could not find any, but my brain is
> > >> small and the standard is large.  I could compare the length of the
> > >> list to (remove-duplicates list), but that is kind of lame.
>
> > >> It is of course trivial to implement it:
>
> > >> (defun has-duplicates-p (list &optional (test #'equal))
> > >>   "Return the first duplicated element, nil if all elements are
> > >>   unique." (let (unique-elements)
> > >>     (dolist (elt list)
> > >>       (let ((found (find elt unique-elements :test test)))
> > >>         (if found
> > >>             (return-from has-duplicates-p found)
> > >>             (push elt unique-elements))))
> > >>     nil))
>
> > >> I was just curious if I missed anything in the standard.
>
> > >> Tamas
>
> > > (has-duplicates-p '(1  21 1))
> > > 1
> > > (has-duplicates-p '(1  21 12))
> > > NIL
> > > You have a small bug if there is a duplicate NIL. (has-duplicates-p '(1
> > > nil nil))
> > > NIL
> > > Either return two values or just t.
>
> > Thanks.
>
> > > (defun hdp (list)
> > >   (dolist (e list)
> > >    (if (member e list)
> > >       (return-from hdp t))))
>
> > I think this always returns t.
>
> > Tamas
>
> (hdp ())
> NIL
>
> Better might be:
> (defun hdp (list)
>   (dolist (e list)
>     (if (member e (cdr (member e list)))
>       (return-from hdp t))))

Cool functional solution, but inefficient.
(defun hdp (list)
  (dolist (e list)
    (if (member e (cdr list))
       (return-from hdp t)
       (pop list))))
Call member of ever smaller lisp. And finally recursion
(defun hdp3 (list)
  (if (member (car list) (cdr list)) t
     (and (cdr list) (hdp2 (cdr list)))))
cheers
Bobi
http://slobodanblazeski.blogspot.com/
From: namekuseijin
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <66619689-5f9d-4856-aecd-7169f64fc1e9@r31g2000prh.googlegroups.com>
Mucking around with many different implementation is all much fun, no
doubt, but this thread reminded of this recent one:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/68bf9c5bae807545#

:)

On Apr 28, 4:33 pm, Slobodan Blazeski <·················@gmail.com>
wrote:
> On Apr 28, 7:14 pm, smallpond <·········@juno.com> wrote:
>
>
>
> > On Apr 28, 12:08 pm, Tamas K Papp <······@gmail.com> wrote:
>
> > > On Tue, 28 Apr 2009 08:54:25 -0700, Slobodan Blazeski wrote:
> > > > On Apr 28, 5:20 pm, Tamas K Papp <······@gmail.com> wrote:
> > > >> Is there a function in the CL standard that I could use to check if a
> > > >> list has duplicate elements?  I could not find any, but my brain is
> > > >> small and the standard is large.  I could compare the length of the
> > > >> list to (remove-duplicates list), but that is kind of lame.
>
> > > >> It is of course trivial to implement it:
>
> > > >> (defun has-duplicates-p (list &optional (test #'equal))
> > > >>   "Return the first duplicated element, nil if all elements are
> > > >>   unique." (let (unique-elements)
> > > >>     (dolist (elt list)
> > > >>       (let ((found (find elt unique-elements :test test)))
> > > >>         (if found
> > > >>             (return-from has-duplicates-p found)
> > > >>             (push elt unique-elements))))
> > > >>     nil))
>
> > > >> I was just curious if I missed anything in the standard.
>
> > > >> Tamas
>
> > > > (has-duplicates-p '(1  21 1))
> > > > 1
> > > > (has-duplicates-p '(1  21 12))
> > > > NIL
> > > > You have a small bug if there is a duplicate NIL. (has-duplicates-p '(1
> > > > nil nil))
> > > > NIL
> > > > Either return two values or just t.
>
> > > Thanks.
>
> > > > (defun hdp (list)
> > > >   (dolist (e list)
> > > >    (if (member e list)
> > > >       (return-from hdp t))))
>
> > > I think this always returns t.
>
> > > Tamas
>
> > (hdp ())
> > NIL
>
> > Better might be:
> > (defun hdp (list)
> >   (dolist (e list)
> >     (if (member e (cdr (member e list)))
> >       (return-from hdp t))))
>
> Cool functional solution, but inefficient.
> (defun hdp (list)
>   (dolist (e list)
>     (if (member e (cdr list))
>        (return-from hdp t)
>        (pop list))))
> Call member of ever smaller lisp. And finally recursion
> (defun hdp3 (list)
>   (if (member (car list) (cdr list)) t
>      (and (cdr list) (hdp2 (cdr list)))))
> cheers
> Bobihttp://slobodanblazeski.blogspot.com/
From: Slobodan Blazeski
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <e1043de3-c543-402b-acfc-88f2cc8e1a06@h11g2000yqb.googlegroups.com>
On Apr 29, 12:29 am, namekuseijin <············@gmail.com> wrote:
> Mucking around with many different implementation is all much fun, no
> doubt, but this thread reminded of this recent one:
>
> http://groups.google.com/group/comp.lang.python/browse_thread/thread/...
>
> :)

Usually when you have some functionality in the standard you just use
it unless:
a. You don't know its in the standard
b. You have some special constraints that makes standard functionality
unusable like not doing what you want, for example what if by lists
I'm thinking about double linked lists or maybe lists that are allowed
to have some elements shared but only under circumstances while the
other must be unique all the time. Or maybe it was performance issue
and someone devised very clever test that implementation doesn't know
about it. So lisp is a flexible langauge that would allow you to use
mismatch 99% of the time but still have the flexibility to switch to
something else. Users of brittle languages usually don't have a
choice, beside greenspunning.

cheers
Bobi
http://slobodanblazeski.blogspot.com/
From: Spiros Bousbouras
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <2a1b1abb-ea8f-47d6-840c-08550440e435@s16g2000vbp.googlegroups.com>
On 1 May, 17:51, Slobodan Blazeski <·················@gmail.com>
wrote:
> On Apr 29, 12:29 am, namekuseijin <············@gmail.com> wrote:
>
> > Mucking around with many different implementation is all much fun, no
> > doubt, but this thread reminded of this recent one:
>
> >http://groups.google.com/group/comp.lang.python/browse_thread/thread/...
>
> > :)
>
> Usually when you have some functionality in the standard you just use
> it unless:
> a. You don't know its in the standard
> b. You have some special constraints that makes standard functionality
> unusable like not doing what you want,
> ...........

Or
c. It's less boring/quicker to write your own solution than to
read the relevant part of the standard. For example I wonder
how many here have the read the full loop specification.
From: Thomas A. Russ
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <ymihc08cri1.fsf@blackcat.isi.edu>
smallpond <·········@juno.com> writes:

> Better might be:
> (defun hdp (list)
>   (dolist (e list)
>     (if (member e (cdr (member e list)))
>       (return-from hdp t))))

Interesting thought and an innovative use of MEMBER.

And we now have an O(n^3) algorithm in the running, so it isn't without
its drawbacks.

-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: smallpond
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <e5b7c132-12f2-42bb-af1d-ae8fa9aec31c@u9g2000pre.googlegroups.com>
On Apr 28, 7:01 pm, ····@sevak.isi.edu (Thomas A. Russ) wrote:
> smallpond <·········@juno.com> writes:
> > Better might be:
> > (defun hdp (list)
> >   (dolist (e list)
> >     (if (member e (cdr (member e list)))
> >       (return-from hdp t))))
>
> Interesting thought and an innovative use of MEMBER.
>
> And we now have an O(n^3) algorithm in the running, so it isn't without
> its drawbacks.
>

Still O[n^2}.  member stops at a match and returns the rest
of the list so in worst case (no matches) it's exactly n^2
steps and no consing.
From: Thomas A. Russ
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <ymitz47q9sa.fsf@blackcat.isi.edu>
smallpond <·········@juno.com> writes:

> On Apr 28, 7:01 pm, ····@sevak.isi.edu (Thomas A. Russ) wrote:
> > smallpond <·········@juno.com> writes:
> > > Better might be:
> > > (defun hdp (list)
> > >   (dolist (e list)
> > >     (if (member e (cdr (member e list)))
> > >       (return-from hdp t))))
> >
> > Interesting thought and an innovative use of MEMBER.
> >
> > And we now have an O(n^3) algorithm in the running, so it isn't without
> > its drawbacks.
> >
> 
> Still O[n^2}.  member stops at a match and returns the rest
> of the list so in worst case (no matches) it's exactly n^2
> steps and no consing.

Ah. OK.  So it is.

What I failed to notice is that the two MEMBER calls are arranged so
that the outer one will only have to go through the list remaining after
the inner one finds the match.  And since the sum of the lengths of
those two arguments to MEMBER will always be N, there is no double
traversal.


-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: Slobodan Blazeski
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <9ef2fed9-85e2-4f59-aa18-88f5859d6b1d@z16g2000prd.googlegroups.com>
On Apr 28, 6:08 pm, Tamas K Papp <······@gmail.com> wrote:
> On Tue, 28 Apr 2009 08:54:25 -0700, Slobodan Blazeski wrote:
> > On Apr 28, 5:20 pm, Tamas K Papp <······@gmail.com> wrote:
> >> Is there a function in the CL standard that I could use to check if a
> >> list has duplicate elements?  I could not find any, but my brain is
> >> small and the standard is large.  I could compare the length of the
> >> list to (remove-duplicates list), but that is kind of lame.
>
> >> It is of course trivial to implement it:
>
> >> (defun has-duplicates-p (list &optional (test #'equal))
> >>   "Return the first duplicated element, nil if all elements are
> >>   unique." (let (unique-elements)
> >>     (dolist (elt list)
> >>       (let ((found (find elt unique-elements :test test)))
> >>         (if found
> >>             (return-from has-duplicates-p found)
> >>             (push elt unique-elements))))
> >>     nil))
>
> >> I was just curious if I missed anything in the standard.
>
> >> Tamas
>
> > (has-duplicates-p '(1  21 1))
> > 1
> > (has-duplicates-p '(1  21 12))
> > NIL
> > You have a small bug if there is a duplicate NIL. (has-duplicates-p '(1
> > nil nil))
> > NIL
> > Either return two values or just t.
>
> Thanks.
>
> > (defun hdp (list)
> >   (dolist (e list)
> >    (if (member e list)
> >       (return-from hdp t))))
>
> I think this always returns t.
>
> Tamas
My mistake sorry
(defun hdp (list)
  (dolist (e list)
    (if (member e (cdr list))
       (return-from hdp t)
       (pop list))))

(hdp   '(2 1 ))
NIL
(hdp   '(2 1 2))
T

cheers
Bobi
http://slobodanblazeski.blogspot.com/
From: Thomas A. Russ
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <ymiprewd4xs.fsf@blackcat.isi.edu>
Tamas K Papp <······@gmail.com> writes:

> Is there a function in the CL standard that I could use to check if a
> list has duplicate elements?  I could not find any, but my brain is
> small and the standard is large.  I could compare the length of the
> list to (remove-duplicates list), but that is kind of lame.

Well, it would work as well as any other simple solution.
REMOVE-DUPLICATES is typically a O(n^2) algorithm, at least in most of
the lisps in which I've tried it.  But for small lists, that is
typically still faster than an O(n) algorithm with hash tables.

If you need to do this with large lists, then you are better off with
your own implementation that has some guarantees.

> It is of course trivial to implement it:
> 
> (defun has-duplicates-p (list &optional (test #'equal))
>   "Return the first duplicated element, nil if all elements are unique."
>   (let (unique-elements)
>     (dolist (elt list)
>       (let ((found (find elt unique-elements :test test)))
> 	(if found
> 	    (return-from has-duplicates-p found)
> 	    (push elt unique-elements))))
>     nil))
> 
> I was just curious if I missed anything in the standard.

Not that I'm aware of.

I note that your solution is also O(n^2).  Another simple implementation
would be:

 (defun has-dupilicates-p (list &optional (test #'eql))
   (loop for (item . rest) on list
         when (find item rest :test test) return item)
    nil)

An O(n) version would look like:

  (defun fast-has-duplicates-p (list &optional (test #'eql))
    (let ((ht (make-hash-table :test test :size (length list))))
      (dolist (item list)
        (if (gethash item ht)
            (return-from has-duplicates-p item)
            (setf (gethash item ht) t)))
      nil))


 ;; Note that the :size argument to make-hash-table is optional, but
 ;; going through the list twice to avoid re-hashing is usually a
 ;; winning strategy for large lists, where the default behavior may
 ;; cause multiple re-hashing to occur.


I wonder if FSet has anything?

-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: MishoM
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <575618c4-8cec-40e8-aae2-60e63c7de820@z8g2000prd.googlegroups.com>
On Apr 28, 6:20 pm, Tamas K Papp <······@gmail.com> wrote:
> Is there a function in the CL standard that I could use to check if a
> list has duplicate elements?  I could not find any, but my brain is
> small and the standard is large.  I could compare the length of the
> list to (remove-duplicates list), but that is kind of lame.
>
> It is of course trivial to implement it:
>
> (defun has-duplicates-p (list &optional (test #'equal))
>   "Return the first duplicated element, nil if all elements are unique."
>   (let (unique-elements)
>     (dolist (elt list)
>       (let ((found (find elt unique-elements :test test)))
>         (if found
>             (return-from has-duplicates-p found)
>             (push elt unique-elements))))
>     nil))
>
> I was just curious if I missed anything in the standard.
>
> Tamas

Another solution is to copy the list, sort the copy and then look
through the sorted copy for two consecutive equal elements. This is O
(log N) but it doesn't satisfy your requirement of returning the first
duplicated element. And of course it requires O(N) extra memory. But
you said that you are dealing with small lists, so this might actually
be better than a hash table.

I'm not going to write an example as I'm still new to CL and it will
take me some time to get it right ;)
From: Thomas A. Russ
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <ymid4awcrfc.fsf@blackcat.isi.edu>
MishoM <···············@gmail.com> writes:

> Another solution is to copy the list, sort the copy and then look
> through the sorted copy for two consecutive equal elements. This is O
> (log N) but it doesn't satisfy your requirement of returning the first
> duplicated element.

Actually, a comparison-based sorting algorithm is at best O(N log N).



-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: MishoM
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <e2371f6d-46cb-4ff9-a628-dac0c5a24eaf@i28g2000prd.googlegroups.com>
On Apr 29, 2:03 am, ····@sevak.isi.edu (Thomas A. Russ) wrote:
> MishoM <···············@gmail.com> writes:
> > Another solution is to copy the list, sort the copy and then look
> > through the sorted copy for two consecutive equal elements. This is O
> > (log N) but it doesn't satisfy your requirement of returning the first
> > duplicated element.
>
> Actually, a comparison-based sorting algorithm is at best O(N log N).

Yes, yes, just a typo. It was late in the evening and I was very
sleepy :)
From: Kaz Kylheku
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <20090510095201.535@gmail.com>
On 2009-04-29, MishoM <···············@gmail.com> wrote:
> On Apr 29, 2:03 am, ····@sevak.isi.edu (Thomas A. Russ) wrote:
>> MishoM <···············@gmail.com> writes:
>> > Another solution is to copy the list, sort the copy and then look
>> > through the sorted copy for two consecutive equal elements. This is O
>> > (log N) but it doesn't satisfy your requirement of returning the first
>> > duplicated element.
>>
>> Actually, a comparison-based sorting algorithm is at best O(N log N).
>
> Yes, yes, just a typo. It was late in the evening and I was very
> sleepy :)

Actually a sorting algorithm is typically at /worst/ O(N log N).  O notation
gives the upper bound on the growth of the running time: the running time grows
no faster with N than some constant times N log N.

You may be lucky and the sort may happen faster, because, say, the data is
nearly sorted already and the algorithm responds favorably to that.
From: Thomas A. Russ
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <ymiljpiq77m.fsf@blackcat.isi.edu>
Kaz Kylheku <········@gmail.com> writes:

> On 2009-04-29, MishoM <···············@gmail.com> wrote:

> Actually a sorting algorithm is typically at worst O(N log N).  O notation
> gives the upper bound on the growth of the running time: the running time grows
> no faster with N than some constant times N log N.
> 
> You may be lucky and the sort may happen faster, because, say, the data is
> nearly sorted already and the algorithm responds favorably to that.

The definition of big-O notation is noted.

On the other hand, there are well-known (naive) sorting algorithms, such
as bubble sort which are O(n^2).

I guess the most precise statement should be something like this:

  The best worst-case behavior for a comparison-based sorting algorithm
  is O(n log n).  

There exist other algorithms that have more expensive worst case
behavior.  There are also other non-comparison sorts that can have
linear performance (bin sorts, radix sorts).

-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: Rupert Swarbrick
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <gt7pr7$k2g$1@news.albasani.net>
--=-=-=

MishoM <···············@gmail.com> writes:

> Another solution is to copy the list, sort the copy and then look
> through the sorted copy for two consecutive equal elements. This is O
> (log N) but it doesn't satisfy your requirement of returning the first
> duplicated element. And of course it requires O(N) extra memory. But
> you said that you are dealing with small lists, so this might actually
> be better than a hash table.

Of course, that requires a total ordering on the elements the list can
contain. This is not always possible... [1]

Rupert

[1] Well, not constructively:
http://en.wikipedia.org/wiki/Axiom_of_choice

--=-=-=
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iJwEAQECAAYFAkn3aqQACgkQRtd/pJbYVoZCyAP+NqCg/bSTIRAVMayGsjuj5CD/
/q9NkPm26m6L2td6GRJ5AbN5c94mw/2ZN6Ee/WonMFxzZ59TI67ZxVQGXns2jDIQ
X752x7OUpg9w+/pZ2ZoS2AwXP2D+leQ49p5954X4ZB+2ViB+DYPBHs0bn0Ho7F23
WlU4eoTwg7je/As3kic=
=ToOB
-----END PGP SIGNATURE-----
--=-=-=--
From: gugamilare
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <03f7ce05-8667-4671-8436-b490b22566af@r31g2000prh.googlegroups.com>
> Of course, that requires a total ordering on the elements the list can
> contain. This is not always possible... [1]
>
> Rupert
>
> [1] Well, not constructively:http://en.wikipedia.org/wiki/Axiom_of_choice

Oh, yes, the Axiom of Choice, the enemy number 1 of Computer Science.
If it weren't the existence of the Axiom of Choice, much more math
would be translatable to machine code. :P
>
>  application_pgp-signature_part
> < 1KVisualizarFazer download
From: gugamilare
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <56c85211-8346-4181-9cff-9b436fdd17ce@v1g2000prd.googlegroups.com>
On 28 abr, 18:25, gugamilare <··········@gmail.com> wrote:
> > Of course, that requires a total ordering on the elements the list can
> > contain. This is not always possible... [1]
>
> > Rupert
>
> > [1] Well, not constructively:http://en.wikipedia.org/wiki/Axiom_of_choice
>
> Oh, yes, the Axiom of Choice, the enemy number 1 of Computer Science.
> If it weren't the existence of the Axiom of Choice, much more math
> would be translatable to machine code. :P

It turns out I was wrong. The Axiom of Choice is always computable if
its arguments are computable. In fact, here is a generic comparison
algorithm (it randomly chooses an integer value to each of its
arguments and memorizes it - probably a memory-eater, but works
anyway):

(let ((id-table (make-hash-table :weakness :key))
      (id 0))
  (defun gen< (a b)
    (let ((key-a (or (gethash a id-table)
		     (setf (gethash a id-table) (incf id))))
	  (key-b (or (gethash b id-table)
		     (setf (gethash b id-table) (incf id)))))
      (< key-a key-b))))

This would make any set sortable.
From: Nicolas Neuss
Subject: [OT] Axiom of Choice [Re: checking for duplicate elements]
Date: 
Message-ID: <873abqzgq6.fsf_-_@ma-patru.mathematik.uni-karlsruhe.de>
gugamilare <··········@gmail.com> writes:

>> Of course, that requires a total ordering on the elements the list can
>> contain. This is not always possible... [1]
>>
>> Rupert
>>
>> [1] Well, not constructively:http://en.wikipedia.org/wiki/Axiom_of_choice
>
> Oh, yes, the Axiom of Choice, the enemy number 1 of Computer Science.

I see the AoC more as the enemy of any practically-minded person.  Does
anyone know a practically relevant result which needs the AoC?  I know only
of results like "every vector space has a basis" which is a quite useless
theorem because a basis is of use only when you know more about it than "it
exists".  (Note that for finite-dimensional spaces you don't need the AoC
anyhow, and that for infinite-dimensional spaces an algebraic basis
obtained by the AoC is usually not the one you need.)

Nicolas
From: Harald Hanche-Olsen
Subject: Re: [OT] Axiom of Choice [Re: checking for duplicate elements]
Date: 
Message-ID: <pcoeivap8fb.fsf@math.ntnu.no>
+ Nicolas Neuss <········@math.uni-karlsruhe.de>:

> I see the AoC more as the enemy of any practically-minded person.
> Does anyone know a practically relevant result which needs the AoC?

A lot of mathematics gets exceedingly painful to do without the axiom of
choice, including bits that are quite useful, such as central theorems
of functional analysis, which in turn are useful for quantum mechanics
and the like (at least in theory). But it is indeed very difficult,
perhaps impossible, to point at a truly "practical" result and say with
confidence, "this result requires AC for its proof". As a result, the
vast majority of mathematicians just assume the presence of AC and get
on with their work without losing any sleep over it.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
  when there is no ground whatsoever for supposing it is true.
  -- Bertrand Russell
From: gugamilare
Subject: Re: Axiom of Choice [Re: checking for duplicate elements]
Date: 
Message-ID: <0dd65caf-ed01-4034-a630-cc1ea510daec@y34g2000prb.googlegroups.com>
On 30 abr, 05:30, Nicolas Neuss <········@math.uni-karlsruhe.de>
wrote:
> gugamilare <··········@gmail.com> writes:
> >> Of course, that requires a total ordering on the elements the list can
> >> contain. This is not always possible... [1]
>
> >> Rupert
>
> >> [1] Well, not constructively:http://en.wikipedia.org/wiki/Axiom_of_choice
>
> > Oh, yes, the Axiom of Choice, the enemy number 1 of Computer Science.
>
> I see the AoC more as the enemy of any practically-minded person.  Does
> anyone know a practically relevant result which needs the AoC?

Hum, you can always conclude something useful. For instance, using the
AoC, it is possible to construct a unmeasurable set from the real line
(i.e. a set that doesn't have Lebesgue measure). But it is also proven
that, without assuming the AoC (or some equivalent), it is not
possible to construct such a set. Therefore, all constructable sets
are measurable, so, if you constructed one (without AoC), it will be
measurable.

It also gives an interesting result to probability. One can wonder if
ther is a probability function that is defined in all subsets of the
real line. Since, with AoC, there is a unmeasurable set, there isn't.
You also can't prove that the AoC is wrong using the other axioms, so,
in particular, you can't construct such a probability function, or
else you would prove that the AoC is worng. Case closed. If it weren't
for this result, this could be an open problem by now, and
matematicians would waste their time trying to find a solution to this
problem, proving it's existence of not.

>  I know only
> of results like "every vector space has a basis" which is a quite useless
> theorem because a basis is of use only when you know more about it than "it
> exists".

This is mostly tecnical. It saves matematicians (like me) from having
to test - or suppose - if every damn vector space we find have a
basis. Imagine if all teorems we have to prove, we need to add
"suppose that the vector space has a basis, then..." Well, that is
just my opinion. And, in most practical cases, you can explictly make
a choice function / set (see the example I gave, I gave a total order
that works like Zermelo's theorem, which is equivalent to the Axiom of
Choice).

> (Note that for finite-dimensional spaces you don't need the AoC
> anyhow, and that for infinite-dimensional spaces an algebraic basis
> obtained by the AoC is usually not the one you need.)
>
> Nicolas
From: Nicolas Neuss
Subject: Re: Axiom of Choice [Re: checking for duplicate elements]
Date: 
Message-ID: <8763glz0t3.fsf@ma-patru.mathematik.uni-karlsruhe.de>
gugamilare <··········@gmail.com> writes:

> On 30 abr, 05:30, Nicolas Neuss <········@math.uni-karlsruhe.de>
>>
>> I see the AoC more as the enemy of any practically-minded person. Does
>> anyone know a practically relevant result which needs the AoC?

> Hum, you can always conclude something useful. For instance, using the
> AoC, it is possible to construct a unmeasurable set from the real line
> (i.e. a set that doesn't have Lebesgue measure). [...]

> It also gives an interesting result to probability. One can wonder if
> ther is a probability function that is defined in all subsets of the
> real line. [...]

>> I know only of results like "every vector space has a basis" which is a
>> quite useless theorem because a basis is of use only when you know more
>> about it than "it exists".
>
> This is mostly tecnical. It saves matematicians (like me) from having to
> test - or suppose - if every damn vector space we find have a
> basis. Imagine if all teorems we have to prove, we need to add "suppose
> that the vector space has a basis, then..." 

The main reason of a basis is its use within calculations.  And then it
does not help if you know only that "it exists".  Instead, you should know
what it is explicitly for representing other vectors.

Maybe the best candidate for something practically useful really are
functional analytic applications like Hahn-Banach etc as mentioned by
Harald Hanche-Olsen.  However, I still suspect that the property "proof
possible only with AC" might classify mathematical questions which are
irrelevant in applications.

> Well, that is just my opinion. And, in most practical cases, you can
> explictly make a choice function / set [...]

Yes, and my question is if "most" in the sentence above should not even
read "all".

Nicolas
From: gugamilare
Subject: Re: Axiom of Choice [Re: checking for duplicate elements]
Date: 
Message-ID: <a2b54313-e8d4-42ad-8c22-de324caac359@v35g2000pro.googlegroups.com>
On 1 maio, 05:26, Nicolas Neuss <········@math.uni-karlsruhe.de>
wrote:
> The main reason of a basis is its use within calculations.  And then it
> does not help if you know only that "it exists".

It helps when you are elaborating the theory, and even theory that
supports results for real uses. Many times you need the prove
existence of something before you even try to calculate it.

> Instead, you should know
> what it is explicitly for representing other vectors.

> Maybe the best candidate for something practically useful really are
> functional analytic applications like Hahn-Banach etc as mentioned by
> Harald Hanche-Olsen.  However, I still suspect that the property "proof
> possible only with AC" might classify mathematical questions which are
> irrelevant in applications.
>
> > Well, that is just my opinion. And, in most practical cases, you can
> > explictly make a choice function / set [...]
>
> Yes, and my question is if "most" in the sentence above should not even
> read "all".

In this case you should agree that the Axiom of Choice is useful.
>
> Nicolas
From: Lieven Marchand
Subject: Re: Axiom of Choice [Re: checking for duplicate elements]
Date: 
Message-ID: <87eiv8lqmf.fsf@black.wyrd.be>
Nicolas Neuss <········@math.uni-karlsruhe.de> writes:

> Maybe the best candidate for something practically useful really are
> functional analytic applications like Hahn-Banach etc as mentioned by
> Harald Hanche-Olsen.  However, I still suspect that the property "proof
> possible only with AC" might classify mathematical questions which are
> irrelevant in applications.

Without any form of choice a lot of stuff becomes tricky. But for
analysis etc something like the Axiom of Dependent Choice or AC
limited to countable sets is sufficient.

Without choice, you can't prove the equivalence of the epsilon-delta
definition of continuity and the lim_{x->a} f(x)=f(a) definition.

-- 
The method of "postulating" what we want has many advantages; they are 
the same as the advantages of theft over honest toil.        - Russell
From: Spiros Bousbouras
Subject: Re: Axiom of Choice [Re: checking for duplicate elements]
Date: 
Message-ID: <a510e107-e02d-4f67-a7e8-9961ee99c7f8@f19g2000yqh.googlegroups.com>
On 1 May, 17:44, Lieven Marchand <····@wyrd.be> wrote:
> Without choice, you can't prove the equivalence of the epsilon-delta
> definition of continuity and the lim_{x->a} f(x)=f(a) definition.

I don't know which definitions you have in mind but with the
definitions I can think of with those names you can definitely
prove the equivalence without AC. Perhaps you meant that you
cannot prove that a) implies b):

a) For every sequence x_n converging to a, the sequence f(x_n)
converges to f(a).

b) f is continuous at point a.
From: Nicolas Neuss
Subject: Re: Axiom of Choice [Re: checking for duplicate elements]
Date: 
Message-ID: <87my9v3gec.fsf@ma-patru.mathematik.uni-karlsruhe.de>
Spiros Bousbouras <······@gmail.com> writes:

> On 1 May, 17:44, Lieven Marchand <····@wyrd.be> wrote:
>> Without choice, you can't prove the equivalence of the epsilon-delta
>> definition of continuity and the lim_{x->a} f(x)=f(a) definition.
>
> I don't know which definitions you have in mind but with the
> definitions I can think of with those names you can definitely
> prove the equivalence without AC. Perhaps you meant that you
> cannot prove that a) implies b):
>
> a) For every sequence x_n converging to a, the sequence f(x_n)
> converges to f(a).
>
> b) f is continuous at point a.

Yes, it looks like as if one would need something like countable choice
here.

I have thought a little more about this, and at the moment I see the
situation as follows (Disclaimer: I'm not an expert in set theory or
logic!):

1. You can choose from one non-empty set.  This is implied by basic set
theory axioms defining what means "non-empty set", i.e. the function
(choose set) is defined whenever set is non-empty.  You can then do also a
multi-choose from a finite set of non-empty sets, i.e. (mapcar #'choose
sets) does work in such situations.

2. When you have the Peano axioms, especially induction, available, you can
prove inductively that you can also do a countable-choose from a countably
indexed set of sets.  This has a special name "AC-countable" which is
important mostly in the case when you do not assume the Peano axioms.

3. Since the Peano axioms are usually assumed from the very beginning in
analysis, most mathematicians use but do not mention AC-countable as a
special axiom.  And the above example of continuity equivalence is one of
those cases.

4. However, mathematicians explicitly mention whenever they use the full
axiom of choice.  Those applications are mostly (always?) not necessary for
applied mathematics, but as "gugamilare" says, it is useful to know that
some problem can be answered using the full AC (or that it is even
equivalent to it).

Yours, Nicolas
From: Dave Seaman
Subject: Re: Axiom of Choice [Re: checking for duplicate elements]
Date: 
Message-ID: <gthk8i$q25$1@mailhub227.itcs.purdue.edu>
On Sat, 02 May 2009 13:15:07 +0200, Nicolas Neuss wrote:
> Spiros Bousbouras <······@gmail.com> writes:

>> On 1 May, 17:44, Lieven Marchand <····@wyrd.be> wrote:
>>> Without choice, you can't prove the equivalence of the epsilon-delta
>>> definition of continuity and the lim_{x->a} f(x)=f(a) definition.
>>
>> I don't know which definitions you have in mind but with the
>> definitions I can think of with those names you can definitely
>> prove the equivalence without AC. Perhaps you meant that you
>> cannot prove that a) implies b):
>>
>> a) For every sequence x_n converging to a, the sequence f(x_n)
>> converges to f(a).
>>
>> b) f is continuous at point a.

> Yes, it looks like as if one would need something like countable choice
> here.

> I have thought a little more about this, and at the moment I see the
> situation as follows (Disclaimer: I'm not an expert in set theory or
> logic!):

> 1. You can choose from one non-empty set.  This is implied by basic set
> theory axioms defining what means "non-empty set", i.e. the function
> (choose set) is defined whenever set is non-empty.  You can then do also a
> multi-choose from a finite set of non-empty sets, i.e. (mapcar #'choose
> sets) does work in such situations.

Correct.

> 2. When you have the Peano axioms, especially induction, available, you can
> prove inductively that you can also do a countable-choose from a countably
> indexed set of sets.  This has a special name "AC-countable" which is
> important mostly in the case when you do not assume the Peano axioms.

Induction does not let you conclude anything about choosing from infinite
families of sets.  By induction, you can prove that choices can be made
from any finite collection of nonempty sets, no matter how large.

> 3. Since the Peano axioms are usually assumed from the very beginning in
> analysis, most mathematicians use but do not mention AC-countable as a
> special axiom.  And the above example of continuity equivalence is one of
> those cases.

> 4. However, mathematicians explicitly mention whenever they use the full
> axiom of choice.  Those applications are mostly (always?) not necessary for
> applied mathematics, but as "gugamilare" says, it is useful to know that
> some problem can be answered using the full AC (or that it is even
> equivalent to it).

Some mention AC, but many don't bother.


-- 
Dave Seaman
Third Circuit ignores precedent in Mumia Abu-Jamal ruling.
<http://www.indybay.org/newsitems/2008/03/29/18489281.php>
From: gugamilare
Subject: Re: Axiom of Choice [Re: checking for duplicate elements]
Date: 
Message-ID: <519cd49c-e466-43d3-b4ba-4ed940d9f5b7@c36g2000yqn.googlegroups.com>
On 1 maio, 13:44, Lieven Marchand <····@wyrd.be> wrote:
> Nicolas Neuss <········@math.uni-karlsruhe.de> writes:
> > Maybe the best candidate for something practically useful really are
> > functional analytic applications like Hahn-Banach etc as mentioned by
> > Harald Hanche-Olsen.  However, I still suspect that the property "proof
> > possible only with AC" might classify mathematical questions which are
> > irrelevant in applications.
>
> Without any form of choice a lot of stuff becomes tricky. But for
> analysis etc something like the Axiom of Dependent Choice or AC
> limited to countable sets is sufficient.
>
> Without choice, you can't prove the equivalence of the epsilon-delta
> definition of continuity and the lim_{x->a} f(x)=f(a) definition.

I think you ar mistaking something. AoC is not needed for that.
>
> --
> The method of "postulating" what we want has many advantages; they are
> the same as the advantages of theft over honest toil.        - Russell
From: Nicolas Neuss
Subject: Re: Axiom of Choice [Re: checking for duplicate elements]
Date: 
Message-ID: <87zldw691a.fsf@ma-patru.mathematik.uni-karlsruhe.de>
Lieven Marchand <···@wyrd.be> writes:

> Without any form of choice a lot of stuff becomes tricky. But for
> analysis etc something like the Axiom of Dependent Choice or AC
> limited to countable sets is sufficient.
>
> Without choice, you can't prove the equivalence of the epsilon-delta
> definition of continuity and the lim_{x->a} f(x)=f(a) definition.

OK, you are right.  I agree that the (weaker) AC of countable choice is
very important also for applications.  And in contrast to the full AC it
has also a constructive feeling: start with the first set, choose, go to
the second, etc. (rather similar to induction).

Thanks, Nicolas
From: Cesar Rabak
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <gt7o52$d72$1@aioe.org>
Tamas K Papp escreveu:
> Is there a function in the CL standard that I could use to check if a
> list has duplicate elements?  I could not find any, but my brain is
> small and the standard is large.  I could compare the length of the
> list to (remove-duplicates list), but that is kind of lame.
> 
You rise an interesting point: why its 'lame' and not 'clever'?

--
Cesar Rabak
From: Barry Margolin
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <barmar-9B3EBF.23480128042009@mara100-84.onlink.net>
In article <············@aioe.org>, Cesar Rabak <·······@yahoo.com.br> 
wrote:

> Tamas K Papp escreveu:
> > Is there a function in the CL standard that I could use to check if a
> > list has duplicate elements?  I could not find any, but my brain is
> > small and the standard is large.  I could compare the length of the
> > list to (remove-duplicates list), but that is kind of lame.
> > 
> You rise an interesting point: why its 'lame' and not 'clever'?

It was the first answer that came to my mind, too.  ISTM that the 
algorithm used by REMOVE-DUPLICATES would be very similar to that of 
HAS-DUPLICATES-P, except that the latter doesn't have to cons a new 
list.  But that new list will be ephemeral, and modern GCs should be 
able to collect it efficiently.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
From: Tamas K Papp
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <75tuctF1a0sphU1@mid.individual.net>
On Thu, 30 Apr 2009 20:39:03 +0530, Madhu wrote:

> * Barry Margolin <····························@mara100-84.onlink.net> :
> Wrote on Tue, 28 Apr 2009 23:48:01 -0400:
> 
> | In article <············@aioe.org>, Cesar Rabak <·······@yahoo.com.br>
> | wrote:
> |
> |> Tamas K Papp escreveu:
> |> > Is there a function in the CL standard that I could use to check if
> a |> > list has duplicate elements?  I could not find any, but my brain
> is |> > small and the standard is large.  I could compare the length of
> the |> > list to (remove-duplicates list), but that is kind of lame. |>
> >
> |> You rise an interesting point: why its 'lame' and not 'clever'? |
> | It was the first answer that came to my mind, too.  ISTM that the |
> algorithm used by REMOVE-DUPLICATES would be very similar to that of |
> HAS-DUPLICATES-P, except that the latter doesn't have to cons a new |
> list.  But that new list will be ephemeral, and modern GCs should be |
> able to collect it efficiently.
> 
> Elsewhere the OP has balked at using hashtables because he was "not
> going to deal with more than 20 elements so they would be overkill."
> 
> So given that N^2 will be 400 tops, my idea of clever would have been to
> churn out in less than a minute,
> 
> (defun has-duplicates-p (seq1)
>   (let ((length (length seq1)))
>     (when (> length 20)
>       (cerror "Continue for now"
> 	      "Bzzzzzt! Reimplement me: ~D elements encountered." length))
>     (let ((n (mismatch seq1 (remove-duplicates seq1))))
>       (when n
> 	(values t (elt seq1 n))))))
> 
> The use of mismatch idea was suggested by Anon.C.Lisper elsewhere in
> this thread.  If one were sure the lists would not contain nil, I'd have
> HAS-DUPLICATES-P behave like DIGIT-CHAR-P and return the first duplicate
> element encountered as the first return value. (instead of the 2nd)

Well, the OP has learned a lot from this thread.  To sketch the
context of the problem: I needed to check uniqueness in a function
which receives lists with elements that look like (symbol &rest
numerical-parameters), each symbol referring to an exogenous parameter
of an economic model, and returns a stochastic version of the model
with the given parameters perturbed.  So it is nonsensical to
enumerate one parameter more than once, and I wanted to check for
that.

Given that the rest of the calculation involves numerical rootfinding
and then simulation, the part which checks for uniqueness has
negligible impact on speed.  So I should go for the simplest thing,
which is mismatch w/ remove-duplicates.

Even after 1.5 years of using Lisp, I still engage in premature
optimization.  This is entirely my fault, the only (admittedly lame)
excuse I have is that I have used some bad languages before finding
Lisp.

Thanks for all the suggestions, I really appreciate everyone's help.
It was fun to read the various algorithms.

Tamas

PS.: Lisp is so damn fast.  I implemented a similar algorithm in R a
couple of years ago, and the difference is more than 100-fold.  Even
without declarations.
From: ··················@gmail.com
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <268bce2e-bd42-43b4-83f6-3f171c186b00@b7g2000pre.googlegroups.com>
On Apr 30, 11:28 am, Tamas K Papp <······@gmail.com> wrote:
> On Thu, 30 Apr 2009 20:39:03 +0530, Madhu wrote:
> > * Barry Margolin <····························@mara100-84.onlink.net> :
> > Wrote on Tue, 28 Apr 2009 23:48:01 -0400:
>
> > | In article <············@aioe.org>, Cesar Rabak <·······@yahoo.com.br>
> > | wrote:
> > |
> > |> Tamas K Papp escreveu:
> > |> > Is there a function in the CL standard that I could use to check if
> > a |> > list has duplicate elements?  I could not find any, but my brain
> > is |> > small and the standard is large.  I could compare the length of
> > the |> > list to (remove-duplicates list), but that is kind of lame. |>
>
> > |> You rise an interesting point: why its 'lame' and not 'clever'? |
> > | It was the first answer that came to my mind, too.  ISTM that the |
> > algorithm used by REMOVE-DUPLICATES would be very similar to that of |
> > HAS-DUPLICATES-P, except that the latter doesn't have to cons a new |
> > list.  But that new list will be ephemeral, and modern GCs should be |
> > able to collect it efficiently.
>
> > Elsewhere the OP has balked at using hashtables because he was "not
> > going to deal with more than 20 elements so they would be overkill."
>
> > So given that N^2 will be 400 tops, my idea of clever would have been to
> > churn out in less than a minute,
>
> > (defun has-duplicates-p (seq1)
> >   (let ((length (length seq1)))
> >     (when (> length 20)
> >       (cerror "Continue for now"
> >          "Bzzzzzt! Reimplement me: ~D elements encountered." length))
> >     (let ((n (mismatch seq1 (remove-duplicates seq1))))
> >       (when n
> >    (values t (elt seq1 n))))))
>
> > The use of mismatch idea was suggested by Anon.C.Lisper elsewhere in
> > this thread.  If one were sure the lists would not contain nil, I'd have
> > HAS-DUPLICATES-P behave like DIGIT-CHAR-P and return the first duplicate
> > element encountered as the first return value. (instead of the 2nd)
>
> Well, the OP has learned a lot from this thread.  To sketch the
> context of the problem: I needed to check uniqueness in a function
> which receives lists with elements that look like (symbol &rest
> numerical-parameters), each symbol referring to an exogenous parameter
> of an economic model, and returns a stochastic version of the model
> with the given parameters perturbed.  So it is nonsensical to
> enumerate one parameter more than once, and I wanted to check for
> that.
>
> Given that the rest of the calculation involves numerical rootfinding
> and then simulation, the part which checks for uniqueness has
> negligible impact on speed.  So I should go for the simplest thing,
> which is mismatch w/ remove-duplicates.
>
> Even after 1.5 years of using Lisp, I still engage in premature
> optimization.  This is entirely my fault, the only (admittedly lame)
> excuse I have is that I have used some bad languages before finding
> Lisp.
>
> Thanks for all the suggestions, I really appreciate everyone's help.
> It was fun to read the various algorithms.
>
> Tamas
>
> PS.: Lisp is so damn fast.  I implemented a similar algorithm in R a
> couple of years ago, and the difference is more than 100-fold.  Even
> without declarations.

IIRC there is an R->CL package, the project claimed 1000 fold
improvement over interpreted R. But the project itself was just a
proof of concept, (apparently author wasn't interested in implementing
the entire language and reaping the glory of epic hax).

Re stochastic version of the model:
If speed gets to be an issue, using arrays instead of lists is an easy
(ish, sometimes it involves a bit of re-factoring) way to up
performance (as long as the lists are sufficiently long).

The advantage of the sequence functions, however, is that they work
similarly on arrays and lists.
From: Tamas K Papp
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <75u838F19eb77U1@mid.individual.net>
On Thu, 30 Apr 2009 11:03:01 -0700, anonymous.c.lisper wrote:

> Re stochastic version of the model:
> If speed gets to be an issue, using arrays instead of lists is an easy
> (ish, sometimes it involves a bit of re-factoring) way to up performance
> (as long as the lists are sufficiently long).
> 
> The advantage of the sequence functions, however, is that they work
> similarly on arrays and lists.

:-) Of course I am using vectors/arrays for computation, but lists do
fine for the symbolic part.  Oh, another amazing CL feature: I can do
symbolic manipulations _and_ compile my code for fast numerical
computation, in the same language, with ease.

If CL is dead, then I guess I am a necrophiliac :-)

Tamas
From: Rob Warnock
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <jtudnQ4M5sql-2fUnZ2dnUVZ_gOdnZ2d@speakeasy.net>
Madhu  <·······@meer.net> wrote:
+---------------
|     (let ((n (mismatch seq1 (remove-duplicates seq1))))
...
| The use of mismatch idea was suggested by Anon.C.Lisper elsewhere in
| this thread.
+---------------

IIUIC, you don't need the overhead of full MISMATCH here, (NOT (EQUAL ...))
should work just as well, e.g.:

    (let ((n (not (equal seq1 (remove-duplicates seq1)))))
      ...)


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Rob Warnock
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <TdWdnc9CgtLr8WfUnZ2dnUVZ_rCdnZ2d@speakeasy.net>
Oops! I just wrote:
+---------------
| Madhu  <·······@meer.net> wrote:
| +---------------
| |     (let ((n (mismatch seq1 (remove-duplicates seq1))))
| ...
| | The use of mismatch idea was suggested by Anon.C.Lisper elsewhere in
| | this thread.
| +---------------
| 
| IIUIC, you don't need the overhead of full MISMATCH here, (NOT (EQUAL ...))
| should work just as well, e.g.:
| 
|     (let ((n (not (equal seq1 (remove-duplicates seq1)))))
|       ...)
+---------------

While EQUAL works fine for lists and strings [and bit vectors],
it *doesn't* work correctly for general arrays;

    > (let* ((seq1 "foobar")
	     (seq2 (map 'vector #'identity seq1)))
	(list (mismatch seq1 (remove-duplicates seq1))
	      (not (equal seq1 (remove-duplicates seq1)))
	      (mismatch seq2 (remove-duplicates seq2))
	      (not (equal seq2 (remove-duplicates seq2)))))

    (2 T 2 T)
    > (let* ((seq1 "fubar")
	     (seq2 (map 'vector #'identity seq1)))
	(list (mismatch seq1 (remove-duplicates seq1))
	      (not (equal seq1 (remove-duplicates seq1)))
	      (mismatch seq2 (remove-duplicates seq2))
	      (not (equal seq2 (remove-duplicates seq2)))))

    (NIL NIL NIL T)
    > 

Oops! But I think there's a fix. Ordinarily, you wouldn't want to use
EQUALP for this sort of thing, since that could get false-positives
on the duplicate detection [#\a va. #\a], but in this case I think
it's o.k., since the REMOVE-DUPLICATES is still using its default EQL
test, so using EQUALP should be safe here:

    > (let* ((seq1 "foObar")
	     (seq2 (map 'vector #'identity seq1)))
	(list (mismatch seq1 (remove-duplicates seq1))
	      (not (equalp seq1 (remove-duplicates seq1)))
	      (mismatch seq2 (remove-duplicates seq2))
	      (not (equalp seq2 (remove-duplicates seq2)))))

    (NIL NIL NIL NIL)
    > 

But you know, if you're going for speed instead of succinctness,
all you *really* need to check is the length:  ;-}

    (/= (length seq1) (length (remove-duplicates seq1)))


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Madhu
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <m3r5z97aws.fsf@moon.robolove.meer.net>
* (Rob Warnock) <································@speakeasy.net> :
Wrote on Thu, 30 Apr 2009 22:28:54 -0500:
| But you know, if you're going for speed instead of succinctness,
| all you *really* need to check is the length:  ;-}
|
|     (/= (length seq1) (length (remove-duplicates seq1)))

Agreed.  MISMATCH is useful when you want to get an index, in case you
wanted to return the first duplicate element.

--
Madhu
From: Don Geddis
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <87eiv8zrcf.fsf@geddis.org>
····@rpw3.org (Rob Warnock) wrote on Thu, 30 Apr 2009:
> But you know, if you're going for speed instead of succinctness,
> all you *really* need to check is the length:  ;-}
>     (/= (length seq1) (length (remove-duplicates seq1)))

Is that necessarily faster?  For a long list, which happens not to be
equal, LENGTH needs to traverse the whole list.  Whereas EQUAL(P) could
short-circuit the comparisons as soon as it finds a single mismatch.

I suppose REMOVE-DUPLICATES needs to examine the whole sequence anyway.

If you really need this to be fast, you probably want to re-implement
remove-duplicates, but with a short-circuit.

(In other words, the usual lesson of premature optimization.  Write code
clearly first, then profile, and only then try to optimize time-critical
sections.)

        -- Don
_______________________________________________________________________________
Don Geddis                  http://don.geddis.org/               ···@geddis.org
Theologian 1: Either there is a god (or more than one god), or there is not.
Theologian 2: Or the amount of god changes over time.
From: Gene
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <2dc5ac1f-5d83-4a20-bd28-e095a84f4c90@t36g2000prt.googlegroups.com>
On Apr 28, 11:20 am, Tamas K Papp <······@gmail.com> wrote:
> Is there a function in the CL standard that I could use to check if a
> list has duplicate elements?  I could not find any, but my brain is
> small and the standard is large.  I could compare the length of the
> list to (remove-duplicates list), but that is kind of lame.
>
> It is of course trivial to implement it:
>
> (defun has-duplicates-p (list &optional (test #'equal))
>   "Return the first duplicated element, nil if all elements are unique."
>   (let (unique-elements)
>     (dolist (elt list)
>       (let ((found (find elt unique-elements :test test)))
>         (if found
>             (return-from has-duplicates-p found)
>             (push elt unique-elements))))
>     nil))
>
> I was just curious if I missed anything in the standard.
>
> Tamas

If you have only symbols or unique symbol identifiers for objects, you
can use symbol property lists.

A not particularly efficient example:

CL-USER> (defun has-duplicate-symbols-p (lst)
	   (let (key (gensym))
	     (flet ((finish (rtn)
		      (dolist (elt lst)
			(remprop elt key))
		      (return-from has-duplicate-symbols-p rtn)))
	     (dolist (elt lst)
	       (when (get elt key)
		 (finish t))
	       (setf (get elt key) t))
	     (finish nil))))
From: ··················@gmail.com
Subject: Re: checking for duplicate elements
Date: 
Message-ID: <d234dc5f-edf6-4eb7-8739-d2925e23c178@l16g2000pra.googlegroups.com>
On Apr 28, 11:20 am, Tamas K Papp <······@gmail.com> wrote:
> Is there a function in the CL standard that I could use to check if a
> list has duplicate elements?  I could not find any, but my brain is
> small and the standard is large.  I could compare the length of the
> list to (remove-duplicates list), but that is kind of lame.
>
> It is of course trivial to implement it:
>
> (defun has-duplicates-p (list &optional (test #'equal))
>   "Return the first duplicated element, nil if all elements are unique."
>   (let (unique-elements)
>     (dolist (elt list)
>       (let ((found (find elt unique-elements :test test)))
>         (if found
>             (return-from has-duplicates-p found)
>             (push elt unique-elements))))
>     nil))
>
> I was just curious if I missed anything in the standard.
>
> Tamas

(mismatch seq1 (remove-dupilcates seq1))