From: msingh
Subject: duplicates
Date: 
Message-ID: <1184144419.997291.126020@e16g2000pri.googlegroups.com>
There doesn't seem to be a way to return a list of duplicates of a
sequence in ANSI CL -- though there is a remove-duplicates. Is there a
reason for this? It would be handy if you could tell remove-duplicates
not to include any duplicated elements so you could do a set-
difference at the end to get a list of duplicates. Feel free to post
code to prove me wrong. Thanks!

From: Slobodan Blazeski
Subject: Re: duplicates
Date: 
Message-ID: <1184152046.686260.218880@o61g2000hsh.googlegroups.com>
On Jul 11, 11:00 am, msingh <··········@gmail.com> wrote:
> There doesn't seem to be a way to return a list of duplicates of a
> sequence in ANSI CL -- though there is a remove-duplicates. Is there a
> reason for this? It would be handy if you could tell remove-duplicates
> not to include any duplicated elements so you could do a set-
> difference at the end to get a list of duplicates. Feel free to post
> code to prove me wrong. Thanks!

remove-duplicates is doing what it's name suggest that it's doing.
What kind of function do you need?
What should below return :
(return-duplicates '(1 2 3 4 1 2 1))
=> (1 2)

=> (1 2 1)
Or something else? And write somecode to see how you tried to solve it
to see your lisp.Is this a homework assignement ?

Slobodan
From: GP lisper
Subject: Re: duplicates
Date: 
Message-ID: <slrnf99geq.gnj.spambait@phoenix.clouddancer.com>
On Wed, 11 Jul 2007 09:00:19 -0000, <··········@gmail.com> wrote:
> There doesn't seem to be a way to return a list of duplicates of a
> sequence in ANSI CL

simple

(let ( (a (some-silly-list))
       (b (remove-duplicates a :test #'string=))
       (c (set-difference a b)) )
  (do-something-here))


-- 
There are no average Common Lisp programmers
Reply-To: email is ignored.

-- 
Posted via a free Usenet account from http://www.teranews.com
From: Thomas A. Russ
Subject: Re: duplicates
Date: 
Message-ID: <ymiodiion1q.fsf@sevak.isi.edu>
msingh <··········@gmail.com> writes:

> There doesn't seem to be a way to return a list of duplicates of a
> sequence in ANSI CL -- though there is a remove-duplicates. Is there a
> reason for this?

Because if CL had all possible functions, there wouldn't be a need for
programmers, and we would all be out of jobs?

In any case, it's easy enough to write your own.  There are several
reasonable and efficient solutions already posted.  There is also a
pretty simple, albeit not especially efficient one:

(defun get-duplicates (list)
  (set-difference list (remove-duplicates list)))

It's not particularly efficient, since in the common implementation,
both set-difference and remove-duplicates are O(N^2) algorithms, and you
can arrange to make this O(N) instead of O(N^2) fairly easily using hash
tables.


-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: msingh
Subject: Re: duplicates
Date: 
Message-ID: <1184214283.655435.225660@d55g2000hsg.googlegroups.com>
Simple, inefficient AND hopelessly wrong. I mentioned this approach in
my first post and explained why it doesn't work. This set difference
will alway return nil for any input.
From: GP lisper
Subject: Re: duplicates
Date: 
Message-ID: <slrnf9blv7.qbp.spambait@phoenix.clouddancer.com>
On Thu, 12 Jul 2007 04:24:43 -0000, <··········@gmail.com> wrote:
> Simple, inefficient AND hopelessly wrong. I mentioned this approach in
> my first post and explained why it doesn't work. This set difference
> will alway return nil for any input.

I'd say you are the hopeless one.

I run it a few thousand times a day.  Inefficient? A moment to write,
a second to run.  Only fools spend hours on something that will only
save seconds.


-- 
There are no average Common Lisp programmers
Reply-To: email is ignored.

-- 
Posted via a free Usenet account from http://www.teranews.com
From: Chris Russell
Subject: Re: duplicates
Date: 
Message-ID: <1184239261.916680.155760@n60g2000hse.googlegroups.com>
On 12 Jul, 08:27, GP lisper <········@CloudDancer.com> wrote:
> On Thu, 12 Jul 2007 04:24:43 -0000, <··········@gmail.com> wrote:
> > Simple, inefficient AND hopelessly wrong. I mentioned this approach in
> > my first post and explained why it doesn't work. This set difference
> > will alway return nil for any input.
>
> I'd say you are the hopeless one.
>
> I run it a few thousand times a day.  Inefficient? A moment to write,
> a second to run.  Only fools spend hours on something that will only
> save seconds.
>
> --
> There are no average Common Lisp programmers
> Reply-To: email is ignored.
>
> --
> Posted via a free Usenet account fromhttp://www.teranews.com

I think the behaviour of set-difference is only specified for sets in
the hyperspec, so it's implementation dependant how it handles
duplicate elements.

You could both be right, but only for the respective implementations
you each use.
From: Alex Mizrahi
Subject: Re: duplicates
Date: 
Message-ID: <46963ade$0$90271$14726298@news.sunsite.dk>
(message (Hello 'Chris)
(you :wrote  :on '(Thu, 12 Jul 2007 11:21:01 -0000))
(

 CR> I think the behaviour of set-difference is only specified for sets in
 CR> the hyperspec, so it's implementation dependant how it handles
 CR> duplicate elements.

"set-difference returns a list of elements of list-1 that do not appear in 
list-2.

For all possible ordered pairs consisting of one element from list-1 and one 
element from list-2, the :test or :test-not function is used to determine 
whether they satisfy the test ...

An element of list-1 appears in the result if and only if it does not match 
any element of list-2."

do use see word "set" in the specification, besides in function name?
is there something that is not clear?

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"scorn") 
From: Chris Russell
Subject: Re: duplicates
Date: 
Message-ID: <1184252059.202558.244140@o61g2000hsh.googlegroups.com>
On 12 Jul, 15:29, "Alex Mizrahi" <········@users.sourceforge.net>
wrote:
> (message (Hello 'Chris)
> (you :wrote  :on '(Thu, 12 Jul 2007 11:21:01 -0000))
> (
>
>  CR> I think the behaviour of set-difference is only specified for sets in
>  CR> the hyperspec, so it's implementation dependant how it handles
>  CR> duplicate elements.
>
> "set-difference returns a list of elements of list-1 that do not appear in
> list-2.
>
> For all possible ordered pairs consisting of one element from list-1 and one
> element from list-2, the :test or :test-not function is used to determine
> whether they satisfy the test ...
>
> An element of list-1 appears in the result if and only if it does not match
> any element of list-2."
>
> do use see word "set" in the specification, besides in function name?
> is there something that is not clear?
>
> )
> (With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
> "scorn")

http://www.lisp.org/HyperSpec/Body/sec_14-1-2-2.html
Functions that treat the lists as sets are allowed to assume that
there are no duplications.
But re-reading the bit you cite, I'm inclined to agree with you that
this function is unambiguously specified for non sets as well.

I wonder which lisp GPlisper is using.
From: Chris Russell
Subject: Re: duplicates
Date: 
Message-ID: <1184152864.864497.85410@g4g2000hsf.googlegroups.com>
On 11 Jul, 10:00, msingh <··········@gmail.com> wrote:
> There doesn't seem to be a way to return a list of duplicates of a
> sequence in ANSI CL -- though there is a remove-duplicates. Is there a
> reason for this? It would be handy if you could tell remove-duplicates
> not to include any duplicated elements so you could do a set-
> difference at the end to get a list of duplicates. Feel free to post
> code to prove me wrong. Thanks!

I can't think of one, you can always write your own though ;) .

>(let (new)
	   (mapcan (lambda(x) (if (member x new)
				  (list x)
				  (progn
				    (push x new)
				    nil))) '(2 2 3 1 2 4 4 5 1)))
(2 2 4 1)
From: Pascal Costanza
Subject: Re: duplicates
Date: 
Message-ID: <5fjs7jF3cmpv0U1@mid.individual.net>
msingh wrote:
> There doesn't seem to be a way to return a list of duplicates of a
> sequence in ANSI CL -- though there is a remove-duplicates. Is there a
> reason for this? It would be handy if you could tell remove-duplicates
> not to include any duplicated elements so you could do a set-
> difference at the end to get a list of duplicates. Feel free to post
> code to prove me wrong. Thanks!

(loop
   with counts
   for element in list
   do (incf (getf counts element 0))
   finally (return
             (loop for (element count) on counts by #'cddr
               if (> count 1)
               collect element into duplicates
               else collect element into uniques
               finally (return (values uniques duplicates)))))

I am using a property list for counting elements, which means that eq is 
implicitly used for detecting equivalent elements. If you want to use 
other comparison functions, it is better to use an association list 
(which makes the code a little bit wordier).

In general, LOOP is a pretty good poor man's list comprehension 
facility. Just ignore that it performs iteration and use it to emulate a 
more declarative style.


Pascal

-- 
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/
From: Pascal Bourguignon
Subject: Re: duplicates
Date: 
Message-ID: <87ps2zmbdu.fsf@thalassa.lan.informatimago.com>
msingh <··········@gmail.com> writes:

> There doesn't seem to be a way to return a list of duplicates of a
> sequence in ANSI CL -- though there is a remove-duplicates. Is there a
> reason for this? It would be handy if you could tell remove-duplicates
> not to include any duplicated elements so you could do a set-
> difference at the end to get a list of duplicates. Feel free to post
> code to prove me wrong. Thanks!

What about triplicates and quadruplicates?

Just build an histogram, and select what you want.


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

NOTE: The most fundamental particles in this product are held
together by a "gluing" force about which little is currently known
and whose adhesive power can therefore not be permanently
guaranteed.
From: msingh
Subject: Re: duplicates
Date: 
Message-ID: <1184155697.849642.207070@j4g2000prf.googlegroups.com>
Just wanted to confirm I hadn't missed any obvious solution before
rolling my own version. Thanks for the replies!

I ended up with something like:

(defun duplicates (list)
  (let ((h (make-hash-table)))
    (iter (for x in list)
	  (for i from 0)
	  (cond ((gethash x h)
		 (push i (gethash x h)))
		(t (setf (gethash x h) (list i)))))
    (iter (for (key value) in-hashtable h)
	  (when (> (length value) 1)
	    (collect (list key value))))))

MSINGH> (duplicates '(1 2 4 4 5 x x y y z))
((4 (3 2)) (X (6 5)) (Y (8 7)))
From: Alex Mizrahi
Subject: Re: duplicates
Date: 
Message-ID: <4695f79c$0$90276$14726298@news.sunsite.dk>
(message (Hello 'msingh)
(you :wrote  :on '(Wed, 11 Jul 2007 09:00:19 -0000))
(

 m> There doesn't seem to be a way to return a list of duplicates of a
 m> sequence in ANSI CL -- though there is a remove-duplicates. Is there a
 m> reason for this? It would be handy if you could tell remove-duplicates
 m> not to include any duplicated elements so you could do a set-
 m> difference at the end to get a list of duplicates. Feel free to post
 m> code to prove me wrong.

i have a really weird solution for you :)

first, let's examine why (set-difference list (remove-duplicates list)) 
doesn't work.
in this case remove-duplicates removes items that are EQL, which is OK.
but set-difference also uses EQL, so it basically will remove same items, 
and you'll get NIL.

so, for this to work we need set-difference to use more strict check than 
remove-duplicates.
it's not possible to portably use difference between EQ and EQL (although on 
some implementations, where numbers are not EQ but EQL, we can use it). so 
we'll need SET-DIFFERENCE to use EQL and REMOVE-DUPLICATES to use EQUAL.

for this to work, we need to wrap everything into CONS:

(defun duplicates (list)
    (mapcar #'car
     (let ((clist (mapcar #'list list)))
       (set-difference clist (remove-duplicates clist :test 'equal)))))

CL-USER> (duplicates (list 1 1 2 2 3 4 5 6 6 7))
(6 2 1)

cool?

in implemenation i'm using


CL-USER> (eq 1000 1000)
NIL

so a simplier version:


(defun duplicates (list)
    (set-difference list (remove-duplicates list)  :test 'eq))

CL-USER> (duplicates (list 1001 1001 1002 1003))
(1001)


while for smaller numbers that doesn't work:

CL-USER> (duplicates (list 1 1 2 3))
NIL


)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"scorn") 
From: msingh
Subject: Re: duplicates
Date: 
Message-ID: <1184251402.833558.46500@k79g2000hse.googlegroups.com>
On Jul 12, 5:42 pm, "Alex Mizrahi" <········@users.sourceforge.net>
wrote:
> (message (Hello 'msingh)
> (you :wrote  :on '(Wed, 11 Jul 2007 09:00:19 -0000))
> (
>
>  m> There doesn't seem to be a way to return a list of duplicates of a
>  m> sequence in ANSI CL -- though there is a remove-duplicates. Is there a
>  m> reason for this? It would be handy if you could tell remove-duplicates
>  m> not to include any duplicated elements so you could do a set-
>  m> difference at the end to get a list of duplicates. Feel free to post
>  m> code to prove me wrong.
>
> i have a really weird solution for you :)
>
> first, let's examine why (set-difference list (remove-duplicates list))
> doesn't work.
> in this case remove-duplicates removes items that are EQL, which is OK.
> but set-difference also uses EQL, so it basically will remove same items,
> and you'll get NIL.
>
> so, for this to work we need set-difference to use more strict check than
> remove-duplicates.
> it's not possible to portably use difference between EQ and EQL (although on
> some implementations, where numbers are not EQ but EQL, we can use it). so
> we'll need SET-DIFFERENCE to use EQL and REMOVE-DUPLICATES to use EQUAL.
>
> for this to work, we need to wrap everything into CONS:
>
> (defun duplicates (list)
>     (mapcar #'car
>      (let ((clist (mapcar #'list list)))
>        (set-difference clist (remove-duplicates clist :test 'equal)))))
>
> CL-USER> (duplicates (list 1 1 2 2 3 4 5 6 6 7))
> (6 2 1)
>
> cool?
>
> in implemenation i'm using
>
> CL-USER> (eq 1000 1000)
> NIL
>
> so a simplier version:
>
> (defun duplicates (list)
>     (set-difference list (remove-duplicates list)  :test 'eq))
>
> CL-USER> (duplicates (list 1001 1001 1002 1003))
> (1001)
>
> while for smaller numbers that doesn't work:
>
> CL-USER> (duplicates (list 1 1 2 3))
> NIL
>
> )
> (With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
> "scorn")

Nice. I enjoyed that very much.