Remove duplicates and sort a list

。_饼干妹妹 提交于 2019-12-10 18:41:03

问题


I am trying to write a procedure that takes a list that may or may not include duplicates, and then return that list without duplicates, and in sorted order. What I came up with so far is:

(define (remove-duplicated list)
    (if (null? list)
       '()
        (if (= (car list) (cadr list))
            (cdr list)
            (cons (car list) (remove-duplicates (cdr list))))))  

I'm not quite sure what the problem is, besides sorting the list. For example, if I input

(remove-duplicates '(3 3 4 5 6 6 7))

returns

(3 4 5 6 6 7)

回答1:


a fairly simple procedure that will take in a list that may or may not include duplicats, and then return that list without any duplicates included, and in sorted order.

There are at least two ways that you could do this:

  • sort the list once the duplicates are removed; or
  • remove the duplicates after the list has been sorted.

Óscar López pointed out that

[Your] implementation fails because you're only testing for two consecutive values, you have to search the current element in the rest of the list, use member for that.

This will be an issue if you remove the duplicates before sorting, since a given element in the list could have duplicates anywhere else in the list. However, if you sort the list first, then you would be guaranteed that any duplicate elements do immediately follow the original, so you wouldn't need to check the whole list. Removing duplicates is easier if the list is sorted, but sorting a list isn't really any easier after duplicate elements are removed, so it really does make sense to sort the list first and then remove duplicates. (I suppose you could be even more efficient, and write your own sort-and-remove-duplicates procedure, but almost certainly not really necessary.)

Your code, if you're assuming that list is already sorted, is almost correct. There are two adjustments necessary:

  1. In the base case, you're only checking whether (null? list). However, for a non-null list, you then compare (car list) and (cadr list), but if list only has one element, then (cadr list) is an error. Fortunately, lists with only one element have no duplicates, so your base case can be (or (null? list) (null? (cdr list))).
  2. The then part of the second if needs to be (remove-duplicated (cdr list)), not (cdr list), since list can still have more duplicates farther down (e.g., (x x x ...) or (x x y y ...)).

This is your code with those modifications and some comments:

(define (remove-duplicated list)
  ;; remove duplicates from a *sorted* list.  Because the 
  ;; list is sorted, any duplicates of an element will
  ;; immediately follow the first occurrence of the element.
  ;;---------------------------------------------------------
  ;; If the list has the form () or (x)
  (if (or (null? list)
          (null? (cdr list)))
      ;; then it has no duplicates, so return it
      list
      ;; otherwise, if the list looks like (x x ...)
      (if (= (car list) (cadr list))
          ;; then you can discard the first element, but you
          ;; still need to remove duplicates from the rest of
          ;; the list, since there can be more duplicates later
          (remove-duplicated (cdr list))
          ;; otherwise, you need the first element of the list
          ;; and can simply remove-duplicated from the rest.
          (cons (car list) (remove-duplicated (cdr list))))))  

This works as expected:

(remove-duplicated '(1 1 2 3 3 4 5 6))
;=> '(1 2 3 4 5 6)



回答2:


The fact that the input list might be sorted slipped my mind. What I'm about to describe will work for removing duplicate elements from any list, sorted or not. For the general case of removing duplicates in a list, you have to search the current element in the rest of the list, using member for that.

Also, you have to advance the recursion in both cases, and be aware that in the last line you're calling remove-duplicates (which is a built-in procedure in some interpreters, so maybe you don't have to implement it from scratch!), but you named the procedure remove-duplicated. As a side note, it's a bad idea to name a parameter list, that'll clash with a built-in function - I took the liberty of renaming it. This will fix the problems, and it's a more general solution:

; if the input list is not sorted, use this
(define (remove-duplicated lst)
  (if (null? lst)
      '()
      (if (member (car lst) (cdr lst))  ; changes here
          (remove-duplicated (cdr lst)) ; and here
          (cons (car lst)
                (remove-duplicated (cdr lst)))))) 

Now, if the input list is sorted to begin with, this is how to fix your code. Most of my comments apply, except that you don't have to use member and the base case is a little different:

; if the input list is sorted, use this
(define (remove-duplicated lst)
  (if (or (null? lst) (null? (cdr lst))) ; changes here
      lst
      (if (= (car lst) (cadr lst))
          (remove-duplicated (cdr lst))  ; and here
          (cons (car lst)
                (remove-duplicated (cdr lst))))))

Either way, the procedure will work as expected as long as you use the right one for the input (the first implementation is for sorted or unsorted input lists, the second one works only for sorted lists):

(remove-duplicated '(3 3 4 5 6 6 7)) ; sorted input, both implementations work
=> '(3 4 5 6 7)

Finally, if you need to make sure that the output list will always be sorted, but have no guarantees that the input list was sorted, then use my first implementation of remove-duplicated and sort it afterwards, check your interpreter to find out which sorting procedures are available - the following will work in Racket:

(sort (remove-duplicated '(3 6 3 7 4 5 6)) <) ; using my first remove-duplicated
=> '(3 4 5 6 7)

… Or sort the list first and then use my second implementation of remove-duplicated. You have so many options to solve this problem!

(remove-duplicated (sort '(3 6 3 7 4 5 6) <)) ; using my second remove-duplicated
=> '(3 4 5 6 7)


来源:https://stackoverflow.com/questions/20084752/remove-duplicates-and-sort-a-list

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!