问题
Here is the problem:
Given an array of integers, sort the array according to frequency of elements. For example, if the input array is {2, 3, 2, 4, 5, 12, 2, 3, 3, 3, 12}, then modify the array to {3, 3, 3, 3, 2, 2, 2, 12, 12, 4, 5}. if 2 numbers have same frequency then print the one which came 1st.
I know how to do it partially. Here is my approcach.
I will create a struct which will be like:
typedef struct node
{
int index; // for storing the position of the number in the array.
int count; // for storing the number of times the number appears
int value; // for storing the actual value
} a[50];
I will create an array of these structs, I will then sort it by a sorting algorithm on the basis of their count. However, how can I ensure that if the frequency of two elements are same, then that number should appear which has a lesser index value?
回答1:
#include <stdlib.h> // qsort, malloc, free
#include <stddef.h> // size_t
#include <stdio.h> // printf
struct number
{
const int * value;
int num_occurrences;
};
static void cmp_by_val(const struct number * a, const struct number * b)
{
if (*a->value < *b->value)
return -1;
else if (*b->value < *a->value)
return 1;
else
return 0;
}
static void cmp_by_occurrence_stable(const struct number * a, const struct number * b)
{
if (a->num_occurrences < b->num_occurrences)
return -1;
else if (b->num_occurrences < a->num_occurrences)
return 1;
else if (a->value < b->value)
return -1;
else if (b->value < a->value)
return 1;
else
return 0;
}
static struct number * sort_by_occurrence(const int * arr, size_t N)
{
//
// STEP 1: Convert the input
//
struct number * sort_arr = (struct number *)malloc(N * sizeof(struct number));
if (! sort_arr) return NULL;
for (int k = 0; k < N; ++k)
{
sort_arr[k].value = &arr[k];
sort_arr[k].num_occurrences = 0;
}
//
// STEP 2: Sort the input based on value
//
qsort(sort_arr, N, sizeof(struct number), cmp_by_val);
//
// STEP 3: Count occurrences
//
if (0 < N)
{
int cur_value = *sort_arr[0].value;
int i = 0;
for (j = 1; j < N; ++j)
{
if (*sort_arr[j].value != *sort_arr[i].value)
{
for (int k = i; k < j; ++k)
sort_arr[k].num_occurrences = j - i;
i = j;
}
}
for (; i < N; ++i)
sort_arr[i].num_occurrences = N - i;
}
//
// STEP 4: Sort based on occurrence count
//
qsort(sort_arr, N, sizeof(struct number), cmp_by_occurrence_stable);
//
// DONE
//
return sort_arr;
}
static void print_arr(const struct number * arr, size_t N)
{
if (0 < N)
{
printf("%d", arr[0]->value);
for (int k = 1; k < N; ++k)
printf(", %d", arr[k]->value);
}
printf("\n");
}
int main(int argc, char ** argv)
{
const int EXAMPLE_INPUT[11] = { 2, 3, 2, 4, 5, 12, 2, 3, 3, 3, 12 };
struct number * sort_arr = sort_by_occurrence(EXAMPLE_INPUT, 11);
if (sort_arr)
{
print_arr(sort_arr, 11);
free(sort_arr);
}
};
回答2:
You could create an array which stores the frequency of your input array (i.e. frequency[i] is the frequency of the input[i] element). After that it is easy to order the frequency array (using an stable algorithm) and make the same changes (swaps?) to the input array.
For creating the frequency array you can use several approaches, a simple and inefficient one is just counting each element with two nested loops. I left more efficient alternatives to your imagination.
Note: the frequency array has the same function as the count field in your struct node, but in a separated memory. If you will not need the frequencies in the future, I recommend you to use the separated memory, as you can release it.
回答3:
It seems that the problem is using unstable sort algorithm on the frequency of array elements.
- Do a qsort on the array based on freq
Again do a qsort on the resulted array based on the indexes of the element with the same freq only.
- This should give you a correct answer in O(nLog)
I minimized the code. The obvious parts are left out.
struct node
{
int *val;
int freq;
// int index; <- we can do this by comparing &a->val with &b->val
};
int compare_byfreq(const int* a, const int* b)
{
return a->freq - b->freq;
}
int compare_index(const int* a, const int* b)
{
if( a->freq == b->freq)
{
return a->val - b->val; //this can never be zero
}
//else we have different freq don't move elem
return 0;
}
int main()
{
int arr[] = {2, 3, 2, 4, 5, 12, 2, 3, 3, 3, 12};
node *narray = (struct node*)malloc(sizeof(arr) * sizeof(node));
// build the nodes-array
for(int i =0; i < sizeof(arr); i++)
{
/* buid narray here, make sure you store the pointer to val and not the actual values */
}
qsort(narray, sizeof(arr), compare_byfreq);
qsort(narray, sizeof(arr), compare_index);
/*print narray*/
return 0;
}
Edit: @0xbe5077ed got an interesting idea. Instead of comparing indexes compare addresses of your values! - I just re-edited the code for that
回答4:
I was trying to learn Java nowadays, realized that this could be a good exercise. Tried and solved this problem over there in Eclipse. Java is horrible, I went back to C to solve it, here's a solution that I'll explain right after showing it:
#include <stdio.h>
#include <malloc.h>
typedef struct numbergroup {
int firstencounteridx;
int count;
int thenumber;
} Numbergroup;
int firstoneissuperior( Numbergroup gr1, Numbergroup gr2 ) {
return gr1.count > gr2.count || // don't mind the line-break, it's just to fit
( gr1.count == gr2.count && gr1.firstencounteridx < gr2.firstencounteridx );
}
void sortgroups( Numbergroup groups[], int amount ) {
for ( int i = 1; i < amount; i++ ) {
for ( int j = 0; j < amount - i; j++ ) {
if ( firstoneissuperior( groups[j + 1], groups[j] ) ) {
Numbergroup temp = groups[j + 1];
groups[j + 1] = groups[j];
groups[j] = temp;
}
}
}
}
int main( ) {
int input[] = { 2, 3, 2, 4, 5, 12, 2, 3, 3, 3, 12 };
Numbergroup * groups = NULL;
int amountofgroups = 0;
for ( int i = 0; i < ( sizeof input / sizeof * input ); i++ ) {
int uniqueencounter = 1;
for ( int j = 0; j < amountofgroups; j++ ) {
if ( groups[j].thenumber == input[i] ) {
uniqueencounter = 0;
groups[j].count++;
break;
}
}
if ( uniqueencounter ) {
groups = realloc( groups, ( amountofgroups + 1 ) * sizeof * groups );
groups[amountofgroups].firstencounteridx = i;
groups[amountofgroups].count = 1;
groups[amountofgroups].thenumber = input[i];
amountofgroups++;
}
}
sortgroups( groups, amountofgroups );
for ( int i = 0; i < amountofgroups; i++ )
for ( int j = 0; j < groups[i].count; j++ )
printf( "%d ", groups[i].thenumber );
free( groups );
putchar( 10 );
return 0;
}
Let me explain the structure first, as well as its functionality: It is for each unique number. In your example, it is for 2
s, 3
s, 4
s, 5
s and the 12
s, one for each, 5 in total. Each one is to store:
- the index of the first encounter of that number
- the amount of encounter of that number
- the value of that number
For example, for 12
s, it shall store:
firstencounteridx
as5
, that is the index of the first 12count
as2
thenumber
as12
The first loop generally does that. It expands the group of Numbergroups whenever a unique number is encountered, stores its index as well; increases the count in case a number that already has a group has been encountered.
Then a sort is issued, which simply is a bubble sort. Might be different than the conventional one, I don't have any memorized.
Sorting criteria function simply checks if the count
field of the first group is greater than the other; otherwise it checks whether they are the same and the firstencounter of the first group is earlier than the other; in which cases it returns 1
as true. Those are the only possible ways for the first group to be considered superior than the second one.
That's one method, there can be others. This is just a suggestion, I hope it helps you, not just for this case, but in general.
回答5:
Created a map and sort the map by value. O(nlogn) time, and O(n) space.
import java.util.*;
public class SortByFrequency {
static void sortByFreq( int[] A ) {
// 1. create map<number, its count>
Map<Integer, Integer> map = new HashMap<>();
for(int i = 0; i < A.length; i++) {
int key = A[i];
if( map.containsKey(key) ) {
Integer count = map.get(key);
count++;
map.put(key, count);
}
else {
map.put(key, 1);
}
}
// 2. sort map by value in desc. order
// used modified (for desc. order) MapUtil in http://stackoverflow.com/questions/109383/how-to-sort-a-mapkey-value-on-the-values-in-java
Map<Integer, Integer> map2= MapUtil.sortByValue(map);
for(Map.Entry<Integer, Integer> entry : map2.entrySet() ) {
int num = entry.getKey();
int count = entry.getValue();
for(int i = 0; i < count; i++ ) {
System.out.print( num + " ");
}
}
System.out.println();
}
public static void main(String[] args ) {
int[] A1 = {2, 3, 2, 4, 5, 12, 2, 3, 3, 3, 12};
sortByFreq(A1);
}
}
来源:https://stackoverflow.com/questions/24251349/how-to-find-complete-sorting-of-elements-by-frequency