问题
I'm trying to sort a set of data so that it looks like a histogram of a probability distribution function (I'm assuming normally distributed for the moment).
I have a list of entries:
private static final class SortableDatasetEntry{
Number value;
Comparable key;
public SortableDatasetEntry(Number value, Comparable key){
this.value = value;
this.key = key;
}
}
An example:
I have the items : {1,2,3,4,5,6,7,8,9}
EDIT:
The sorted list I would like: {1,3,5,7,9,8,6,4,2}
(or something similar) The numbers will not always be so neat (i.e. simply sorting by odd/even wont work either). I have a partial solution that involves sorting by regular order (lowest to highest) then copying that list to another by inserting into the middle each time, thus the last item inserted (into the middle) is the largest. I'd still like to find a method of doing this with a comparator.
This is quite tricky because its not being sorted by the absolute value of value
but by the distance from the Mean(value
) within the set, and then somehow moved so those values closest to mean are centered. I know that the compareTo function must be "reversible" (I forget the correct term).
Bonus points: How do I determine the correct distribution for the data (i.e. if it isn't normal, as assumed).
回答1:
First calculate the mean and store it in a variable called say mean
. Next, when you insert the entries into your SortableDatasetEntry, use value - mean
as the actual value for each entry rather than value
.
回答2:
For what I see, you probably want to get a tuple of "mean distance", value and sort the tuple list with the first entry "mean distance".
回答3:
Would something like:
public List<Integer> customSort(List<Integer> list) {
Collections.sort(list);
List<Integer> newList = new ArrayList<Integer>();
for (int i = 0; i < list.size(); i += 2) {
newList.add(list.get(i));
}
if (list.size() % 2 == 0) {
for (int i = 1; i < list.size(); i += 2) {
newList.add(list.get(list.size() - i));
}
} else {
for (int i = 1; i < list.size(); i += 2) {
newList.add(list.get(list.size() - i - 1));
}
}
return newList;
}
work? I put in {1,2,3,4,5,6,7,8,9}
and get {1,3,5,7,9,8,6,4,2}
, and {1,2,3,4,5,6,7,8}
gives {1,3,5,7,8,6,4,2}
.
回答4:
You cannot accomplish this in a single sort merely with a custom Comparator
.
However, it is still be feasible to do it in-place, without an additional collection of references.
Your current approach is not in-place, but is probably the easiest to implement and understand. Unless the size of the collection in memory is a concern, consider staying with your current approach.
Custom comparator in a single sort
Your desired order depends on the ascending order. Given unsorted data, your Comparator
doesn't have the ascending order while the first sort is occurring.
In-place approaches
You could create your desired order in-place.
What follows presumes 0-based indices.
One approach would use two sorts. First, sort in ascending order. Mark each object with its index. In the Comparator for the second sort, all objects with even indices will be less than all objects with odd indices. Objects with even indices will be ordered in ascending order. Objects with odd indices will be ordered in descending order.
Another approach would be a custom sorting algorithm that supported mapping from virtual to physical indices. The sorting algorithm would create an ascending order in the virtual index space. Your index mapping would lay it out in physical memory in the order you desire. Here's an untested sketch of the index mapping:
private int mapVirtualToPhysical( int virtualIndex, int countElements ) {
boolean isEvenIndex = ( 0 == (index % 2));
int physicalIndex = isEvenIndex ? (index / 2) : (countElements - (index/2) - 1);
return physicalIndex;
}
Preferable to either of these would be an initial sort followed by an O(n) series of swaps. However, I haven't yet determined the sequence of swaps. The best I've come up with so far gets the left tail in order, but the right tail either requires a subsequent sort or a stack.
回答5:
For large sets of data, you can use the approach when SortableEntry
constructor determines, which side of chart (left or right to the highest) this particular entry will occupy, using random number generator:
static final class SortableEntry<T>{
Number value;
Comparable<T> key;
int hr;
static Random rnd = new Random();
public SortableEntry(Number value, Comparable<T> key){
this.value = value;
this.key = key;
this.hr = rnd.nextInt(2) == 0 ? -1 : 1; // here
}
}
The point of additional hr
variable is to make any "right" entry be greater than any "left" and vice versa. If hr
of two compared entries are the same, compare by actual key
, taking into account sign of hr
:
static final class SortableEntryComparator<T> implements Comparator<SortableEntry<T>> {
@Override
public int compare(SortableEntry<T> e1, SortableEntry<T> e2) {
if (e1.hr == e2.hr)
return e1.hr < 0 ? e1.key.compareTo((T) e2.key) : e2.key.compareTo((T) e1.key);
else
return e1.hr - e2.hr;
}
}
Now a small test:
@Test
public void testSort() {
List<Integer> keys = Arrays.asList(10, 20, 30, 40, 50, 60, 70, 80, 90, 100,
12, 25, 31, 33, 34, 36, 39, 41, 26, 49,
52, 52, 58, 61, 63, 69, 74, 83, 92, 98);
List<SortableEntry<Integer>> entries = new ArrayList<>();
for (Integer k : keys) {
entries.add(new SortableEntry<Integer>(0, k));
}
entries.sort(new SortableEntryComparator<Integer>());
System.out.println(entries);
}
// output:
// [12, 26, 33, 36, 39, 40, 49, 50, 52, 60, 61, 63, 80, 90, 98, 100, 92, 83, 74, 70, 69, 58, 52, 41, 34, 31, 30, 25, 20, 10]
// the highest key (100) is not precisely in the center,
// but it will tend to occur in the center when dataset is large.
回答6:
You would find it much easier to build your histogram as a Map
.
public static Map<Integer, List<Number>> histogram(List<Number> values, int nBuckets) {
// Get stats on the values.
DoubleSummaryStatistics stats = values.stream().mapToDouble((x) -> x.doubleValue()).summaryStatistics();
// How big must each bucket be?
int bucketSize = (int) (stats.getMax() - stats.getMin()) / nBuckets;
// Roll them all into buckets.
return values.stream().collect(Collectors.groupingBy((n) -> (int) ((n.doubleValue() - stats.getMin()) / bucketSize)));
}
Note the intent of a Histogram
To construct a histogram, the first step is to "bin" the range of values—that is, divide the entire range of values into a series of small intervals—and then count how many values fall into each interval.
来源:https://stackoverflow.com/questions/29852044/sorting-list-from-smallest-largest-smallest-in-java