java heap analysis with oql: Count unique strings

前端 未结 6 990
自闭症患者
自闭症患者 2020-12-29 09:11

Im doing a memory analysis of an existing java software. Is there a sql \'group by\' equivalent in oql to see the count of objects with same values but different instances.<

相关标签:
6条回答
  • 2020-12-29 09:26

    I would use Eclipse Memory Analyzer instead.

    0 讨论(0)
  • 2020-12-29 09:28

    The following is based on the answer by Peter Dolberg and can be used in the VisualVM OQL Console:

    var counts={};
    var alreadyReturned={};
    
    filter(
      sort(
        map(heap.objects("java.lang.String"),
        function(heapString){
          if( ! counts[heapString.toString()]){
            counts[heapString.toString()] = 1;
          } else {
            counts[heapString.toString()] = counts[heapString.toString()] + 1;
          }
          return { string:heapString.toString(), count:counts[heapString.toString()]};
        }), 
        'lhs.count < rhs.count'),
      function(countObject) {
        if( ! alreadyReturned[countObject.string]){
          alreadyReturned[countObject.string] = true;
          return true;
        } else {
          return false;
        }
       }
      );
    

    It starts by using a map() call over all String instances and for each String creating or updating an object in the counts array. Each object has a string and a count field.

    The resulting array will contain one entry for each String instance, each having a count value one larger than the previous entry for the same String. The result is then sorted on the count field and the result looks something like this:

    {
    count = 1028.0,
    string = *null*
    }
    
    {
    count = 1027.0,
    string = *null*
    }
    
    {
    count = 1026.0,
    string = *null*
    }
    
    ...
    

    (in my test the String "*null*" was the most common).

    The last step is to filter this using a function that returns true for the first occurrence of each String. It uses the alreadyReturned array to keep track of which Strings have already been included.

    0 讨论(0)
  • 2020-12-29 09:31

    A far more efficient query:

    var countByValue = {};
    
    // Scroll the strings
    heap.forEachObject(
      function(strObject) {
        var key = strObject.toString();
        var count = countByValue[key];
        countByValue[key] = count ? count + 1 : 1;
      },
      "java.lang.String",
      false
    );
    
    // Transform the map into array
    var mapEntries = [];
    for (var i = 0, keys = Object.keys(countByValue), total = keys.length; i < total; i++) {
      mapEntries.push({
        count : countByValue[keys[i]],
        string : keys[i]
      });
    }
    
    // Sort the counts
    sort(mapEntries, 'rhs.count - lhs.count');
    
    0 讨论(0)
  • 2020-12-29 09:35

    Sadly, there isn't an equivalent to "group by" in OQL. I'm assuming you're talking about the OQL that is used in jhat and VisualVM.

    There is an alternative, though. If you use pure JavaScript syntax instead of the "select x from y" syntax then you have the full power of JavaScript to work with.

    Even so, the alternative way of getting the information you're looking for isn't simple. For example, here's an OQL "query" that will perform the same task as your query:

    var set={};
    sum(map(heap.objects("java.lang.String"),function(heapString){
      if(set[heapString.toString()]){
        return 0;
      }
      else{
        set[heapString.toString()]=true;
        return 1;
      }
    }));
    

    In this example a regular JavaScript object mimics a set (collection with no duplicates). As the the map function goes through each string, the set is used to determine if the string has already been seen. Duplicates don't count toward the total (return 0) but new strings do (return 1).

    0 讨论(0)
  • 2020-12-29 09:35

    Method 1

    You can select all the strings and then use the terminal to aggregate them.

    1. Increase the oql limit in the visual vm config files
    2. restart visual vm
    3. oql to get all the strings
    4. copy and paste them into vim
    5. clean the data with vim macros so there's
    6. sort | uniq -c to get the counts.

    Method 2

    1. Use a tool to dump all the fields object the class you're interested in ( https://github.com/josephmate/DumpHprofFields can do it )
    2. Use bash to select the strings you're interested in
    3. Use bash to aggregate
    0 讨论(0)
  • 2020-12-29 09:46

    Just post my solution and experience when doing similar issue for other references.

    var counts = {};
    var alreadyReturned = {};
    top(
    filter(
        sort(
            map(heap.objects("java.lang.ref.Finalizer"),
                function (fobject) {
                    var className = classof(fobject.referent)
                    if (!counts[className]) {
                        counts[className] = 1;
                    } else {
                        counts[className] = counts[className] + 1;
                    }
                    return {string: className, count: counts[className]};
                }),
            'rhs.count-lhs.count'),
        function (countObject) {
            if (!alreadyReturned[countObject.string]) {
                alreadyReturned[countObject.string] = true;
                return true;
            } else {
                return false;
            }
        }),
        "rhs.count > lhs.count", 10);
    

    The previous code will output the top 10 classes used by java.lang.ref.Finalizer.
    Tips:
    1. The sort function by using function XXX is NOT working on my Mac OS.
    2. The classof function can return the class of the referent. (I tried to use fobject.referent.toString() -> this returned a lot of org.netbeans.lib.profiler.heap.InstanceDump. This also wasted a lot of my time).

    0 讨论(0)
提交回复
热议问题