Google Sheets: select top N cases within groups using QUERY

你说的曾经没有我的故事 提交于 2019-12-24 09:01:36

问题


I want to make a report: "select salesmen who have N recent Status codes = 2,3,4,5".

A structural example of my data has 35 rows (including 1 header row): link. This file has a Date, Sales code (id of a salesman), Status code (id of how successful a transaction was) and other fields which are not necessary for the purpose.

I ended up using three formulas:

  1. a QUERY function with IMPORTRANGE.
    In the example data it is slightly simpler - take only Sales Codes, Status Code, and Date from another sheet, then order by Date, Sales Code. Formula in A9:
    =QUERY({Source!D1:E, Source!A1:A}, $B$4, 1)

  2. an additional column with a sequentional numbering. Formula in D10:
    =ArrayFormula(if(len(A10:A), ROW(A10:A) - MATCH(A10:A,A10:A,0) - 8, ))

  3. a QUERY function to extract only N cases (let's say 5). Formula in F9:
    =QUERY(A9:D, "select A, B where D <="&B3, 1)

Is there a way to combine all 3 steps into one so I get the output like in F10:G24 using one (and hopefully fast :)) formula? The formula I tried (expanded for readability):

=QUERY(
    {
      QUERY({Source!D1:E, Source!A1:A}, $B$4, 1),
      ArrayFormula(
         IF(len(J10:J),
            ROW(J10:J) - MATCH(J10:J, J10:J, 0) - 8,
         )
      )
    },
    "select Col1, Col2 where Col4 <="&B3,
    1
)

It gives me the error:

"Function ARRAY_ROW parameter 2 has mismatched row size. Expected: 28. Actual: 991."

I also tried finite data ranges in ROW() and MATCH() but that yields an empty table. In the original database there are ~3500 rows and they will expand, so I think I should stick to infinite ranges to automate data extraction.


回答1:


Hm well it's a bit of a nightmare TBH - a similar question has cropped up before but no easy answer. Here is a draft which involves repeating the basic query several times-

=ArrayFormula(query({query(A2:E,"select * where E>=2 and E<=5  order by D, A desc"),
row(indirect("2:"&count(filter(E2:E,E2:E>=2,E2:E<=5))+1))-
match(query(A2:E,"select D where E>=2 and E<=5  order by D, A desc"),
query(A2:E,"select D where E>=2 and E<=5  order by D, A desc"),0)
},"select * where Col6<=5"))

But needs looking at more to see if it can be simplified.

This is the full-column version including headers - I think it's OK

=ArrayFormula(query({query(A:E,"select * where E>=2 and E<=5  order by D, A desc"),
row(indirect("1:"&count(filter(E:E,E:E>=2,E:E<=5))+1))+1-
match(query(A:E,"select D where E>=2 and E<=5  order by D, A desc"),
query(A:E,"select D where E>=2 and E<=5  order by D, A desc"),0)
},"select * where Col6<=5"))



来源:https://stackoverflow.com/questions/51582617/google-sheets-select-top-n-cases-within-groups-using-query

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!