How to partially filter substring from a table with count

只谈情不闲聊 提交于 2019-12-11 16:44:38

问题


I am trying to filter substring from a string. I achieve it like. But I can print count value but it is always 1. I need to print real count number.

#standardSQL
WITH `project.dataset.table` AS (
select  term from(
select LOWER(REGEXP_EXTRACT(textPayload,"Search term:(.*)")) as term  from `log_dataset.backend_*`
where REGEXP_CONTAINS(textPayload, "Search term:.*")=true
)
group by term
order by count(*) desc
), temp AS (
  SELECT term, COUNT(1) `count`
  FROM `project.dataset.table`
  GROUP BY term
)
SELECT term , `count` FROM (
  SELECT term, `count`, STARTS_WITH(prev_str, term) AND  
    ARRAY_LENGTH(REGEXP_EXTRACT_ALL(term, r' ')) = ARRAY_LENGTH(REGEXP_EXTRACT_ALL(prev_str, r' ')) AS flag
  FROM (
    SELECT term, `count`, LAG(term) OVER(ORDER BY term DESC) AS prev_str
    FROM temp
  )
)
WHERE NOT IFNULL(flag, FALSE) 

These are a list of terms

anderstand
anderstan
andersta
anderst
understand
understan
understa
underst
unders
under
understand i
understand i
understand it
understand it
understand it y
understand it ye
understand it yes
understand it yes it
understand it yes it

Desired output is

Row str                   count
1   understand it yes it   2
2   anderstand             1
3   understand it yes      1
4   understand             1
5   understand it          2

回答1:


To obtain the desired output you can employ a GROUP BY statement as follows:

SELECT
  str,
  COUNT(*) AS count
FROM
  `project_id.dataset.table`
GROUP BY
  str

In addition, the LIKE operator can be used to filter the words in the str field.



来源:https://stackoverflow.com/questions/59011960/how-to-partially-filter-substring-from-a-table-with-count

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!