How to filter out rows with given column(not null)?

旧时模样 提交于 2020-01-23 06:08:13

问题


I want to do a hbase scan with filters. For example, my table has column family A,B,C, and A has a column X. Some rows have the column X and some do not. How can I implement the filter to filter out all the rows with column X?


回答1:


I guess you are looking for SingleColumnValueFilter in HBase. As mentioned in the API

To prevent the entire row from being emitted if the column is not found on a row, use setFilterIfMissing(boolean) on Filter object. Otherwise, if the column is found, the entire row will be emitted only if the value passes. If the value fails, the row will be filtered out.

But SingleColumnValueFilter would want a value to have Column X "CompareOp" to something, say bring this row if ColumnX == "X" or bring this row if ColumnX != "A sentinel value that ColumnX can never take" and setFilterIfMissing(true) so that if ColumnX has some value, it is returned.

I hope this nudges you in the right direction.




回答2:


You can use a SkipFilter along with ColumnPrefixFilter. The ColumnPrefixFilter gets keys where the column exists (an HBase row will only have a column if it has a value) the Skip filter will give you the "Not" on the first filter so the row will be omitted




回答3:


Ankit Arnon user1573269

The only way I could get it work, is like below

So - I have a table with columns rule1, rule2 , rule3 and so on. Rows can have only rule1 column, or rule1 and rule2, or rule1 and rule2 and rule3 and so on. Say - I want to extract rows which have ONLY rule1 in them. Now this means, I will have to skip rows which have rule2 in them.

Scan getRules = new Scan();
    ColumnPrefixFilter rule1Filter = new ColumnPrefixFilter(Bytes.toBytes("rule1"));
    SingleColumnValueFilter skipRule2Value = new      SingleColumnValueFilter(Bytes.toBytes("rules"),Bytes.toBytes("rule2"),
    CompareOp.EQUAL,Bytes.toBytes("0"));
    SkipFilter skipRule2 = new SkipFilter(skipRule2Value);
    getRules.setFilter(rule1Filter);
    getRules.setFilter(skipRule2);
    ResultScanner scanner = htable.getScanner(getRules);

Though this worked, I am not very happy with the solution. Its takes time for hbase to figure out. I would have thought there should be an easier straightforward method which does not have to check the value. Arnon, your method does not work because SkipFilter will skip those which DONOT satisfy the condition. Hence constructing it from a ColumnPrefixFilter fails the requirement.



来源:https://stackoverflow.com/questions/12858995/how-to-filter-out-rows-with-given-columnnot-null

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!