How can I skip HBase rows that are missing specific columns?

自闭症网瘾萝莉.ら 提交于 2019-12-07 09:57:30

The HBase book is the best place to answer a large number of questions: http://hbase.apache.org/book/client.filter.html in particular explains how filters work.

Filters are very efficient as they are performed on the server side and reduce the amount of data flowing over the network. I agree that the javadocs really makes the semantics of include or exclude non-obvious, but I think the book makes it clear: Filters define what must be true in order to return the row to the client.

Scans are also a good way to defining what must be returned, although you need to be careful in how you define your scans. If you define a scan to contain a whole column family in one api call, and then later in your code, define a specific column qualifier to be returned, the second call will override the first call and only that specific qualifier will be returned, and no other column qualifier in the column family will be returned.

//to skip columns with Column Prefix
Filter columnFilter = new ColumnPrefixFilter(Bytes.toBytes("col-1"));
 //To skip the values
Filter valueFilter= new ValueFilter(CompareFilter.CompareOp.NOT_EQUAL,
      new BinaryComparator(Bytes.toBytes("yourvalue")));

 To Avoid the multiple column names you can pass multiple column filter with must pass all option(which is default)
Below is sample with single column filter.

Filter avoidColumnNamesFilter = new SkipFilter(columnFilter);
scan.setFilter(avoidColumnNamesFilter)
Similarly to avoid certain value pass valuefilter to skip filter
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!