How to remove duplicates from single Power Query column without removing entries rows?

南笙酒味 提交于 2019-12-08 13:29:06

问题


I have a merged query that I need to remove duplicates for my Invoice+Tax column without removing the entire row for that duplicate. I just need to remove the duplicate values for that one column. As you can see from the below picture I am trying to remove duplicate values from the Invoice+Tax column. But I need to keep the row, just remove the duplicate values, for example I have highlighted below what should be removed, anything not highlighted needs to remain. I have also included my code to this point below the picture.

let
Order = Order,
Source = Sql.Database("jansql01", "mas500_app"),
dbo_vdvInvoiceLine = Source{[Schema="dbo",Item="vdvInvoiceLine"]}[Data],
#"Removed Other Columns" = Table.SelectColumns(dbo_vdvInvoiceLine,{"Description", "ItemID", "STaxClassID", "ExtAmt", "FreightAmt", "TranID", "TradeDiscAmt", "FormattedGLAcctNo", "Segment1", "Segment2", "Segment3", "SalesOrder", "CustID", "CustName", "TranDate", "PostDate", "City", "StateID", "ItemClassID", "UseTaxRate", "ReleaseSO", "Job Number"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each Text.Contains([SalesOrder], Order)),
#"Added Material Column" = Table.AddColumn(#"Filtered Rows", "Material", each if [ItemClassID] <> "INSTALLATION" then [ExtAmt] else 0),
#"Added Installation Column" = Table.AddColumn(#"Added Material Column", "Installation", each if [ItemClassID] = "INSTALLATION" then [ExtAmt] else 0),
#"Merged Queries" = Table.NestedJoin(#"Added Installation Column",{"TranID"},vdvInvoice,{"TranID"},"vdvInvoice",JoinKind.LeftOuter),
#"Expanded vdvInvoice" = Table.ExpandTableColumn(#"Merged Queries", "vdvInvoice", {"STaxAmt"}, {"vdvInvoice.STaxAmt"}),
#"Extracted Date" = Table.TransformColumns(#"Expanded vdvInvoice",{{"TranDate", DateTime.Date, type date}, {"PostDate", DateTime.Date, type date}}),
#"Added Invoice+Tax" = Table.AddColumn(#"Extracted Date", "Invoice+Tax", each [TranID]&Number.ToText([vdvInvoice.STaxAmt]))

in
#"Added Invoice+Tax"

回答1:


I can't think of any reason to do it, but if you really want to, replace the bottom two rows with

#"Added Index" = Table.AddIndexColumn(#"Added Invoice+Tax", "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "Invoice+Tax2", each if [Index]=0 then [#"Invoice+Tax"] else if #"Added Index"{[Index]-1}[#"Invoice+Tax"]=[#"Invoice+Tax"] then null else [#"Invoice+Tax"]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"})
in
#"Removed Columns"



回答2:


Here's another approach, just for the heck of it:

After your line that says, #"Added Invoice+Tax" = Table.AddColumn(#"Extracted Date", "Invoice+Tax", each [TranID]&Number.ToText([vdvInvoice.STaxAmt])), add a comma and then, in place of,

in
#"Removed Columns"

add this:

#"Grouped Rows" = Table.Group(#"Added Invoice+Tax", {"Invoice+Tax"}, {{"AllData", each Table.FromColumns({[Installation],[vdvInvoice.STaxAmt],{[#"Invoice+Tax"]{0}}},{"Installation", "vdvInvoice.STaxAmt", "Invoice+Tax"}), type table}}),
#"Removed Other Columns2" = Table.SelectColumns(#"Grouped Rows",{"AllData"}),
#"Expanded AllData" = Table.ExpandTableColumn(#"Removed Other Columns2", "AllData", {"Installation", "vdvInvoice.STaxAmt", "Invoice+Tax"}, {"Installation", "vdvInvoice.STaxAmt", "Invoice+Tax"})
in
#"Expanded AllData"

What the #Grouped Rows" line, above, does is that it groups by the Invoice+Tax, with a sub-table for each Invoice+Tax from the original table's columns. But in making each sub-table, while it pulls all Installation and vdvInvoice.STaxAmt rows that are associated with each Invoice+Tax from the original table, it only pulls the first Invoice+Tax row associated with each Invoice+Tax from the original table. Table.FromColumns({[Installation],[vdvInvoice.STaxAmt],{[#"Invoice+Tax"]{0}}}...is basically saying to get all rows within columns Installation and vdvInvoice.STaxAmt, and only list item 0 (row 1) of the list which is actually the Invoice+Tax column.

The complete query: Your initial query that you provided above with my little part added, would be:

let
Order = Order,
Source = Sql.Database("jansql01", "mas500_app"),
dbo_vdvInvoiceLine = Source{[Schema="dbo",Item="vdvInvoiceLine"]}[Data],
#"Removed Other Columns" = Table.SelectColumns(dbo_vdvInvoiceLine,{"Description", "ItemID", "STaxClassID", "ExtAmt", "FreightAmt", "TranID", "TradeDiscAmt", "FormattedGLAcctNo", "Segment1", "Segment2", "Segment3", "SalesOrder", "CustID", "CustName", "TranDate", "PostDate", "City", "StateID", "ItemClassID", "UseTaxRate", "ReleaseSO", "Job Number"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each Text.Contains([SalesOrder], Order)),
#"Added Material Column" = Table.AddColumn(#"Filtered Rows", "Material", each if [ItemClassID] <> "INSTALLATION" then [ExtAmt] else 0),
#"Added Installation Column" = Table.AddColumn(#"Added Material Column", "Installation", each if [ItemClassID] = "INSTALLATION" then [ExtAmt] else 0),
#"Merged Queries" = Table.NestedJoin(#"Added Installation Column",{"TranID"},vdvInvoice,{"TranID"},"vdvInvoice",JoinKind.LeftOuter),
#"Expanded vdvInvoice" = Table.ExpandTableColumn(#"Merged Queries", "vdvInvoice", {"STaxAmt"}, {"vdvInvoice.STaxAmt"}),
#"Extracted Date" = Table.TransformColumns(#"Expanded vdvInvoice",{{"TranDate", DateTime.Date, type date}, {"PostDate", DateTime.Date, type date}}),
#"Added Invoice+Tax" = Table.AddColumn(#"Extracted Date", "Invoice+Tax", each [TranID]&Number.ToText([vdvInvoice.STaxAmt])),
#"Grouped Rows" = Table.Group(#"Added Invoice+Tax", {"Invoice+Tax"}, {{"AllData", each Table.FromColumns({[Installation],[vdvInvoice.STaxAmt],{[#"Invoice+Tax"]{0}}},{"Installation", "vdvInvoice.STaxAmt", "Invoice+Tax"}), type table}}),
#"Removed Other Columns2" = Table.SelectColumns(#"Grouped Rows",{"AllData"}),
#"Expanded AllData" = Table.ExpandTableColumn(#"Removed Other Columns2", "AllData", {"Installation", "vdvInvoice.STaxAmt", "Invoice+Tax"}, {"Installation", "vdvInvoice.STaxAmt", "Invoice+Tax"})
in
#"Expanded AllData"


来源:https://stackoverflow.com/questions/53817863/how-to-remove-duplicates-from-single-power-query-column-without-removing-entries

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!