In SQL Server CDC with SSIS, which data should be stored for windowing (LSN or Date)?

时光毁灭记忆、已成空白 提交于 2019-12-11 07:49:37

问题


I have implemented delta detection while loading data warehouse from transaction systems using an identity column or date-time column in source transaction tables. When data needs to be extracted next time, the maximum date-time value extracted last time is used in the filter of extraction query to identify new or changed records. This was good enough except when there were multiple transactions at the same milli second.

But now we have Change Data Capture (CDC) with SQL Server 2008 and it provides a new stuff called LSN (Log Sequence Number) which is binary of length 10. Now I am confused. Which data should be stored for windowing purpose, the LSN or the date-time. Of course LSN eliminates the need for storing additional date-time values in large transaction tables, but does this have any disadvantages? Which one should I use? I feel, the mapping of LSN to date-time and then storing date-time is not a reliable method. What is your opinion?

PS: To, non-BI professionals, Sorry.


回答1:


See Improving Incremental Loads with Change Data Capture for information on using CDC with SSIS.




回答2:


After a lot of wait I don't see any further answers here. I have used LSN in my current project for windowing and I find it better than date time values as it is more precise and the process is simple. I recommend using LSN. If anyone out there disagree, please let me know...




回答3:


If you set up CDC, you get a system table added to your database with the name cdc.lsn_time_mapping so you can use either.



来源:https://stackoverflow.com/questions/1137283/in-sql-server-cdc-with-ssis-which-data-should-be-stored-for-windowing-lsn-or-d

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!