Parsing an Excel file in C#, the cells seem to get cut off at 255 characters… how do I stop that?

吃可爱长大的小学妹 提交于 2019-11-30 05:04:58

Just from a quick Googling of the subject, it appears that that's a limit of Excel.

EDIT: Possible workaround (unfortunately in VB)

Andrew Garrison

The Solution!

I've been battling this today as well. I finally got it to work by modifying some registry keys before parsing the Excel spreadsheet.

You must update this registry key before parsing the Excel spreadsheet:

// Excel 2010
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel\
or
HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel\

// Excel 2007
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel\

// Excel 2003
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel\

Change TypeGuessRows to 0 and ImportMixedTypes to Text under this key. You'll also need to update your connection string to include IMEX=1 in the extended properties:

string connString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileLocation + ";Extended Properties=\"Excel 12.0 Xml;HDR=YES;IMEX=1\";");

References

http://blogs.vertigo.com/personal/aanttila/Blog/archive/2008/03/28/excel-and-csv-reference.aspx

http://msdn.microsoft.com/en-us/library/ms141683.aspx

...characters may be truncated. To import data from a memo column without truncation, you must make sure that the memo column in at least one of the sampled rows contains a value longer than 255 characters, or you must increase the number of rows sampled by the driver to include such a row. You can increase the number of rows sampled by increasing the value of TypeGuessRows under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry key....

I have came across this, and the solution that worked for me was to move the cells with long text to the top of the spreadsheet.

I found this comment in a forum describing the issue

This is an issue with the Jet OLEDB provider. It looks at the first 8 rows
of the spreadsheet to determine the data type in each column. If the column does
not contain a field value over 256 characters in the first 8 rows, then it assumes the
data type is text, which has a character limit of 256. The following KB article has
more information on this issue: http://support.microsoft.com/kb/281517

Hope this help someone else!

Have you tried setting the columns datatype to text within the spreadsheet? I believe doing this will allow the cells to contain much more than 255 characters.

[Edit] For what it's worth this dialog with the MS-Excel team is an interesting read. In the comments section at the bottom they get into some discussions about that 255 cutoff. They say Excel 12 can support 32k characters per cell.

If that is true there must be a way to get at this data. Here is two things to consider.

  1. In the past I have used the "IMEX=1" option in my connection string to deal with columns containing mixed data showing up as empty. It's a longshot, but you might give that a try.

  2. Could you export the file to a tab delimited flat file? IMHO this is the most reliable way of dealing with Excel data, since Excel does have so many gotchas.

SpreadsheetGear for .NET can read and write (and more) xls and xlsx workbooks and supports the same limitations as Excel for text - in other words it will just work. There is a free evaluation if you want to give it a try.

Disclaimer: I own SpreadsheetGear LLC

Regarding the last post, I also use SpreadsheetGear and find that it also suffers from the 255 characters per cell limitation when reading from the older XLS (not XLSX) format.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!