Working With Excel Files

Connection strings

For details of connection strings to use with different versions of Excel, see ConnectionString Attribute.

Detecting data types

For best results when importing Excel files, you may need to adjust some registry settings on the computer where the Business Importer is executing. This may be necessary because the OLE-DB driver for Excel automatically assigns the format for each column based on a sampling of the first eight rows of the spreadsheet. This may cause problems in scenarios like the following:
  • In a particular column, the first eight rows contain numeric values.
  • Therefore the OLE-DB driver sets the column type to numeric.
  • Further down the column, there are string (text) values in other rows.
  • When it encounters these, the driver may either discard the string values, or throw a fatal error.

One possible way to solve the problem is to change the number of rows considered by Excel. Of course, this is only reliable if all rows in the spreadsheet are examined; but on very large spreadsheets, this may have performance impacts.

If you wish to modify the number of rows Excel examines to determine the data type, use regedit to locate the following registry key:
  • For a 32-bit operating system:
    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel
  • For a 64-bit operating system:
    HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Excel

In this key, the value for TypeGuessRow defines the number of rows that Excel reads to determine the format of a column. A value of zero causes Excel to read the full worksheet to evaluate column types.

Working with mixed data types

Depending on the quality of the data and different scenarios that can occur, a spreadsheet column can sometimes contain mixed data types (for example, numeric and string). In this case, data should always be considered as string (with numeric values being numbers that are represented as strings). You can control the behavior of the OLE-DB driver in this case by setting the ImportMixedType value (in the same registry key listed above) to Text.

If you force this behavior, also ensure that, within the connection string defined in the XML adapter file, the extended properties set the value of IMEX to 1:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=FullPathAndFileName;
Extended Properties=Excel 12.0 Xml;HDR=Yes;IMEX=1

One further control

For date/time columns in your source spreadsheet, you can use the Format attribute of the Property element to clarify the data type. This identifies the format of the data column in the source data, improving conversion for storage in the target database.

The last resort

Even with all these parameters and setting and the additional use of the Format attribute of the Property element, there are still few cases that cannot be handled by the OLE-DB Excel driver. These cases display as either fatal errors, or as data loss of data corruption. In these instances, the use of the text driver is recommended.

2021 R1