Wednesday, March 7, 2012

AS 400 data being interpreted as unicode strings (DT_WSTR) datatype

I am building a data warehouse. Some of the data comes from an AS 400 EPR system. I used the OLEDB connector when first pulling the data into SQL Server doing simple import data from table option. That worked great for getting the initial data load into SQL Server and creating the base SQL Server tables although it was excruciatingly slow (that was probably due to the transport from the AS 400).

Now, I need to get new records that are added to the AS400 side of things on a daily basis. For that, I was trying to use the OLEDB AS400 connector. However, I found that the OLEDB connector wouldn't work when I was trying to specify an SQL Statement for what to get; i.e., a simple query like Select * from TWLDAT.STKT where BYSDAT >= '2005-01-27' would simply not work. Found articles here explaining that it is probably a problem on the AS400 side of things and where people recommended using an ADO ODBC data reader source for this type of thing. So, I'm trying to implement that. However, I have a huge problem with it.

The original tables that got created were mapped to use NVARCHAR fields for character data. When the ADO ODBC data reader source accesses the AS400 data, it insists on interpreting the string type fields as being unicode strings and giving it a data type of DT_WSTR when what I need it to have is a plain old DT_STR data type. When the strings are interpreted as unicode strings, they cannot be converted in a way that allows the NVARCHAR fields to be filled with the data. The exact error message I get for all the fields that should wind up being nvarchar fields is as follows:

Column "BYStOK" cannot convert between unicode and non-unicode string data types.

Okay, so I try to change the data types in the ADO ODBC data reader to be plain DT_STR data types and I cannot do so.

Does anyone have any idea why the ADO ODBC data reader source insists on interpreting the string data coming from the AS 400 as unicode string data or why it refuses to allow that to be changed to DT_STR data type?

Thanks in advance for any info. By the way, if there is a better way than the ADO ODBC data source to get at this data when I need to specify an SQL command, I would love to hear about it. Not wild about using ODBC in the OLEDB age.

Steve Wells

swells wrote:

Does anyone have any idea why the ADO ODBC data reader source insists on interpreting the string data coming from the AS 400 as unicode string data or why it refuses to allow that to be changed to DT_STR data type?

Hi Steve,

Those are simply limitations of the Data Reader Src. The data reader source is a managed component that gets data as CLR types, so by the time strings get to the Data Reader Src they are already Unicode. The data reader src component itself does not do data conversion, so it does not allow its output column types to be changed.

You will need to put a Data Conversion transform downstream of the Data Reader Src to convert the DT_WSTR column to DT_STR before inserting in the destination.

Thanks
Mark

|||I haven't worked with an AS400 in about a year and a half so things may have changed with newer releases. That being said, IBMs OLEDB was merely a wrapper around their ODBC drivers and not all of the functionality exists in OLEDB. I remember having problems with scrollable cursors and such with OLEDB. Unless IBM has changed their architecture, OLEDB is really ODBC anyways just with fewer features and added overhead.

Microsoft has their own OLEDB driver for DB2 and you can get it from the here. Since I don't have access to an AS400 anymore, I can't say if it works or whether performance is better. Hit Software also provides very good data access libraries although they do cost quite a bit.|||

It looks like a lot of the trouble I am having between the AS400 and SQL server 2005 using SSIS is due to conversion problems. I did a simple "Hi World" app where I take a table with a NChar, Int, and Decimal fields. I move the data from the 400 to SQL then from SQL to the 400. I use the ADO.net for ODBC for the 400 to SQL and have to convert the NChar fields to DT_STR but the int and Decimal fields move as is. On the way back, I have to convert all the NChar fields to DT_STR and all the numeric fields to DT_NUMERIC. Not fun but it works. Before the upgrade to SQL 2005 I did not have to do any of this and it all just worked. I have over 100 DTS and SSIS packages working every day.

Microsoft is trying to push the problem to IBM saying their driver is not up to the task. That may be so but why did all of this stuff work pre 2005? Isn't it conveniant that MS has a driver but to use it you have to buy the more expensive version of Sql Server. For now we will just scrap the upgrade to SQL 2005 and hope that Microsoft and IBM can work it out. For now it is just not worth the trouble. Guess I can put off buying those licenses at least one more year.

|||

I have the same probleme to transfert data from AS400 to SQL Server 2005 database.

"cannot convert between unicode and non-unicode string data types"

( I dont want to add a Data Conversion Transformation,... too slow ! )

Any solution ?

|||Use a derived column and cast the field to DT_STR (or DT_WSTR is necessary).

It won't be slow.|||


Well...

- old DTS ( AS400 --> SQL2000 ) is working in 1 minute.
- DTSX ( AS400 --> SQL2005 ) doesnt work ( "cannot convert between unicode and non-unicode string data types" )
- DTSX ( AS400 --> SQL2005 using a Data Conversion Transformation * ) is working in 10 minutes !!

Data Conversion Transformation :
ex: 'Field_A' --> 'Copy of Field_A' with Data type = "DT_STR"

Moreover, I have 80 tables with 50 to 80 fields to transfer,... if I must set manualy Conversions for all fields i will spend days and days !

Any solution to change the "DataReader Output (external Columns) (Output Columns)" ?

|||

Sombre wrote:


Well...

- old DTS ( AS400 --> SQL2000 ) is working in 1 minute.
- DTSX ( AS400 --> SQL2005 ) doesnt work ( "cannot convert between unicode and non-unicode string data types" )
- DTSX ( AS400 --> SQL2005 using a Data Conversion Transformation * ) is working in 10 minutes !!

Data Conversion Transformation :
ex: 'Field_A' --> 'Copy of Field_A' with Data type = "DT_STR"

Moreover, I have 80 tables with 50 to 80 fields to transfer,... if I must set manualy Conversions for all fields i will spend days and days !

Any solution to change the "DataReader Output (external Columns) (Output Columns)" ?

Try using a derived column to cast the field!!!!!!

(DT_STR,50,1252)[Your_Field] (convert to non-unicode, 50 character length)|||

Edited: In fact my performance problem was about the FastLoad option (Locktable).

Nevertheless, it boring to have to explicitly rechange the data types :(

No comments:

Post a Comment