Skip to main content

Read Microsoft Excel Document As Relational Table Using Teiid

If you are thinking, who in this age of big data with Hadoop, MongoDB et.al one still bothers to fiddle around with Excel? In reality there are still lot of corporate users out there, do their analytics and reporting using Excel. So, this is still is very important source of data.

From very early versions of Teiid, Teiid supported consuming the data from Microsoft Excel documents, however it always relied upon the "Excel" ODBC driver that came built in Windows platforms. Teiid then used "jdbc-odbc" bridge driver provided by Oracle's JDK to read the data in. Although this solution worked most of the time, it was very tedious and sprinkled with issues and limited the access to Windows platform.

In latest version of Teiid 8.7, a new translator is introduced which is based Apache POI framework, which gives platform independent way to read Excel documents. This is worked out to be much more simple and also gave a way define the metadata for the data in Excel Documents.

If you are interested, I wrote a step-by-step example for this here https://community.jboss.org/wiki/MicrosoftExcelDocumentIntoRelationalTable

Ramesh..

Comments

  1. great! this will help a lot. great also for replacing odbc method... sometimes it's necessary (sometimes temporarily... for months) use xls spreadsheets to speed up things, while a proper app is developed. At least for us. Thanks for caring :)

    ReplyDelete

Post a Comment

Popular posts from this blog

Tech Tip: Teiid SQL Language MAKEDEP Hint Explained

In this article I will explain what a MAKEDEP hint is, how and when, why it should be used in Teiid. What: MAKEDEP is query hint.  When a query hint is defined in the SQL query it influences the Teiid query planner to optimize the query in a way that is driven by the user. MAKEDEP means "make this as a dependent join". What is a Dependent Join? For example if we have query like: SELECT * FROM X INNER JOIN Y ON X.PK = Y.FK Where the data for X, and Y are coming from two different sources like Oracle and WebService in Teiid, so in relational algebra you can represent above query as Here the result tuples from node X and node Y are being simultaneously fetched by Teiid query engine, then it joins the both the results inside Teiid engine based on the specified X.PK = Y.PK condition and returns the filtered resulted to the user. simple.. Now, what if, if X table has 5 rows and Y table has 100K rows? In order to do the JOIN naively Teiid need sto read all the 5

Teiid 8.11 Beta1 and 8.10.1 Released

Teiid 8.11 Beta1 is now available from the  downloads  and maven.  Feature highlights since Alpha2 include: TEIID-3434 More caching control over ttls in the result set cache down to the schema/table level. TEIID-3412 MS Access support via the UCanAccess driver. The UCanAccess support is necessary for those running on Java 1.8 as the JDBC ODBC bridge has been removed from the JRE. The waiting continues on EAP 6.4 Alpha1 - it still should be available shortly and should be the platform target for Teiid 8.11 Beta2. Of course, let us know if you find any issues with these early releases.  There's still plenty of time to get fixes into the final release. Teiid 8.10.1 is also available.  It addresses 5 important issues discovered since 8.10 was released: [ TEIID-3409 ] - PostgreSQLExecutionFactory TranslatorProperty annotation in wrong place [ TEIID-3437 ] - Inconsistencies with row count handling [ TEIID-3438 ] - Null value returned from BlobImpl

Teiid 8.13.3 Released

Teiid 8.13.3 is now  available .  In total 8.13.3 addresses 10 issues since 8.13.2: [ TEIID-4028 ] - adding salesforce-34 resource adapter does not work through the cli [ TEIID-4066 ] - Odata translator ClassNotFoundException: com.sun.ws.rs.ext.RuntimeDelegateImpl [ TEIID-4070 ] - Issues with resource adapters with api modules in wildfly [ TEIID-4089 ] - Teiid JDBC driver does not reset the update count when calling getMoreResults(int) [ TEIID-4093 ] - OData authentication fails with NPE when gss-pattern related properties are included in VDB [ TEIID-4096 ] - AssertionError with independent side of a dependent join that has an ordered limit [ TEIID-3050 ] - allow for more incremental insert with iterator [ TEIID-4075 ] - Netezza translator to support common table expressions [ TEIID-4098 ] - Always preserve columns order in google spreadsheets models [ TEIID-4046 ] - OData - $skip is beeing ignor