Monday, January 16, 2012

SSIS Data flow or T-SQL

Advantages of using Stored Procedures

  • A SQL statement will outperform a SSIS data-flow when the data transform is table-to-table on the same server
  • No new learning curve as ETL developers will already know how to write SQL code. SSIS data-flows are a completely new technology.
  • Utilise transactions within the database rather than use MSDTC
  • Easier to generate stored procedures from known metadata than it is with a data-flow (at the time of writing it is anyway)

Advantages of using a SSIS data-flow

  • Handle data from heterogenous sources in the same place
  • Consume data from sources that can't be accessed using a SQL statement
  • Data can be transformed without the need for an intermediate staging area
  • If data does need to be persisted temporarily it can be to a raw file whereas T-SQL requires temporary tables for which there may be a management/security overhead
  • Extra transformation functionality (e.g. Fuzzy logic, data mining, text mining, insert to Analysis Services)
  • Visual representation of the "work"
  • Bad data can be captured to a different data sink for examination later
  • Exception Handling
  • Use .Net Data providers as an alternative to OLE DB Providers (e.g. mySAP Business Suite .Net Data Provider)
  • Data-flows are, to some extent, self-documenting
  • User-controlled parallel execution of data-flows is possible where it isn't really in the inherent batch operated world of stored procedures
  • "Heavy-lifting" of data can occur on a different to machine to that storing the data (thanks to John in the comments section)
  • By default, events containing very pertinent information such as "component "" (5824)" wrote 2398156 rows" are raised

No comments: