We are building an application with
a database that contains Jobs. These Jobs have properties like Name, Code etc.
and some custom properties, definable by the application admin. For bulk import
of Jobs, we want to allow the import of an Excel sheet with the columns Name,
Code and a variable amount of columns. If the header names of these columns in
the Excel sheet match the name of a custom property in the system we want to add
the value of that cell into the database as property
value.
In our Data Flow of our Import
Package in SSIS we added an Excel Source that points to a test excel sheet with
the Name and Code columns and – for this example - 3 custom property columns
(Area, Department, Job Family). When we configure the Excel Source in the Excel
Source Editor, we have the option to select the Columns from the Available
External Columns table. But here lays the problem, we do not know at design
time, what custom property columns to expect. We DO expect the Name and Code
columns, but the rest is uncertain at design-time.
That raises the question: Is there
some way to select all of any incoming columns (something like a SELECT * in
T-SQL)? This looks like a big problem since it would mean that the .DTSX XML that is
being generated at design-time would need to be updated at run-time to reflect
the variability of the columns that might be encountered while reading the excel
sheet.
Then, we thought, we could add a Script
Component to our data flow that passes some kind of DataSet (or DataReader) in
which we can walk through the columns ourselves? But then still, we miss the
option to include ANY of the columns while reading an Excel sheet (or any other
datasource by the looks of it)
We are aware of the option of
optional columns in combination with the RaggedRight option, but it seems that
we would have to put all of the columns of a row in just one column and then
extract all the columns later with Derived Columns. But then, since the source
import file is being prepared by an application admin, we want don’t want to
burden him with this horrendous task of putting everything in one
column.
We would like to have some way of
iterating through all the columns, either in a Script Component or maybe with a
Pivot/Unpivot mechanism.
Does anyone have any suggestions? Are there other options we should have considered?
The metadata of the pipeline is fixed at design-time. You cannot change the columns at runtime.
-Jamie
|||
Since you're importing from Excel, you may be able to define a dataflow that reads the maximum amount of columns you anticipate ever having in one Excel file. The Excel files with less columns would return empty strings for the non-existent columns.
Haven't tried this, but it might work.
K
No comments:
Post a Comment