Common package development scenarios
This section covers a number of common package development scenarios. The scenarios include the following:
- How to use packages as templates
- How to reuse package objects
- How to save copies of packages to different locations
- How to configure packages
- How to implement checkpoints in packages
- How to set user properties to secure packages
- How to implement looping in packages
- How to manage control flow layouts with many items
- How to format package layout
- How to use expressions to determine whether executables run
- How to use expressions to set variable values
- How to use expressions to values of properties
- How to view data during package execution
- How to implement logging in packages
How to use packages as templates
Frequently groups of packages have a common set of features, or your organization may define how certain properties should be set in packages to impose consistency across packages. These scenarios call for package templates!
By default, the package template for new packages is an empty package, but you can use any package saved to the file system as a template. To use packages as templates, you copy them to the DataTransformationItems folder. The default location of this folder is C:Program FilesMicrosoft Visual Studio 8Common7IDEPrivateAssembliesProjectItemsDataTransformationProject.
However, you cannot replace the built-in functionality of Integration Services. If you right-click the SSIS Packages node in Solution Explorer and click New SSIS Packages, Integration Services always adds a new empty package. To use the packages that you added to the DataTransformationItems folder when you add a "new" package, you need to add an existing item to the Integration Services project.
To add a new package, follow these steps:
- Right-click the Integration Services project to which you want to add the template package, point to Add, and then click New Item. The dialog box shown in Figure 16-31 appears.
Figure 16-31The Add New Item—MyProject dialog box lists the packages, NewlyAddedPackage and TemplatePackage, to use when adding new items that are packages. Notice the built-in templates for data sources, data source views, and the empty package New SSIS Package.
- In the dialog box, select the package you want to use, and then click Add.
When you add new items to an Integration Services project, the items are placed in the project folders based on their file extensions. Integration Services packages use the file extension dtsx to identify packages, and when you add a new item with this extension, it is automatically added to the SSIS Packages folder in Solution Explorer. Likewise, if you add a file that has the ds extension, the item is added to the Data Sources folder, and a file with the dsv extension is added to the Data Source Views folder.
How to reuse package objects
You will probably find that you can reuse configured package objects such as tasks and connection managers in multiple packages. You might want to reuse only a single configured task or you might want to reuse an entire data flow. This is very easy to do!
SSIS Designer supports copying and pasting package objects within a package and between packages. If a solution has multiple Integration Services projects, you can copy between packages in different projects. If you copy multiple objects connected by precedence constraints or paths, then the precedence constraints and paths are copied as well. If you copy a Data Flow task, you also copy the data flows that are associated with the Data Flow task. The copy and paste functionality is built into SSIS Designer, and the packages must be opened in SSIS Designer to enable copying and pasting within a package or between packages.
To copy objects, follow these steps:
- Select one or more objects to copy.
- Right-click and click Copy.
- If copying to different package, activate that package.
- Click the Control Flow, Data Flow, or Event Handlers tab, depending on the type of objects to paste.
- Right-click and click Paste.
To ensure that copied objects continue to work, you must also copy pertinent dependencies. The dependencies can be variables or connection managers that the object uses. And of course, you cannot copy data flow components to a package that has no Data Flow task.
How to save copies of packages to different locations
From SSIS Designer you can save packages to different location types and locations. This capability is very similar to the import feature that is available to Integration Services in SQL Server Management Studio. To make the menu option to copy a package available, you must first open the package in SSIS Designer and click anywhere in the background of the control flow designer. The File menu includes two options that read very much alike: the Save
Figure 16-32The Save Copy of Package dialog box saves a package, renamed to NewSQLServerPackage, to SQL Server and updates the package protection lovel from Do not save sensitive data to Rely on server storage and roles for access control.
How to configure packages
The package, not to be confused with the package content, is configured in the Properties window in Business Intelligence Development Studio. If you sort the package properties by category, you will find it easier to set related properties (see Figure 16-33).
Figure 16-33 The categorized Properties window.
How to implement checkpoints in packages
Integration Services supports checkpoints. You can configure a package to use checkpoints and this way make it is possible to restart it from the point of failure, rather than rerun the whole package. If a package is configured to use checkpoints, information about package execution is written to a checkpoint file. When you rerun the failed package, the checkpoint file is used to identify where the package failed previously and restart the package from the point of failure. If the package reruns successfully, the checkpoint file is deleted.
To set the properties related to checkpoints, expand the Checkpoints node in the Properties window.
To configure checkpoints, follow these steps:
- Set the SaveCheckpoints property to True to indicate that the package saves checkpoints.
- Provide the name and location of the checkpoint file in CheckpointFileName property.
- Set the value of the CheckpointUsage property (see Table 16-2).
- To identify tasks and containers in the package as restart points, select each task or container, and in the Properties window set its FailPackageOnFailure property to True.
Table 16-2The Possible Values of the CheckpointUsage Property
|Never||Indicates that the checkpoint file is not used.|
|Always||Indicates that the checkpoint file is always used. If the checkpoint file is not found, the package fails.|
|IfExists||Indicates that the checkpoint file is used if it exists. If the checkpoint file is not found, the whole package is rerun.|
How to set user properties to secure packages
Integration Services provides a variety of package protection levels to encrypt packages or to omit saving sensitive information in the package definition. When you develop a package in Business Intelligence Development Studio, you set the package protection to a level that is suitable for a single developer of a team of developers. Some package protection levels (EncryptAllWithUserKey and EncryptSensitiveWithUserKey) encrypt packages using a key based on the user profile. This means that only the same user using the same profile can load and work with the package. Obviously, these protection levels are difficult to use in a team environment because developers cannot open packages that were created by other team members. Other protection levels (EncryptAllWithPassword and EncryptSensitiveWithPassword) require that the passwords be known to all developers. Frequently the DontSaveSensitive protection level is used during package development. You can set the protection level to ServerStorage to rely only on the built-in security features of SQL Server to protect packages.
To set the properties related to package protection level expand the Security node in the Properties window.
To configure the protection level,
- Set the ProtectionLevel property.
- If applicable, provide a password in the PackagePassword property.
If you import or export a package in SQL Server Management Studio or save a copy of a package in Business Intelligence Development Studio, you can change the protection level that the package uses.
How to implement looping in packages
In DTS, it was difficult to implement looping in packages. The solution typically included custom code. In SQL Server 2005, Integration Services introduces two new container types, the Foreach Loop and the For Loop, that make it very easy to include looping in the package control flow.
- For Loop repeats its control flow until a specified expression evaluates to False.
- The Foreach Loop repeats the control flow for each member in the collection of a specified enumerator type.
Of the two types of loops, the For Loop is probably the most straightforward to implement. To configure it, you provide the expression the evaluation results of which determine whether the loop repeats, and optionally, expressions to initialize the loop counter and increment or decrement the loop counter. The three properties to configure the For Loop are: EvalExpression (the evaluation expression), InitExpression (the initialization expression), and AssignExpression (the expression to increment or decrement the loop counter).
In the following very simple implementation, you use a variable, varCounter, to control loop execution. The variable must have a numeric data type. The loop repeats the loop repeats 5 times.
Set InitExpression to @varCounter = 1 Set EvalExpression to @varCounter < 6 Set AssignExpression to @varCounter = @varCounter +1
To use the Foreach Loop to implement looping you first choose the type of enumerator to use. You select the enumerator type on the Collection page of the Fore each Loop Editor dialog box (see Figure 16-34).
Figure 16-34The Collection page.
The built-in enumerator types that Integration Services provides (see Table 16-3) support repeating control flows to enumerate a wide variety of objects and items. For example, you can use the Foreach File enumerator to repeat a control flow for all the files in a specified folder, regardless of the number of files present in the folder. You will learn that the Foreach File enumerator supports wildcards, which makes it possible to filter the filesand choose just those that you wish to enumerate.
Table 16-3The Enumerator Types to Use with the Foreach Loop
|Foreach ADO Enumerator||Enumerate rows in an ADO recordset.|
|Foreach ADO.NET Schema Rowset Enumerator.||Enumerate the schema information of a data source.|
|Foreach ADO.NET Schema Rowset Enumerator.||Enumerate the schema information of a data source.|
|Foreach File Enumerator.||Enumerate files in a folder and subfolders.|
|Foreach From Variable Enumerator||Enumerate the enumerable object in a specified variable.|
|Foreach Item Enumerator||Enumerate items that are collections.|
|Foreach Nodelist Enumerator||Enumerate the resultset of an XML Path Language (XPath) expression.|
|Foreach SMO Enumerator||Enumerate SQL Server Management Objects (SMO) objects.|
After you select the enumerator type to use and configure it, you need to map variables to the collection value. This is how the Foreach Loop provides information to the repeating control flow. Let's walk through a couple of scenarios.
To provide the names of files to use in a data flow, follow these steps:
- Create a data flow that reads data from flat files, and drag the Data Flow task inside a Foreach Loop container.
- On the Collection page, choose the Foreach File Enumerator and specify the folder that contains the files to enumerate. Optionally, filter the file list by using wildcards. For example, to enumerate only text files, use *.txt.
- On the Variables Mapping page, select an existing variable or create a new one. The variable must have the String data type. This example uses the variable varEnumerateFiles.
- In the Connection Managers area, click the Flat File connection manager that connects to the files.
- In the Properties window, add a property expression for the ConnectionString property of the Flat File connection manager that uses the expression @varEnumerateFiles.
To provide the logins on a server
- On the Collection page, choose the Foreach SMO Enumerator and specify the server that contains the SQL Server Management Objects (SMO) objects to enumerate, and then select the SMO objects to enumerate. The following shows the value to enumerate logins: SMOEnumObj[@Name='Logins']/SMOEnumType[@Name='Objects']
- On the Variables Mapping page, select an existing variable or create a new one. The variable must have a string data type. This example uses the variable varLogins. The variable must have the Object data type.
How to manage control flow layouts with many items
If the control flow in a package contains a large number of tasks and containers, the design surface can become crowded and you may not be able to view all the control flow items at one time to understand fully the functionality of the package, or you may not be able to easily locate the task or container that you want to work with. The grouping feature, available in the control flow designer, is useful in this situation.
To group tasks and containers,
- Select the tasks and containers to group.
- Right-click and click Group
- Optionally, rename the group.
You can then expand and collapse a group to suit your needs (see Figure 16-35). Expanding a group provides access to the properties of the task and containers in the group.
Figure 16-35 Expanded and collapsed groups.
To ungroup tasks and containers,
- Select the group.
- Right-click and click Ungroup.
The navigator is another useful feature for working with packages that have large control flows or data flows. You use the navigator to access parts of the control flow or data flow in packages when the flows are too large to view in the designer at one time.
The navigation feature in SSIS Designer can also make it easier to work with packages in which the control flow and data flows have a large number of items. The navigation feature, also know as the navigator, is located in the lower-right corner of the control flow and data flow design surfaces (see Figure 16-36). The navigator is not available as long as all items are visible on the design surface at one time.
Figure 16-36The navigator pop-up and the navigator location.
To use the navigator,
- Click the crossed arrows in the lower-right corner of the designer.
- Use the crossed-arrows cursor to move to the part of the design surface that you want to be visible in the designer.
How to format package layout
As you develop a package and modify control and data flows by adding and deleting items, you might find that the package layout becomes more difficult to understand. In short, the graphical layout does not communicate package functionality in the most optimal way. You could work your way through the layout, item by item, resizing and re-placing each item on the design surface, but Business Intelligence Studio offers a much easier way to do this.
On the Format menu (see Figure 16-37), you will find useful options to size and align items and to set the horizontal and vertical spacing between items.
Figure 16-37 The Format menu
To apply an option, you must select at least two items on the Control Flow, Data Flow or Event Handlers tab. The item selected first is the item to which other items are sized or aligned. You can think of this item as the precedence item. You can identify the item selected first by its white handles. The items that are selected second or later have black handles. If you select items by dragging the cursor across items, the item the cursor touches first is the precedence item.
The Format menu provides the following options:
- From the Auto Layout option, select Selection or Diagram. The former formats the layout of selected items, the latter all items.
- From the Align option, you can left or right, center-align the vertical sides of items, or middle-align the horizontal sides of items.
- From the Make Same Size option, you can make items the same size by width, height, or both.
- From the Horizontal Spacing option, you can make equal, increase, decrease, or remove space.
- From the Vertical Spacing option, you can make equal, increase, decrease, or remove the vertical spacing between items.
How to use constraints and expressions to determine whether executables run
Precedence constraints link executables (containers and tasks) in packages into a control flow and specify conditions that determine whether executables run. A precedence constraint connects two executables: the precedence executable and the constrained executable.
You can use expressions in place of or in combination with the outcome (success, failure, or completion) of the precedence executable to determine whether the constrained executable runs. The Precedence Constraint Editor is shown in.
Figure 16-38The Precedence Constraint Editor shows the available evaluation operations.
Select either Expression and Constraint or Expression or Constraint to use an expression in the evaluation operation of the precedence constraint. The expression that you use must evaluate to a Boolean. For example, a package could use an expression that evaluates whether the constrained executable runs based on the amount of disk space available, the number of rows a data flow inserted into a table, or the day of the month.
Let's look at some sample expressions. In the following expression, the variable varRowCount, set by a Row Count transformation in a data flow, is compared to the value 1000. You could use this expression to determine whether the package should run a task that backs up a database depending on the number of rows inserted into the destination database. The prefix @ is required when variables are used in expressions to differentiate the variables from column names. In most user interface, the @ is added for you automatically.
@varRowCount > 1000
Likewise, in the following expression, the variable @varDisk, set by a Script task, is compared to the value 20. You could use this expression to determine whether to continue package execution.
@varDisk >= 20
In the following expression, the GETDATE function determines the day. If the day is the first day in the month, the constrained executable runs.
DATEPART("day",GETDATE()) == 1
How to Use Expressions to Set Variable Values
You can configure most variable properties in the Variables window, but you must use the Properties window to configure variables to evaluate as expressions.
To configure variables:
- In the Variables window, click the user-defined variable you want to configure.
- Set EvaluateAsExpression to True.
- In the Expression field, type a valid expression.
You will notice that when the Expression field loses focus, the Value property is automatically updated with the evaluation result of the expression and the ValueType is set.
Let's look at some sample expressions. In the Expression field, type GetDate(). Notice, that the Value property now contains the current date and the ValueType property is set to DateTime.
In the Expression field, type "A" + "B". Notice that the Value property now contains AB and the ValueType property is set to String.
How to use expressions to values of properties
The value of properties can be updated with the evaluation results of expressions by implementing property expressions on properties. Property expressions are expressions that you write in the Integration Services expression language. The expressions can use variables and the operators and functions that the Integration Services expression language provides. The variables can be either system- or user-defined. The custom tools for tasks and containers include the Expressions page (see Figure 16-39), but you can also add property expressions in the Properties window.
Figure 16-39The Expressions page in the Execute SQL Task Editor on which you access the expression tools for writing property expressions for properties of the Execute SQL task.
From the Expressions page or the Expressions node in the Properties window, you click (…) to open the Property Expressions Editor dialog box (see Figure 16-40). Here you select the properties for which to create property expressions in the Property list, and then either type the expression directly in the Expression box or click (…) to use the Expression Builder to write expressions.
The Expression Builder (Figure 16-41) makes it easy to write expressions. It lists the available variables and the functions, type casts, and operators that you can use in expressions. You can type the expression in the Expression box or build the expression by dragging variables, functions, type casts, or operators to the Expression box. The advantages of using the Expression Builder instead of the Property Expression Editor are that you can verify that the syntax of the expression is valid and view the evaluation result of the expression. The value of the property with which the property expression is associated is not updated when you evaluate the expression. The values of properties are updated when you save, load, or run the package.
The Expression Builder lists the variables, functions, type casts, and operators for use in property expressions. Note that the name of the property, in this case CodePage, appears in the text at the top of the dialog box.
Figure 16-40The Property Expressions Editor lists the properties of the Execute SQL task in the Property list.
Figure 16-41The Expression Builder shows the Variables folder, and the folders for functions, type casts, and operator. The expression sets the CodePage property of the task to 1045.
The Expression Builder automatically adds the @ prefix and also includes the variable namespace. If it is not necessary to include the namespace to uniquely identify a variable, you may omit the namespace.
The Expression Builder is also available for building the expressions that data flow components use. The only difference is that the Expression Builder then includes a Columns folder, which lists the columns in the data flow that are available to use in expressions.
Let's consider some sample expressions. The following expression uses the variable, varConnectionString, It could be used to set the ConnectionString property of a file connection manager. The variable would have the String data type and would contain a value like "C:MyProject TestLog.txt".
The following expression uses the system variables PackageName, MachineName, and StartTime. Because the StartTime variable has a date/time datatype it must be cast to a string before the variable value can be concatenated with the other strings in the expression. This expression could be used to set the Subject property of a Send Mail task. The evaluation results would be a value like "MyPackage is running on MYLAP1started at: 2/18/2006 3:07:20 PM."
"The package " + @[System::PackageName] + " is running on " Â+ @[System::MachineName] + "started at: " Â+ (DT_WSTR, 20) (@[System::StartTime])
Other expressions are more simple. The following expression can set the DisableEventHandlers property to True and this way prevent the execution of the event handler control flow.
How to view data during package execution
Frequently it is useful to see the data as it moves through the data flow in a package. You might want to view the data after a transformation updates values or modifies the dataset, verify that the dataset contains the columns that you need, or examine the data for unexpected values. Data viewers provide an easy way to do all these things.
Data viewers are attached to the paths that connect components in a data flow. Integration Services provides four different types of data viewers: grid, histogram, scatter plot, and column chart. The column chart and grid types are exemplified in Figure 16-42). You can only use the histogram and scatter plot types with numeric data.
Figure 16-42 The Column Chart data viewer and the Grid data viewer.
You can copy the data in data viewers to use for further analysis. If you copy data from a Grid data viewer, the column names are copied as well.
How to implement logging in packages
The earlier section about logging in this chapter describes the types of logging that you can implement in packages. To view the logged information, you need to exit Business Intelligence Development studio and open another tool like the Windows Event Viewer or SQL Profiler or a text or XML file, or run an SQL statement to query log entries in SQL Server tables. This is not very convenient during package development.
To view log entries in real time in Business Intelligence Development Studio, you use the Log Events window. There are no additional steps in the configuration of the package logging; all you do is open the Log Events window and the log entries are automatically written to the window. If the package is configured to log to multiple logs, the information appears only once in the Log Events window.
To open the Log Events window (see Figure 16-43), click Log Events on the SSIS menu.
Figure 16-43The Log Events window lists log entries for the Foreach Loop container.
From the Log Events window you can copy the log entries and also view the log entry in a separate dialog box. To copy the log entries, select the entries, right-click, and then click Copy. To view a log entry in the Log Entry dialog box, double-click the log entry in the Log Events window.
Use the following table of contents to navigate to chapter excerpts, or click here to view Inside Integration Service Tools in its entirety.
|About the book|
|Microsoft SQL Server 2005's high-powered management tools can dramatically improve DBA productivity and effectiveness. Now there's a comprehensive guide to SQL Server 2005's toolset, straight from the Microsoft team that created it. This book covers the entire toolset in unprecedented depth, guides database professionals in choosing the right tools, and shows them how to use various tools collectively to solve real-world problems. Purchase Inside SQL Server 2005 Tools from Addison-Wesley.|
|About the author|
|Lead author Michael Raheem is a senior product manager in the SQL Server Marketing team at Microsoft. Michael currently leads the SQL Server enterprise marketing efforts, including high availability, scalability, performance, and SQL Server Always On Technologies. Prior to joining the marketing team, he led the design and implementation of several SQL Server 2005 tools such as Management Studio, Upgrade Advisor, Database Mail and Surface Area Configuration. Michael has spoken at several conferences, including TechEd, TechReady, PASS and SQL Connections. Additionally, he has contributed to the Answers from Microsoft column in SQL Server Magazine and has over 13 years of experience in designing and developing solutions with Microsoft SQL Server.|
This was first published in March 2007