Designing a Data Access Layer in .NET

Summary

This article discusses some considerations when designing a Data Access Layer (DAL) using C# and Microsoft's .NET Framework.

My approach throughout has been to make use of the functionality provided by the .NET Framework classes as much as possible. This minimises the amount of code written reducing the possibility of coding errors and improves performance.

What is a Data Access Layer?

A Data Access Layer is a set of classes and functions for reading from and writing to a database or other data store. It does not contain any of the business logic for the application nor User Interface elements. It is part of a multi-layer application design that usually includes:

  • a User Interface layer (UI) which contains screens and User Interface components
  • a Business Logic layer (BLL) which contains the business rules for the application
  • a Data Access Layer Source Code The source code for this article can be downloaded from the link below.

Source Code Download

Designing a Data Access Layer in .NET Source Code (~30 KB)

The example application uses the Northwind database that comes with Microsoft SQL Server.

If you have made modifications to Northwind and the sample does not run correctly, the Northwind database can be reinstalled by running the instnwnd.sql script. This script is usually located in:

C:\Program Files\Microsoft Visual Studio .NET\FrameworkSDK\Samples\Setup

This script will re-create the Northwind database.

Features of the Data Access Layer

These are some of the features I wanted to implement and the approaches that could be considered for each feature when designing the data access layer.

Support for Create, Read, Update and Delete (CRUD) of rows in the Database

Support the basic operations for tables in a database such as

  • Select all rows
  • Select a single row by primary key
  • Add/insert a new row
  • update an existing row
  • and delete a row

The DataTable class that is a part of the .NET Framework is a useful data store for transporting data inside of the application.

Support for auto-increment (identity) columns and column defaults

Some of the updates to the database such as INSERTs cause the values in the database to be updated by the RDBMS. I wanted the data layer to be aware of changes to the database and keep the values in the DataRow current.

One of the most common situations is inserting a row into a table that has an auto-increment (identity) column. The row is inserted, and the database sets the value of the identity column. The problem is that the column values in the DataRow are not the same as the values in the database.

One solution to this problem is the UpdatedRowSource property of the Command object. The default for the property is UpdateRowSource.Both which means that the DataRow will be updated with the values from any output parameters and the first row that is returned from the execution of the command (if 1 or more rows are returned).

I have decided to use UpdatedRowSource.FirstReturnedRecord and design the INSERT or UPDATE stored procedures to return the row that was added or updated. I do not need to use any output parameters in this case.

Setting up the INSERT Command

IDbCommand InsertCommand = 
	DataObjectFactory.CreateStoredProcedureCommand(_DatabaseConnection);
InsertCommand.CommandText = "up_Orders_Insert";
InsertCommand.UpdatedRowSource 
	= UpdateRowSource.FirstReturnedRecord;

The INSERT stored procedure

INSERT INTO [Orders]
(
	[CustomerID],
	[EmployeeID],
	[OrderDate],
	[RequiredDate],
	[ShippedDate],
	[ShipVia],
	[Freight],
	[ShipName],
	[ShipAddress],
	[ShipCity],
	[ShipRegion],
	[ShipPostalCode],
	[ShipCountry]
)
VALUES
(
	@CustomerID,
	@EmployeeID,
	@OrderDate,
	@RequiredDate,
	@ShippedDate,
	@ShipVia,
	@Freight,
	@ShipName,
	@ShipAddress,
	@ShipCity,
	@ShipRegion,
	@ShipPostalCode,
	@ShipCountry
)


SET @OrderID = (SELECT SCOPE_IDENTITY())

/*
** Select the row that was just inserted
*/
SELECT
	Orders.*,
	BINARY_CHECKSUM(*) AS ROWCHECKSUM
FROM
	Orders
WHERE
	[OrderID] = @OrderID

The stored procedure above will insert the new row and then return the newly inserted row to the Data Access layer. The DataAdapter will automatically update the DataRow with the new identity value and any other values in the row that have been changed by the database (for example triggers).

Note that SCOPE_IDENTITY() is a function that SQL Server 2000 supports for returning the last identity value that was inserted in the current scope. SQL Server 7 and Access 2000 can use @@IDENTITY instead to return the most recently inserted identity/autonumber value.

Allow Primary Key columns to be updated

If the primary key column value can be changed (e.g. it is text) then allow it to be updated to a new value by the application.

This is a feature can be implemented quite easily by using the DataRowVersion property of the DataRow. When command parameters are added to the Command the DataRowVersion for the parameter can be specified. The default is to use the current value in the row (DataRowVersion.Current). Setting the parameter to use DataRowVersion.Original tells the DataAdapter to use the column value as it was when the row was first retrieved from the database.

updateCommand.Parameters.Add(
	DataObjectFactory.CreateCommandParameter(
		"@ClientIDOriginal", 
		"ClientID", 
		SqlDbType.Int,
		 DataRowVersion.Original)); 
		
updateCommand.Parameters.Add(
	DataObjectFactory.CreateCommandParameter(
		"@ClientID", 
		"ClientID", 
		SqlDbType.Int, 
		DataRowVersion.Current)); 

The stored procedure needs to be written to expect the two versions of the column and use them appropriately. For example

UPDATE 
	[Customers]
SET 
	[CustomerID] = @CustomerID,
	[CompanyName] = @CompanyName,
	[ContactName] = @ContactName,
	[ContactTitle] = @ContactTitle,
	[Address] = @Address,
	[City] = @City,
	[Region] = @Region,
	[PostalCode] = @PostalCode,
	[Country] = @Country,
	[Phone] = @Phone,
	[Fax] = @Fax
WHERE
	[CustomerID] = @CustomerIDOriginal

Optimistic Concurrency Support

Implement a simple method for determining if a row has been modified between selecting the row and updating it and raise a concurrency exception if it has.

Dealing with Concurrency is difficult and many developers simply ignore it. Most solutions will be specific to the RDBMS. There are a couple of approaches for Microsoft SQL Server that I will outline below.

Concurrency approach #1: Using BINARY_CHECKSUM

The BINARY_CHECKSUM() function can be used to determine if a column or set of columns has changed. For example this SQL command:

SELECT TOP 5
	BINARY_CHECKSUM(*) AS ROWCHECKSUM,
	Customers.*
FROM
	Customers

Returns the following results:

ROWCHECKSUM CUSTOMERID COMPANYNAME CONTACTNAME ...
-513193757 ALFKI Alfreds Futterkiste Maria Anders ...
-1384213123 ANATR Ana Trujillo Emparedados y helados Ana Trujillo ...
-302140383 ANTON Antonio Moreno Taquera Antonio Moreno ...
266281670 AROUT Around the Horn Thomas Hardy ...
694914837 BERGS Berglunds snabbkp Christina Berglund ...

When the row is updated, the value originally returned by BINARY_CHECKSUM() can be checked against the value returned by BINARY_CHECKSUM() for the row currently in the database. If the values are different then the row has been modified by another process.

Note that BINARY_CHECKSUM() will ignore any columns using the datatypes of TEXT, NTEXT, IMAGE or sql_variant with any of the above types as its base type. This means that you might need to look at another solution for databases that are using these datatypes extensively.

Concurrency approach #2: Using the TIMESTAMP data type

Timestamp is a SQL Server data type that has special behaviour. The value in the timestamp column is updated every time a row is inserted or updated.

Using timestamp columns to do concurrency checking would be similar to the use of BINARY_CHECKSUM() above. When the row is updated, the value in the timestamp column is checked against the current value in the database. If the values differ then the row has been updated by another user and the DataAdapter will raise the Concurrency exception.

Allow Data Providers to be changed easily

Design the data layer so that it can be used with different .NET Data providers without having to recode extensively. E.g. it can be used with OLEDB, MS SQL Server Native library, Oracle Native library, ODBC, etc.

Using the interfaces of the ADO.NET Data Access library instead of the classes themselves allows the data layer to change providers with little coding. The interfaces are implemented by each of the native libraries and the OleDb library. For example each Database connection class implements the IDbConnection interface and each DataAdapter implements IDbDataAdapter. So instead of:

protected SqlConnection m_Connection; 
protected SqlDataAdapter m_DataAdapter;

I can use:

protected IDbConnection m_Connection; 
protected IDbDataAdapter m_DataAdapter; 

Code generation

Be sure to have a look at my code generator which uses many of the ideas in this article.

Summary

The Author Ross Donald is an independent software developer and consultant based in Sydney, Australia. He can be contacted at
ross @ radsoftware.com.au