Designing a Data Access Layer in .NET
Summary
This article discusses some considerations when designing a Data
Access Layer (DAL) using C# and Microsoft's .NET Framework.
My approach throughout has been to make use of the functionality
provided by the .NET Framework classes as much as possible. This
minimises the amount of code written reducing the possibility of
coding errors and improves performance.
What is a Data Access Layer?
A Data Access Layer is a set of classes and functions for reading
from and writing to a database or other data store. It does not contain
any of the business logic for the application nor User Interface
elements. It is part of a multi-layer application design that usually
includes:
-
a User Interface layer (UI) which contains screens and User
Interface components
-
a Business Logic layer (BLL) which contains the business rules
for the application
-
a Data Access Layer Source Code The source code for this
article can be downloaded from the link below.
Source Code Download
Designing a Data Access Layer in .NET Source Code (~30 KB)
The example application uses the Northwind database that comes with
Microsoft SQL Server.
If you have made modifications to Northwind and the sample does not
run correctly, the Northwind database can be reinstalled by running the
instnwnd.sql
script. This script is usually located in:
C:\Program Files\Microsoft Visual Studio .NET\FrameworkSDK\Samples\Setup
This script will re-create the Northwind database.
Features of the Data Access Layer
These are some of the features I wanted to implement and the
approaches that could be considered for each feature when designing
the data access layer.
Support for Create, Read, Update and Delete (CRUD) of rows in the
Database
Support the basic operations for tables in a database such as
-
Select all rows
-
Select a single row by primary key
-
Add/insert a new row
-
update an existing row
-
and delete a row
The DataTable class that is a part of the .NET Framework is a useful
data store for transporting data inside of the application.
Support for auto-increment (identity) columns and column defaults
Some of the updates to the database such as INSERTs cause the values
in the database to be updated by the RDBMS. I wanted the data layer
to be aware of changes to the database and keep the values in the
DataRow current.
One of the most common situations is inserting a row into a table
that has an auto-increment (identity) column. The row is inserted, and
the database sets the value of the identity column. The problem is that
the column values in the DataRow are not the same as the values in the
database.
One solution to this problem is the UpdatedRowSource
property of the Command object. The
default for the property is UpdateRowSource.Both
which means that the DataRow will be updated with the values from
any output parameters and the first row that is returned from the
execution of the command (if 1 or more rows are returned).
I have decided to use
UpdatedRowSource.FirstReturnedRecord and design the
INSERT or UPDATE stored procedures to return the row that was added or
updated. I do not need to use any output parameters in this case.
Setting up the INSERT Command
IDbCommand InsertCommand =
DataObjectFactory.CreateStoredProcedureCommand(_DatabaseConnection);
InsertCommand.CommandText = "up_Orders_Insert";
InsertCommand.UpdatedRowSource
= UpdateRowSource.FirstReturnedRecord;
The INSERT stored procedure
INSERT INTO [Orders]
(
[CustomerID],
[EmployeeID],
[OrderDate],
[RequiredDate],
[ShippedDate],
[ShipVia],
[Freight],
[ShipName],
[ShipAddress],
[ShipCity],
[ShipRegion],
[ShipPostalCode],
[ShipCountry]
)
VALUES
(
@CustomerID,
@EmployeeID,
@OrderDate,
@RequiredDate,
@ShippedDate,
@ShipVia,
@Freight,
@ShipName,
@ShipAddress,
@ShipCity,
@ShipRegion,
@ShipPostalCode,
@ShipCountry
)
SET @OrderID = (SELECT SCOPE_IDENTITY())
/*
** Select the row that was just inserted
*/
SELECT
Orders.*,
BINARY_CHECKSUM(*) AS ROWCHECKSUM
FROM
Orders
WHERE
[OrderID] = @OrderID
The stored procedure above will insert the new row and then return
the newly inserted row to the Data Access layer. The DataAdapter will
automatically update the DataRow with the new identity value and any
other values in the row that have been changed by the database (for
example triggers).
Note that SCOPE_IDENTITY() is a
function that SQL Server 2000 supports for returning the last identity
value that was inserted in the current scope. SQL Server 7 and Access
2000 can use @@IDENTITY instead to return
the most recently inserted identity/autonumber value.
Allow Primary Key columns to be updated
If the primary key column value can be changed (e.g. it is text)
then allow it to be updated to a new value by the application.
This is a feature can be implemented quite easily by using the
DataRowVersion
property of the DataRow. When command parameters are
added to the Command the DataRowVersion for the
parameter can be specified. The default is to use the current value
in the row (DataRowVersion.Current). Setting the
parameter to use DataRowVersion.Original tells the
DataAdapter to use the column value as it was when the row was first
retrieved from the database.
updateCommand.Parameters.Add(
DataObjectFactory.CreateCommandParameter(
"@ClientIDOriginal",
"ClientID",
SqlDbType.Int,
DataRowVersion.Original));
updateCommand.Parameters.Add(
DataObjectFactory.CreateCommandParameter(
"@ClientID",
"ClientID",
SqlDbType.Int,
DataRowVersion.Current));
The stored procedure needs to be written to expect the two versions
of the column and use them appropriately. For example
UPDATE
[Customers]
SET
[CustomerID] = @CustomerID,
[CompanyName] = @CompanyName,
[ContactName] = @ContactName,
[ContactTitle] = @ContactTitle,
[Address] = @Address,
[City] = @City,
[Region] = @Region,
[PostalCode] = @PostalCode,
[Country] = @Country,
[Phone] = @Phone,
[Fax] = @Fax
WHERE
[CustomerID] = @CustomerIDOriginal
Optimistic Concurrency Support
Implement a simple method for determining if a row has been modified
between selecting the row and updating it and raise a concurrency
exception if it has.
Dealing with Concurrency is difficult and many developers simply
ignore it. Most solutions will be specific to the RDBMS. There are a
couple of approaches for Microsoft SQL Server that I will outline below.
Concurrency approach #1: Using BINARY_CHECKSUM
The BINARY_CHECKSUM() function can be
used to determine if a column or set of columns has changed. For
example this SQL command:
SELECT TOP 5
BINARY_CHECKSUM(*) AS ROWCHECKSUM,
Customers.*
FROM
Customers
Returns the following results:
|
ROWCHECKSUM |
CUSTOMERID |
COMPANYNAME |
CONTACTNAME |
... |
| -513193757 |
ALFKI |
Alfreds Futterkiste |
Maria Anders |
... |
| -1384213123 |
ANATR |
Ana Trujillo Emparedados y helados
|
Ana Trujillo |
... |
| -302140383 |
ANTON |
Antonio Moreno Taquera |
Antonio Moreno |
... |
| 266281670 |
AROUT |
Around the Horn |
Thomas Hardy |
... |
| 694914837 |
BERGS |
Berglunds snabbkp
|
Christina Berglund |
... |
When the row is updated, the value originally returned by
BINARY_CHECKSUM() can be checked against the value returned by
BINARY_CHECKSUM() for the row currently in the database. If the
values are different then the row has been modified by another process.
Note that BINARY_CHECKSUM() will ignore any columns
using the datatypes of TEXT, NTEXT, IMAGE or sql_variant with any of the
above types as its base type. This means that you might need to look at
another solution for databases that are using these datatypes
extensively.
Concurrency approach #2: Using the TIMESTAMP data type
Timestamp is a SQL Server data type that has special behaviour. The
value in the timestamp column is updated every time a row is
inserted or updated.
Using timestamp columns to do concurrency checking would be similar
to the use of BINARY_CHECKSUM() above. When the row is
updated, the value in the timestamp column is checked against the
current value in the database. If the values differ then the row has
been updated by another user and the DataAdapter will raise
the Concurrency exception.
Allow Data Providers to be changed easily
Design the data layer so that it can be used with different .NET
Data providers without having to recode extensively. E.g. it can be
used with OLEDB, MS SQL Server Native library, Oracle Native
library, ODBC, etc.
Using the interfaces of the ADO.NET Data Access library instead of
the classes themselves allows the data layer to change providers with
little coding. The interfaces are implemented by each of the native
libraries and the OleDb library. For example each Database connection
class implements the
IDbConnection interface and each DataAdapter
implements IDbDataAdapter. So instead of:
protected SqlConnection m_Connection;
protected SqlDataAdapter m_DataAdapter;
I can use:
protected IDbConnection m_Connection;
protected IDbDataAdapter m_DataAdapter;
Code generation
Be sure to have a look at my code generator
which uses many of the ideas in this article.
Summary
The Author Ross Donald is an independent software developer and
consultant based in Sydney, Australia. He can be contacted at
ross @ radsoftware.com.au