Bob Blackburn
2015-02-12
There are times during development, ETL, or data conversion
where you run into self referencing and Circular Referencing tables. A common
example of a self referencing table is the Employee table who has a manager
that points back to the Employee table. Sometimes a circular reference is
created intentionally or unintentionally. We will not discuss architecture best
practices here. We will document the references so you can continue with your
work.
Create a test database with a few tables
Create Database
DemoReference
go
Use DemoReference
go
CREATE TABLE
Island (
IslandID INT NOT NULL,
IslandName NVARCHAR (60) NOT NULL,
CONSTRAINT [PK_Island] PRIMARY KEY CLUSTERED (IslandID ASC)
);
CREATE TABLE
Employee (
EmployeeID INT NOT NULL,
ManagerID INT NULL,
EmployeeName NVARCHAR (60) NOT NULL,
CONSTRAINT [PK_Employee] PRIMARY KEY CLUSTERED (EmployeeID
ASC),
CONSTRAINT [FK_Employee_Manager] FOREIGN KEY (ManagerID) REFERENCES Employee (EmployeeID)
);
CREATE TABLE
Country (
CountryID INT NOT NULL,
CountryName NVARCHAR (60) NOT NULL,
CONSTRAINT [PK_Country] PRIMARY KEY CLUSTERED (CountryID ASC)
);
CREATE TABLE
USState (
USStateID INT NOT NULL,
CountryID INT NOT NULL,
JurisdictionID INT NOT NULL,
StateName NVARCHAR (60) NOT NULL,
CONSTRAINT [PK_USState] PRIMARY KEY CLUSTERED (USStateID ASC),
CONSTRAINT [FK_USState_Country] FOREIGN KEY (CountryID) REFERENCES Country (CountryID)
);
CREATE TABLE
[Jurisdiction] (
[JurisdictionID] INT NOT NULL,
[ParentID] INT NULL,
[Name] NVARCHAR (60) NOT NULL,
[USStateID] INT NULL,
CONSTRAINT [PK_Jurisdiction] PRIMARY KEY CLUSTERED ([JurisdictionID]
ASC),
CONSTRAINT [FK_Jurisdiction_Parent] FOREIGN KEY ([ParentID]) REFERENCES [Jurisdiction] ([JurisdictionID]),
CONSTRAINT [FK_Jurisdiction_USState] FOREIGN KEY ([USStateID]) REFERENCES [USState] ([USStateID])
);
CREATE TABLE
USCity (
USCityID INT NOT NULL,
USStateID INT NOT NULL,
CountryID INT NOT NULL,
StateName NVARCHAR (60) NOT NULL,
CONSTRAINT [PK_USCity] PRIMARY KEY CLUSTERED (USCityID ASC),
CONSTRAINT [FK_USCity_USState] FOREIGN KEY (USStateID) REFERENCES USState (USStateID),
CONSTRAINT [FK_USCity_Country] FOREIGN KEY (CountryID) REFERENCES Country (CountryID)
);
You get the following Diagram.
Now we have some code to find our self and circular
referencing tables.
The Circular Reference query can take a long time to run.
This is because of recursion and the use of functions in a where clause. On a small database it took about 1 minute.
Sometimes on a larger database it will run faster when they do not contain
circular references. Since this is used for analysis and not as a production
query, we are not as concerned with performance.
-- Self Referencing
SELECT KCU.constraint_name,
kcu.table_schema
+ '.' + KCU.table_name AS TABLE_NAME,
kcu.column_name,
TC.table_schema
+ '.' + tc.table_name AS
REFERENCED_TABLE,
kcu2.column_name AS REFERENCED_COLUMN
FROM
information_schema.key_column_usage
kcu
JOIN information_schema.referential_constraints rc
ON kcu.constraint_name = rc.constraint_name
AND
KCU.constraint_schema =
RC.constraint_schema
JOIN information_schema.table_constraints tc
ON rc.constraint_schema =
tc.constraint_schema
AND
rc.unique_constraint_name = tc.constraint_name
JOIN information_schema.key_column_usage kcu2
ON
kcu2.constraint_name =
tc.constraint_name
AND
KCU2.constraint_schema =
tC.constraint_schema
WHERE
kcu.table_name =
tc.table_name;
--- Circular Reference
WITH circularrefcte (tableschema, tablename, refpath, reflevel)
AS (SELECT t.table_schema,
t.table_name,
Cast(t.table_schema + '.' + t.table_name AS VARCHAR(max)) AS
RefPath,
1
FROM information_schema.tables t
WHERE t.table_type = 'BASE TABLE'
UNION ALL
SELECT
kcu.table_schema,
kcu.table_name,
Cast(cr.refpath + ' ' +
kcu.table_schema +
'.'
+
kcu.table_name +
' ' AS VARCHAR(max)),
reflevel + 1
FROM information_schema.key_column_usage
kcu
JOIN
information_schema.referential_constraints rc
ON
kcu.constraint_name =
rc.constraint_name
AND
KCU.constraint_schema =
RC.constraint_schema
JOIN
information_schema.table_constraints tc
ON
rc.constraint_schema =
tc.constraint_schema
AND
rc.unique_constraint_name = tc.constraint_name
JOIN
circularrefcte cr
ON
tc.table_schema =
cr.tableschema
AND
tc.table_name =
cr.tablename
WHERE reflevel <
100 -- Set recursive level limit
AND
kcu.table_name <>
cr.tablename
AND
Charindex(kcu.table_schema + '.' + kcu.table_name,
cr.refpath, 2)
=
0)
-- WITH CircularRefCTE
SELECT *
FROM
circularrefcte
WHERE
Charindex(tableschema
+ '.' + tablename + ' ', refpath, 1) > 0
-- First
Circular Reference
AND Charindex(tableschema
+ '.' + tablename + ' ', refpath,
Charindex(tableschema + '.' + tablename + ' ', refpath, Charindex
(
tableschema +
'.', refpath) + 1 )
+ 1) = 0
-- only first
reference found
AND Charindex(tableschema
+ '.' + tablename + ' ', refpath) =
Charindex(tableschema
+
'.', refpath) -- Start position
To find the Circular reference, we created a recursive CTE.
You can set the recursion level in the WHERE clause. The CHARINDEX in the WHERE
clause is to eliminate tables that join to the circle. The WHERE clause on the
SELECT is also to eliminate duplicates and supersets of the circular reference.
Currently we only get self referencing results.
constraint_name
|
TABLE_NAME
|
column_name
|
REFERENCED_TABLE
|
REFERENCED_COLUMN
|
FK_Employee_Manager
|
dbo.Employee
|
ManagerID
|
dbo.Employee
|
EmployeeID
|
FK_Jurisdiction_Parent
|
dbo.Jurisdiction
|
ParentID
|
dbo.Jurisdiction
|
JurisdictionID
|
Over time, the following changes were introduced:
alter table
USState
add foreign
key (JurisdictionID)
References Jurisdiction(JurisdictionID);
Alter table
Country add JurisdictionID INT not null;
alter table
Country
add foreign
key (JurisdictionID)
References Jurisdiction(JurisdictionID);
Now the Diagram looks like this:
I hope this would never be designed this way; but, it is
loosely based on recent events for an example.
Rerun the Circular reference query and the results are:
tableschema
|
tablename
|
refpath
|
reflevel
|
dbo
|
Jurisdiction
|
dbo.Jurisdiction dbo.USState dbo.Jurisdiction
|
3
|
dbo
|
Jurisdiction
|
dbo.Jurisdiction dbo.Country dbo.USState dbo.Jurisdiction
|
4
|
dbo
|
USState
|
dbo.USState dbo.Jurisdiction dbo.USState
|
3
|
dbo
|
USState
|
dbo.USState dbo.Jurisdiction dbo.Country dbo.USState
|
4
|
dbo
|
Country
|
dbo.Country dbo.USState dbo.Jurisdiction dbo.Country
|
4
|
We can now see the table and the path with one or two hops that
create the circular reference. The bigger the circle the harder it would be to
determine visually.
I hope you don’t need this often; but, it will save you some
analysis time if you do.