Seems very strange but it is possible to use "recursion" in SQL statements. By recursion I mean the method of defining function (or object) which inside its definition refers to itself. To write recursive SQL statement it is necessary to follow strict rules : http://msdn.microsoft.com/en-us/library/ms186243.aspx
Example
Consider simple table containing distances between cities.
CREATE TABLE CONNECTION (CITY_FROM VARCHAR(100), CITY_TO VARCHAR(100), DISTANCE DECIMAL);Now "recursive" statement:
delete from connection;
insert into connection values ('Warszawa', 'Augustów', 242);
insert into connection values ('Warszawa', 'Suwałki', 267);
insert into connection values ('Warszawa', 'Olsztyn', 192);
insert into connection values ('Augustów', 'Suwałki', 32);
insert into connection values ('Warszawa', 'Płock', 103);
insert into connection values ('Płock', 'Grudziądz', 133);
insert into connection values ('Grudziądz', 'Gdańsk',106);
insert into connection values ('Warszawa', 'Mława', 116);
insert into connection values ('Mława','Elbląg', 144);
insert into connection values ('Elbląg', 'Gdańsk', 58);
WITH path (city_from, city_to, distance, stops)How it runs ?
AS ( SELECT f.city_from, f.city_to, f.distance, 0
FROM connection f
WHERE city_from='Warszawa'
UNION ALL
SELECT p.city_from, f.city_to,
p.distance+f.distance, p.stops+1
FROM connection f, path p
WHERE p.city_to=f.city_from )
SELECT substr(pa.city_from,1,20), substr(pa.city_to,1,20), min(distance) FROM path pa GROUP BY (pa.city_from, pa.city_to);
The beginning is the statement before UNION ALL. It selects all direct connection from Warszawa.
- 'Warszawa', 'Augustów', 242;
- 'Warszawa', 'Suwałki', 267;
- 'Warszawa', 'Olsztyn', 192
- 'Warszawa', 'Płock', 103
- 'Warszawa', 'Mława', 116
- 'Augustów', 'Suwałki', 32
- 'Płock', 'Grudziądz', 133
- 'Mława','Elbląg', 144
The third and next steps are repetition of the second step. The only difference is that rows taken are rows added in the previous step (not the whole path table). This time following destinations will be added.
- 'Elbląg', 'Gdańsk', 58
- 'Grudziądz', 'Gdańsk',106
The final statement selects minimal distance between Warszawa and all cities found.
Warszawa Augustów 242,During evaluation also 'stops' column is created, although not used. It contains number of cities visited during journey from 'Warszawa' to destination. In the final statement, by replacing minimal(distance) with minimal(stops) we can choose paths with having minimal number of nodes.
Warszawa Elbląg 260,
Warszawa Gdańsk 318,
Warszawa Grudziądz 236,
Warszawa Mława 116,
Warszawa Olsztyn 192,
Warszawa Płock 103,
Warszawa Suwałki 267,
What this 'recursion SQL' has in common with graph theory ?
Answer is simple. Connection table describes nodes of directed graph. 'Recursive' statement is an implementation of Breadth-first_search. Distance is a weigth assigned to a node and our 'recursive' statement implements the algorithm of finding the Shortest path problem Shortest_path_problem
Danger.
Unfortunately, there is also a dark side oh this moon.
Consider connection table:
insert into connection values ('Warszawa', 'Augustów', 242);Our 'recursive' statement will run infinitely on that table. Using terminology from graph teory our statement is invalid if there is a 'cycle' in the graph, it is valid only for graphs without cycles. What's more important - there is no simply workaround to this problem.
insert into connection values ('Augustów', 'Warszawa', 242);