So when I was asked to figure out a way to add rows to a table based on the data from another table, I had no clue on how to get it done with pure SQL... Turns out: it's easy!
I'll sketch my situation and show you the magic query, you might be able to adapt it for your own use. Table1 contains a number of years, the goal is to add five extra rows with years to table2 starting from the highest year in table1. So: if the highest year for a specific identifier in table1 is 2009, I will need to add the years 2010, 2011, 2012, 2013 and 2014 to table2 for that specific identifier. On top of that we don't want to make double row entries in table2, so we'll have to keep that in mind as well. Still with me? Below you can find examples of the tables.
Table 1
table1_id | identifier | year |
1 | 1 | 2009 |
2 | 2 | 2008 |
3 | 2 | 2007 |
4 | 3 | 2010 |
Table 2
table2_id | identifier | year |
1 | 1 | 2011 |
2 | 1 | 2012 |
3 | 2 | 2009 |
So, for identifier 1 the highest year in table1 is 2009, for identifier 2 that is 2008 and for identifier 3 that is 2010. This means that for identifier 1, I would need to insert 2010, 2013 and 2014 (since table2 already contains 2011 and 2012). For identifier 2, I would need to insert 2010, 2011, 2012 and 2012 (since table2 already contains 2009). For identifier 4, I would need to insert all five years: 2011, 2012, 2013, 2014 and 2015.
And now, the magic query!
-- declare variables
Declare @identifier int, @year int, @i int, @numrows int;
-- loop all max years from Table_1
Declare c Cursor For (select identifier, max(year) from dbo.Table_1 group by identifier)
Open c
-- fetch the identifier and highest year
Fetch next From c into @identifier, @year
While @@Fetch_Status=0 Begin
-- print a line per identifier
PRINT N'--Identifier: ' + cast(@identifier as nvarchar) + ' highest year ' + cast(@year as nvarchar)
-- print per identifier 5 insert statements with the next five years after the current, highest year
set @i = 1;
set @numrows = 5;
while (@i <= @numrows)
begin
-- check if the row already exists
if exists(select * from dbo.Table_2 where identifier = @identifier and year = @year+@i)
print N'--row with identifier ' + cast(@identifier as nvarchar) + ' and year ' + cast(@year+@i as nvarchar) + ' already exists'
else
print 'insert into dbo.Table_2(identifier, year) values (' + cast(@identifier as nvarchar) + ', ' + cast(@year+@i as nvarchar) + ');'
-- increment the counter
set @i = @i + 1;
end
Fetch next From c into @identifier, @year
End
Close c
Deallocate c
--Identifier: 1 highest year 2009
insert into dbo.Table_2(identifier, year) values (1, 2010);
--row with identifier 1 and year 2011 already exists
--row with identifier 1 and year 2012 already exists
insert into dbo.Table_2(identifier, year) values (1, 2013);
insert into dbo.Table_2(identifier, year) values (1, 2014);
--Identifier: 2 highest year 2008
--row with identifier 2 and year 2009 already exists
insert into dbo.Table_2(identifier, year) values (2, 2010);
insert into dbo.Table_2(identifier, year) values (2, 2011);
insert into dbo.Table_2(identifier, year) values (2, 2012);
insert into dbo.Table_2(identifier, year) values (2, 2013);
--Identifier: 3 highest year 2010
insert into dbo.Table_2(identifier, year) values (3, 2011);
insert into dbo.Table_2(identifier, year) values (3, 2012);
insert into dbo.Table_2(identifier, year) values (3, 2013);
insert into dbo.Table_2(identifier, year) values (3, 2014);
insert into dbo.Table_2(identifier, year) values (3, 2015);
As you can see this is a pretty short and simple query and it does exactly what it should do! It contains a cursor that loops all rows of table1, grouped by identifier, fetching the highest year. A loop runs five times to create the "insert"-statements and within that loop a "if exists"-statement checks if the rows you're about to insert don't already exist. The result of that statement and the generated "insert"-statements are printed on the screen.
That's all, as you can see, very straight forward!
Could be useful, right?
No comments:
Post a Comment