Hi Blog, long time no see! How’s it going?! I know, I really should up my blog game….
I was recently approached by my firms Marketing Manager with a request for some information. She wanted to know “Which departments have our top clients never done any work for?”. For some clarity, I work in a law firm with 11 departments. The request seems pretty straightforward at first. Then once I got thinking about how the output of this report would be presented it made me reconsider quite how simple a request this was. She handed me a drawing with her vision for the output. What she wanted was :
- Client’s names down the left,
- List of Departments across the top,
- Ticks and crosses at the intersection to show whether they had or had not done work for them.
Finding out the basic information she was asking for was not going to be tough, but I had concerns about the presentation. Sure, I could output it in any old way, import into Excel, do some Pivot Table jiggery-pokery and get it to look how she wanted. But where is the opportunity for learning in that approach?! No, I decided that I was going to have the query itself output in the way she wanted somehow….Pivot Table…… Pivot……. Table….That’s when 2 things hit me:
- I had recently been reading a chapter in my SQL MSCA book about pivoting and unpivoting data that I had been struggling with. I had been waiting for a real world example to sink my teeth into, and this seemed like just the thing.
- Pivot Tables in Excel and PIVOT in SQL are not called the same thing by accident!
before we can concern ourselves with PIVOTING any data, the first thing you need to do is get your data t be be pivoted. This requires 3 elements :
- Grouping Column – this is what will be your rows,
- Spreading Columns – this will be the columns,
- Aggregation Column – this will be what is displayed at the intersection of your rows and columns.
For me, my Grouping Column was the Client’s name, the Spreading Columns were the departments, and the Aggregation Column was whether the client had done work for that department ( I was going to get this data by doing a count of the number of legal matters that had been opened).
Once you have your data, then the next thing to do is to PIVOT it. Below you can see the syntax for setting out your 3 data columns as a CTE to then be pivoted.
WITH PivotData AS ( SELECT [grouping column] [spreading columns] [aggregation column] FROM ) SELECT [select list] FROM PivotData PIVOT ([aggregate function]([aggregation column]) FOR [spreading column] IN ([distinct spreading values]) ) AS P;
When I first saw this, the actual pivoting part looked pretty daunting. Turns out, its really not too bad. It just looks daunting when its laid out in front of you. Once you understand what’s going on, it make a lot of sense.
- [SELECT LIST] – This is where you say what you want to take from your original query and lay out as columns. So for my example, my first column is the clients, and the subsequent columns are each of my departments individually.
- [AGGREGATE FUNCTION]([AGGREGATION COLUMN]) – You need to tell SQL what to do with the data that you want at the intersection of your columns. For me, this was a count of the number of cases that each client had worked on for each department.
- [SPREADING COLUMN] IN ([DISTINCT SPREADING VALUES] – What values for your initalls specified spreading column (the data that will make up your columns headings) you want. My spreading column was our Departments, and my distinct spreading values was each of the names of the Departments I wanted to check against.
I appreciate that my explanation may still sound somewhat complicated, but hang in there. When you see my final working code and it’s output, hopefully it’ll all fall into place.
WITH PivotData AS ( SELECT E.Name AS 'Client' , D.Description AS 'Dept' , COUNT(D.Description) AS 'DCount' FROM DAB_MattersALL AS M JOIN Users AS U ON M.FeeEarnerRef = U.Code JOIN Departments AS D ON U.Department = D.Code JOIN Entities AS E ON M.Entityref = E.Code WHERE M.EntityRef IN ( SELECT TOP (20) E.Code AS 'Client' FROM Departments AS D INNER JOIN Users AS U ON D.Code = U.Department INNER JOIN Ac_Billbook AS A ON U.Code = A.SubmittingFeeEarner INNER JOIN Entities AS E ON A.EntityRef = E.Code WHERE A.BillDate >= '2008-01-01 00:00:00' AND A.BillDate < GETDATE() GROUP BY E.Code , E.Name ORDER BY CAST(SUM(A.CostsNet) AS MONEY) DESC ) GROUP BY E.Name , D.Description ) SELECT Client , [Banking ] , [Commercial and IP] , [Commercial Property] , [Construction] , [Corporate] , [Employment] , [Family] , [Intellectual Property] , [Licensing] , [Litigation] , [Management] , [Real Estate] , [Residential Property] , [Tax and Probate] FROM PivotData PIVOT(SUM(DCOUNT) FOR Dept IN ( [Banking] , [Commercial and IP] , [Commercial Property] , [Construction] , [Corporate] , [Employment] , [Family] , [Intellectual Property] , [Licensing] , [Litigation] , [Management] , [Real Estate] , [Residential Property] , [Tax and Probate] )) AS P ORDER BY Client
That looks like a giant wall of text, right?! It’s not, don’t worry. When you break it down it’s simple! The first part is just setting out what I want (Clients, Departments, and the count of the Departments).
The PIVOT part is saying – Give me some columns named “Clients”, and then subsequent columns named after each possible department they may have done any work for. At the intersection between each client and each department give me the number of cases each client has had with the departments in the following list.
Finally, this is the output (with client names poorly redacted) that you get from all this :
So, I hope that made sense. Writing this out really helped cement it in my brain as something that I now understand. I find reading about new techniques in SQL (or anything else really) can be really daunting and demoralising until I find a practical application. That’s when it all seems to make sense.
Pivoting is now something I feel pretty confident with and is another arrow in my quiver of SQL functions I can call upon when necessary.