2013-03-31

Assembly to PDB to Source Files

When you build a .NET project, it creates an assembly in a .dll file and a "program debug database" in a .pdb file that contains symbols for debugging. John Robbins' claims that "PDB files are as important as source code!". They can even help you retrieve the source code automatically when developing and when debugging.

Assembly to PDB

The assembly (.exe or .dll) will contain the name of the .pdb file that was created and a GUID for it unless building a .pdb was disabled. You can view the GUID using dumpbin.

C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\dumpbin.exe
dumpbin /headers Mono.HttpUtility.dll
  Debug Directories
        Time Type       Size      RVA  Pointer
    -------- ------ -------- -------- --------
    5133AAD4 cv          11C 00006F80     5180    Format: RSDS, {8791519F-DC5A-4CF9-94F0-09F69BC7C4B6}, 2, c:\Users\hpierson\Projects\HttpUtility\HttpUtility\obj\Release\Mono.HttpUtility.pdb

The same information can be retrieved using Mono.Cecil:
Running this program outputs:
Mono.HttpUtility.pdb 8791519f-dc5a-4cf9-94f0-09f69bc7c4b6

Symbol Servers

Symbols (.pdb files) can be hosted on the filesystem or on web servers. In Visual Studio, you can configure an ordered list of symbol servers by going to Visual Studio > Options > Debugging > Symbols > Symbol file (.pdb) locations. If there is one listed as "Microsoft Symbol Servers", it means http://msdl.microsoft.com/download/symbols. Microsoft actually has better .pdb files that link to source code if available at http://referencesource.microsoft.com/symbols. Many of the NuGet packages host their .pdb files with source code at SymbolSource.org. They have more instructions for configuring VisualStudio. Their public endpoints are http://srv.symbolsource.org/pdb/Public and http://srv.symbolsource.org/pdb/MyGet.

PDB to Source Files

A .pdb file contains a "srcsrv" stream, essentially a text file that describes how to get the source code from a file. After a .pdb is created, it can be source indexed. Not all .pdb files are source indexed. The index can be modified. The index can be viewed using a the Windows SDK tool pdbstr:

C:\Program Files (x86)\Windows Kits\8.0\Debuggers\x64\srcsrv\pdbstr.exe
pdbstr -r -s:srcsrv -p:Mono.HttpUtility.pdb
SRCSRV: ini ------------------------------------------------ VERSION=2 INDEXVERSION=2 VERCTRL=http SRCSRV: variables ------------------------------------------ SRCSRVTRG=http://srv.symbolsource.org/pdbsrc/Public/public/42b47598-1b3c-4204-a801-5d4e7621ffd2/%CN%/%UN%/Mono.HttpUtility/8791519FDC5A4CF994F009F69BC7C4B62/%var2% SRCSRVCMD= UN=%USERNAME% CN=%COMPUTERNAME% SRCSRVVERCTRL=http SRCSRVERRVAR=var2 SRCSRV: source files --------------------------------------- c:\Users\hpierson\Projects\HttpUtility\HttpUtility\Helpers.cs*Mono.HttpUtility/Helpers.cs*b1662f95db5a82745072f991dfea68d6 c:\Users\hpierson\Projects\HttpUtility\HttpUtility\HttpEncoder.cs*Mono.HttpUtility/HttpEncoder.cs*dfe2d8a91c322c8dc8915bb5f63cc798 c:\Users\hpierson\Projects\HttpUtility\HttpUtility\HttpUtility.cs*Mono.HttpUtility/HttpUtility.cs*a3784bf5a49c81f6893947d72b671912 SRCSRV: end ------------------------------------------------

This information and the GUID in the .pdb can be accessed using a modified Microsoft.Cci.Pdb. Mono.Cecil actually uses the library internally. I'm not sure why everything is marked internal in that namespace. I forked Mono.Cecil, separated out that Microsoft.Cci.Pdb namespace into its own dll I called Mono.Cecil.Cci.Pdb. I made anything public that Mono.Cecil needed to build. With that in place, the code to get the GUID and srcsrv stream is:

Out Parameters in F#

One thing to highlight is the F# code above is how elegant the call to PdbFile.LoadFunctions is. The C# code from PdbReader.cs is: