Moving to the new SymbolSource engine

A few months ago I promised we would be launching a new version of SymbolSource. Today I’m very happy to make the first part of the new service public, the one you’ve been waiting for most impatiently.

A new package repository has been deployed to production at https://nuget.smbsrc.net and contains all packages from https://www.symbolsource.org/Public/Metadata/NuGet (around 50 thousand of them!). In other words this is the new complementary service to https://www.nuget.org, running on a codebase rewritten from scratch – something we needed to do to solve all of the performance issues you have been facing with SymbolSource in the past.

What about the SymbolSource repository for MyGet, or the other company instances, you might ask? Those will be migrated to the new service too, and I will report as soon as we will have done that.

For now let’s see how the new service operates. You can try all of these commands yourself, and update all of your scripts and debuggers accordingly.

Publishing a symbol package

To push a symbol package run this command:

> nuget push SymbolSource.DemoLibrary-1.0.0.0.nupkg a0f4e24d-851f-4327-a9ec-181d335e7e95 -source https://nuget.smbsrc.net
Pushing SymbolSource.DemoLibrary 1.0.0.0 to 'https://nuget.smbsrc.net'...
Your package was pushed.

The command will succeed immediately without any timeouts, it just sends data to Azure storage. Packages are queued and processed asynchronously. That means that you will need to wait a bit before the symbols and sources become available, but we’ll be monitoring the queue length to make sure it never takes more than a few minutes.

Waiting and refreshing isn’t much fun, so we included a simple notification service, which you can use directly from the command prompt. Just append your Twitter username to the API key after a slash, like  this:

> nuget push SymbolSource.DemoLibrary-1.0.0.0.nupkg a0f4e24d-851f-4327-a9ec-181d335e7e95/@TripleEmcoder -source https://nuget.smbsrc.net
Pushing SymbolSource.DemoLibrary 1.0.0.0 to 'https://nuget.smbsrc.net'...
Your package was pushed.

You’ll get tweeted by our friendly SymbolSource Bot (@smbsrc) when the package starts processing and when it finishes. The tweet will also tell you if anything went wrong, in which case @SymbolSource will also be mentioned to alert us about the issue.

One thing you’ll notice immediately is that we’ve added a suffix to the version number. Each upload to SymbolSource is identified separately. We then do a form of reference counting on all of the symbols and sources, so there is no cost to that. But in case you upload different builds with the same version number, all of them will be available to clients. Unless of course you decide to delete a package. Which is now a fully supported self-service operation!

Deleting a symbol package

If you know the SymbolSource version string of your package, deleting is as simple as running this command:

> nuget delete SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51 a0f4e24d-851f-4327-a9ec-181d335e7e95/@TripleEmcoder -source https://nuget.smbsrc.net
Deleting SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51 from the 'https://nuget.smbsrc.net'.
SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51 was deleted successfully.

Once again NuGet will lie a bit, as this is an asynchronous operation. But you will get the same friendly tweets if you include your username with the API key, as I did above.

But what if you didn’t push with Twitter notifications enabled, and are now at a loss as to what the version string is? No worries, you can query SymbolSource as any other package source and find out what the version is.

Before we talk about that, however, if you really want to remove your content from SymbolSource, run this too:

> nuget delete SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51 a0f4e24d-851f-4327-a9ec-181d335e7e95 -source https://nuget.smbsrc.net/,all,original
Deleting SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51 from the 'https://nuget.smbsrc.net'.
SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51 was deleted successfully.

SymbolSource saves a copy of all uploaded packages, in case there is a change in the indexing algorithm and we need to reprocess. They are hosted in subfeed called original – see below for an explanation of what that is.

Listing symbol packages

By now you probably have already guessed the command to list packages from SymbolSource:

> nuget list SymbolSource.DemoLibrary -source https://nuget.smbsrc.net
No packages found.

Well, that result might seem unexpected. But remember that we added an additional version string and therefore promoted all packages to semantic versioning. NuGet treats those as prerelease packages. A quick fix:

> nuget list SymbolSource.DemoLibrary -prerelease -source https://nuget.smbsrc.net
SymbolSource.DemoLibrary 1.0.0.0-at-0E8GM1

There it is! Just remember about that NuGet will only show the most recent package, but you can show all of the uploads:

> nuget list SymbolSource.DemoLibrary -prerelease -allversions -source https://nuget.smbsrc.net
SymbolSource.DemoLibrary 1.0.0.0-at-0E2NAN
SymbolSource.DemoLibrary 1.0.0.0-at-0E8G1E
SymbolSource.DemoLibrary 1.0.0.0-at-0E8GM1

All of this works in NuGet Package Explorer too! As an exercise try downloading one of those packages. What you will get is a status package, useful in determining what symbols and source actually have been indexed by SymbolSource. More on this next time.

And in case you’re wondering… That version string is just a timestamp, encoded into an alphabet-based number system to save length. Unfortunately NuGet imposes a 20-character limit on all semantic versions.

Advanced listing by status and ownership

In my last post I also mentioned subfeeds, which let you query by status and ownership. To recap here are all of the statuses – they’ve slightly changed since our original spec:

  • new – not yet opened, random package name and unknown version,
  • original – a copy of the uploaded package, without any processing, useful if we improve the indexer and need to resubmit packages without any users’ actions,
  • indexingqueued – indexing will soon start asynchronously,
  • indexing – processing is in progress as you are looking at the list,
  • succeeded – the package has been indexed without any issues,
  • deletingqueued – deleting with soon start asynchronously,
  • deleting – deleting is in progress as you are looking at the list,
  • deleted – the package has been completely removed from the index,
  • partial – something went wrong, either during indexing or deleting, and some symbols and sources might be available, but some may not,
  • damagednew – package could not be read with a standard NuGet library,
  • damagedindexing, damageddeleting – these will probably indicate an error in SymbolSource.

For example, if you wanted to confirm that a package has been deleted, you would need to run this command:

> nuget list SymbolSource.DemoLibrary -prerelease -allversions -source https://nuget.smbsrc.net/,all,deleted
SymbolSource.DemoLibrary 1.0.0.0-at-0D9I8Z
SymbolSource.DemoLibrary 1.0.0.0-at-0D9I90
SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51

As you can see, the list contains a package we did delete a few paragraph earlier – version 1.0.0.0-at-0E8H51. There are however also some old versions, which I deleted before writing this post. I can now delete them again, sort of, to simply remove the status package. Note that at this point symbols and sources have already been deleted, this is just similar to cleaning log files:

> nuget delete SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51 a0f4e24d-851f-4327-a9ec-181d335e7e95 -source https://nuget.smbsrc.net/
Deleting SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51 from the 'https://nuget.smbsrc.net/'.
Failed to process request. 'Not Found'.
The remote server returned an error: (404) Not Found..

A 404 error is a bit odd, isn’t it? Well not exactly. Remember, we have already deleted this package. It doesn’t show in the feed. To remove the status package we need to address the subfeed specifically.

> nuget delete SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51 a0f4e24d-851f-4327-a9ec-181d335e7e95 -source https://nuget.smbsrc.net/,all,deleted
Deleting SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51 from the 'https://nuget.smbsrc.net/,all,deleted'.
SymbolSource.DemoLibrary 1.0.0.0-at-0E8H51 was deleted successfully.

If you don’t specify a subfeed at all, SymbolSource defaults to succeeded state and all packages (as opposed to own packages). The first comma separates the feed name, which in case of nuget.org is empty. Things will be different for myget.org, but more on that in a future post.

So far we have listed all packages, regardless of who uploaded them. Of course pushing and deleting is permitted according to nuget.org package ownership. But you can also limit the packages listed by a query, using the following command:

nuget list -prerelease -allversions -source https://nuget.smbsrc.net/,own,succeded
Please provide credentials for: https://nuget.smbsrc.net/,own,succeded
UserName: @TripleEmcoder
Password: ************************************
SymbolSource.DemoLibrary 1.0.0.0-at-0E2NAN
SymbolSource.DemoLibrary 1.0.0.0-at-0E8G1E
SymbolSource.DemoLibrary 1.0.0.0-at-0E8GM1

SymbolSource will ask for credentials this time, as it needs to known who the user is, to list only the right packages. You can pass anything as the username, but if you give us something meaningfull like a Twitter handle, we will be able to get back to you in case of problems. The password is your NuGet API key.

Debugging in Visual Studio

And last, but not least, the new URL for Visual Studio is…

https://nuget.smbsrc.net

Just replace the old symbolsource.org URL in your Visual Studio options and you’re done. Happy (and fast) debugging 🙂

Wrapping it up

In this post I described the core features of our new SymbolSource engine. I hope you will find the new service much faster and much more stable than the previous one. We will be gradually upgrading other parts of the service over the next weeks and months. If you have any questions or feedback please leave them in the comment section.

Designing a new SymbolSource

A long, long time ago

It’s been a long time since I have written here, and it’s been an equally long time since we have made any visible improvements to SymbolSource. Part of the reason is that Kamil and I have both been involved in building an EHR (electronic health record) webapp for the Polish outpatient care market. Well, involved doesn’t really cut it, that startup has been more or less our entire life for three years. But now I can say, not without a lot of pride, that we have successfully led the project out of the startup phase. If you’re interested in what we have managed to build, visit www.mediporta.pl. (The website only has a Polish version at the moment.)

We also might not have seemed very responsive when contacted with issues, but we have never stopped reading and noting all the feedback that we received. It has been very hard to reply, when the current SymbolSource architecture has not given us many options of implementing improvements. When we started the project, there was no Azure, no NuGet (yes!) and normalized SQL databases ruled the Earth. That resulted in quite a few mediocre decisions, looking back today.

Those times have passed, however, and it’s time to loose the excuses and make things better.

Before we begin

If you are not familiar with what purpose SymbolSource serves, or how it works, first have a look at the official Wiki:

Back to drawing board

A few months ago we have started talking about a total rewrite of SymbolSource, and even with the limited time we could have spared, we have made good progress in designing a new, scalable architecture for the service. It will enable us to support many different scenarios with the best possible performance:

  • a public symbol repository for nuget.org,
  • public and private feeds integrated with myget.org,
  • hosted instances deployed in our Azure subscription,
  • instances deployed in private Azure subscriptions,
  • on-premise instances integrated with Active Directory.

But today I don’t want to talk about architecture, as it is more or less irrelevant when a service performs well. Today I’d like to share with you how we see the most basic form of interaction with the new SymbolSource – pushing and managing packages.

Consider this post a spec which, although already implemented, isn’t yet available publicly for testing. We are looking forward to any questions and comments!

Pushing a package

Pushing to the public repository:

nuget.exe push NHibernate.4.0.3.4000.symbols.nupkg 8ac00d48-a8e8-48e4-bb40-4fc92f18e15c -source http://nuget.smbsrc.net

Pushing to a named feed integrated with MyGet:

nuget.exe push EntityFramework.6.1.4-alpha1-40301.symbols.nupkg 60b1845b-116f-4eb0-8086-f96acaae46d7 -source http://myget.smbsrc.net/aspnetwebstacknightly

As a part of our effort to improve SymbolSource performance, we have decided to make all package operations asynchronous, which means that a successful push will only acknowledge that the package was received correctly. Read on to see how to determine the true package status.

Listing packages in various states

If all is well, you should see the package listed in the feed:

nuget.exe list -source http://nuget.smbsrc.net -allversions -prerelease

The list of packages will be similar to the following:

  • NHibernate 4.0.3.4000-smbsrc150302193927

Note that SymbolSource has automatically added a SemVer compatible version suffix. It uniquely identifies each package upload with a timestamp. There is no guarantee that two packages with identical versions don’t have different symbol files inside. That’s why we process them independently.

You can probably see why the extra options to nuget.exe are needed:

  • -prerelease shows packages with the smbsrc suffix,
  • -allversions disables skipping earlier uploads.

What if a previous upload was accidental? Read on for instructions on how to delete a package.

Deleting a package

Removing a package – and all of its symbols and sources! – will be as simple as issueing:

nuget.exe delete NHibernate 4.0.3.4000-smbsrc150302193927 8ac00d48-a8e8-48e4-bb40-4fc92f18e15c -source http://nuget.smbsrc.net

Again, this operation is asynchronous. A success message from nuget.exe will only tell you that the package has been found and correctly queued for deletion.

Determining other states

The new SymbolSource will also introduce a concept of subfeeds, which will let users list packages in various states. The list command showed earlier targets the default subfeed, which is own,succeded. You will be able to explicitly target:

  • ownership – currently own or all (subject to permissions),
  • state – one of success, partial, indexing and so on.

Here’s an example of listing all failed packages in the default feed as an administrator of the Caliper company instance:

nuget.exe list -source http://caliper.smbsrc.net/,all,partial -allversions -prerelease

The meaning of the various states is as follows:

  • new – not yet opened, random package name and unknown version,
  • original – a copy of the uploaded package, without any processing, useful if we improve the indexer and need to resubmit packages without any users’ actions,
  • damaged – package could not be read with a standard NuGet library,
  • indexingqueued – indexing will soon start asynchronously,
  • indexing – processing is in progress as you are looking at the list,
  • succeeded – the package has been indexed without any issues,
  • deletingqueued – deleting with soon start asynchronously,
  • deleting – deleting is in progress as you are looking at the list,
  • deleted – the package has been completely removed from the index,
  • partial – something went wrong, either during indexing or deleting, and some symbols and sources might be available, but some may not.

If listing is possible… is downloading too?

Yes! Just as for any other NuGet package:

nuget.exe install NHibernate -version 4.0.3.4000-smbsrc150302193927 -source http://nuget.smbsrc.net

But you might be surprised by the results. Remember that when no subfeed is specified, you target the succeeded state. Packages in that state have no content, but status files instead! I will blog more about this in a future post. At the moment these are JSON files that specify what symbols and sources where detected and whether they have been uploaded to permanent storage. If there were any problems, a package will be listed in the partial state, and the status files will provide error messages.

By the way, since the JSON status files have complete information about indexed symbols and sources, we can delete a package entirely based only on those, without hitting any database at all. A big win performance-wise!

Time for some feedback

What do you think about the scheme that we designed? Please share your thoughts.

The Orchard packaging experiment

I’ve written about plugins based on NuGet before. One piece of software that manages its pluggable parts with NuGet is Orchard – a CMS based on .NET and MVC, which from the experience of my company, Caliper, also serves as a nice framework for building web applications without any CMS functionality.

There’s a couple of things that Orchard solves for you, that make it great as a general framework:

  • database connectivity (with NHibernate), including out-of-the-box support for local, automatic SqlCe databases,
  • data migrations, for upgrading table schemas with new releases of your web applications or modules,
  • theme and module management, with the use of NuGet packages and feeds,
  • dynamic compilation, which lets you do in place changes to your web site or application, even through FTP, and see changes instantenously.

Orchard is distributed as a ZIP file that you can deploy as an application in IIS or open as a web application in Visual Studio. Module and theme packages are only managed through Orchard’s web UI, which gets them from a special Orchard Gallery Feed URL: http://packages.orchardproject.net/FeedService.svc. If you open a module from that feed you’ll see that it contains all of its sources (as in *.cs files), for Orchard’s dynamic compilation to take care of. Open-source at its best. Or is it?

As much as we love having access to sources at Caliper, we love having them on-demand in our debugging sessions even more – instead of moving thousands of them around on disk for no reason. And we all have SymbolSource to handle that for free 🙂

So we set out to try and make Orchard and its modules more vanilla-nuggety and symbol/source server friendly.

Step 1: Packaging Orchard

First, we decided to try putting Orchard itself into packages. The end-result experience should speak for itself:

  1. Open Visual Studio and create an ASP.NET Empty Web Application (yes, no MVC, no Web Forms, no nothing).
  2. Add http://nuget.gw.symbolsource.org/Public/Orchard/FeedService.mvc as a new package source to Visual Studio (yes, SymbolSource can host and serve NuGet packages as a package repository)
  3. Install the Orchard package from this feed in the empty project.
  4. Run the project with F5. You now have the Orchard setup page ready for action.

Things to note, as this should be considered work-in-progress still:

  • All standard modules and themes are downloaded as dependencies. We haven’t tried yet to determine what subset is needed for setup, and which of them could be installed an on as-needed basis later with NuGet in Visual Studio or Orchard through the web UI.
  • External dependencies are taken from the official NuGet feed only if they were taken by the Orchard team as-is, without modification or recompilation. All the rest is in Orchard.External.
  • SqlCe native binaries are not included yet. But you should be good to go on your development machine, as you probably have in installed globally in your system.
  • We haven’t yet figured out how to do versioning, so all packages have strict dependencies (==, i.e. [x.y.z] in NuGet).
  • You can easily create your own binary module packages that take other Orchard modules as NuGet dependencies and get compiled with the right references.

Now for the best part:

  • No *.cs files anywhere! Less files is more speed, less waiting, less transfers, less comparing on updates, less temptation to modify modules in place.
  • Debuggable – with SymbolSource! All symbols (PDB files) and sources are indexed and accessible with Visual Studio using the regular, public SymbolSource URL. This applies only to Orchard.Framework, Orchard.Core and Orchard.Web projects at the moment, but all Orchard modules will follow soon.
  • No dynamic compilation! No more screwed up assembly names and WCF services unable to load. Yes, you can disable dynamic compilation in configuration, but then you’re still required to compile modules and cleanup sources yourself.
  • All views are included in the main project as content files, so it’s easy to find, edit or copy them to your theme.

Step 2: Repackaging modules

Since a lot of the standard Orchard distribution is built using modules, we had to figure out how to transform them form source form into binary form. We have a small application that interfaces with NuGet and MSBuild to get the modules, guess their interdependencies from project references, then compile and repackage them. All of the standard modules have been uploaded in repackaged, binary form to the SymbolSource Orchard repository.

So far we haven’t touched any contributed modules from the Orchard Gallery, but our plan is to either create a SymbolSource gateway that would do that compilation on the fly (module authors would decide to push to it additionally), or monitor the gallery feed and repackage new modules into our Orchard repository automatically.

What do you think?

All the tooling that we created for this is available on GitHub: http://github.com/SymbolSource/Orchard. To run it, just clone, open a Visual Studio prompt and do msbuild Orchard.proj. The output will be in pkg (main Orchard packages) and pkgbin (module packages).

A world of debuggable open-source software – Part 3: Applications

In my last post, A world of debuggable open-source software – Part 2: Plugins, I used NuGet Package Explorer as an example of an application that uses NuGet-based plugins, and how that makes it very easy to publish their symbols and sources to SymbolSource, for a great debugging experience in case anything goes wrong at runtime. If you’re interested in creating your own plugin system based on NuGet, have a look at this post by Aaron Powell: Creating a NuGet-based plugin engine.

I also mentioned in that post that you can push any PDB files and sources to SymbolSource, including those of entire applications. Because NuGet is still the easiest way of pushing symbol packages, what we would need is a NuGet-based method of distributing programs. Fortunately such tool exists already and is called Chocolatey. If you haven’t heard of it before, here’s how it’s described by its authors:

Chocolatey NuGet is a Machine Package Manager, somewhat like apt-get, but built with windows in mind.

In this post I will show you how Chocolatey and SymbolSource can be used to publish open-source applications with full debugging support, allowing users with programming skills to analyze problems through code and provide much more useful bug reports. Or even patches and pull requests!
Read the full post »

A world of debuggable open-source software – Part 2: Plugins

Apart from commonly being free-as-in-beer, open-source is also great because it enables us geeks to have a deep look into the inner-workings of software that we use. How many times have you wondered why a program fails, only to find that it has no logging capabilities, or produces useless output? Open-source often lets you avoid banging your head on the wall too hard, because you can always have a look at the code and try to figure out what the problem is.

But what if the reason still remains a mystery after reading the code? Fortunately then you have the most powerful tool in your development toolbox still left – the debugger. Let’s recap what it usually takes to start debugging third-party code?
Read the full post »

Creating a custom PostSharp aspect (with IL transformation)

Last time I wrote about my efforts to create minidumps with PostSharp aspects. I showed some ways of achieving that, but the main conclusion was that no out-of-the-box aspect in PostSharp provides a transformation that would allow inserting minidump generation code in the optimal place: just before throw instructions. Fortunately, it isn’t too hard to create such an aspect, even though the required PostSharp SDK is unsupported and undocumented.

Let’s start by analyzing how PostSharp aspects are architected. Then we’ll move to implementing the actual IL transformation required to temporarily store the exception about be thrown, pass it to the aspect code, and then actually throw it expected.

Read the full post »

Writing minidumps with PostSharp aspects

In the comments for one of my previous posts, Writing an automatic debugger in 15 minutes, a reader suggested creating minidumps with PostSharp aspects. Although I proved (by the very scientific method of guesstimation – see Performance impact of running under MDbgEngine) that running under a debugger introduces only a 25% performance penalty, it is reasonable to try to avoid it altogether. Today’s post will be an investigation into using PostSharp as a means of injecting minidump creation code into exception handlers.

Read the full post »

Performance impact of running under MDbgEngine

As promised, this post will present the results of my investigation into the performance impact of running under the simplest possible debugger written using MDbgEngine.
Read the full post »

Accessing stack traces with MDbgEngine and PADRE

In my previous post, I showed how MDbgEngine (available on NuGet) can be used to stop a process when an exception is thrown, and how a minidump can then be created to enable post-mortem debugging.

Using MDbgEngine, you can do much, much more – and all with a very high-level API. Read the full post »

Writing an automatic debugger in 15 minutes (yes, a debugger!)

Seriously, it will take you longer to read this long introduction, than to code a working debugger in C#.

You may also want to check out:

Bugs in production

Remember all those reports coming back form clients saying “hey, your program crashed” or “hey, your is site showing these ugly yellow pages at random moments”? So do I. Unfortunately, there isn’t much that can be done to diagnose problems in running software, basically we’re at the mercy of our clients’ reports (which vary in quality, most of the time tending towards useless) or more or less verbose logging. Having those, we can reproduce issues during a debugging session to see what code causes them. If that fails,  we can go to the extremes of attaching a debugger in a production environment. Not feeling enough pressure today? Attach to a live site and try to setup breakpoints the exact way needed to only catch the error, and not stop a milion people from doing their daily work. At first try. Logging, on the other hand is very safe, it but will only tell you as much as you predicted that would be needed.

Now what if we could have something in the middle? More than logging, but less than an interactive debugging session?
Read the full post »