Preventing XSS in .NET Core Web Apis

This post is out of date. Consider reading Cross Site Scripting for an up to date perspective of XSS.

G’day guys!

Guess what? If you’re developing an ASP.NET Web API project and haven’t taken steps to protect it against XSS (Cross Site Scripting), then unfortunately you’ve got a security hole on your hands.

You can read more about XSS (and some platform agnostic techniques to prevent it) on the OWASP website here and here, but long story short, XSS involves your API saving user-submitted data containing malicious HTML or JavaScript, and potentially exposing another endpoint that returns the data as-is. The script that the user submitted then gets returned to the browser of another user and wreaks havoc in some way - potentially defacing your service for them, scraping private data from the page and sending it over the wire, or whatever other nasties they can think of.

Now, it’s true that most modern browsers and front-end frameworks will often have security mechanisms that prevent rendering untrusted html (eg CSP and Angular’s DomSanitizer), but that doesn’t mean we should just delegate the responsibility of security to the front-end; the API plays a part in this as well, and ought to be secured.

Hold up, doesn’t ASP.NET take care of this for us?

It used to (to a certain extent), so I’d forgive you for thinking this was the case. As described on the OWASP website, ASP.NET used to come with a Request Validation feature that was enabled by default for Web Forms (remember those!) and MVC projects, but this has never been supported for Web API projects. For the life of me I haven’t been able to find why, so if you do happen to know - please let me know in the comments. The sad fact is that nowadays the vast majority of ASP.NET applications are likely to be APIs, so it’s a shame that Request Validation is no longer available for us. When it comes to preventing XSS, we’re on our own.

Well, crap. What do we do now?

There are basically two approaches you can take with XSS; either sanitise (or reject) the input, and/or encode the output.

One very reputable package that’s recommended for sanitisation is HtmlSanitizer, which is even listed on the OWASP website. Basic usage looks something like:

var sanitizer = new HtmlSanitizer();
sanitizer.AllowedAttributes.Add("class");
var sanitized = sanitizer.Sanitize(html);

The sanitiser (yep, that’s how I’m spelling it) can be configured with a whitelist of supported attributes and tags and then invoked to strip attributes and tags that aren’t whitelisted from the source string. While this library is a pretty much unanimously recommended package, out-of-the-box it does mean that you need to sanitise each field individually.

You’re kidding, right?

Sadly, no. There are some approaches you can take to have the sanitiser run on every request, but as far as I’ve seen, there isn’t any built-in support for this by ASP.NET or by the package itself. I’ll show you a couple of samples for this, but you can download and run the entire sample project by cloning it from github.

JSON Converter

One (albeit naieve) approach for sanitising the content either coming in or going out of the API is to use a custom JSON converter. To achieve this, first create a new class (eg: AntiXssConverter) with the following content:

   public class AntiXssConverter : JsonConverter<string>
    {
        public override string? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
        {
            var sanitiser = new HtmlSanitizer();
            var raw = reader.GetString();
            var sanitised = sanitiser.Sanitize(raw);

            if (raw == sanitised)
                return sanitised;

            throw new BadRequestException("XSS injection detected.");
        }

        public override void Write(Utf8JsonWriter writer, string value, JsonSerializerOptions options)
        {
            writer.WriteStringValue(value);
        }
    }

Note that this converter is based on the System.Text.Json serialiser that ships by default with .NET Core 3.1 and above. If you’re using the Newtonsoft.Json converter, the principle will still be the same, but the implementation will likely differ slightly.

Essentially what we’re doing here is providing a strategy for reading string values whenever a JSON string is deserialised (eg: during model binding) and writing string values when an object is serialised (eg: when a controller returns a response). In this particular example, we’re using the HtmlSanitizer package introduced earlier to sanitise each string, and if the sanitised string differs from the original, we throw a custom BadRequestException, denying the request with a HTTP 400 (handled by custom exception handling middleware). Another approach we could have taken is to simply sanitise the string and allow the request, or to save the string as-is and sanitise (or encode) it on serialisation in the Write method.

In Startup.cs, we’d need to register the converter in ConfigureServices(...) like so:

    services
        .AddControllers()
        .AddJsonOptions(options=> 
        {  
            options.JsonSerializerOptions.Converters.Add(new AntiXssConverter());
        });

This approach works for simple objects, nested objects and collections. However, it does have two fairly serious drawbacks:

This logic needs to run for every string in the request. If the payload is a large object with many string properties, this could become quite inefficient.
Our XSS validation / sanitisation is coupled to JSON; if the user were to submit url-encoded content or multipart/form-data content (which could quite realistically be the case if we have endpoints that deal with file uploads), then other parts of our application may still be at risk.

ASP.NET Middleware

A more efficient and more versatile approach might be to sanitise and validate the entire request body through a middleware. This has the advantages of:

It only runs once for each request - so it should be significantly faster than using the JSON converter approach.
It’s not coupled to any particular content type. Regardless of whether the endpoint uses JSON, formdata or any other medium, the request will still be sanitised and validated.

To implement this, create a new class called AntiXssMiddleware with the following content:

    public class AntiXssMiddleware
    {
        private readonly RequestDelegate _next;

        public AntiXssMiddleware(RequestDelegate next)
        {
            _next = next;
        }
        
        public async Task Invoke(HttpContext httpContext)
        {
            // enable buffering so that the request can be read by the model binders next
            httpContext.Request.EnableBuffering(); 
            
            // leaveOpen: true to leave the stream open after disposing, so it can be read by the model binders
            using (var streamReader = new StreamReader(httpContext.Request.Body, Encoding.UTF8, leaveOpen: true))
            {
                var raw = await streamReader.ReadToEndAsync();
                var sanitiser = new HtmlSanitizer();
                var sanitised = sanitiser.Sanitize(raw);

                if (raw != sanitised)
                {
                    throw new BadRequestException("XSS injection detected from middleware.");
                }
            }

            // rewind the stream for the next middleware
            httpContext.Request.Body.Seek(0, SeekOrigin.Begin);
            await _next.Invoke(httpContext);
        }
    }

Don’t forget to register it in Startup.cs under Configure(...):

    app.UseMiddleware<AntiXssMiddleware>();

If you happen to be using a custom exception handling middleware, this needs to be registered after that.

In our middleware, we’ve taken the same approach as in our JSON converter of validating the request and throwing a bad request exception if the validation failed. I’ve found this to be about 50% faster than the JSON converter, even for a contrived example of a single-string model.

Anyway, this is just one suggestion for how one might (aggressively) protect against XSS in your API. This is of course just a simple example - if you’re building any sort of API that expects HTML content being passed, such as a CMS, you’ll likely need to configure the HtmlSanitizer with a whitelist of allowed tags and attributes.

How do you deal with XSS in your own API? Let me know in the comments!

Catch ya!

Jason Sultana