Captain Codeman Captain Codeman

Introducing Embed.ly client for .NET

Contents

Introduction

First of all if you haven’t heard of Embed.ly you really should check it out

At it’s core, embedly is an oEmbed provider. oEmbed is a format for allowing an embedded representation of a URL on third party sites. The simple API allows a website to display embedded content (such as photos or videos) when a user posts a link to that resource, without having to parse the resource directly.

If you’ve ever posted a link on facebook and been impressed that it automatically added a title, some descriptive text and one or more preview images to select from or included a playable video automatically and want to build something like that into your own site then this is for you.

There were already client libraries for several other languages but none for .NET so I developed this. I’ve been using it on a forum app to automatically detect links to videos and images to build up a gallery of both and to make them playable within the post. You can also use it on the client via a jQuery plugin but then you lose the ability to build up the gallery and index the additional content. If someone has posted a link to a Bob Dylan video then I’d like that post to be returned if someone searches for ‘Dylan’.

The response from embedly can also include a flag to indicate if the URL is considered ‘safe’ (based on Google’s safe-browsing API).

Example

Here is an example of the original content posted showing how the link is converted into a video and the additional information retrieved.

Original Post

The user makes a post and just copies and pastes a regular link

original

Embedly Enhanced

The HTML is parsed and sanitized (using HtmlAgilityPack and a custom Html cleaning library) and the discovered URL checked with Embedly. We told embedly we wanted a preview of 640px maximum width so the html snippet returned fits perfectly and shows a playable preview:

with-embedly

Embedly also returns static thumbnail images which are perfect to add to a gallery of content:

video-library

Additional Content

As well as the html preview and thumbnail, the title, description and other information is returned by embedly which can enhance the page to host the content or make it more searchable on our site:

video-preview

Embedly provides a much richer experience to the end user.

So what does the .NET client do?

Basically, it provides an easy way to make requests to embedly and get strongly-typed results back. It automatically handles the request to the embedly service to get the details of the services they support fully and has a high-performance regex-less way of matching URLs against them to see if they are supported (doing 500+ regex lookups against each URL is too slow when batch processing).

Requests to embedly can be filtered based on the provider information making it easy to limit requests to YouTube videos or Amazon products or perhaps any video or photo provider.

When requesting more than one URL the client will automatically batch them into a single HTTP request to embedly (which supports up to 20 URLs per request) and uses async downloading to handle the response without blocking or using valuable CPU time.

Finally, a caching mechanism helps avoid re-requesting URLs that you have recently checked operating at the URL level - the individual URL results are cached, not the entire embedly response which could be for 20 URLs so if you requested 60 URLs and 40 had already been requested it would only sand a single HTTP request to embedly whatever sequence they were requested in.

The caching can be disabled completed if required and there is also an InMemory cache provided as well as examples of an ADO / SQL Client cache and a MongoDB cache (which is the one I’m using myself).

What doesn’t it do?

At the moment it works for the base oEmbed endpoint only but I plan on adding support for the Preview and Objectify endpoints in the future.

Where do I get it?

You can download the source from GitHub: https://github.com/CaptainCodeman/embedly-dotnet or get a binary version as a NuGet package:

http://nuget.org/List/Packages/embedly

NOTE: I’ll probabably be splitting the NuGet version into a core / base package and separate cache providers to avoid bloating the dependencies.

How do I use it?

The source includes a sample project showing some of the ways you can use it but I’ll give a brief summary here.

Create a client

All requests go through the client which, at a minimum, needs an embedly account key provided which you can store however you want (the sample shows it stored in a .config file using the standard .NET ConfigurationManager). You can sign-up for a free account at http://embed.ly/pricing to get a key

var key = ConfigurationManager.AppSettings["embedly.key"];
var client = new Client(key);

Use a Cache

If you want to use a cache then this should be passed into the client constructor. Here’s an example using the MongoDB cache:

var key = ConfigurationManager.AppSettings["embedly.key"];
var database = ConfigurationManager.ConnectionStrings["embedly.cache"];
var cache = new MongoResponseCache(database.ConnectionString);
var client = new Client(key, cache);

The final optional parameter when creating a client is the embedly request timeout. If the HTTP request to embedly takes longer than this then it is aborted and an exception returned instead of the embedly result. The default timeout for requests is 30 seconds.

List Embed.ly Providers

Once you have a client then you can see the list of providers that embedly supports:

foreach (var provider in client.Providers)
{
    Console.WriteLine("{0} {1}", provider.Type, provider.Name);
}

Check if a URL is supported:

Embed.ly supports over 200 different providers (all the big names like YouTube) although they will return results for the non-provider backed requests too.

var url = new Uri(@"http://www.youtube.com/watch?v=YwSZvHqf9qM")
var supported = client.IsUrlSupported(url);

Get provider information for a URL:

You can get the provider for a URL (this does not make any additional requests to embedly beyond the initial retrieval of the provider list itself).

var url = new Uri(@"http://www.youtube.com/watch?v=YwSZvHqf9qM")
var supported = client.IsUrlSupported(url);
Console.WriteLine("Supported      : {0}", supported);
Console.WriteLine();

var provider = client.GetProvider(url);
Console.WriteLine("PROVIDER");
Console.WriteLine("About          : {0}", provider.About);
Console.WriteLine("DisplayName    : {0}", provider.DisplayName);
Console.WriteLine("Domain         : {0}", provider.Domain);
Console.WriteLine("Favicon        : {0}", provider.Favicon);
Console.WriteLine("Name           : {0}", provider.Name);
Console.WriteLine("Regexs         : {0}", string.Join(", ", provider.Regexs));
Console.WriteLine("Subdomains     : {0}", string.Join(", ", provider.Subdomains));
Console.WriteLine("Types          : {0}", provider.Type);

Get the oEmbed information for a single URL:

The API supports single URL requests.

var url = new Uri(@"http://www.youtube.com/watch?v=YwSZvHqf9qM")
var result = client.GetOEmbed(url, new RequestOptions { MaxWidth = 320 });

// basic response information
var response = result.Response;
Console.WriteLine("Type           : {0}", response.Type);
Console.WriteLine("Version        : {0}", response.Version);

// link details
var link = result.Response.AsLink;
Console.WriteLine("Author         : {0}", link.Author);
Console.WriteLine("AuthorUrl      : {0}", link.AuthorUrl);
Console.WriteLine("CacheAge       : {0}", link.CacheAge);
Console.WriteLine("Description    : {0}", link.Description);
Console.WriteLine("Provider       : {0}", link.Provider);
Console.WriteLine("ProviderUrl    : {0}", link.ProviderUrl);
Console.WriteLine("ThumbnailHeight: {0}", link.ThumbnailHeight);
Console.WriteLine("ThumbnailUrl   : {0}", link.ThumbnailUrl);
Console.WriteLine("ThumbnailWidth : {0}", link.ThumbnailWidth);
Console.WriteLine("Title          : {0}", link.Title);
Console.WriteLine("Url            : {0}", link.Url);

// video specific details
var video = result.Response.AsVideo;
Console.WriteLine("Width          : {0}", video.Width);
Console.WriteLine("Height         : {0}", video.Height);
Console.WriteLine("Html           : {0}", video.Html);

Get oEmbed information for a list of URLs:

Any IEnumerable<Uri> list of URLs can be processed as a batch. The .NET client will return results as they arrive.

var results = client.GetOEmbeds(urls, new RequestOptions { MaxWidth = 320 })

Limit the URLs to request to supported providers only:

(embedly can return results for ‘unsupported’ providers but the supported ones typically have richer content.

var results = client.GetOEmbeds(urls, provider => provider.IsSupported);

Limit the URLs to request to a single provider:

A lambda expression enables the request to be filtered on any property of the provider identified for a URL.

var results = client.GetOEmbeds(urls, provider => provider.Name == "youtube")

Limit the URLs to request based on the type of provider:

Each provider has a Type to indicate the content they return so if you are only interested in video links you can filter on that type.

var results = client.GetOEmbeds(urls, provider => provider.Type == ProviderType.Video);

NOTE: ‘urls’ is an IEnumerable<Uri> in the above.

NOTE: RequestOptions enables a number of additional request arguments to be specified, see: http://embed.ly/docs/endpoints/arguments

The Result returned contains the original request (URL and any matching provider) an Exception (if the HTTP request failed) or a Response which could be an embedly Error (used to indicate if the URL being inspected doesn’t exist for instance) or one of the specific response types (Link, Photo, Rich and Video).

Extension methods enable the results to be filtered as a convenience for:

result.Success()

Returns results that were successful only

result.Failed()

Returns results that failed (HTTP error during request to embedly)

result.Errors()

Returns results that embedly responded with an error code. i.e. the request to embedly was successful but maybe the URL doesn’t exist

result.Link()

Returns results that are of type Link

result.Photos()

Returns results that are of type Photo

result.Richs()

Returns results that are of type Rich

result.Videos()

Returns results that are of type Video

If you are iterating over multiple results and want to handle them correctly then the first step is to check each result’s Exception property. If there was an exception during the HTTP request to embedly then this will be set. If it is null then the request to embedly was successful in that embedly returned a response but that response may be an Error, a Link or a Phot, Rich or Video. The Respone.Type will indicate the response and the As[type] property is a convenience way to get the Response as the particular type.

foreach (var result in results.Successful())
{
    if (result.Exception == null)
    {
        Console.WriteLine("{0} found for {1} ({2})", result.Response.Type, result.Request.Url, result.Request.Provider.Name);
        switch (result.Response.Type)
        {
            case ResourceType.Error:
                var error = result.Response.AsError;
                Console.WriteLine("  code:{0} message:{1}", error.ErrorCode, error.ErrorMessage);
                break;
            case ResourceType.Link:
                var link = result.Response.AsLink;
                Console.WriteLine("  title:{0}", link.Title);
                Console.WriteLine("  url:{0}", link.Url);
                break;
            case ResourceType.Photo:
                var photo = result.Response.AsPhoto;
                Console.WriteLine("  title:{0} ({1}x{2})", photo.Title, photo.Width, photo.Height);
                Console.WriteLine("  url:{0}", photo.Url);
                break;
            case ResourceType.Rich:
                var rich = result.Response.AsRich;
                Console.WriteLine("  title:{0} ({1}x{2})", rich.Title, rich.Width, rich.Height);
                Console.WriteLine("  url:{0}", rich.Url);
                break;
            case ResourceType.Video:
                var video = result.Response.AsVideo;
                Console.WriteLine("  title:{0} ({1}x{2})", video.Title, video.Width, video.Height);
                Console.WriteLine("  url:{0}", video.Url);
                break;
        }
    }
    else
    {
        Console.WriteLine("Exception requesting {0} : {1}", result.Request.Url, result.Exception);
    }
}

Logging

The library uses the Common.Logging 2 library so you can plug it in to whatever your preferred logging framework is. The log output isn’t very rich right now but I’ll be expanding that in future so you can peek into what is happening.

Reactive Extensions

The other dependency is the Reactive Extensions which I’m new to but it really made the caching of individual LINQ responses much easier than it would otherwise be. The Push vs Pull model allows the pipeline to be split with cached items going to the return pipeline immediately and non-cached requests going through the full download pipeline. I’ll try and make a further post describing how this works.

Roadmap

I’d like to add support for the other embedly endpoints (Preview and Objectify) although I’m not using them myself at the moment – let me know if you’d find these useful.

Some custom Windows Performance Counters would probably be good to track how many requests are going through the library and what the cache-hit ratio is.

The current caching system is very simple and doesn’t have much support for expiring items which should be added.

Feedback

If you find the library useful or have any comments or suggestions to improve things I’d welcome any feedback.