Facebook Open Graph previews not working with Github Pages and CNAME records

Tags: Facebook Open Graph, github, CNAME, and Facebook Preview

Fixing Facebook previews of external links can be easy, but sometimes It's not. In my specific case I managed to find this solution, after a while of research, trial and error.

This is how a Facebook Preview should look like

If you have a page hosted on Github Pages, as I do, and you set up a custom domain for your page through CNAME records, as I do, you may have seen that if you share your page or anything on that page on Facebook, you won't get any preview. Even if you add the open graph metadata tags, it won't show anything. So how do we get a functioning preview?

My first stop was the Facebook debugger tool, with which you can view previews generated for a link, you can see the errors (if there are any), you can refresh the previews, and get some other useful infos. I was puzzled when it told me Error parsing input URL, no data was cached, or no data was scraped., and no other info expect a very generic non-preview. On the bottom of the page there's a link to see what to scraper has seen, so I clicked it and was puzzled to say the least: Document returned no data

I tried different solutions: the first thing I did (Honestly, I should have done it before already) was removing all tags from the "description" part of the text. I used to have <p> tags wrapping around it (because I jsut used the raw page.excerpt jekyll provides). I did this with a very simple page.excerpt | remove: '<p>' | remove: '</p>' , and then I added a | truncate:155 at the end, just to be on the safe side (for Facebook it should not be needed, but since I used this also for other metadata, I decided to go with it). This didn't change a single thing.

My next guess was bit.ly: When I was setting up the site I created a few bit.ly shortlinks to test some stuff, and it gave me a warning that the site contains malaware. I checked and double-checked, everything was clean, and since it's jekyll there can't really be any server-side problem (except if all of Github Pages were infected, which wouldn't have been my problem), and so I contacted bit.ly customer support. They were very nice, and resolved the issue right away (The problem was that my site somehow ended up on their internal blacklist, not clear why). So now that I was working to resolve the Facebook previews, I thought that maybe it's still related to that incident: I was sharing bit.ly links after all. So I tried using the Facebook debugger tool with the raw URL, but strangely I still got the same error.

After (quite a lot) of additional research, I stumbled upon a few other cases that had the same errors with the Facebook previews. Looking for similarities, I saw that a few of them too were hosted on Github Pages. So I tried links from various Github-Hosted pages, and most of them worked... most of them. A-Ha!, there was a pattern! All pages with in the username.github.com or username.github.io domain worked fine, those with custom domain didn't. Could it be that I found the culprit? Yes, apparently I did: after quickly trying to enter ramsesoriginal.github.com/good-start-long-absence/ in facebook debug tool's adress bar, and pressing enter, voilá! It's showing data!

It appears that it's still a tad inconsistent on how it handles the situation: upon trying to refresh the scratch data sometimes it works, sometimes it doesn't. In the debug tool the preview gets generated correctly, but then if you try to look at it in the share dialog, or when inserting the URL in a Facebook post, it doesn't work.

As far as I can tell this is the best we can do for now: we use the username.github.com or username.github.io links to share on facebook. We can even use them in bit-ly (or other URL shorteners), so it will just redirect from bit.ly to the github.com to the github.io to the canonical domain. Does this work consistently? Now. Is this a good solution? hell no. Can we do better? As far as I understood, for now... sadly not.

I'll keep this updated in case there are any news.

p.s.: There's one thing you should also not forget to check: trailing slashes in the site's canonical URL. I fixed that somewhere along the path, you should check that too.

~ Ramsesoriginal