If You Administer a WordPress Installation You Have Heard About oEmbed. What is oEmbed and WordPress Embed? It is Not Standard But it is Just a Format. Most website described oEmbed as Open Standard. But, Standard should have proper documentation, validated and accepted by the known groups/societies/bodies/authorities. Like, Dublin Core is a Specification. It has corresponding Wiki on W3C’s website :
1 | https://www.w3.org/wiki/DublinCore |
What is oEmbed and WordPress Embed?
I was unable to find any history and technical specifications (I am talking about Internet Protocols, RFC, Normal Standards like IEEE). I found that, all are referring towards the domain oembed.com
as official website, but in that website (rather a GitHub Hosted domain) there is no true “specification”. What is linked as Mailing List
is link towards a Google Group. Actually my suspicion started at the paragraph under Security Consideration :
When a consumer displays any URLs, they will probably want to filter the URL scheme to be one of http, https or mailto, although providers are free to specify any valid URL. Without filtering, Javascript:… style URLs could be used for XSS attacks.
When a consumer displays HTML (as with video embeds), there’s a vector for XSS attacks from the provider. To avoid this, it is recommended that consumers display the HTML in an iframe, hosted from another domain. This ensures that the HTML cannot access cookies from the consumer domain.
That made me to search DMOZ
and W3C
against the word oEmbed. DMOZ, obviously not listed any website out of suspicion, W3C has this minimal information :
---
1 | https://www.w3.org/2005/Incubator/federatedsocialweb/wiki/Protocols#oEmbed |
Reason of suspicion became obvious – possibility of security risks and thereby chance of any NSA or any Governmental spyware activity. Cheating & fooling should have a limit. If an innocent read it :
1 | http://www.webmonkey.com/2010/02/get_started_with_oembed/ |
The web monkey has written :
The full OEmbed spec says that all requests sent to the API endpoint (Flickr in our example) must be HTTP GET requests, with any arguments sent as query parameters. Obviously any arguments you send through HTTP should be url-encoded (as per RFC 1738 in this case).
RFC 1738 in that sentence “any arguments you send through HTTP should be url-encoded (as per RFC 1738 in this case)” is not about oEmbed. RFC 1738 talks about Uniform Resource Locator (URL).
What is oEmbed and WordPress Embed? Answer is Complex
Copy-Pasting any URL from some websites makes it something like an embedded Tweet or iFrame can not be a reason to try to establish it as an “Open Standard”. Instead of the “official webpage” i.e.
1 2 3 4 | https://developers.facebook.com/docs/plugins/oembed-endpoints https://developer.wordpress.com/docs/oembed-provider-api/ https://developer.yahoo.com/blogs/ydn/oembed-embedding-third-party-media-made-easy-7355.html https://dev.twitter.com/rest/reference/get/statuses/oembed |
It appears that oEmbed was originally thought out by Yahoo. There is discussion on wordpress.org
about that oEmbed :
1 | https://make.wordpress.org/core/2015/10/28/new-embeds-feature-in-wordpress-4-4/ |
Some developers of WordPress are forcing, while the general developers are not agreeing the hidden matter that a JSON or XML output they actually the users have on their website.
Criticism of WordPress
Whatever, in whichever way oEmbed developed or even used by NSA is not important. It is more important that the software should have an agreed list of standard features not open to XSS attacks. XMLRPC attack is common and it is difficult to detect on PHP-FPM and Nginx, but XMLRPC possibly has some usage. It is definitely an unwanted package. Removing with PHP filters is not practical option. The files are on server. PHP Shell can be used to get access.
Some problems were discussed at Drupal :
1 | https://www.drupal.org/node/1175368 |
It is obvious, if our post is embedded by an innocent user inside his/her WordPress post (on different server), first I can run exploit, second; if my server is hacked, the persons embedding will get hacked. The posts are saved in MySQL database.
Another problem is content scrapping. Case of Flickr, GitHub, Facebook, Twitter is different. They provide public services and the websites are configured, managed by many skilled persons. It is not great to fetch a JSON response from a server. There is much safer Open Graph. Facebook developed Open Graph inspired by Dublin Core, Microformats, and RDF but factually it is really open.
Obviously there is suspicious matters around that oEmbed. Somehow on GitHub an “official website” hosted with no logo and all are referring them, some WordPress developers thinking it is great and not providing any easy option to “switch it off” – any sane human will say that there is something wrong in that scheme of promotion. From WordPress 2.9 that thing is present, but dangerous has been now.
The innocent non-tech users will get more confused with Restful API, JSON encoding like words.
Worst would be if the embedding site is under DDoS attack. The source site will suffer from DDoS. The old :
1 | <a href="https://jima.in/" target="_blank">Restful Website</a> |
will not do it. Those providers like Twitter has caching mechanism on different server, DDoS protection at several layers. There are web hosts who kick out if a client website has DDoS.
Fair Usage of oEmbed
It is practical to limit the GET request to specific IPs like localhost. The generated output is suitable to show up in own other domains, same domain’s section etceteras.
Tagged With html in oembed