Hi. Is there any way to get the keywords of a video and display it using PHP? I’m parsing Youtube API using SimpleXML. I can’t seem to display the keywords and can’t find anything on google. Does anybody know how?
Thanks!
Marcel
Hi. Is there any way to get the keywords of a video and display it using PHP? I’m parsing Youtube API using SimpleXML. I can’t seem to display the keywords and can’t find anything on google. Does anybody know how?
Thanks!
Marcel
If you do not get a reply here, then I imagine the best place to look would be the corresponding Google Group - every Google API I have used has a matching Google support Group set up.
We can’t help you with your broken code if you’re not going to show it to us. (:
I was able to make it work.
I var_dump’ed the SimpleXML object I had and the keyword list wasn’t there. But, from the Youtube API, it supposed be there.
So what I did is I used cURL, substr and strpos. Haha! Crude I know! What gets the job done I guess.
Thanks!
I think it would be worthwhile to stick with SimpleXML. You stumbled into a common problem, whether you know it or not. I don’t know how much you know about XML, so if there are any key words here that don’t make sense then please take a little time to learn about them; they’re essential.
The problem is that SimpleXML can be a little, to put it bluntly, difficult when it comes to working with namespaced items in your XML. These can be recognised by the form [COLOR="#800000"]ns:example[/COLOR]
where [COLOR="#800000"]ns[/COLOR]
is what’s called the namespace prefix and [COLOR="#800000"]example[/COLOR]
is the local name of an element or attribute.
Namespaces, to massively generalise, group XML items together and to avoid ambiguity for items with the same name. Working with some XML returned from the YouTube API (see below), you can hopefully recognise a bunch of namespaces. The XML defines the namespaces in the document element (entry
) and look like [COLOR="#800000"]xmlns:prefixname="http://example.com/namespace"[/COLOR]
. These definitions declare the prefix to be used within the document and the URI (Uniform Resource Identifier) that a given prefix belongs to. URIs uniquely identify any particular namespace: items using a namespace belong to that (and only that) one namespace. Widely used namespaces (such as the “Media RSS” namespace at http://search.yahoo.com/mrss/) commonly use the same prefix across XML documents (e.g. [COLOR="#800000"]media[/COLOR]
) more as a matter of convention than requirement; if you see [COLOR="#800000"]media:[/COLOR]
it is often for Media RSS (you can always check by looking at the namespace declaration).
Moving swiftly back to SimpleXML, there are a couple of ways to access the namespaced items but as you have seen, the usual [COLOR="#800000"]$element->child->grandchild[/COLOR]
syntax does not work. You have to say to SimpleXML, “okay, I want to work with this namespace now, thanks!” For this, there is a dedicated method available on all [COLOR="#800000"]SimpleXMLElement[/COLOR]
objects called [COLOR="#800000"]children()[/COLOR]
(docs [FONT=Book Antiqua][I]&[/I][/FONT] [url=http://php.net/simplexml.examples-basic]examples).
Lets have an example, we’ll use this XML which contains video information for [url=http://www.youtube.com/watch?v=gzDS-Kfd5XQ][I][FONT=Book Antiqua]Sesame Street: Ray Charles Sings “I Got A Song” With Bert & Ernie[/FONT][/I]. The XML looks like:
<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" xmlns:gd="http://schemas.google.com/g/2005" xmlns:yt="http://gdata.youtube.com/schemas/2007" gd:etag="W/"A0UFQX47eCp7I2A9WhRSGUo."">
<id>tag:youtube.com,2008:video:gzDS-Kfd5XQ</id>
<published>2008-08-06T18:56:56.000Z</published>
… more stuff here…
<media:group>
<media:category label="Entertainment" scheme="http://gdata.youtube.com/schemas/2007/categories.cat">Entertainment</media:category>
… more stuff here…
<media:keywords>sesame, street, celebrity, ray, charles, ernie, bert, muppet, instruments, guitar, drums, piano</media:keywords>
<media:license type="text/html" href="http://www.youtube.com/t/terms">youtube</media:license>
<media:player url="http://www.youtube.com/watch?v=gzDS-Kfd5XQ&feature=youtube_gdata_player"/>
<media:thumbnail url="http://i.ytimg.com/vi/gzDS-Kfd5XQ/default.jpg" height="90" width="120" time="00:01:04.500" yt:name="default"/>
… more stuff here…
</media:group>
<gd:rating average="4.9522257" max="5" min="1" numRaters="921" rel="http://schemas.google.com/g/2005#overall"/>
<yt:statistics favoriteCount="2149" viewCount="610256"/>
<yt:rating numDislikes="11" numLikes="910"/>
</entry>
</entry>
The keywords that you’re after are held within [COLOR="#800000"]media:keywords[/COLOR]
tags, which means that the local name is [COLOR="#800000"]keywords[/COLOR]
(a logical enough name to hold the keywords!) within the Media RSS namespace (defined with the prefix [COLOR="#800000"]media[/COLOR]
at the top of the XML). The children()
method allows use to access items within a given namespace by using either the URI ([COLOR="#800000"]children("http://search.yahoo.com/mrss/")[/COLOR]
) or the prefix ([COLOR="#800000"]children("media", TRUE)[/COLOR]
, the second argument tells children that you’re giving it a prefix).
The structure of the XML, to access the keywords, looks like
entry
└ media:group
└ media:keywords
In SimpleXML, that would look like
$entry->children("media", TRUE)->group->keywords
In words, if [COLOR="#800000"]$entry[/COLOR]
is the result of loading the XML, that says: give me my [COLOR="#800000"]media[/COLOR]
namespaced child elements, get the first [COLOR="#800000"]group[/COLOR]
from those children, and get the first [COLOR="#800000"]keywords[/COLOR]
(still in the [COLOR="#800000"]media[/COLOR]
namespace) from that [COLOR="#800000"]group[/COLOR]
.
Putting all of that together into a simple example, getting an array of the keywords for that video might look like.
<?php
$entry = simplexml_load_file('http://gdata.youtube.com/feeds/api/videos/gzDS-Kfd5XQ?v=2');
$keywords = (string) $entry->children('media', TRUE)->group->keywords;
$keywords_array = explode(", ", $keywords);
print_r($keywords_array);
?>
The example should output something like the following if all goes well.
Array
(
[0] => sesame
[1] => street
[2] => celebrity
[3] => ray
[4] => charles
[5] => ernie
[6] => bert
[7] => muppet
[8] => instruments
[9] => guitar
[10] => drums
[11] => piano
)
More reading:
P.S. As for [COLOR="#800000"]var_dump()[/COLOR]
not displaying the namespaced items, unfortunately it just doesn’t. To see what a given element contains, it is generally more useful just to call [COLOR="#800000"]asXML()[/COLOR]
on it which will return its XML.
Great explanation of simpleXML and its behaviour concerning namespaces - I did not know that - thanks.