This tutorial will walk through creating a simple PHP search engine using Yahoo! Search BOSS. When we finish, you'll have a basic foundation upon which you can, what else could I say?, Build your Own Search Service.
Before we get started, you'll need two things:
Let's get started. (You can download the source here.)
We'll start out by just putting together an HTML skeleton that we'll add our PHP code to. Go ahead and create a file to contain the code, for this tutorial we'll assume it's named search.php, but any name is okay.
This first attempt is just a stub that won't actually do anything. Copy this code and paste it into search.php:
<html>
<head><title>My Custom Search</title></head>
<body>
<form action="search.php" method="GET">
<label for="query"> Search </label>
<input name="query">
<input type="submit" value="Search">
</form>
<div id="results">
<div class="result">
<h3 class="title">A Web Result</h3>
<p class="summary">This is a summary for a web result...</p>
</div>
<div class="result">
<h3 class="title">Another Web Result</h3>
<p class="summary">This is another summary for a web result...</p>
</div>
</div>
</body>
</html>
It's pretty ugly, and it doesn't do anything, but hey, it's just a first try. We'll have it up and running in no time.
Now we're going to update our static HTML to actually perform queries. Our first change will be to add some PHP to the form so that it displays the current query.
<form action="search.php" method="GET">
<label for="query"> Search </label>
<?php
echo '<input name="query" value="' . $_GET['query'] . '">';
?>
<input type="submit" value="Search">
</form>
Next we need to build the query string sent to BOSS to retrieve search results. There are a number of parameters for customizing queries, but we're going to stick with the basics:
start is the result number to begin at. For example, if you've already looked at 20 results, then you'll want to begin with the 20th result (BOSS, like all well-trained machines, starts counting at zero instead of one).
count is the number of results to retrieve. By default its value is 10, but you may choose any number between 1 and 50.
appid is your BOSS Application ID (that you already signed up for, right?).
type is either xml or json, and determines the format that BOSS will return results in. By default it returns JSON, and we'll stick with that, so we won't have to specify the value explicitly.
Bringing that all together, if you were writing a query to search for pizza,
in the web vertical (verticals are the type of search,
and you can choose between web, images and news; there
will be more on this topic later in the tutorial),
retrieving 35 results, starting with the 70th result,
and your appid is pizza_lover_54321, then the
query would look like this:
http://boss.yahooapis.com/ysearch/web/v1/pizza?appid=pizza_lover_54321&start=70&count=35
The PHP to compose that string might look like this:
<?php
$search_term = $_GET['query'];
$base_url = "http://boss.yahooapis.com/ysearch/";
$vertical = "web/";
$version = "v1/";
$search_term = urlencode($search_term);
$appid = "?appid=" . "pizza_lover_54321";
$start = "&start=" . "70";
$count = "&count=" . "35";
$request_url = $base_url . $vertical . $version . $search_term . $appid . $start . $count;
?>
Now let's build on that request building code, and extend it to retrieve results from BOSS.
<?php
$search_term = $_GET['query'];
if ($search_term != "") {
// Build search request.
$base_url = "http://boss.yahooapis.com/ysearch/";
$vertical = "web/";
$version = "v1/";
$search_term = urlencode($search_term);
$appid = "?appid=" . "your-app-id-here";
$start = "&start=" . "0";
$count = "&count=" . "10";
$request_url = $base_url . $vertical . $version . $search_term . $appid . $start . $count;
// Send search request.
$curl_handle = curl_init($request_url);
curl_setopt($curl_handle,CURLOPT_URL, $request_url);
curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl_handle, CURLOPT_CONNECTTIMEOUT, 5);
$raw_results = curl_exec($curl_handle);
curl_close($curl_handle);
$results_dict = json_decode($raw_results);
$results = $results_dict->ysearchresponse->resultset_web;
}
else {
$results = array();
}
?>
(Note that we are using the php-json library here, which is included by default in PHP >=5.2. If you are using an earlier version of PHP, you'll need to install it. Alternatively you could use the pure PHP JSON-PHP, and slightly modify the above code. Finally, you could also retrieve the results as XML, and use the PHP XML Parser.)
Now that we're successfully retrieving results, the last step in this iteration
is that we need to display the results. Go ahead and remove the
existing results div and replace it with this:
<div id="results">
<?php
foreach ($results as $result) {
echo '<div class="result">';
echo '<h3 class="title"><a href='.$result->clickurl.'">'.$result->title.'</a></h3>';
echo '<p class="summary">'.$result->abstract.'</p>';
echo '</div>';
}
?>
</div>
Altogether, the code should now look like this:
<html>
<head><title>My Custom Search</title></head>
<body>
<form action="search.php" method="GET">
<label for="query"> Search </label>
<?php
echo '<input name="query" value="' . $_GET['query'] . '">';
?>
<input type="submit" value="Search">
</form>
<?php
$search_term = $_GET['query'];
if ($search_term != "") {
// Build search request.
$base_url = "http://boss.yahooapis.com/ysearch/";
$vertical = "web/";
$version = "v1/";
$search_term = urlencode($search_term);
$appid = "?appid=" . "your-app-id-here"; // replace with your app-id
$start = "&start=" . "0";
$count = "&count=" . "10";
$request_url = $base_url . $vertical . $version . $search_term . $appid . $start . $count;
// Send search request.
$curl_handle = curl_init($request_url);
curl_setopt($curl_handle,CURLOPT_URL, $request_url);
curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl_handle, CURLOPT_CONNECTTIMEOUT, 5);
$raw_results = curl_exec($curl_handle);
curl_close($curl_handle);
$results_dict = json_decode($raw_results);
$results = $results_dict->ysearchresponse->resultset_web;
}
else {
$results = array();
}
?>
<div id="results">
<?php
foreach ($results as $result) {
echo '<div class="result">';
echo '<h3 class="title"><a href='.$result->clickurl.'">'.$result->title.'</a></h3>';
echo '<p class="summary">'.$result->abstract.'</p>';
echo '</div>';
}
?>
</div>
</body>
</html>
If you do a quick search for pizza, the results should look like this:
So far we're only at fifty lines of code. Let's see what we can accomplish with a couple more.
The next piece of functionality we're going to add to our
search is the ability to page through results. To begin
with we'll focus on adding previous and next
buttons.
The first step is that we'll need to add another GET parameter,
to track which results to retrieve. To do that we'll comment out
the line where we're initializing $start and replace it with
this code:
<?php
// Some code...
//$start = "&start=" . "0";
$start_val = $_GET['start'];
if ($start_val == "") $start_val = 0;
$start = "&start=" . $start_val;
// Some more code...
?>
Now you can test out your app by adding &start=15 to the
URI, and then it'll start at the 15th result. So our search can
technically page through results now, but we still need to
add a couple of links to make it easier for users.
To accomplish that we'll need to reorganize our source
code a bit, and move our PHP block above the search form.
We're making this change because in order to implement the
logic for the previous and next links,
we'll need access to the start and count variables,
which wouldn't exist yet with our previous code's structure.
Next, we need to modify how we're creating the $count
variable, because we need access to the integer value that
is being used for $count.
<?php
// Some code...
$count_val = 10;
$count = "&count=" . $count_val;
// Some more code...
?>
Then, after the search form we can add this code:
<?php
if ($query != "") {
if ($start_val != 0) {
echo '<a href="?query='.$_GET['query'] . '&start='. (intval($start_val) - intval($count_val)) .'">previous</a>';
echo '<span> | </span>';
}
echo '<a href="?query='.$_GET['query'] . '&start='.(intval($start_val) + intval($count_val)) . '">next</a>';
}
?>
The above code implements this logic for displaying previous and next links:
Calculating the previous or next batch of results is as
simple as subtracting or adding the value of $count_val
to the value of $start_val.
With this addition, our search is now looking like this:
Now we have a mostly functional, although rather plain on the eyes, search engine. But it's missing (at least) one important feature that we've become accustomed to search engines supporting: targeting searches for either web, image or news results.
Fortunately, BOSS makes adding support for multiple search verticals (a search vertical is just a kind of search, like web, image or news) as close to painless as possible.
If we look back at the code where we are building the URI for querying BOSS, it looked like this:
<?php
$base_url = "http://boss.yahooapis.com/ysearch/";
$vertical = "web/";
$version = "v1/";
$search_term = urlencode($search_term);
$appid = "?appid=" . "your-app-id-here";
$start_val = $_GET['start'];
if ($start_val == "") $start_val = 0;
$start = "&start=" . $start_val;
$count_val = 10;
$count = "&count=" . $count_val;
$request_url = $base_url . $vertical . $version . $search_term . $appid . $start . $count;
?>
Looking at the $vertical variable, we are currently assigning
it the value "web/". Changing the search vertical is as simple
as switching $vertical's value to "images/" or "news/".
Well, it's almost that simple. The other change is that the
returned results will be contained in resultset_images or
resultset_news instead of resultset_web, so we'll
have to update our code a bit more to handle that difference as well.
We'll start by updating our search form with a select widget for specifying the vertical to search in:
<form action="search.php" method="GET">
<label for="query"> Search </label>
<?php
echo '<input name="query" value="' . $_GET['query'] . '">';
echo '<select name="vertical">';
if ($_GET['vertical']=='web')
echo '<option value="web" selected="selected">Web</option>';
else
echo '<option value="web">Web</option>';
if ($_GET['vertical']=='images')
echo '<option value="images" selected="selected">Images</option>';
else
echo '<option value="images">Images</option>';
if ($_GET['vertical']=='news')
echo '<option value="news" selected="selected">News</option>';
else
echo '<option value="news">News</option>';
echo '</select>';
?>
<input type="submit" value="Search">
</form>
This code is a bit uglier than we'd like, because we need to
preserve the selected vertical between search pages.
Next, we need to update how we're assigning the $vertical
variable its value.
<?php
// the old way
$vertical = "web/";
// the new way
$vertical = $_GET['vertical'] . "/";
if ($vertical == "/") $vertical = "web/";
?>
We also need to update our next and previous links to include the selected vertical.
<?php
if ($query != "") {
if ($start_val != 0) {
echo '<a href="?query='.$_GET['query'] . '&start='. (intval($start_val) - intval($count_val)) . "&vertical" .$_GET['vertical'].'">previous</a>';
echo '<span> | </span>';
}
echo '<a href="?query='.$_GET['query'] . '&start='.(intval($start_val) + intval($count_val)) . "&vertical=". $_GET['vertical'].'">next</a>';
}
?>
Finally, we have one last step in our transition: extracting
the results list from the correct place for the selected vertical.
As mentioned before, the web vertical returns results in the
resultset_web, the images vertical in resultset_images, and
the news vertical in resultset_news.
Replace the line of code that is assigning the value to
$results with the if-elseif-else block from this code:
<?php
// some code...
if ($vertical == "images/")
$results = $results_dict->ysearchresponse->resultset_images;
else if ($vertical == "news/")
$results = $results_dict->ysearchresponse->resultset_news;
else
$results = $results_dict->ysearchresponse->resultset_web;
// some code...
/>
If you try out your search service now, you can get news, image and web results. It'll look like this:
Although this tutorial won't tackle the challenge, you'll likely want to customize how each vertical's results are displayed, because they come with different metadata:
All three contain these pieces of metadata: abstract, clickurl, date,
title, and url.
Image results have filename, format, height, width, mimetype, thumbnail_height, thumbnail_url, thumbnail_width, referrerclickurl and referrerurl.
News results have language, source, sourceurl, and time.
Web results have dispurl.
Using that extra vertical-specific metadata is part of transforming a bland list of links into something much more exciting, and will make a big difference in how your users experience your search service.
You can download the complete source for this tutorial here.
Hopefully this tutorial has been enough to get you started building your own search service with Yahoo! Search BOSS. From here you can find more details by looking at the documentation, and can ask questions in the BOSS forum.