Today, I casually made a basic script for scraping DuckDuckGo API with CURL HTTPS POST method. This will require your hosting must support with CURL module. So lets check it out,
1. Retrieve the API url.
2. Build some function to make easy your work.
3. Get Post Data variable.
3. Declare your CURL result with regex expression or other way as you like. On this step I would to use php regex expression to get the result value.
combine all this script together then try it at your site or testing first with your webserver.
Demo : http://allpreview.net/up/search.php
1. Retrieve the API url.
Code:
<?php
$url = "https://api.duckduckgo.com/html/";
?>
Code:
<?php
function get_string_between($string, $start, $end)
{
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0)
return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
function post_https($url, $post_data)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL , $url );
curl_setopt($ch, CURLOPT_RETURNTRANSFER , 1);
curl_setopt($ch, CURLOPT_TIMEOUT , 60);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_USERAGENT , "" );
curl_setopt($ch, CURLOPT_POST , 1);
curl_setopt($ch, CURLOPT_POSTFIELDS , $post_data );
$xml_response = curl_exec($ch);
if (curl_errno($ch)) {
$error_message = curl_error($ch);
$error_no = curl_errno($ch);
echo "error_message: " . $error_message . "<br>";
echo "error_no: " . $error_no . "<br>";
}
curl_close($ch);
return $xml_response;
}
?>
Code:
<?php
$post_data = array(
'q' => $_POST['q'],
'kl' => 'en-us',
);
?>
Code:
<?php
$response = post_https($url, $post_data); //Function from CURL
$regex = '#<div class="[^>]*results_links_deep web-result">(.*?)<div class="url">#s';
$match = preg_match_all($regex,$response,$r);
if($match > 1){
for($i = 0; $i < $match; $i++) {
$a = preg_replace('/\s\s+/', '', $r[1][$i]);
$l = preg_match_all('#<a[^>]*href="[^>]*>(.*?)</a>#s',$a,$k);
$rn = preg_match('#<img[^>]*src="[^>]*/>#s',$a,$li);
$link = $k[0][1];
$fav = $li[0];
$ceklink = get_string_between($a,'href="','">');
//$fav = str_replace('src="/','src='.$ceklink,$fav);
$desc = get_string_between($a,'<div class="snippet">','</div>');
echo $link.'<br>';
echo $desc.'<br>';
}
}
?>
combine all this script together then try it at your site or testing first with your webserver.
Demo : http://allpreview.net/up/search.php