hamidjoukar Posted September 3, 2015 Share Posted September 3, 2015 When I read HTML source of below linkhttp://www.dresslink.com/women-candy-color-handbag-leather-cross-body-shoulder-bag-bucket-bag-p-10908.htmlI can find below data about the product: <script type="text/javascript"> item.stock['ss42356']=[]; DL.item.stock['ss42356']['qty']=56; DL.item.stock['ss42356']['sku']='SV000837_B'; DL.item.stock['ss42356']['inexistence']=0; DL.item.stock['ss42356']['down_shelf']=0; DL.item.stock['ss42356']['procurement_cycle']='8'; DL.item.stock['ss42356']['paid_set']=[]; DL.item.stock['ss42356']['paid_set'].push(35630); DL.item.color_image['35630']='of7ea7'; DL.item.stock['ss42357']=[]; DL.item.stock['ss42357']['qty']=29; DL.item.stock['ss42357']['sku']='SV000837_G'; DL.item.stock['ss42357']['inexistence']=0; DL.item.stock['ss42357']['down_shelf']=0; DL.item.stock['ss42357']['procurement_cycle']='6'; DL.item.stock['ss42357']['paid_set']=[]; DL.item.stock['ss42357']['paid_set'].push(35631); DL.item.color_image['35631']='of710e'; DL.item.stock['ss42358']=[]; DL.item.stock['ss42358']['qty']=14; DL.item.stock['ss42358']['sku']='SV000837_BR'; DL.item.stock['ss42358']['inexistence']=0; DL.item.stock['ss42358']['down_shelf']=0; DL.item.stock['ss42358']['procurement_cycle']='17'; DL.item.stock['ss42358']['paid_set']=[]; DL.item.stock['ss42358']['paid_set'].push(35632); DL.item.color_image['35632']='of77c1'; DL.item.stock['ss42359']=[]; DL.item.stock['ss42359']['qty']=36; DL.item.stock['ss42359']['sku']='SV000837_O'; DL.item.stock['ss42359']['inexistence']=0; DL.item.stock['ss42359']['down_shelf']=0; DL.item.stock['ss42359']['procurement_cycle']='7'; DL.item.stock['ss42359']['paid_set']=[]; DL.item.stock['ss42359']['paid_set'].push(35633); DL.item.color_image['35633']='of7136'; </script> I need to know the quantity for each SKU, so I need to produce a simple array containing each SKU name and it's quantity like below $a = array( 'SV000837_B' => '56', 'SV000837_G' => '29', 'SV000837_BR' => '14', 'SV000837_O' => '36', ); Please help me write a PHP code using regex and any other method to provide above array. Link to comment https://forums.phpfreaks.com/topic/298041-get-product-information-from-html-source-regex/ Share on other sites More sharing options...
Ch0cu3r Posted September 3, 2015 Share Posted September 3, 2015 Try <?php // webpage you are scraping the javascript code from $page_url = 'http://www.dresslink.com/women-candy-color-handbag-leather-cross-body-shoulder-bag-bucket-bag-p-10908.html'; // load the webpage into DOMDocument libxml_use_internal_errors(true); $doc = new DOMDocument(); $doc->loadHTMLFile($page_url); // use XPath to return the second <script> element inside the <div class="dd1"> element // this is where the javascript code containing the stock array is in the webpage $xpath = new DOMXPath($doc); $result = $xpath->query('//div[@class="dd1"]/script[2]'); // retrieve the node element value $JS_stock_array_code = $result[0]->nodeValue; // use regex to find the qty and sku values preg_match_all("~\[('[\w\d]+')\]\['qty'\]=(\d+);.+\[\\1\]\['sku'\]='([\w\d]+)'~", $JS_stock_array_code, $matches); // loop through the results and define sku array // the sku is used as the array key // the quantity is the assigned to the sku $skus = array(); foreach($matches[3] as $key => $sku) { $qty = $matches[2][$key]; $skus[$sku] = $qty; } // output $sku array printf('<pre>%s</pre>', print_r($skus, 1)); Output for me is Array ( [SV000837_B] => 49 [SV000837_G] => 26 [SV000837_BR] => 11 [SV000837_O] => 35 ) Link to comment https://forums.phpfreaks.com/topic/298041-get-product-information-from-html-source-regex/#findComment-1520195 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.