asp.net - Scraping using Html Agility Package -


i trying scrape data news article using htmlagilitypackage link follows http://www.ndtv.com/india-news/vyapam-scam-documents-show-chief-minister-shivraj-chouhan-delayed-probe-780528

i have written following code below extract comments in articles reason variable atags returning null value

code:

var gethtmlweb = new htmlweb();         var document = gethtmlweb.load(txtinputurl.text);         var atags =    document.documentnode.selectnodes("//div[@class='com_user_text']");         int counter = 1;         if (atags != null)         {             foreach (var atag in atags)             {                 lbloutput.text += lbloutput.text + ". " + atag.innerhtml + "\t" + "<br />";                 counter++;             }         } 

i have used xpath still same result //div[@class='newcomment_list']/ul/li/div[@class='headerwrap']/div[@class='com_user_text'] please me correct xpath extract comments searched on net no solution.

do 'view source' on page , search com_user_text. user comments don't appear @ all. loaded via javascript after page loaded. when load page content via gethtmlweb.load(), don't user comments.

as this answer says, html agility not tool capable of emulating browser , running javascript. instead, need watin "allows programmatic access web pages through given browser engine , load full document."


Comments

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

javascript - Chrome Extension: Interacting with iframe embedded within popup -