Image

Imagetesting4l wrote in Imagelinux

This one has me _stumped_

and that's saying something.

I've never been a fan of web technologies and now I'm learning new ways to curse javascript for making it so bloody hard to get a simple web page.

I want to pull data out of a newspaper article on the Washington Post. I want to do it via some automatic means. I've been looking at using curl which works beautifully for the New York Times and the LA Times.

I've been trying this command line:


curl -D /tmp/blah -c /siftology/local/crawlers/cookies/wpostcookie -A "Mozilla/4.0" -b 'Password=cpunks' -b 'MemberName=cpunks@cpunks.com' -b " =Submit Button" -b 'UserIdRembrd=true' -e 'http://www.wpost.com' 'http://www.wpost.com/ac2/wp-dyn?node=admin/registration/register&destination=login&nextstep=gather&application=reg30-globalnav&applicationURL=http://www.wpost.com'


Alas. No luck. Anyone have a way of doing this successfully?