iEntry 10th Anniversary RSS Newsletter Advertising
Visit Twellow.com

LWP (Library for WWW in Perl)

Post to Twitter Post to Facebook

If you want to automatically process web pages to extract data, you have a number of tools available. You can bring a web page down to your computer using "curl" or "wget"

curl http:.//aplawrence.com > mysite

If you don't really want the html, use "lynx --dump http://whatever.com > /yourstorage/whatever.txt" to get a text representation of the page. Check the man page for options you might want like "--nolist" and also see lynx alternatives

You can also easily be selective and pull only the data you want from a page with simple Perl scripts.

#!/usr/bin/perl
use LWP::Simple;
$url = 'http://aplawrence.com";
$content = get $url;
print $content;

And then of course you'd process the $content as desired. It's only a little more complex if you are dealing with forms; see http://aplawrence.com/Words/2005_03_05.html for a small example of that.

A book that covers LWP is reviewed at http://aplawrence.com/Books/webc.html.

*Originally published at APLawrence.com

A.P. Lawrence provides SCO Unix and Linux consulting services http://www.pcunix.com

News Tags: Library, Books, HTTP, perl
About the author:
A.P. Lawrence provides SCO Unix and Linux consulting services http://www.pcunix.com

1 Comment

great stuff

Another great article AP

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
4 + 1 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
WebProNews on Facebook
Featured Headline
Search Bing From Hotmail Inbox to Insert Content
Bing Added to Quick Add Feature
1 comment | Thursday, July 9th
 
Subscribe to WebProNews


Send me relevant info