Screen Scraping
By Alex Carter on September 23, 2024
If you would like to monitor a number or string from some arbitrary web page anywhere in the world, Dan Fruehauf’s article, HTTP Extraction With Paid Monitor, shows you how.Dan’s article not only explains how, but also contains sample code to show how. The sample code is written in M3 (the Paid Monitor Monitor Manager) and uses JSON (JavaScript Object Notation), but the article also discusses how to use regular expressions or XML instead of JSON. The 15 line sample code is available on GitHub, so you can use it without worrying about copyright issues.
HTTP Extraction
HTTP extraction has been around almost as long as the World-Wide Web. We called it “screen scraping” back in the 80’s. It’s a simple concept – you access a web page programmatically, then extract the information you need from the web page. Once you have the information in a variable inside your program, you are limited only by your imagination.
M3 simplifies HTTP extraction. It is a Perl-based framework that uses regular expressions, XML, or JSON to extract parameters. As Dan shows us, M3 can be used to extract data from a web page and populate a Paid Monitor monitor with the extracted data.
A World of Possibilities
When I first read Dan’s article, I could easily see the technical simplicity that it demonstrated, but I could also see that Paid Monitor’ monitoring platform is useful for more than server monitoring. It opened my eyes to new possibilities. I’m sure many others have seen these possibilities already, but it was new for me.
“Limited only by your imagination” is the real key here. I remember when I was originally told about this new Internet thing (1988, I think). I couldn’t get my mind around why everyone would cooperate to build this giant network. Someone had a better imagination than I did. Today, some people will have the same problem. “Okay, I can extract data from web pages. So what?” The lack of imagination obscures opportunities that are available to us.
Sysadmins and other techies may limit their thinking to things like performance, security, and availability (which is good), but may not recognize opportunities to monitor non-technical data. Dan’s article gives one non-technical example, but there are countless others. For example, would your company like to monitor your competition’s website? Your marketing gurus might like to know about any changes. Press releases, new products, new services, and changes to specifications are just a few of the possibilities.
We know CEOs of publicly traded companies like to keep a close eye on their companies’ stock prices. A Paid Monitor monitor could send them a text message when the price drops below a preset level.
Your purchasing department might appreciate a monitor that tracks the prices of key raw materials. Finance departments might like to monitor their bank’s prime lending rate. Lawyers might monitor supreme court decisions that contain certain keywords. Pharmacists might like to be informed about changes to existing product specifications.
There is a myriad of possibilities, but I am limited by my imagination, which is my point. Dan’s article shows us how to track weather, but more importantly it show us that cloud-based monitoring with M3 can be used for much more than we can imagine. In his following article, Planning Your Next Vacation With Paid Monitor, Dan shows that he understands this point well. It contains additional examples that show how to monitor data that is not related to servers. This is not the main point of his article, but it is obvious that he has a good grasp of the concept.
Dan initially introduced M3 and provided examples in M3 – Paid Monitor Monitor Manager. Josh Mattson followed up with Using M3 to Take System Monitors to the Next Level, which outlined the benefits and provided use cases and examples.
Try Paid Monitor For Free. A 15-day free trial. Your opportunity to see how easy it is to use the Paid Monitor cloud-based monitoring system. Credit card not required.
The Paid Monitor Exchange at GitHub. This is the official repository for scripts, plugins, and SDKs that make it a breeze to use the Paid Monitor system to its full potential.
Posted in blog, Web Applications
Alex Carter
Alex Carter is a cybersecurity enthusiast and tech writer with a passion for online privacy, website performance, and digital security. With years of experience in web monitoring and threat prevention, Alex simplifies complex topics to help businesses and developers safeguard their online presence. When not exploring the latest in cybersecurity, Alex enjoys testing new tech tools and sharing insights on best practices for a secure web.