Web Scraping using a Microcontroller

This program connects a Wiring or Arduino module to the internet through a Lantronix serial-to-ethernet converter (Xport, WiPort, or Micro). The microcontroller makes a TCP connection to a web server first. Once it’s got a connection, it sends an HTTP request for a web page. When the web page comes back, it parses the page for < and > symbols, and takes the string between them. Then it converts the string to an integer. It assumes the string is made only of numeric ASCII characters (0 – 9).

This program couldn’t parse an entire web page very easily, so it’s best used in conjunction with a web scraper PHP script like this one, which reads the AIRNow site and extracts the Air Quality Index into a single string like this:

< AQI: 54>

This program was written to make an air quality index meter out of an analog voltmeter.

The electrical connections to the microcontroller are as follows:

  • disconnected LED: Arduino digital I/O 6
  • connected LED: Arduino digital I/O 7
  • connecting LED: Arduino digital I/O 8
  • requesting LED: Arduino digital I/O 9
  • Lantronix module reset: Arduino digital I/O 10
  • Voltmeter: Arduino digital I/O 11. The voltmeter is controlled by using pulse width modulation (analogWrite() command on the Arduino).

The web scraper is written in PHP. Its code follows below the Arduino code.

Technorati Tags: ,

Continue reading “Web Scraping using a Microcontroller”

Network Data Logging Suite

This suite of programs takes data from a sensor and saves it to a text file on a network. Each sensor reading is time stamped. The suite illustrates the basic principles involved in sending sensor data to a networked file or database.
The first program involved is a microcontroller program, written in PicBasic Pro, tested on a PIC18F258. It waits for serial input from an external program. Then it reads its analog sensor, and sends the result out in two bytes.
The second program is the same microcontroller code in Wiring/Arduino, thanks to Jamie Allen for the cleanup.
The third program involved is a desktop computer program, written in Processing. It requests data via its serial port from the microprocessor and sends that data to a CGI program on a web server. It passes the data into the CGI using an HTTP GET request. This program only sends every three seconds, so as not to overwhelm the server with hits.
The fourth program is a CGI (common gareway interface) program, written in PHP. It takes in data from an HTTP GET request and appends it to a text file, along with the time the request was received. Note that this program does not check to see how big the file is, or whether the incoming data is properly formatted, so it isn’t terribly secure.
The fifth program is another PHP script that logs the data to a mySQL database. Running this doesn’t require any change in the microcontroller code, but it does require a slight change in the Processing code. The change is in the sentToNet() method, and is noted below.

Technorati Tags: ,

Continue reading “Network Data Logging Suite”