Skip to main content
Beautiful Soup HTML parsing

The following Python code fetches the specific windspeed web page and extracts the timestamp, average windspeed, direction, gust speed and writes out data to a date stamped file named say /home/user/wind_data/windspeed_date(2015-04-21-12).txt. Schedule a cron job to run this every day at midnight say. The windspeed file can be selected for a particular day and processed by graph.py. 


#!/usr/bin/python

import os
import requests
import time
from bs4 import BeautifulSoup

date_stamp = time.strftime('%Y-%m-%d-%H',(time.localtime(time.time())))

outfile = os.path.join(os.path.expanduser('~'), 'wind_data', "windspeed_%s.txt"%date_stamp)
f = open(outfile,'w')
list = []
r = requests.get("http://xxxxx.wwww.yyyyy")
soup = BeautifulSoup(r.content)
table = soup.find("table", {"id":"grid"})

for line in table.findAll('tr'):
for l in line.findAll('td'):
str = l.getText()
list.append(str)

for item in list:
f.write("%s\n" % item)
f.close()


The following python program graphs the data from the windspeed text file.

#!/usr/bin/python

# This program requires the input of the date reference of the file
# created by the scraping program hha.py. That program stores the
# scraped data in file named windspeed_2015-04-21.txt for example.
# The scraped data is in the form of date time /n ave windspeed /n
# wind direction /n gust speed /n

# 21/04/15 22:10
# 7.19kt
# 40.10deg
# 11.46kt
# 21/04/15 22:00
# 5.44kt
# 32.70deg
# 10.88kt
# 21/04/15 21:50
# 6.41kt
# 40.40deg
# 10.88kt


import numpy as np
import matplotlib.pyplot as plt

#following for earlier version of file processing
date = raw_input("Enter date as yyyy-mm-dd ")
file = 'windspeed'+'_'+ date

list = open('%s.txt' % file,'r').readlines()

timestr = []        # list containing the time string e.g. 10:20
for i in list[::4]:
    v = i[-6:-1]
    timestr.append(v)

time = []        # list containing the time samples as numbers e.g. 10.2
for i in list[::4]:            # start at element 0 and step 4
    u = i[-6:-1]
    u = float(u.replace(':','.'))    # replace the time sec colon
    time.append(u)

wind_ave = []
for i in list[1::4]:            # start at element 1 and step 4
    w = float(i[:-3])        # remove the last 3 chars inc /n
    wind_ave.append(w)

wind_ave = wind_ave[::-1]

direction = []
for i in list[2::4]:
    y = float(i[:-4])        # remove last 4 chars inc /n
    direction.append(y)


gust = []
for i in list[3::4]:
    z = float(i[:-3])    # remove the last 3 characters kt + /n
    gust.append(z)

gust = gust[::-1]

p = range(len(time))


timelabel = []
for i in timestr:
    if i in ['00:00','03:00','06:00','09:00','12:00','15:00','18:00','21:00','24:00']:
        timelabel.append(i)
    else:
        i = ' '
        timelabel.append(i)

timelabel = timelabel[::-1]

d = 21
plt.xticks(p,timelabel)

plt.plot(p,gust, '-r', label = 'gust speed')    # solid red line
plt.plot(p, wind_ave, '-b', label = 'ave speed')    # solid blue line
plt.legend(loc='upper right')

plt.xlabel('time (10 min intervals)')
plt.ylabel('windspeed (kt)')
plt.title('Landguard windspeed on %s'%date)
plt.grid(True)
#savefig("windspeed.png")
plt.show()


# r = np.arange(0, 3.0, 0.01)

r = 2 * np.pi/360
direction = np.asarray(direction)
theta = r * direction

ax = plt.subplot(111, polar=True)
ax.set_theta_zero_location('N')
ax.set_theta_direction(-1)
ax.scatter(theta, wind_ave, color='r', linewidth=3)
ax.set_rmax(20.0)
ax.grid(True)

ax.set_title("wind direction on a polar axis on %s"%date, va='bottom')
plt.show()

Comments

Popular posts from this blog

GNU Radio Waterfall and CW Filter

The following GNU radio application adds a waterfall spectrogram to the previous CW filter program. The plot show 4 CW signals in the audio band (lower sideband) at 7023 kHz. The 700Hz signal is filtered and output to the laptop headphones by the CW bandpass filter. The frequency display is shown after the script which is as follows: #!/usr/bin/env python from gnuradio import gr from gnuradio import audio from lpf_bpf_class import Bandpass from gnuradio.qtgui import qtgui from PyQt4 import QtGui import sys, sip     class cw_filter(gr.top_block):     def __init__(self):         gr.top_block.__init__(self)           sample_rate = 44100         out_rate = 8000         kaiser = Bandpass()         cw_flr = gr.fir_filter_fff(1, kaiser.bpftaps)         decimate = int(sample_rate/out_rate)         Bandpass.cutoff1 = 3000                pre_decim = Bandpass()         dec_flr = gr.fir_filter_fff(1, pre_decim.lpftaps)         dec = gr.keep_one_in_n(gr.sizeof_float, decima

Digital Bandpass Filter FIR design - Python

The python code generates the Finite Impulse Response (FIR) filter coefficients for a lowpass filter (LPF) at 10 (Hz) cut off using firwin from scipy.  A highpass filter is then created by subtracting the lowpass filter output(s) from the output of an allpass filter. To do this the coefficients of the LPF are multiplied by -1 and 1 added to the centre tap (to create the allpass filter with subtraction). A second LPF is then created with a cutoff at 15 (Hz) and the bandpass filter formed by addition of the LPF and HPF coefficients. The program also generates a test sine wave of a given amplitude and power and to this noise from a Normal distribution is added.  The graph below shows the signal and nois, and the signal (green) after filtering. The input snr is approximately 3dB. The frequency response below shows the passband centered on 12.5 (Hz), the Nyquist frequency is 50 (Hz). from numpy import cos, sin, pi, absolute, arange from numpy.random import normal from scipy.

Splunk Cheat Sheet (Linux)

1. set root's password:  sudo su passwd root Enter new UNIX password: < new_root_password > Retype new UNIX password: < new_root_password > passwd: password updated successfully # su - 2. Remove any existing Splunk directories & create user etc: # rm -rf /opt/splunkforwarder # userdel -r splunk # this will remove as above if user splunk's home directory # groupadd siem # useradd -g siem -s /bin/bash -d /home/siem -m siem # vi ~/.profile # chage -I -1 -m -0 -M -99999 -E -1 siem If above fails because of multiple passwd fails: # pam_tally --reset check with #chage -l siem # uname -a # check OS version # dpkg -i splunk-4.3.1...........intel.deb # chown -R siem:siem /opt/splunk # su - siem : $SPLUNK_HOME/bin/splunk start --accept-license : $SPLUNK_HOME/bin/splunk edit user admin -password newpassword -role admin -auth admin:changeme 3. vi ~/.profile (as follows) (OR .bash_profile) # ~/.profile: executed by the command interpreter for log