Simple Web Response Time Testing with Python

For my day job, I’m creating a series of HTML pages that each have a table that shows how our various services and solutions map onto problems our customers are likely to have.  The main site is currently thousands of static HTML pages, with a bit of PHP thrown in a few pages to do page footers.  We’re working on upgrading to a dynamic CMS type site.  In the meantime, I used the opportunity to learn a bit more about PHP and I wrote a small function to generate the table HTML given a JSON document describing the table headers, rows, and content.

As I was debugging the sites, I felt like there was sometimes a noticeable delay in rendering the page that wasn’t there on the existing static pages.  Was this my imagination, or something that our users might notice and complain about.  Hmm, I don’t have any web profiling software, and I couldn’t find anything that I could quickly install and run.  And I had some time.  Looks like I have to write some code.  In the immortal words of Leeroy Jenkins, Let’s Do This!”

(Update: I found a simpler way to profile pages in python)

Python timeit Module

Python’s mantra is Batteries Included, implying that for whatever coding task you have, there’s probably something in the standard library that will do muct of what you want.  You shouldn’t have to go and write something completely from scratch.  I knew about python’s time module.  I was planning on using it to mark the time before fetching my webpage, mark the time after fetching the page, and comparing the two.  But I stumbled onto the timeit module, which makes it even a bit easier.  Timeit basically wraps up that logic of marking time before and after some bit of code in convenient package.  You give the timeit.Timer() class a bit of code that you want to time.  The timeit() method will run the code a specified number of times (default 1,000,000) and return the average time for code execution.  The repeat() method will run the timeit() method a specified number of times, and return a list of the average times.

In action, it looks like this:

import timeit

# Request the page 100 times, time the response time

t = timeit.Timer("h.request('http://PAGE/URL',headers={'cache-control':'no-cache'})", "from httplib2 import Http; h=Http()")
times_p1 = t.repeat(100,1)

Three lines of code...not bad.  The Timer() class takes two strings as parameters: 1) The python code you would like repeated and timed, 2) Python code required to run before each run of the test code.  If you're familiar with Unit Testing, then the 2nd parameter is like the setUp() method.  Notice I'm using the httplib2 library instead of the standard urllib library.  I like httplib2 for requesting urls because I'm familiar with it, it combines requesting the url and reading its contents, and its really good about dealing with caching.  In this case, I don't want the server to cache.

The second line instructs my Timer() to run 100 sets of my test code, with 1 trial per set.  The output is a list of 100 times.

The documentation for timeit.repeat() gives some good advice on how much stock to put into these numbers, and using mean/standard deviation to describe the performance.  But what I really wanted to know was whether or not my page took significantly longer to load than a similar page with no dynamic content.  I expanded my code to repeatedly time a second, static page, and the two lists in two columns of a csv file.

import timeit
from csv import writer

# Hit the dynamic page 100 times, time the response time

t = timeit.Timer("h.request('http://PAGE1/URL',headers={'cache-control':'no-cache'})","from httplib2 import Http; h=Http()")
times_p1 = t.repeat(100,1)

# Now hit a similar static page 100 times
t = timeit.Timer("h.request('http://PAGE2/URL', headers={'cache-control':'no-cache'})","from httplib2 import Http; h=Http()")
times_p2 = t.repeat(100,1)

# the times to a CSV file
times = zip(times_p1,times_t2)

with open('times.csv','w') as f:
    w = writer(f)
    w.writerows(times)

Note we're using the python with statement from Python 2.5+, which encapsulates some of the try/except/finally logic you'd normally write when opening a file.    Because I had even more spare time, I imported my new times.csv file into a statistics program (SPSS) to calculate mean, and perform a T-Test to see if the means of the two columns they are statistically different.  I also could have used various statistics scripting tools: scipy, R, for example.  But I didn't have THAT much time.  :)

There was a statistically significant difference.  The dynamic page was, on average, about 1.2 ms slower than the static page.  This makes practically no difference to the user experience of the page, and makes my development life much easier (and also illustrates how practical significance may differ from statistical significance).  I'll continue to generate pages dynamically.

About these ads

2 thoughts on “Simple Web Response Time Testing with Python

  1. Pingback: Even Simpler Web Response Testing in Python with Pylot « Fitzgerald Steele

Comments are closed.