Terraform/Jmeter performance testing

Terraform/Jmeter performance testing - practical experience

June 14, 2017

Over the last while we've had the opportunity to put our new Jmeter learnings to work.

Bug came up that was only evident under load - we were able to reproduce it in our dev environments! The dev ran Jmeter off his laptop, and it was enough load to generate the bug.
I think I mentioned last time about how the simple act of mapping out a Jmeter script revealed excess calls to our middleware - tickets were created to address this.
QA has used it to help draw out issues with a new production environment, but...

...the other day they ran out of steam on their laptops. So we got to come back to the Terraform/Jmeter setup we built a few months back. Thankfully everything still worked, and we were quickly (15m) on our feet with 1 master and 6 slaves (c4.large) raising heck.

This is where I will talk about the lessons we learned today...

Terraform is amazing and was totally worth the time investment
If you are testing a cold production environment - ASK ABOUT HOSTS FILE CHANGES!! We hammered 'real' production for a good 5 minutes before realizing that. "Huh, where are the logs?" After the cold sweats passed, everyone had a good giggle.
Killing the Jmeter master test run doesn't also kill the slave processes! Oops.
Manually updating the master/slave nodes quickly gets old (just git pulls, but definitely worth some automation)
Manually uploading the reports/data dumps to S3 quickly gets old (even if it's pasted cmds)
Manually linking S3 reports/dumpfiles quickly gets old (even if it's just 3 files)
Not having a parameterized Jmeter script (i.e. for the thread count variable) leads to wasted time (although not sure how thread count #s work in a master/slave setup - probably won't help that much)
There is a ceiling of thread count/thread groups that will cause Jmeter to blow up - GC overload, heap dump, etc - we had 15 thread groups with 3-4 requests each, and accidentally put in 417 threads per group. Oops. (fwiw, 417 threads with 5 thread groups was ok)
If you are using AWS ELB IPs and hosts file changes on the slaves - check the IP validity before each test run! We had an IP expire and didn't realize - corrupted two of our test runs. Better yet, find a better solution than hosts file changes!!
Don't use test accounts/organizations/users for performance testing! Use real data! Better yet, log replay!!! (this is our next big step, has to happen)
Start with a CLEAR idea of the targets you are testing against. Don't assume you can easily translate a prod figure into a Jmeter figure! Better yet, use log replay!!
It's hard to learn while performing!
Performance testing is hard

Our next big steps are going to probably be:

Put a front-end on the terraform/jmeter automation, make it accessible to dev/qa
Log replay. Log replay. Log replay.

Search This Blog

Practical Technology

Terraform/Jmeter performance testing - practical experience

Comments

Post a Comment

Popular posts from this blog

DFSR - eventid 4312 - replication just won't work

Health check learnings, this time with data!

Learning through failure - a keyboard creation journey