Load Test for Event Sourcing
Background
On previous posts, we explored Event Sourcing and how to implement a Fast Lane Notebook in Databricks. Now the question is: How to check the performance of a Spark Application?
Objective
On this post, we will tackle aspects of the Load Test, and how existing tools could be adapted to execute appropriately on a Spark Application. For now, we only have Databricks Notebooks. As explained before, as Message Broker we use Azure Event Hub.
Problem
JMeter Synchronous Aspect
Apache JMeter is broadly used for Load Testing purposes but is based on Synchronous Request and Response format. This aspect makes it ready for REST API tests, but not for EventHub messages, that are fundamentally asynchronous.
Solution
Java Sampler
The solution was to create a custom Java Sampler that could be loaded into JMeter and do the dirty work of connecting to EventHub and returning the answer to a Java Request component, and this is what we see in the JMeter GUI. In the end, this Sampler is transforming the Asynchronous aspect of the Event Sourcing in a request/response for JMeter to understand. The code can be view here.
Sampler Evolution
The first version of the Sampler was created for Functional Test, but the team realise that the same component could be used for Load Testing with improvements on Java Sampler that we created previously. The new solution includes classes to monitor the return stream and fire the proper thread that was waiting for the response.
Correlation Id
The key aspect that makes the change of the Sampler to work with Load Test was to send a different Correlation Id for each message, and that could be tracked back from the Event Hub response. Using this approach, each JMeter thread receives one answer and the time is recorded correctly.
Results
To analyse the outcomes JMeter has a series of Listeners, and it is possible to create as many as necessary to help to understand the results. The Summary Report can give a good idea of the response time:
CONCLUSION
- This test is already helping us to find performance issues, some not related to our code.
- Avoid waiting to go live to check if it works in a high concurrency environment.
- Creation of the Sampler was a good solution for the Sync nature of JMeter.
REFERENCES:
https://techblog.fexcofts.com/2018/09/01/event-sourcing-event-handling/
https://techblog.fexcofts.com/2018/07/09/spark-app-development-part-i-working-with-notebooks/
https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-about
https://gitlab-delivery-platform.fexcofts.com/cp/jmeter-eventhub-sampler
hi, the repository https://gitlab-delivery-platform.fexcofts.com/cp/jmeter-eventhub-sampler isn’t more available. Do you know where I can find it? Thanks 🙂
Hi,
Unfortunately our Git repositories are not public yet. We hope to have public Git repos pretty soon. In the meantime you can contact [email protected] for any further details about the source code mentioned in this post.
Cheers!
I have mailed but did not get any response for the source code.
Hi Davide,
Did you get the sample code for this?
Thank you,
Vinay