Tutorials on QA

NEW

Test Data and AI: What Makes Good Test Data?

In this series of articles we’re going to be talking about how to use LLMs to generate synthetic data for QA testing, starting with the basics of test data, then moving on to generation methods, and finally looking at examples for generating test data for the purpose of validating LLM products. But let’s start at the beginning - in this article we’re going to talk about how to use synthetic test data more generally, what makes good or bad test data, and we’ll also look at some traditional QA methodologies and how test data can inform them. Synthetic data refers to any machine-generated data that can be used to execute test cases or mock a production environment scenario. This includes data produced by LLMs, procedural data, and human curated or created data generated outside of production. Of course, production data is incredibly valuable for testing, and when it’s possible to use it, it should be used - but often this is not possible, legal or scalable. Generating production data can also be an expensive process for a new feature or product since you need to hire beta testers. Synthetic data also has some other advantages other than cost.
Thumbnail Image of Tutorial Test Data and AI: What Makes Good Test Data?
NEW

Synthetic Data Generation with Prompt Engineering

In our previous article, we talked about the role of synthetic data in QA testing, and looked at two QA methodologies: Equivalence Class Partitioning and Boundary Value Analysis. Today, we’re going to talk about how you can use LLMs to generate test data for your applications. If you haven’t read it yet, I recommend taking a look at our articles on prompt engineering for traditional and reasoning models, as we’re going to be using prompts to generate test data. As we’ve discussed before, there are many reasons to use synthetic data in your testing - one of the largest being the cost and scalability, but it may also be required as an alternative to production data in the event that it contains personally identifiable information (illegal to use in most of the world).
Thumbnail Image of Tutorial Synthetic Data Generation with Prompt Engineering

I got a job offer, thanks in a big part to your teaching. They sent a test as part of the interview process, and this was a huge help to implement my own Node server.

This has been a really good investment!

Advance your career with newline Pro.

Only $40 per month for unlimited access to over 60+ books, guides and courses!

Learn More