Using Process Evaluation to Become a Better GenAI Prompt Engineer

December 18, 2024

As a child of the 90s, our generation was born at such an interesting and transformative period. Essentially, we've grown up and matured alongside the Internet (so maybe the Internet also likes dog memes and avocado toast?). During that period of maturation, we've seen how the power of the Internet (and computing in general) has been effectively and creatively leveraged to the benefit of so many different segments of our global population.

One of the Internet OGs of course is Google. It's hard to overstate how radically Google searches changed the way we access and interact with information. Just the other day, I wanted to make snickerdoodle cookies from scratch. And with a few taps, I had dozens of recipes at my fingertips. Parsing through the list, I was able to identify the top-rated recipe and shortly thereafter me and my wife were enjoying delicious cookies.

This experience made me realize something as it relates to data science and statistical methodology -- Google (and search engines like it) made information ubiquitous whereas before it was scarce. Before, if I wanted to learn how to make cookies or fix a leaky faucet or learn how to run linear regression using Python, I'd have to go get a book or ask somebody with more experience. Now, I can Google my question and parse through all of the links to find the answer I'm looking for.

However, while Google does a good job at giving you what it thinks is most relevant to your query, as we know, we don't always get exactly what we're looking for. Over time, people have learned how to query Google more effectively in order to have a high probability of receiving a "correct" search result. This skill, which I colloquially refer to as "effective Googling," was a hallmark of successful people I knew both in school and in professional settings throughout the first two decades of the 21st century (which makes me feel like I need to go take an Advil real quick haha).

Now that we live in the age of ChatGPT and Gemini, what I called "effective Googling" has now become its own field of AI research called "prompt engineering." But the premise is the same.

With these wonderfully powerful tools so easily accessible, we can already see entire industries rapidly changing. Those who will be the most successful in this era will be those who are good at prompt engineering.

But what makes a good prompt? How does one hone this skill? It all comes down to process evaluation techniques. In classical process evaluation and control techniques, we systematically evaluate the degree to which some process is meeting its goal. We can think of entering prompts and receiving responses from ChatGPT as a process.

For any process monitoring/evaluation program to be effective, we need to:

Be specific about the goal and how "success" is defined
Be clear about your ask
Refine (2) as needed in subsequent prompts

In the context of code generation, success is getting the right code to produce the result you want. For example, if I want to transpose a dataframe from wide to long using R, I have to have a sense of how the data should be structured/organized in the resulting dataframe. Otherwise, code is being executed blindly which is not advisable.

After understanding very clearly what success looks like, now we move on to the ask. I have found success in clearly stating what you have (e.g., a dataframe in wide format with an ID column named "X" and value columns named "V1 - V3"), clearly stating what you want (e.g., a transposed dataframe in long format and three columns, one for ID, one for identifying the value labels, and one for the values themselves), and finally stating how you'd like to get there (e.g., using the pivot_longer function or maybe base functionality).

If we get what we want, that's great! And if not, we need to refine our ask until we do.

In sum, effective Googling or prompt engineering is nothing more than a process evaluation problem. If we're clear about our goal and where we are starting from, we can simply use domain-specific language to connect the two points.

Let's keep the conversation going! How do you use AI in your work?

Back to blog