Programming for and with Large Language Models
Large language models (LLMs) are neural networks (usually based on the transformer architecture) that model human language, i.e., given an initial text sequence – the so-called prompt – they predict which words are statistically most likely to come next. When given the right prompt, LLMs are able to solve a surprisingly large number of downstream tasks. For example, if a LLM should summarize a long text, a good prompt might consist of the actual text followed by a “TLDR” to make the model believe that the most likely continuation is a short summary of the preceding text. The art of crafting suitable prompts is called prompt engineering. Our interest is mainly focused on two aspects: (1.) The effective use of pre-trained LLMs, e.g., by means of prompt engineering and/or software frameworks; and (2.) Using LLMs to write high-quality code – we refer to such LLMs as code models.
Prompt Programming
Prompt engineering is not only needed to make large language models produce the desired output; it also has a high impact on the quality of the results. Without access to the internals of an LLM, prompt engineering is the only way to control how it behaves. Therefore, one possible way to view LLMs is that they are a kind of program interpreter, executing programs (= prompts) written in natural language. Prompt engineering is thus a form of programming. In this seminar topic, you will learn about the underlying principles of “prompt programming” and get to know different prompting patterns. Moreover, you will see how software frameworks can interact with LLMs to realize dynamic prompts, different decoding techniques, output constraining, and more.
- Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
- Demystifying Chains, Trees, and Graphs of Thoughts
- Prompting Is Programming: A Query Language For Large Language Models
LLM Vulnerabilities
You may know code injections, a common exploit in different programming languages (e.g., SQL injections) that allow an attacker to inject malicious instructions in a program. LLMs suffer from a similar vulnerability, called prompt injections. As there is no clear separation between the model’s instructions and the user’s input, an attacker can craft a prompt that is (mis-)interpreted by the model as the former instead of the latter. This way, attackers can abuse an LLM to do things it’s not supposed to do. They can even manipulate another user’s AI assistant to do harmful things. In this topic, you will research different attack scenarios and ways to prevent or mitigate them.
- Prompt Injection attack against LLM-integrated Applications
- Prompt Injection Attacks and Defenses in LLM-Integrated Applications
- Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
- A Survey of Trojans in Neural Models of Source Code: Taxonomy and Techniques
Program Synthesis with LLMs
Program synthesis is the task of automatically generating a program that matches a given description and input/output examples. In recent years, large language models have shown that they can accomplish this task quite well. Program synthesis through LLMs has the potential to make writing code by hand largely unnecessary. But obviously, we are not quite there yet. Despite intriguing advancements, such code models are still very limited. In this seminar topic, you will take a look at different code models as well as evaluation methods and results and see which factors affect the performance of code models. You will also see how humans can interact with those models to solve problems cooperatively.
- Program Synthesis with Large Language Models
- Large Language Models for Software Engineering: Survey and Open Problems
- Evaluating Large Language Models Trained on Code
- CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
Feedback Mechanisms in Code Generation
The first answer an LLM gives is not always the best one. In fact, providing feedback on the answer and regenerating it can improve the results a lot. There are many approaches that follow this idea and implement an automatic feedback and refinement loop. This topic is about exploring and surveying such feedback mechanisms in the domain of code generation. You will find out in which ways feedback on code can be computed and communicated to the model.
- Self-Refine: Iterative Refinement with Self-Feedback
- CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing
- Teaching Large Language Models to Self-Debug
- Improving Code Generation by Training with Natural Language Feedback
- Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback
Constrained Decoding for Code Models
Causal LLMs generate their output one token at a time. The idea behind constrained (a.k.a. guided, structured, or masked) decoding (or generation) is simple: in each step, prevent all those tokens from being generated that would continue the sequence generated so far in an undesired way. For example, if the LLM is supposed to return only a number, we can constrain the output to tokens that do not contain any letters. However, things get much trickier when applying this approach to more complicated settings and constraints, such as code synthesis, where we want syntactically and semantically correct programs. In this seminar topic, you will research different constrained decoding frameworks for code generation and find out how they work and what their advantages are.