Adapting Pipelines

It’s probably best to start from an existing pipeline. Export a copy to a local file like this:

soak show pipeline zs > my_pipeline.soak

A pipeline consists of 2 parts:

a YAML header, which defines each node, its type, and other configuration
a series of markdown templates, one for each Node that uses an LLM

The start of the zero shot pipeline is:

name: zero_shot

default_context:
  persona: Experienced qual researcher
  research_question: None

nodes:
  - name: chunks
    type: Split
    chunk_size: 30000

  - name: codes_and_themes_per_chunk
    type: Map
    max_tokens: 16000
    inputs:
      - chunks

...

And an example of a prompt template is:

---#codes_and_themes_per_chunk

You are a: 
This is the initial coding stage of a qualitative analysis.
Your research question is: 

In this stage, you will generate codes. Codes are ...

... instructions omitted ...

The text to read is:

<text>

</text>

Identify all relevant codes in the text, provide a Name 
for each code in 8 to 15 words in sentence case.

[[code*:codes]]

Next is the theme identification stage. 
Your task is to group the codes into distinct themes.
A 'theme' related to the wants, needs, meaningful outcomes, 
and lived experiences of participants.

... further instructions omitted ...

Create themes from the codes above

[[theme*:themes]]

The [[code:codes]] and [[theme:themes]] placeholders are used to identify that LLM completions of specific types (codes/themes) should be extracted from the LLM output. That is, it’s a prompt to soak to request structured data from the LLM.

To adapt the pipeline to your needs, simply edit the YAML file to add your own nodes and prompts. See the documentation for details of all the node types and options.