Style Text WebGL+iOS Stand-alone LLM (+Llama.cpp wrapper)

This is a set of mobile friendly LLAMA and GPT2-based GenAI LLM models for text line rewriting with iOS and WebGL wrappers beased on LLama.cpp.


We ship it with three model quantised resolutions, try it online entirely in your browser with the quantised (simplified) one :

- Automated NPC Dialogue stylisation demo (LLAMA based Q4_K_M ~110mb)

- Single Prompt Debug demo (LLAMA based Q8_0 ~170mb)

Inside the package, you will also find Q8_0, Q4 and original bf16 (~321mb) resolutions of our models ready to run inside a cross-platform game engine!


You give it:

```

<input> How are you today? <inputEnds>

<style> Pirate's Poetic Question <styleEnds>

<output>

```

It prints out: Hey there, how fares ye today?


We hope that this model and wrapper can help you save a day! (or, more realistically, 3 months with a dedicated team of 3 people in our estimates).

  • Model finetuned on a corpus of more than 0.5+ million dialogue lines (64343812 tokens).
  • All our training data was purely syntactical and generated.
  • Tools for output filtering are provided with release in CSharp - thus easely editable!
  • All generative LLM models can make mistakes in their output. In some cases, they can turn questions into statements and add minor hallucinated info based on the input text and style data.

The bounded model inference wrapper is based on open-source model runtime and supports the latest 405b models, so you can try them out on platforms with enough RAM/time available.


The bundled wrapper allows:

  • Build and tested for: Web, Mobile, PC.
  • To execute model setup and execution in asynchronous mode based on threads in C++ code. (this is needed if one wants parallel code run when building targeting the Web platform)
  • It allows tapping into Logs, output Tokens, and completion callbacks from generation.
  • Setup model update callbacks in editor UI using events and helper scripts. Also, you can configure most model configuration/run parameters from the editor UI and at runtime.