Comments (12)
Support for saving and restoring state has been added for Org mode buffers. Saving a gptel buffer to disk will save gptel metadata as Org properties. Opening an Org file and turning on gptel-mode
will cause the state (if available in the file) to be restored, and the conversation can be continued.
See also M-x gptel-set-topic
, which can be used to limit a conversation context to an Org heading.
Support for Markdown mode is pending.
gptel remains stateful when the file is unsaved. I have yet to find a way to do this that does not involve adding additional syntax like a "Response: " prefix or heading. But gptel conversations in Org mode buffers can be saved to disk and resumed now.
from gptel.
By "internal state", do you refer to the use of a text property to differentiate between queries (what you type) and responses (what ChatGPT generates)?
Yes, and also the use of variables to maintain the GPT parameters. I think it would be better to store them in the file too, such that the entire state of the conversation is stored as plain text. If you are running the conversation from a different buffer (as in your programming use case), the parameters could also be stored in file local variables.
Your suggestion is to replace the text-property based differentiation of query/responses with markup.
Essentially yes. But the markup should ideally be very lightweight. In Org, you could mark headers with a special :gptel: tag for example.
I'm not sure if adding persistence is worth giving up "structure-less" interaction in any buffer.
Ideally we would end up with something which is still structure-less (or as much structure-less as possible), while still supporting persistence. One should also note that gptel already makes some assumptions about the structure via the gptel-prompt-string
.
Another alternative is to use markup-based conversations in dedicated gptel buffers (in Org or Markdown as you describe), and continue to use text-properties otherwise, but this makes the code messier and harder to maintain on the whole.
This is a route I wouldn't take. I would stick to the idea of using org/markdown/prog-mode buffers. I would also stick to the idea of staying mostly structure-less, but only to that extent which allows to eliminate other internal state (text properties and maybe parameters).
Using text-only is a powerful concept and also very Emacsy. But maybe it conflicts a little bit with the goal of creating a fully polished UI in the style of the browser or some apps. But I would rather take the plain text only approach, since I believe it just fits better into Emacs.
from gptel.
Lines beginning with
>
are used for responses from the model. ("role":"assistant"
)
The model's response can include code blocks, or other kinds of formatted output. Prepending a >
to them destroys the markup.
Lines beginning with
#
are used for system prompts. ("role":"system"
)
These are comments in org-mode and headings in Markdown -- this means every comment/heading is interpreted as a system prompt?
Choosing other characters for these purposes will present similar conflicts, and possibly confuse users if they are unfamiliar, for example if we use !
or ==
at the beginning of a line to denote a system prompt. The user might also edit them in the course of using the document as a general-purpose Org/Markdown file.
My idea so far is to handle this internally and without imposing any markup or syntax, along the following lines. (I'm using markdown-mode as an example, a similar system will work for org-mode):
- The front matter has a field, let's say "locations", that is a list of integers. Each integer is the value of
(point)
at the boundary between a prompt and a response. - When
gptel-mode
is turned on, we read the locations and start tracking them with markers or text-properties. At this point, yes, the system is no longer stateless. - When
gptel-mode
is turned off or when the file is written to disk, we update the locations list with the boundary information. All the state is confined to the file again. - Since we know what regions of the buffer are prompts and responses,
gptel-mode
can optionally use a visual (like subtle fontification) to indicate this to the user. We can also distinguish visually between prompts and pre-existing text that was not fed to ChatGPT this way. - With a slightly richer data structure than a list -- that is still not too ugly when serialized as front matter -- we can track the model that buffer content came from, such as a response from GPT-3 that was used as a prompt fed to DALL-E to produce an image.
- Finally, using gptel in any buffer without turning on
gptel-mode
is possible (like right now) but there's no persistence.
Essentially: Instead of storing the state metadata separately in an auxiliary file or creating syntax and imbuing it with meaning, we store the metadata in the file itself as TOML-style front matter or as Org property drawers.
The advantages are that
- the user is free to impose whatever structure they desire on the conversation. If they want to use headings for their prompts and section contents for the responses, that works. If they want a free-flowing conversation that works too.
- You can have as much visual indication as you'd like. None, or differently colored backgrounds for prompts/responses, or something in between.
- If you turn
gptel-mode
off it's just a regular Org file -- no special syntax/markup that you might mess up before you turngptel-mode
on again. Of course, you could mess up the front matter or property drawer, but this is less likely than editing prefixes at the start of lines, etc.
The disadvantages are that
- when the buffer is modified as a result of interaction with ChatGPT the system is in an intermediate state that can be lost. This can be mitigated by updating the metadata in the front matter after each response from ChatGPT. This can be done so that the response text insertion and the metadata update are part of an amalgamated change as far as undo is concerned.
- the front matter/property drawer will fill up with what looks like gibberish.
What do you think?
(@minad feel free to weigh in.)
from gptel.
I think it would be better to store them in the file too, such that the entire state of the conversation is stored as plain text.
+1, using Markdown front-matter for all parameters would be great. To avoid clutter the front matter could be hidden by default.
Your suggestion is to replace the text-property based differentiation of query/responses with markup.
+1, it would be nice if it was possible to continue conversations by saving the file and then opening it again.
Actually, it would be nice if all conversations were backed by a file on disk.
A few emergent properties would result from this:
- It is possible to review and continue all conversations.
- Packages which provide a persistent undo history, such as
undo-tree
, allow reviewing past branches of conversations in the same way that the web UI does. - Having the conversation in a file would allow easier integration with other (non-Emacs) tools.
from gptel.
I'm current using file-name-handler-alist
to save gptel to a file:
(setf (get 'gptel--system-message 'permanent-local) t)
(defun gptel-run-real-handler (operation &rest args)
(let ((inhibit-file-name-handlers
(cons #'gptel-file-handler
(and (eq inhibit-file-name-operation operation)
inhibit-file-name-handlers)))
(inhibit-file-name-operation operation))
(apply operation args)))
(defun gptel-insert-file-contents (filename &optional visit beg end replace)
(with-undo-amalgamate
(let (obj ans)
;; FIXME: honor replace == nil
(delete-region (point-min) (point-max))
(setq ans (gptel-run-real-handler 'insert-file-contents filename visit beg end replace))
(goto-char (point-min))
(setq obj (read (current-buffer)))
(delete-region (point-min) (point-max))
(mapc
(lambda (x)
(let ((content (plist-get x :content)))
(pcase (plist-get x :role)
("system" (setq-local gptel--system-message content))
("user" (insert content))
("assistant" (insert (propertize content 'gptel 'response))))))
obj)
ans)))
(defun string-trim-ignore-advice (str &rest _)
str)
(defun gptel-write-region (_start _end filename &optional append visit lockname mustbenew)
(when append (error "append not supported"))
(let (ans gptel--num-messages-to-send obj)
(save-excursion
(save-restriction
;; FIXME: respect start + end
(widen)
(goto-char (point-max))
(catch 'revert!
(atomic-change-group
(advice-add #'string-trim :override #'string-trim-ignore-advice)
(unwind-protect
(setq obj (gptel--create-prompt))
(advice-remove #'string-trim #'string-trim-ignore-advice))
(delete-region (point-min) (point-max))
;; FIXME: pp settings ought to be set
(pp obj (current-buffer))
(setq ans
(gptel-run-real-handler 'write-region
(point-min) (point-max) filename
nil visit lockname mustbenew))
(throw 'revert! nil)))
ans))))
(defun gptel-file-handler (operation &rest args)
(cond ((eq operation 'insert-file-contents)
(apply #'gptel-insert-file-contents args))
((eq operation 'write-region)
(apply #'gptel-write-region args))
(t (apply #'gptel-run-real-handler operation args))))
(add-to-list 'file-name-handler-alist
(cons (rx ".gpt" eos)
#'gptel-file-handler))
When I have a bit more time I can fix some things in this and make a PR
from gptel.
Support for saving chats to Markdown/Text files has been added. The implementation isn't very satisfying compared to Org (I'm using file-local variables), but it works.
from gptel.
More precisely the idea is that you don't maintain any internal state in gptel and instead take everything from the current buffer.
By "internal state", do you refer to the use of a text property to differentiate between queries (what you type) and responses (what ChatGPT generates)? Because otherwise gptel is already stateless. When gptel-send
is invoked, it does a text-property-search-backward and builds the conversation history/context to send -- it does not maintain anything internally. How many past exchanges it searches for is controlled by one of the model parameters in the transient menu.
Your suggestion is to replace the text-property based differentiation of query/responses with markup. As I mention in the README, I did not want to make any assumptions about structure at the start since it's not clear if, for example, forcing the heading-content-heading-content structure makes sense. Right now I can have a conversation in a code buffer that looks like this:
# How do I parse arguments in a bash script? I want it to handle the arguments "-d" (that sets "download") and "-b". Respond with only bash code.
while getopts "d:b:" opt; do # <-- response from ChatGPT
case $opt in
d)
download=1
shift
;;
... # code omitted.
esac
done
# Now write a function to do task X, where...
Note: gptel-mode
isn't even required for this, you can just type the comment and run gptel-send
.
However there's no robust way to persist text-properties as metadata, so the above exchange cannot be resumed in a new Emacs session. I'm not sure if adding persistence is worth giving up "structure-less" interaction in any buffer.
Another alternative is to use markup-based conversations in dedicated gptel buffers (in Org or Markdown as you describe), and continue to use text-properties otherwise, but this makes the code messier and harder to maintain on the whole.
I don't have any strong opinions about this yet, I'm still experimenting to see what's possible/useful behavior! Let me know what you think.
from gptel.
@CyberShadow Storing and reading the chat parameters from front-matter in Markdown (or a property drawer in Org) is quite simple. However we also need to store the boundaries demarcating prompts and responses. Reading headings as prompts and the text body as responses is too limiting. You can't have a long prompt that includes a bulleted list of instructions to ChatGPT, for example. See @minad's point above about using some format that is as structure-less as possible. Do you have any ideas on how to do this?
from gptel.
Yes. I agree that imposing typing overhead on users' prompts would be annoying, so I had the following syntax in mind:
- Lines beginning with
>
are used for responses from the model. ("role":"assistant"
) - Lines beginning with
#
are used for system prompts. ("role":"system"
) - All other non-empty lines are used as user prompts. (
"role":"user"
) - Empty lines delimit messages, except when both the above and below line are user lines, in which case they're just a
\n\n
. - Contiguous spans of lines formatted in the same way are grouped together as one
messages
item.
I think this gets us close to being able to represent with 100% fidelity all possible inputs to the API endpoint. A few corner cases are not representable (trailing newlines, or several consecutive messages
items with the same "role":"user"
), but I think this is acceptable. There's also ">" or "#" at the start of the line in user input, though if we really need that, that could be represented by space-stuffing as in RFC 3676.
For ease of use the major mode could implement some niceties which do not detract from the stateless design or fidelity of representation. For example, hitting Return while point is on a line which starts with >
or #
could prefix the new line with the same character.
Does that sound reasonable?
from gptel.
The model's response can include code blocks, or other kinds of formatted output. Prepending a
>
to them destroys the markup.
It should not.
Quoted paragraph
code block in quoted paragraph
Continuation of quoted paragraph
Source code for the above:
> Quoted paragraph
>
> ```
> code block in quoted paragraph
> ```
>
> Continuation of quoted paragraph
These are comments in org-mode and headings in Markdown -- this means every comment/heading is interpreted as a system prompt?
Sorry, when would this be a problem? I don't think I've ever needed to type a Markdown heading or quote into GPT.
from gptel.
What do you think?
I admit it would work. I can think of these minor points:
-
Personally, I would like to be able to type the syntax which defines who said what. Otherwise, it no longer feels like I'm working in a text-based format, but some kind of WYSIWYG editor with hidden state which I can't see or control (which would be somewhat true).
-
It would be no longer possible to copy fragments of conversations into other conversations using just pure text editing operations. It would have to be done in a way that preserves text properties, or manually "re-paint" text after pasting with gptel-specific commands. (To be fair here, copying from Emacs buffers to Emacs buffers does preserve properties.)
-
The format of the out-of-band metadata would only be understandable by gptel, which hinders interoperability with other software.
-
Updates to the out-of-band metadata may interfere with
undo
/undo-in-region
in unpleasant ways.
If the main concern is conflict with user-typed quote blocks / headings, a more distinct prefix could be chosen, such as GPT>
or GPT-SYSTEM:
.
Does that make sense?
from gptel.
I have no actionable ideas at the moment to make gptel completely stateless (without adding syntax, which I don't want to do), so I am moving it to discussions for now.
from gptel.
Related Issues (20)
- gptel-org-branching-context requires Org 9.7 (unreleased) HOT 6
- Watermarking / inline annotation of LLM generated content vs prompt content HOT 2
- Org mode response formatting HOT 8
- Ollama as default backend HOT 1
- Perplexity: citations HOT 14
- Different results between gptel+ChatGPT and chat.openai.com HOT 2
- Auto Fill mode should be respected HOT 4
- Wrong language in the source block header HOT 4
- Help using local backends HOT 2
- Interacting with privateGPT (specifically for RAG) HOT 10
- Is it possible to interact with LLMs provided by an Open WebUI? HOT 1
- Error with the local LLM scripts in init file HOT 5
- How to disable the highlight after its done generating HOT 9
- How to register additional backends in doom emacs? HOT 5
- Enhance `gptel` to allow selecting from list of existing gptel buffers, when called interactively HOT 1
- [enhancement] Provide information about the different models HOT 1
- Updating `DoomEmacs`/`gptel` and chaing lightly configuration => only garbage generated now... HOT 8
- (gptel-menu) Truncated width of system message is wrong
- Better prefix-arg behavior for M-x gptel
- gptel-request wrong type argument HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gptel.