Maybe if you filtered GPT-3’s output through a Polog program (perhaps implemented as a DCG transducer, outputting corrected strings) to make it more coherent semantically, you could get something out of it. Otherwise it’s awfully inconsistent in terms of meaning (though very good at representing structure).
I was wondering if others were thinking along the same lines about transformers, the T in GPT. As many know transformers fail at evluations such as math but are great at generating, including generating Prolog code. I have been experiementing with them generating DCGs but ChatGPT is not perfect in this area either but the fact that it can create Prolog DCGs is impressive and that transformers can be trained to become better is still possible.
So while you used the word filtered
I would replace that with eval
to cover a larger set of problems. Also instead of having GPT generate the initial story instead consider having GPT or such generate Prolog DCGs that generated the story and do all necessary checking.
With OpenAI you would need a large token count to generate the rather lengthy DCG and with ChatGPT set at either 4,000 or 8,000 tokens for the prompt and completion I do not see this working correctly, Even using the continue
prompt to get more code, However GPT-4 was just released and has a model with 32K token limit.
Along a similar lines I posted an example ChatGPT prompt for doing math (ref) since ChatGPT can not do math evaluations. Also tired having CHatGPT create a query for Wolfram Alpha to do the evaluation but was not getting consistent valid queires so set that aside for now.
So in a few years what you seek should be mainstream and might even be as free as a Google query.
While stassa.p and Schank talk about “meaning”. It seems to have been replaced by “attention”. So a sentence has not a meaning, but a sentences pays attention to something. And an answer should pay attention to the same thing that the question was paying attention?
So ChatGPT is more an attention based information retrieval system, whereas the retrieval is not existing documents, but rather newly generated text. So the architecture includes a generative component, and the “transformation” stems from the output is generated relative to the input.
But this transformation is learnt through machine learning? See also:
GPT and BERT: A Comparison of Transformer Architectures
The first transformer was presented in the famous paper “attention is all you need” by Vaswani et al. The transformer models were intended to be used for machine translation and used as encoder-decoder architecture that didn’t rely on things like recurrence. Instead, the transformer focused on something called attention. In a nutshell, attention is like a communication layer that is put on top of tokens in a text. This allows the model to learn the contextual connections of words in a sentence.
https://dev.to/meetkern/gpt-and-bert-a-comparison-of-transformer-architectures-2k46
A sentence can have multiple attentions and an ultimate self-attention?
The self-attention mechanism is a key component of transformer models, and it has revolutionized the way natural language processing (NLP) tasks are performed. Self-attention allows for the model to attend to different parts of an input sequence in parallel, allowing it to capture complex relationships between words or sentences without relying on recurrence or convolutional layers.
Edit 17.03.2023
Disclaimer: Not an expert in these new approachs. So take it with a salt of grain. I am just brainwriting away some stuff I am seeing on the internet. Looks like the world is currently spinning like crazy, according to Google Scholar the paper has ca. 68’000 references!
So to be better than ChatGPT, you need to be better in meaning, or best would be better in meaning and attention, also via machine learning? If you are only good in meaning for some common things, like 2 + 2 = 4
, you will miss less common things.