While machine learning is still an appopriate moniker for a sub-part of the AI space where the systems are pre-trained or adaptively trained it's unfortunate that the phrase artificial intellience has also become attached to simple state machines and non-adaptive systems with basic recognition of a set of keywords and action sets that are hard-coded with a limited set of user input variables.
My desire is to develop a system that is capable of learning dynamically based on observation - re-training as that input is provided, having a set of long term intents and attempting to fulfil those intents in the least impacting lowest effort manner based on obsevations of the current state of whatever problem space it is given access to combined with observing its own actions and their impact.
I've decided to describe it as Level 2 below and come up with some simple categories for the intelligence of software:-
I don't believe this is easily achievable however you only learn by trying and there are limited areas that pose significant problems to implementation.
While this planned implementation pre-dated GPT being released (2010) the tokeniser and generators have been easily updated based on the new approaches they have displayed. On the diagram below significant alignment with the new LLM approaches are possible, however this implementation is not limited to textual or image generation and can instead be used for synthesis of other outputs. As of October 2023, the model for this system is now up to 80 billion tokens in size and producing comparable output to GPT-3.
The primary issue with this design currently is the Output encoder and the Intent engine; the Intent engine lacks a means to decide on the actual activity to perform to achieve the given change in the simulated model; although it's possible for this to simulate the change okay imagine playing a game of noughts and crosses, it doesn't know to actually draw a nought or a cross.
This is the same issue with the Observation decoder, it can be sent a set of commands or state about a particular thing but must be hard-coded to understand that it is being sent the graphical or textual data representing a 2d board in the case of noughts and crosses; in the case of playing a 3D game the only hope would be that the probability model would eventually learn that things move around which is silly as there's no way it would achieve that in a reasonable time period.
This problem is what large numbers of academics are tackling though - it's fairly safe to assume that someone will find a resolution to this particular problem eventually; in the interim this system will simply use a lexical analyser in its place so we're not stuck forever. If we complete the rest of the work, we can always revisit this if nobody else has solved the problem yet.