genAI[generative AI] |
So far, we have studied many data mining algorithms (including NNs), which all learn (model) patterns (between outputs and inputs) in existing data, and use that to classify/calculate new outputs based on new inputs.
So... what's this 'genAI' thing?
Generative AI, very loosely speaking, 'runs a neural network BACKWARDS'!
Rather than learn to classify new data using existing data, why not GENERATE new data instead?
Researchers tried this, but with unimpressive results.
In 2014, Ian Goodfellow got a much better idea than the 'SOTA' - why not pair up TWO NNs in opposing order - one a generator (eager 'student'), and the other, a discriminator (strict 'teacher')? His invention is called a 'GAN'.
GAN stands for Generative Adversarial Network
More: https://developers.google.com/machine-learning/gan/gan_structure and https://machinelearningmastery.com/what-are-generative-adversarial-networks-gans/
An early use of genAI was/is to "add style" to imagery: https://arxiv.org/abs/1508.06576
Encoder: a function (NN) that maps original input to LATENT/ENCODED space [decoder is the reverse]
Autoencoder: encoder + decoder combination - can GENERATE NEW OUTPUT (by interpolating a random point in latent space)!
Variational AE: the encoder produces a distribution rather than a single point [and the decoder uses a sampled point from the distribution].
'Attention is all you need': https://arxiv.org/abs/1706.03762
Revolution: non-fixed size and parallelizable self-attention mechanism (ie. computing word affinities).
More:
The decoder takes a prompt (new point in latent space), INTERPOLATES over inputs (ALL English!!), generates output.
Here is a great explanation of ChatGPT: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/
The core pre-trained LLM needs to be augmented with HFRL [a form of fine-tuning], to produce acceptable responses.
As a result of the extensions listed above, there is bound to be numerous, narrow-purpose GPTs, eg. https://chat.openai.com/g/g-kCfSC3b10-analystgpt - it's in a way, back to 'expert systems' AI from the mid-80s and early 90s :)
Here is an architecture useful for building at-scale RAG apps: https://www.pinecone.io/learn/aws-reference-architecture/
https://www.theguardian.com/books/2023/nov/15/hallucinate-cambridge-dictionary-word-of-the-year
GenAI Against Humanity: https://arxiv.org/pdf/2310.00737.pdf