Foundation models
A foundation model is a large artificial intelligence model trained on a vast quantity of unlabeled data at scale (usually by AI/Self-supervised learning) resulting in a model that can be adapted to a wide range of downstream tasks
See:
# Resources
- https://en.wikipedia.org/wiki/Foundation_models
- https://en.wikipedia.org/wiki/Large_language_model
- Center for Research on Foundation Models (CRFM)
- Foundation Models and the Future of Multi-Modal AI
- Foundation models: 2022’s AI paradigm shift
- Foundation Models: paradigm shift for AI or mere rebranding?
- ChatGPT, LLMs, and Foundation models — a closer look into the hype and implications for startups
- AI Image Generators Compared Side-By-Side Reveals Stark Differences
- Illustrating Reinforcement Learning from Human Feedback (RLHF)
- Langchain JS | How to Use GPT-3, GPT-4 to Reference your own Data | OpenAI Embeddings Intro
# Language models
# AI-based conversational models and search engines
- Bing chat
- Claude
- ChatGPT
- Bard
- You chat
- Perplexity.ai
- Exper AI
- Neeva
- Humata - Ask AI anything about your files
- Explainpaper - Understand papers instantly
# Vision Models
# Courses
# References
- #PAPER
On the Opportunities and Risks of Foundation Models (Bommasani 2021)
- A foundation model is any model that is trained on broad data at scale and can be adapted (e.g., fine-tuned) to a wide range of downstream tasks; current examples include BERT, GPT-3, and CLIP
- Foundation models are based on deep neural networks and self-supervised learning
- On a technical level, foundation models are enabled by transfer learning and scale
- The idea of transfer learning is to take the “knowledge” learned from one task (e.g., object recognition in images) and apply it to another task (e.g., activity recognition in videos).
- Within deep learning, pretraining is the dominant approach to transfer learning: a model is trained on a surrogate task (often just as a means to an end) and then adapted to the downstream task of interest via fine-tuning
- Transfer learning is what makes foundation models possible, but scale is what makes them powerful. Scale required three ingredients: improvements in computer hardware, the development of the Transformer model architecture that leverages the parallelism of the hardware to train much more expressive models than before and the availability of much more training data
- #PAPER Florence: A New Foundation Model for Computer Vision (Yuan 2021)
- #PAPER Foundation Transformers (Wang 2022)
- #PAPER InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions (Wang 2022)
- #PAPER Towards artificial general intelligence via a multimodal foundation model (Fei 2022)
- #PAPER
InstructGPT - Training language models to follow instructions with human feedback (Ouyang 2022)
- https://openai.com/blog/chatgpt
- https://en.wikipedia.org/wiki/ChatGPT
- See GPT-3 in AI/Deep learning/Transformers
- ChatGPT – a generative pre-trained transformer (GPT) – was fine-tuned on top of GPT-3.5 using Supervised learning as well as Reinforcement learning. Both approaches used human trainers to improve the model’s performance. In the case of supervised learning, the model was provided with conversations in which the trainers played both sides: the user and the AI assistant. In the reinforcement learning step, human trainers first ranked responses that the model had created in a previous conversation. These rankings were used to create ‘reward models’ that the model was further fine-tuned on using several iterations of Proximal Policy Optimization (PPO)
- #PAPER ChatGPT is not all you need. A State of the Art Review of large Generative AI models (Gozalo-Brizuela 2023)
- #PAPER
LLaMA: Open and Efficient Foundation Language Models (Touvron 2023)
- Introducing LLaMA: A foundational, 65-billion-parameter large language model
- Paper explained
- #CODE Dalai - The simplest way to run LLaMA on your local machine
- #PAPER GPT-4 Technical Report (OpenAI 2023)
- #PAPER ChatGPT: Jack of all trades, master of none (Kocon 2023)
- #PAPER OpenChatKit (Together 2023)
- #PAPER
Alpaca: A Strong, Replicable Instruction-Following Model (Taori 2023)
- Fine-tuned LLaMA 7B model on 52K instruction-following demonstrations produced using OpenAI API
- Performance qualitatively similar to OpenAI’s text-davinci-003, while being surprisingly small and easy/cheap to reproduce (<600$)
- Alpaca dataset
- https://the-decoder.com/stanfords-alpaca-shows-that-openai-may-have-a-problem/
- https://github.com/tatsu-lab/stanford_alpaca#fine-tuning
- How to finetune your own Alpaca 7B
- https://huggingface.co/mrm8488/Alpacoom
- #PAPER
Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models (Wu 2023)
- Visual ChatGPT opens the door of combining ChatGPT and Visual Foundation Models and enables ChatGPT to handle complex visual tasks
- #PAPER Sparks of Artificial General Intelligence: Early experiments with GPT-4 (Bubeck 2023)