+ 1
AI in Python
How do people make AIs like ChatGPT and DALL-E in Python? I get that they probably used PyTorch or something but still, what do you do to get a result like that? I know the basics of the process: 1. Code the foundations like how it recognizes input data and creates/chooses output data 2. Train on a LOT of data from articles to pictures of cats, AI needs a ton of training data 3. Deploy the AI! What do you do in step 1 to get the right results from step 2 (Especially if you plan on doing unsupervised training)? I want a deeper dive into what these AIs really are.
2 odpowiedzi
+ 6
ChatGPT and others use a large language mode (LLM). That's a highly advanced ability to parse human language and find results. You are not going to be able to recreate that on your own. You can write a simple chat bot, but large language models are massive programs developed by thousands of programmers.
The simplest solution to have an AI powered chat bot is to use the LLM provided by the openai API. They give you the ability to pass it text and let it generate responses to you programmatically. They do charge a fee, but it's not much. Check out https://platform.openai.com/docs/overview
You can train the AI to know unique facts about your application or products. But it already knows everything ChatGPT knows, so you only need to train it with extra stuff that's not already known on the Internet.
0
Hi,
It’s nearly impossible to recreate something like ChatGPT or DALL-E as a single developer, but open-source models like GPT-J or LLaMA can be a starting point. For example, AI Sweden uses similar models for public services, but finding and preparing training data is a major challenge.
DALL-E combines a language model (to understand text) with an image generation model trained using techniques like diffusion. Tools like PyTorch or TensorFlow are great for smaller projects, while APIs like OpenAI’s make building practical solutions faster. Even APIs require time and effort to fine-tune or integrate.
Creating a full-scale model is massive, but experimenting with smaller tools is a great way to start! For GPT-J, you’ll need a powerful PC (NVIDIA RTX 3090, 64 GB RAM). No hardware? Try cloud services like Google Colab or AWS to dive into AI without big upfront costs.