如何从零开始学习AI大模型？2026年AWS架构师的实战分享：原理解析、实操步骤、常见问题与优化建议

核心洞察

这篇文章以一位资深云架构师产假归来后的真实视角切入，通过一个“改编食谱”的实战演示，生动展示了LLM的三种核心能力：提取、推理与个性化。作者刻意避开数学和高级编码，从第一性原理重建AI心智模型，这种“从业务场景出发、先动手再理论”的入门方式，对技术新手和有经验的开发者同样具有极高的启发性。

This article, written from the perspective of a senior cloud architect returning from maternity leave, vividly demonstrates three core capabilities of LLMs — extraction, reasoning, and personalization — through a real-world recipe adaptation demo. The author deliberately avoids mathematics and advanced coding, rebuilding an AI mental model from first principles. This hands-on, scenario-first approach is highly instructive for both technical newcomers and experienced developers.

I just came back from maternity leave. And honestly? I felt like I'd missed a decade in six months. I talked about starting small in my other blog Lost in the AI Hype, I Started Small.

我刚结束产假回到工作岗位。说实话，感觉像错过了六年。我在另一篇博客《迷失在AI热潮中，我从简单开始》中聊过从小处着手的重要性。

I've spent the last fifteen years designing cloud systems. And even I felt behind. AI went from a thing people were experimenting with to a thing everyone's apparently building with, and I had no idea where to start.

过去十五年我一直在设计云系统，连我都感到落后了。AI从人们实验的东西变成了每个人都在用来构建的东西，而我却不知道该从哪里开始。

So I did what any architect would do. I went back to first principles.

所以我做了任何架构师都会做的事：回归基本原理。

I'm rebuilding my AI mental model from scratch in public. No math. No expert-level coding. Just real problems, the architecture underneath, and honest notes on where things might break.

我正在公开地从零重建我的AI心智模型。不涉及数学，也不需要专家级编程。只有真实的问题、底层架构，以及关于哪里可能出问题的诚实笔记。

If you prefer video, please watch Episode 1 of my video series. If you prefer reading, you're in the right place.

如果你更喜欢视频，请观看视频系列第1集；如果你更喜欢阅读，那你来对地方了。

The demo: AI adapts a recipe in under a minute

演示：AI 在一分钟内改编一个食谱

Before any theory, let me show you what these models can actually do.

在讲任何理论之前，先让我展示一下这些模型实际能做什么。

I opened Amazon Bedrock Playground, pasted a real recipe, and asked three questions with each one pushing the model a little further:

我打开 Amazon Bedrock 游乐场，粘贴了一个真实的食谱，然后问了三个问题，每个问题都让模型更进一步：

1. Extract and summarise

1. 提取并总结

"What are the core techniques in this recipe, strip off the fluff?"

“这个食谱的核心技巧是什么，去掉废话？”

What are the core techniques in this recipe, stripped of the fluff

Clean, fast, useful. You might think: that's a fancy Ctrl+F (search).

干净、快速、有用。你可能会想：这不过是个高级版的Ctrl+F（搜索）。

2. Interpret and advise

2. 解释并给出建议

"Looking at this recipe, what's the thing that's most likely to go wrong for someone cooking it for the first time?"

“看看这个食谱，第一次做的人最可能在哪一步出错？”

Looking at this recipe, what's the thing that's most likely to go wrong for someone cooking it for the first time?

Now we're somewhere a search tool genuinely can't go. The model is reasoning about the recipe like spotting the bit where people actually mess up.

现在我们触及了搜索工具真正无法企及的地方——模型正在对食谱进行推理，就像发现人们实际会搞砸的那个点。

3. Personalise

3. 个性化

"I'm cooking this for six people on Saturday. One is vegan, one is gluten-free. Adapt the recipe, give me a shopping list, and a timeline starting from 4pm."

“我周六要做这道菜给六个人吃，其中一位是素食者，一位不吃麸质。请调整食谱，给我购物清单，以及从下午4点开始的时间轴。”

Adapt the recipe with dietary restrictions

This is the moment. I asked it something I'd normally spend twenty minutes thinking through. It gave me a starting point in ten seconds.

这就是关键时刻。我问了件我通常要花二十分钟思考的事情，它在十秒钟内就给了我一个起点。

If you're curious but not technical, that's already useful.

如果你只是好奇但并非技术背景，这已经很有用了。

If you're a builder, you're probably already thinking so what happened here.

如果你是开发者，你可能已经在想：这背后到底发生了什么。

So what actually happened?

到底发生了什么？

Here's the architecture, as simply as I can put it.

下面我用最简单的话描述一下架构。

I sent text called a prompt to a foundation model.

我向一个基础模型发送了一段文本，称为提示（prompt）。

People throw around terms like AI, LLMs, and foundation models like they all mean the same thing but they don’t.

人们随意使用 AI、LLM、基础模型这些术语，好像它们意思相同——其实不然。

AI is the broad umbrella. It includes everything from recommendation engines and fraud detection systems to generative AI tools like ChatGPT.

AI 是宽泛的伞状概念，涵盖了从推荐引擎、欺诈检测系统到像 ChatGPT 这样的生成式AI工具的所有内容。

Foundation models are a subset of AI, they are large models trained on massive datasets that can be adapted for different tasks. These aren’t just text models; they can generate images, video, speech, code, and more. Platforms like Amazon Bedrock give access to many of these models.

基础模型是 AI 的一个子集，它们是经过海量数据训练的大型模型，可以适配不同的任务。它们不仅仅是文本模型，还能生成图像、视频、语音、代码等等。像 Amazon Bedrock 这样的平台提供了对许多这些模型的访问。

LLMs (Large Language Models) are a specific type of foundation model built for language tasks like answering questions, summarizing text, writing, or coding. So in my recipe demo, I was technically interacting with an LLM.

LLM（大语言模型）是基础模型的一种特定类型，专为问答、文本总结、写作或编程等语言任务而构建。所以在我的食谱演示中，从技术上讲，我是在与一个 LLM 交互。

The simplest way to think about it:

最简单的理解方式：

AI → Foundation Models → LLMs

So, in our case it means its a big model trained on a huge mix of data for your day to day general purpose.

因此，在我们的案例中，它意味着一个在大量混合数据上训练的大模型，用于日常通用目的。

The model is a piece of software trained on an enormous amount of text: books, articles, code, conversations. It is not searching the internet. It learned patterns from all that text beforehand.

这个模型是一个在大量文本（书籍、文章、代码、对话）上训练出来的软件。它并不是在搜索互联网，而是事先从所有这些文本中学习了模式。

When I give it my prompt, it predicts the most useful response based on everything it learned.

当我给出提示时，它会基于所学的一切预测出最有用响应。

Input (prompt) → Foundation Model → Output (response)

I've been building distributed systems for years, and a foundation model call is simpler than most of the APIs I'm used to. It's an HTTP request with text in, text out.

我构建分布式系统很多年了，而调用基础模型比我熟悉的大多数 API 还要简单——就是一个 HTTP 请求，文本进，文本出。

The complexity isn't in the call itself, it is in what the model learned before you or I ever showed up.

复杂性不在于调用本身，而在于模型在你我出现之前已经学到的内容。

And this exact loop is what the entire current wave of AI is built on.

而这个循环恰恰就是当前整个 AI 浪潮的基石。

Every time you see a new Claude, or GPT, or Llama land, what's actually happening is someone trained a bigger or smarter version of this same idea.

每当你看到一个新的 Claude、GPT 或 Llama 发布时，实际上就是有人训练了一个更大或更聪明的版本，但本质是一样的。

Same loop. More data. Better prediction.

同样的循环，更多的数据，更好的预测。

Where it breaks

它在哪里会出错

The model doesn't know if it's right. It's predicting what a useful answer looks like. Sometimes that prediction is brilliant. Sometimes it invents something that sounds plausible and is completely wrong.

模型并不知道自己是否正确。它只是预测出一个看起来有用的答案。有时预测很出色，有时它会编造出听起来合理但完全错误的东西。

Every time you use one of these tools, ask yourself: what would I need to double-check before I trusted this?

每次使用这些工具时，问问自己：在信任它之前，我需要复核什么？

That question is the single most useful habit you can build right now. We'll dig into why this happens in the next post.

这个问题是你现在可以培养的最有用的习惯。我们将在下一篇文章中深入探讨为什么会发生这种情况。

Where the models live: Amazon Bedrock

模型在哪里：Amazon Bedrock

You might've noticed I wasn't using ChatGPT or Claude's own website. I was using ** Amazon Bedrock**.

你可能注意到了，我用的不是 ChatGPT 或 Claude 的官网，而是 Amazon Bedrock。

Bedrock is where a bunch of foundation models live on AWS. Anthropic's Claude, Meta's Llama, Mistral, Amazon's own models, they are all callable through Bedrock, no need to run or train anything yourself.

Bedrock 是 AWS 上一系列基础模型的集合地。Anthropic 的 Claude、Meta 的 Llama、Mistral、亚马逊自己的模型，都可以通过 Bedrock 调用，无需自己运行或训练任何东西。

The Playground is the easy door in, just type and go. Later in this series, when we start building, we'll call these same models from code. Same models, different door.

Playground 是轻松入门的方式，输入即可。在本系列后续内容中，当我们开始构建时，我们会从代码中调用同样的模型。同一组模型，不同的入口。

A note on my stack

关于我的技术栈

I work at AWS. So the tools I use in this series are AWS tools like Bedrock for the models, and later, an AI-powered IDE called Kiro for building.

我在 AWS 工作，因此本系列中使用的工具是 AWS 提供的，比如用于模型的 Bedrock，以及稍后用于构建的 AI 增强 IDE —— Kiro。

The concepts, though, aren't AWS specific. Foundation models, tokens, context windows, RAG, agents, these work the same way on any cloud. I'm showing you my stack. And honestly, I'm still figuring out which parts of it are great and which parts are a pain. You'll know which is which.

但概念并不是 AWS 特有的。基础模型、Token、上下文窗口、RAG、智能体——这些在任何云上都以相同的方式工作。我是在展示我的技术栈，说实话，我自己也还在摸索哪些部分很棒、哪些部分很麻烦。你会看到它们各自的情况。

Try it yourself

自己动手试试

If you're just getting started: open any AI chat tool (Bedrock Playground, Claude, ChatGPT, whatever you have access to), paste a recipe, a contract, a long email and ask it three questions:

如果你是刚入门： 打开任意 AI 聊天工具（Bedrock Playground、Claude、ChatGPT 或你能访问的任何工具），粘贴一个食谱、一份合同或一封长邮件，然后问它三个问题：

One to summarise.
One to interpret.
One that's personal to you.

一个让它总结的问题。

一个让它解释的问题。

一个针对你个人的问题。

See what happens. That's your homework.

看看结果。这就是你的作业。

If you're more on the builder side: the mental model is simple: text in, model, text out. Everything we build in this series is a variation on that loop.

如果你是偏开发者类型： 心智模型很简单：文本进，模型，文本出。本系列中我们构建的所有内容都是这个循环的变体。

What's next

下一篇预告

Next up: when AI sounds confident and is completely wrong. Why it happens, how to spot it, how to stop it.

下一篇：当 AI 听起来很自信但完全错误时——为什么会这样，如何发现它，以及如何阻止它。

This is a series. I'm learning this in public, building as I go, and being honest when things don't work. If that sounds useful, please follow along.

这是一个系列。我正在公开学习，边学边构建，并在事情不奏效时如实记录。如果这对你有用，请关注后续内容。

This post is part of the "Learning AI Out Loud" series, a cloud architect learning AI from first principles. Watch the video version or follow the series on dev.to.

本文是“大声学 AI”系列的一部分，由一位从第一性原理学习 AI 的云架构师撰写。可以观看视频版本或关注 dev.to 上的系列文章。

常见问题（FAQ）

文章演示了LLM的哪三种核心能力？

提取（总结食谱）、推理（分析风险点）和个性化（根据饮食限制调整食谱），通过Amazon Bedrock的提示词实现。

Amazon Bedrock在文章中扮演什么角色？

它是一个平台，提供多种基础模型，允许用户通过提示词调用LLM进行提取、推理和个性化操作，无需管理底层架构。

这篇文章适合什么样的读者？

适合技术新手和有经验的开发者，尤其是想从业务场景出发、不依赖数学和高级编码来理解LLM工作原理的人。

AIAI Summary (BLUF)