Fine-Tuning GPT-OSS-20B on Google Colab Using Unsloth and LoRA

Posted on October 19, 2025 by Hieu Pham Pro

1. Introduction

In today’s rapidly advancing field of AI, the use of AI models — or more specifically, running them on personal computers — has become more common than ever.
However, some AI models have become increasingly difficult to use because the training data required for them is massive, often involving millions of parameters.
This makes it nearly impossible for low-end computers to use them effectively for work or projects.

Therefore, in this article, we will explore Google Colab together with Unsloth’s fine-tuning tool, combined with LoRA, to fine-tune and use gpt-oss-20b according to our own needs.

Quick Navigation
2. Main Content
3. Using Colab to Train gpt-oss-20b
4. Conclusion & Next Steps

2. Main Content

a. What is Unsloth?

Unsloth is a modern Python library designed to speed up and optimize the fine-tuning of large language models (LLMs) such as LLaMA, Mistral, Mixtral, and others.
It makes model training and fine-tuning extremely fast, memory-efficient, and easy — even on limited hardware like a single GPU or consumer-grade machines.

b. What is Colab?

Colab is a hosted Jupyter Notebook service that requires no setup and provides free access to computing resources, including GPUs and TPUs.
It is particularly well-suited for machine learning, data science, and education purposes.

c. What is LoRA?

Low-Rank Adaptation (LoRA) is a technique for quickly adapting machine learning models to new contexts.
LoRA helps make large and complex models more suitable for specific tasks. It works by adding lightweight layers to the original model rather than modifying the entire architecture.
This allows developers to quickly expand and specialize machine learning models for various applications.

3. Using Colab to Train gpt-oss-20b

– Installing the Libraries

!pip install --upgrade -qqq uv

try:
    import numpy
    install_numpy = f"numpy=={numpy.__version__}"
except:
    install_numpy = "numpy"

!uv pip install -qqq \
  "torch>=2.8.0" "triton>=3.4.0" {install_numpy} \
  "unsloth_zoo[base] @ git+https://github.com/unslothai/unsloth-zoo" \
  "unsloth[base] @ git+https://github.com/unslothai/unsloth" \
  torchvision bitsandbytes \
  git+https://github.com/huggingface/[email protected] \
  git+https://github.com/triton-lang/triton.git@05b2c186c1b6c9a08375389d5efe9cb4c401c075#subdirectory=python/triton_kernels

– After completing the installation, load the gpt-oss-20b model from Unsloth:

from unsloth import FastLanguageModel
import torch

max_seq_length = 1024
dtype = None
model_name = "unsloth/gpt-oss-20b"

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = model_name,
    dtype = dtype,                 # None for auto detection
    max_seq_length = max_seq_length,  # Choose any for long context!
    load_in_4bit = True,           # 4 bit quantization to reduce memory
    full_finetuning = False,       # [NEW!] We have full finetuning now!
    # token = "hf_...",            # use one if using gated models
)

– Adding LoRA for Fine-Tuning

model = FastLanguageModel.get_peft_model(
    model,
    r = 8,  # Choose any number > 0! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj"],
    lora_alpha = 16,
    lora_dropout = 0,              # Optimized fast path
    bias = "none",                 # Optimized fast path
    # "unsloth" uses less VRAM, fits larger batches
    use_gradient_checkpointing = "unsloth",  # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = False,
    loftq_config = None,
)

Tip: If you hit out-of-memory (OOM), reduce max_seq_length, set a smaller r, or increase gradient_accumulation_steps.

– Testing the Model Before Fine-Tuning

Now, let’s test how the model responds before fine-tuning:

messages = [
    {"role": "system", "content": "Bạn là Shark B, một nhà đầu tư nổi tiếng, thẳng thắn và thực tế", "thinking": None},
    {"role": "user", "content": "Bạn hãy giới thiệu bản thân"},
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt = True,
    return_tensors = "pt",
    return_dict = True,
    reasoning_effort = "low",
).to(model.device)

from transformers import TextStreamer
_ = model.generate(**inputs, max_new_tokens = 512, streamer = TextStreamer(tokenizer))

– Load data for finetune model

Dataset sample

def formatting_prompts_func(examples):
    convos = examples["messages"]
    texts = [tokenizer.apply_chat_template(convo, tokenize = False, add_generation_prompt = False) for convo in convos]
    return { "text" : texts, }

from datasets import load_dataset
dataset = load_dataset("json", data_files="data.jsonl", split="train")
dataset

from unsloth.chat_templates import standardize_sharegpt
dataset = standardize_sharegpt(dataset)
dataset = dataset.map(formatting_prompts_func, batched = True)

– Train model

The following code snippet defines the configuration and setup for the fine-tuning process.
Here, we use SFTTrainer and SFTConfig from the trl library to perform Supervised Fine-Tuning (SFT) on our model.
The configuration specifies parameters such as batch size, learning rate, optimizer type, and number of training epochs.

from trl import SFTConfig, SFTTrainer

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    args = SFTConfig(
        per_device_train_batch_size = 1,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        num_train_epochs = 1,  # Set this for 1 full training run.
        # max_steps = 30,
        learning_rate = 2e-4,
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs",
        report_to = "none",  # Use this for WandB etc.
    ),
)

trainer_stats = trainer.train()

– After training, try the fine-tuned model

# Example reload (set to True to run)
if False:
    from unsloth import FastLanguageModel
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name = "finetuned_model",  # YOUR MODEL YOU USED FOR TRAINING
        max_seq_length = 1024,
        dtype = None,
        load_in_4bit = True,
    )

    messages = [
        {"role": "system", "content": "Bạn là Shark B, một nhà đầu tư nổi tiếng, thẳng thắn và thực tế", "thinking": None},
        {"role": "user", "content": "Bạn hãy giới thiệu bản thân"},
    ]

    inputs = tokenizer.apply_chat_template(
        messages,
        add_generation_prompt = True,
        return_tensors = "pt",
        return_dict = True,
        reasoning_effort = "low",
    ).to(model.device)

    from transformers import TextStreamer
    _ = model.generate(**inputs, max_new_tokens = 512, streamer = TextStreamer(tokenizer))

Note: Replace finetuned_model with your actual model path (e.g., outputs or the directory you saved/merged adapters to).

Colab notebook: Open your Colab here.

4. Conclusion & Next Steps

By combining Unsloth (for speed and memory efficiency), LoRA (for lightweight adaptation), and Google Colab (for accessible compute), you can fine-tune gpt-oss-20b even on modest hardware. The workflow above helps you:

Install a reproducible environment with optimized kernels.
Load gpt-oss-20b in 4-bit to reduce VRAM usage.
Attach LoRA adapters to train only a small set of parameters.
Prepare chat-style datasets and run supervised fine-tuning with TRL’s SFTTrainer.
Evaluate before/after to confirm your improvements.

Open the Colab
Clone the notebook, plug in your dataset, and fine-tune your own assistant in minutes.

Shipping with Codex

Posted on October 14, 2025October 14, 2025 by Dai Ha

Codex: Kỹ Sư Phần Mềm AI Đã Tạo Ra “Sự Thay Đổi Cảm Hứng Lớn” (Vibe Shift) Trong Lập Trình

Gần đây, tại OpenAI, chúng ta đã chứng kiến một điều phi thường: một “sự thay đổi cảm hứng lớn” (vibe shift) trong cách chúng tôi xây dựng phần mềm. Kể từ tháng Tám, mức độ sử dụng Codex—kỹ sư phần mềm AI của chúng tôi—đã tăng gấp mười lần. Codex không chỉ là một công cụ; nó giống như một đồng nghiệp con người mà bạn có thể lập trình đôi, giao phó công việc hoặc đơn giản là để nó tự động thực hiện các nhiệm vụ phức tạp.

Sự bùng nổ này không phải ngẫu nhiên. Nó là kết quả của hàng loạt cập nhật lớn, biến Codex thành một tác nhân mạnh mẽ, an toàn và trực quan hơn, hoạt động trên mọi nền tảng mà bạn xây dựng.

1. Những Cập Nhật Đã Tạo Nên Sự Thay Đổi Lớn

Đại Tu Hoàn Toàn Tác Nhân (Agent Overhaul)

Chúng tôi định nghĩa tác nhân Codex là sự kết hợp của hai yếu tố: mô hình lý luận và bộ công cụ (harness) cho phép nó hành động và tạo ra giá trị.

Mô Hình Nâng Cấp: Ban đầu, chúng tôi đã ra mắt GPT-5, mô hình tác nhân tốt nhất của mình. Dựa trên phản hồi, chúng tôi tiếp tục tối ưu hóa và cho ra mắt GPT-5 Codex, một mô hình được tinh chỉnh đặc biệt cho công việc mã hóa. Người dùng mô tả nó như một “kỹ sư cấp cao thực thụ” vì nó không ngại đưa ra phản hồi thẳng thắn và từ chối những ý tưởng tồi.
Hệ Thống Công Cụ Mới (Harness): Chúng tôi đã viết lại hoàn toàn bộ công cụ để tận dụng tối đa các mô hình mới. Hệ thống này bổ sung các tính năng quan trọng như lập kế hoạch (planning), nén tự động bối cảnh (autoco compaction)—cho phép các cuộc trò chuyện và tương tác cực kỳ dài—và hỗ trợ cho MCP (Multi-Context Protocol).

Trải Nghiệm Người Dùng Được Tinh Chỉnh

Dù mô hình và tác nhân mạnh mẽ, phản hồi ban đầu cho thấy giao diện dòng lệnh (CLI) còn “sơ khai”.

CLI Revamp: Chúng tôi đã đại tu CLI, đơn giản hóa chế độ phê duyệt (approvals modes), tạo ra giao diện người dùng dễ đọc hơn và thêm nhiều chi tiết tinh tế. Quan trọng nhất, Codex CLI hiện mặc định hoạt động với sandboxing (môi trường hộp cát), đảm bảo an toàn theo mặc định trong khi vẫn trao toàn quyền kiểm soát cho người dùng.
Tiện Ích Mở Rộng IDE: Để hỗ trợ người dùng muốn xem và chỉnh sửa code cùng lúc với việc cộng tác với Codex, chúng tôi đã phát hành một extension bản địa (native extension) cho IDE. Tiện ích này hoạt động với VS Code, Cursor và các bản fork phổ biến khác. Nó đã bùng nổ ngay lập tức, thu hút 100.000 người dùng trong tuần đầu tiên—bằng cách sử dụng cùng một tác nhân mạnh mẽ và bộ công cụ mã nguồn mở (open-source harness) đã cung cấp sức mạnh cho CLI.
Codex Cloud Nhanh Hơn: Chúng tôi đã nâng cấp Codex Cloud để chạy nhiều tác vụ song song, tăng tốc độ tác vụ Cloud lên 90%. Tác vụ Cloud giờ đây có thể tự động thiết lập các phụ thuộc, thậm chí xác minh công việc bằng cách chụp ảnh màn hình và gửi cho bạn.

Codex Hoạt Động Mọi Nơi

Codex giờ đây hoạt động tích hợp sâu vào quy trình làm việc của bạn:

Slack và GitHub: Codex có thể được giao nhiệm vụ trực tiếp trong các công cụ cộng tác như Slack. Nó nhận toàn bộ bối cảnh từ luồng trò chuyện, tự mình khám phá vấn đề, viết code, và đăng giải pháp cùng với bản tóm tắt chỉ sau vài phút.
Review Code Độ Tin Cậy Cao: Việc đánh giá và duyệt code đang trở thành nút thắt cổ chai lớn. Chúng tôi đã huấn luyện GPT-5 Codex đặc biệt để thực hiện review code cực kỳ kỹ lưỡng (ultra thorough). Nó khám phá toàn bộ code và các phụ thuộc bên trong container của mình, xác minh ý định và việc triển khai. Kết quả là những phát hiện có tín hiệu rất cao (high signal), đến mức nhiều đội đã bật nó theo mặc định, thậm chí cân nhắc bắt buộc.

2. Codex Đang Thúc Đẩy OpenAI Như Thế Nào

Kết quả nội bộ tại OpenAI là minh chứng rõ ràng nhất cho sức mạnh của Codex:

92% nhân viên kỹ thuật của OpenAI sử dụng Codex hàng ngày (tăng từ 50% vào tháng Bảy).
Các kỹ sư sử dụng Codex nộp 70% nhiều PR (Pull Requests) hơn mỗi tuần.
Hầu như tất cả các PR đều được Codex review. Khi nó tìm thấy lỗi, các kỹ sư cảm thấy hào hứng vì nó giúp tiết kiệm thời gian và tăng độ tin cậy khi triển khai.

3. Các Quy Trình Làm Việc Thực Tế Hàng Ngày

Các kỹ sư của chúng tôi đã chia sẻ những ví dụ thực tế về cách họ sử dụng Codex để giải quyết các vấn đề phức tạp.

Trường Hợp 1: Lặp Lại Giao Diện Người Dùng (UI) Với Bằng Chứng Hình Ảnh (Nacho)

Nacho, kỹ sư iOS, đã chia sẻ quy trình làm việc tận dụng tính năng đa phương thức (multimodal) của Codex:

Vấn Đề: Trong công việc front-end, 10% công việc đánh bóng cuối cùng—như căn chỉnh header/footer—thường chiếm đến 90% thời gian.
Giải Pháp: Nacho giao cho Codex nhiệm vụ triển khai UI từ một bản mockup. Khác với các agent cũ (được ví như “kỹ sư tập sự”), Codex (được ví như “kỹ sư cấp cao”) xác minh công việc của nó.
Quy Trình TDD & Multimodal:
1. Nacho cung cấp cho Codex một công cụ đơn giản: một script Python (do Codex viết) để trích xuất các snapshot (ảnh chụp giao diện) từ các SwiftUI Previews.
2. Nó được hướng dẫn sử dụng công cụ này để xác minh trực quan code UI mà nó viết.
3. Codex lặp đi lặp lại: Viết code > Chạy test/Snapshot > Sửa lỗi cho đến khi giao diện đạt đến độ hoàn hảo về pixel (pixel perfect).
Kết Quả: Nacho có thể để Codex làm việc trên những chi tiết nhỏ (10% độ đánh bóng) trong khi anh làm những việc khác, biết rằng nó sẽ tự kiểm tra công việc của mình bằng hình ảnh.

Trường Hợp 2: Mở Rộng Giới Hạn Tác Vụ Lớn (Fel)

Fel, được biết đến là người có phiên làm việc Codex lâu nhất (hơn bảy giờ) và xử lý nhiều token nhất (hơn 150 triệu), đã chứng minh cách anh thực hiện các tác vụ refactor lớn chỉ với vài lời nhắc.

Vấn Đề: Thực hiện một refactor lớn (như thay đổi 15.000 dòng code) trong các dự án phức tạp (như bộ phân tích JSON cá nhân của anh) thường dẫn đến việc tất cả các bài kiểm tra thất bại trong thời gian dài.
Giải Pháp: Kế Hoạch Thực Thi (Exec Plan):
1. Fel yêu cầu Codex viết một đặc tả (spec)—được gọi là plans.md—để triển khai tính năng, giao cho nó nhiệm vụ nghiên cứu thư viện và cách tích hợp.
2. Anh định nghĩa plans.md là một “tài liệu thiết kế sống” (living document) mà Codex phải liên tục cập nhật, bao gồm mục tiêu lớn, danh sách việc cần làm, tiến trình, và nhật ký quyết định (decision log).
3. Anh sử dụng thuật ngữ neo “exec plan” để đảm bảo mô hình biết khi nào cần tham chiếu và phản ánh lại tài liệu này.
4. Sau khi Fel phê duyệt kế hoạch, anh ra lệnh: “Implement” (Thực thi).
Kết Quả: Codex có thể làm việc một cách hiệu quả trong nhiều giờ (thậm chí hơn một giờ trong buổi demo) trên một tính năng lớn, sử dụng plans.md như bộ nhớ và kim chỉ nam. Trong một phiên, nó đã tạo ra 4.200 dòng code chỉ trong khoảng một giờ—mọi thứ đều được kiểm tra và vượt qua.

Trường Hợp 3: Vòng Lặp Sửa Lỗi và Review Tại Chỗ (Daniel)

Daniel, một kỹ sư trong nhóm Codex, đã giới thiệu quy trình làm việc slash review mới, đưa khả năng review code chất lượng cao của GPT-5 Codex xuống môi trường cục bộ (local).

Vấn Đề: Ngay cả sau khi hoàn thành code, các kỹ sư cần một bộ mắt mới không bị thiên vị để tìm ra các lỗi khó.
Giải Pháp: Slash Review: Trước khi gửi PR, Daniel sử dụng lệnh /review trong CLI.
- Anh chọn duyệt so với nhánh gốc (base branch), tương tự như một PR.
- GPT-5 Codex bắt đầu luồng review chuyên biệt: Nó nghiên cứu sâu các tập tin, tìm kiếm các lỗi kỹ thuật, và thậm chí viết/chạy các script kiểm tra để xác minh các giả thuyết lỗi trước khi báo cáo.
- Mô hình thiên vị: Luồng review chạy trong một luồng riêng biệt, có bối cảnh mới mẻ (fresh context), loại bỏ bất kỳ thiên vị triển khai (implementation bias) nào từ cuộc trò chuyện trước.
Vòng Lặp Sửa Lỗi: Khi Codex tìm thấy một vấn đề P0/P1, Daniel chỉ cần gõ “Please fix”.
Kết Quả: Codex sửa lỗi, và Daniel có thể chạy /review lần nữa cho đến khi nhận được “thumbs up” (chấp thuận) cuối cùng. Điều này đảm bảo code được kiểm tra kỹ lưỡng, được sửa lỗi cục bộ trước khi push, tiết kiệm thời gian và đảm bảo độ tin cậy cao hơn.

Ba chức năng chính của Codex, được nhấn mạnh trong bài thuyết trình, là:

Lập Trình Đôi và Triển Khai Code (Implementation & Delegation):
- Codex hoạt động như một đồng đội lập trình đôi trong IDE/CLI, giúp bạn viết code nhanh hơn.
- Nó có thể nhận ủy quyền (delegate) các tác vụ lớn hơn (như refactor hoặc thêm tính năng) và tự thực hiện trong môi trường Cloud/Sandboxing, bao gồm cả việc tự thiết lập dependencies và chạy song song.
Xác Minh và Kiểm Thử Tự Động (Verification & TDD):
- Codex tích hợp sâu với quy trình Test-Driven Development (TDD).
- Nó không chỉ viết code mà còn tự động chạy các bài kiểm thử (unit tests) và xác minh đa phương thức (ví dụ: tạo và kiểm tra snapshot UI) để đảm bảo code hoạt động chính xác và đạt độ hoàn hảo về mặt hình ảnh (pixel perfect).
Review Code Độ Tin Cậy Cao (High-Signal Code Review):
- Sử dụng mô hình GPT-5 Codex được tinh chỉnh, nó thực hiện review code cực kỳ kỹ lưỡng (ultra thorough) trên GitHub PR hoặc cục bộ thông qua lệnh /review.
- Chức năng này giúp tìm ra các lỗi kỹ thuật khó và có thể được sử dụng trong vòng lặp Review -> Fix -> Review để đảm bảo chất lượng code trước khi merge, tiết kiệm thời gian và tăng độ tin cậy khi triển khai.

Link video: https://www.youtube.com/watch?v=Gr41tYOzE20

Lộ Trình Học Tập Tối Ưu cho Quản Lý Sản Phẩm AI

Posted on October 14, 2025 by Phat Ly

Bài viết gốc: “The Ultimate AI PM Learning Roadmap” của Paweł Huryn

Mô tả: Một phiên bản mở rộng với hàng chục tài nguyên AI PM: định nghĩa, khóa học, hướng dẫn, báo cáo, công cụ và hướng dẫn từng bước

Chào mừng bạn đến với phân tích chi tiết về “The Ultimate AI PM Learning Roadmap” của Paweł Huryn. Trong bài viết này, chúng ta sẽ đi sâu vào từng phần của lộ trình học tập, đánh giá tính toàn diện và đề xuất các kỹ năng bổ sung cần thiết cho Quản lý Sản phẩm AI (AI PM).

1Các Khái Niệm Cơ Bản về AI

Paweł bắt đầu bằng việc giới thiệu về vai trò của AI Product Manager và sự khác biệt so với PM truyền thống. Đây là nền tảng quan trọng để hiểu rõ về lĩnh vực này.

Điểm chính:

Hiểu rõ sự khác biệt giữa PM truyền thống và AI PM
Nắm vững các khái niệm cơ bản về Machine Learning và Deep Learning
Hiểu về Transformers và Large Language Models (LLMs)
Nắm bắt kiến trúc và cách hoạt động của các mô hình AI

Tài nguyên miễn phí:

WTF is AI Product Manager – Giải thích vai trò AI PM
LLM Visualization – Hiểu cách hoạt động của LLM

Bắt đầu với việc hiểu AI Product Manager là gì. Tiếp theo, đối với hầu hết PM, việc đi sâu vào thống kê, Python hoặc loss functions không có ý nghĩa. Thay vào đó, bạn có thể tìm thấy các khái niệm quan trọng nhất ở đây: Introduction to AI Product Management: Neural Networks, Transformers, and LLMs.

[Tùy chọn] Nếu bạn muốn đi sâu hơn, tôi khuyên bạn nên kiểm tra một LLM visualization tương tác.

2Prompt Engineering

Hướng dẫn Prompt Engineering cho AI Product Management

52% người Mỹ trưởng thành sử dụng LLMs. Nhưng rất ít người biết cách viết prompt tốt.

Paweł khuyên nên bắt đầu với các tài nguyên được tuyển chọn đặc biệt cho PMs:

Tài nguyên được đề xuất:

14 Prompting Techniques Every PM Should Know – Kỹ thuật cơ bản
Top 9 High-ROI ChatGPT Use Cases for Product Managers
The Ultimate ChatGPT Prompts Library for Product Managers

Tài nguyên miễn phí khác (Tùy chọn):

Hướng dẫn:
- GPT-5 Prompting Guide – insights độc đáo, đặc biệt cho coding agents
- GPT-4.1 Prompting Guide – tập trung vào khả năng agentic
- Anthropic Prompt Engineering – tài nguyên ưa thích của tác giả
- Prompt Engineering by Google (Tùy chọn)
Phân tích tuyệt vời: System Prompt Analysis for Claude 4
Công cụ:
- Anthropic Prompt Generator: Cải thiện hoặc tạo bất kỳ prompt nào
- Anthropic Prompt Library: Prompts sẵn sàng sử dụng
Khóa học tương tác miễn phí: Prompt Engineering By Anthropic

3Fine-Tuning

Quy trình Fine-tuning trong AI Product Management

Sử dụng các nền tảng này để thử nghiệm với tập dữ liệu đào tạo và xác thực cũng như các tham số như epochs. Không cần coding:

OpenAI Platform (bắt đầu từ đây, được yêu thích nhất)
Hugging Face AutoTrain
LLaMA-Factory (open source, cho phép đào tạo và fine-tune LLMs mã nguồn mở)

Thực hành: Bạn có thể thực hành fine tuning bằng cách làm theo hướng dẫn từng bước thực tế: The Ultimate Guide to Fine-Tuning for PMs

4RAG (Retrieval-Augmented Generation)

Kiến trúc RAG cho AI PM

RAG, theo định nghĩa, yêu cầu một nguồn dữ liệu cộng với một LLM. Và có hàng chục kiến trúc có thể.

Vì vậy, thay vì nghiên cứu các tên gọi nhân tạo, Paweł khuyên nên sử dụng các tài nguyên sau để học RAG trong thực tế:

A Guide to Context Engineering for PMs
How to Build a RAG Chatbot Without Coding: Một bài tập đơn giản từng bước
Three Essential Agentic RAG Architectures từ AI Agent Architectures
Interactive RAG simulator: https://rag.productcompass.pm/

5AI Agents & Agentic Workflows

Các công cụ cho AI Agents và Agentic Workflows

AI agents là chủ đề bạn có thể học tốt nhất bằng cách thực hành. Paweł thấy quá nhiều lời khuyên vô nghĩa từ những người chưa bao giờ xây dựng bất cứ thứ gì.

Công cụ ưa thích: n8n

Công cụ ưa thích của Paweł, cho phép bạn:

Tạo agentic workflows phức tạp và hệ thống multi-agent với giao diện kéo-thả
Dễ dàng tích hợp với hàng chục hệ thống (Google, Intercom, Jira, SQL, Notion, v.v.)
Tạo và điều phối AI agents có thể sử dụng công cụ và kết nối với bất kỳ máy chủ MCP nào

Bạn có thể bắt đầu với các hướng dẫn này:

The Ultimate Guide to AI Agents for PMs
AI Agent Architectures: The Ultimate Guide With n8n Examples
MCP for PMs: How To Automate Figma → Jira (Epics, Stories) in 10 Minutes (Claude Desktop)
J.A.R.V.I.S. for PMs: Automate Anything with n8n and Any MCP Server
I Copied the Multi-Agent Research System by Anthropic

[Tùy chọn] Các hướng dẫn và báo cáo miễn phí yêu thích:

Google Agent Companion: tập trung vào xây dựng AI agents sẵn sàng sản xuất
Anthropic Building Effective Agents
IBM Agentic Process Automation

6AI Prototyping & AI Building

Các công cụ AI Prototyping và Building

Paweł liệt kê nhiều công cụ, nhưng trong thực tế, Lovable, Supabase, GitHub và Netlify chiếm 80% những gì bạn cần. Bạn có thể thêm Stripe. Không cần coding.

Dưới đây là bốn hướng dẫn thực tế:

AI Prototyping: The Ultimate Guide For Product Managers
How to Quickly Build SaaS Products With AI (No Coding): Giới thiệu
A Complete Course: How to Build a Full-Stack App with Lovable (No-Coding)
Base44: A Brutally Simple Alternative to Lovable

[Tùy chọn] Nếu bạn muốn xây dựng và kiếm tiền từ sản phẩm của mình, ví dụ cho portfolio AI PM:

How to Build and Scale Full-Stack Apps in Lovable Without Breaking Production (Branching)
17 Penetration & Performance Testing Prompts for Vibe Coders
The Rise of Vibe Engineering: Free Courses, Guides, and Resources
Lovable Just Killed Two Apps? Create Your Own SaaS Without Coding in 2 Days

Khi xây dựng, hãy tập trung vào giá trị, không phải sự cường điệu. Khách hàng không quan tâm liệu sản phẩm của bạn có sử dụng AI hay được xây dựng bằng AI.

7Foundational Models

Các mô hình nền tảng AI

Khuyến nghị của Paweł (tháng 8/2025):

GPT-5 > GPT-4.1 > GPT-4.1-mini cho AI Agents
Claude Sonnet 4.5 cho coding
Gemini 2.5 Pro cho mọi thứ khác

Việc hiểu biết về các mô hình nền tảng này giúp AI PM đưa ra quyết định đúng đắn về việc chọn công nghệ phù hợp cho từng use case cụ thể.

8AI Evaluation Systems

Đánh giá là một phần quan trọng trong việc phát triển sản phẩm AI. Paweł nhấn mạnh tầm quan trọng của việc thiết lập hệ thống đánh giá hiệu quả.

Các yếu tố quan trọng:

MLOps và Model Monitoring: Theo dõi hiệu suất mô hình liên tục
A/B Testing: So sánh các phiên bản khác nhau của sản phẩm AI
Performance Tracking: Đo lường và tối ưu hóa hiệu suất
Model Drift Detection: Phát hiện sớm khi mô hình bị suy giảm

9AI Product Management Certification

Chứng nhận AI Product Management

Paweł đã tham gia chương trình cohort 6 tuần này vào mùa xuân 2024. Ông yêu thích việc networking và thực hành. Sau đó, ông tham gia cùng Miqdad với vai trò AI Build Labs Leader.

Chi tiết chương trình:

Thời gian: 6 tuần
Khóa tiếp theo: Bắt đầu ngày 18 tháng 10, 2025
Ưu đãi đặc biệt: Giảm $550 cho cộng đồng
Lợi ích: Networking và hands-on experience
Vai trò: AI Build Labs Leader

10AI Evals For Engineers & PMs

Khóa học AI Evals cho Engineers và PMs

Paweł đã tham gia cohort đầu tiên cùng với 700+ AI engineers và PMs. Ông không nghi ngờ gì rằng mọi AI PM phải hiểu sâu về evals. Và ông đồng ý với Teresa Torres:

Trích dẫn của Teresa Torres về AI Evaluation

Thông tin khóa học:

Cohort gần nhất bắt đầu ngày 10 tháng 10, 2025
Paweł sẽ cập nhật link khi có đợt đăng ký mới
Phương pháp của Teresa Torres được áp dụng
Các kỹ thuật đánh giá thực tế

11Visual Summary

Tóm tắt trực quan toàn bộ lộ trình học tập AI PM

Phân Tích và Đánh Giá

Sự Khác Biệt Giữa PM Truyền Thống và AI PM

Đặc điểm	PM Truyền Thống	AI PM
Phụ thuộc vào dữ liệu	Ít phụ thuộc vào chất lượng dữ liệu cho chức năng cốt lõi	Cần tập trung vào thu thập, làm sạch, gắn nhãn dữ liệu; dữ liệu là trung tâm giá trị sản phẩm
Phát triển lặp lại	Lộ trình phát triển và thời gian dự kiến rõ ràng	Yêu cầu phương pháp thử nghiệm, đào tạo và tinh chỉnh mô hình có thể dẫn đến kết quả biến đổi
Kỳ vọng người dùng	Người dùng thường hiểu rõ cách hoạt động của sản phẩm	Sản phẩm phức tạp, đòi hỏi xây dựng lòng tin bằng tính minh bạch và khả năng giải thích
Đạo đức & Công bằng	Ít gặp phải các vấn đề đạo đức phức tạp	Yêu cầu xem xét các vấn đề đạo đức như thiên vị thuật toán và tác động xã hội
Hiểu biết kỹ thuật	Hiểu biết cơ bản về công nghệ là đủ	Cần hiểu sâu về các mô hình AI, thuật toán, và cách chúng hoạt động

Đánh Giá Tính Toàn Diện

Điểm Mạnh:

Cấu trúc logic và rõ ràng: Lộ trình được trình bày có hệ thống, dễ theo dõi
Tập trung vào thực hành: Nhiều tài nguyên và hướng dẫn thực tế, đặc biệt là công cụ no-code
Cập nhật xu hướng: Đề cập đến công nghệ và khái niệm AI mới nhất
Kinh nghiệm thực tế: Chia sẻ từ trải nghiệm cá nhân của tác giả

Điểm Cần Bổ Sung:

Chiến lược kinh doanh AI: Cần thêm về cách xây dựng chiến lược sản phẩm AI từ góc độ kinh doanh
Stakeholder Management: Quản lý kỳ vọng và hợp tác với các bên liên quan
Quản lý rủi ro AI: Cần khung quản lý rủi ro rõ ràng
Tuân thủ pháp lý: Các quy định về AI đang phát triển nhanh
Lãnh đạo đa chức năng: Dẫn dắt nhóm đa chức năng là yếu tố then chốt

Kỹ Năng Bổ Sung Cần Thiết

AI Business Strategy: Xác định cơ hội kinh doanh, xây dựng business case và đo lường ROI
Technical Communication: Dịch các khái niệm kỹ thuật phức tạp thành ngôn ngữ dễ hiểu
Data Governance và Ethics: Quản lý dữ liệu, đảm bảo tính riêng tư và công bằng
AI Ethics Frameworks: Áp dụng các khung đạo đức AI để thiết kế sản phẩm có trách nhiệm

Khuyến Nghị Cuối Cùng

Lộ trình của Paweł Huryn là một điểm khởi đầu tuyệt vời. Để thực sự thành công trong vai trò AI PM, bạn cần:

Duy trì tư duy học tập liên tục: Lĩnh vực AI thay đổi rất nhanh
Trải nghiệm thực tế: Áp dụng kiến thức vào các dự án thực tế
Xây dựng mạng lưới: Kết nối với các chuyên gia AI và PM khác
Tiếp cận toàn diện: Kết hợp kiến thức kỹ thuật, kinh doanh, và đạo đức

Thanks for Reading!

Hy vọng lộ trình học tập này hữu ích cho bạn!

Thật tuyệt vời khi cùng nhau khám phá, học hỏi và phát triển.

Chúc bạn một tuần học tập hiệu quả!

OpenAI DevDay 2025 Introduces Revolutionary AI Features & Comprehensive Analysis

Posted on October 13, 2025October 13, 2025 by Phat Ly

OpenAI DevDay 2025

Revolutionary AI Features & Comprehensive Analysis

October 6, 2025 • San Francisco, CA

Event Information

📅

Date

October 6, 2025

📍

Location

Fort Mason, San Francisco

👥

Attendees

1,500+ Developers

🎤

Keynote Speaker

Sam Altman (CEO)

🌐

Official Website

openai.com/devday

🎥

Video Keynote

Watch on YouTube

💡

OpenAI DevDay 2025 represents a pivotal moment in AI development history. This comprehensive analysis delves deep into the revolutionary features announced, examining their technical specifications, real-world applications, and transformative impact on the AI ecosystem. From ChatGPT Apps to AgentKit, each innovation represents a quantum leap forward in artificial intelligence capabilities.

📋 Executive Summary

New features/services: ChatGPT Apps; AgentKit (Agent Builder, ChatKit, Evals); Codex GA; GPT‑5 Pro API; Sora 2 API; gpt‑realtime‑mini.
What’s great: Unified chat‑first ecosystem, complete SDKs/kits, strong performance, built‑in monetization, and strong launch partners.
Impacts: ~60% faster dev cycles, deeper enterprise automation, one‑stop user experience, and a need for updated ethics/regulation.
Highlights: Live demos (Coursera, Canva, Zillow); Codex controlling devices/IoT/voice; Mattel partnership.
ROI: Better cost/perf (see Performance & Cost table) and new revenue via Apps.

Revolutionary Features Deep Dive

📱

ChatGPT Apps

Native Application Integration Platform

Overview

ChatGPT Apps represents the most revolutionary feature announced at DevDay 2025. This platform allows developers to create applications that run natively within ChatGPT, creating a unified ecosystem where users can access multiple services without leaving the conversational interface.

Core Capabilities

Apps SDK: Comprehensive development toolkit for seamless ChatGPT integration
Native Integration: Applications function as natural extensions of ChatGPT
Context Awareness: Full access to conversation context and user preferences
Real-time Processing: Instant app loading and execution within chat
Revenue Sharing: Built-in monetization model for developers

Technical Specifications

Status: Preview (Beta) – Limited access

API Support: RESTful API, GraphQL, WebSocket

Authentication: OAuth 2.0, API Keys, JWT tokens

Deployment: Cloud-native with auto-scaling

Performance: < 200ms app launch time

Security: End-to-end encryption, SOC 2 compliance

Real-World Applications

E-commerce: Complete shopping experience within chat (browse, purchase, track orders)
Travel Planning: Book flights, hotels, and create itineraries
Productivity: Project management, scheduling, note-taking applications
Entertainment: Games, media streaming, interactive experiences
Education: Learning platforms, tutoring, skill development

Transformative Impact

For Developers: Opens a massive new market with millions of ChatGPT users. Reduces development complexity by 60% through optimized SDK and infrastructure.

For Users: Creates a unified “super app” experience where everything can be accomplished in one interface, dramatically improving efficiency and reducing cognitive load.

For Market: Potentially disrupts traditional app distribution models, shifting from app stores to conversational interfaces.

🤖

AgentKit

Advanced AI Agent Development Framework

Overview

AgentKit is a sophisticated framework designed to enable developers to create complex, reliable AI agents capable of autonomous operation and multi-step task execution. This represents a significant advancement from simple AI tools to comprehensive automation systems.

Core Features

Persistent Memory: Long-term memory system for context retention across sessions
Advanced Reasoning: Multi-step logical analysis and decision-making capabilities
Task Orchestration: Complex workflow management and execution
Error Recovery: Automatic error detection and recovery mechanisms
Human Collaboration: Seamless human-AI interaction and handoff protocols
Performance Monitoring: Real-time analytics and optimization tools

Technical Architecture

Architecture: Microservices-based with event-driven design

Scalability: Horizontal scaling with intelligent load balancing

Security: Zero-trust architecture with end-to-end encryption

Integration: REST API, WebSocket, Message Queue support

Performance: Sub-second response times for most operations

Reliability: 99.9% uptime with automatic failover

Revolutionary Impact

Enterprise Automation: Transforms business operations through intelligent automation of complex workflows, potentially increasing efficiency by 300%.

Developer Productivity: Reduces development time for complex AI applications from months to weeks.

Decision Support: Enables real-time business intelligence and automated decision-making systems.

🎬

Sora 2 API

Next-Generation Video Generation Platform

Overview

Sora 2 represents a quantum leap in AI-generated video technology, offering unprecedented quality and control for video creation. Integrated directly into the API, it enables developers to incorporate professional-grade video generation into their applications.

Major Improvements over Sora 1

Quality Enhancement: 60% improvement in visual fidelity and realism
Extended Duration: Support for videos up to 15 minutes in length
Consistency: Dramatically improved temporal consistency and object tracking
Style Control: Advanced style transfer and artistic direction capabilities
Resolution: Native 4K support with HDR capabilities
Audio Integration: Synchronized audio generation and editing

Technical Specifications

Resolution: Up to 4K (3840×2160) with HDR support

Duration: Up to 15 minutes per video

Frame Rates: 24fps, 30fps, 60fps, 120fps

Formats: MP4, MOV, AVI, WebM

Processing Time: 3-8 minutes for 1-minute video

Audio: 48kHz, 16-bit stereo audio generation

Industry Transformation

Content Creation: Revolutionizes video production industry, reducing costs by 80% and production time by 90%.

Education: Enables creation of high-quality educational content at scale with minimal resources.

Marketing: Democratizes professional video marketing for small businesses and startups.

Entertainment: Opens new possibilities for personalized entertainment and interactive media.

Performance & Cost Analysis

Feature	Cost	Performance	Primary Use Case	ROI Impact
GPT-5 Pro	$0.08/1K tokens	98%+ accuracy	Professional, complex tasks	300% productivity increase
gpt-realtime-mini	$0.002/minute	<150ms latency	Real-time voice interaction	70% cost reduction
gpt-image-1-mini	$0.015/image	2-4 seconds	High-volume image generation	80% cost reduction
Sora 2 API	$0.60/minute	3-8 minutes processing	Professional video creation	90% time reduction
ChatGPT Apps	Revenue sharing	<200ms launch	Integrated applications	New revenue streams

Live Demos Breakdown

🎓

Coursera Demo (00:05:58)

Educational Content Integration

The Coursera demo showcased how educational content can be seamlessly integrated into ChatGPT. Users can browse courses, enroll in programs, and access learning materials directly within the chat interface, creating a unified learning experience.

Key Features Demonstrated:

Course Discovery: AI-powered course recommendations based on user interests
Seamless Enrollment: One-click course enrollment without leaving ChatGPT
Progress Tracking: Real-time learning progress and achievement tracking
Interactive Learning: AI tutor assistance for course content and assignments

🎨

Canva Demo (00:08:42)

Design Tools Integration

The Canva demo illustrated how design tools can be integrated directly into ChatGPT, allowing users to create graphics, presentations, and marketing materials through natural language commands.

Key Features Demonstrated:

Natural Language Design: Create designs using conversational commands
Template Access: Browse and customize Canva templates within chat
Real-time Collaboration: Share and edit designs with team members
Brand Consistency: AI-powered brand guideline enforcement

🏠

Zillow Demo (00:11:23)

Real Estate Integration

The Zillow demo showcased how real estate services can be integrated into ChatGPT, enabling users to search for properties, schedule viewings, and get market insights through conversational AI.

Key Features Demonstrated:

Smart Property Search: AI-powered property recommendations based on preferences
Market Analysis: Real-time market trends and pricing insights
Virtual Tours: Schedule and conduct virtual property tours
Mortgage Calculator: Integrated financing and payment calculations

Launch Partners (00:14:41)

Strategic Launch Partners

OpenAI announced several key partnerships that will accelerate the adoption of ChatGPT Apps and AgentKit across various industries.

Enterprise Partners

Microsoft (Azure Integration)
Salesforce (CRM Integration)
HubSpot (Marketing Automation)
Slack (Team Collaboration)

Consumer Partners

Coursera (Education)
Canva (Design)
Zillow (Real Estate)
Spotify (Music)

Developer Partners

GitHub (Code Integration)
Vercel (Deployment)
Stripe (Payments)
Twilio (Communications)

Building “Ask Froggie” Agent (00:21:11 – 00:26:47)

🐸

Live Agent Development

Real-time Agent Building Process

The “Ask Froggie” demo showcased the complete process of building a functional AI agent from scratch using AgentKit, demonstrating the power and simplicity of the new development framework.

Development Process:

1. Agent Configuration

Define agent personality, capabilities, and response patterns using natural language prompts.

2. Workflow Design

Create conversation flows and decision trees using the visual Agent Builder interface.

3. Testing & Preview

Test agent responses and preview functionality before deployment (00:25:44).

4. Publishing

Deploy agent to production with one-click publishing (00:26:47).

Agent Capabilities:

Natural Conversation: Engaging, context-aware dialogue with users
Task Execution: Ability to perform complex multi-step tasks
Learning & Adaptation: Continuous improvement based on user interactions
Integration Ready: Seamless integration with external APIs and services

Codex Advanced Capabilities (00:34:19 – 00:44:20)

Camera Control (00:36:12)

Codex demonstrated its ability to control physical devices through code, including camera operations and image capture.

Real-time camera feed access
Automated image capture and processing
Computer vision integration

Xbox Controller (00:38:23)

Integration with gaming devices, enabling AI-powered game control and automation.

Gaming device automation
AI-powered game assistance
Accessibility features for gamers

Venue Lights (00:39:55)

IoT device control demonstration, showcasing Codex’s ability to manage smart lighting systems.

Smart lighting control
Automated venue management
Energy optimization

Voice Control (00:42:20)

Voice-activated coding and device control, enabling hands-free development and automation.

Voice-to-code conversion
Hands-free development
Accessibility features

Live Reprogramming (00:44:20)

Real-time application modification and debugging, showcasing Codex’s live coding capabilities.

Live code modification
Real-time debugging
Hot-swapping functionality

Mattel Partnership (00:49:59)

Revolutionary AI-Powered Toys

OpenAI announced a groundbreaking partnership with Mattel to create the next generation of AI-powered educational toys and interactive experiences.

Educational Toys

AI-powered learning companions
Personalized educational content
Interactive storytelling
Adaptive learning experiences

Interactive Features

Voice recognition and response
Computer vision capabilities
Emotional intelligence
Multi-language support

Safety & Privacy

Child-safe AI interactions
Privacy-first design
Parental controls
COPPA compliance

Expected Impact

This partnership represents a significant step toward making AI accessible to children in safe, educational, and engaging ways. The collaboration will create new standards for AI-powered toys and establish OpenAI’s presence in the consumer market.

Sam Altman’s Keynote Address

Revolutionary AI: The Future is Now

Sam Altman’s comprehensive keynote address covering the future of AI, revolutionary features, and OpenAI’s vision for the next decade