PyTorch Basics

5 jam16 min baca
Tujuan

Tensor, autograd, dan training loop. Fondasi semua deep learning di PyTorch.

01 — PyTorch Basics

Estimasi: 5 jam Tujuan: Tensor, autograd, dan training loop. Fondasi semua deep learning di PyTorch.


Kenapa Materi Ini Penting?

PyTorch adalah framework yang dipakai oleh mayoritas riset AI dan produk LLM modern — termasuk model-model yang akan kamu pelajari di bootcamp. Kalau scikit-learn adalah "kalkulator" ML, PyTorch adalah "mesin industri" yang membangun GPT, Stable Diffusion, dan semua model canggih yang kamu lihat hari ini. Menguasai tensor, autograd, dan training loop di sini berarti kamu siap memahami arsitektur neural network apapun yang dibahas di kelas, bukan cuma jadi penonton yang copy-paste kode tanpa paham apa yang terjadi di balik layar.

Bayangkan kamu mau jadi koki profesional. Resep (algoritma ML) memang penting, tapi kalau kamu tidak paham cara pakai pisau, panci, atau kompor (tools dasar), kamu tidak bisa eksekusi resep apapun dengan baik. PyTorch adalah "dapur" yang dipakai semua koki AI top dunia — OpenAI, Meta, Anthropic, semua training di PyTorch. Begitu kamu fasih di sini, semua materi lanjutan (CNN, Transformer, fine-tuning LLM) tinggal aplikasi pattern yang sama: tensor masuk → operasi → loss → backward → update.

Tiga hal kunci yang akan kamu kuasai di sini: (1) Tensor sebagai struktur data inti — kotak data multi-dimensi yang bisa hidup di GPU, (2) Autograd sebagai mesin penghitung turunan otomatis — yang membuat backpropagation sekejap mata, dan (3) Training loop sebagai ritme tetap setiap proyek deep learning — sekali hafal, dipakai seumur hidup.

Peta Mental Materi Ini

Diagram statis Mermaid sebagai fallback:

flowchart LR
    T["🔢 Tensor<br/>(data)"] --> M["🧠 Model<br/>(nn.Module)"]
    M --> L["📊 Loss<br/>(criterion)"]
    L --> B["⚡ Backward<br/>(autograd)"]
    B --> O["🎯 Optimizer<br/>(update)"]
    O --> M
    style T fill:#e1f5ff
    style M fill:#fff4e1
    style L fill:#ffe1e1
    style B fill:#e1ffe1
    style O fill:#f0e1ff

Bagian 1 — Install & Setup

# CPU only
pip install torch torchvision

# Dengan GPU (kalau punya NVIDIA)
# Cek versi di pytorch.org
import torch
print(torch.__version__)
print(torch.cuda.is_available())    # True kalau ada GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Tanpa GPU? Pakai Google Colab — gratis, ada GPU. Atau Kaggle Notebooks.


Bagian 2 — Tensor

Tensor = NumPy array + GPU support + autograd.

Analogi: Tensor adalah kotak data multi-dimensi. Skalar = titik (0-D). Vektor = garis angka (1-D). Matriks = grid 2-D (seperti tabel Excel). Tensor 3-D = kubus (RGB image: tinggi × lebar × channel). Tensor 4-D = batch image (batch × channel × H × W). Bayangkan tumpukan kubus, semakin banyak dimensi semakin tinggi tumpukannya.

Visualisasi Dimensi Tensor

Cara Membaca Diagram:

  • Tiap node = satu "level" tensor, semakin ke kanan semakin banyak dimensi
  • Edge label menunjukkan apa yang ditambahkan saat naik level
  • Drag node untuk eksplorasi, zoom untuk lihat detail
  • Tensor 3-D = 1 image (channel, height, width), Tensor 4-D = batch image
  • Pola dimensi PyTorch: (B, C, H, W) atau (B, T, D) untuk sequence

Walkthrough Step-by-Step:

  1. Scalar (0-D) — angka tunggal, contoh torch.tensor(5)
  2. Vector (1-D) — list angka, contoh [1, 2, 3], shape (3,)
  3. Matrix (2-D) — tabel angka, shape (rows, cols) seperti DataFrame
  4. Tensor 3-D — kubus, contoh image RGB shape (3, 32, 32)
  5. Tensor 4-D — tumpukan kubus, contoh batch 16 image shape (16, 3, 32, 32)

Analogi Sehari-hari: Bayangkan toko buku. Buku = scalar (1 item). Rak buku = vector. Lemari rak = matrix. Toko = tensor 3-D. Jaringan toko = tensor 4-D. Tiap level naik = nambah satu sumbu organisasi.

Diagram statis Mermaid sebagai fallback:

flowchart LR
    S["🔹 Scalar<br/>0-D<br/>5"] --> V["📏 Vector<br/>1-D<br/>[1,2,3]"]
    V --> M["🟦 Matrix<br/>2-D<br/>[[1,2],[3,4]]"]
    M --> T3["🧊 Tensor 3-D<br/>(C,H,W)<br/>image"]
    T3 --> T4["📦 Tensor 4-D<br/>(B,C,H,W)<br/>batch"]
    style S fill:#fef3c7
    style V fill:#fde68a
    style M fill:#fcd34d
    style T3 fill:#fbbf24
    style T4 fill:#f59e0b

Bikin Tensor

import torch

# Dari list/numpy
t = torch.tensor([1, 2, 3])
t = torch.tensor([[1, 2], [3, 4]])

# Special
torch.zeros(3, 4)
torch.ones(2, 3)
torch.eye(3)
torch.empty(2, 2)        # uninitialized

# Random
torch.rand(3, 3)              # uniform [0, 1)
torch.randn(3, 3)             # standard normal
torch.randint(0, 10, (3, 3))

# From numpy
import numpy as np
arr = np.array([1, 2, 3])
t = torch.from_numpy(arr)
t.numpy()    # back to numpy

Properties

t = torch.randn(3, 4)
t.shape         # torch.Size([3, 4])
t.dtype         # torch.float32
t.device        # cpu or cuda
t.requires_grad # False (default)

Move to GPU

t = t.to(device)
# atau
t = t.cuda()    # kalau yakin ada GPU

# Best practice
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
t = t.to(device)

Operations (Mirip NumPy)

a = torch.tensor([[1, 2], [3, 4]])
b = torch.tensor([[5, 6], [7, 8]])

a + b
a * b           # element-wise
a @ b           # matrix mult
a.T             # transpose
a.sum()
a.mean()
a.max()
a.argmax()
a.sum(dim=0)    # sum per kolom

# Reshape
a.reshape(4)
a.view(4)       # mirip reshape, lebih efisien (kalau memori contiguous)
a.unsqueeze(0)  # add dim: (2,2) → (1,2,2)
a.squeeze()     # remove dim 1

Bagian 3 — Autograd

Inti yang membuat PyTorch beda dari NumPy.

Analogi: Autograd = mesin penghitung turunan otomatis. Bayangkan kamu menulis rumus matematika di kertas, lalu ada robot ajaib yang langsung tahu turunan rumus itu terhadap variabel manapun, tanpa kamu hitung manual. Itu autograd. Tiap operasi di tensor "direkam" jadi graf komputasi (computational graph) — saat .backward() dipanggil, robot ini berjalan mundur di graf untuk hitung gradient pakai chain rule.

Alur Autograd

Cara Membaca Diagram:

  • Atas (kiri-ke-kanan) = forward pass, tiap operasi menambah node ke graf komputasi
  • Bawah (kanan-ke-kiri) = backward pass, autograd berjalan mundur dengan chain rule
  • Edge animated pink/amber = arah aliran gradient saat .backward() dipanggil
  • Node dengan requires_grad=True adalah "leaf" yang akan dapat .grad

Walkthrough Step-by-Step:

  1. Bikin x dengan requires_grad=True — autograd tracking aktif
  2. Operasi y = x² dicatat sebagai node di graf forward
  3. Operasi z = y + 2x menambah node, total graf jadi 3 step
  4. Saat loss.backward() dipanggil, autograd hitung gradient mulai dari loss
  5. Chain rule jalan mundur: dz/dx = 2x + 2. Hasilnya disimpan di x.grad

Analogi Sehari-hari: Bayangkan jejak GPS saat kamu jalan-jalan (forward = jalan maju, sambil ditandai pin di tiap belokan). Saat mau pulang (backward), GPS otomatis kasih tahu cara balik dengan urutan pin terbalik. Kamu tidak perlu inget jalan — autograd inget semuanya untukmu.

Diagram statis Mermaid sebagai fallback:

flowchart LR
    X["🔢 x<br/>requires_grad=True"] --> F1["⚙️ y = x²"]
    F1 --> F2["⚙️ z = y + 2x"]
    F2 --> L["📊 loss"]
    L -->|".backward()"| B["⚡ Hitung gradient<br/>chain rule mundur"]
    B -->|"x.grad"| G["🎯 dz/dx"]
    style X fill:#dbeafe
    style L fill:#fee2e2
    style B fill:#fef3c7
    style G fill:#d1fae5

Track Gradient

x = torch.tensor(3.0, requires_grad=True)
y = x ** 2 + 2 * x + 1

y.backward()

print(x.grad)    # 8.0  (turunan: 2x + 2 di x=3)

Multi-variable

x = torch.tensor(2.0, requires_grad=True)
y = torch.tensor(3.0, requires_grad=True)

z = x ** 2 * y

z.backward()

print(x.grad)    # 12.0  (∂z/∂x = 2xy = 12)
print(y.grad)    # 4.0   (∂z/∂y = x² = 4)

Disable Gradient

# Untuk inference (faster, less memory)
with torch.no_grad():
    output = model(x)

# atau
output = model(x).detach()

Reset Gradient

# WAJIB di awal training step
optimizer.zero_grad()    # reset semua param.grad
# atau
for param in model.parameters():
    param.grad = None

Common bug: lupa zero_grad → gradient menumpuk dari iterasi sebelumnya.


Bagian 4 — Linear Regression Manual

Tanpa nn.Module, full manual:

import torch

# Data sintetis: y = 2x + 1
torch.manual_seed(42)
X = torch.randn(100)
y = 2 * X + 1 + torch.randn(100) * 0.1

# Parameters
w = torch.tensor(0.0, requires_grad=True)
b = torch.tensor(0.0, requires_grad=True)

learning_rate = 0.01

for epoch in range(1000):
    # Forward
    y_pred = w * X + b
    
    # Loss
    loss = ((y_pred - y) ** 2).mean()
    
    # Backward
    loss.backward()
    
    # Update (manual, tanpa optimizer)
    with torch.no_grad():
        w -= learning_rate * w.grad
        b -= learning_rate * b.grad
        
        # Reset gradient
        w.grad.zero_()
        b.grad.zero_()
    
    if epoch % 100 == 0:
        print(f"Epoch {epoch}: w={w.item():.4f}, b={b.item():.4f}, loss={loss.item():.4f}")

print(f"\nFinal: y = {w.item():.4f}x + {b.item():.4f}")
print(f"True:  y = 2x + 1")

Run, lihat konvergen ke ~2 dan ~1.


Bagian 5 — nn.Module (OO Way)

import torch.nn as nn

# Pakai built-in
linear = nn.Linear(in_features=10, out_features=1)
# Internal: weight (10,1) + bias (1,)

x = torch.randn(5, 10)    # batch 5, dim 10
output = linear(x)         # shape (5, 1)

Custom Model

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer1 = nn.Linear(10, 32)
        self.layer2 = nn.Linear(32, 16)
        self.layer3 = nn.Linear(16, 1)
    
    def forward(self, x):
        x = torch.relu(self.layer1(x))
        x = torch.relu(self.layer2(x))
        x = self.layer3(x)
        return x

model = MyModel()
x = torch.randn(5, 10)
output = model(x)    # PyTorch auto panggil forward()

Pakai nn.Sequential (Lebih Singkat)

model = nn.Sequential(
    nn.Linear(10, 32),
    nn.ReLU(),
    nn.Linear(32, 16),
    nn.ReLU(),
    nn.Linear(16, 1),
)

Inspect Model

print(model)

# Number of parameters
total = sum(p.numel() for p in model.parameters())
print(f"Total params: {total}")

# State dict (semua weight)
print(model.state_dict().keys())

Bagian 6 — Loss Functions

import torch.nn as nn

# Regression
mse = nn.MSELoss()
loss = mse(predictions, targets)

# Classification (binary)
bce = nn.BCEWithLogitsLoss()    # sigmoid + BCE
loss = bce(logits, labels)

# Classification (multi-class)
ce = nn.CrossEntropyLoss()      # softmax + NLL
loss = ce(logits, labels)        # labels: long tensor of class indices

Tip: untuk classification, pakai XxxWithLogitsLoss — tidak perlu apply sigmoid/softmax sendiri (numerically stable).


Bagian 7 — Optimizer

import torch.optim as optim

# SGD (klasik)
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

# Adam (paling populer)
optimizer = optim.Adam(model.parameters(), lr=1e-3)

# AdamW (untuk transformer)
optimizer = optim.AdamW(model.parameters(), lr=1e-4)

# RMSprop
optimizer = optim.RMSprop(model.parameters(), lr=1e-3)

Learning Rate Scheduler

scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
# Tiap 10 epoch, lr × 0.1

scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=50)
# Smooth decay

Bagian 7.5 — Training Loop: Big Picture

Analogi: Training loop = lifting weight di gym. Tiap iterasi: angkat (forward), rasakan beratnya (loss), lepaskan dengan teknik benar (backward), istirahat dan adjust grip (optimizer step), reset posisi (zero_grad), ulangi. Konsisten ribuan kali → otot (weights) berkembang.

flowchart TD
    Start["🏁 epoch loop"] --> Z["1️⃣ optimizer.zero_grad()<br/>reset gradient"]
    Z --> F["2️⃣ y_pred = model(x)<br/>forward pass"]
    F --> Loss["3️⃣ loss = criterion(y_pred, y)"]
    Loss --> Back["4️⃣ loss.backward()<br/>hitung gradient"]
    Back --> Step["5️⃣ optimizer.step()<br/>update weights"]
    Step --> Check{"epoch selesai?"}
    Check -->|"belum"| Z
    Check -->|"selesai"| Eval["🔍 model.eval()<br/>validate"]
    Eval --> Done["✅ done"]
    style Z fill:#fee2e2
    style F fill:#dbeafe
    style Loss fill:#fef3c7
    style Back fill:#e9d5ff
    style Step fill:#d1fae5

Bagian 8 — Training Loop (Standar)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = MyModel().to(device)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)

for epoch in range(100):
    # ===== TRAIN =====
    model.train()
    for batch_x, batch_y in train_loader:
        batch_x, batch_y = batch_x.to(device), batch_y.to(device)
        
        # Forward
        y_pred = model(batch_x)
        loss = criterion(y_pred, batch_y)
        
        # Backward
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    
    # ===== VALIDATE =====
    model.eval()
    val_loss = 0
    with torch.no_grad():
        for batch_x, batch_y in val_loader:
            batch_x, batch_y = batch_x.to(device), batch_y.to(device)
            y_pred = model(batch_x)
            val_loss += criterion(y_pred, batch_y).item()
    
    val_loss /= len(val_loader)
    print(f"Epoch {epoch}: Val loss {val_loss:.4f}")

5 step inti (sudah dibahas):

  1. optimizer.zero_grad()
  2. y_pred = model(x)
  3. loss = criterion(y_pred, y)
  4. loss.backward()
  5. optimizer.step()

Bagian 9 — Dataset & DataLoader

from torch.utils.data import Dataset, DataLoader

class MyDataset(Dataset):
    def __init__(self, X, y):
        self.X = torch.FloatTensor(X)
        self.y = torch.FloatTensor(y)
    
    def __len__(self):
        return len(self.X)
    
    def __getitem__(self, idx):
        return self.X[idx], self.y[idx]

# Pakai
dataset = MyDataset(X_train, y_train)
loader = DataLoader(dataset, batch_size=32, shuffle=True, num_workers=4)

for batch_x, batch_y in loader:
    # ... process batch
    pass

Wajib pakai DataLoader untuk dataset > kecil. Otomatis batching + shuffling + parallel loading.


Bagian 10 — Save & Load

# Save (best practice: only state dict)
torch.save(model.state_dict(), "model.pt")

# Load
model = MyModel()
model.load_state_dict(torch.load("model.pt"))
model.eval()    # untuk inference

# Save full checkpoint (model + optimizer + epoch)
checkpoint = {
    "epoch": epoch,
    "model_state_dict": model.state_dict(),
    "optimizer_state_dict": optimizer.state_dict(),
    "loss": loss.item(),
}
torch.save(checkpoint, "checkpoint.pt")

Bagian 11 — Common Mistakes & FAQ

Tujuh bug klasik yang akan kamu hadapi cepat atau lambat. Kenali dulu, hemat berjam-jam debug nanti.

1. Lupa optimizer.zero_grad()

# ❌ SALAH — gradient terakumulasi dari iterasi sebelumnya
for batch in loader:
    loss = criterion(model(x), y)
    loss.backward()
    optimizer.step()

# ✅ BENAR
for batch in loader:
    optimizer.zero_grad()    # reset dulu
    loss = criterion(model(x), y)
    loss.backward()
    optimizer.step()

2. Lupa model.eval() saat Inference

# ❌ SALAH — Dropout & BatchNorm masih aktif, hasil tidak deterministik
predictions = model(test_x)

# ✅ BENAR
model.eval()
with torch.no_grad():
    predictions = model(test_x)

Dropout aktif saat eval = jawaban beda-beda tiap call. BatchNorm pakai stats batch saat eval = bocor info batch.

3. Tensor Shape Mismatch

# ❌ SALAH
x = torch.randn(32, 10)         # batch=32, dim=10
linear = nn.Linear(20, 5)        # expect dim=20
linear(x)                        # ERROR: mat1 and mat2 shapes cannot be multiplied

# ✅ Cek shape dulu
print(x.shape)                   # torch.Size([32, 10])
linear = nn.Linear(10, 5)        # match dim input

Tip debug: print tensor.shape di tiap tahap forward saat error.

4. Device Mismatch (CPU vs GPU)

# ❌ SALAH
model = model.to("cuda")
x = torch.randn(32, 10)          # masih di CPU
output = model(x)                # ERROR: expected all tensors on same device

# ✅ BENAR
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
x = x.to(device)
output = model(x)

5. Wrong Loss Function

Task Output Layer Loss
Regression Linear (no activation) nn.MSELoss()
Binary classification Linear (logits) nn.BCEWithLogitsLoss()
Multi-class classification Linear (logits) nn.CrossEntropyLoss()
Multi-label classification Linear (logits) nn.BCEWithLogitsLoss()

Bug klasik: apply softmax sendiri lalu pakai CrossEntropyLoss — double softmax, training tidak konvergen.

6. CrossEntropyLoss dengan dtype Salah

# ❌ SALAH
labels = torch.tensor([0.0, 1.0, 2.0])    # float
loss = nn.CrossEntropyLoss()(logits, labels)    # ERROR

# ✅ BENAR
labels = torch.tensor([0, 1, 2], dtype=torch.long)    # long/int64

7. Memory Leak: Loss Tracking

# ❌ SALAH — graf komputasi nempel di list, OOM cepat
losses = []
for batch in loader:
    loss = criterion(model(x), y)
    losses.append(loss)              # menyimpan tensor + graph

# ✅ BENAR — pakai .item() untuk dapat scalar Python
losses = []
for batch in loader:
    loss = criterion(model(x), y)
    losses.append(loss.item())       # cuma float, no graph

Bagian 12 — Tensor Operations Cheatsheet (Runnable)

Latihan yang bisa langsung kamu jalankan untuk feel-in tensor:

import torch

# === CREATION ===
a = torch.tensor([[1, 2, 3], [4, 5, 6]])
b = torch.zeros_like(a)               # bentuk sama, isi nol
c = torch.full((3, 3), 7)             # 3x3 isi 7
d = torch.linspace(0, 1, 5)           # [0, 0.25, 0.5, 0.75, 1]
e = torch.arange(0, 10, 2)            # [0, 2, 4, 6, 8]

# === SLICING (mirip NumPy) ===
x = torch.arange(24).reshape(2, 3, 4)
print(x[0])                           # slice batch ke-0 → shape (3, 4)
print(x[:, 1, :])                     # row ke-1 di tiap batch → shape (2, 4)
print(x[..., -1])                     # kolom terakhir → shape (2, 3)
print(x[x > 10])                      # boolean masking → 1-D

# === MATH ===
a = torch.tensor([1.0, 2.0, 3.0])
b = torch.tensor([4.0, 5.0, 6.0])

print(torch.dot(a, b))                # scalar product: 32
print(torch.outer(a, b))              # outer product: 3×3
print(a.norm())                       # L2 norm

# === BROADCASTING ===
x = torch.randn(3, 1)                 # (3, 1)
y = torch.randn(1, 4)                 # (1, 4)
z = x + y                             # broadcast → (3, 4)

# === STACK / CAT ===
a = torch.zeros(2, 3)
b = torch.ones(2, 3)
torch.cat([a, b], dim=0)              # (4, 3) — tumpuk vertikal
torch.cat([a, b], dim=1)              # (2, 6) — tumpuk horizontal
torch.stack([a, b], dim=0)            # (2, 2, 3) — bikin dim baru

# === RESHAPE FAMILY ===
x = torch.arange(12)
x.reshape(3, 4)                       # (3, 4)
x.view(2, 6)                          # (2, 6) — share memory
x.unsqueeze(0)                        # (1, 12)
x.unsqueeze(0).expand(5, 12)          # (5, 12) tanpa copy memori

# === REDUCTION ===
m = torch.randn(3, 4)
m.sum(dim=0)                          # sum per kolom → (4,)
m.sum(dim=1, keepdim=True)            # (3, 1)
m.argmax(dim=1)                       # index max per row

Bagian 13 — Mini Training Loop Lengkap (Runnable End-to-End)

Contoh lengkap dari nol sampai inference, bisa kamu copy-paste dan jalankan:

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader

# === 1. Data sintetis: y = 3x₁ - 2x₂ + 5 + noise ===
torch.manual_seed(0)
N = 1000
X = torch.randn(N, 2)
y = 3 * X[:, 0] - 2 * X[:, 1] + 5 + 0.1 * torch.randn(N)
y = y.unsqueeze(1)    # (N, 1)

# === 2. Split & DataLoader ===
n_train = int(0.8 * N)
train_ds = TensorDataset(X[:n_train], y[:n_train])
val_ds = TensorDataset(X[n_train:], y[n_train:])
train_loader = DataLoader(train_ds, batch_size=32, shuffle=True)
val_loader = DataLoader(val_ds, batch_size=64)

# === 3. Model ===
model = nn.Sequential(
    nn.Linear(2, 16),
    nn.ReLU(),
    nn.Linear(16, 1),
)

# === 4. Loss & Optimizer ===
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=1e-2)

# === 5. Training Loop ===
for epoch in range(20):
    model.train()
    train_loss = 0
    for xb, yb in train_loader:
        optimizer.zero_grad()
        pred = model(xb)
        loss = criterion(pred, yb)
        loss.backward()
        optimizer.step()
        train_loss += loss.item() * xb.size(0)
    train_loss /= len(train_ds)

    model.eval()
    val_loss = 0
    with torch.no_grad():
        for xb, yb in val_loader:
            val_loss += criterion(model(xb), yb).item() * xb.size(0)
    val_loss /= len(val_ds)

    print(f"Epoch {epoch:02d} | train {train_loss:.4f} | val {val_loss:.4f}")

# === 6. Inference ===
model.eval()
with torch.no_grad():
    test_x = torch.tensor([[1.0, 1.0]])
    print(f"\nPrediksi y untuk x=[1,1]: {model(test_x).item():.4f}")
    print(f"Target: 3*1 - 2*1 + 5 = 6")

Cobain: ubah lr, batch_size, atau jumlah hidden layer — lihat efek ke val loss.


Cek Pemahaman

  • Tahu beda tensor dan numpy array?
  • Bisa pakai autograd untuk hitung gradient?
  • Bisa bikin nn.Module sendiri?
  • Hafal 5 step training loop?
  • Bisa pakai DataLoader?
  • Bisa save/load model?

Challenge 6.1

Challenge 1 — Linear Regression Manual

Replicate code di Bagian 4. Eksperimen learning rate. Plot loss curve.

Challenge 2 — Neural Network from Scratch

Bikin NN 2-layer untuk klasifikasi Iris:

  • Input: 4 features
  • Hidden: 16 neurons + ReLU
  • Output: 3 (3 species)

Train dan evaluate.

Challenge 3 — Pakai DataLoader

Convert challenge 2 untuk pakai Dataset + DataLoader. Train dengan batch size 16.

Challenge 4 — Tonton Karpathy

Andrej Karpathy "Neural Networks: Zero to Hero" Episode 1 — micrograd. Tonton penuh (2.5 jam).

Tulis 10 insight di jurnal.


Selanjutnya: 02-neural-networks.md