11 — Mini Projects
Estimasi: 8 jam (4 jam per project, atau pilih 1) Tujuan: Konsolidasi semua skill Python lewat project nyata. Output: 2-3 repo GitHub yang bisa kamu pamerkan.
Pilih minimal 1 project. Idealnya 2-3.
Kenapa Materi Ini Penting?
Bagian ini adalah transisi dari belajar ke praktik. Teori dan exercise itu penting, tapi recruiter Dicoding (dan recruiter manapun) tidak peduli kamu tahu list comprehension — mereka mau lihat kamu bisa selesaikan masalah end-to-end. Project di file ini didesain untuk meng-exercise semua skill di file 01-10 secara bersamaan: file I/O, OOP, error handling, CLI parsing, modular code.
Lebih strategis lagi: project di sini bisa di-evolve di fase berikutnya. Knowledge Base CLI di Project 3 bisa jadi RAG system di Fase 7. Gemini API wrapper di Project 4 jadi pondasi LLM application di Fase 6+. Investasi kamu di sini compound — tidak terbuang.
Analogi besar: project = ujian praktik mengemudi. Kamu sudah baca buku rambu, sudah hafal teori. Sekarang waktunya nyetir di jalan beneran — dengan macet, hujan, dan pengendara lain. Itu yang dinilai recruiter.
Peta Project
flowchart TD
A[🎯 Mini Projects] --> P1[📝 Todo CLI]
A --> P2[🌐 News Scraper]
A --> P3[📚 Knowledge Base]
A --> P4[🤖 Gemini Wrapper]
P1 --> S1[OOP, JSON, argparse]
P2 --> S2[requests, regex, error handling]
P3 --> S3[OOP, JSON, regex, search]
P4 --> S4[OOP, requests, env, type hints]
P3 -.->|evolves to| RAG[🔮 RAG System Fase 7]
P4 -.->|foundation for| LLM[🔮 LLM Apps Fase 6+]
Rekomendasi urutan:
- Project 1 (Todo CLI) — paling mudah, latihan dasar lengkap
- Project 3 (Knowledge Base) — strategic, akan kamu pakai lagi nanti
- Project 4 (Gemini Wrapper) — pengalaman pertama LLM API
- Project 2 (Scraper) — kalau masih ada waktu
Project 1 — CLI Todo App
Skill: OOP, JSON, file I/O, argparse, datetime
Diagram Arsitektur
flowchart LR
U[👤 User] -->|CLI command| CLI[argparse]
CLI --> TM[⚙️ TodoManager]
TM --> T[📦 Todo dataclass]
TM <-->|load/save| F[💾 todos.json]
TM --> O[📤 Output ke stdout]
Spec
CLI app untuk manage todo list. Data di-save ke todos.json.
python todo.py add "Belajar Python" --priority high --due 2026-05-20
python todo.py list
python todo.py list --status pending
python todo.py done 1
python todo.py delete 2
python todo.py search "python"
Format Data
[
{
"id": 1,
"title": "Belajar Python",
"priority": "high",
"status": "pending",
"created": "2026-05-13T10:30:00",
"due": "2026-05-20",
"completed": null
}
]
Class Structure
from dataclasses import dataclass, field, asdict
from datetime import datetime
from pathlib import Path
from typing import Optional
import json
import argparse
@dataclass
class Todo:
id: int
title: str
priority: str = "medium"
status: str = "pending"
created: str = field(default_factory=lambda: datetime.now().isoformat())
due: Optional[str] = None
completed: Optional[str] = None
class TodoManager:
def __init__(self, path="todos.json"):
self.path = Path(path)
self.todos: list[Todo] = []
self.load()
def load(self): ...
def save(self): ...
def add(self, title: str, **kwargs) -> Todo: ...
def done(self, id: int): ...
def delete(self, id: int): ...
def list_todos(self, status: Optional[str] = None) -> list[Todo]: ...
def search(self, keyword: str) -> list[Todo]: ...
def main():
parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers(dest="command")
add_parser = subparsers.add_parser("add")
add_parser.add_argument("title")
add_parser.add_argument("--priority", choices=["low", "medium", "high"], default="medium")
add_parser.add_argument("--due")
# ... subparser lain
args = parser.parse_args()
mgr = TodoManager()
if args.command == "add":
todo = mgr.add(args.title, priority=args.priority, due=args.due)
print(f"Added: {todo.title}")
# ... handle command lain
if __name__ == "__main__":
main()
Bonus Features
- Color output (pakai
coloramaatau ANSI codes) - Filter by priority + due date
- Sort options
- Export to CSV
- Backup before save
Submission
Push ke GitHub: dicoding-genai-prep/projects/01-todo-cli/
README.md harus punya:
- Demo (screenshot atau GIF)
- Cara install + run
- Daftar fitur
- Code structure
Project 2 — Web Scraper Berita
Skill: requests, regex, BeautifulSoup, file I/O, error handling
Diagram Pipeline Scraper
flowchart LR
U[🌐 URL] -->|GET| F[fetch]
F -->|HTML| P[parse_article]
P -->|dict| L[list articles]
L --> S1[💾 save_json]
L --> S2[💾 save_csv]
F -->|error| E[🚨 logger.error]
Spec
Script yang scrape berita dari portal berita Indonesia, save ke JSON dan CSV.
Tech Stack
pip install requests beautifulsoup4 lxml
Implementation Outline
import requests
from bs4 import BeautifulSoup
from pathlib import Path
import json
import csv
from datetime import datetime
import time
import logging
logger = logging.getLogger(__name__)
class NewsScraper:
def __init__(self, base_url: str, delay: float = 1.0):
self.base_url = base_url
self.delay = delay
self.session = requests.Session()
self.session.headers.update({
"User-Agent": "Mozilla/5.0 (educational scraping)"
})
def fetch(self, url: str) -> str:
try:
r = self.session.get(url, timeout=10)
r.raise_for_status()
return r.text
except requests.RequestException as e:
logger.error(f"Failed to fetch {url}: {e}")
return ""
def parse_article(self, html: str) -> dict:
soup = BeautifulSoup(html, "lxml")
return {
"title": soup.find("h1").get_text(strip=True),
"content": " ".join(p.get_text(strip=True) for p in soup.find_all("p")),
"date": ...,
"url": ...,
}
def get_article_urls(self, list_page_html: str) -> list[str]:
soup = BeautifulSoup(list_page_html, "lxml")
return [a["href"] for a in soup.select("article a")]
def scrape(self, max_articles: int = 10) -> list[dict]:
articles = []
list_html = self.fetch(self.base_url)
urls = self.get_article_urls(list_html)[:max_articles]
for url in urls:
time.sleep(self.delay) # be polite
html = self.fetch(url)
if html:
article = self.parse_article(html)
article["scraped_at"] = datetime.now().isoformat()
articles.append(article)
return articles
def save_json(self, articles: list[dict], path: str):
Path(path).write_text(
json.dumps(articles, indent=2, ensure_ascii=False),
encoding="utf-8"
)
def save_csv(self, articles: list[dict], path: str):
if not articles:
return
with open(path, "w", encoding="utf-8", newline="") as f:
writer = csv.DictWriter(f, fieldnames=articles[0].keys())
writer.writeheader()
writer.writerows(articles)
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
scraper = NewsScraper("https://example-news.com")
articles = scraper.scrape(max_articles=20)
scraper.save_json(articles, "news.json")
scraper.save_csv(articles, "news.csv")
print(f"Scraped {len(articles)} articles")
Aturan Etika Scraping
- Cek
robots.txtdulu (https://site.com/robots.txt) - Hormati rate limit — kasih delay
- Set User-Agent yang masuk akal
- Tidak scrape konten yang dibatasi (login required, paid content)
- Tidak overload server kecil
Bonus Features
- Pagination (multi-page)
- Cek duplicate articles
- Sentiment analysis sederhana (count keyword positif/negatif)
- Schedule pakai
schedulelibrary
Submission
dicoding-genai-prep/projects/02-news-scraper/
Project 3 — Personal Knowledge Base CLI
Skill: OOP, JSON, regex, search, advanced Python
Ini project paling strategic karena nantinya bisa kamu evolve jadi RAG system di Fase 7.
Diagram Knowledge Base
classDiagram
class Note {
+int id
+str title
+str content
+list~str~ tags
+list~int~ links
+str created
}
class KnowledgeBase {
-Path path
-list~Note~ notes
+new(title, content, tags)
+get(id)
+search(query)
+filter_by_tag(tag)
+link(source, target)
+graph()
}
KnowledgeBase "1" *-- "*" Note : composition
Evolusi ke RAG System (Fase 7)
flowchart LR
KB[📚 KB sekarang<br/>keyword search] -->|tambah| EMB[🔢 Embedding per note]
EMB --> VDB[🗄️ Vector DB]
VDB --> SEM[🔎 Semantic search]
SEM --> RAG[🤖 RAG Q&A]
Spec
CLI knowledge base untuk simpan catatan personal — dengan tagging, search, dan link antar note.
python kb.py new "Belajar AI" --tags "ai,learning,bootcamp"
python kb.py list --tag ai
python kb.py search "machine learning"
python kb.py view 5
python kb.py edit 5
python kb.py link 5 7 # link note 5 ke note 7
python kb.py graph # show all links
Format Note
{
"id": 1,
"title": "Belajar AI",
"content": "...markdown content...",
"tags": ["ai", "learning"],
"links": [3, 5],
"created": "2026-05-13T10:00:00",
"updated": "2026-05-13T11:00:00"
}
Implementation Outline
from dataclasses import dataclass, field, asdict
from datetime import datetime
from pathlib import Path
from typing import Optional
import json
import re
import argparse
@dataclass
class Note:
id: int
title: str
content: str = ""
tags: list[str] = field(default_factory=list)
links: list[int] = field(default_factory=list)
created: str = field(default_factory=lambda: datetime.now().isoformat())
updated: Optional[str] = None
class KnowledgeBase:
def __init__(self, path: str = "kb.json"):
self.path = Path(path)
self.notes: list[Note] = []
self.load()
def _next_id(self) -> int:
return max((n.id for n in self.notes), default=0) + 1
def load(self):
if not self.path.exists():
return
data = json.loads(self.path.read_text(encoding="utf-8"))
self.notes = [Note(**n) for n in data]
def save(self):
data = [asdict(n) for n in self.notes]
self.path.write_text(
json.dumps(data, indent=2, ensure_ascii=False),
encoding="utf-8"
)
def new(self, title: str, content: str = "", tags: list[str] = None) -> Note:
note = Note(
id=self._next_id(),
title=title,
content=content,
tags=tags or []
)
self.notes.append(note)
self.save()
return note
def get(self, id: int) -> Optional[Note]:
return next((n for n in self.notes if n.id == id), None)
def search(self, query: str) -> list[Note]:
"""Full-text search di title + content."""
pattern = re.compile(re.escape(query), re.IGNORECASE)
return [
n for n in self.notes
if pattern.search(n.title) or pattern.search(n.content)
]
def filter_by_tag(self, tag: str) -> list[Note]:
return [n for n in self.notes if tag in n.tags]
def link(self, source_id: int, target_id: int):
source = self.get(source_id)
if not source:
raise ValueError(f"Note {source_id} not found")
if not self.get(target_id):
raise ValueError(f"Note {target_id} not found")
if target_id not in source.links:
source.links.append(target_id)
source.updated = datetime.now().isoformat()
self.save()
def graph(self) -> dict[int, list[int]]:
"""Return adjacency list."""
return {n.id: n.links for n in self.notes}
Bonus Features (Wajib Coba Minimal 2)
- Markdown render di terminal pakai
richlibrary - Backlinks — note A ke note B otomatis update note B punya backlink dari A
- Tag autocomplete dari tag yang sudah ada
- Export to markdown — generate folder
notes_md/dengan file per note - Stats — total notes, top tags, orphan notes (tanpa link)
- TUI dengan
textuallibrary (advanced)
Kenapa Project Ini Strategic?
Di Fase 7, kamu akan bikin RAG system. Knowledge base ini bisa di-upgrade:
- Tambah embedding ke setiap note
- Save ke vector DB
- Search jadi semantic, bukan keyword
- Tanya jawab AI berdasarkan note kamu sendiri
Project ini menjadi dataset dan starter untuk capstone Dicoding kamu nanti.
Submission
dicoding-genai-prep/projects/03-knowledge-base/
Project 4 (Bonus) — API Wrapper Library
Skill: OOP, requests, env vars, error handling, type hints
Spec
Bikin wrapper Python untuk Gemini API (gratis tier). Jadi library kecil yang bisa kamu pakai di project lain.
from my_gemini import GeminiClient
client = GeminiClient(api_key=os.environ["GEMINI_API_KEY"])
response = client.generate("Tulis puisi tentang Bandung")
print(response.text)
# Streaming
for chunk in client.generate_stream("..."):
print(chunk, end="", flush=True)
# Chat (multi-turn)
chat = client.start_chat()
chat.send("Halo")
chat.send("Siapa kamu?")
print(chat.history)
Implementation
import os
import requests
from typing import Optional, Iterator
from dataclasses import dataclass
@dataclass
class GenerateResponse:
text: str
model: str
usage: dict
class GeminiClient:
BASE_URL = "https://generativelanguage.googleapis.com/v1beta"
def __init__(self, api_key: Optional[str] = None, model: str = "gemini-1.5-flash"):
self.api_key = api_key or os.environ.get("GEMINI_API_KEY")
if not self.api_key:
raise ValueError("API key required")
self.model = model
def generate(self, prompt: str, **kwargs) -> GenerateResponse:
url = f"{self.BASE_URL}/models/{self.model}:generateContent"
params = {"key": self.api_key}
body = {"contents": [{"parts": [{"text": prompt}]}]}
r = requests.post(url, params=params, json=body, timeout=30)
r.raise_for_status()
data = r.json()
text = data["candidates"][0]["content"]["parts"][0]["text"]
return GenerateResponse(text=text, model=self.model, usage=data.get("usageMetadata", {}))
Pelajaran Penting
- Tidak hardcode API key — pakai env var
- Pakai
python-dotenvuntuk load.envfile - Add
.envke.gitignore— JANGAN PERNAH commit API key - Handle error API dengan baik
- Type hints untuk autocomplete IDE
Submission
dicoding-genai-prep/projects/04-gemini-wrapper/
Ini project pertama kamu pakai LLM API. Bukan main, tapi rasakan dulu sebelum Fase 6+.
Cara Submit & Showcase
Untuk setiap project:
1. Repo Structure
projects/01-todo-cli/
├── README.md ← penjelasan, demo, cara pakai
├── requirements.txt ← dependencies
├── src/
│ └── todo.py
├── tests/
│ └── test_todo.py
├── .gitignore
└── .env.example ← template env vars (bukan .env asli!)
2. README Template
# Todo CLI App
Manage todos dari command line.
## Demo

## Install
\`\`\`bash
git clone <repo>
cd 01-todo-cli
pip install -r requirements.txt
\`\`\`
## Usage
\`\`\`bash
python src/todo.py add "Belajar Python"
python src/todo.py list
\`\`\`
## Features
- Add, list, complete, delete todos
- Filter by status, priority
- Search by keyword
- Save to JSON
## Tech
- Python 3.11
- argparse, json, dataclasses
## Tests
\`\`\`bash
pytest tests/
\`\`\`
3. Tambah ke Profile README
Update profile README GitHub kamu dengan link project. Saat HR ngintip, project ini langsung kelihatan.
Common Mistakes & FAQ
❌ Mistake 1: Commit API key ke Git
# ❌ FATAL — exposed di history selamanya
git add .env
git commit -m "config"
# ✅ Selalu pakai .gitignore
echo ".env" >> .gitignore
echo "*.key" >> .gitignore
# ✅ Pakai .env.example sebagai template
echo "GEMINI_API_KEY=your-key-here" > .env.example
Kalau sudah terlanjur commit:
- Rotate API key segera (anggap key sudah bocor)
- Pakai
git filter-repountuk hapus dari history - Force push (kalau project personal)
❌ Mistake 2: Tidak ada error handling di network call
# ❌ crash pertama kali API down
response = requests.get(url)
data = response.json()
# ✅ Handle network errors + timeout
try:
response = requests.get(url, timeout=10)
response.raise_for_status()
data = response.json()
except requests.exceptions.Timeout:
logger.warning("Request timed out, retrying...")
except requests.exceptions.HTTPError as e:
logger.error(f"HTTP error: {e}")
❌ Mistake 3: Project tanpa README
Recruiter buka repo, lihat README kosong → close tab. README adalah storefront project kamu.
Minimal:
- 1 paragraf "apa ini"
- Cara install + run
- Screenshot/demo
- 3-5 fitur utama
❌ Mistake 4: Folder structure berantakan
# ❌ semua di root
todo.py
data.json
test.py
helpers.py
README.md
# ✅ terorganisir
todo-cli/
├── README.md
├── requirements.txt
├── .gitignore
├── src/
│ └── todo.py
├── tests/
│ └── test_todo.py
└── data/ ← .gitignore-d
❌ Mistake 5: Commit "wip" 50 kali
# ❌
git log
> wip
> wip2
> fix
> wip3
# ✅ commit message yang descriptive
git log
> feat: add todo deletion by id
> fix: handle missing due date in list view
> refactor: extract TodoManager save logic
FAQ
Q: Pilih project mana dulu? A: Project 1 (Todo CLI) untuk warm-up. Lalu Project 3 (KB) karena strategic untuk fase berikutnya.
Q: Perlu test untuk project mini? A: Wajib coba pytest minimal 3-5 test. Recruiter sangat appreciate kandidat yang test code-nya.
Q: Bahasa apa untuk README — Indonesia atau English? A: Untuk repo public yang ingin di-showcase ke recruiter international, English. Untuk learning lokal, ID OK.
Q: Boleh pakai LLM untuk bantu coding? A: Boleh, tapi: (1) pahami tiap baris, (2) test sendiri, (3) refactor sesuai gaya kamu. Kalau cuma copy-paste tanpa paham, kamu rugi sendiri saat interview.
Q: Berapa banyak project yang ideal di portfolio? A: 3-5 project yang dalam lebih baik dari 20 project shallow. Quality > quantity. Project yang ada README clean + test + dokumentasi proper akan menonjol.
Q: Gimana cara showcase project ke recruiter Dicoding? A:
- Pin project terbaik di GitHub profile
- Profile README highlight 2-3 project utama
- LinkedIn post saat selesai project (with screenshot/demo)
- Update CV dengan link GitHub
Cek Pemahaman
Setelah selesai minimal 1 project, pastikan kamu bisa:
- Setup project Python dari scratch (folder, env, dependencies)
- Bikin class dengan method dan dataclass
- Read/write JSON dan CSV
- Pakai argparse untuk CLI
- Handle error dengan try/except + logging
- Test dengan pytest
- Tulis README yang clear
- Push ke GitHub dengan history rapi
Tantangan Tambahan
- Code review yourself: baca ulang code 1 minggu kemudian. Refactor yang jelek.
- Ask LLM untuk review: kirim code-mu ke Claude/ChatGPT, minta feedback.
- Live demo: record video 2-3 menit demo project. Posting di LinkedIn.
- Open issue ke project lain: baca code orang lain di GitHub, kontribusi minor (typo, dokumentasi).
Selanjutnya: challenges.md — final challenge konsolidasi Fase 2 sebelum lanjut ke Fase 3.