After wrestling with fonts, diacritics, and PDF rendering quirks, I’ve finally arrived at a solution that works: a Python script that accurately renders fully vocalized Arabic Quran text, complete with harakat, into a beautiful, multi-page PDF.
This was more than just a technical hurdle. It was about honoring the complexity and beauty of Quranic Arabic — and making sure that the sacred diacritics (harakat) didn’t get lost in translation or omitted by software limitations.
Here’s how we did it.
The Problem: Harakat Missing in PDF Output
In previous checkpoints, I managed to generate structured Quran PDFs, even inserting Basmala before each Surah, and aligning Arabic and English beautifully.
But I hit a wall: when trying to render verses like:
بِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ
The harakat — the kasrah, shaddah, maddah — were either missing, rendered as rectangles, or not rendered at all.
Even though my console showed the correct text, the PDF silently stripped or failed to display the diacritics.
The Discovery: Platypus
and Paragraph
Save the Day
The breakthrough came when I shifted from drawing text directly with canvas.drawString()
to using reportlab.platypus
components like SimpleDocTemplate
, Paragraph
, and Spacer
.
Combined with:
arabic_reshaper
(configured to keep harakat)python-bidi
(for right-to-left alignment)- A reliable Arabic font like
Amiri-Regular.ttf
…everything just worked.
The Fix: Use ArabicReshaper
with delete_harakat: False
from arabic_reshaper import ArabicReshaper
reshaper = ArabicReshaper(configuration={'delete_harakat': False})
reshaped = reshaper.reshape(ayah_ar)
bidi = get_display(reshaped)
This tells the reshaper to keep all harakat and pass them to the PDF as-is.
The Result: Harakat-Rich, Structured, Multi-Surah PDF
Each page now beautifully includes:
- A Surah header in Arabic and English
- An auto-inserted Basmala, conditionally skipped for Surah 9
- Each ayah fully vocalized, aligned right using
Paragraph
- Its corresponding English translation, aligned left
No more harakat loss. No more boxes. Just pure, readable, respectful rendering.
Full Technologies Used
pandas
: for loading and filtering the Quran datasetreportlab.platypus
: for PDF document structurearabic_reshaper
: to reshape Arabic with diacriticspython-bidi
: to properly reorder RTL textAmiri-Regular.ttf
: for Quran-friendly Arabic font
Checkpoint 3 Accomplishments
Here’s what we’ve nailed down so far:
- ✅ Accurate harakat rendering using
Paragraph
- ✅ Flexible multi-line Arabic + English layout
- ✅ Basmala insertion before each Surah (except At-Tawbah)
- ✅ Font configuration that preserves diacritics
- ✅ PDF output that matches the beauty and structure of printed Mushaf
Code Explanation: Generating a Bilingual Quran PDF with Harakat and Basmala
This Python script takes a Quran dataset and generates a PDF that includes both the Arabic verses and their English translations, along with the Basmala (بِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ) added before each Surah. Additionally, the script ensures the Arabic text is rendered with harakat (diacritics) and displays properly using a custom font.
1. Importing Required Libraries
import pandas as pd
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import A4
from reportlab.pdfbase.ttfonts import TTFont
from reportlab.pdfbase import pdfmetrics
import arabic_reshaper
from bidi.algorithm import get_display
Here, we import the necessary libraries:
pandas
: For reading and handling the Quran dataset stored as a CSV.reportlab
: The core library used for PDF creation.canvas
: Used to draw on the PDF.pagesizes
: Defines the size of the page (A4 in this case).pdfmetrics
andTTFont
: For loading custom fonts.
arabic_reshaper
: A tool for reshaping Arabic text to ensure proper letter connection and display.bidi.algorithm
: To handle right-to-left (RTL) text direction (critical for Arabic text rendering).
2. Loading the Dataset
df = pd.read_csv("The Quran Dataset.csv")
df = df[df["surah_no"] >= 110]
df = df.sort_values(by=["surah_no", "ayah_no_surah"])
- We load the dataset (assumed to be a CSV file with Quranic data).
- Filter: Only Surahs from 110 onward are selected (last 5 Surahs in this case).
- Sorting: The dataset is sorted by Surah number and Ayah number for proper order.
3. Setting Up the PDF
c = canvas.Canvas("quran_last_5_surahs_bismillah_harakat.pdf", pagesize=A4)
width, height = A4
y_position = height - 50
- Canvas: Initializes the PDF file named
"quran_last_5_surahs_bismillah_harakat.pdf"
. - A4 Pagesize: Defines the page dimensions (A4).
- y_position: Sets the initial vertical position where the content starts.
4. Registering the Arabic Font
pdfmetrics.registerFont(TTFont('AmiriQuran', 'PlaypenSansArabic-VariableFont_wght.ttf'))
c.setFont("AmiriQuran", 16)
- We register the custom font
AmiriQuran
(make sure the.ttf
font file is in the working directory). - Set the font size to 16 for the Arabic text.
5. Iterating Through the DataFrame to Add Surah Titles and Verses
current_surah = ""
for _, row in df.iterrows():
surah_en = row["surah_name_en"]
surah_ar = row["surah_name_ar"]
ayah_no = row["ayah_no_surah"]
ayah_ar = row["ayah_ar"]
ayah_en = row["ayah_en"]
- Iterating over each row: Each row contains data for an ayah (verse) in the Quran.
- We extract:
- Surah name in English (
surah_name_en
) and Arabic (surah_name_ar
). - Ayah number in the surah (
ayah_no_surah
). - Arabic text (
ayah_ar
) and English translation (ayah_en
).
- Surah name in English (
6. Adding Surah Titles and Basmala
if surah_en != current_surah:
current_surah = surah_en
y_position -= 30
if y_position < 100:
c.showPage()
y_position = height - 50
c.setFont("AmiriQuran", 16)
# English + Arabic surah title
c.setFont("Helvetica-Bold", 14)
c.drawString(50, y_position, f"Surah {surah_en} / {surah_ar}")
y_position -= 25
- We check if the current Surah is new and then adjust the vertical position (
y_position
). - Surah Title: The Surah title is displayed in both English and Arabic, using Helvetica-Bold for clarity.
7. Adding Basmala (Before Every Surah)
if row["surah_no"] != 9:
c.setFont("AmiriQuran", 14)
basmala = "بِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ"
reshaped_basmala = arabic_reshaper.reshape(basmala)
bidi_basmala = get_display(reshaped_basmala)
c.drawRightString(width - 50, y_position, bidi_basmala)
y_position -= 25
- Basmala Insertion: The Basmala is added before each Surah except Surah 9 (At-Tawbah).
- Reshaping & Bidi Processing: The Basmala text is reshaped (to connect letters) and processed to ensure correct right-to-left alignment with
bidi
.
8. Adding Arabic and English Ayah
# Prepare Arabic (reshaped + RTL)
reshaped_text = arabic_reshaper.reshape(f"{ayah_no}. {ayah_ar}")
bidi_text = get_display(reshaped_text)
# Arabic ayah
c.setFont("AmiriQuran", 18) # instead of 16
c.drawRightString(width - 50, y_position, bidi_text)
y_position -= 25
# English ayah
c.setFont("Helvetica", 12)
c.drawString(50, y_position, ayah_en)
y_position -= 35
- Arabic Verse: Each ayah is reshaped for correct letter connection and alignment using
arabic_reshaper
andbidi
. The font size is increased to 18 for visibility. - English Translation: Displayed next to the Arabic text in Helvetica.
9. Page Breaks (When Necessary)
if y_position < 100:
c.showPage()
y_position = height - 50
c.setFont("AmiriQuran", 16)
- New Page: When there’s not enough space left, the script creates a new page and resets the font size for continued ayah rendering.
10. Save the PDF
c.save()
- Finally, the PDF is saved with all the Surahs, Basmala, and ayahs formatted.
Get full code here. password: dosensibuk.com
Final Thoughts
This checkpoint felt incredibly fulfilling. After so much trial and error with rendering engines, font quirks, and reshaping logic, the text now finally appears in the PDF the way it’s meant to: precise, vocalized, and reverent.
It’s proof that code can carry culture, and that with the right tools and intention, we can digitize even the most sacred scripts respectfully.
I’m excited to take this further — maybe into a web-based Quran viewer, a memorization tool, or even a customizable PDF generator for teachers and students.
If you’re working on something similar, I’d love to collaborate.
Up Next for Checkpoint 4?
- Add page numbers or headers
- Render Juz markers
- Export one Surah per file
- Build a web UI for selecting surahs and outputting printable PDFs
Let’s keep going.