How We Built an AI Specifically Optimized for Resumes

When we started ResumeCraft, we faced an uncomfortable truth: generic Large Language Models (LLMs) are terrible at writing resumes. If you ask ChatGPT to write a resume for a software engineer, it will likely produce a document filled with flowery adjectives, vague responsibilities, and hallucinations about skills you don't possess. It writes like a marketing intern, not a senior engineer.

The problem isn't intelligence; it's alignment. Foundation models are trained to be helpful, conversational assistants. They prioritize fluency over density. But a resume is a density optimization problem. You have roughly 800 words to convince a recruiter—and a parsing algorithm—that you are worth $200,000 a year. To solve this, we couldn't just wrap a prompt around GPT-4. We had to build a dedicated pipeline that understands the physics of hiring.

The "Hallucination" of Competence

Standard LLMs suffer from what we call "adjective drift." When asked to improve a bullet point, a standard model will often insert words like "spearheaded," "visionary," or "dynamic" to make the text sound more impressive. In reality, these words are red flags for recruiters. They signal fluff.

We solved this by curating a dataset of high-performing resumes, such as our Google Software Engineer resume example, which rely heavily on the "Action Verb + Task + Metric" formula. We used these examples to fine-tune a Llama-3-70b base model using QLoRA (Quantized Low-Rank Adaptation). Instead of training the model to "write a resume," we trained it to translate vague input into quantified impact. The loss function wasn't just predicting the next token; it was weighted to prefer tokens that represented numbers, technologies, and hard skills over soft descriptors.

Direct Preference Optimization (DPO) for Tone

Fine-tuning taught the model what to say, but we needed to control how it said it. We utilized Direct Preference Optimization (DPO) to align the model's output with the preferences of senior hiring managers. We generated thousands of resume bullet pairs—one written in the "standard AI" style and one written in a "senior engineer" style—and had human experts rank them.

The model learned that "Collaborated with the team to build a frontend" (Rejected) is inferior to "Architected a React-based frontend reducing load times by 40%" (Chosen). This alignment phase was crucial. It stripped away the robotic "AI accent" that recruiters can now spot from a mile away. You can see the result of this density-first approach in our Amazon Software Engineer resume example, where every line is tied to a Leadership Principle and a metric.

Constrained Decoding for ATS Compliance

The most technically challenging aspect of resume generation is structure. Applicant Tracking Systems (ATS) are notoriously brittle parsers. If your resume's date formatting is inconsistent, or if your skills section is a non-standard table, the parser fails, and your application goes into the void.

We couldn't rely on the LLM to "hope" it produced valid JSON or correct formatting. We implemented constrained decoding (grammar-based sampling). By forcing the model's output logits to conform to a strict JSON schema during generation, we ensure 100% structural validity before the data is ever rendered into a PDF.

# Simplified example of our schema enforcement logic
from pydantic import BaseModel, Field
from typing import List

class ResumeBullet(BaseModel):
    action_verb: str = Field(..., description="Strong verb like 'Optimized', 'Deployed'")
    task: str
    metric: str = Field(..., description="Quantifiable impact, e.g., '20%' or '$50k'")
    tech_stack: List[str]

class ExperienceEntry(BaseModel):
    company: str
    role: str
    bullets: List[ResumeBullet]

# We force the LLM to generate tokens that strictly satisfy this Pydantic model
# preventing the generation of unstructured "fluff" text.

This approach allows us to decouple content generation from rendering. The AI generates pure, structured data. Our rendering engine then takes that data and maps it to templates that we have reverse-engineered from major ATS platforms like Taleo and Greenhouse.

Retrieval-Augmented Generation (RAG) for Job Targeting

A generic resume is a weak resume. To beat the ATS, you need keyword matching. We built a RAG pipeline that ingests the target job description provided by the user. We embed this job description into a vector space and retrieve the most semantically relevant skills and keywords.

When the model generates your resume, it isn't just looking at your history; it is looking at your history through the lens of the job description. If the job requires "Kubernetes" and you have "Container Orchestration" in your history, our RAG layer prompts the model to swap the terminology to match the employer's specific dialect. This is subtle, but it significantly increases the match score in automated systems.

Building ResumeCraft wasn't about wrapping a UI around OpenAI's API. It was about deconstructing the hiring process into a data science problem. By combining fine-tuning, DPO, constrained decoding, and RAG, we've built a tool that doesn't just write; it engineers your career history into a format that algorithms—and humans—love.

The "Hallucination" of Competence

Direct Preference Optimization (DPO) for Tone

Constrained Decoding for ATS Compliance

Retrieval-Augmented Generation (RAG) for Job Targeting

Ready to land your dream job?