Building Aiella: From Idea to Working EU AI Act Scanner in 3 Weeks

Three weeks ago I started building Aiella, an EU AI Act compliance monitoring platform. This week I hit Milestone 1: a fully working scanner that takes a description of an AI system, classifies its EU AI Act risk tier using Claude Sonnet 4.6 via AWS Bedrock, and emails a complete compliance report within seconds.

Here’s exactly how I built it and what I learned.


What the Scanner Does

The flow is straightforward:

  1. A company fills out a 10-question form describing their AI system — what it does, what data it processes, whether it has EU users, whether decisions are automated
  2. The form submits to a FastAPI backend
  3. A structured prompt sends the description to Claude Sonnet 4.6 via AWS Bedrock
  4. Claude returns a JSON classification: risk tier, applicable EU AI Act Articles, priority actions, and GDPR overlap
  5. An HTML email report is rendered and sent via AWS SES
  6. The user sees a confirmation page — results are delivered privately to their inbox, not displayed in the browser

The whole round trip takes about 10 to 15 seconds. The email arrives looking like a professional compliance document, not a developer’s test output.


The Technical Stack

FastAPI for the web layer. Lightweight, async native, and the right tool for an API first product. Jinja2 templates handle the form and confirmation pages.

AWS Bedrock with Claude Sonnet 4.6 for the risk classification. I’m using the cross region inference profile which routes requests across US regions for better availability. The Bedrock integration is straightforward: standard boto3 client, invoke_model, parse the JSON response.

A versioned prompt system in ml/prompts/risk-classifier/v1.0.txt. The prompt is treated as a model artifact — versioned, testable, and never edited in place. When I improve the classification logic I create v1.1.txt rather than overwriting the previous version. This is the foundation of the MLflow experiment tracking that comes in Week 5.

A prompt evaluation suite in ml/tests/test_cases.json with 10 known scenarios and expected outputs. Before any prompt version can deploy, it must pass at least 90% of these test cases. The current v1.0 prompt passes all 10. This is CI/CD for ML, the same concept that will gate EKS deployments in Week 12.

AWS SES for email delivery. Domain verified for aiella.com, sending from scanner@aiella.com. The email is HTML with a plain text fallback, responsive on mobile, and includes the legal disclaimers in the right places.


The Decisions That Mattered

Results by email only, not in the browser

My first instinct was to display results directly in the browser after form submission. I changed this for a few reasons.

Email only means results are private. Only someone with access to the submitted email address can see the report. It also encourages real email addresses, which matters for lead generation. And it feels more like a professional compliance tool than a toy web app.

The confirmation page is intentionally minimal: “Your report is on its way to [email]. Check your inbox.” Clean and credible.

Structured JSON output from Claude

The prompt instructs Claude to return only valid JSON with no preamble, no explanation, no markdown fences. The response gets parsed directly into a Python dict and passed to the email template renderer.

Getting this right took some iteration. A few things that helped: explicit instruction to return only JSON, a stripping step that removes any backtick fences Claude occasionally adds despite instructions, and graceful error handling when parsing fails.

Form validation in the browser, not just the server

The HTML form has required attributes but browsers don’t always enforce them consistently. I added JavaScript validation that runs on submit, checks every field, and shows a styled error list at the top of the form if anything is missing. This catches empty fields before anything hits the server.

The WeasyPrint decision

I originally planned to generate PDF reports. WeasyPrint works flawlessly on Linux but has significant dependency issues on Windows. Since I’m developing on Windows and deploying to Linux on EKS, the pragmatic call was to use HTML email for Milestone 1 and add PDF generation when running on Linux in production. The HTML email actually looks more professional than most PDFs I’ve seen from compliance tools.


The Prompt Engineering Process

Week 2 was entirely prompt engineering: writing and testing the risk classification prompt against 10 known scenarios.

The EU AI Act’s risk framework is specific enough that a well structured prompt produces accurate classifications. The key was giving Claude the right decision rules:

Prohibited systems include social scoring, real time biometric surveillance, and subliminal manipulation. High risk systems under Annex III cover employment, credit, education, essential services, law enforcement, migration, and safety critical infrastructure. Limited risk covers chatbots that must disclose they are AI and AI generated content that must be labeled. Everything else is minimal risk.

The trickiest cases were systems touching multiple categories, or where the EU users flag changes everything. An AI system making employment decisions with no EU users doesn’t trigger the Act at all. The prompt handles this correctly.

After testing against all 10 scenarios, the v1.0 prompt achieved 100% accuracy. The additional Articles Claude identifies beyond the minimum expected ones are legitimate — it’s being thorough rather than wrong.


What I Learned About the EU AI Act

Building a classifier forces you to actually read the regulation carefully. A few things that surprised me.

The extraterritorial reach is broad. The Act applies to any AI system affecting EU individuals, regardless of where the provider is based. A San Diego SaaS company with European customers is subject to the Act for those customers. This is identical to GDPR’s territorial scope and equally underappreciated.

Article 12 is more specific than people realize. Most compliance discussions focus on risk classification. But Article 12 (record keeping) has concrete technical requirements: millisecond precision timestamps, input references, confidence scores, model version identifiers, human intervention flags. Most production AI systems aren’t logging these fields.

The “ongoing” requirement is the gap most companies miss. Articles 9 and 72 both use the word “ongoing.” Risk management and post market monitoring must be continuous processes, not one time assessments. A risk assessment filed at deployment and never updated doesn’t satisfy the requirement. This is the core problem Aiella’s monitoring product addresses.

The deadline uncertainty is real but not a reason to wait. The August 2, 2026 enforcement date may extend to December 2027 pending a Council amendment, but that extension isn’t yet law. Companies using the uncertainty as a reason to delay are taking on more risk, not less. The monitoring infrastructure takes time to build and the companies that start now will be ahead regardless of when enforcement begins.


What’s Next

This week brings Docker containerization, packaging everything into containers so the scanner runs consistently anywhere. This sets up the EKS deployment later.

Week 5 and 6 add MLflow experiment tracking, logging every Claude call as a run with prompt version, latency, quality score, and output metrics. This is where the scanner starts becoming a production MLOps system rather than a development prototype.

Week 7 and 8 introduce SQS async processing, decoupling the API from the Claude call so the API returns immediately and processing happens in the background.

Month 3 and 4 bring the monitoring product: the SDK customers install alongside their AI systems to enable continuous Article 9, 12, 13, 14, and 72 monitoring.

The scanner at aiella.com will be live for public use soon. If you want early access or want to discuss EU AI Act compliance for your own AI systems, connect with me on LinkedIn or join the mailing list at aiella.com.

Leave a Reply

Your email address will not be published. Required fields are marked *.

*
*