Excel as Database
Life can be funny
Sometimes life has a funny way of teaching you lessons.
One day you try that snowboard jump you’ve been avoiding for years and realize you’re more capable than you thought.
Some other day you catch yourself comparing your progress to someone else’s and learn, yet again, that envy is the thief of joy.
And then there are days when you learn that programmatically writing GitHub pull request statistics into an Excel file is a terrible idea.
This is the story about that last lesson.
How it started
Once upon a time, I was tasked to build a tool that could monitor the code review activity for a small but very important group of people.
Let’s call them mandatory code reviewers (MCRs).
If an MCR didn’t approve your PR, your PR wouldn’t merge.
If your PR wouldn’t merge, work stalled.
If work stalled, delivery timelines slipped.
If delivery timelines slipped, bonuses would be gone.
Nobody wants the bonuses gone.
There were:
- 6 MCRs total
- 3 backend
- 3 client
- ~7 development teams
- Roughly 30–40 developers generating PRs every day.
When MCRs were on top of reviews, features flowed.
When they weren’t, the system clogged.
Managers felt this was happening.
They just couldn’t prove it.
Telling people to “be more involved in reviews” wasn’t cutting it.
And removing mandatory reviews wasn’t an option either — tests help, but they don’t replace human judgment.
So the ask landed on my desk.
The Request
The request was deceptively simple:
“We need weekly reports that show how mandatory code reviewers interact with PRs.
Who’s fast. Who’s slow. Who’s missing things. And trends over time.”
Some constraints:
- Data should come from GitHub
- SLAs mattered:
An MCR must interact with a PR within 3 days from PR creation. If more than 7 days pass and MCRs still don’t review a PR, we got a real problem. - Reports must be:
- Scannable in seconds
- Obvious in conclusions
- Comparable week over week
- Historical data needed to exist somewhere
Might be worth mentioning that before me, the task was given to a different engineer who couldn’t bring it to fruition. No pressure there.
My mental model
In my head, the system had three moving parts:
- Generate
- Pull data from GitHub
- Enrich it with ownership info
- Output:
- HTML (for email)
- JSON (for history)
- Send
- Grab the latest reports
- Embed them into emails
- Ship every Monday morning
- Store
- Persist historical data
- Enable filtering, trends, averages, graphs
Straightforward enough.
Tech Stack
Everything was done as Python scripts.
Everything ran in TeamCity.
BackStage API was used for retrieving repo ownership based on team names.
GitHub API was used for retrieving PR activity.
Did I had any experience with Python, TeamCity, BackStage or querying the GitHub API un until that point of my career?
No.
Did that stop me from trying ?
Also no.
Every Sunday the generator script would run twice, once for the Backend report and once for the Client report.
Every Monday morning the email sender script would pickup the artefacts from TeamCity and assemble 1 email that would be sent to managers.
At this point, nothing was weird.
Then came storage.
The Database Issue
I could’ve used a database.
But let’s be honest about what that actually means in a corporate environment:
- Pick a DB
- Talk to the “Infra team”
- Provision infra
- Expose it internally
- Manage auth
- Handle secrets
- Get IT involved
- Explain why this needs to exist forever
Then, to allow managers to use it I would have to either:
- Tell managers to install a DB Client OR
- Develop a UI for it
All that… to append a few tens of rows once a week.
Meanwhile, managers already had a tool that:
- They used daily
- Had filters, pivots, charts
- Required zero onboarding
So I asked myself:
What if I could bypass all that DB effort ? I could just… use Excel as database!
How hard could it be?
Famous last words
Buckle up, ladies and gents, the ride is going to get bumpy.
Phase 1: The Naive Optimism Era™
My first thought was simple:
“I’ll just create an Excel file in my own Drive and use my own account keys to write to it.”
I mean… I own the file.
I can open it.
I can edit it.
Surely Python can do the same, right?
Wrong.
This is where I learned my first important lesson:
Human accounts are for humans. Automation is not human.
Turns out, letting a script pretend to be me is frowned upon. Apparently, companies don’t love the idea of immortal Python scripts roaming SharePoint with my full permissions, bypassing MFA, HR offboarding, and common sense.
Rude, but fair.
Phase 2: “Okay, Fine, I’ll Do It the Proper Way”
After a bit of Googling and a mild existential crisis, I discovered the phrase that would define the next few weeks of my life:
Service account
Great! I’ll just create one.
Haha. No.
Service accounts are not Pokémon. You can’t just catch one.
You need an IT Admin. Someone who understands Microsoft architecture.
So I did the only thing you can do in such situations:
🎫 I submitted a ticket.
Phase 3: “We don’t do Service Accounts”
A few days later, IT replied:
“We don’t create service accounts. We create App Registrations.”
Of course you do.
Suddenly, my “simple Excel automation” required an application identity in Azure AD.
Out came the sacred artefacts:
AZURE_TENANT_IDAZURE_CLIENT_IDAZURE_CLIENT_SECRET
At this point, my script officially had:
- Its own identity
- Its own credentials
- More documentation than my original feature request
Progress! 🚀
Phase 4: “You also can’t use your Drive”
Next problem:
“You cannot put this Excel file in your personal OneDrive.”
Why?
Because personal drives are:
- Tied to humans (ugh, again)
- Disabled when you leave
- Not auditable in a sane way
- A compliance horror show
So no, my precious Excel file could not live with me.
It needed a neutral, corporate, well-governed home.
Which meant…
🥁 SharePoint.
Phase 5: The Teams Group Side Quest
To get a SharePoint space, I had to:
- Create a Teams group
- Add all relevant managers
- Add the service app identity
This triggered:
- Email notifications
- Teams pings
- Calendar invites (somehow?)
- A wave of “Why was I added to this?” messages
Nothing says “great developer experience” like explaining to senior management why a robot needs access to a spreadsheet.
But eventually:
✅ SharePoint site created
✅ Excel file uploaded
We were back on track.
Or so I thought.
Phase 6: Welcome to Microsoft Graph, Enjoy Your Stay
Now I just needed to… access the file.
Except Microsoft doesn’t believe in “paths”.
There is no:
/Documents/Reports/github.xlsx
No no.
There is only:
- Site ID
- Drive ID
- Item ID
So I spent a delightful afternoon listing sites, listing drives, listing items, comparing ids just trying to figure out which one is my dear Excel.
Eventually, I found the correct Site ID and Drive ID and did what any responsible engineer would do:
I hardcoded them.
Judge me if you want.
You would’ve done the same.
Phase 7: “403 Forbidden” — My Old Friend
Script ready.
Excel downloaded.
Data appended.
Upload attempt…
❌ Permission denied
Of course.
Cue a two-week email ping-pong with IT.
Complication:
- The IT admin was a contractor
- Worked two days a week
- Was understandably terrified of granting write permissions to anything with the word “App” in it
Never mind that:
- The app was sandboxed
- The SharePoint site was isolated
- The permissions were scoped
The process went on like this:
The IT Admin would turn a nob in his dashboards, then proceed to ask me:
Did it work ?
I would execute a script and concur:
Nope.
The cycle would repeat a few times, then at some point he’d say:
Let’s reconvene on Tuesday, I think I know what’s happening.
At this point I was genuinely thinking to redo everything in PostgreSQL…
Phase 8: The Finale
Then one day — no announcement, no ceremony — it worked.
The script ran. The Excel updated. No errors.
I didn’t ask what changed.
I didn’t want to know.
I backed away slowly and added more logs.
The Aftermath
The reports ran every week for over a year, while I was there.
My manager used them to identify that one of the MCRs was indeed constantly slacking and the others had to pick up their work. Eventually, that person got laid off ( not just because of that but the reports definitely helped build the case ).
The statistics also showcased the high performer’s productivity and impact over the teams. One guy was involved in more than half of all Backend PRs delivered by 7 teams throughout a year. He wasn’t jut approving blindly, you could use the reports to check his comments and boy, devs learned a lot from him.
The solution worked… kind off.
Until entropy kicked in:
Excel got slower as it grew. Not talking millions of entries, just a mere 1200 PR lines would freeze scroll and take longer and longer to recompute stats
Fields got updated as specs evolved but they couldn’t be replaced as that would break various references in other Stats tabs, so new fields had to be added, making the Excel sheet even bigger.
Formulas for computing date time diffs per various time intervals were becoming exceedingly complicated and hard to manage.
New teams meant script updates and backfilling, which, because Excel is not “normalized”, forced me to do all the dedupe checks in code but then when I had to update some lines, had to either do it manually or run custom scripts for it.
So did it actually work ?
Life Can Be Funny Like That
A wise man once told me that if you don’t win, at least learn something.
This one taught me a few things I won’t forget.
- Excel is not the problem. Identity is.
- Human accounts are mortal. Automation must not be.
- Microsoft Graph is powerful — but it will not hold your hand, and it definitely won’t stop you from doing something questionable.
- And lastly, Excel is not a database.
It never was.
Leave it to the managers.
After a few years I’ve been told by an ex-colleague that the whole solution was redone completely from scratch using vibe-coding as a web app.
I had but 1 question for him:
Where do they store historical data ?
He replied:
PostgreSQL, of course.
Life really does have a funny way of teaching you lessons.
Stay safe.
Andrei
P.S. If you felt this article “excelled” in providing you with insightful engineering juices, I wrote another one about how I vibe coded this site you’re on right now from scratch using Antigravity and Astro in a day. Check it out!