← Back to Experiments
Tool Comparison

When AI Data Extraction Becomes More Work Than Doing It Manually

The promise: AI will automate tedious tasks like extracting data from PDFs into spreadsheets.

When AI Data Extraction Becomes More Work Than Doing It Manually

The promise: AI will automate tedious tasks like extracting data from PDFs into spreadsheets.

The reality: I spent hours trying to get Microsoft Copilot to extract data from a 60-page PDF document, only to have it tell me to wait for excel files that never came and then crash on page six.

The Experiment

I had what seemed like a straightforward task: extract names and other details from a 60-page PDF document into an Excel table.

But this wasn't a clean, perfectly formatted document. The rows varied in height and many fields were blank. The formatting was inconsistent throughout. This is the kind of messy real-world data that makes manual extraction tedious and time-consuming, exactly the scenario where AI tools promise to shine.

I uploaded the PDF to Copilot and asked it to extract the data into a table format.

My first attempt was a failure. Copilot couldn't even process the first row correctly, forget about the entire document. Fair enough. Sixty pages of inconsistent data is a lot to handle.

I adjusted my approach and asked Copilot to extract just the first page into a table, which worked. The table appeared with the data properly organized, and I exported it to Excel successfully.

Encouraged, I tried to process the remaining 59 pages which failed miserably. I scaled back further, pages 2 through 5 only and failed again.

At this point, I was hours into what I'd hoped would be an automation task. I pivoted to an even more tedious approach: processing each page individually.

Pages 2, 3, 4, and 5 worked when handled one at a time. Progress was painfully slow, but at least it was progress. Then I reached page six, where copilot failed and crashed entirely.

I then tried the same task in Gemini 2.5 Pro which told me the PDF was too complex to process into an excel file, still not what I needed, but at least it was honest.

What This Experiment Reveals

The task difficulty matters, but so does the tool's ability to communicate limitations. Yes, extracting data from inconsistent PDFs is genuinely difficult. But Copilot never indicated what it could or couldn't handle. It simply failed silently, leaving me to guess at solutions through trial and error.

Inconsistent performance creates workflow chaos. Copilot handled page one perfectly, then couldn't process multiple pages together, then worked on individual pages, then crashed on page six. Without any pattern or explanation, I couldn't develop an efficient workflow.

AI "automation" required constant supervision. Instead of setting Copilot loose on the document, I had to manually test different approaches, adjust requests, export results after each success, and restart after each failure. This wasn't automation, it was "assisted manual labor" that may have taken longer than doing it manually from the start.

Complex real-world scenarios expose AI limitations. The messy formatting in this PDF represents typical business documents. If AI tools can only handle pristine data, their practical value for everyday tasks is questionable.

Key Takeaways

Extracting data from inconsistent PDFs is legitimately difficult, but AI tools should communicate their limitations rather than failing silently "AI-powered" features often require significantly more human intervention than marketing suggests Unpredictable performance makes it impossible to estimate time or plan workflows around AI tools For messy, critical work, traditional methods or specialized extraction tools may still be more reliable The gap between AI capability and AI usability remains significant for common business tasks

The Bottom Line:

Microsoft Copilot turned a difficult but potentially automatable task into an hours-long troubleshooting session that ended in a crash. While the complexity of the source document was real, the tool's inability to handle it consistently, or even explain its limitations, made the experience more frustrating than just doing the work manually.

AI tools are getting better at handling complex tasks, but they're not yet reliable enough to trust with messy real-world data.

Tools Used

Tool: Microsoft Copilot - Cost: Included with Microsoft 365 subscription (pricing varies by plan)