Extracting Data points as JSON from Handwritten forms using — AWS Textract, Lambda and S3

This is how it works!

In this post we will be covering the following checkpoints:


  • Lambda Service
  • Textract Service
  • Simple Storage Service i.e. S3
  • Identity Access Management Service

1. Uploading Image/doc to S3 bucket

A process overview

Create S3 Bucket

Upload image to S3 bucket

2. Extracting data from an S3 image

  • Add the bucket name
  • Select all object to create events for Event type
  • Suffix as .jpg
  1. In the local machine create a directory for the project
mkdir -p zip_boto3/python
Pip install boto3
cd zip_boto3zip -r boto3-layer.zip python

3. Add code to Lambda Function

This code will help to extract the data from the uploaded image and save the data as a JSON file
trp.py helps in parsing the data response that we get from AWS Textract

4. Result

Generated JSON file in S3 Bucket

5. References




Software Engineer and a Data Science Enthusiast :)

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Great Find for How to Learn Javascript Online

Kubernetes: Use cases and it’s application to solve problems in Industries

Welcome to Modern Authorization

PHP VS PYTHON Which One is Best for Programming in 2021?

Advanced Git concepts for Developers

Manage AWS via CLI

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Noufal Rijal

Noufal Rijal

Software Engineer and a Data Science Enthusiast :)

More from Medium

Monitoring Dockerized Models using IBM Watson OpenScale

Log Analysis with PgBadger

Getting started with Flask

MongoDB Compass: A no-code tool for data analysts