$ whoami
ZainRazaJafri
|

Cloud & DevOps Engineer — backed by 3+ years Full Stack Engineering.

Zain Raza Jafri
Available for work
zain@aws-prod — infrastructure
scroll
AWS·Terraform·Docker·GitHub Actions·Linux·Nginx·Node.js·NestJS·ECS·ECR·IAM·VPC·CloudWatch·KMS·CI/CD·Kubernetes·Python·TypeScript·MongoDB·PostgreSQL·Redis·AWS·Terraform·Docker·GitHub Actions·Linux·Nginx·Node.js·NestJS·ECS·ECR·IAM·VPC·CloudWatch·KMS·CI/CD·Kubernetes·Python·TypeScript·MongoDB·PostgreSQL·Redis·
Zain Raza Jafri
Lahore, Pakistan
$ cat about.md

Cloud Engineer who ships infrastructure, not just code.

I'm a Cloud & DevOps Engineer focused on AWS infrastructure — the kind that gets provisioned from scratch, monitored under real production load, and maintained when things go wrong at 2am. What sets me apart is 3+ years of backend engineering that came before the cloud work: building REST APIs, real-time Socket.io services, and microservices-based HRM platforms in Node.js. I don't just provision the infrastructure — I understand what's actually running on it.

That combination — infrastructure depth plus application context — is rare, and it shows up in how I work: VPC designs that fit the workload, container configurations that don't create security debt, CI/CD pipelines that the dev team actually trusts. I've provisioned 5 AWS environments from scratch, kept 8+ production applications running, and handled real incidents — including tracing and eradicating persistent crypto-mining malware from a live production system over two days of forensic work.

0+
Years of Production Experience
0+
Production Applications Maintained
0
AWS Environments Provisioned from Scratch
0
Days to Eradicate Live Production Malware
Critical Incident · Production

Production Incident

DeveloperTag · December 2025

incident-report.txt — forensicsSEVERITY: CRITICAL

This incident shaped how I think about container security. The build cache is not just a performance tool — it's a persistent surface that outlives containers and images, and demands the same scrutiny as running workloads.

$ git log --oneline

Experience

Cloud & DevOps Engineer (AWS)

DeveloperTag·Lahore, Pakistan
CurrentDecember 2025 – Present
  • Designed and maintained CI/CD pipelines using GitHub Actions integrated with AWS, fully automating build, test, and production deployments — eliminating manual SSH/SCP workflows and saving ~2–3 hours per release cycle.
  • Deployed and managed containerised workloads on Amazon EC2 and ECS, standardising Node.js application deployments using Docker multi-stage builds with non-root execution for improved security posture.
  • Led an active security incident response — traced persistent crypto-mining malware that survived container restarts, image deletions, and npm overrides; identified root cause as malicious layers in Docker build cache and fully eradicated the threat.
  • Diagnosed and resolved critical CPU spikes (110% utilisation) and container OOM crashes on production; implemented rollback procedures and zero-downtime deployment strategies.
  • Right-sized AWS infrastructure using Cost Explorer & Compute Optimizer, downgrading an over-provisioned EC2 instance post-incident and reducing ongoing cloud spend.
  • Managed infrastructure using Terraform, defining EC2, VPC, and IAM resources as code for repeatable and auditable provisioning.
  • Set up AWS CodePipeline alongside GitHub Actions for select workflows, integrating CodeBuild and CodeDeploy for fully managed build and deployment automation.
AWSDockerGitHub ActionsTerraformECSEC2CodePipelineCI/CD

Cloud & DevOps Engineer

Max ERP·Lahore, Pakistan
January 2025 – December 2025
  • Containerised Node.js applications using Docker and contributed to CI/CD workflows, improving deployment consistency and reducing manual release effort across multiple environments.
  • Implemented secure AWS configurations including IAM roles, VPC isolation, and security group rules; integrated CloudWatch + SNS for real-time monitoring and alerting.
  • Set up and managed Nginx reverse proxy configurations on EC2 instances to route traffic across multiple services, improving reliability and enabling zero-downtime deployments.
  • Supported application deployments using Docker Compose multi-service stacks on AWS EC2, coordinating backend, frontend, and database containers.
  • Used Terraform to manage infrastructure as code for AWS environments, keeping infrastructure changes version-controlled and consistent.
DockerAWS EC2NginxTerraformCloudWatchIAM

Software Engineer

Max ERP·Lahore, Pakistan
September 2023 – December 2025
  • Developed and integrated RESTful APIs and a Socket.io real-time server for a multi-branch HRM platform used by thousands of users across enterprise clients.
  • Built core HRM microservices modules: Invoice Management, Multi-Branch Access Control, Help Desk, and Bulk Import.
  • Designed and implemented RBAC and multi-branch permission logic for complex organisational hierarchies.
  • Collaborated with a cross-functional team using Git workflows, PR reviews, and code reviews to maintain code quality.
Node.jsSocket.ioREST APIsRBACMicroservicesPostgreSQL

Full Stack & DevOps Engineer

BytePark Solutions·Lahore, Pakistan
January 2023 – September 2023
  • Provisioned 5 AWS environments from scratch using Terraform (IaC) across EC2, EBS, S3, ECS, ECR, and Lambda — ensuring consistent, repeatable, and secure infrastructure.
  • Containerised and deployed applications on Amazon ECS and ECR; managed IAM users, roles, and KMS encryption policies enforcing least-privilege access control.
  • Designed VPC architectures including subnets, Internet Gateways, route tables, security groups, and NACLs for secure, isolated multi-tier application environments.
  • Configured CloudWatch dashboards and SNS alerts for proactive monitoring of application health and performance.
AWSTerraformECSECRVPCKMSIAMCloudWatch
$ ls -la ./projects/

Projects

Public projects and Cloud & DevOps architecture labs — ordered by complexity.

ChatHub

Live · Public

Production-grade real-time chat application built with the MERN stack and Socket.io, deployed end-to-end on AWS with a CI/CD pipeline, infrastructure as code, and observability baked in from day one.

  • Real-time messaging with Socket.io — typing indicators, online presence, read receipts
  • JWT authentication with refresh token rotation
  • Group chats, file uploads, message reactions
  • Nginx reverse proxy on AWS EC2 handling all traffic on port 80
  • GitHub Actions pipeline deploying to ECS on every push
  • Terraform managing the entire AWS infrastructure as code
  • Prometheus metrics for backend observability
MERNSocket.ioAWS ECSGitHub ActionsTerraformNginxPrometheusJWT
View on GitHub

Serverless REST API — API Gateway, Lambda, DynamoDB

Public

Serverless backend with persistent NoSQL storage

A fully serverless REST API built on AWS — API Gateway routing to Lambda with DynamoDB as the persistent store. IAM scoped to the minimum required action on a single table ARN.

  • Least-privilege IAM — inline policy scoped to a single table ARN and single action
  • Lambda proxy integration with API Gateway for full request/response passthrough
  • CORS handled at both API Gateway and Lambda level
  • DynamoDB as the NoSQL persistent layer — no server to manage
API GatewayLambdaDynamoDBIAMServerless

Custom VPC — Public/Private Subnets, NAT Gateway & Bastion Host

Public

Network isolation + secure admin access

A production-grade VPC from scratch — public and private subnets across AZs, NAT Gateway for outbound-only private traffic, and a Bastion Host as the sole SSH entry point.

  • Security group referencing (SG-to-SG, not CIDR) — no IP whitelisting drift
  • SSH agent forwarding through Bastion — private instances never hold keys
  • NAT Gateway verified: outbound internet from private subnet confirmed via curl
  • Route tables configured per subnet — public IGW, private NAT
VPCEC2NAT GatewayIGWRoute TablesSecurity Groups

Terraform EC2 + Nginx with cloud-init

Public

Infrastructure as Code — zero manual steps after terraform apply

Full IaC — EC2 provisioned, Nginx installed and running, SSH locked to your IP, all from a single terraform apply. No manual steps after the command completes.

  • SSH access restricted to var.my_ip — no default value, plan fails if unset
  • cloud-init handles Nginx installation and startup at first boot
  • Security group allows only port 80 (public) and port 22 (your IP only)
  • Full teardown with terraform destroy — no orphaned resources
TerraformEC2Security Groupscloud-initNginxIaC

Application Load Balancer + EC2 Target Groups

Public

High availability + multi-AZ traffic distribution

ALB distributing traffic across EC2 instances in multiple AZs, with security group chaining ensuring EC2 instances are never directly reachable from the internet.

  • EC2 security group only accepts traffic from the ALB security group — never from 0.0.0.0/0
  • Target group health checks gate traffic — unhealthy instances pulled automatically
  • Nginx installed on EC2 via user-data at launch — no manual SSH required
  • Multi-AZ target group for resilience across availability zones
ALBEC2Target GroupsHealth ChecksSecurity GroupsMulti-AZ

E-Commerce Platform

Public

Full-featured e-commerce backend with complete product lifecycle management, deployed on AWS ECS via AWS Pipeline for automated deployments.

  • Product catalog, cart management, order processing
  • JWT-based authentication with refresh token rotation
  • MongoDB backend deployed on ECS via AWS Pipeline
  • RESTful API architecture with full CRUD coverage
Node.jsExpress.jsMongoDBJWTAWS ECSAWS Pipeline
View on GitHub

Docker Flask + Redis Counter

Public

Multi-container application with persistent state

Multi-container Docker Compose setup with a Flask app and Redis backend. Redis uses AOF persistence so the counter survives container restarts.

  • Health checks using condition: service_healthy — Flask waits for Redis to be ready
  • Redis AOF persistence — counter state survives docker-compose down and restart
  • Named volumes for data persistence across container lifecycle
  • python:3.12-slim base for minimal image footprint
DockerDocker ComposeFlaskRedisHealth Checks

NGINX on EC2 with Custom Domain (Route 53)

Public

DNS management + web serving

EC2 instance running Nginx, served at a custom domain via Route 53. Elastic IP ensures the DNS record stays valid across EC2 stop/start cycles.

  • Elastic IP attached — DNS record never breaks on EC2 stop/start
  • Route 53 A record pointing apex domain to the Elastic IP
  • Nginx configured as the web server — ready for reverse proxy extension
EC2NginxRoute 53Elastic IPDNS

Internal Projects

Enterprise production systems — under NDA. Architecture details available on request.

Max HRM

Internal · Production

Monolithic multi-tenant HRM system serving thousands of enterprise users across multiple branches. Built Invoice Management, Multi-Branch Access Control, Help Desk, and Bulk Import modules with RBAC.

Node.jsSocket.ioRBACMulti-tenantREST APIs

Max Invoice

Internal · Production

Microservice multi-tenant invoicing system handling financial workflows across enterprise client branches.

Node.jsMicroservicesMulti-tenant

Max Inventory

Internal · Production

Microservice multi-tenant inventory management system for real-time stock tracking and control.

Node.jsMicroservicesMulti-tenant

Max Payroll

Internal · Production

Microservice multi-tenant payroll processing system for complex organisational pay structures.

Node.jsMicroservicesMulti-tenant
$ cat skills.json | jq .

Skills & Stack

Cloud & Infrastructure
AWS (10+ services) · IaC
EC2ECSECREBSS3LambdaIAMVPCCloudWatchCompute OptimizerTerraformCloudFormation
Containers & CI/CD
Full deploy automation
DockerDocker ComposeMulti-stage BuildsGitHub ActionsAWS CodePipelineCodeBuildCodeDeploy
Security & Networking
Production hardened
IAM PoliciesKMS EncryptionVPC IsolationSecurity Group ChainingNginxDocker HardeningMalware Incident Response
Observability
Full-stack visibility
CloudWatchSNS AlertsPrometheusCost ExplorerCompute Optimizer
Backend
3+ yrs production
Node.jsNestJSExpress.jsDjango
Languages
JavaScriptTypeScriptPythonBash
Databases
PostgreSQLMySQLMongoDBRedis
APIs & Architecture
Production at scale
RESTful APIsSocket.ioWebSocketsMicroservicesMulti-tenant SystemsOpenAI APIStripe APIRBAC / ABAC
$ grep -r "impact" ./career/

Key Achievements

01
2 Days

Eradicated Persistent Crypto-Mining Malware

Traced malware surviving container restarts, image deletions, and npm overrides. Root-caused to malicious layers embedded in Docker build cache. Fully eradicated from live production without downtime at DeveloperTag.

02
~2–3 hrs saved / release

Automated End-to-End Deployments

Replaced manual SSH/SCP → docker-compose workflows with GitHub Actions CI/CD pipelines, eliminating human error from production deployments across all release cycles.

03
Direct cost reduction

AWS Cost Reduction via Right-Sizing

Used Cost Explorer & Compute Optimizer to identify over-provisioned infrastructure post-incident. Downgraded EC2 instance from medium to small after confirming CPU had stabilised under 1%.

04
8+ apps · 5 environments

Deployed & Maintained Production Infrastructure

Provisioned 5 AWS environments from scratch and maintained 8+ production systems — handling everything from VPC design to containerisation, monitoring, and rollback procedures.

$ ping zainjafri4

Get in Touch

I build and operate cloud infrastructure for production systems. If you have an infrastructure challenge or want to talk DevOps, reach out.

Location

Lahore, Pakistan

Response time: I typically reply within 24 hours on business days.