Mumbai, India · Search × Security

Abhirup Chandra

Backend engineer who makes large, messy corpora searchable — and keeps them secure. From 7M+ national-archive documents to multilingual queries answered by an offline LLM.

0M+ documents searchable
0M metadata records
0K+ users served
<2s query latency at scale

01About

I've spent the last five years building the unglamorous machinery behind the search box — index pipelines, relevance tuning, migrations that touch a hundred million rows, and the security hardening that keeps it all standing.

As backend lead on Abhilekh Patal, the National Archives of India's digital platform, I ran search and data infrastructure for 7M+ digitised documents. On top of it I built a Hindi→English cross-language search layer powered by a fully offline translation model — because government data can't leave the building.

Today I work on enterprise search at Upland (BA Insight), helping large US enterprises find what they own across SharePoint, iManage, Salesforce and 90+ other systems. Alongside, I'm formalising the security side of my work with an MTech in Cybersecurity and the CPTE certification. I care about relevance you can measure, pipelines that fail loudly, and security designed in — not bolted on.

02Experience

2026 — Present

Senior Technical Consultant · Upland Software

BA Insight — Enterprise Search

Delivering enterprise search for large US corporate clients: connector architecture across 95+ content systems, relevance engineering on Lucene-based stacks, and the client-facing consulting that turns "search is broken" into a measurable fix.

SolrElasticsearchLuceneBM25Connectors

Through Apr 2026

Backend Lead · CloudMojo Tech

Abhilekh Patal — National Archives of India

Led backend engineering for India's national digital archive: .NET 8 APIs over a DSpace + Solr + PostgreSQL core, AWS infrastructure with hot/cold S3 tiering for 500K+ PDFs, and production firefighting from 92M-row migrations to bot-attack defense.

.NET 8DSpaceSolrPostgreSQLAWS

Independent

Founder · NaviTro Consultancy & Services

Backend & search consulting

Independent consulting practice for backend engineering and search infrastructure work — the vehicle behind government archival projects and freelance engagements.

Java · Spring BootPythonConsulting

03Featured work

Case study · Government scale

Abhilekh Patal — National Archives of India

The search and order-management backbone for India's national digital archive: 7M+ documents, ~108M metadata records, 30K+ users. PostgreSQL as the source of truth, Solr as the denormalised read model, async workers for everything slow.

  • Cut a stalled 7-hour, 92M-row schema migration to 5 hours with temp-table joins and targeted indexes
  • Designed hot/cold S3 tiering for 500K+ PDFs to keep archive storage costs flat
  • Hardened the public portal against scanner and bot traffic with per-IP rate limiting and WAF rules
  • Kept faceted search under 2 seconds on a 7M-document corpus
.NET 8Solr 8PostgreSQL · RDSS3DSpace

Case study · Offline AI

Multilingual Search — Hindi→English with IndicTrans2

Cross-language search over the same 7.3M-record index, powered by AI4Bharat's IndicTrans2 — a 200M-parameter translation model running fully on-prem, because government data residency rules out every cloud API.

  • 200–500ms CPU inference on an ARM64 VM — free, offline, and state-of-the-art on Indic languages
  • Curated a 449-entry Hindi synonym dictionary — precision beats LLM-generated expansion
  • Defensive by default: query-length caps, 5s timeout with fallback, output validation, strict Nginx allowlist in front of Solr
  • Supports 10 Indic languages with client-side script detection
IndicTrans2PyTorchFlaskSolr 9Nginx

04Side projects

🔎Full-stack search

InfoHunt

Wikipedia semantic search — Spring Boot 3 + Solr 9 + Redis, LLM summaries cached server-side, graph-driven related content.

Java 21SolrRedisReact
🎙️Static site

PodSule

Podcast summary library for long YouTube episodes. Local AI pipeline, grounded summaries, 100% static — no backend, no runtime LLM.

Next.jsPythonVercel
📚Static site

Headway

One book-wisdom lesson a day: searchable, themed, per-lesson pages with OG + RSS from a <500-line dependency-free build script.

Vanilla JSRSSVercel
✍️macOS tool

GrammarGuard

System-wide menu-bar grammar checker — keystroke monitoring via the Accessibility API, offline LanguageTool, optional AI rewrites.

macOSPythonLanguageTool
📦Data pipeline

OpenSearch Ingest

Streaming bulk ingest of multi-GB Wikipedia dumps into a 2-node OpenSearch cluster — constant memory, tuned refresh and replicas.

OpenSearchPythonlxml
🧵Systems

DupFinder

Parallel duplicate-file finder in a single dependency-free Java file — a working tour of 10 concurrency primitives done right.

JavaConcurrencyZero deps

05Skills

Search & Retrieval

Apache Solr / SolrCloudLuceneElasticsearch OpenSearchBM25 & relevance tuningFaceting at scale

Backend

Java · Spring BootC# · .NET 8Python · Flask REST API designAsync workers

Data & Infrastructure

PostgreSQLRedisAWS · EC2/RDS/S3 DockerNginxVercel

Security

MTech Cybersecurity (ongoing)CPTE (in progress)OWASP Top 10 WAF & rate limitingHardening & allowlists

AI & LLM Engineering

Offline model deploymentHF Transformers Timeout · fallback · validationAgentic workflows

06 · Contact

Building something that
needs real search?

I'm always happy to talk search infrastructure, backend architecture, or an interesting problem.

abhirupchandra01work@gmail.com