The Election Commission of India conducted a Special Intensive Revision (SIR) of electoral rolls in West Bengal in 2025–26, an exercise it described as unprecedented. But when the final rolls were published, what emerged was not simply a dataset, but a barrier.

The rolls were uploaded not as searchable, machine-readable files, but as scanned PDF images, which are effectively photographs of printed pages. They cannot be searched. They cannot be meaningfully analysed. Every page resists scrutiny by design, which, in a moment of intense political contestation, raises a central question: who benefits when public data is made practically unusable?

The timing matters as well. The SIR in West Bengal has been fiercely contested. Opposition parties, led on the ground by the Bengal chief minister Mamata Banerjee, have alleged large-scale discrepancies, claiming that voters from communities less likely to support the Bharatiya Janata Party were disproportionately targeted for deletion or placed under doubt.

The controversy has extended to the institution itself. An impeachment notice against chief election commissioner Gyanesh Kumar was introduced on March 13, 2026, alleging partisan conduct. His appointment had already drawn criticism after a new law replaced the Chief Justice of India with a cabinet minister in the selection committee, effectively giving the ruling party a two-to-one majority.

Against this backdrop, the format of the data becomes inseparable from the politics surrounding it.

‘Logical Discrepancy’: A Concept with No Precedent in Indian Electoral History

To understand why this SIR is fundamentally different, one must begin with a category that did not exist in Indian electoral practice before 2026: “Logical Discrepancy.”

The SIR required every voter to establish linkage to the 2002 electoral roll, either directly or through a relative. Voters were sorted into three groups: Mapped, Unmapped, and a newly introduced category: Logical Discrepancy.

On paper, this category captured mismatches in parental names, implausible age gaps, or inconsistencies in records. In practice, the triggers were often banal. Variations in transliteration, for example, “Mohammed” versus “Muhammad,” “Mondal” versus “Mandal”, routinely confused the ERONET software. Poor-quality scans of older rolls further undermined digital matching.

A voter could be flagged not because they were ineligible, but because software failed to recognise what any human reader would.

Midway through the process, the Commission introduced another layer: an algorithm that re-flagged voters who had already been successfully mapped. The goalposts shifted in real time, through software updates that were never publicly explained to voters or even consistently communicated to officials.

The scale was staggering. Around 1.36 crore voters were placed in the Logical Discrepancy category. They were not deleted, but nor were they allowed to vote. Instead, they were marked “Under Adjudication,” a status that suspends the right to vote until cleared in a supplementary list. No deadline was set for this process. Elections were announced anyway.

The Supreme Court of India intervened repeatedly. On February 20, it invoked Article 142 to deploy judicial officers, even requisitioning personnel from Jharkhand and Odisha to manage the caseload. Yet of the 32 lakh cases adjudicated, nearly 40% (roughly 13 lakh names) did not appear in the first supplementary list released on March 23.

The SIR has been conducted thirteen times since 1952. Never before has a category like Logical Discrepancy been used to place millions of voters in a state of suspended franchise without a clear, public explanation of what the category means, how it is triggered, or how a voter exits it.

The Barriers Begin Before you Even Open the File

When Alt News attempted to work with the SIR Final Rolls 2026 for two Kolkata constituencies — Bhabanipur (159) and Ballygunge (161) — the obstacles began immediately.

The first barrier was access. Bhabanipur alone contains 267 zones. The ECI website allows downloads for only ten areas at a time, each gated by a CAPTCHA. Automation is effectively blocked. Manual downloading took hours.

The screenshot below shows the portal throwing up an ‘invalid captcha’ message despite the correct captcha put in:

The second barrier was the format. The scanned PDFs are, on average, 228 times larger than a digitally readable equivalent, yet contain none of the underlying structured data. This is not a technological limitation. India runs systems like Aadhaar, UPI, and DigiLocker at massive scale. Publishing a CSV (A computerised file with Comma Separated Values) alongside a PDF is trivial by comparison. The absence of such formats is a decision.

The third barrier lies within the files. Roughly one in ten voter entries carries a diagonal “UNDER ADJUDICATION” watermark, often obscuring the name itself. This is not incidental. It directly interferes with automated data extraction and even manual reading in some cases.

Each layer targets a different stage of scrutiny: Captcha A blocks collection, image format blocks analysis, watermark blocks identification.

What the Data Shows: 39,604 Voters in Limbo

Alt News digitised the voter records of both constituencies from 558 PDF files, extracting the serial number, voter ID, name and adjudication status of every voter, with both models run independently and their outputs compared record by record. Across the two constituencies, the digitised rolls show 39,604 voters — 11.2% of the total electorate — marked “Under Adjudication.”

Of these 39,604 voters, Alt News analysed the religious identity of each using their full name (first name and surname together) since in Bengal a Hindu first name can appear alongside a Muslim surname and vice versa. The findings are stark.

Muslims constitute 39.5% of the combined electorate of the two constituencies. They account for 66.5% of voters placed under adjudication.

The framework for evaluating this is straightforward: If the percentage of voters placed under adjudication for a particular community exceeds that community’s share of the constituency’s total population, that is a pattern that demands scrutiny. 

In Bhabanipur, which is the chief minister’s own seat, Muslims are 21.9% of voters but account for 51.8% of those placed under adjudication, that is nearly one in four Muslim voters. For Hindu voters in the same seat, the figure is fewer than one in 17. In Ballygunge, where Muslims are a slight majority at 54.3% of the electorate, they are still dramatically over-represented among adjudicated voters — three in four — with an adjudication rate nearly three times that of Hindu voters.

Across both constituencies, a Muslim voter is 3.1 times more likely to have their name placed under adjudication than a Hindu voter.

What the data shows is a pattern consistent across both constituencies and in the same direction that demands a response from the Election Commission of India.

What We are Publishing and Why it is Unique

Several organisations have analysed the SIR data and identified troubling patterns. What Alt News has done goes a step further: publishing not just conclusions, but the underlying data itself.

For two constituencies, every classified voter record is available in our database (link given below) including voter ID, serial number, demographic classification, and a confidence score. When we state that Muslims constitute 21.9% of Bhabanipur’s electorate, the underlying list of classified names is publicly accessible. When we identify 8,083 Muslim voters under adjudication, all 8,083 records are available for verification.

However, information about religion, arrived at by the use of AI tools, has been withheld in this report in view of privacy issues. 

Classifications are graded as high, medium, or low confidence. Of 39,604 adjudicated records, 93.9% fall in the high-confidence category; for Muslim records, this rises to 97.3%. The observed disparities are driven overwhelmingly by these high-certainty classifications.

Ambiguous cases — single-word names, mixed indicators (Hindu sounding names and Muslim sounding surnames) or watermark-obscured entries — are flagged for manual review. The dataset is not presented as infallible; it is presented as auditable. Readers, researchers, and affected voters are invited to identify and correct errors.

The database is accessible through a public interface that allows users to explore demographics, search adjudication records by name or voter ID, and examine classification logic. The aim is not merely to analyse the data, but to return it to the public domain in a usable form.

The entire dataset and interpretations can be accessed here: https://sir-data-decoded.altnews.in/

It looks like this:

On the homepage, click Demographics Analysis to view the full religion and gender breakdown of both constituencies: the population baseline against which adjudication figures are measured.

Click Adjudication Records to access all 39,604 voters placed under adjudication. The database is searchable by name or voter ID, allowing you to look up individuals directly.

Click Demographics Triage to explore voters under adjudication along with their demographic classification and confidence scores. This section offers total transparency to our methodology. One can filter by religion, gender, and confidence level, and review records where classification was uncertain.

Note: Anyone who wishes to study and verify the full dataset can contact the author. 

Breaking the Barrier, Building the Tool

What has been done for two constituencies is a proof of concept. The closest parallel lies outside India: tools built to process opaque government PDF dumps, such as the conversion of U S Department of Justice releases into searchable archives by independent engineers. The principle is straightforward: opacity enforced through format is not the same as secrecy.

The Election Commission already possesses this data in structured form. The PDFs are generated from databases. Publishing only scanned images, without accompanying machine-readable files, is therefore a choice to withhold usability, not information.

Alt News hopes to extend this work across all constituencies in West Bengal building a system through which journalists, researchers, and citizens can search, analyse, and report discrepancies. Verified corrections will be incorporated, making the dataset progressively more accurate.

The broader point is larger than this election. In a democracy, data that is technically public must also be practically accessible. When it is not, the barrier is not technological, it is political.

For at least 352,287 voters in these two constituencies, that barrier is no longer abstract. It is measurable.

The Cost of Breaking the Wall

The total cost of digitising 352,287 voter records across both constituencies, including extraction, cross-verification, and full demographic analysis, came to approximately $141 (roughly ₹11,800). Using the most efficient method developed through this process, the per-constituency cost is now approximately $55.

This figure is not speculative. It is stable and replicable. Anyone with basic technical skills, access to the ECI’s publicly available PDFs, and $55 can reproduce this analysis for any constituency in West Bengal.

That figure, however, reflects the cost of a finished pipeline, not the cost of discovering it. Reaching this stage required several rounds of experimentation. Different extraction methods were tested, accuracy was benchmarked against watermarked voter cards, and approaches were discarded when they proved either too expensive or insufficiently reliable. Analyses were rerun when results required verification. These iterations cost a few hundred dollars, which is a one-time investment that brought the per-constituency cost down to its current level.

Even this accounting is incomplete. It does not include the labour of subject-matter experts who helped design the methodology and validate findings. Their contributions were pro bono, but indispensable.

Placed in context, the contrast is stark. The Election Commission of India spent public money building ERONET — the centralised system that generated over a crore “Logical Discrepancy” flags, while declining to publicly explain how those flags were produced. Alt News, by comparison, spent a few hundred dollars to make those flags searchable, sortable, and open to public scrutiny for the first time.

This work is only the first phase. The next step is to extend the analysis to supplementary lists published after the Final Roll, as well as the draft rolls that preceded the SIR; datasets that would allow a full reconstruction of how voter status changed across the revision cycle.

The Commission’s Justification and its Limits

The Election Commission of India has, over time, offered multiple justifications for restricting electoral rolls to image-based formats.

In January 2018, it directed all state chief electoral officers to publish rolls as image files, citing data security concerns, specifically the risk of misuse by foreign actors. When this policy was challenged in court by Congress leader Kamal Nath, the Commission argued that searchable, machine-readable formats would enable large-scale data mining and potentially violate voter privacy. The Supreme Court of India declined to examine this claim on its merits, leaving the choice of format to the Commission’s discretion.

More recently, in August 2025, chief election commissioner Gyanesh Kumar advanced a more categorical claim: that machine-readable files are effectively “barred” because they “can be edited,” opening the door to misuse.

This argument has been widely criticised as technically unsound. Editing a downloaded copy of a dataset does not and cannot alter the original records maintained by the Commission. The integrity of the official rolls is not dependent on the format in which they are shared publicly.

What is notable is the asymmetry at the heart of this policy. The ECI itself maintains electoral data in structured, machine-readable form within its ERONET system. The restriction applies only to what is made available outside — to the public, to researchers, and even to political parties.

In effect, the concern is not about whether the data can be structured. It already is. The concern is about who gets to use it in that form.

Alt News has written to the chief election commissioner, West Bengal, seeking responses. This report will be updated if we get a response.

Donate to Alt News!
Independent journalism that speaks truth to power and is free of corporate and political control is possible only when people start contributing towards the same. Please consider donating towards this endeavour to fight fake news and misinformation.

Donate Now