What Public Safety Agencies Should Ask Before Buying an AI Tool
Before buying a public safety AI tool, ask whether it assists decisions or makes them, whether every output can be traced back to its source, what happens to your data and whether it trains the vendor's model, whether every AI action is logged and auditable, whether it fits your workflow, and what problem you are actually solving. The thread through all of it: the best AI cuts paperwork while the officer stays responsible for every decision.
Every AI vendor promises to save you time. Few will tell you what happens when the AI is wrong.
If you are a sergeant or a chief, you are not just buying software. You are buying something you will have to defend to a city council that approved the money, and that an officer may have to explain on a witness stand a year later. That is the altitude to evaluate AI from. Not "is it impressive in the demo," but "can we stand behind every output it produces."
So here is a buyer's checklist you can use on any vendor, including us. The thread through all of it is simple: the best AI cuts the paperwork that keeps officers from policing, and it leaves the officer responsible for every decision. Use these questions to tell the tools that do that from the ones that just add risk.
First, name the problem you are solving
Agencies do not buy AI. They buy a solution to a specific operational problem, and "AI" is not one. In public safety it shows up as very different tools, each a different purchase with different risks:
- Report writing from notes or recordings.
- Search warrant drafting from an incident record.
- Intelligence and incident summaries so a role can get up to speed fast.
- Incident command support, drafting activations and prefilling plans.
- Video analysis and search across footage.
- Drone analytics on what an aircraft sees.
- Administrative automation for the paperwork around the work.
Decide which of these you are actually solving before you sit through a single demo. Buying AI because it is AI is how agencies end up with shelfware.
Who is actually deciding?
Does the AI make decisions, or assist the people making them?
This is the most important question on the list. AI should draft, summarize, and organize. It should not decide who to arrest, when to use force, or whether to activate a team. A good vendor is clear about where the software stops and the officer's judgment begins. Public safety is treated as a high-risk setting for AI for exactly this reason, and bodies like the IACP now publish frameworks that call for a defined point where a human reviews every automated suggestion, as Police Chief Magazine lays out for governable AI in public safety. If a tool is positioned to make operational decisions on its own, that is a liability, not a feature.
The cleanest way to hold that line is to be explicit about which work belongs to the software and which belongs to the officer:
| AI does | The officer does |
|---|---|
| Summarize, organize, draft, search, categorize | Decide, authorize, arrest, charge, testify |
Transparency: can you trust the output?
Can every output be traced back to its source?
This is the question almost nobody asks, and it is the one that separates a serious tool from a risky one. If an AI-generated report, incident summary, or warrant paragraph cannot be traced back to the underlying facts, you are looking at black-box policing. Summaries should cite their source material. Drafts should show what they were built from. An officer should be able to verify every line before it carries their name, and a prosecutor should be able to defend it. If the vendor cannot explain where an answer came from, think hard before relying on it.
How does it handle errors and hallucinations?
Generative AI can produce confident, fluent, wrong output, including fabricated facts and citations. Ask how the vendor reduces that, and more importantly, how the tool keeps a person in the loop to catch it. The right answer is never "it is accurate, trust it." It is that every output is a draft a human reviews, and the system makes that review easy rather than burying it.
What happens when the AI is wrong?
Errors are not hypothetical, so ask what the tool does when one slips through. Who is accountable for the output, the vendor or your agency? Can a supervisor catch and correct it before it reaches a report or a courtroom? A tool that saves an hour of writing but produces a record nobody can stand behind has created liability, not efficiency.
Beware of AI theater
A lot of what is sold as AI is a general-purpose language model with a logo on it. That is not automatically bad, but it is not automatically worth paying for either, so make the vendor show their work. Ask what unique data the tool actually uses, what it does better than a generic chatbot anyone could already open, and what measurable operational improvement it has shown with real agencies. Then come back to the one that matters most: can your people independently verify every output? If the answers are vague, you are buying a wrapper, not a capability.
Data and CJIS
What happens to your data?
This is several questions, and you want all of them answered plainly:
- Where is it stored, and is it on infrastructure built for government workloads?
- Is it retained, and for how long, or is it handled live and never archived?
- Is it used to train the vendor's model? For most agencies the answer needs to be no.
- Who owns it, can you export it, and can you have it permanently deleted?
- Who has access, are they screened, and what leaves your agency, especially with a cloud tool.
CJIS obligations do not transfer to the vendor. A compliant vendor reduces your burden; it does not remove it. Get the data answers in writing.
Accountability
Is every AI action logged and auditable?
Provability is half of accountability. A supervisor should be able to see what was asked, what the AI produced, what a person changed, and who did it, and to disable a feature that is not earning its place. If there is no audit trail of AI activity, you cannot answer for it later, and in this profession you will eventually be asked to.
Mission fit
Does it fit your existing workflow, or create duplicate work?
A tool that makes officers do the job twice, once for the work and once to feed the AI, will not get used, no matter how good the demo was. The right tools sit inside the work that is already happening and take load off it. Ask to see where the tool fits on a real call, not on a slide.
The bottom line
The goal of AI in public safety is not to replace human judgment. It is to cut administrative burden, organize information faster, and give experienced professionals better information when time matters most. Decision support, not decision making.
That is the standard we hold ourselves to, so ask us these questions too. We built BabbarOps to assist command staff and investigators, to keep every output reviewable and traceable, and to leave the officer responsible for every call. The one-page checklist above is yours to take into any evaluation, ours or anyone else's.
Whether it assists decisions or makes them, whether every output can be traced to its source and verified, what happens to your data (storage, retention, model training, access, and what leaves the agency), whether every AI action is logged and auditable, whether it fits your existing workflow, and what specific problem you are solving. Ask these of any vendor.
It depends on the vendor and the deployment. Ask where data is stored, whether it is retained, whether it is used to train the vendor's model, and who has access. CJIS obligations do not transfer to the vendor; a compliant vendor reduces your burden but does not remove it. Get the answers in writing.
No. The right tools assist by drafting, summarizing, and organizing, and the officer reviews the output and remains responsible for every decision and action. AI should not decide who to arrest, when to use force, or whether to activate a team.
It should be traceable to its source and easy to verify. If a report, summary, or warrant paragraph cannot be traced back to the underlying facts, treat it with caution. Every output should be a draft a person reviews before it carries their name.
Evaluating AI tools for your agency? Bring this checklist to the table. We will answer all of it about BabbarOps, including the hard ones.
Sukh Bhela is a California police sergeant who has served as a UAS operator, UAS supervisor, and incident commander during critical incidents. His experience leading patrol operations and integrating drone technology into public safety responses led him to found BabbarOps, where he builds tools for live situational awareness and incident command. He writes about policing, drone operations, leadership, and the technology shaping the future of emergency response.
The views expressed here are the author's own, written in his personal capacity. They do not represent, and are not made on behalf of, any law enforcement agency or employer.
This guide is general information for evaluating technology, not legal, procurement, or compliance advice. Confirm any requirement with your agency's IT, policy, and legal authorities and with the relevant standards. BabbarOps is an independent commercial product and is not affiliated with, endorsed by, or operated on behalf of any law enforcement agency.
