NIH scores your SBIR application on a 1-9 scale where 1 means "Exceptional." NSF uses the same 1-9 scale, but 9 means "Exceptional." ARPA-H doesn't use a panel at all -- a single Program Manager reads your submission and decides. DoD rejects your technology if it doesn't match the exact solicitation topic. These aren't minor differences. They determine whether you frame your innovation as hypothesis-driven research, high-risk R&D, a 10x health improvement, or an operational solution.
The complete comparison
| Dimension | NIH | NSF | ARPA-H | DoD |
|---|---|---|---|---|
| Scoring scale | 1-9 (1 = best) | 1-9 (9 = best) | 1-9 (9 = best) | Varies by component |
| Who reviews | Panel of 15-20 scientists | Program Director + expert | Single Program Manager | Technical evaluators |
| Top criterion | Approach | Innovation Classification | Non-Incremental Innovation (25%) | Topic alignment |
| What kills applications | Sequential aim dependencies | Tier C (engineering, not R&D) | Failing the 60-second test | Misaligned to solicitation topic |
| Innovation bar | Hypothesis-driven R&D | High-risk/high-reward R&D | 10x improvement required | Solves the defined problem |
| Preliminary data | Critical | Less formal; customer signals valued | Proof-of-concept | Varies by topic |
| Language culture | Scientific, hypothesis-driven | R&D-focused, national significance | Plain language, outcome-focused | Operational, mission-focused |
| Decision language | Fundable / Not Competitive | Invite / Decline | Encourage / Discourage | Select / Not Select |
Each agency optimizes for a different question:
- NIH: "Will this advance scientific knowledge and improve health?"
- NSF: "Is this genuine high-risk/high-reward R&D with national significance?"
- ARPA-H: "Can this solve a health problem in a way conventional approaches cannot?"
- DoD: "Does this solve the specific operational problem we defined?"
How NIH scores: 5 criteria, study section review
NIH uses the most formalized review process. A study section panel (15-20 domain scientists) assigns 3 reviewers to each application.
The 5 criteria
Each reviewer scores on a 1-9 scale where 1 = Exceptional and 9 = Poor.
| Criterion | Core Question | What Kills Applications |
|---|---|---|
| Significance | Does this address an important health problem? | Generic health burden instead of quantified data with CDC/WHO sources |
| Innovation | Does this challenge existing approaches? | Claiming "novel" without explaining what specifically is new |
| Approach | Is the methodology well-reasoned? | Sequential aim dependencies, missing potential problems section |
| Investigators | Is the PI suited for this work? | No preliminary data supporting the proposed hypothesis |
| Environment | Does the institution support success? | Missing equipment descriptions, no collaboration evidence |
Overall Impact: the score that matters
Reviewers produce an Overall Impact score reflecting the likelihood of sustained influence on the field. This is NOT the average of the 5 criteria. A fatal flaw in one criterion (particularly Approach) can drive Overall Impact to unfundable levels even if the other four score well.
- Scores 1-3: typically fundable
- Scores 4-5: needs revision
- Scores 6-9: not competitive
Half of applications get triaged
NIH triages the bottom half before the study section meeting. "Not Discussed" means your application never received a formal score. Triage triggers: no preliminary data, vague hypothesis, sequential aim dependencies, Phase II scope in a Phase I budget, missing potential problems section.
How NSF scores: innovation classification is everything
The screening gate
Before technical review, the Program Director applies 5 screening questions. Fail any one and your pitch is declined regardless of merit:
| Screening Question | What They're Really Asking |
|---|---|
| Has this been done before? | Is there genuine R&D novelty? |
| Are there technical hurdles NSF R&D could overcome? | Is the risk technical (fundable) or business risk (not fundable)? |
| Could this disrupt the target market? | Is the impact nationally significant? |
| Is there evidence of product-market fit? | Real customer signals, not just a TAM slide? |
| Is there potential for broad societal impact? | Specific population and mechanism of benefit? |
Innovation classification
NSF classifies your innovation before scoring:
- Tier A -- New scientific principle: typically scores highest. This is what NSF wants.
- Tier B -- Novel application of known science: competitive but not a slam dunk.
- Tier C -- Engineering optimization: rarely scores well. Effectively a decline.
If reviewers can't tell if your work is A/B or C, that ambiguity itself is a red flag. NSF's primary gate is whether you're doing genuine R&D versus product development dressed as research.
The 3 criteria
- Intellectual Merit -- potential to advance scientific knowledge
- Broader Impacts -- how the technology benefits society (for SBIR: specific population, mechanism, scale)
- Commercial Impact -- market need, scalability, whether NSF funding de-risks the technology
How ARPA-H evaluates: the 60-second test
ARPA-H has no peer review panels. A single Program Manager reads your 6-page Solution Summary and decides: Encourage or Discourage.
The PM should understand your concept in 60 seconds
If your opening requires domain-specific knowledge to parse, the PM assumes your thinking is unclear.
Fails the test: "We are developing a platform to improve cancer treatment."
Passes the test: "We are developing a [specific technology] that [mechanism] to [quantified outcome], which would [health impact] for [specific population]."
5 weighted criteria
| Criterion | Weight | What the PM Looks For |
|---|---|---|
| Non-Incremental Innovation | 25% | 10x better, not 10%. New mechanism, not better implementation. |
| Health Impact and Scale | 25% | Quantified in patients/lives, not market size. Equity addressed. |
| Technical Feasibility | 20% | Measurable milestones with real Go/No-Go decisions. |
| Team and Execution | 15% | Three pillars: technical + clinical + commercialization. |
| Writing Quality | 15% | Passes 60-second test. Jargon-free. Quantified throughout. |
Language matters: NIH vocabulary is a red flag at ARPA-H
| NIH Language (avoid at ARPA-H) | ARPA-H Language (use instead) |
|---|---|
| "Hypothesis-driven" | "Will demonstrate" |
| "Specific aims" | "Milestones with Go/No-Go" |
| "Preliminary data suggests" | "Preliminary data demonstrates" |
| "Grantee" | "Performer" |
| "Market opportunity ($XB TAM)" | "Health impact (X million patients)" |
How DoD scores: topic alignment is king
DoD SBIR is topic-driven. You respond to a specific solicitation topic, not your own research question.
| Component | Format | Key Differentiator |
|---|---|---|
| Standard DoD (Navy, Army, SOCOM) | Topic-based proposals | Respond to explicit topic requirements |
| AFWERX | Open Topic + Specific Topic | Commercial viability weighted equally with technical merit |
| DARPA | BAA-specific | PM-directed; Proposers Day attendance matters |
DoD awards are contracts, not grants. This means defined deliverables and milestones from the solicitation, not self-defined research aims. IP and patent strategy matter more at DoD than civilian agencies.
Common DoD decline patterns
- Topic misalignment -- addresses a related but different problem than the solicitation specifies
- No operational context -- technology described in commercial terms without defense use case
- Missing IP strategy -- unclear data rights or IP protection plan
- Academic framing -- reads like an NIH grant instead of a defense contract response
Writing the same technology for different agencies
If you're applying to multiple agencies (recommended -- a portfolio approach improves your odds):
For NIH: lead with scientific significance. Hypothesis-driven. Quantify health burden. Structure aims as independent, testable hypotheses.
For NSF: lead with innovation classification. Demonstrate Tier A/B novelty. Frame Broader Impacts around populations and mechanisms, not revenue.
For ARPA-H: lead with the 10x improvement. 60-second clarity. Health impact in patients, not dollars. Use ARPA-H vocabulary.
For DoD: lead with topic alignment. Show your technology addresses the operational need. Emphasize IP protection and defense sector credibility.
For the full cross-agency proposal strategy, see how to win an SBIR grant. For agency-specific guides, see our NSF pitch guide, AFWERX guide, or DARPA BAA guide.
Want to know how your proposal would score?
We write proposals across NIH, NSF, ARPA-H, and DoD. If you're not sure which agency your technology is most competitive for, that's the first question to answer before investing 80+ hours. Our Strategy Review includes agency-fit assessment specific to your technology.