Can standard deviation be negative?

No. A negative result indicates a calculation or data handling error. Recheck constants, units, and data integrity.

Standard Deviation (SD)Glossary

Standard Deviation – Process Variation Metric

Q: When do I use n−1 in the denominator?

Use the unbiased estimator with n−1 when estimating sigma from a sample to represent a larger population or future production. For descriptive statistics of a complete finite batch, n is acceptable but less relevant for SPC.

Q: Why do Cpk and Ppk differ?

Cpk uses within-subgroup sigma (short-term) while Ppk uses overall sigma (long-term). Large gaps indicate instability, poor subgrouping, or seasonal shifts that should be addressed.

Q: How many data points do I need?

Collect at least 25 subgroups of 3–5 parts or 100+ individual points to set an initial baseline, then refresh after validated changes or sustained shifts.

Q: What if the distribution isn’t normal?

Use a Box-Cox transformation, fit a more suitable distribution (e.g., lognormal), or compute percentile-based capability. Document and apply the method consistently.

Q: Should I delete outliers before computing sigma?

No by default. Investigate outliers as potential special causes. Exclude only with documented rationale, then recalculate baselines to avoid masking true process signals.

This topic is part of the SG Systems Global regulatory & operations glossary.

Updated October 2025 • SPC, Capability & Sampling • Quality, Manufacturing, Laboratory

Standard deviation (σ for population, s for sample) quantifies the typical distance of observations from the mean. In manufacturing and labs it is the workhorse statistic behind SPC, capability indices (Cp/Cpk), MSA, and CPV. It turns a cloud of points into a stable number managers and regulators can act on—provided the data are collected rationally, the estimator matches the use (within‑subgroup vs overall), and Document Control governs the method so results are attributable and repeatable.

“If the mean tells you where the process is, standard deviation tells you how often it won’t be there.”

TL;DR: Standard deviation measures spread. Use s with Bessel’s correction (n−1) for estimates, distinguish within‑subgroup σ (short‑term) from overall σ (long‑term), and base control limits and capability on the right one: control charts use within σ; Cp/Cpk use within σ, while Pp/Ppk use overall σ. Validate calculations in MES/LIMS under CSV; keep raw data, subgrouping, and audit trails intact per Data Integrity and Record Retention.

1) What Standard Deviation Covers—and What It Does Not

Covers: the dispersion of a numeric characteristic around its mean under a defined sampling plan and time scale; the noise component used to set SPC control limits; the denominator of capability calculations; the yardstick to compare equipment setups, methods, and gages. Properly defined, it separates common‑cause scatter from special‑cause shifts seen by control charts.

Does not cover: shape (skewness, multimodality), stability (time ordering), or measurement bias. A small s can coexist with off‑target means, non‑normal tails, or autocorrelation. Standard deviation is not a pass/fail rule; it is a parameter you interpret in context of specifications, control rules, and risk.

2) System & Data Integrity Anchors

The calculation and use of s must be governed. Sampling plans live in SOPs under Document Control; data capture occurs contemporaneously in MES, LIMS, or ELN with attributable users and immutable audit trails. Calculation engines are validated proportionately (CSV) and results are retained per policy (Record Retention). If σ supports regulated release decisions, electronic signatures must meet Part 11/Annex 11.

3) The Evidence Pack When You Report σ

Be audit‑ready: preserve the raw observations with timestamps and subgroup IDs; the sampling plan and rationale; data screening rules (missing values, outliers) and their justifications; the estimator used (e.g., s with n−1, MR‑based σ, R̄/d₂, S̄/c₄); normality assessment and any transformations; the link to the control plan characteristic; gage results (MSA); and downstream use in charts or capability reports. Each step should be reconstructable through system records, not spreadsheets alone.

4) From Data Collection to Action—A Standard Path

Start with a rational sampling plan (Sampling) that groups items produced under the “same” conditions into small subgroups (e.g., 3–5 consecutive parts) to estimate short‑term variation. Collect a baseline of 25–30 subgroups, compute within‑subgroup σ (via R̄/d₂ or S̄/c₄) and set control limits for the mean chart at X̄̄ ± 3·(σ/√n). Run the process with rules for signals (SPC) and maintain an overall σ for longer‑term performance and Pp/Ppk. Feed both into the control plan and CPV so shifts or drifts trigger timely investigation, not surprises at release.

5) Interpreting σ—The Practical Meaning

Under a normal model, roughly 68% of values lie within ±1σ of the mean, 95% within ±2σ, and 99.73% within ±3σ. This is the intuition behind 3σ control limits and 6σ capability spans. But treat normality as an assumption to check, not a given: skewed or heavy‑tailed data can make ±3σ too permissive or too tight. Combine σ with time‑ordered charts and specification context, and be explicit about which σ you are using (within vs overall).

6) Choosing the Right σ for the Job

Within‑subgroup σ (short‑term). Use when setting control limits and computing Cp/Cpk. Estimators include R̄/d₂ (for X̄‑R charts), S̄/c₄ (for X̄‑S charts), or MR̄/d₂ for individuals charts (d₂≈1.128 for 2‑point moving ranges). This σ reflects inherent noise when conditions are held constant.

Overall σ (long‑term). Use for Pp/Ppk—population standard deviation over the entire run (all points, n in the denominator). It captures drifts, shifts, and seasonal effects. Expect Ppk ≤ Cpk in stable processes; large gaps signal instability or subgrouping errors.

7) σ in Control Limits

For a mean chart with subgroup size n, limits are X̄̄ ± 3·(σ/√n). Traditional constants (A₂, A₃) implement the same logic using R̄ or S̄. For individuals charts, center at X̄̄ with limits X̄̄ ± 3·σ_MR, where σ_MR≈MR̄/d₂. Always base limits on within σ so the chart is sensitive to special‑cause shifts rather than routine long‑term wander.

8) σ in Capability (Cp/Cpk vs Pp/Ppk)

Cp = (USL − LSL) / (6·σ_within) assumes centered means; Cpk = min{(USL − μ),(μ − LSL)}/(3·σ_within) penalizes off‑target means. Pp and Ppk use σ_overall. Report both pairs: Cp/Cpk reflect potential under controlled conditions; Pp/Ppk reflect actual long‑term performance. Investigate large Cp−Cpk gaps (centering) and Cpk−Ppk gaps (instability or subgrouping issues).

9) Non‑Normal & Robust Alternatives

When normality is doubtful, consider transformations (Box‑Cox), distribution fitting (e.g., lognormal), or non‑parametric capability via percentiles (e.g., estimate the 0.135th and 99.865th percentiles for a 6σ span). For outlier‑resistant spread, the median absolute deviation (MAD) scaled by 1.4826 approximates σ for symmetric data. Whatever the choice, document the rationale and ensure consistency across time and products.

10) Gage Variation—Don’t Confuse Noise Sources

If the measurement system is noisy, σ conflates process and gage. Run MSA (e.g., Gage R&R) to quantify repeatability and reproducibility; reduce gage contribution or adjust interpretation (e.g., assess capability on σ_process if you can subtract gage variance appropriately). Without MSA, capability and control conclusions can be badly biased.

11) Sample Size, Frequency & Subgrouping

For a stable baseline, collect at least 25 subgroups (3–5 parts each) or 100+ individual points when using I‑MR. Choose rational subgroups: sample consecutively to capture short‑term variation; avoid mixing different shifts, materials, or setups within the same subgroup. Watch for autocorrelation in continuous processes; if present, increase spacing or use time‑series‑aware charts.

12) Outliers, OOT & Data Conditioning

Treat outliers as signals, not dirt to sweep away. Investigate under Deviation/NC and RCA, and classify as special cause before exclusion. Persistently drifting σ or mean should surface as OOT; non‑conforming results trigger OOS procedures. Keep the decision trail in the system with reasons and approvals.

13) Metrics That Demonstrate Control

σ_within vs σ_overall ratio: convergence indicates stability; divergence signals unaddressed shifts or subgrouping issues.
Control chart OOC rate: percentage of points/batches beyond 3σ limits or violating rules.
Cp/Cpk and Pp/Ppk trend: sustained capability above targets with narrowing gap over time.
MSA %GRR: gage contribution as a share of total variation; target “acceptable” by policy.
Baseline refresh cadence: timely recalculation after validated process changes (MOC).

These indicators show whether σ is meaningful and the process is under statistical control with adequate measurement fidelity.

14) Common Pitfalls & How to Avoid Them

Using overall σ for control limits. Base limits on within‑subgroup σ or you will either over‑ or under‑react to normal noise.
Mixing estimators. Don’t compute Cp with σ_overall while computing Cpk with σ_within; declare which family you use.
Ignoring non‑normality. Check distributions; use transformations or percentile‑based capability where appropriate.
No MSA. Without gage characterization, σ is a guess at best. Run and maintain MSA.
Poor subgrouping. Never mix different conditions (materials, lines, shifts) inside a subgroup; you’ll inflate σ and hide signals.
Spreadsheet drift. Lock methods into validated systems with audit trails and governed SOPs.

15) What Belongs in the σ Record

Identify the characteristic (units, spec), data source and period, sampling plan (subgroup size, frequency), the estimator and constants used, normality assessment, outlier handling, gage status, and links to the Control Plan, SPC charts, and capability reports. Include references to governing SOPs and any MOC that altered the distribution (e.g., new tooling). Store everything under controlled records with traceable approvals.

16) How This Fits with V5 by SG Systems Global

Built‑in SPC with the right σ. The V5 platform computes both within and overall σ from shop‑floor or lab data captured in the V5 MES and LIMS contexts. It applies appropriate constants (R̄/d₂, S̄/c₄, MR̄/d₂) by chart type, sets 3σ limits automatically, and prevents mixing estimators across Cp/Cpk vs Pp/Ppk reports.

Rational subgrouping & normality checks enforced. V5 encodes subgroup rules in SOPs and sampling routes, making operators select consecutive parts or scheduled pulls. The SPC engine runs normality tests, flags non‑normal data, and suggests transformations or percentile‑based capability—logged under Document Control so methods are consistent and auditable.

MSA and gage linkage. V5 tracks gage assets and MSA outcomes; if %GRR is out of policy, capability dashboards display warnings and prevent publishing misleading Cp/Cpk externally.

Streaming CPV & alerts. For continuous data or PAT signals, V5 updates σ in near real‑time and raises alert/action notifications before product crosses specification, tying signals to investigation workflows and CAPA.

Change control & baselines. When recipes, equipment, or methods change, V5 triggers MOC and requires a new σ baseline for CPV. All versions are effective‑dated and retained for trend comparisons and regulatory defense.

Bottom line: V5 turns σ from a spreadsheet artifact into a governed, real‑time metric that drives SPC, capability, and release decisions with traceable logic and compliant evidence.

17) FAQ

Q1. When do I use n−1 in the denominator?
When estimating σ from a sample to represent a larger population or future production, use the unbiased estimator with n−1. For descriptive statistics of a complete finite batch, n is acceptable but usually less relevant for SPC.

Q2. Why do Cpk and Ppk differ?
Cpk uses within‑subgroup σ (short‑term); Ppk uses overall σ (long‑term). Big gaps indicate instability, poor subgrouping, or shifts/seasonality that should be addressed.

Q3. How many data points do I need for a reliable σ?
As a rule of thumb, collect ≥25 subgroups of size 3–5 or ≥100 individual points for an initial baseline. Update after validated changes or when control charts show sustained shifts.

Q4. What if the distribution isn’t normal?
Consider Box‑Cox transformation, fit an appropriate family (e.g., lognormal), or use percentile‑based capability. Document the choice and use consistent methods over time.

Q5. Can σ be negative?
No. Negative results indicate a calculation or data pipeline error; check constants, units, and data integrity.

Q6. Should I delete outliers before computing σ?
Not by default. Investigate as potential special causes. Exclude only with documented rationale, then recalculate baselines; otherwise you risk masking true process signals.