什么是 JMH
JMH 即 Java Microbenchmark Harness,这是专门用于进行代码的微基准测试的一套工具 API。
JMH 由 OpenJDK/Oracle 里面那群开发了 Java 编译器的大牛们所开发。何谓 Micro Benchmark 呢? 简单地说就是在 method 层面上的 benchmark,精度可以精确到 微秒级 。
为什么需要 JMH
陷阱 1:死码消除
陷阱 2:常量折叠与常量传播
常量折叠 (Constant folding) 是一个在编译时期简化常数的一个过程,常数在表示式中仅仅代表一个简单的数值,就像是整数 2
JMH 的注意点
- 测试前需要预热。
- 防止无用代码进入测试方法中。
- 并发测试。
- 测试结果呈现。
- 1.当你已经找出了热点函数,而需要对热点函数进行进一步的优化时,就可以使用 JMH 对优化的效果进行定量的分析。
- 2.想定量地知道某个函数需要执行多长时间,以及执行时间和输入 n 的相关性
- 3.一个函数有两种不同实现(例如 JSON 序列化/反序列化有 Jackson 和 Gson 实现),不知道哪种实现性能更好
JMH 概念
- iteration 是 JMH 进行测试的最小单位,包含一组 invocations。Invocation
- 一次 benchmark 方法调用。Operation
- benchmark 方法中,被测量操作的执行。如果被测试的操作在 benchmark 方法中循环执行,可以使用@OperationsPerInvocation
表明循环次数,使测试结果为单次 operation 的性能。Warmup
- 在实际进行 benchmark 前先进行预热。因为某个函数被调用多次之后,JIT 会对其进行编译,通过预热可以使测量结果更加接近真实情况。
添加 maven 依赖
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.*;
import java.util.concurrent.TimeUnit;
@Warmup(iterations = 3)
@Measurement(iterations = 10, time = 5, timeUnit = TimeUnit.SECONDS)
public class StringBuilderBenchmark {
public void testStringAdd() {
String a = "";
for (int i = 0; i < 10; i++) {
a += i;
// System.out.println(a);
public void testStringBuilderAdd() {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 10; i++) {
// System.out.println(sb.toString());
public static void main(String[] args) throws RunnerException {
Options options = new OptionsBuilder()
new Runner(options).run();
执行 JMH
(1)初始化 benchmarking 工程
$ mvn archetype:generate \
-DinteractiveMode=false \
-DarchetypeGroupId=org.openjdk.jmh \
-DarchetypeArtifactId=jmh-java-benchmark-archetype \
-DgroupId=org.sample \
-DartifactId=test \
(2)构建 benchmark
$ cd test/
$ mvn clean install
(3)运行 benchmark
$ java -jar target/benchmarks.jar
执行 main 方法
执行 main 方法,耐心等待测试结果,最终会生成一个测试报告,内容大致如下;
# JMH version: 1.22
# VM version: JDK 1.8.0_181, Java HotSpot(TM) 64-Bit Server VM, 25.181-b13
# VM invoker: C:\Program Files\Java\jdk1.8.0_181\jre\bin\java.exe
# VM options: -javaagent:D:\Program Files\JetBrains\IntelliJ IDEA 2019.2.3\lib\idea_rt.jar=58635:D:\Program Files\JetBrains\IntelliJ IDEA 2019.2.3\bin -Dfile.encoding=UTF-8
# Warmup: 3 iterations, 10 s each
# Measurement: 10 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 8 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: io.github.dunwu.javalib.jmh.StringBuilderBenchmark.testStringAdd
# Run progress: 0.00% complete, ETA 00:05:20
# Fork: 1 of 2
# Warmup Iteration 1: 21803.050 ops/ms
# Warmup Iteration 2: 22501.860 ops/ms
# Warmup Iteration 3: 20953.944 ops/ms
Iteration 1: 21627.645 ops/ms
Iteration 2: 21215.269 ops/ms
Iteration 3: 20863.282 ops/ms
Iteration 4: 21617.715 ops/ms
Iteration 5: 21695.645 ops/ms
Iteration 6: 21886.784 ops/ms
Iteration 7: 21986.899 ops/ms
Iteration 8: 22389.540 ops/ms
Iteration 9: 22507.313 ops/ms
Iteration 10: 22124.133 ops/ms
# Run progress: 25.00% complete, ETA 00:04:02
# Fork: 2 of 2
# Warmup Iteration 1: 22262.108 ops/ms
# Warmup Iteration 2: 21567.804 ops/ms
# Warmup Iteration 3: 21787.002 ops/ms
Iteration 1: 21598.970 ops/ms
Iteration 2: 22486.133 ops/ms
Iteration 3: 22157.834 ops/ms
Iteration 4: 22321.827 ops/ms
Iteration 5: 22477.063 ops/ms
Iteration 6: 22154.760 ops/ms
Iteration 7: 21561.095 ops/ms
Iteration 8: 22194.863 ops/ms
Iteration 9: 22493.844 ops/ms
Iteration 10: 22568.078 ops/ms
Result "io.github.dunwu.javalib.jmh.StringBuilderBenchmark.testStringAdd":
21996.435 ±(99.9%) 412.955 ops/ms [Average]
(min, avg, max) = (20863.282, 21996.435, 22568.078), stdev = 475.560
CI (99.9%): [21583.480, 22409.390] (assumes normal distribution)
# JMH version: 1.22
# VM version: JDK 1.8.0_181, Java HotSpot(TM) 64-Bit Server VM, 25.181-b13
# VM invoker: C:\Program Files\Java\jdk1.8.0_181\jre\bin\java.exe
# VM options: -javaagent:D:\Program Files\JetBrains\IntelliJ IDEA 2019.2.3\lib\idea_rt.jar=58635:D:\Program Files\JetBrains\IntelliJ IDEA 2019.2.3\bin -Dfile.encoding=UTF-8
# Warmup: 3 iterations, 10 s each
# Measurement: 10 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 8 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: io.github.dunwu.javalib.jmh.StringBuilderBenchmark.testStringBuilderAdd
# Run progress: 50.00% complete, ETA 00:02:41
# Fork: 1 of 2
# Warmup Iteration 1: 241500.886 ops/ms
# Warmup Iteration 2: 134206.032 ops/ms
# Warmup Iteration 3: 86907.846 ops/ms
Iteration 1: 86143.339 ops/ms
Iteration 2: 74725.356 ops/ms
Iteration 3: 72316.121 ops/ms
Iteration 4: 77319.716 ops/ms
Iteration 5: 83469.256 ops/ms
Iteration 6: 87712.360 ops/ms
Iteration 7: 79421.899 ops/ms
Iteration 8: 80867.839 ops/ms
Iteration 9: 82619.163 ops/ms
Iteration 10: 87026.928 ops/ms
# Run progress: 75.00% complete, ETA 00:01:20
# Fork: 2 of 2
# Warmup Iteration 1: 228342.337 ops/ms
# Warmup Iteration 2: 124737.248 ops/ms
# Warmup Iteration 3: 82598.851 ops/ms
Iteration 1: 86877.318 ops/ms
Iteration 2: 89388.624 ops/ms
Iteration 3: 88523.558 ops/ms
Iteration 4: 87547.332 ops/ms
Iteration 5: 88376.087 ops/ms
Iteration 6: 88848.837 ops/ms
Iteration 7: 85998.124 ops/ms
Iteration 8: 86796.998 ops/ms
Iteration 9: 87994.726 ops/ms
Iteration 10: 87784.453 ops/ms
Result "io.github.dunwu.javalib.jmh.StringBuilderBenchmark.testStringBuilderAdd":
84487.902 ±(99.9%) 4355.525 ops/ms [Average]
(min, avg, max) = (72316.121, 84487.902, 89388.624), stdev = 5015.829
CI (99.9%): [80132.377, 88843.427] (assumes normal distribution)
# Run complete. Total time: 00:05:23
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
StringBuilderBenchmark.testStringAdd thrpt 20 21996.435 ± 412.955 ops/ms
StringBuilderBenchmark.testStringBuilderAdd thrpt 20 84487.902 ± 4355.525 ops/ms
下面来了解一下 jmh 常用 API
基准测试类型。这里选择的是 Throughput
- 整体吞吐量,例如“1 秒内可以执行多少次调用”。AverageTime
- 调用的平均时间,例如“每次调用平均耗时 xxx 毫秒”。SampleTime
- 随机取样,最后输出取样结果的分布,例如“99%的调用在 xxx 毫秒以内,99.99%的调用在 xxx 毫秒以内”SingleShotTime
- 以上模式都是默认一次 iteration 是 1s,唯有 SingleShotTime 是只运行一次。往往同时把 warmup 次数设为 0,用于测试冷启动时的性能。All
- 所有模式
上面我们提到了,进行基准测试前需要进行预热。一般我们前几次进行程序测试的时候都会比较慢,所以要让程序进行几轮预热,保证测试的准确性。其中的参数 iterations 也就非常好理解了,就是预热轮数。
为什么需要预热?因为 JVM 的 JIT 机制的存在,如果某个函数被调用多次之后,JVM 会尝试将其编译成为机器码从而提高执行速度。所以为了让 benchmark 的结果更加接近真实情况就需要进行预热。
- 进行测试的轮次time
- 每轮进行的时长timeUnit
- 时长单位
每个进程中的测试线程,这个非常好理解,根据具体情况选择,一般为 cpu 乘以 2。
进行 fork 的次数。如果 fork 数是 2 的话,则 JMH 会 fork 出两个进程来进行测试。
方法级注解,表示该方法是需要进行 benchmark 的对象,用法和 JUnit 的 @Test 类似。
属性级注解,@Param 可以用来指定某项参数的多种情况。特别适合用来测试一个函数在不同的参数输入的情况下的性能。
当使用 @Setup 参数的时候,必须在类上加这个参数,不然会提示无法运行。
State 用于声明某个类是一个“状态”,然后接受一个 Scope 参数用来表示该状态的共享范围。因为很多 benchmark 会需要一些表示状态的类,JMH 允许你把这些类以依赖注入的方式注入到 benchmark 函数里。Scope 主要分为三种。
- 该状态为每个线程独享。Group
- 该状态为同一个组里面所有线程共享。Benchmark
- 该状态在所有线程间共享。
关于 State 的用法,官方的 code sample 里有比较好的例子。
- jmh 官方示例
- JAVA 拾遗 — JMH 与 8 个测试陷阱
$ java -jar target/benchmarks.jar -h
Usage: java -jar ... [regexp*] [options]
[opt] means optional argument.
<opt> means required argument.
"+" means comma-separated list of values.
"time" arguments accept time suffixes, like "100ms".
[arguments] Benchmarks to run (regexp+).
-bm <mode> Benchmark mode. Available modes are: [Throughput/thrpt,
AverageTime/avgt, SampleTime/sample, SingleShotTime/ss,
-bs <int> Batch size: number of benchmark method calls per
operation. Some benchmark modes may ignore this
setting, please check this separately.
-e <regexp+> Benchmarks to exclude from the run.
-f <int> How many times to fork a single benchmark. Use 0 to
disable forking altogether. Warning: disabling
forking may have detrimental impact on benchmark
and infrastructure reliability, you might want
to use different warmup mode instead.
-foe <bool> Should JMH fail immediately if any benchmark had
experienced an unrecoverable error? This helps
to make quick sanity tests for benchmark suites,
as well as make the automated runs with checking error
-gc <bool> Should JMH force GC between iterations? Forcing
the GC may help to lower the noise in GC-heavy benchmarks,
at the expense of jeopardizing GC ergonomics decisions.
Use with care.
-h Display help.
-i <int> Number of measurement iterations to do. Measurement
iterations are counted towards the benchmark score.
-jvm <string> Use given JVM for runs. This option only affects forked
-jvmArgs <string> Use given JVM arguments. Most options are inherited
from the host VM options, but in some cases you want
to pass the options only to a forked VM. Either single
space-separated option line, or multiple options
are accepted. This option only affects forked runs.
-jvmArgsAppend <string> Same as jvmArgs, but append these options before
the already given JVM args.
-jvmArgsPrepend <string> Same as jvmArgs, but prepend these options before
the already given JVM arg.
-l List the benchmarks that match a filter, and exit.
-lp List the benchmarks that match a filter, along with
parameters, and exit.
-lprof List profilers.
-lrf List machine-readable result formats.
-o <filename> Redirect human-readable output to a given file.
-opi <int> Override operations per invocation, see @OperationsPerInvocation
Javadoc for details.
-p <param={v,}*> Benchmark parameters. This option is expected to
be used once per parameter. Parameter name and parameter
values should be separated with equals sign. Parameter
values should be separated with commas.
-prof <profiler> Use profilers to collect additional benchmark data.
Some profilers are not available on all JVMs and/or
all OSes. Please see the list of available profilers
with -lprof.
-r <time> Minimum time to spend at each measurement iteration.
Benchmarks may generally run longer than iteration
-rf <type> Format type for machine-readable results. These
results are written to a separate file (see -rff).
See the list of available result formats with -lrf.
-rff <filename> Write machine-readable results to a given file.
The file format is controlled by -rf option. Please
see the list of result formats for available formats.
-si <bool> Should JMH synchronize iterations? This would significantly
lower the noise in multithreaded tests, by making
sure the measured part happens only when all workers
are running.
-t <int> Number of worker threads to run with. 'max' means
the maximum number of hardware threads available
on the machine, figured out by JMH itself.
-tg <int+> Override thread group distribution for asymmetric
benchmarks. This option expects a comma-separated
list of thread counts within the group. See @Group/@GroupThreads
Javadoc for more information.
-to <time> Timeout for benchmark iteration. After reaching
this timeout, JMH will try to interrupt the running
tasks. Non-cooperating benchmarks may ignore this
-tu <TU> Override time unit in benchmark results. Available
time units are: [m, s, ms, us, ns].
-v <mode> Verbosity mode. Available modes are: [SILENT, NORMAL,
-w <time> Minimum time to spend at each warmup iteration. Benchmarks
may generally run longer than iteration duration.
-wbs <int> Warmup batch size: number of benchmark method calls
per operation. Some benchmark modes may ignore this
-wf <int> How many warmup forks to make for a single benchmark.
All iterations within the warmup fork are not counted
towards the benchmark score. Use 0 to disable warmup
-wi <int> Number of warmup iterations to do. Warmup iterations
are not counted towards the benchmark score.
-wm <mode> Warmup mode for warming up selected benchmarks.
Warmup modes are: INDI = Warmup each benchmark individually,
then measure it. BULK = Warmup all benchmarks first,
then do all the measurements. BULK_INDI = Warmup
all benchmarks first, then re-warmup each benchmark
individually, then measure it.
-wmb <regexp+> Warmup benchmarks to include in the run in addition
to already selected by the primary filters. Harness
will not measure these benchmarks, but only use them
for the warmup.