If you are coming from Python or R, Julia's runtime performance is intoxicating, but its startup latency is sobering. The "Time-to-First-Plot" (TTFP) is the most notorious offender. You write a CLI script to process a CSV and generate a chart, but the user spends 15 seconds waiting for using Plots and the first plot() call to complete before anything actually happens.
While Julia 1.9+ introduced native code caching which significantly improved package load times, heavy workflows (especially those involving Plots.jl, DataFrames.jl, or DifferentialEquations.jl) still suffer from noticeable compilation lag.
For CLI tools and repetitive scripting tasks, this latency is unacceptable. The solution is PackageCompiler.jl.
The Root Cause: Just-In-Time Compilation
To fix TTFP, you must understand why it exists. Julia is a Just-In-Time (JIT) compiled language. When you run a script, the following happens:
- Parsing: Code is converted to an AST.
- Lowering: AST is converted to intermediate representation (IR).
- Type Inference: The compiler figures out types (if not explicitly declared).
- LLVM Codegen: Julia generates LLVM IR.
- Native Code Generation: LLVM compiles IR to machine code.
Standard precompilation (the .ji files generated when you install a package) caches the type inference stage and some lowered code. However, the final step—generating native machine code specific to your CPU architecture—often happens at runtime.
When you call plot(), Julia compiles the specific method signatures required for that call right then and there. A custom Sysimage solves this by performing that compilation ahead of time and serializing the resulting machine code into a shared library (.so, .dll, or .dylib) that the Julia executable loads instantly at startup.
The Fix: Building a Custom Sysimage
We will create a custom system image that bakes in DataFrames and Plots along with the compiled paths for a scatter plot.
1. Project Setup
Create a fresh environment to ensure clean dependencies.
# inside a Julia REPL
import Pkg
Pkg.activate("FastPlots")
Pkg.add(["Plots", "DataFrames", "CSV", "PackageCompiler"])
2. Create the Trace Script
This is the most critical step. PackageCompiler needs to know exactly which functions to compile. If you include the packages but don't run the functions, the code remains uncompiled.
Create a file named precompile_execution.jl. This script acts as a "trace." It runs your heavy functions so the compiler can see them, but it doesn't need to produce real output.
# precompile_execution.jl
using Plots
using DataFrames
using CSV
# 1. Exercise DataFrames and CSV
# Create dummy data representative of your actual workload
df = DataFrame(A = rand(100), B = rand(100))
# Write to buffer to trigger CSV writing code paths
io = IOBuffer()
CSV.write(io, df)
# 2. Exercise Plots
# We use the specific backend you intend to ship (GR is default)
gr()
# Run the plot command.
# We disable display to avoid opening windows during the build process.
p = plot(df.A, df.B, seriestype=:scatter, title="Precompile Tracing")
display(p)
# Exercise saving, as that involves different code paths (libs like libpng/Cairo)
savefig(p, tempname() * ".png")
println("Trace complete.")
3. Build the Sysimage
Create a build script named build_sysimage.jl. This script instructs PackageCompiler to replace the default Julia sysimage with a new one containing our dependencies and the compiled traces from step 2.
# build_sysimage.jl
using PackageCompiler
println("Building custom sysimage. This will take a few minutes...")
create_sysimage(
# The packages to bake into the image
[:Plots, :DataFrames, :CSV];
# The script that exercises the functions to compile
precompile_execution_file = "precompile_execution.jl",
# Output file name
sysimage_path = "sys_plots.so",
# Optimization level (3 is max, takes longer to build but runs faster)
cpu_target = PackageCompiler.default_app_cpu_target()
)
println("Sysimage built successfully: sys_plots.so")
Run this script from your terminal:
julia --project=. build_sysimage.jl
Note: This process consumes significant RAM and CPU. It will take 2–5 minutes depending on your hardware.
4. Running with the Sysimage
Once sys_plots.so is generated, you use the -J (or --sysimage) flag to load it.
Create a test script app.jl:
# app.jl
# Note: We do NOT need `using Plots` here if we exported it in the sysimage,
# but keeping it is good practice for IDE support.
# Ideally, loading time is now near-instant.
using Plots
using DataFrames
println("Generating plot...")
@time begin
df = DataFrame(x = rand(1000), y = rand(1000))
p = scatter(df.x, df.y, title="Instant Plotting")
savefig(p, "output_plot.png")
end
println("Done.")
Execute it using your new sysimage:
julia --project=. -J sys_plots.so app.jl
The Results: Benchmark
Here is the difference between a standard run and a sysimage run on a standard M1 MacBook Pro for the script above.
Standard Julia Runtime:
Generating plot...
14.234102 seconds (25.4 M allocations: 1.8 GiB, 4.2% gc time)
With Custom Sysimage:
Generating plot...
0.412030 seconds (150 k allocations: 9.5 MiB)
We reduced the execution time by roughly 97%.
Why This Works
When you launch Julia with -J sys_plots.so, the memory space of the process is pre-populated with the heap state that existed at the end of your build process.
- Deserialization vs. Parsing: Instead of reading source code and parsing it, Julia maps the shared library into memory.
- Skipped Inference: The types for
plot(::Vector{Float64}, ::Vector{Float64})are already inferred and stored. - Skipped Codegen: The machine code instructions for the GR backend to draw a scatter plot are already compiled. The CPU jumps directly to those addresses.
Conclusion
Julia's JIT compiler is powerful, but it imposes a tax on startup latency. By using PackageCompiler.jl to create custom sysimages, you move that tax from runtime (every time the user runs the script) to build time (once per release).
For production data pipelines, CLI tools, or containerized Julia microservices, shipping a custom sysimage is not optional—it is the standard for professional Julia engineering.