The formatter takes a
.jl file as input and produce a idealized, formatted
.jl as output. Some formatters mutate the state of the current file,
JuliaFormatter takes a different approach - first generating a canonical output, and then mutating that canonical output; adhering to the indent and margin constraints.
The source code is parsed with
CSTParser.jl which returns a CST (Concrete Syntax Tree). A CST is a one-to-one mapping of the language to a tree form. In most cases a more compact AST (Abstract Syntax Tree) representation is desired. However, since formatting manipulate the source text itself, the richer representation of a CST is incredibly useful.
Once the CST is created it's then used to generate a
FST (Formatted Syntax Tree).
Note: this is not an actual term, just something I made up. Essentially it's a CST with additional formatting specific metadata.
The important part of an FST is any
.jl file that is syntactically the same (whitespace is irrelevant) produce an identical
# p1.jl a = foo(a, b, c,d)
# p2.jl a = foo(a, b, c,d)
will produce the same FST, which printed would look like:
# fst output a = foo(a, b, c, d)
So what does a typical
FST look like?
Code and comments are indented to match surrounding code blocks. Unnecessary whitespace is removed. Newlines in between code blocks are untouched.
If the expression can be put on a single line it will be. It doesn't matter it's a function call which 120 arguments, making it 1000 characters long. During this initial stage it will be put on a single line.
If the expression has a structure to it, such as a
if, or 'struct' definition. It will be spread across multiple lines appropriately:
# original source try a1;a2 catch e b1;b2 finally c1;c2 end -> # printed FST try a1 a2 catch e b1 b2 finally c1 c2 end
FST representation it's much easier to determine when and how lines should be broken.
During the nesting stage and original
FST is mutated to adhere to the margin specification.
Throughout the previous stage, while the
FST was being generated,
PLACEHOLDER nodes were being inserted at various points. These can be converted to
NEWLINE nodes during nesting, which is how lines are broken.
Assume we had a function call which went over the margin.
begin foo = funccall(argument1, argument2, ..., argument120) # way over margin limit !!! end
It would be nested to
begin foo = funccall( argument1, argument2, ..., argument120 ) # way over margin limit !!! end
You can read how code is nested in the style section.
FST has been nested it's then printed out to a file and voila! You have a formatted version of your code!