Static Analysis of Ruby
David Worth - dave@highgroove.com

What Language is that?
Lisp?
Scheme?
Clojure?
... I would have accepted any.
Why (study, learn, like) Lisps?
... because they show that computers are Turing machines.
"data" and "code" are indistinguishable in Lisps because they are identically encoded "lists"
Shellcode
another great example of our Turing Machines
Ruby as an acceptable Lisp
Apologies to Steve Yegge but... I posit that Ruby is a perfectly acceptable Lisp.
Even if it lacks some cool Lisp-isms like macros... because it fakes it so well with things like define_method / instance_eval / class_eval
Abstract Syntax Trees
Without teaching a compilers class...
- an Abstract Syntax Tree (AST) is an internal representation used to represent a language
- used by a compiler as an intermediate representation between your code and the compiled machine code
- used by "interpreted"1 languages like Ruby or Python2 to instruct the execution environment (virtual machine) how to manipulate its state.
- These "interpreted" languages pretty much aren't anymore. Everything executes in a VM, and it's VMs/turtles all the way down.
- Python is complicated, has a byte-compiler, and I am out of my comfort zone talking about the nuts and bolts of it as a language.
Analyzing Ruby
Here's some everyday Ruby:
What the?
Who can tell me what this does?
Side by side...
it is much clearer what's going on
Node types in Ruby's AST
You could almost just write code with every operator and extract the node types manually
or...
Node types in Ruby's AST (con't)
By reading the Ruby source we can learn about the node types:
Do something interesting...
Do something interesting...
Find string interpolations using the Ruby Parser gem
Find string interpolations (con't)
Which gives us our old friend the s-expression
Find string interpolations (con't)
This is an S-Expression... also known as an array.
So we can simply walk down the tree making decisions on node types
Find string interpolations (con't)
Walk the AST
Find string interpolations (con't)
Walk the AST, and we'll get back a list of interpolated strings as s-expressions
Find string interpolations (con't)
Walking the tree further to find "interpolation expressions" is very similar and produces
Find string interpolations (con't)
Walking the tree further to find "interpolation expressions" is very similar and produces
This is where the actual interpolation occurs
As we can see, we're interpolating the value of the local variable baz
Better S-Expression Parsing with SexpProcessor
SexpProcessor allows you to write handlers for each node type in the AST
Better S-Expression Parsing with SexpProcessor
SexpProcessor allows you to write handlers for each node in the AST:
"This whole thing is impossible..."
Ruby has all sorts of things that get in our way:
- method_missing
- eval
These make analysis of every piece of code in their entirety very hard...
The good news is we can do "good enough" for lots of tasks
Other cool uses of Static Analysis
- Flog / Flay
- Brakeman Scanner
- ... and more

Cool Uses of Static Analysis - Flog
From the README:"Flog reports the most tortured code in an easy to read pain report. The higher the score, the more pain the code is in."
Flog relies on static analysis via SexpProcessor
Cool Uses of Static Analysis - Flog (con't)
Flog assigns to each node of the AST various weighted penalties for complexity... the end of a run the "worst" (most complicated) methods are presented with their scores
Cool Uses of Static Analysis - Flog (con't)
Cool Uses of Static Analysis - Flog (con't)
Note: If you look at Flog's source, the author(s) use the Sexp#s method from SexpProcessor to return an empty s-expression at the end of each process_* method.
It may take a while to track that down
Cool Uses of Static Analysis - Flay
From the README:"Flay analyzes code for structural similarities. Differences in literal values, variable, class, method names, whitespace, programming style, braces vs do/end, etc are all ignored. Making this totally rad.""
They mean it. That is rad.
Cool Uses of Static Analyis - Brakeman
Fairly comprehensive security scanner for rails apps
Analyzes Rails apps from their source without scanning the running app.
Aware of a wide variety of Rails-specific vulnerabilities as well as general web-app security issues.
Presented on before at ATLRUG but...
I'll be happy to talk your ear off about it after the talk.
Other cool uses of Static Analysis
(Sir Not-Appearing-In-This-Film)
-
Roodi - "Ruby Object Oriented Design Inferometer" Cyclomatic complexity checker for Ruby apps - i.e. the number of "linearly independent" paths through a program's source
-
Reek - a "Code smell detector for Ruby" Detects:
- functions with too many parameters
- poor variable naming: x, y, i, j, k
- and more...
Cyclomatic complexity checker for Ruby apps - i.e. the number of "linearly independent" paths through a program's source
Detects:
- functions with too many parameters
- poor variable naming: x, y, i, j, k
- and more...