Static Analysis of Ruby


David Worth - dave@highgroove.com

What Language is that?

Lisp?

Scheme?

Clojure?

... I would have accepted any.

Why (study, learn, like) Lisps?

... because they show that computers are Turing machines.Bletchley Park - Block B - The Bletchley Park Story - Statue of Alan Turing - by Stephen Kettle

"data" and "code" are indistinguishable in Lisps because they are identically encoded "lists"

Shellcode

another great example of our Turing Machines

Ruby as an acceptable Lisp

Apologies to Steve Yegge but... I posit that Ruby is a perfectly acceptable Lisp.

Even if it lacks some cool Lisp-isms like macros... because it fakes it so well with things like define_method / instance_eval / class_eval

Abstract Syntax Trees

Without teaching a compilers class...

Purple(?) Dragon Book Cover
  1. These "interpreted" languages pretty much aren't anymore. Everything executes in a VM, and it's VMs/turtles all the way down.
  2. Python is complicated, has a byte-compiler, and I am out of my comfort zone talking about the nuts and bolts of it as a language.

Analyzing Ruby

Here's some everyday Ruby:

What the?

Who can tell me what this does?

Side by side...

it is much clearer what's going on

Node types in Ruby's AST

You could almost just write code with every operator and extract the node types manually

or...

Node types in Ruby's AST (con't)

By reading the Ruby source we can learn about the node types:

Do something interesting...

Do something interesting...

Find string interpolations using the Ruby Parser gem

Find string interpolations (con't)

Which gives us our old friend the s-expression

Find string interpolations (con't)

This is an S-Expression... also known as an array.

So we can simply walk down the tree making decisions on node types

Find string interpolations (con't)

Walk the AST

Find string interpolations (con't)

Walk the AST, and we'll get back a list of interpolated strings as s-expressions

Find string interpolations (con't)

Walking the tree further to find "interpolation expressions" is very similar and produces

Find string interpolations (con't)

Walking the tree further to find "interpolation expressions" is very similar and produces

This is where the actual interpolation occurs

As we can see, we're interpolating the value of the local variable baz

Better S-Expression Parsing with SexpProcessor

SexpProcessor allows you to write handlers for each node type in the AST

Better S-Expression Parsing with SexpProcessor

SexpProcessor allows you to write handlers for each node in the AST:

"This whole thing is impossible..."

Ruby has all sorts of things that get in our way:

  • method_missing
  • eval

These make analysis of every piece of code in their entirety very hard...

The good news is we can do "good enough" for lots of tasks

Other cool uses of Static Analysis

Ruby - A Programmer's Best Friend

Cool Uses of Static Analysis - Flog

From the README:
"Flog reports the most tortured code in an easy to read pain report. The higher the score, the more pain the code is in."

Flog relies on static analysis via SexpProcessor

Cool Uses of Static Analysis - Flog (con't)

Flog assigns to each node of the AST various weighted penalties for complexity... the end of a run the "worst" (most complicated) methods are presented with their scores

Cool Uses of Static Analysis - Flog (con't)

Cool Uses of Static Analysis - Flog (con't)

Note: If you look at Flog's source, the author(s) use the Sexp#s method from SexpProcessor to return an empty s-expression at the end of each process_* method.

It may take a while to track that downWhere's Waldo in Google Maps?

Cool Uses of Static Analysis - Flay

From the README:
"Flay analyzes code for structural similarities. Differences in literal values, variable, class, method names, whitespace, programming style, braces vs do/end, etc are all ignored. Making this totally rad.""

They mean it. That is rad.

Cool Uses of Static Analyis - Brakeman Brakeman Scanner Logo

Fairly comprehensive security scanner for rails apps

Analyzes Rails apps from their source without scanning the running app.

Aware of a wide variety of Rails-specific vulnerabilities as well as general web-app security issues.

Presented on before at ATLRUG but...

I'll be happy to talk your ear off about it after the talk.

Other cool uses of Static Analysis

(Sir Not-Appearing-In-This-Film)
  • Roodi - "Ruby Object Oriented Design Inferometer"
    Cyclomatic complexity checker for Ruby apps - i.e. the number of "linearly independent" paths through a program's source
  • Reek - a "Code smell detector for Ruby"
    Detects:
    • functions with too many parameters
    • poor variable naming: x, y, i, j, k
    • and more...

Thank You!

Questions?

References