Released: September 2015
Data source: GitHub
Projects included
All projects have top-level metadata included. Only projects identified as Java projects also include repository history and source code.
Compiler: compiler-2015-08
Programming languages processed (stored as ASTs)
- Java (any file with .java file extension) (up to and including Java 7 - not including Java8 or newer)
Known Bugs/Limitations
- Project creation dates are off by 1000. Anyone wishing to use this field should correct it (p.created_date / 1000). See this query for an example: http://boa.cs.iastate.edu/boa/?q=boa/job/public/14441
- All fields in Person (real_name, email, username) are set to the same value (they all contain the real name)
- Some fields in Revision (author.real_name, committer.real_name, and committer.email) are actually blank - see https://github.com/boalang/compiler/issues/260
- Tags and branches are not stored
- Commits are listed topologically and thus you can not see/infer the commit graph (or which commits belong to master vs a branch).
- The 2-expression form of assert statements lose the first expression - only the 2nd expression (possibly non-boolean value) is stored.
- If variables are left undefined, they might actually have non-null values in them from processing other projects. This can lead to weird results if you try to read their value, e.g. something like 's = s + v;' where s was assumed to be undefined initially. See here: https://github.com/boalang/compiler/issues/257