SBIR-STTR Award

CRAM: C++ to Rust Assisted Migration PH2
Award last edited on: 3/31/2023

Sponsored Program
SBIR
Awarding Agency
DOD : DARPA
Total Award Amount
$1,724,783
Award Phase
2
Solicitation Topic Code
HR001121S0007-10
Principal Investigator
Thomas Wahl

Company Information

GrammaTech Inc

531 Esty Street
Ithaca, NY 14850
   (607) 273-7340
   info@grammatech.com
   www.grammatech.com
Location: Multiple
Congr. District: 23
County: Tompkins

Phase I

Contract Number: HR001122C0025
Start Date: 9/9/2021    Completed: 12/31/2022
Phase I year
2021
Phase I Amount
$224,997
The C language has traditionally emphasized programs' runtime performance, achieved by leaving low-level memory management to the programmer. Countless program crashes, hangs, and security vulnerabilities have been attributed to uninformed or malicious use of this feature. In contrast, languages considered to be (memory-)safe restrict direct memory access by programmers and include C#, Java, Rust, and Go. Migrating actively used code in C and its extension C++ to one of these languages allows this code to benefit from the safer languages' advantages. We propose an approach to semi-automatically migrating well-designed C++ code into a safer language. We believe we can build substantial tool support for such a migration. We propose Rust as the target language and therefore dub our effort CRAM ("C++ to Rust Assisted Migration"). The reasons for choosing Rust include its safe programming features, its promise to deliver well-performing system-level and network applications, and that it is a modern language with many features beyond those related to safety. We are seeking a form of language migration that results in well-designed, human-maintainable programs in Rust, which we can hand back to the developer with a good conscience. Our vision is that our tool will be applied to code that is under active development. To support post-migration development, the code needs to be readable, have human-friendly variable names, and it must be idiomatic Rust code, resting on that language's native design pillars. Human-maintainability is also critical for post-processing automatically migrated code does not yet satisfy all our requirements. Our approach will lead to programs that take full advantage of Rust's safety and other features. As a by-product of the migration process, we will generate strong program-specific certifications of the result, including both statements relating the source and target programs, and reasoning about the stand-alone target program. Our migration tool CRAM will be developed as open-source software. Stable revisions, along with benchmarks and milestone migration results, will be made available on a public website. We plan to engage with the Rust programmer community, to stay abreast of the advances the language's developments and adjust CRAM accordingly. Potential commercialization will serve the urgent need of both the Government and the private sector to respond to the inherent risks of continued reliance on existing C/C++ code bases. GrammaTech has a long history of successful commercialization of products developed by its research division.

Phase II

Contract Number: HR001123C0079
Start Date: 2/28/2023    Completed: 2/28/2026
Phase II year
2023
Phase II Amount
$1,499,786
The C language has traditionally emphasized a program’s runtime performance, achieved by leaving low-level memory management to the programmer. Countless program crashes, hangs, and security vulnerabilities have been attributed to uninformed or malicious use of this freedom. C’s extension C++ provides better programming ions but insists on backward compatibility with C and thus suffers from similar risks, which are unacceptable in domains like systems programming or for defense applications. In contrast, languages considered to be (memory-)safe restrict direct memory access by programmers and include C#, Java, and Go. The Rust programming language offers safe programming features without resorting to expensive runtime management techniques like garbage collection and can therefore promise to deliver well-performing system-level and network applications. In order to extend the advantages of modern languages to legacy code, this proposal presents a strategy for migrating actively used C++ code to Rust that maximally benefits from the idiomatic and safety features of Rust, such as aggressively employing move semantics for assignments, and generics for reusable code patterns. We are seeking to produce human-maintainable programs in Rust, suitable for code under active development. Our strategy is to perform the migration in two stages. We begin with a refactoring step that attempts to harden the given C++ code, guided by Rust safe-programming rules. This step prepares the code for an easier migration to Rust, but also results in safer C++ code as a useful intermediate product. The refactoring is followed by a target code generation step, which attempts to recognize code segments with a particular intent, represents these in a language-agnostic concept representation, and retargets these concepts into Rust, using a library of Rust code templates. Acknowledging the hardness of the problem and the boldness of our proposed solution, our approach is semi-automatic, requesting user assistance to resolve ambiguities in inferring the intent of the code. The user interaction is supported by the integration of CRAM into an Integrated Development Environment, which displays the source code, the outcome of any refactorings, and the result of migrating code segments. For each operation performed on the source code, the user has the option of accepting or rejecting it, and of modifying its result. Despite human involvement, a problem of this magnitude demands an arsenal of assurance artifacts, such as tests and proofs, that aim to give evidence of the correctness of the migration, and the reliability of the target program. Our approach will deliver both migrated and automatically generated test cases, as well as proofs that establish the preservation of the program semantics by our refactoring and code generation steps.