Weave: Österreich - Belgien - Deutschland - Luxemburg - Polen - Schweiz - Slowenien - Tschechien
Disciplines
Computer Sciences (100%)
Keywords
Strings,
Type Systems,
Automated Testing,
Type Grammar Inference
Abstract
Strings of textual datasuch as names, credit card numbers, mail addresses, URLs, bank
accounts, color codes, and much moreare commonly processed by almost all computer
programs. Yet, programming languages offer little support for actually checking whether the
contents of these strings are as expected. Unexpected string inputs can not only lead to
functional errors, but are also common vectors for cyberattacks.
In this project, we introduce string types: a way for programmers to express the valid values
of strings using formal descriptions like regular expressions and grammars. We introduce
means to specify which sets of strings are acceptable as input values and to check whether a
program is correct with respect to the specified string typesbefore the program is released
into the wild and exposed to potentially dangerous input strings.
The formal languages used to describe these string types additionally make it possible to
quickly produce examples of valid program inputs. Because such automatically generated
formally valid inputs can bypass all early syntactic integrity checks built into a program, they
are able to reach the inner layers of the software. This allows for massive automated testing
of programsboth already released and still in developmentto uncover bugs and security
issues that would otherwise be deeply hidden.
Finally, we will also introduce means to automatically learn string types from program code
and its executions. We can infer a rough but correct outline of possible input strings by
statically analyzing the program source code, and then further refine this outline with
automated testing. Such automatic type inference will make it easy for programmers to add
string types to their codebases.