Prof. Dr. Gordon Fraser, Stephan Lukasczyk, and Sebastian Schweikl (Chair of Software Engineering II)
Wednesday, 2020–01–22, 6pm, HS 13 (IM)
Preliminary Discussion:
Monday, 2020–02–03, 2pm, (ITZ) SR 252
Seminar Time:
Tuesday, 2pm to 4pm
(IM) SR 033 ZOOM Online Meeting
Course Number:
Seminar Topic

A large number of analysis techniques for dynamical languages as well as functional languages have been proposed over the last years. Static and dynamic analyses for two categories of such techniques. Furthermore, type-inference techniques are very valuable, especially in the context of dynamically-typed languages, to allow a better reasoning for existing techniques.

Form of the Seminar

We require students to present their topic during the seminar and submit a written report on their topic by the end of the semester.

The presentation should last 20 minutes plus a 10 min discussion. It should motivate the problem, present the approach of the work, and demonstrate the results. We further encourage you to give a small demo using the available artifacts. Lastly, we require you to present the ideas for the evaluation of your own work.

The written report is due to 2020–09–30 (AoE) and shall be written using the ACM Conference Template (set options \documentclass[sigconf]{acmart}). For master students, we require you to submit six pages plus references; for bachelor students, we require you to submit five pages plus references.

Besides that, we require you to submit a short summary of each paper before the presentation of the paper. It shall consist of about half a page of summary and questions or issues of understanding of the work.

To get feedback or in case of any issues do not hesitate to contact your topic advisor.


  • Introduction; Paper “Dynamically Typed Languages”
    Date: 2020–04–21

    Laurence Tratt. 2009. Dynamically Typed Languages. Advances in Computers, Vol. 77, 36 pages. https://doi.org/10.1016/S0065-2458(09)01205-4.

    Dynamically typed languages such as Python and Ruby have experienced a rapid grown in popularity in recent times. However, there is much confusion as to what makes these languages interesting relative to statically typed languages, and little knowledge of their rich history. In this chapter, I explore the general topic of dynamically typed languages, how they differn from statically typed languages, their history, and their defining features.

  • Introduction; Paper “Why Functional Programming Matters”
    Date: 2020–04–28

    John Hughes. 1989. Why Functional Programming Matters. The Computer Journal, 32(2), 10 pages. https://doi.org/10.1093/comjnl/32.2.98.

    As software becomes more and more complex, it is more and more important to structure it well. Well-structured software is easy to write, easy to debug, and provides a collection of modules that can be re-used to reduce future programming costs. Conventional languages place conceptual limits on the way problems can be modularised. Functional languages push those limits back. In this paper we show that two features of functional languages in particular, higher-order functions and lazy evaluation, can contribute greatly to modularity. AS examples, we manipulate lists and trees, program several numerical algorithms, and implement the alpha-beta heuristic (an Artificial Intelligence algorithm used in game-playing programs). Since modularity is the key to successful programming, functional languages are vitally important to the real world.

  • Test Generation for Higher-Order Functions in Dynamic Languages
    Date: 2020–05–05 Student: Markus Wagner, Advisor: Stephan Lukasczyk

    Marija Selakovic, Michael Pradel, Rezwana Karim, and Frank Tip. 2018. Test Generation for Higher-Order Functions in Dynamic Languages. In Proc. ACM Program. Lang. 2, OOPSLA, Article 161 (November 2018), 27 pages. https://doi.org/10.1145/3276531. GitHub Repository.

    Test generation has proven to provide an effective way of identifying programming errors. Unfortunately, current test generation techniques are challenged by higher-order functions in dynamic languages, such as JavaScript functions that receive callbacks. In particular, existing test generators suffer from the unavailability of statically known type signatures, do not provide functions or provide only trivial functions as inputs, and ignore callbacks triggered by the code under test. This paper presents LambdaTester, a novel test generator that addresses the specific problems posed by higher-order functions in dynamic languages. The approach automatically infers at what argument position a method under test expects a callback, generates and iteratively improves callback functions given as input to this method, and uses novel test oracles that check whether and how callback functions are invoked. We apply LambdaTester to test 43 higher-order functions taken from 13 popular JavaScript libraries. The approach detects unexpected behavior in 12 of the 13 libraries, many of which are missed by a state-of-the-art test generator.

  • DLint: Dynamically Checking Bad Coding Practices in JavaScript
    Date: 2020–05–12 Student: Jonas Spielberger, Advisor: Stephan Lukasczyk

    Lian Gong, Michael Pradel, Manu Sridharan, and Koushik Sen. 2015. DLint: Dynamically Checking Bad Coding Practices in JavaScript. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA), July 12–17, 2015, Baltimore, MD, USA. ACM, 11 pages. https://doi.org/10.1145/2771783.2771809. GitHub Repository.

    JavaScript has become one of the most popular programming languages, yet it is known for its suboptimal design. To effectively use JavaScript depsite its design flaws, developers try to follow informal code quality rules that help avoid correctness, maintainability, performance, and security problems. Lightweight static analyses, implemented in “lint-like” tools, are widely used to find violations of these rules, but are of limited use because of the language's dynamic nature. This paper presents DLint, a dynamic analysis approach to check code quality rules in JavaScript. DLint consists of a generic framework and an extensible set of checkers that each addresses a particular rule. We formally describe and implement 28 checkers that address problems missed by state-of-the-art approaches. Applying the approach in a comprehensive empirical study on over 200 popular web sites shows that static and dynamic checking complement each other. On average per web site, DLint detects 49 problems that are missed statically, including visible bugs on the web sites of IKEA, Hilton, eBay, and CNBC.

  • NL2Type: Inferring JavaScript Function Types from Natural Language Information
    Date: 2020–05–19 Student: Dariia Grishina, Advisor: Stephan Lukasczyk

    Rabee Sohail Malik, Jibesh Patra, and Michael Pradel. 2019. NL2Type: Inferring JavaScript Function Types from Natural Language Information. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), May 25–31, 2019, Montreal, QC, Canada. IEEE, 11 pages. https://doi.org/10.1109/ICSE.2019.00045. GitHub Repository.

    JavaScript is dynamically typed and hence lacks the type safety of statically typed languages, leading to suboptimal IDE support, difficult to understand APIs, and unexpected runtime behavior. Several gradual type systems have been proposed, e.g., Flow and TypeScript, but they rely on developers to annotate code with types. This paper presents NL2Type, a learning-based approach for predicting likely type signatures of JavaScript functions. The key idea is to exploit natural language information in source code, such as comments, function names, and parameter names, a rich source of knowledge that is typically ignored by type inference algorithms. We formulate the problem of predicting types as a classification problem and train a rucurrent, LSTM-based neural model that, after learning from an annotated code base, predicts function types for unannotated code. We evalute the approach with a corpus of 162,673 JavaScript files from real-world projects. NL2Type predicts types with a precision of 84.1% and a recall of 78.9% when considering only the top-most suggestion, and with a precision of 95.5% and a recall of 89.6% when considering the top-5 suggestions. The approach outperforms both JSNice, a state-of-the-art approach that analyzes implementations of functions instead of natural language information, and DeepTyper, a recent type prediction approach that is also based on deep learning. Beyond predicting types, NL2Type serves as a consistency checker for existing type annotations. We show that it discovers 39 inconsistencies that deserve developer attention (from a manual analysis of 50 warnings), most of which are due to incorrect type annotations.

  • Efficient and Precise Dynamic Slicing for Client-Side JavaScript Programs
    Date: 2020–05–26 Student: None, Advisor: Stephan Lukasczyk

    Jianbin Ye, Cheng Zhang, Lei Ma, Haibo Yu, and Jianjun Zhao. 2016. Efficient and Precise Dynamic Slicing for Client-Side JavaScript Programs. In IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), March 14–18, 2016, Suita, Osaka, Japan. IEEE Computer Society, 10 pages. https://doi.org/10.1109/SANER.2016.96. Bitbucket Repository.

    JavaScript is the de facto dominant programming language for developing web applications. Most popular websites are using JavaScript, especially to develop client-side features. Being syntactically flexible and highly dynamic, JavaScript is easy to use and productive, but its code is known to be less maintainable. The task of maintaining client-side JavaScript code is further complicated by the pervasive interactions between JavaScript code and HTML elements, through browser.
    In this paper, we present JS-Slicer, a dynamic slicer for JavaScript, to ease the task of understanding and debugging practical client-side JavaScript code. JS-Slicer defines three types of dependences, including data dependences, control dependences, and DOM dependences, to capture all relationships between program elements. JS-Slicer extends a novel dynamic analysis framework and combines dynamic and static analysis to precisely capture the dependences at run-time. A lot of language specific issues are properly handled, which enables JS-Slicer to slice practical JavaScript code. Our evaluation on six real-world web applications and JavaScript libraries shows that JS-Slicer is both precise and efficient: on average it captures around 40K dependences in 2.5K lines of code, in less than 3.0 seconds.

  • QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs
    Date: 2020–05–26 Student: León Liehr, Advisor: Sebastian Schweikl

    Koen Claessen and John Hughes. 2000. QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs. In Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming (ICFP), September 18–21, 2000, Montreal, Canada. ACM, 11 pages. https://doi.org/10.1145/351240.351266. Tool Page.

    Quick Check is a tool which aids the Haskell programmer in formulating and testing properties of programs. Properties are described as Haskell functions, and can be automatically tested on random input, but it is also possible to define custom test data generators. We present a number of case studies, in which the tool was successfully used, and also point out some pitfalls to avoid. Random testing is especially suitable for functional programs because properties can be stated at a fine grain. When a function is built from separately tested components, then random testing suffices to obtain good coverage of the definition under test.

  • ESVERIFY: Verifying Dynamically-Typed Higher-Order Functional Programs by SMT Solving
    Date: 2020–06–09 Student: Leonhard Bauer, Advisor: Stephan Lukasczyk

    Christopher Schuster, Sohum Banerjea, and Cormac Flanagan. 2018. ESVERIFY: Verifying Dynamically-Typed Higher-Order Functional Programs by SMT Solving. In Proceeding sof the 30th Symposium on Implementation and Application of Functional Languages (IFL), September 5–7, 2018, Lowell, MA, USA, ACM, 11 pages. https://doi.org.10.1145/3310232.3310240. GitHub Repository. Live Demo.

    Program verifiers statically check correctness properties of programs with annotations such as assertions and pre- and postconditions. Recent advances in SMT solving make it applicable to a wide range of domains, including program verification. In this paper, we describe ESVERIFY, a program verifier for JavaScript based on SMT solving, supporting functional correctness properties comparable to languages with refinement and dependent function types. ESVERIFY supports both higher-order functions and dynamically-typed idioms, enabling verification of programs that static type systems usually do not support. To verify these programs, we represent functions as universal quantifiers in the SMT logic and function calls as instantiations of these quantifiers. To ensure that the verification process is decidable and predictable, we describe a bounded quantifier instantiation algorithm that prevents matching loops and avoids ad-hoc instantiation heuristics. We also present a formalism and soundness proof of this verification system in the Lean theorem prover and a prototype implementation.

  • How Good Are Your Types? Using Mutation Analysis to Evaluate the Effectiveness of Type Annotations
    Date: 2020–06–16 Student: Jakob Jäger, Advisor: Stephan Lukasczyk

    Rahul Gopinath and Eric Walkingshaw. 2017. How Good Are Your Types? Using Mutation Analysis to Evaluate the Effectiveness of Type Annotations. In 2017 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICST Workshops), March 13–17, 2017, Tokyo, Japan. IEEE Computer Society, 6 pages. https://doi.org/10.1109/ICSTW.2017.28. mypy static type checking tool for Python. MutPy mutation testing tool for Python.

    Software engineers primarily use two orthogonal means to reduce susceptibility to faults: software testing and static type checking. While many strategies exist to evaluate the effectiveness of a test suite in catching bugs, there are few that evaluate the effectiveness of type annotations in a program. This problem is most relevant in the context of gradual or optional typing, where programmers are free to choose which parts of a program to annotate and in what detail. Mutation analysis is one strategy that has proven useful for measuring test suite effectiveness by emulating potential software faults. We propose that mutation analysis can be used to evaluate the effectiveness of type annotations too. We analyze mutants produced by the MutPy mutation framework against both a test suite and against type-annotated programs. We show that, while mutation analysis can be useful for evaluating the effectiveness of type annotations, we require stronger mutation operators that target type information in programs to be an effective mutation analysis tool.

  • MuCheck: An Extensible Tool for Mutation Testing of Haskell Programs
    Date: 2020–06–16 Student: Thomas Beer, Advisor: Sebastian Schweikl

    Duc Le, Mohammad Amin Alipour, Rahul Gopinath, and Alex Groce. 2014. MuCheck: An Extensible Tool for Mutation Testing of Haskell Programs. In International Symposium on Software Testing and Analysis (ISSTA), July 21–27, 2014, San Jose, CA, USA, ACM, 4 pages. https://doi.org/10.1145/2610384.2628052. Technical Report. Bitbucket Repository.

    This paper presents MuCheck, a mutation testing tool for Haskell programs. MuCheck is a counterpart to the widely used QuickCheck random testing tool for functional programs, and can be used to evaluate the efficacy of QuickCheck property definitions. The tool implements mutation operators that are specifically designed for functional programs, and makes use of the type system of Haskell to achieve a more relevant set of mutants than otherwise possible. Mutation coverage is particularly valuable for functional programs due to highly compact code, referential transparency, and clean semantics; these make augmenting a test suite or specification based on surviving mutants a practical method for improved testing.

  • Python Predictive Analysis for Bug Detection
    Date: 2020–06–23 Student: Salin Kumar, Advisor: Stephan Lukasczyk

    Zhaogui Xu, Peng Liu, Xiangyu Zhang, and Baowen Xu. 2016. Python Predictive Analysis for Bug Detection. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE), November 13–18, 2016, Seattle, WA, USA, ACM, 11 pages. https://doi.org/10.1145/2950290.2950357. Bitbucket Repository.

    Python is a popular dynamic language that allows quick software development. However, Python program analysis engines are largely lacking. In this paper, we present a Python predictive analysis. It first collects the trace of an execution, and then encodes the trace and unexecuted branches to symbolic constraints. Symbolic variables are introduced to denote input values, their dynamic types, and attribute sets, to reason about their variations. Solving the constraints identifies bugs and their triggering inputs. Our evaluation shows that the technique is highly effective in analyzing real-world complex programs with a lot of dynamic features and external library calls, due to its sophisticated encoding design based on traces. It identifies 46 bugs from 11 real-world projects, with 16 new bugs. All reported bugs are true positives.

  • Deep Learning Type Inference
    Date: 2020–06–23 Student: Patric Feldmeier, Advisor: Stephan Lukasczyk

    Vincent J. Hellendoorn, Christian Bird, Earl T. Barr, and Miltiadis Allamanis. 2018. Deep Learning Type Inference. In Proceedings of the 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE '18), November 4–9, 2018, Lake Buena Vista, FL, USA. ACM, 11 pages. https://doi.org/10.1145/3236024.3236051. GitHub Repository.

    Dynamically typed languages such as JavaScript and Python are increasingly popular, yet static typing has not been totally eclipsed: Python now supports type annotations and languages like TypeScript offer a middle-ground for JavaScript: a strict superset of JavaScript, to which it transpiles, coupled with a type systemt that permits partially typed programs. However, static typing has a cost: adding annotations, reading the added syntax, and wrestling with the type system to fix type errors. Type inference can ease the transition to more statically typed code and unlock the benefits of richer compile-time information, but is limited in languages like JavaScript as it cannot soundly handle duck-typing or runtime evaluation via eval. We propose DeepTyper, a deep learning model that understands which types naturally occur in certain contexts and relations and can provide type suggestions, which can often be verified by the type checker, even if it could not infer the type initially. DeepTyper, leverages an automatically aligned corpus of tokens and types to accurately predict thousands of variable and function type annotations. Furthermore, we demonstrate that context is key in accurately assigning these types and introduce a technique to reduce overfitting on local cues while highlighting the need for further improvements. Finally, we show that our model can interact with a compiler to provide more than 4,000 additional type annotations with over 95% precision that could not be inferred without the aid of DeepTyper.

  • Python Probabilistic Type Inference with Natural Language Support
    Date: 2020–06–30 Student: Julian Einwanger, Advisor: Stephan Lukasczyk

    Zhaogui Xu, Xiangyu Zhang, Lin Chen, Kexin Pei, and Baowen Xu. 2016. Python Probabilistic Type Inference with Natural Language Support. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE), November 13–18, 2016, Seattle, WA, USA. ACM, 11 pages. https://doi.org/10.1145/2950290.2950343. Artifact Website.

    We propose a novel type inference technique for Python programs. Type inference is difficult for Python programs due to their heavy dependence on external APIs and the dynamic language features. We observe that Python source code often contains a lot of type hints such as attribute accesses and variable names. However, such type hints are not reliable. We hence propose to use probabilistic inference to allow the beliefs of individual type hints to be propagated, aggregated, and eventually converge on probabilities of variable types. Our results show that our technique substentially outperforms a state-of-the-art Python type inference engine based on abstract interpretation.

  • PyGGI 2.0: Language Independent Genetic Improvement Framework
    Date: 2020–06–30 Student: Constantin Wenger, Advisor: Stephan Lukasczyk

    Gabin An, Aymeric Blot, Justyna Petke, and Shin Yoo. 2019. PyGGI 2.0: Language Independent Genetic Improvement Framework. In Proceedings of the ACM Joint Meeting of European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), August 26–30, 2019, Tallinn, Estonia. ACM, 5 pages. https://doi.org/10.1145/3338906.3341184. GitHub Repository.

    PyGGI is a research tool for Genetic Improvement (GI), that is designed to be versatile and easy to use. We present version 2.0 of PyGGI, the main feature of which is an XML-based intermediate program representation. It allows users to easily define GI operators and algorithms that can be reused with multiple target languages. Using the new version of PyGGI, we present two case studies. First, we conduct an Automated Program Repair (APR) experiment with the QuixBugs benchmark, one that contains defective programs in both Python and Java. Second, we replicate an existing work on runtime improvement through program specialisation for the MiniSAT satisfiability solver. PyGGI 2.0 was able to generate a patch for a bug not previously fixed by any APR tool. It was also able to achieve 14% runtime improvement in the case of MiniSAT. The presented results show the applicability and the expressiveness of the new version of PyGGI. A video of the tool demo is at: https://youtu.be/PxRUdlRDS40.

Important: Compulsory attendance is mandatory during the seminar!