Automated Test Input Generation for Android: Are We There Yet?
Paper accepted at the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE 2015)
Early version of the publication is available at arxiv.org/abs/1503.07217
The goal of this project is to compare the state-of-art test input generation techniques for Android. Recently, a lot of research has been done in the development of such techniques, which differ in the way they generate inputs, the strategy they use to explore the behavior of the app under test, and the specific heuritics they use. To better understand the strengths and weaknesses of these existing approaches, and get general insight on ways they could be made more effective, in this project we perform a thorough comparison of the main existing test input generation tools for Android. In our comparison, we evaluate the effectiveness of these tools, and their corresponding techniques, according to four metrics: code coverage, ability to detect faults, ability to work on multiple platforms, and ease of use.
List of Tools
The following table contains the list of tools, which were used in our study.
|1.||Monkey||N/A, Part of the Android SDK (Google Inc.)|
|2.||Acteve||Automated Concolic Testing of Smartphone Apps.
Saswat Anand, Mayur Naik, Hongseok Yang, and Mary Jean Harrold.
FSE'12: ACM Symposium on Foundations of Software Engineering.
|3.||Dynodroid|| Dynodroid: An Input Generation System for Android Apps.
Aravind Machiry, Rohan Tahiliani, and Mayur Naik.
FSE'13: ACM Symposium on Foundations of Software Engineering.
|4.||A3E||Targeted and Depth-first Exploration for Systematic Testing of Android Apps.
Tanzirul Azim and Iulian Neamtiu.
OOPSLA'13: Object-Oriented Programming, Systems, Languages, and Applications.
|5.||SwiftHand||Guided GUI Testing of Android Apps with Minimal Restart and Approximate Learning
Wontae Choi, George Necula and Koushik Sen.
OOPSLA'13: Object-Oriented Programming, Systems, Languages, and Applications.
|6.||GuiRipper a.k.a. MobiGuitar||MobiGUITAR -- A Tool for Automated Model-Based Testing of Mobile Apps.
Domenico Amalfitano, Anna Rita Fasolino, Porfirio Tramontana, Bryan Dzung Ta and Atif M. Memon.
IEEE-S/W'14: IEEE Software Volume: PP, Issue: 99, April 2014
|7.||PUMA||PUMA: programmable UI-automation for large-scale dynamic analysis of mobile apps
Shuai Hao, Bin Liu, Suman Nath, William G.J. Halfond, and Ramesh Govindan.
MobiSys'14: Mobile systems, applications, and services
For our experiments, we setup each of the tools along with the benchmark apps, on a common virtualized linux infrastructure. Each tool is run on each benchmark 10 times to account for non-deterministic factors, and we report both the best and the mean coverage in our results. Our results also reports the failures (unhandled exceptions), which were triggered by the tool during the 10 executions and attribute it to the tool's fault detection capability.
Our experimental benchmarks are a union of all open-source benchmarks used in the tool's evaluation. The following chart shows the distribution of the categories of these apps.
Evaluation Results(Note: Click on charts to expand)
1. Pairwise comparison of coverage achieved and failures triggered.
This chart shows pairwise comparison of the tools in terms of coverage and failures. The pairwise statement coverage information is shown above the diagonal (right top; white background) and the percentage of statements covered by both is highlighted in grey. Similarly, pairwise failure information is shown below the diagonal (left botton; yellow background). Failures are unhandled exceptions that originate from the mobile application under test (i.e., the stack trace contains the application's package name).
NOTE: We could not obtain the statement coverage information from SwiftHand due to technical limitations of its underlying framework. Hence, for SwiftHand, we only compare the failures it invokes in the applications.
2. Variance of statement coverage achieved by tools on benchmark apps.
This chart shows the cross-benchmark variance in statement coverage obtained by the different tools. The mean of statement coverage for the 10 runs was considered for each application.
3. Progress of coverage for each tool on benchmark apps.
This chart shows the progress of the statement coverage achieved by each tool over 5 minute intervals. The mean of statement coverage for the 10 runs was considered for each application.
4. Unique failures triggered by tools on benchmark apps across 10 runs
This chart reports the cumulative failures, across 10 runs, which were triggered in the benchmark applications by the tools. The chart report unique failures, where uniqueness is determined by the stack trace associated with the failure.
Our experimental infrastructure contains all the tools, benchmark applications and scripts used in our empirical evaluation.
To use our virtual machine, you will need to download and install VirtualBox and Vagrant tools. If you would like to see the GUI of the VM, then you also need to install the VirtualBox extension pack. Once both of these tools are installed, follow the steps below to setup out VM.
- In a terminal, add the Androtest box to Vagrant
$ vagrant box add androtest http://bear.cc.gatech.edu/~shauvik/androtest/boxes/androtest_v2.boxIf you already downloaded the VM, put it's file path instead of the URL.
- Create a directory, say
~/vagrant/androtest, to host the vagrant machine and inside this directory download this Vagrantfile. This file contains the VirtualBox VM configuration that vagrant uses. Note that the configuration defines 10 VM instances labelled run1-run10.
$ mkdir -p ~/vagrant/androtest $ cd ~/vagrant/androtest $ wget http://bear.cc.gatech.edu/~shauvik/androtest/boxes/Vagrantfile
- You can start the vm using
vagrant up. Vagrant will create all virtual machines on your computer (run1-run10) and start them. To start only one (or few) VMs, pass the VM name(s) as parameters. Once the VM has booted up, login to the VM using SSH.
$ vagrant up run1 $ vagrant ssh run1 vagrant@run1:~$ ls -1 android-ndk-r10 #--> Android NDK android-sdk-linux #--> Android SDK lib #--> Libraries needed by tools scripts #--> Scripts for experiments (invokes tools) subjects #--> Open source android app benchmarks tools #--> Android test input generation tools vagrant@run1:~$ ls /vagrant Vagrantfile #--> Host machine directory (i.e., ~/vagrant/androtest) is mounted as /vagrant on the Vagrant box
- To start the experiments, run the
~/scripts/run_[tool].shto start the input generation tool on all benchmarks. Example for monkey below.
vagrant@run1:~$ cd scripts vagrant@run1:~/scripts$ bash -x run_monkey.shThis command runs the monkey tool on all benchmarks. Results are saved in
/vagrant/resultsdirectory in the VM, which is
~/vagrant/androtest/resultsdirectory on the host machine.
- To run the tool on different/selected benchmarks, change
~/scripts/projects.txtwith names of those benchmarks. Also, edit the
~/scripts/run_[tool].shto comment/uncomment the following lines to pick subjects from this file.
< for p in `ls -d */`; do < #for p in `cat $DIR/projects.txt`; do --- > #for p in `ls -d */`; do > for p in `cat $DIR/projects.txt`; do
Understanding the result files
results/run[id]/[tool]/[benchmark]/. Here is a description of the files.
- tool.log -- log generated by the tool
- tool.logcat -- logcat from the emulator, while tool was running (contains failure stack traces)
- install.log -- installation log of the app on the device
- icoverage -- log of intermediate coverage collected
- coverage.em -- Emma coverage metadata
- coverage.ec or coverage.es -- complete coverage file
- coverage[1-11].ec -- snapshots of progressive coverage collected every 5 minutes.
- To connect, graphically to the VM, find the VRDE port by running this script and connect to that port on the host machine using a remote desktop client (I prefer MS Remote Desktop Connection, that comes with office).
/vagrantdirectory is not mounted in the VM, then probably guest additions needs to be updated in the VM to match the host. The vagrant-vbguest plugin automatically keeps guest additions up to date.
- For general help with the virtual machine infrastructure, consult the official VirtualBox and vagrant docs and also search on stackoverflow.