artifacts is a CLI toolkit for static triage of suspicious APKs. It surfaces known strings, network indicators, manifest permissions/anomalies, and suggests likely malware families by comparing the findings against a LiteJDB dataset. Use it for the very first pass before switching to heavyweight tools such as Jadx, Bytecode-Viewer, or dynamic sandboxes.
- Key Features
- apkInspector Integration
- Requirements & Installation
- Quick Start
- CLI Commands
- Datasets & Reporting
- Contributing
- Developers
- License
- Robust extraction –
apkInspector.headers.ZipEntrylets us unpack obfuscated or tampered APKs directly inside the working temp folder, avoiding the limitations of Python'szipfile. - Manifest decoding –
apkInspector.axml.parse_apk_for_manifestdecodesAndroidManifest.xmleven when it is still in binary form, so we can read permissions/applicationId, package name, and launcher activity without fully expanding the archive. - String & IOC hunting – regexes for Base64, URLs/IPs, Telegram IDs, plus curated network/root indicators stored in
data/*.json. - Similarity scoring – compares the extracted permission/application sets against the LiteJDB database (
data/patterns.json) to suggest the closest family match. - Actionable reports –
-rbuilds a structured JSON report,-sprints similarity tables,--activitydumps the decoded manifest details.
ℹ️ Disclaimer: outputs are heuristics meant for triage and can produce false positives. Always confirm with additional tooling or manual review.
This project relies on apkInspector to reliably unpack APKs even when the ZIP structure is malformed or heavily obfuscated, and to decode AndroidManifest.xml straight from the package so permissions and components remain readable.
git clone https://github.com/guelfoweb/artifacts.git
cd artifacts
python3 -m venv .venv && source .venv/bin/activate # recommended
pip install -r requirements.txtMain dependencies:
- Python 3.10+
- apkInspector
- LiteJDB
- prettytable
apkInspector is installed from the upstream GitHub repository (branch main) so this project can consume fixes before the next PyPI release.
If you want reproducible builds, pin a specific commit instead of main in both requirements.txt and pyproject.toml:
apkInspector @ git+https://github.com/erev0s/apkInspector.git@<commit_sha>
# Full help
python3 artifacts.py -h
# Run analysis + JSON report + similarity table
python3 artifacts.py sample.apk -r -s
# List every family stored in LiteJDB
python3 artifacts.py --list-allSample output (truncated):
{
"version": "1.1.4",
"md5": "ab879f4e8f9cf89652f1edd3522b873d",
"sha1": "0b36f0f2ddfd73be3452dc955a6c97c45b4f78e9",
"sha256": "4d0ec8e8c1fbbac866f38b5df31b3a775f482373789a9b6d33ad7bfb17e8c576",
"package_name": "fdgfhfgjhfgj.dfgdfgdfhfjfgj.sdgdfgdfhd",
"main_activity": "com.brkwl.apkstore.MainActivity",
"dex": ["classes.dex"],
"network": {
"ip": ["1.1.1.1", "8.8.8.8"],
"url": ["tg://telegram.org"]
},
"string": {
"base64": [["MTc4LjIzNi4yNDcuMTI0", "178.236.247.124"]],
"known": ["ping"]
},
"activity_counts": {
"permission": 12,
"application": 4,
"intent": 7
},
"family": {
"name": "SpyNote Italy 10/2023",
"match": 100.0
}
}- Feature extraction –
lib/manifest.pyparses the decoded manifest and normalizes the three indicator buckets (permission,application,intent) as alphabetically sorted lists. - Family dataset – each entry in
data/patterns.jsonprovides the expected indicators for a known malware family. When you runartifacts.py … -s, the CLI loads this dataset via LiteJDB. - Per-bucket scoring – for every family we compute the Jaccard similarity between the APK bucket and the family bucket (e.g.,
permission_score = |perm_apk ∩ perm_family| / |perm_apk ∪ perm_family| * 100). The same formula is applied toapplicationandintent. - Final score – the reported
family.matchis the arithmetic mean of the three bucket scores (all equally weighted) and is expressed as a percentage (e.g.,21.48means ~21.48%). Thefamily.valueobject surfaces the individual bucket percentages so you can tell why a match ranked higher (e.g., strong intent overlap but few shared permissions). - Interpreting the report – identical APK permissions can still yield different percentages if the top-ranked family changes, because each family contributes its own reference set. As a rule of thumb, scores below ~50 % should be treated as weak evidence (possible false positives or new families) and warrant manual review or dataset enrichment. Log the similarity table (
-s) to compare how your indicators intersect with multiple candidates. - Raw counts – the
activity_countsfield in the main JSON result shows how many unique permissions, components, and intents were extracted from the APK, independent of any family match. This is useful when profiling previously unseen samples.
| Flag | Description |
|---|---|
-h, --help |
Show the inline help. |
-v, --version |
Print the tool version. |
-r, --report |
Emit a structured JSON report. |
-s, --similarity |
Display the similarity table against the family DB. |
-a, --activity |
Dump decoded manifest activities/permissions. |
-l, --list-all |
List all families tracked in LiteJDB with the number of permissions, applications, and intents each entry defines. |
--fast |
Faster scan: skip regex-heavy IOC extraction (network, root, string). |
--del NAME |
Remove a family from the DB. |
--add NAME |
Add the currently analyzed APK (pass the malware sample as the positional argument) to the DB under NAME. |
data/patterns.json– defines known families, expected permissions, and application names. Updating this file improves matching without touching the Python code.data/permission_categories.json/permission_description.json– provide human-readable descriptions used in reports.- The
-rflag outputs category-based JSON (LOCATION, NETWORK, etc.) that can be pasted into tickets or knowledge bases.
- Open an issue describing the bug/feature.
- Work in a dedicated branch (e.g.,
feature/apk-inspector-upgrade). - Run the core commands (
python3 artifacts.py sample.apk -r -s,--list-all) and attach the output to your PR. - Use imperative commit messages (e.g.,
Add apkInspector extraction).
See AGENTS.md for extended contributor guidance (coding style, security hygiene, etc.).
Released under GPL-3.0. Treat every APK as hostile: operate inside isolated VMs, never commit binaries to the repo, and sanitize sensitive data before sharing artifacts.