<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by +Ch0pin🕷️ on Medium]]></title>
        <description><![CDATA[Stories by +Ch0pin🕷️ on Medium]]></description>
        <link>https://medium.com/@valsamaras?source=rss-ded5e114da13------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*4pO69EQh6BkPjK262b-zhA.jpeg</url>
            <title>Stories by +Ch0pin🕷️ on Medium</title>
            <link>https://medium.com/@valsamaras?source=rss-ded5e114da13------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Thu, 07 May 2026 13:03:14 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@valsamaras/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Fuzzing Android binaries using AFL++ Frida Mode]]></title>
            <link>https://valsamaras.medium.com/fuzzing-android-binaries-using-afl-frida-mode-57a49cf2ca43?source=rss-ded5e114da13------2</link>
            <guid isPermaLink="false">https://medium.com/p/57a49cf2ca43</guid>
            <dc:creator><![CDATA[+Ch0pin️]]></dc:creator>
            <pubDate>Tue, 14 May 2024 10:01:02 GMT</pubDate>
            <atom:updated>2024-05-14T10:01:02.899Z</atom:updated>
            <content:encoded><![CDATA[<p>You might find this to be a fitting prologue to my earlier post on <a href="https://valsamaras.medium.com/creating-and-using-jvm-instances-in-android-c-c-applications-c289415b9dbd">Creating and using JVM instances in Android C/C++ applications</a>… and you are right !! Well, consider this my way of enticing you by presenting the end goal you’ll eventually reach. After all, it’s not uncommon to wade through various write-ups without a clear understanding of their objectives.</p><p>With that said, if you’re interested in or considering exploring fuzzing, this serves as a step-by-step guide on configuring AFL++ and employing it to fuzz Android binaries. I’ll try to keep it short and avoid boring paragraphs of type …<em>how-to-set-up-your-Android-pentest-lab</em>. After all if you don’t know what fuzzing or what AFL is, there are thousands of write ups out there in order to get you started.</p><p>I followed this step-by-step guide to set up AFL++ (Frida mode) on my MacBook Pro M1 running Sonoma v. 14.4.1, but I doubt you’ll encounter many challenges with your system.</p><h3>Setting up AFL++</h3><ol><li><strong>Download the latest release here: </strong><a href="https://github.com/AFLplusplus/AFLplusplus/releases/">https://github.com/AFLplusplus/AFLplusplus/releases/</a> and extract the compressed files.</li><li><strong>Install the </strong><a href="https://developer.android.com/ndk/guides"><strong>Android-ndk</strong></a><strong> using brew:</strong></li></ol><pre>$brew install -- cask android-ndk<br><br>$export ANDROID_NDK_HOME=&quot;/opt/homebrew/share/android-ndk&quot;</pre><p>Set the ANDROID_NDK_HOME persistently, so you won’t need to redefine it every time you start a shell session. Depending on your OS and shell, you may add the line export ANDROID_NDK_HOME=&#39;/opt/homebrew/share/android-ndk&#39; to your shell configuration file (e.g. ~/.zshrc in case you are using zsh).</p><p><strong>3. Download the following CMAKE file and save it under the directory you extracted AFL (in step 1):</strong></p><p><a href="https://github.com/Ch0pin/android-fuzzing/blob/main/AFLplusplus/CMakeLists.txt">https://github.com/Ch0pin/android-fuzzing/blob/main/AFLplusplus/CMakeLists.txt</a></p><p>If you face any issue with the above you ming thave to change this part:</p><pre>execute_process(<br>  COMMAND<br>  bash -c &quot;echo &#39;unsigned char api_js[] = {&#39; &gt; ${API_C}; \<br>  xxd -p -c 12 ${API_JS} | sed -e \&quot;s/\\([0-9a-f]\\{2\\}\\)/0x\\1, /g\&quot; \<br>                         | sed -e \&quot;s/^/  /\&quot; &gt;&gt; ${API_C}; \<br>  echo &#39;};&#39; &gt;&gt; ${API_C}; \<br>  echo \&quot;unsigned int api_js_len = $(stat --printf=&#39;%s&#39; ${API_JS});\&quot; \<br>     &gt;&gt; ${API_C}&quot;<br> )</pre><p>As follows:</p><pre>execute_process(<br>  COMMAND<br>  bash -c &quot;echo &#39;unsigned char api_js[] = {&#39; &gt; ${API_C}; \<br>  xxd -p -c 12 ${API_JS} | sed -e \&quot;s/\\([0-9a-f]\\{2\\}\\)/0x\\1, /g\&quot; \<br>                         | sed -e \&quot;s/^/  /\&quot; &gt;&gt; ${API_C}; \<br>  echo &#39;};&#39; &gt;&gt; ${API_C}; \<br>  echo \&quot;unsigned int api_js_len = $(stat ${API_JS} | cut -d &#39; &#39; -f 8);\&quot; \<br>     &gt;&gt; ${API_C}&quot;<br> )</pre><p><strong>4. Save the following script under the directory you downloaded AFL and run it in order to compile the </strong><strong>afl-fuzz and </strong><strong>afl-frida-trace.so:</strong></p><pre>mkdir build &amp;&amp; cd build<br>cmake -DANDROID_PLATFORM=31 \<br>      -DCMAKE_TOOLCHAIN_FILE=/opt/android-ndk-r25c/build/cmake/android.toolchain.cmake \<br>      -DANDROID_ABI=arm64-v8a ..<br>make</pre><p>You may need to change the DCMAKE_TOOLCHAIN_FILE value with the location of the ndk. In case it is installed with Brew, this path will be under the <strong>/opt/homebrew/Cascroom/android-ndk</strong></p><h3><strong>If everything worked as expected</strong></h3><p>You may find the afl-fuzz and afl-frida-trace.so under the ./build path. Use adb to push these binaries in /data/local/tmp:</p><pre>$adb push afl* /data/local/tmp</pre><p>Give execute access to the afl-fuzz . If you followed <a href="https://valsamaras.medium.com/creating-and-using-jvm-instances-in-android-c-c-applications-c289415b9dbd">my guide,</a> you probably know what to fuzz. Indicatively, assuming that the binary you want to fuzz is called ‘fuzz’ :) you may start with:</p><p>./afl-fuzz -O -G 256 -i in -o out ./fuzz:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*-AwU3hhIfFP78DxWqO-Y2w.png" /></figure><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=57a49cf2ca43" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Ghost files in the shared preferences]]></title>
            <link>https://valsamaras.medium.com/ghost-files-in-the-shared-preferences-8d75226c23c0?source=rss-ded5e114da13------2</link>
            <guid isPermaLink="false">https://medium.com/p/8d75226c23c0</guid>
            <dc:creator><![CDATA[+Ch0pin️]]></dc:creator>
            <pubDate>Sun, 18 Feb 2024 16:01:08 GMT</pubDate>
            <atom:updated>2024-02-18T16:01:08.415Z</atom:updated>
            <content:encoded><![CDATA[<p>Have you ever encountered an exceptionally clever bug, only to be thwarted by an unforeseen obstacle just moments before exploiting it? Perhaps a check, initially designed for another purpose, now inadvertently blocks you from leveraging this significant bug you’ve discovered?</p><p>That’s precisely what this post aims to explore….</p><h3>The bug ? what bug 🕷️?</h3><p>Numerous <a href="https://i.blackhat.com/Asia-23/AS-23-Valsamaras-Dirty-Stream-Attack-Turning-Android.pdf">bug categories</a> can enable you to circumvent WRITE restrictions in an application’s home directory, sparking considerable excitement due to the typically high impact of such attacks. For example, achieving code execution through overwriting a native library can lead to significant repercussions, among other potential exploits.</p><p>Even in the absence of a native library, numerous avenues remain for exploiting such a bug, with the shared preferences directory being a prime target. Android applications often store connection settings within the shared_prefs directory. Should you succeed in overwriting a trusted domain within this space, you could redirect the application to communicate with a malicious server. This could lead to the unintended transmission of user tokens and other sensitive information directly to an attacker-controlled environment.</p><h3>Write is not always … Overwrite</h3><p>Many apps implement checks prior to overwriting a file; it’s quite typical, for instance, to verify the file’s existence. If the file exists, they might either halt the operation altogether or prompt for user consent for overwriting it. This requirement for user interaction can elevate the user-interaction metric of a CVE (Common Vulnerabilities and Exposures) and consequently reduce the overall CVE score, impacting the perceived severity of the vulnerability.</p><p>This is exactly the obstacle that I encountered in numerous cases while trying to overwrite a file…</p><pre>if(file.exists()) abort();</pre><p>And this is how it feels….</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/640/0*yyMjA9UHvM7CLLnu.gif" /></figure><p>…and I might have given up, if it wasn’t this great file monitoring medusa module called <a href="https://github.com/Ch0pin/medusa/blob/master/modules/file_system/file_write.med">file_write</a>…without any direct intervention on my part, I observed the appearance of unusual .bak files within the shared preferences directory. The answer to the purpose of these mysterious .bak files lies within the implementation of the <a href="https://cs.android.com/android/platform/superproject/main/+/main:frameworks/base/core/java/android/app/SharedPreferencesImpl.java;l=132;drc=9e2bc26eb6d703b0b03130bf042222fb8dab08ce;bpv=1;bpt=1">shared preferences class</a>:</p><pre>    SharedPreferencesImpl(File file, int mode) {<br>        mFile = file;<br>        mBackupFile = makeBackupFile(file);<br>        mMode = mode;<br>        mLoaded = false;<br>        mMap = null;<br>        mThrowable = null;<br>        startLoadFromDisk();<br>    }</pre><p>Notice the makeBackupFile , which simply returns a new file with the .bak extention:</p><pre>    static File makeBackupFile(File prefsFile) {<br>        return new File(prefsFile.getPath() + &quot;.bak&quot;);<br>    }</pre><p>Let’s say you are using the example.xml file , then this one will return example.xml.bak . Now take a look below:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*NMnmN0qYQBvPbrgZ1EcJaw.png" /></figure><p>The StartLoadFromDisk at the end of the constructor, calls the loadFromDisk , which checks if the example.xml.bak exists and if it does, it deletes the example.xml and renames the example.xml.bak to example.xml</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/200/0*8HvFEP32TNHSMJrM.gif" /></figure><h3>Write is always … Overwrite (when it comes to shared_prefs)</h3><p>I guess, it’s clear where this is leading… Suppose you have WRITE but not OVERWRITE in the shared preferences directory. Instead of attempting to write the file directly, you could simply create a .bak file and allow the behavior described previously to work in your favor. This approach leverages the inherent handling of .bak files by the shared preferences mechanism to indirectly achieve file modification, circumventing the restriction on direct overwriting ;)</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=8d75226c23c0" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Creating and using JVM instances in Android C/C++ applications]]></title>
            <link>https://valsamaras.medium.com/creating-and-using-jvm-instances-in-android-c-c-applications-c289415b9dbd?source=rss-ded5e114da13------2</link>
            <guid isPermaLink="false">https://medium.com/p/c289415b9dbd</guid>
            <category><![CDATA[jni]]></category>
            <category><![CDATA[android]]></category>
            <category><![CDATA[reverse-engineering]]></category>
            <category><![CDATA[security]]></category>
            <dc:creator><![CDATA[+Ch0pin️]]></dc:creator>
            <pubDate>Wed, 30 Aug 2023 11:35:09 GMT</pubDate>
            <atom:updated>2023-08-30T16:27:40.230Z</atom:updated>
            <content:encoded><![CDATA[<p>Considering the reader’s interest in this post, it’s reasonable to assume a certain level of familiarity with JNI and its usage. For those who stumbled upon this content by chance, a brief introduction to the subject is recommended. I invite you to explore the topic further by reading <a href="https://studentprojects.in/software-development/jni/jni-tutorial/jni-part1-java-native-interface/">THIS</a> article which provides a foundational understanding.</p><p>While the typical perception of JNI involves utilising native code in Java applications, this post explores exactly the oposite. We’re about to dive into crafting a pure native Android app and we will try to use Java features, that normally a native app wouldn’t support.</p><blockquote>Why we might want to do such a thing ?</blockquote><p>In addition to the advantages related to software development, when trying to test, fuzz, or broadly speaking, reverse-engineer native code that interacts with Java objects, you’ll inevitably encounter a juncture where you need to isolate particular segments of code, and this is what this post is all about.</p><blockquote>So, what are we going to do ?</blockquote><p>Our task is to call a java method from an apk using a C/C++ Android application. I chose com.whatsapp version 2.23.16.76, from which we are going to call the following method (which can be found in the <em>X.2ts class)</em>:</p><pre>public static X.2ts A01(byte[] bArr)</pre><p>This method gets as an argument a byte array returned by the native method:</p><pre>WebpUtils.fetchWebpMetadata(file.getAbsolutePath())</pre><p>This functionality is natively implemented in the libwhatsapp.so library. Given a file path that points to a webp file, this function returns the file&#39;s metadata as a byte array. Subsequently, the A01 method utilises this data to initialize an object of the X.2ts class, encapsulating the metadata information. Finally, we invoke the toString method of the X.2ts class to display the metadata of the webp file as interpreted by WhatsApp.</p><p>Our final “deliverable” is an Android binary (lets call it <em>caller) </em>which gets a file path to a webp file from the command line and prints its metadata.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/698/1*H3nspSOeppNjZvj4_s2ZfQ.png" /><figcaption>Calling WhatsApps get webp metadata java method</figcaption></figure><h3>The Invocation API</h3><p>The first step towards achieving our goal is to create a Java Virtual Machine (JVM) and utilize it within our native application to execute our Java compiled code. The <strong>Invocation API </strong>enables software vendors to integrate the Java VM into any native application [1]. This API offers a range of functions, including the creation, attachment, detachment, and destruction of the JVM. Among these functions, the JNI_CreateJavaVM stands out as one of the most important. It initializes a Java VM and provides a pointer to the JNI interface pointer (JNIEnv):</p><pre>JNI_CreateJavaVM(&amp;jvm, (void**)&amp;env, &amp;vm_args);</pre><p>The third parameter contains a set of arguments which are used during the VM’s initialisation. Conveniently, the java native interface provides a structure called JavaVMInitArgs which can be used for this reason:</p><pre>typedef struct JavaVMInitArgs {<br>   jint version;<br>   jint nOptions;<br>   JavaVMOption *options;<br>   jboolean ignoreUnrecognized;<br>} JavaVMInitArgs;<br><br>typedef struct JavaVMOption {<br>    char *optionString;<br>    void *extraInfo;<br>} JavaVMOption;</pre><p>We are going to use theJavaVMOption in order to define the path where our java compiled code relies.</p><h3>Implementation</h3><p>While I was writing this post, I came across various implementations of the JVM creation process, including:</p><ul><li>One used in <a href="https://blog.quarkslab.com/android-greybox-fuzzing-with-afl-frida-mode.html">THIS</a> post by <em>Quarkslab</em></li><li>Celeb Fenton’s <a href="https://calebfenton.github.io/2017/04/14/calling_jni_functions_with_java_object_arguments_from_the_command_line/">post</a>: <em>Calling JNI Functions with Java Object Arguments from the Command Line</em></li><li>This <a href="https://gist.github.com/tewilove/b65b0b15557c770739d6">gist</a> by <em>tewilove</em></li></ul><p>Unfortunately, none of these worked for me (for various reasons in each case), so I decided to use the <a href="https://cs.android.com/android/platform/superproject/main/+/main:libnativehelper/">libnativehelper</a> approach which did the trick for me. Further than that, the code is pretty much similar with the one from Quarkslab and you can find it <a href="https://github.com/Ch0pin/JNIInvocation">here</a>.</p><p>Our project consists of the following files:</p><ul><li><em>caller.c (which corresponds to our native app)</em></li><li><em>jnihelper.c (a library that we are going to use to create the JVM)</em></li><li><em>include/jenv.h (header file for our jnihelper library)</em></li><li><em>lib/libwhatsapp.so (the whatsapp library extracted from the whatsapp apk)</em></li></ul><h4>The jnihelper library (jnihelper.c)</h4><p>Before we proceed with compiling the library as well as the native app that uses it, let’s take a look on a few things. First of all, the method that we are going to use to create JVMs is the following (<a href="https://github.com/Ch0pin/JNIInvocation/blob/main/Caller/jnihelper.c">jnihelper.c</a>):</p><pre>int initialize_java_environment(JavaCTX *ctx, char **jvm_options, uint8_t jvm_nb_options)</pre><p>This method returns JNI_OK on success or JNI_ERR otherwise and takes the three following parameters:</p><ul><li><strong>JavaCTX *ctx</strong>, is a pointer to a structure of type JavaCTX which holds context and configuration information related to the Java environment.</li><li><strong>char **jvm_options</strong>, is a pointer to an array of pointers to characters (strings). We will use it to pass an array of Java Virtual Machine (JVM) options or/and configuration settings to the function.</li><li><strong>uint8_t jvm_nb_options </strong>represents an unsigned 8-bit integer which we will use to indicate the number of JVM options provided in the jvm_options array.</li></ul><p>After successful invocation of the JNI_CreateJVM the ctx-&gt;vm and and ctx-&gt;env will point to the JVM and JNIenv respectivelly:</p><pre>...<br><br>jint status = JNI_CreateJVM(&amp;ctx-&gt;vm, &amp;ctx-&gt;env, &amp;args);<br><br>if (status == JNI_ERR){<br>        printf(&quot;[!] Can&#39;t create java vm/env \n&quot;);<br>        return JNI_ERR;<br>    }<br>    printf(&quot;[+] Initialization completed successfully.\n \<br>    [+]Java VM pointer: %p\n \<br>    [+]Java env pointer: %p\n&quot;,ctx-&gt;vm, ctx-&gt;env);<br>....<br>....</pre><p>We are going to use the initialize_java_environment from our <em>caller</em> native program, in order to be able to call java methods from our DEX/APK file.</p><h4>The native caller (caller.c)</h4><p>Starting with the main, we have the following:</p><pre>JavaCTX ctx;<br><br>int main(int argc, char **argv)<br>{<br>    int status; <br>    if(argc &lt; 2){<br>        printf(&quot;Usage: ./caller webp_file.webp&quot;);<br>        return 1;<br>    }<br>    char *jvmoptions = &quot;-Djava.class.path=/data/local/tmp/JNIhelper/base.apk&quot;;<br>    if((status = initialize_java_environment(&amp;ctx,&amp;jvmoptions,1)) != 0)<br>        return status;<br>    <br>    wrapper(argv[1]);<br>    if(cleanup_java_env(&amp;ctx)!=0)<br>        return -1;<br>    return 0;<br>}</pre><p>While the code is pretty much self-explanatory a few points worth to mention are:</p><ul><li>The jvmoptions points to the whatsapp apk, which we push under the: /data/local/tmp/JNIhelper/base.apk</li><li>The call to the initialize_java_environment in order to create our JVM and initialise our java context (JavaCTX).</li><li>The wrapper depicted below:</li></ul><pre>int wrapper(const char *path){<br><br>jclass X_2ts = (*ctx.env)-&gt;FindClass(ctx.env, &quot;X/2ts&quot;);<br>    if (X_2ts == NULL) {<br>        printf(&quot;Can&#39;t find class X/2ts\n&quot;);<br>        return -1;<br>    }<br>    jmethodID A01 = (*ctx.env)-&gt;GetStaticMethodID(ctx.env, X_2ts, &quot;A01&quot;, &quot;([B)LX/2ts;&quot;);<br>    if (A01 == NULL) {<br>        printf(&quot;Can&#39;t find method A01\n&quot;);<br>        return -1;<br>    }<br><br>    jobject X_2ts_obj = (*ctx.env)-&gt;CallStaticObjectMethod(ctx.env,X_2ts,A01,Java_com_whatsapp_stickers_WebpUtils_fetchWebpMetadata(ctx.env,NULL,(*ctx.env)-&gt;NewStringUTF(ctx.env,path)));<br>    if(X_2ts_obj==NULL) {<br>        printf(&quot;Can&#39;t create X_2ts_obj object\n&quot;);<br>        return -1;<br>    }<br>        <br>    jmethodID toString = (*ctx.env)-&gt;GetMethodID(ctx.env,X_2ts,&quot;toString&quot;,&quot;()Ljava/lang/String;&quot;);<br>    if(toString==NULL){<br>        printf(&quot;Can&#39;t find toString method id\n&quot;);<br>        return -1;<br>    }<br>    jstring describe = (*ctx.env)-&gt;CallObjectMethod(ctx.env,X_2ts_obj,toString);<br>    if(describe==NULL){<br>        return -1;<br>    }<br>    const char *descr = (*ctx.env)-&gt;GetStringUTFChars(ctx.env, describe, NULL);<br>    if(descr!=NULL)<br>        printf(&quot;%s&quot;,descr);<br>        return 0;<br>    (*ctx.env)-&gt;DeleteLocalRef(ctx.env, X_2ts_obj);<br>    (*ctx.env)-&gt;DeleteLocalRef(ctx.env, describe);<br>    return -1;<br>}</pre><p>Notice how we use our JVM in order to call the JNI methods, having loaded what we need from the whatsapp apk.</p><h4>Compile</h4><p>Assuming that you have <a href="https://developer.android.com/ndk/downloads">download</a> and install the Android NDK, make sure to modify the the build.sh in order to point to your toolchain file:</p><pre>mkdir build &amp;&amp; cd build<br>cmake -DANDROID_PLATFORM=31 \<br>        -DCMAKE_TOOLCHAIN_FILE=$HOME/Library/Android/sdk/ndk/25.2.9519653/build/cmake/android.toolchain.cmake \<br>        -DANDROID_ABI=arm64-v8a ..<br>make</pre><h4>Running</h4><ol><li>Push the compiled binaries (build/caller and build/libjenv.so), the whatsapp apk (as base.apk) and the <a href="https://github.com/Ch0pin/JNIInvocation/raw/main/a.webp"><em>a.webp</em></a> under /data/local/tmp.</li><li>Chmod the <em>caller</em> to +x</li><li>Install the whatsapp apk (we need a couple more dependencies to resolve)</li><li>Set the LD_LIBRARY_PATH to ./:/data/data/com.whatsapp/files/decompressed/libs.spk.zst/</li></ol><p>Run with <em>./caller a.webp</em>References:</p><figure><img alt="" src="https://cdn-images-1.medium.com/proxy/1*usM1_VMRQuLx5-KyoVX9aw.png" /></figure><p>Project git directory: <a href="https://github.com/Ch0pin/JNIInvocation">https://github.com/Ch0pin/JNIInvocation</a></p><h3>References:</h3><ol><li><a href="https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html">https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html</a></li><li><a href="https://blog.quarkslab.com/android-greybox-fuzzing-with-afl-frida-mode.html">https://blog.quarkslab.com/android-greybox-fuzzing-with-afl-frida-mode.html</a></li><li><a href="https://calebfenton.github.io/2017/04/14/calling_jni_functions_with_java_object_arguments_from_the_command_line/">https://calebfenton.github.io/2017/04/14/calling_jni_functions_with_java_object_arguments_from_the_command_line/</a></li><li><a href="https://gist.github.com/tewilove/b65b0b15557c770739d6">https://gist.github.com/tewilove/b65b0b15557c770739d6</a></li></ol><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=c289415b9dbd" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Wireless pairing and device mirroring in Android Studio]]></title>
            <link>https://valsamaras.medium.com/wireless-pairing-and-device-mirroring-in-android-studio-1e7e841c187e?source=rss-ded5e114da13------2</link>
            <guid isPermaLink="false">https://medium.com/p/1e7e841c187e</guid>
            <dc:creator><![CDATA[+Ch0pin️]]></dc:creator>
            <pubDate>Wed, 01 Mar 2023 10:06:12 GMT</pubDate>
            <atom:updated>2023-03-01T10:06:12.981Z</atom:updated>
            <content:encoded><![CDATA[<p>Having your mobile devices cable-connected can be challenging sometimes.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/304/1*TBsAUTOBMLHsnS7hek_yqg.png" /></figure><p>Thankfully, the latest Android studio versions provide a convenient way to take control of them remotely, including mirroring, debugging and file browsing.</p><h3>Wireless Pairing</h3><p>To pair a device wirelessly, follow the steps bellow:</p><ol><li>Open the Android Studio device manager:</li></ol><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*MVV9NCEZzq7_BFyhzm63LQ.png" /></figure><p>2. Click on ‘Pair using Wi-fi’:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*kqECNmJqTdrMasUeDAIBvg.png" /></figure><p>3. Navigate to the developer options in your device:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*1aicuF9n_wfjWc9rzsJYrA.png" /></figure><p>And simply scan the QR code:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/960/1*3DsFuQR72QzNGxX1RyV-OQ.png" /></figure><p>You can now access the device using the Device Manager’s options:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/727/1*q-MhoIFIfj6BxB1HasRMaA.png" /></figure><h3>Physical device mirroring</h3><p>As many of you I have been using this excellent tool <a href="https://github.com/Genymobile/scrcpy">https://github.com/Genymobile/scrcpy</a> to perform physical device mirroring. The newest Android Studio version though, includes an option that can be used to perform the same task (although it is still experimental).</p><p>To enable it, navigate to:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/219/1*2SNM3SpMsFPcJd3A7TmtIw.png" /></figure><p>And then:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*QArn6I7Z9cuQx2_8P6zqJA.png" /></figure><p>This is it…. You can now view your device by clicking on the “Running Devices” menu:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*nkyB6FlStNcOSQEPRJ9zaA.png" /></figure><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=1e7e841c187e" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Practical ARM64 (Subroutines)]]></title>
            <link>https://valsamaras.medium.com/practical-arm64-subroutines-1b5ea3935ff5?source=rss-ded5e114da13------2</link>
            <guid isPermaLink="false">https://medium.com/p/1b5ea3935ff5</guid>
            <dc:creator><![CDATA[+Ch0pin️]]></dc:creator>
            <pubDate>Fri, 26 Aug 2022 14:12:20 GMT</pubDate>
            <atom:updated>2022-09-13T09:05:47.032Z</atom:updated>
            <content:encoded><![CDATA[<p>Calling subroutines in higher level programming languages is trivial, the developer has simply to reference the name of a subroutine, give some arguments (if any) and handle the result. Doing the same in assembly language can be sometimes overwhelming as the developer has to take care a lot of details and comply with the calling conventions of each processor family.</p><blockquote>A <strong><em>calling convention </em></strong>defines how arguments are passed to subroutines and how the results are returned. These “rules” are not enforced by hardware, but they must be followed during the development process in order for the product to be available to other programmers.</blockquote><p>When it comes to AArch64 the rules of calling a subroutine are the following:</p><ul><li>Up to <em>eight </em>parameters are stored in registers <em>x0-x7:</em></li><li>Any additional parameter must be passed in the stack in reverse order</li><li>The subroutine’s result (if there is any), should be stored in the <em>x0 </em>register</li></ul><blockquote><strong><em>Marshalling</em></strong>: is<em> t</em>he process of placing arguments to the corresponding location</blockquote><blockquote><strong><em>1st</em></strong><em> argument → x0, </em><strong><em>2nd</em></strong><em> argument →x1, …, </em><strong><em>8th</em></strong><em> argument →x7</em></blockquote><p>Additionally there are <em>volatile </em>(caller saved) and <em>non-volatile </em>(callee saved) registers. Simply said, when you store a data in to a volatile register don’t assume that this information will survive a subroutine call. Contrariwise, a subroutine must save the contents of a non-volatile register before usage and restore them afterwards. In respect to AArch64 we have the following conventions:</p><ul><li><em>x0-x7</em> are <strong>volatile</strong> while X0 is used to store the result of a subroutine</li><li><em>x8-x18</em> are also <strong>volatile</strong>, while during a system call, <em>X8</em> stores the (linux) system call number</li><li><em>x19-x28 </em>are <strong>non-volatile</strong></li><li><em>x29, x30, sp</em> correspond to the Frame Pointer (FP), Link Register (LR) and Stack Pointer.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/998/1*XiJ65jEYXUTr54r3oIoIBw.png" /><figcaption>Volatile and non volatile registers</figcaption></figure><h3>Calling a subroutine</h3><p>Let’s first see the steps that we should take when calling a subroutine.</p><h4>Arguments to registers</h4><p>Let’s start with a simple case where we have only up to eight arguments that we have to take care of. In the example below, we are calling the printf function passing the format string to x0 (<em>line 10</em>), and the rest of the parameters to w1-w7 registers (<em>lines 12–15</em>):</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*9hbkXu6LroRvyBt0box6vw.png" /><figcaption>nstack.s</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/794/1*obqfMCPDahvn8Uui4291Qg.png" /><figcaption>Compiling and running the program ($as nstack.s -o nstack.o &amp;&amp; ld nstack.o -o nstack -lc)</figcaption></figure><p>As we discussed in the previous posts, the <em>bl</em> instruction will store the contents of the <em>program counter</em> (pc) to the <em>link register </em>(lr) and set the new value of the <em>program counter</em> to the address of the first instruction of the subroutine that we are calling. According to the printf’s manual, this subroutine expects the format string as a first argument and the displayed values as a 2nd, 3rd and so on:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/514/0*odGSaPvtw4s5Q8G-.png" /></figure><p>Since we comply with the calling convention, <em>printf </em>executes as expected, printing the given values to the standard output.</p><h4>Arguments to the stack</h4><p>When calling a subroutine that takes more than eight arguments, the extra ones must be stored in to the stack. The process of <em>popping</em> and <em>pushing</em> values from and in to the stack takes place in two steps:</p><ul><li>First the developer has to <strong>allocate space </strong>in the stack by modifying the value of the stack pointer (sp).</li><li>Then, store or recover a value to or from the memory address where sp points to.</li></ul><h4>Allocating space</h4><blockquote>This is done by subtracting the space that we need in byte units from the value that the stack pointer points to, while taking care of the stack alignment. In AArch64 the stack pointer must always be 16 bytes aligned.</blockquote><p>Although this seems confusing, thing of the stack as the pile depicted below:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*IRU7pflVGAYlgkkz8d8lwg.png" /><figcaption>AArch64 16 Bytes alignment requirement</figcaption></figure><p>In order to store <em>16 bytes</em> the <em>Stack TOP </em>must be placed one position lower, for <em>32 bytes </em>two positions and so on. To store values which are not multiples of 16 we need to find the closest 16 byte multiple boundary and set the <em>Stack TOP </em>to this value. This means that in order to store <em>8 bytes </em>the stack top should still be placed one position lower, for <em>24 bytes</em> two positions … and so on.</p><p>In the example below, we need to store 24 bytes in total (8 for each register):</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*O_OUN29wmqlcYryuh-1gAw.png" /></figure><p>The instructions at lines 7 and 8 will modify the stack as follows:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*2p9XMbzLsdo_tOoA0RGteg.png" /></figure><p>More specifically, for the sake of simplicity, assume that sp points to 0 when entering the main function. The <em>[sp, #-32]! </em>will set <em>sp </em>equal to <em>sp −32 </em>and <em>X29</em> will be stored at sp[-32:-25] and <em>X30</em> at sp[-24:-17]. Finally <em>X19</em> is stored at sp + 16 (the sp value is not modified).</p><p>Now that (hopefully), this step is clear, let’s see an example, which make use of these concepts. We will use the notation sp[a:b] to indicate the stack offsets and start by storing an array of 8 integers in to the stack:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*VTiUwH11cDPkffNIdcUDMg.png" /></figure><p>Compile and load the program above in gdb and set a breakpoint at *main+0. Then step in to each instruction in order to observe the changes in the stack:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*LHhQfJoJaHYY9LfsEflK5w.png" /><figcaption>sp is set to sp-32, x29 will be stored at 0x7..ffb10 → 0x7..ffb17 and X30 at 0x7..ffb18</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*X1NIF4asByjc8DaXxsRFlw.png" /><figcaption>sp is not modified and X19 is stored at 0x7..ffb20</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*4FoOS-7Uz5A8gXephd5yqA.png" /><figcaption>sp is set to sp-32</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*WSbfEmyQGhcg26IWCAq7Xg.png" /><figcaption>sp is restored to the entry value</figcaption></figure><p>Notice that the instruction at <em>line 25</em> allocates a 32 bytes space and the next stp instructions push the array elements in to the stack. Finally the instruction at <em>line 33 </em>will restore the stack pointer to the state before saving the array elements, and finally recover the rest of the values (<em>line 34, 35</em>).</p><p>In the next example, we are calling printf once again, passing more than 8 arguments this time:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*SRAbzr-5oB7QHXQQTgp5og.png" /></figure><p><strong>Few things to notice:</strong></p><ul><li>At <em>line 12</em>, we store <em>x19</em> as we are going to use it and it is non volatile</li><li><em>printf</em> will take 12 arguments in total, including the format string, this means that 4 arguments have to be pushed to the stack</li><li>At <em>lines 25, 26 </em>x11 will be stored at sp[16:23], x12 at sp[24:31], x9 at sp[0:7] and x10 at sp[8:15]</li><li>Although that the extra arguments are 4 bytes each we store them as 8 bytes value in to the stack before the call to printf</li><li>In<em> line 28 </em>we set the return value to 0 and restore sp (<em>line 29)</em></li><li>Finally, we restore x19, x29 and x30 from the stack and return to the address indicated by x30</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/602/1*1ONKz2S9cgu7u9V36qjPYQ.png" /></figure><h3>Implementing a subroutine</h3><p>We saw the steps that we should take when <em>calling</em> a subroutine and now it is time to see the conventions from the perspective of the <em>called</em> subroutine. From what have been discussed so far, you must already figured out that:</p><ul><li>We can safely assume that up to 8 arguments must be stored in the registers x0 to x7 and the extra ones in the stack.</li><li>The returned value must be stored in x0</li></ul><p>Additionally:</p><ul><li>When using a <em>non volatile</em> register we must save its value before we use it and restore it before exiting</li><li><em>Volatile registers</em> can be used without need to restore their value</li><li>The link register (x30) and frame pointer (x29) must also be saved when entering a subroutine and restored before exiting.</li></ul><h4>Example u-itoa</h4><p>Reaching at the end of this post, we are going to write a program which converts an unsigned integer to a null-terminated string using a specified base and prints the result to the screen. More specifically, our <em>main</em> function calls the scanf to get an unsigned value and a base. It then calls our subroutine <em>uitoa</em> which does the conversion and prints the returned result to the screen. We are going to break our program in 3 parts, in order to make it easier to understand.</p><p>The first part which is the simplest one, asks the user to enter an unsigned integer and a base and then calls the standard scanf function to get the input. It then calls our subroutine uitoa which gets three arguments: the address where it should save the result, the integer to be converted and the conversion base:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*7puFFr89Z7nVLWF_uFunsQ.png" /></figure><p>Our simplified version of itoa, checks if the base is between 2 and 16 and the input is greater than 0. It then runs a loop where it divides the input with the base and stores the remainder on every iteration at position result[i]:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*nuylvX8lprs8zSOMoAZl0g.png" /></figure><p>When this function exits, the result will be in reverse order at the memory address where x0 points to, while the length of the result is stored in the x1 register. Finally we print the result in reverse order:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*iiZvI2uDIugWYoljSrM1fA.png" /></figure><p>The overall program structure is as follows:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*neEac66vhu-yuljLNMu2sQ.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*VvhLG3pJrgVZoBOb4jT6Hw.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*d8TbE-LIV1W3Z04qBtkKUw.png" /></figure><p>Compile and run:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/466/1*3UahVKefDMMTlI3LxiDqng.png" /></figure><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=1b5ea3935ff5" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Practical ARM64 (selections and loops)]]></title>
            <link>https://valsamaras.medium.com/practical-arm64-selections-and-loops-89f9a0e7e395?source=rss-ded5e114da13------2</link>
            <guid isPermaLink="false">https://medium.com/p/89f9a0e7e395</guid>
            <category><![CDATA[arm64]]></category>
            <dc:creator><![CDATA[+Ch0pin️]]></dc:creator>
            <pubDate>Tue, 16 Aug 2022 08:26:33 GMT</pubDate>
            <atom:updated>2022-08-16T08:26:33.250Z</atom:updated>
            <content:encoded><![CDATA[<p>So far we went trough the <a href="https://valsamaras.medium.com/arm-64-assembly-series-data-processing-part-2-3d0526dc07b6">most important instructions of the AArch64 instruction set</a> and it is time to move to something more practical. In these series of posts we are going to talk about structured programming in arm64. For better understanding, we are going to use C statements and try to “translate” them to their arm64 corresponding ones.</p><h3>Selections</h3><ul><li>Simple <strong><em>if — then C statements</em></strong><em>, </em>can be easily implemented in ARM by combining <em>compare</em> and <em>branch</em> instructions. The following examples are pretty straightforward:</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Rco66q91cKjRa8nMSkS6wQ.png" /></figure><p>Similarly, an <em>if x ≥ 10</em> statement can be written as follows:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/192/1*gPVAszDFET4CifMU_X5z4Q.png" /></figure><ul><li>An <strong><em>if-then-else</em> <em>C statement</em></strong> can be achieved by adding an additional branch instruction in order to jump to the <em>else </em>branch, in case the <em>if </em>fails:</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*p2z6s6dIops7DAA-ulTEHw.png" /></figure><p>As being said, the compare and branch instruction can be combined to effectively simulate any <strong><em>if-then-else </em></strong>statement:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*cnlt9j6uM5asF43vJXo51w.png" /></figure><ul><li><a href="https://medium.com/@valsamaras/arm-64-assembly-series-data-processing-part-2-3d0526dc07b6">Conditional operations</a> can also be used to implement more complex selection structures like <strong><em>if-elif-else.</em></strong></li></ul><p>Let’s see an example:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*5kW75ebyHapwIUL47FTFjA.png" /><figcaption>ARM to C equivalent</figcaption></figure><p>Few things about the example above:</p><ul><li>Starting in <em>line 15</em>, we load the <em>nums</em> address to the <em>x0</em> register.</li><li>In <em>line 16</em>, the <em>integers 2,1</em> will be stored in <em>w1</em> and <em>w2</em> respectively and <em>x0</em> will be advanced by 8 bytes in order to point to the next <em>integer 3. </em>At this point w1=2, w2=1, w3 = 3.</li><li>The statement in <em>line 20 </em>compares <em>w1 &lt; w2</em> and if it succeeds it redirects the execution to the <em>elseif</em> label. If it fails, which means that <em>w1≥ w2</em>, the statement at <em>line 22</em> will be executed and <em>w1</em> will be compared with <em>w2</em>.</li><li>The statement in <em>line 23 </em>completes the <em>if(a≥b &amp;&amp; a≥ c)</em> then <em>max = a</em>, since the <em>csel</em> statement will maintain the value of <em>w1 </em>if <em>w1</em> is <em>greater or equal</em> to <em>w3</em>, or it will set <em>w1 = w3 </em>if w3 is <em>grater</em> than <em>w1</em>.</li><li><em>Line 24 </em>uses a branch statement to jump to the <em>else </em>label and prepare the call to <em>printf</em> after storing the address of the first parameter <em>“%d\n” </em>to <em>x0 </em>and the second parameter (the max value) to <em>x1</em>(remember <em>R0</em> to <em>R7</em> store argument values passed to and results returned from a subroutine).</li><li><em>Line</em> <em>26, 27 </em>covers the <em>if(b≥c) then max = b</em> while</li><li>Arriving at <em>line 29 </em>we make sure that the max number is stored in <em>x1</em>, thus the call to printf will yield the number 3.</li></ul><p>You can compile the above with the following oneliner:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/711/1*PjkEY8iwkISkY3puVX8D9A.png" /></figure><p>Although that we are going to discus about the <em>prologue</em> and <em>epilogue</em> of a function in a dedicated post, <em>line 14 </em>saves the frame pointer and link register (which stores the return address to the _start function) to the stack, while <em>line 31</em> restores these values, thus the execution returns to the calling function.</p><h3>Loops</h3><p>Depending upon the position of a control statement in a program, loop statements in ARM are classified into two types:</p><ul><li><em>pre-test (for and while)</em></li><li><em>post-test (do-while )</em></li></ul><p>Let’s see some examples:</p><blockquote><strong>Example 1</strong>: Reverse String</blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*0KAszUPfH1BfrjkByrZMEA.png" /><figcaption>Reverse string, pre-test loop</figcaption></figure><p>Few things to notice about the above: in <em>lines 17–18 </em>we call the <em>strlen</em> which will return the length of the string inserted by the user to the <em>w0</em> register. Assume, for example that the user entered “example” as an input, then <em>w0</em> will be equal to 7. As we want to print the last char which is equal to <em>inpt[6] </em>we subtract #1 from w0 and we store the result to the stack (<em>line 19</em>) and subsequently we store this value back to <em>w1 </em>(this is because we need <em>w0</em> to be available for our call to <em>putchar</em> in <em>line 27</em>).</p><p>The <em>startloop </em>label literally implements our while loop: the comparison in <em>line 23 </em>will test the value of <em>w1</em> and will end the loop <em>(line 23)</em>, if <em>w1</em> is <em>les than</em> <em>0</em>. The body of the loop is implemented in <em>lines 24–29. </em>More specifically we store the address of <em>inpt</em> to <em>x2 </em>register and load the value of <em>x2+x1 </em>to <em>x0</em>. As x1 is subtracted in each loop, <em>x0 </em>will store the values <em>[x2+6], [x2+5],…,[x2+0], </em>in each loop, thus the <em>putchar</em> will print the values “e”, “l”, “p”,…,“e” respectively.</p><p>Compile and run the program with</p><p><em>$as revstr.s -o revstr.o &amp;&amp; ld revstr.o -o revstr -lc</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/374/1*IhEUQLVvENfLjCDz0L9U0w.png" /></figure><blockquote><strong>Example 2</strong>: Print decimal to binary</blockquote><p>In this program we are going to ask the user to enter an unsigned integer value of which we are going to print its binary form by performing short division by two with remainder. Let decimal number is N then divide this number from 2 because base of binary number system is 2. Note down the value of remainder, which will be either 0 or 1. Again divide remaining decimal number till it became 0 and note every remainder of every step. Then write remainders from bottom to up (or in reverse order), which will be equivalent binary number of given decimal number.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/764/1*JSLOB1nhzYgnicHZ9Ckjaw.png" /></figure><p>Compile with <em>$as printbin.s -o printbin.o &amp;&amp; ld printbin.o -o printbin -lc</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/588/1*HPr1Fit0V7RksON4rcoK-w.png" /></figure><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=89f9a0e7e395" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[ARM 64 Assembly Series — Data Processing (Part 2)]]></title>
            <link>https://valsamaras.medium.com/arm-64-assembly-series-data-processing-part-2-3d0526dc07b6?source=rss-ded5e114da13------2</link>
            <guid isPermaLink="false">https://medium.com/p/3d0526dc07b6</guid>
            <dc:creator><![CDATA[+Ch0pin️]]></dc:creator>
            <pubDate>Thu, 04 Aug 2022 10:34:42 GMT</pubDate>
            <atom:updated>2022-08-04T10:34:42.539Z</atom:updated>
            <content:encoded><![CDATA[<h3>ARM 64 Assembly Series — Data Processing (Part 2)</h3><h4>Previous posts: <a href="https://valsamaras.medium.com/arm-64-assembly-series-basic-definitions-and-registers-ec8cc1334e40">Basic definitions and registers</a>, <a href="https://valsamaras.medium.com/arm-64-assembly-series-offset-and-addressing-modes-aa48b65b4c99">lab setup, offset and addressing modes</a>, <a href="https://valsamaras.medium.com/arm-64-assembly-series-load-and-store-6bfe9c1d1896">Load And Store</a>, <a href="https://valsamaras.medium.com/arm-64-assembly-series-branch-9ce820987fc6">Branch</a>, <a href="https://valsamaras.medium.com/arm-64-assembly-series-data-processing-part-1-b6f6f877c56b">Data Processing Part 1</a></h4><p>In the<a href="https://valsamaras.medium.com/arm-64-assembly-series-data-processing-part-1-b6f6f877c56b"> <em>first part </em></a>of the <em>data processing instruction set </em>we talked about <em>arithmetic, logical, move and shift </em>operations. Continuing on the same track, in this part, we are going to discuss about <em>multiplication </em>and<em> division, </em>as well as the rest of the most important operations of the <em>A64</em> <em>instruction set</em> including <em>compare, conditional</em> and <em>special</em> instructions.</p><h3>Multiplication and Division</h3><p>The <em>mul, madd, msub </em>and <em>mneg </em>can be used to multiply two <em>32bit</em> or <em>64bit</em> registers and get <em>32bit</em> and <em>64bit</em> results respectively:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/d56263c60473b14fb4f447596228a77a/href">https://medium.com/media/d56263c60473b14fb4f447596228a77a/href</a></iframe><p>When forming 64bit results from <em>32bit</em> registers we have the following:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/ebe07a6cfad0abdaf415b843d4485fd3/href">https://medium.com/media/ebe07a6cfad0abdaf415b843d4485fd3/href</a></iframe><p>Replacing the <strong>s</strong> in the beginning of the mnemonic with a <strong>u</strong> denotes unsigned multiplication (<em>umull, umaddl, umsubl, umnegl</em>). For 128bit results the <em>smulh</em> and <em>umulh</em> can be used too calculate the upper 64bits and mull can be used for the rest, for example:</p><p><strong><em>smulh</em></strong><em> Xd, Xm, Xn</em> <strong>means </strong>Xd=Xm × Xm</p><p><em>Xd will hold the upper 64 bits of the result </em>and the <strong>s </strong>denotes that the Xm and Xn must be sign extended. For the corresponding unsigned operation we have the following:</p><p><strong><em>umulh</em></strong><em> Xd, Xm, Xn</em> <strong>means </strong>Xd=Xm × Xm</p><p>When it comes to division we have the <em>sdiv</em> and <em>udiv</em> to divide and unsigned divide. For example:</p><p><strong><em>sdiv</em></strong><em> Rd, Rm, Rn</em> <strong>means </strong>Rd=Rm ÷ Rm</p><h3>Comparison</h3><p>The comparison operators are used to set the PSTATE flags and don’t have any further effect as the result is discarded. Their general syntax is as follows:</p><p><em>op Rn, operand2</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/600/1*gKWZzZAZxyMYfRvxFNR_oA.png" /><figcaption><a href="https://comp.anu.edu.au/courses/comp2300/resources/ARM_cheat_sheet">https://comp.anu.edu.au/courses/comp2300/resources/ARM_cheat_sheet</a></figcaption></figure><p>For example the statement <em>cmp X0, #0x40 </em>will subtract 0x40 from X0 and if the result is negative, it will set the <em>N</em> flag of the <em>PSTATE</em> register. Similarly cmn will add the first and second operand in order to set the N flag accordingly. The <em>TST</em> instruction performs a bitwise AND operation on the value in <em>Rn</em> and the value of <em>Operand2</em>. This is the same as a ANDS instruction, except that the result is discarded.The <em>TEQ</em> instruction performs a bitwise Exclusive <em>OR</em> operation on the value in <em>Rn</em> and the value of <em>Operand2</em>. This is the same as a <em>EORS</em> instruction, except that the result is discarded.</p><p>Using <em>compare</em>, <em>branch</em> and <em>and</em> to construct loops is straightforward:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/433/1*RtPeSYRiRyBmwTKRBgsCEg.png" /></figure><h3>Conditional operations</h3><p>This set of instructions is used to set the destination register equal to first or second operand, depending on certain conditions. Their syntax is as follows:</p><ul><li>op Rd, Rn, Rm, &lt;cond&gt; (1)</li><li>op Rd, Rn, &lt;cond&gt; (2)</li><li>op Rd, &lt;cond&gt; (3)</li><li><em>op Rn, R_or_imm, nzcv, </em>&lt;cond&gt; (4)</li></ul><p>The &lt;cond&gt; can be one of the following:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/769/1*IP7M6Jn8QMp-K1okNZiD5A.png" /></figure><ul><li><strong>In the <em>first case (1)</em></strong> <em>op</em> can be either<em> csel, csinc, csinv, csneg </em>and can be interpreted as follows:</li></ul><p><em>cs</em><strong><em>el</em></strong><em> Rd, Rn, Rm, cond </em><strong>means </strong><em>if cond then Rd = Rn else Rd = Rm</em></p><p><em>cs</em><strong><em>inc</em></strong><em> Rd, Rn, Rm, cond </em><strong>means </strong><em>if cond then Rd = Rn else Rd = Rm+1</em></p><p><em>cs</em><strong><em>inv</em></strong><em> Rd, Rn, Rm, cond </em><strong>means </strong><em>if cond then Rd = Rn else Rd = </em>˜<em>Rm</em></p><p><em>cs</em><strong><em>neg</em></strong><em> Rd, Rn, Rm, cond </em><strong>means </strong><em>if cond then Rd = Rn else Rd = </em>˜<em>Rm+1</em></p><ul><li><strong>In the <em>second case (2)</em></strong><em> op</em> can be either<em> cinc, cinv or cneg </em>and can be interpreted as follows:</li></ul><p><em>c</em><strong><em>inc</em></strong><em> Rd, Rn, cond </em><strong>means </strong><em>if cond then Rd = Rn+1 else Rd = Rn</em></p><p><em>c</em><strong><em>inv</em></strong><em> Rd, Rn, cond </em><strong>means </strong><em>if cond then Rd = </em>˜<em>Rn else Rd = Rm+1</em></p><p><em>c</em><strong><em>neg</em></strong><em> Rd, Rn, cond </em><strong>means </strong><em>if cond then Rd = </em>˜<em>Rn else Rd = Rn</em></p><ul><li><strong>In the <em>third case (3) </em></strong><em>op</em> can be either<em> cset or csetm </em>and can be interpreted as follows:</li></ul><p><em>c</em><strong><em>set</em></strong> <em>Rd, cond </em><strong>means </strong><em>if cond then Rd =1 else Rd=0</em></p><p><em>c</em><strong><em>setm</em></strong><em> Rd, cond </em><strong>means</strong> <em>if cond then Rd =0xffff..fff else Rd=0x0000..000</em></p><ul><li>Finally in case (4) op can be either ccmp or ccmn and can be interpreted as follows:</li></ul><p>The conditional compare <strong><em>ccmp</em> <em>Rn, R_or_imm,</em></strong><em> #</em><strong><em>nzcv</em></strong><em>, </em><strong><em>cond</em></strong><em> </em>can be interpreted as follows:</p><pre>if &lt;cond&gt; then<br>    cmp Rn with R_or_imm and set the nzcv according</pre><pre>else <br>    nzcv = <em>#nzcv</em></pre><pre>R_or_imm can be a register or an immediate</pre><p>As being said before, while cmp a, b executes a-b and sets the PSTATE flags accordingly, cmn on the other hand executes a+b, this affects the conditional compare as follows:</p><pre>if &lt;cond&gt; then<br>    cmn Rn with R_or_imm and set the nzcv according</pre><pre>else <br>    nzcv = <em>#nzcv</em></pre><pre>R_or_imm can be a register or an immediate</pre><h3>Special instructions</h3><blockquote><strong><em>Count leading zeros</em></strong><em>: clz Rn, Rm </em>counts the leading zero of Rm and stores the result to Rn:</blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ZCouGbXFoeMChc3cBAwcVA.png" /></figure><blockquote><strong><em>Move Status to Register or register to status:</em> </strong><em>mrs Rn, status</em> or <em>msr status, Rn:</em></blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*hk1scf_eREqaLiP56FDR7g.png" /></figure><blockquote><strong><em>Supervisor call: </em></strong><em>svc system_call_number </em>is used to perform a system call. Depending on the operating system, each call has a specific id as it is depicted in the figure below:</blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*sOroVkAZk_9sxm7WZyrbeA.png" /><figcaption>full list here: <a href="https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md">https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md</a></figcaption></figure><p>The <em>system_call_number </em>in Linux is always set to 0 while the actual id is store in X8 and up to six parameters can be passed to x0-x5:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*oa3PY0r8mqDjjKMIopSotA.png" /><figcaption>The system call id is stored in x8, x0-x2 stores the three arguments (“/bin/sh”,null,null) and svc 0 will perform the system call</figcaption></figure><blockquote><strong><em>No operation: </em></strong><em>nop which does nothing, other than advance the value of the program counter by 4. This instruction can be used for instruction alignment purposes.</em></blockquote><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=3d0526dc07b6" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[ARM 64 Assembly Series — Data Processing (Part 1)]]></title>
            <link>https://valsamaras.medium.com/arm-64-assembly-series-data-processing-part-1-b6f6f877c56b?source=rss-ded5e114da13------2</link>
            <guid isPermaLink="false">https://medium.com/p/b6f6f877c56b</guid>
            <category><![CDATA[arm]]></category>
            <category><![CDATA[aarch64]]></category>
            <category><![CDATA[assembly]]></category>
            <dc:creator><![CDATA[+Ch0pin️]]></dc:creator>
            <pubDate>Mon, 01 Aug 2022 09:38:43 GMT</pubDate>
            <atom:updated>2022-09-13T08:26:04.506Z</atom:updated>
            <content:encoded><![CDATA[<h3>ARM 64 Assembly Series — Data Processing (Part 1)</h3><h4>Previous posts: <a href="https://valsamaras.medium.com/arm-64-assembly-series-basic-definitions-and-registers-ec8cc1334e40">Basic definitions and registers</a>, <a href="https://valsamaras.medium.com/arm-64-assembly-series-offset-and-addressing-modes-aa48b65b4c99">lab setup, offset and addressing modes</a>, <a href="https://valsamaras.medium.com/arm-64-assembly-series-load-and-store-6bfe9c1d1896">Load And Store</a>, <a href="https://valsamaras.medium.com/arm-64-assembly-series-branch-9ce820987fc6">Branch</a></h4><p>So far we talked about <em>load, store </em>and <em>branch</em> instructions and it is time to discuss about a (long) set of instructions that can be used to process data. To quickly refresh your memory on what has been discussed so far, you can refer to the table below or you can simply navigate to the previous posts by following the links in the subtitle section:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*8c2WRQbpm2WVV-H1eJ-e3A.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*zwgwwOHtwZm6Oba-vvx_Ng.png" /><figcaption><a href="https://comp.anu.edu.au/courses/comp2300/resources/ARM_cheat_sheet/">https://comp.anu.edu.au/courses/comp2300/resources/ARM_cheat_sheet/</a></figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/595/0*jIn9-qQBhwVx-Yej.png" /><figcaption>Condition modifiers</figcaption></figure><h3>General format and operands</h3><p>In mathematics, an operand is the object of a mathematical operation. For example, in the following addition <em>y</em> and <em>x</em> are the <em>operands</em> of an <em>addition </em>where <em>a i</em>s the <em>result</em> : <em>y + x= a. </em>Similarly, in arm assembly <em>most of the data processing instructions require two operands and a destination register:</em></p><blockquote><strong>op</strong> Rd, Rn, Rm</blockquote><p><em>op </em>defines the type of the operations, <em>Rd </em>is the result destination register, <em>Rn </em>is the first operand and <em>Rm </em>is the second.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/206/1*OOKmN_fHW-hZzK5FkFCmYA.png" /></figure><p>As you probably notice in the figure above, the first operand enters directly the ALU while the second one may be processed before entering (e.g. shifted as in the example above). Indeed, this flexible second operand, also known as <a href="https://developer.arm.com/documentation/dui0473/j/arm-and-thumb-instructions/flexible-second-operand--operand2-">operand2</a>, can be one of the following:</p><ul><li><em>A register</em> with an optional <em>shift or extend operation (LSL, LSR, ASR e.t.c)</em></li><li>A <em>12bit immediat</em>e value or <em>13bit pattern</em> (used only for logical instructions)</li></ul><p><strong>Lets see some examples:</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*1NIK_KZ0DrhWZMRHqFLSFA.png" /><figcaption>On the right: operand2 as a pattern for the logical operations AND and ORR. On the left: operand2 as a register to an add instructions and as an immediate to a sub instruction (<a href="https://armkeil.blob.core.windows.net/developer/Files/pdf/graphics-and-multimedia/ARMv8_InstructionSetOverview.pdf">examples source</a>)</figcaption></figure><pre>add x1, x4, x5           // x1 = x4 + x5<br>sub x0, x0, #1           // x0 = x0 - 1<br>neg x3, x4, lsl #3       // x3 = -(x4 &lt;&lt; 3)</pre><p>We will go into details regarding the shift and extend operations, but for now and for the sake of simplicity we will refer to them as <em>shift_op</em> and <em>extend_op. </em>That being said, here is a summary of the <em>operand2</em> formats:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/dd153dd42f9b366b980a63ae2a0fcdcb/href">https://medium.com/media/dd153dd42f9b366b980a63ae2a0fcdcb/href</a></iframe><h4>Arithmetic operations</h4><p>The basic arithmetic instructions are the <em>add, sub </em>and<em> neg </em>corresponding to <em>addition</em>, <em>subtraction</em> and <em>negation</em>. In addition to those, we have the <em>adc</em>, <em>sbc</em> and <em>ngc</em> which are adding the carry bit of the <em>PSTATE</em> register to the two operands. These (carry) instructions can be used only with unsifted/unextended operands. Finally, an <em>s </em>can be appended to each instruction (<em>adds</em>, <em>subs</em> and so on) in order to affect the flags of the <em>PSTATE:</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ZzpsfArH7Rh4tz0HccwHqQ.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*GLY417wRuk9uJoFRvOCQzA.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/465/1*NprHd5JsOmFO_v-kha3QoA.png" /></figure><h4>Logical operations</h4><p>Similarly to the arithmetic operations, the general syntax of the logical operations is <em>op Rd, Rn, operand2</em><strong> </strong>except the bitwise-not which uses only the operand2 and the destination register. As before, appending an <em>s </em>to the instruction can affect the <em>PSTATE </em>flags. Here are the most basic logical operations and their usage:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*VKXaSVF49eGxf4bvHInJig.png" /></figure><p><em>Examples:</em></p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/0ff6eb6305456d837e0b877e999f4aa7/href">https://medium.com/media/0ff6eb6305456d837e0b877e999f4aa7/href</a></iframe><h4>More on the role of s and the PSTATE</h4><p>Contrary to <em>AArch32</em>, <em>AArch64 doesn’t not allow other than the </em><a href="https://valsamaras.medium.com/arm-64-assembly-series-branch-9ce820987fc6"><em>branch</em></a><em> instructions to be conditionally executed</em>.</p><blockquote>Need to know (n2k): Conditional execution controls whether or not the core will execute an instruction. Most instructions have a condition attribute that determines if the core will execute it based on the setting of the condition flags. Prior to execution, the processor compares the condition attribute with the condition flags in the <a href="https://www.sciencedirect.com/topics/computer-science/current-program-status-register">cpsr</a>. If they match, then the instruction is executed; otherwise the instruction is ignored.</blockquote><p>As being said before, appending an <em>s </em>to the mnemonic of the operation can be used to set the flags of the <em>PSTATE</em> register which in conjunction with a branch instruction can be used to modify the flow of a program. Let’s see some examples:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*6NbVLunghTuKPc-NorHP9Q.png" /><figcaption><a href="https://cpulator.01xz.net/?sys=arm">https://cpulator.01xz.net/?sys=arm</a></figcaption></figure><p>As it is depicted above, since the result of the subtraction operation is negative, the<strong><em> N (Negative) flag </em></strong>of the <em>program status register</em> (cpsr in this case) will be set to <em>1</em>. This will make the <em>bmi</em> instruction (branch minus) to set the program counter to the address of the <em>neg </em>label. The <strong><em>Z (Zero) </em></strong>denotes another flag<strong> </strong>of the status register which is set when a result is equal to zero:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*46iNXMBtbUDtS7M4SRCTzw.png" /></figure><p>The <strong><em>V</em></strong> <strong><em>(oVerflow) flag </em></strong>is set if the result of an addition or subtraction operation can overflow the range of the result. The flag will be set to <em>1 if overflow occurs and 0 if not</em>:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Z0vTBpX7Ht2PKyoAYD5ynQ.png" /></figure><p>The <strong><em>C (Carry) flag</em></strong> is used to indicate whether the result of an unsigned operation is not representable. For example adding <em>1</em> to <em>4294967295</em> will normally has as a result the value <em>4294967296. </em>However, if the destination register can hold up to 32 bits then the result will be <em>zero </em>with <em>carry:</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/714/1*-B00RaaOZtrfTSxynFpFUQ.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*n8RqdK92hnCxNB69OycaGw.png" /></figure><h4>Move and Shift Operations</h4><p>The <em>move</em> operations are used to copy data from a register to a register or from an immediate to a register. The particular set includes the following instructions:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/a6c47540dbab9d74a23f2059f7bc3593/href">https://medium.com/media/a6c47540dbab9d74a23f2059f7bc3593/href</a></iframe><p>The <em>shift </em>operations are used to shift or rotate the contents of a register. They can be used either as standalone instructions or for flexible second operands similar to the ones we saw above. The standalone syntax is as follows:</p><p><em>op Rd, Rn, Rm</em></p><p>Where op can be either<em> lsl, lsr, asr </em>or<em> ror</em>:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*bK0WyqprFU1Z4IWl0QMp5A.png" /><figcaption>source: <a href="https://armkeil.blob.core.windows.net/developer/Files/pdf/graphics-and-multimedia/ARMv8_InstructionSetOverview.pdf">https://armkeil.blob.core.windows.net/developer/Files/pdf/graphics-and-multimedia/ARMv8_InstructionSetOverview.pdf</a></figcaption></figure><p>As it is depicted above, the <em>lsl instruction</em> shifts the contents to left while padding the shifted positions with 0. A single shift to left is like multiplying by 2, a double shift is like multiplying by 4 and so on:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*CWQxaX1CHn9n9h_7C3lj5g.png" /></figure><p>Similarly to <em>lsl</em>, the <em>lsr instruction</em> shifts the contents to right while padding the shifted positions with 0. A single shift to right is like dividing by 2, a double shift is like multiplying by 4 and so on:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Pj5At4NjegIYdGb5kH6OFw.png" /></figure><p>The <em>ror</em> instruction will rotate the contents, moving the shifted bit to the most significant bit of the register:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*9RSxqkm7c4VhNSEHcB_opw.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/492/1*hw6kNnNvQqwSDR_xiPKtGQ.png" /><figcaption>Rotating 10 bits to right</figcaption></figure><p>Finally, the <em>asr </em>instruction will shift a number of bits to the right padding with zeros but maintaining the sign bit:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*zPDZ7tKzPP_v_aN2qPkJBw.png" /></figure><p>That’s all for now, I hope to see you in part 2 of the data processing instructions.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=b6f6f877c56b" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[ARM 64 Assembly Series — Branch]]></title>
            <link>https://valsamaras.medium.com/arm-64-assembly-series-branch-9ce820987fc6?source=rss-ded5e114da13------2</link>
            <guid isPermaLink="false">https://medium.com/p/9ce820987fc6</guid>
            <category><![CDATA[arm64]]></category>
            <category><![CDATA[arm]]></category>
            <dc:creator><![CDATA[+Ch0pin️]]></dc:creator>
            <pubDate>Thu, 21 Jul 2022 10:27:57 GMT</pubDate>
            <atom:updated>2022-09-13T08:12:38.524Z</atom:updated>
            <content:encoded><![CDATA[<h3>ARM 64 Assembly Series — Branch</h3><h4>Previous posts: <a href="https://valsamaras.medium.com/arm-64-assembly-series-basic-definitions-and-registers-ec8cc1334e40">Basic definitions and registers</a>, <a href="https://valsamaras.medium.com/arm-64-assembly-series-offset-and-addressing-modes-aa48b65b4c99">lab setup, offset and addressing modes</a>, <a href="https://valsamaras.medium.com/arm-64-assembly-series-load-and-store-6bfe9c1d1896">Load And Store</a></h4><p>In the previous post we talked about the <strong>ldr</strong> and <strong>str</strong> instructions which can be used to transfer data bidirectionally between a memory address and a register (or pair of registers):</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*dceGFPGxNS2FHrE9i9Bk6Q.png" /><figcaption>Appending b, h or w to the instruction mnemonic indicates an unsigned byte, a half word or a word respectively. Adding an <strong>s</strong> in front of these letter (sb, sh, sw), it will force to the cpu handle the data as signed.</figcaption></figure><p>In this post we are going to talk about branch instructions and how they can be used in order to change the address of the next instruction that will be executed.</p><h3>Branch</h3><p>Branching is one of the most important concepts in programming as it allows the developer to define alternative code paths depending on various conditions. In high level languages, these conditions are evaluated using control flow statements like the <em>if, for, while </em>or even <em>goto. </em>Similarly, in low level languages there are special instructions that may be used in order to achieve the same result and route the code flow to a different path.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/281/0*kYBnxsARHBRdnkQQ" /></figure><p>AArch64 defines a set of <em>branch instructions </em>which can be used to perform conditional or unconditional jumps within a function (branch) or calls to other functions (branch and link). Let’s see the most important of them as well as their usage.</p><h4>Conditional and Unconditional Branches</h4><p>Starting with the simplest case, a conditional or unconditional branch instruction looks as follows:</p><p><strong>b&lt;c&gt; label</strong></p><p>And can be interpreted as <em>if &lt;c&gt; then pc = new_address .</em>If the <em>&lt;c&gt;</em> parameter is omitted (e.g. <em>b label</em>), then it simply sets the pc = <em>new_address</em> .The <em>label </em>is an immediate which is encoded as a relative offset to the program counter. This immediate will be sign extended and multiplied in order to calculate the offset that will be added to the current address of the <em>program counter:</em></p><p><em>offset = immX * 4</em></p><p>Where <em>X</em> will be <em>19bits</em> for <em>conditional branches </em>and <em>26bits</em> for <em>unconditional</em>. Finally, the symbol &lt;c&gt; is a mnemonic which denotes the state of a flag of the <a href="https://valsamaras.medium.com/arm-64-assembly-series-basic-definitions-and-registers-ec8cc1334e40"><em>PSTATE register</em></a><em>. </em>The possible values of &lt;c&gt; as well as their meaning in regard to the PSTATE flags is depicted below:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/595/1*-FJ50J_96Edg7ye2W5ZX5Q.png" /><figcaption><strong><em>Table 1: Condition modifiers for AArch64</em></strong></figcaption></figure><p>That being said, the instruction bvs checks the overflow flag V in order to decide to follow or not a new code path, while the bne checks if the Zero flag is not equal to 1. In the following example, the cmp instruction at line 5 will set the Z flag to 1 if w1 is equal to zero, this will have as a result the beq to succeed, thus the code will follow the address indicated by the foo label. In case the w1 is not equal to zero, the code will continue up to line 9 where the unconditional branch will redirect the code flow to the address indicated by the label bar:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/2e336e3cd354d7531091ca2d3af2d916/href">https://medium.com/media/2e336e3cd354d7531091ca2d3af2d916/href</a></iframe><h4>Branch to register</h4><p>In case that the address of the next instruction is fetched by a register, the branch instruction has the following forms:</p><pre><strong>br   Rn     </strong>//meaning that pc will be set to Rn<br><strong>ret &lt;Rn&gt;    </strong>//meaning that if Rn is omitted pc = lr else pc = Rn</pre><p>Although that the instructions above are self explanatory, it worths to clarify that in the case of ret the &lt;Rn&gt; parameter is optional and if it is omitted then the value will be fetched by the <a href="https://valsamaras.medium.com/arm-64-assembly-series-load-and-store-6bfe9c1d1896"><em>link register</em></a>.</p><h4>Branch and link</h4><p>The main difference with the previous cases is that before taking the new branch, the next instruction from the current address will be copied to the <em>link register:</em></p><pre><strong>bl    label   </strong>//meaning that lr = pc+4 and pc = new_address*<br><strong>blr   Rn      //</strong>meaning that lr = pc+4 and pc = Rn</pre><p>*<em>in this case the immediate is 26 bits and multiplied with 4</em></p><h4>Compare and branch</h4><p>These are conditional branches where the decision to continue the execution from a new address depends on the value of the register which is given as parameter. Their general form is depicted below:</p><pre><strong>cbz  Rn, label</strong>           //if Rn == 0 then pc = new_address*<br><strong>cbnz Rn, label</strong>           //if Rn != 0 then pc = new_address*<br><strong>tbz  Rn, #imm6, label</strong>    //if Rn[#imm6] == 0 then pc = new_address** <br><strong>tbnz Rn, #imm6, label</strong>    //if Rn[#imm6] != 0 then pc = new_address**</pre><p><em>*the immediate is 19 bits and multiplied with 4</em></p><p>**<em>the immediate is 14 bits and multiplied with 4</em></p><p>The <em>#imm6 </em>is an integer ranging from <em>0 to 63</em>, indicating a specific bit of the register which is given as a parameter. For example, the following instruction checks if the value contained in <em>X0</em> is even and takes or not the branch to the address indicated by the label even :</p><pre>tbz X0, 0, even          //if X0 % 2==0 then pc = even</pre><h4>PC relative address calculation</h4><p>The <em>adr</em> and <em>adrp </em>instructions can be used to calculate an address associated with a label and store the result to a general register which is given as a parameter. Their general form is as follows:</p><p>adr Rn, label and adrp Rn, label</p><p>In the first case a <em>21bit </em>immediate is used, resulting a range of 1MB within the current address while in the second case the address has a range of 4GB to the nearest 4KB page as the <em>the 21bit immediate</em>, is <em>shifted left by 12 bits </em>and <em>the 12 LSB bits are padded with zero. </em>As being said, the result in both cases is stored to the <em>general purpose</em> register which is given as a parameter.</p><h4>Synopsis</h4><p>Here is a summary table to help you keep track on what has been discussed so far in regard to the branch instructions:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*dFj751anuY5HQxKyCB5d_w.png" /></figure><h3>Examples</h3><p>Here is a simple loop and its arm equivalent:</p><pre>x = 3;</pre><pre>while (x &gt; 17) {<br>   ++x;<br>}</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/726/1*-9dkK8WmjlFZDwSyKYA2kA.png" /></figure><p>And here is a simple C program which makes use of the concepts that we discussed so far:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/ed5c929586bf306405dd6bc08a267c34/href">https://medium.com/media/ed5c929586bf306405dd6bc08a267c34/href</a></iframe><p>After compiling it, load it to gdb and disassemble its main function:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/654/1*3z2lLI866b13FCBGwkZjwA.png" /></figure><p>We will go through each line explaining what the corresponding command is doing:</p><ul><li>+0 stp x29, x30, [sp, #-32]!</li></ul><p>Push the <em>frame pointer</em> (fp) and <em>link register</em> (lr) to the stack. Before executing this instruction, the <em>stack pointer</em> (sp) points to 0x7ffffff9f0. The instruction will be completed in the following steps:</p><ol><li>sp -= 0x20 =&gt; sp = 0x7FFFFF9D0</li><li>Store x29 at 0x7FFFFF9D0</li><li>Store x30 at 0x7FFFFF9D8</li></ol><figure><img alt="" src="https://cdn-images-1.medium.com/max/998/1*hbDOVvCmPoWo3ubXhjNwaQ.png" /></figure><ul><li>+8 mov w1, #0x1 and +12 mov w0, #0xa</li></ul><p>The instructions above will prepare the call to the <em>looper</em> function by storing its parameters <em>looper(10,1)</em> to <em>w0</em> and <em>w1</em>.</p><ul><li>+16 bl 0x5555550774 &lt;looper&gt;</li></ul><p>Notice that before branching to <em>0x5555550774</em> the program counter points to <em>0x…07cc:</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/821/1*sVfE_oUhgV20yEoICHR5mw.png" /></figure><p>The bl instruction will first store the return address to the <em>link register </em>thus</p><p>lr = pc + 4 =&gt; <em>lr= 0x…7d0</em></p><p>And finally take the branch:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/686/1*lTmhBXDx25OHyxawatBcTg.png" /></figure><p>Inside the <em>looper </em>function we have the following:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/666/1*81_rpL1-QkJizIDhQwvxPA.png" /></figure><p>The instructions:</p><ul><li>&lt;+0&gt; sub sp, sp, #0x20, &lt;+4&gt; str w0, [sp, #12] and &lt;+8&gt; str w1, [sp, #8]</li></ul><p>will set up the stack and push the function’s parameters to it. Similarly, the wzr, [sp, #28] will push the zero value to the stack, which after this instruction will be as follows:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/976/1*c2H9Z12kzfzhXuWc66bohA.png" /></figure><p>Next comes the actual loop and. The <em>w1</em> register will store our integer variable <em>i </em>and so<em> </em>we have the following:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*lvViHaJGvhf3xErBYAMvsQ.png" /></figure><p>What the <em>green block </em>does is increasing the second<em> </em>parameter which is given to the function by one (b++)<em>. </em>Indeed, the value at address <em>sp+8</em> contains this parameter which then loaded to w0, increased by one (at +24) and stored back to <em>sp+8</em>. The yellow block does exactly the same thing for the local variable <em>i. </em>Finally at offsets 44 →56 the <em>increased by one</em> value is stored to w1, the threshold is stored to w0, these values are compared and if w1 &lt; w0 the loop continues. When w1 gets to be equal to w0 the <em>blt</em> is not taken and the code continues in order to store the return value to w0, restore the sp and use the <em>ret</em> to set the program counter to the value store to the link register:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/470/1*mMHbXruxXVokbqTh_lL-qQ.png" /></figure><p>The last instruction will bring us back to main:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/668/1*at8CnfDJEcHmqDPywYlYiA.png" /></figure><p>Not much different that the previous call, our printf takes two parameters:</p><p><em>printf(“%d\n”,k);</em></p><p>As it happened before, <em>“%d\n”</em> will be stored to x0 and the result from the <em>looper </em>will be stored to w1:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/382/1*sFgVxtPe5kj-hNtmYeprRA.png" /></figure><p>Finally, after returning from printf and then returning from main have the call to exit:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/666/1*fDoyAwjSxNgDq4emICNMpw.png" /></figure><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=9ce820987fc6" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[ARM 64 Assembly Series — Load and Store]]></title>
            <link>https://valsamaras.medium.com/arm-64-assembly-series-load-and-store-6bfe9c1d1896?source=rss-ded5e114da13------2</link>
            <guid isPermaLink="false">https://medium.com/p/6bfe9c1d1896</guid>
            <category><![CDATA[assembly]]></category>
            <category><![CDATA[arm64]]></category>
            <dc:creator><![CDATA[+Ch0pin️]]></dc:creator>
            <pubDate>Thu, 14 Jul 2022 19:22:38 GMT</pubDate>
            <atom:updated>2022-08-08T09:09:45.503Z</atom:updated>
            <content:encoded><![CDATA[<h3>ARM 64 Assembly Series — Load and Store</h3><h4>Previous posts: <a href="https://valsamaras.medium.com/arm-64-assembly-series-basic-definitions-and-registers-ec8cc1334e40">Basic definitions and registers</a>, <a href="https://valsamaras.medium.com/arm-64-assembly-series-offset-and-addressing-modes-aa48b65b4c99">lab setup, offset and addressing modes</a></h4><p>As we discussed in the <a href="https://valsamaras.medium.com/arm-64-assembly-series-offset-and-addressing-modes-aa48b65b4c99">previous post</a>:</p><ul><li>The AArch64<strong> </strong>architecture<strong> </strong>supports a single instruction set called <strong>A64</strong> which consists of <strong>fixed-length 32 bit </strong>instructions that can be used to: <em>Load and store data, change the address of the next instruction to be executed, perform arithmetic or logical operations, perform a special operation</em></li><li>AArch64 is a <strong>load-store </strong>architecture, which means that only load and store instructions can access the memory.</li><li>The <strong>l</strong>oa<strong>d</strong> <strong>r</strong>egister ldr and <strong>st</strong>ore <strong>r</strong>egister str instructions are used to transfer: <em>bytes</em> (8 bits), <em>half-words </em>(16 bits), <em>words</em> (32 bits) and <em>double words</em> (64 bits) from a memory address to registers or from registers to a memory address.</li></ul><p>In this post we are going to cover the load and store instructions and, most importantly, we are going to see how they can be formed in order to carry information about the size of the data that they are operating to. This, in conjunction with the <a href="https://valsamaras.medium.com/arm-64-assembly-series-offset-and-addressing-modes-aa48b65b4c99">offset and addressing</a> syntax might seem a little bit confusing in the beginning, but hopefully by the end of this article you will be able to fully understand these concepts.</p><h3>Loading and Storing Data</h3><p>The <strong>ldr</strong> and <strong>str</strong> instructions can be used to load or store <strong>one</strong> or <strong>a pair of </strong>registers at a time. Let’s see the corresponding syntax in each case:</p><h4>Single register</h4><p>As the title implies, in this case, a single register is used a a source or a destination during a data transfer from -or- to memory. The basic syntax is as follows:</p><p><strong>op&lt;sz&gt; Rn, &lt;address&gt;</strong></p><ul><li>The <strong>op </strong>refers to the instruction mnemonic, which can be <strong>ldr</strong> or <strong>str</strong> (capitalisation is optional)</li><li>The <strong>&lt;sz&gt;</strong> refers to the size of the data to be transferred (see below)</li><li>The <strong>Rn</strong> refers to the source or destination register</li><li>The <strong>&lt;address&gt; </strong>refers to the memory address to which or from the data will be transferred</li></ul><p>When the &lt;sz&gt; parameter is omitted, the data size to be moved is determined by the symbol which is used to refer to the register (<a href="https://valsamaras.medium.com/arm-64-assembly-series-basic-definitions-and-registers-ec8cc1334e40"><em>remember</em></a><em> </em><strong><em>x</em></strong><em> implies</em><strong><em> </em></strong><em>64bit size</em><strong><em> </em></strong><em>and </em><strong><em>w </em></strong><em>to 32bit size</em>).</p><p>Let’s see an example to clarify this case:</p><pre>ldr x1, &lt;address&gt;       <em>//store 64 bits from &lt;address&gt; to X1</em><br>str x1, &lt;address&gt;       <em>//store 64 bits from X1 to &lt;address&gt;</em></pre><pre>-----------------</pre><pre>ldr w1, &lt;address&gt;       <em>//store 32 bits from &lt;address&gt; to w1</em><br>str w1, &lt;address&gt;       <em>//store 32 bits from w1 to address</em></pre><p>The &lt;sz&gt; can be used to force a different than the default size. This parameter can be either b, h or w and indicates an unsigned byte, a half word or a word respectively. Finally, adding an <strong>s</strong> in front of these letter (sb, sh, sw), it will force to the cpu handle the data as signed.</p><p>Let’s see some examples:</p><pre>ldrb x1,[x2]       <em>//store the least </em>significant byte <em>from *x2 to x1</em></pre><pre>strh x1,[x2],#3    <em>//store a half word (2 bytes) from x1 to *x2 and set x2 = x2 + 3</em></pre><pre>strsh w0,[w3]      //store <em>a half word (2 bytes) from w0 to *w3 and sign extend it </em></pre><p>By <em>sign-extend </em>we mean that the transferred data will be signed when they get stored to the destination:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*xQVIwuoa2Xkj9hL6_xMwaA.png" /><figcaption><a href="https://armkeil.blob.core.windows.net/developer/Files/pdf/graphics-and-multimedia/ARMv8_InstructionSetOverview.pdf">https://armkeil.blob.core.windows.net/developer/Files/pdf/graphics-and-multimedia/ARMv8_InstructionSetOverview.pdf</a></figcaption></figure><p>In the first case (see figure above), the byte <strong>0x8A</strong> will be loaded to the <strong>w4</strong> (32bits) register and the remaining 3 bytes will be modified in order to indicate that the number is signed. Exactly the same happens in the second case, with the only difference that <strong>x4</strong> refers to 64 bits, thus 7 bytes are going to be sign extended. Omitting the <strong>s </strong>extension (last case) will pad the remaining destination bytes with 0.</p><h4>Pair of registers</h4><p>The <strong>ldp</strong>, <strong>stp</strong> instructions can be used to move data twice as much as the ldr, str since they can use a pair of registers each time. The general syntax is as follows:</p><p><strong>&lt;op&gt;&lt;sz&gt; Rn,Rm, &lt;address&gt;</strong></p><p>This operation can brake down to the following steps:</p><ul><li>Load or store Rn to &lt;address&gt;</li><li>Increase &lt;address&gt; according to the size of Rn (4 bytes for 32 bit transfer or 8 for 64 bit transfer)</li><li>Load or store the second register to the (increased) address</li></ul><p>Further than that, the rest parts of the instruction have the same meaning as in the previous case, so let’s go straight to the examples:</p><p><strong>Example 1: </strong>*x2 will be stored to w0 and *(x2 + 4) will be stored to w1</p><pre>ldp w0, w1, [x2]         </pre><p><strong>Example 2: </strong>sp (the stack pointer) will be set to sp -16 bytes, then x29 will be stored to the address indicated by the sp and x30 will be stored to sp + 8bytes</p><pre>stp x29,x30, [sp, #-16]!</pre><p><strong>Example 3: </strong>the value stored in the memory address where sp shows will be stored to x29, the value stored at sp+8bytes to x30 and finally sp will be modified to sp+16bytes</p><pre>ldp x29,x30, [sp], #16</pre><p>If you ever used a disassembler in the past, then the last two examples may seem familiar as they can be used to allocate space on the stack during a function call:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/962/1*rdNpBzTYOekDnjHduNgmSQ.png" /><figcaption>Disassembling a function with Ghidra</figcaption></figure><h3>Example</h3><p>Let us now write a program that demonstrate the instructions we discussed so far. If you have set up your lab, use the following oneliner to start the vm:</p><pre>qemu-system-aarch64 -m 1024 -M raspi3b -kernel kernel8.img -dtb bcm2710-rpi-3-b-plus.dtb -sd 2022-01-28-raspios-bullseye-arm64.img -append &quot;console=ttyAMA0 root=/dev/mmcblk0p2 rw rootwait rootfstype=ext4&quot; -nographic -device usb-net,netdev=net0 -netdev user,id=net0,hostfwd=tcp::5555-:22</pre><p>If you haven’t set up your lab yet, you can use this <a href="https://cpulator.01xz.net/?sys=arm">link</a> to do your experiments (unfortunately it doesn’t support ArmV8 yet but it can be very helpful for simple examples). Next, copy the following code:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/616722144e96398d0064963b1f465377/href">https://medium.com/media/616722144e96398d0064963b1f465377/href</a></iframe><p>And compile it with:</p><p>$as filename.s -o filename.o &amp;&amp; ld filename.o -o filename</p><p>In <strong>line 2</strong>, you see what is called a <strong>label</strong>, which is something like a function for higher level languages. The <strong>_start</strong> defines the entry point of the program while the .<strong>global</strong> is a way to export a function. The instructions at <strong>lines 9,10</strong> form a system call (or <a href="https://man7.org/linux/man-pages/man2/syscall.2.html">syscall</a> in short):</p><blockquote><strong>syscall</strong>() is a small library function that invokes the system<br> call whose assembly language interface has the specified number<br> with the specified arguments. Employing syscall() is useful, for<br> example, when invoking a system call that has no wrapper function<br> in the C library.</blockquote><p>Simply said a system call is like requesting a task from the kernel. These tasks are indexed and identified by an integer which is passed through a special register followed up by a software interrupt instruction, indicated by the svc #0 mnemonic (in the case of AArch64).</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/746/1*gMFUyiLeB05FzuxcvOs4ww.png" /><figcaption>syscall conventions depending on the architecture</figcaption></figure><p>In our example above, the exit system call for AArch64 is indicated by number <strong>93</strong>, so in our case we first mov this value to <strong>w8</strong> (the special register we were talking about) and then use the svc #0to perform the call. Let’s load the program to gdb, set a breakpoint to the beginning of the function _start and hit run:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/671/1*xm-EYryiraTvxggm2v6BiQ.png" /></figure><p>The mov instruction, will store the values 10 and 20 to the registers x29 and x30 respectively, so after they get executed you will see the following:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/700/1*zYhq73Nu_8QAnuLfiFK2ew.png" /></figure><p>Also, notice that sp is pointing to 0xfffffb30 which brings up right to the next instruction that will first subtract the value <strong>16 </strong>from sp and store the values <strong>10</strong> and <strong>20</strong> to the stack:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/489/1*AoladS4RoIg9eVgA3P0fSw.png" /><figcaption>0x7ffffffb20: 0x000000000000000a, 0x7ffffffb28: 0x0000000000000014</figcaption></figure><p>The next two instructions will store the values 16 and 11 to x29 and x30:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/739/1*t8pHDC7RdglaloKK7HNdyw.png" /></figure><p>Next is ldp, which as we said it will restore the previous values of x29 and x30 from the stack and set sp back to 0xfffffb30:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/719/1*-MNadeJgD5jZG8wJg2nQgA.png" /></figure><p>Finally, the <strong>b exit </strong>will <strong>b</strong>ranch the execution to the exit function and finish our program:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/720/1*WgQP8Gbq7rh-oH0EMSuKnQ.png" /></figure><h3>Food for though</h3><p>To make these posts more interactive, here is a challenge until the next post:</p><p>Assume the following C statements:</p><pre>int x[] = {1,2,3,4,5};</pre><pre>x[0] = 6;<br>x[1] = x[2];<br>x[3] = x[0];</pre><p>Write the arm version of it using only ldr, str and mov.</p><pre>.global _start</pre><pre>_start:<br>     ldr r0, =x</pre><pre>     @ write your program here</pre><pre>.data <br>x: .word 1,2,3,4,5</pre><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=6bfe9c1d1896" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>