That time I Reverse-Engineered the Motorola CBEP Protocol

This is the tale of how I reverse-engineered a Motorola CPS radio protocol to make it work on Linux. While this may have been of questionable legality and thus lost interest in the project, I learned a lot on how to reverse engineer. I’m writing this entry more than a year after I initially did this, so I may be a little rusty on the details, but this is the gist of it.

My father worked in radio communications so when he passed I inherited his old EOL Motorola XTS 3000. I got an FCC ham radio license and wanted to utilize this device in service of my fledgling new radio hobby. Turns out, this device was the Rolls Royce of radios in its day. It can operate within the ham bands, can do encryption, digital communication, P.25, D-Star, trunking, and pumps out a very clear signal. In short, this was a heavy-duty mission-radio.


Motorola XTS3000 Radio

So I purchased a new $30 lithium ion battery, a programming cable, I ran into a few unfortunate roadblocks. First two…

  1. This device cannot be front-face programmed, which is a fancy way of saying you cannot just set in an arbitrary frequency on the fly. Kinda sucks if you want to change to another random frequency.
  2. This could only be done by a proprietary Motorola CPS (Computer Programming Software) – but this was trivially easy to download.

…All of that was trivial compared to the next bomb-shell…

You could only program this device using a hardware serial port running on native 32-bit Windows. This means no Windows 7/8/10, no Virtual Machines, no Linux, no USB-to-Serial port.

Radio Reference users lamented that they were forced to maintain an old Windows XP laptop with a serial port for programming their device. I personally went out and purchased a $75 computer off Craigslist. Damn!

Up to this point, here are my thoughts: A serial RS-232 port is a “dumb” port compared to modern USB or PCI devices. In fact, serial does not even have a single standard, its whatever the device programmers decide. RS-232 is as close to raw data over a wire as you can get. So why doesn’t this work in a VM? And, why can’t I just capture this data, replay it and have the device function as normal?

Chipping Away at the Secret

I questioned the assumptions and put in an FTDI cross-over cable I made. One end went into the Windows machine, the other end went into My Linux machine, a final serial to radio cable connected to the device. This way my computer was essentially doing a Man-in-the-Middle (MitM) attack, but for serial.


FTDI Cross-over-Cable

I whipped up some C code to relay messages between the Windows machine and device. When I initialized the connection, the radio beeped! And then nothing happened…the software timed out and complained that the connection to the radio failed. I captured the initialization bytes 0x01,0x02,0x01,0x40,0xF7 and replaying them clearly made the radio do something, but immediately stopped afterwards.


Serial Capture Cable

I tried this process several times, but it failed. Annoyed, I looked into purchasing an RS-232 snooping cable: a cable with two regular connections that transfer data as normal and two tap ports, one that taps pins 2 and another that taps pin 3. For whatever reason, well-built cables cost upwards of $100 online and proper hardware snooping devices cost $500, way above my budget. So I decided it was much cheaper to build my own damn cable. I have the program I whipped up to read the bits from the transfer.

And it worked!

I saw a flurry of bits fly across the terminal. In the mix, I noticed a subtle pattern: A pause in the transfer, 7 bytes sent back, echoed back, another pause, and then another flurry of bits. For example, I saw:

0xF5,0x11,0x20,0x00,0x00,0x00,0xD9.

Later on I saw:

0xF5,0x11,0x20,0x00,0x00,0x20,0xB9

and then

0xF5,0x11,0x20,0x00,0x00,0x40,0x99

If you didn’t catch it, the last two bytes are increasing by 0x20 (32) while the last bit ends in 9. (Spoiler, the repeating 9 was coincidental). I interpreted this as a value increasing by 32, and the last byte being a checksum. This was actually a lucky half-guess, because I had no way to know that.

Again I tried to replay these same bits, I ran into the same failure.

I briefly attempted to run the program in IdaPro, GNU Debugger for Windows and Immunity Debugger, but this approach failed and I am still not certain why. For example, I found a string that was clearly only utilized in one place in the binary and set an IdaPro breakpoint when the binary accessed that memory address. But for whatever reason, it did not break. Moreover, I learned the hard-way that the Win32 GUI API controls the application flow and is far from linear, so I could not just break after, say, 30,000 GUI instructions and expect to reliably step into the desired code.

I also briefly tried to use some RS-232 snooping tools, but every tool I found relied on a modern .NET framework not available on Windows XP. Moving on…

Win32 API Monitor

A Windows XP zealot on IRC informed me of a Win32 API monitoring application that would monitor DLL files and produce a list of when they were executed and their arguments. That might be useful.

I spent some time reading up on how Windows communicates over Serial: It uses CreateFileA() to open the COM1 port, typically listed as "\\.\COM1" and then use ReadFileA() call, similar to Unix’s read(2) syscall. I expected to see this in the API capture.

Nope! Instead, I saw that the binary used CreateFileA() against Commsb9. Next, I saw ReadFileA(), sending over 0xF5,0x01,0x02,0x01,0x40 but not the trailing 0xF7. Win32 API even told me driver was communicating to it using USB IOCTLs — not only is this device serial, USB was barely invented when this program was created. What’s going on here?

Reading the code, I identified that these Read/Write commands were taking place in VcomSB96.dll, and filtering by that DLL file, I saw that it was loading Commsb96.sys and Commsbep.sys. In my experience sys files are always driver files.

The Driver

Looks like we are working with a driver. With my limited background in writing a Linux USB driver, Microsoft’s excellent documentation and websites like this, I had an idea of what I needed to hunt for. The C code would look like this:

NTSTATUS DriverEntry( IN PDRIVER_OBJECT pDriverObject,
IN PUNICODE_STRING pRegistryPath )
{
NTSTATUS ntStatus = 0;
UNICODE_STRING deviceNameUnicodeString, deviceSymLinkUnicodeString;
...
pDriverObject->DriverUnload = OnUnload;
pDriverObject->MajorFunction[IRP_MJ_CREATE] = Function_IRP_MJ_CREATE;
pDriverObject->MajorFunction[IRP_MJ_CLOSE] = Function_IRP_MJ_CLOSE;
pDriverObject->MajorFunction[IRP_MJ_DEVICE_CONTROL] = Function_IRP_DEVICE_CONTROL;

...
}

This snippet is assigning the driver methods to the pDriverObject struct. I opened up the Commsb96.sys in IdaPro, expecting to hunt through, starting from the obligatorily exported DriverEntry symbol and trace where the DeviceIoControl initializes. To my surprise I saw this:


IdaPro with Full Symbols

Wow, that was easy. Turns out, a now defunct company called Vireo Software, Inc produced this driver in 1997 and failed to strip out the kernel symbols, making it much easier to reverse. Looks like they used C++ and compiled with optimization. That produced assembly that is a bit difficult to follow, but which I eventually traced back to where the DeviceIoControl messages landed in the kernel, and from there tracked down the USB message that Win32 API Monitor detected.

I finally traced code that read 5 bytes form the stack, and ran them through a loop, calculated a single byte, then returned that byte. I wish I could claim to have written the code below, but it was actually done by another IRC contact. He had a (probably bootleg) professional version of IdaPro that produced the following C code.

unsigned char sbCRC(const unsigned char *msgbuf, const int len) {
     const unsigned char table[8] = {}; // REDACTED FROM THE BLOG TO AVOID LEGAL TROUBLE!

     unsigned char a, b, n = 0;
     int i = 0;

     while (i < len) {
          n = (unsigned char)*(msgbuf + i) ^ n;
          a = ((unsigned char)((signed char)n >> 1) >> 1) ^ n;
          b = a;
          a = ((signed char)a << 1) & 0xF0;
          b = (signed char)b >> 1;
          if (b & 0x80)
               b = ~b;
          n = (a + (b & 0x0F)) ^ table[n & 0x07];
          i++;
     }

     return n;
}

I tested this function against known values such as 0xF5,0x11,0x20,0x00,0x00,0x00 that I cited before, ran it through the function and it resulted in 0xD9. Boom! Checksum reversed!

The Missing Link – RS-232 Flow Control

Coming of age in the 2000s, I learned about effectively full-duplex buses such as USB or PCI. In modern buses, you can effectively asynchronously send data but the RS-232 often requires manual flow-control by the CTS, DSR, RTS and DTR pins.

Providentially, around this time I found the amazing tool 232Analyzer, which was the only tool of its sort that did not require a modern .NET framework. Had I found it earlier, this would have saved me a lot of time! But I learned so much along this process.


232Analyzer Capture

Putting it all Together

With that, I modified my python code to emulate the flow control, calculate the checksum and sent the resulting bytes over. And this time….it worked! I could replay messages from original CPS and the radio responded with the same flutter of meaningless data.

Asking around on forums, IRC and reading up on the Motorola hardware, I learned that these sorts of devices do not request or set specific values from the radio as an API abstraction layer might do. Instead, you request memory locations and interpreting those locations according to a pre-known memory map. I deduced that the 0xF5,0x11 bytes meant “read”, the 0x20 is some sort of divider (or maybe more?), the next 3 bytes are memory locations, and the final byte is a checksum.

Armed with this hypothesis, I found the memory location of the device serial number and my code could read serial. I tested this on my second radio and it resulted it correctly retrieved the serial number.

To find other values, I recorded reading the radio memory, made a single change, then read the radio memory again and compared the difference to isolate values. I was able to find frequency values, channel names, signal strength values, etc! With time, I could mapped out the entire address space! I even found the write-prefix, but was too scared to test it in fears of bricking my radio.

Anti-Climactic Ending

Somewhere along this, I wondered “wait…is this legal?” I contacted the EFF. They were extremely eager to get back to me and after a long conversation the lawyer suspected that because the CPS has a copyright notice and I did not…um…come into acquisition of it through a financial transaction (sure, that phrasing works), it was likely illegal to distribute the reverse-engineered code.

And mapping out the memory got really tedious and annoying. And I started watching The Walking Dead.

And right about here my journey ended.

::cough::

But I learned so much!

  1. How Windows device drivers work
  2. Windows API calls (turns out, they don’t suck)
  3. How to reverse engineer code with IdaPro
  4. How RS-232 traffic flow works
  5. A buncha tools!

Thoughts? Should I have kept going?

Draw this shape without picking up your pen

For many years, while in a meeting or in a moment of free time, I have tried to draw this shape without picking up my pen or drawing over the same two points twice.

shape

At best I would get 1 line away, but never completed the shape.

I wanted to know if it was even possible. So I wrote some python code to try every possible combination.

But, the code is below.

#!/usr/bin/env python3

import copy
import sys

lines = {
        1:[2,3],
        2:[1,3,4,6,7],
        3:[1,2,5,6,7],
        4:[2,5,6,7],
        5:[3,7],
        6:[2,3,4,7,8],
        7:[2,3,5,6,8],
        8:[6,7]
    }

def check(cstate):
    for offset in lines:
        if sorted(lines[offset]) != sorted(cstate[offset]):
            return
    print("Solution!")
    sys.exit()

def iteration(clocation, cstate):

    if len(cstate) == 8:
        check(cstate)

    for ilocation in lines[clocation]:
        nstate = copy.deepcopy(cstate)
        y = nstate.get(clocation, [])
        x = nstate.get(ilocation, [])

        if ilocation in y:
            continue

        y = y + [ilocation]
        x = x + [clocation]

        nstate[clocation] = y
        nstate[ilocation] = x
        iteration(ilocation, nstate)

iteration(1, {})
iteration(2, {})

The lines list is an abstraction of the possible points in the shape and where they can connect to. Point 1 is the top point, 2 and 3 are the top corners of the square, 4 and 5 are the far left and right points of the triangle, etc.

Starting at points 1 and 2. Point 1 is functionally the same as points 4, 5 and 8, while point 2 is the same as 2, 3, 6 and 7. No need for unnecessary iterations. Give its current location, the code recursively builds lines to all possible connection points. If no points are available, it just returns.

It breaks when all possible links are met, as seen by the check function. This is done by checking if every point is touched at least one, and then iterating through all points to see if that point is connected to every possible other line.

Turns out it is not possible.

Sucks.

My python3 Programming Environment

UPDATE: I have since started using a very good vimrc. I recommend it over mine listed below. My only modification is that I removed all line numbers, eww.

I ssh into a FreeBSD jail with everything setup.

The Jail runs on code.mydomainname.com, which has an internet-routable IPv6 address – and IPv4 behind a NAT, (boo!)

I have a virtualenv already built-out. (more about my pip list later)

The set my ~/.bashrc to execute source enter-env.sh (even though I run ksh)

My REPL is ptpython, which just requires touch ~/.ptpython/config.py.

I use gitlab, since they offer free repositories, and then periodically manually backup my code at other locations. If there are automatic ways of doing this, I would be interested.

My project’s gitlab wiki has copy-paste instructions to install all necessary packages, both on FreeBSD and Debian (well….Ubuntu) and subsequent python3 packages that you install with pip.

My default browser is vim, and I set ~/.vimrc to: set ts=4 / set expandtab. I used to set syn on, but that does not seem necessary anymore.

My project requires a PostgreSQL database, so I included the very simple instructions on installation and configuration in the gitlab wiki.

Finally, though I typically code off of a FreeBSD Jail, everything is configured to run on Debian. The main reason it works on Debian is because my personal computer (before my Chromebook took over) is was Mint, but I intend to run this code on a FreeBSD server, primarily for ZFS. I used to code on a Raspberry Pi, but it was too slow.

It takes me about 5 minutes to rebuild this environment, in the event that it goes down (which it never does).

Thoughts?

Adding Arbitrary XML to python-docx

I am thankful to the developers of python-docx, they did a great job, especially since OpenXML is beyond confusing. However, I have two respectful criticisms: Python-docx lacks several key features and though it is properly written…its really confusing to follow the code.

Its just a few steps. Identify the entry-point, create a new tag, and append it to the document.

from docx.oxml.shared import OxmlElement # Necessary Import
tags = document.element.xpath('//w:r') # Locate the right  tag
tag = tags[0] # Specify which  tag you want
child = OxmlElement('w:ARBITRARY') # Create arbitrary tag
tag.append(child) # Append in the new tag

And that’s it!

I also found inserting xml-snippet into docx using the python-docx api online (giving credit where credit is due). Defining the variables:

from docx.oxml.shared import qn
child.set( qn('w:val'), 'VALUE') # Add in the value

Thoughts?

Duplicate a Django modelformset_factory Form

I created a formset_factory and wanted to have a simple “click me to add another form”. This seemed like a routine task, but the solutions I found online were unnecessarily complicated or required me to install a separate Django app, which I had no intention of doing.

So I created my own…

The only pre-requirement that this needs besides standard Django is jQuery.

So here is a rough overview of how this works:

  • Create a modelformset in my views.py and send it to the template.
  • Add in a link that’s executed to trigger the new form adding.
  • Django’s formset_factory’s required management_form creates the id_form-TOTAL_FORMS hidden variable. The jQuery must update this value.
  • Have jQuery locate the current form and create a blank copy from.
  • Update the name and id parameters of the copied form using jQuery
  • Identify where to paste the new form
  • Paste it there!

Here is my views.py snippet. In my case, the model is called “Component” and the related form is called “ComponentForm”. I define them as follows:

ComponentFormSet = modelformset_factory(Component, form=ComponentForm)
componentformset = ComponentFormSet( queryset=Component.objects.none() )

The queryset must be set to Component.objects.none() for whatever reason, otherwise you will get the latest Component value. I pass the “componentformset” in the context to the template.

Next, the HTML must be rendered in the template as follows. Noticed that I used componentformset.0 with the “.0“. Why? Because componentformset, being a modelformset_factory, is a set of forms, not an individual form. We have an individual form, so I am only going to display the first one in the list. After that, the template includes a tbody called “formtemplate”. This will later tell our jQuery where to copy our template from. Finally, the <a> tag is necessary to tell jQuery when to add a new form. The newcomponents 

is where the new form will be pasted.

{{ componentformset.management_form }}
<tbody id="formtemplate">
{{ componentformset.0 }}
tbody>

<tbody id="newcomponents">
tbody>
<a href="#" id='addForm'>Add Componenta>

Next is the Javascript. (In my actual production code I put this above the HTML, but you could really put this anywhere since it’ll only trigger when the page fully renders).

<script type="text/javascript">
    $(document).ready(function() {
      var addForm = $('#addForm');
      var formNumberObject = $('#id_form-TOTAL_FORMS');

      var addForm = function(e) {
        e.preventDefault();
        newFormCount = parseInt(formNumberObject.val()) + 1;
        formNumberObject.val( newFormCount );

        var tabletemplate = $('#formtemplate tr').clone();
        var pastehere = $('#newcomponents');

        changevalues = tabletemplate.find('[id^=id_form-0]');
        changevalues.each( function(i, index) {
          currentid = $(this).attr('id').replace(/(id_form-)[0-9]+/, 'id_form-' + (newFormCount-1) );
          $(this).attr('id', currentid);

          currentname = $(this).attr('name').replace(/(form-)[0-9]+/, 'form-' + (newFormCount-1) );
          $(this).attr('name', currentname);

        });

        tabletemplate.appendTo(pastehere);
      }
      addForm.on('click', addForm);
    });
  
script>

The JS does the following. This is in order of the logic of the program, not in order of the code above, but you should be able to piece it together.

  • Identify the ‘addForm’ <a> link and set it to execute the function ‘addForm’ (yes, they have the same name, but you can change that)
  • Identify the id_form-TOTAL_FORMS by ID in the rendered HTML. Again, this is produced by the management_form part of our formset_factory.
  • Identify the number of forms. I suppose I could have just set this to 0 by default, but I’ll have the JS do it for me.
  • When the user clicks the ‘addForm’ link, it will execute the function ‘addForm’, which does the following:
    • Add 1 to the current newFormCount value.
    • Create a copy of the form and store it in “tabletemplate”.
    • Identify where to paste the data, identified by “newcomponents”.
    • Find each instance where the ID is in the RegEx pattern “id_form-0”
    • Iterate for each instance, and replace the number 0 with the current number. This will keep the individual component name intact, but update the count.
  • Append the newly created form to the “pastehere” tbody.

With that, you should be able to add a new form with the click of a button!

Thoughts? Comments? Please do let me know so I can fix any mistakes or update the code as needed. And remember…Free Palestine!

Custom Django Fixture Imports

I needed to convert an XML document into a customize Django model with modifications to the based on programmable logic. Converting it to my model’s fixture would take too long and be unnecessary work, so I instead opted to manually convert the data.

I figured I could just import the Django model object, as is follows:

from tester.models import Control
a = Control()

However, I got the following vexing error in red:

$ python code.py
Traceback (most recent call last):
File "code.py", line 1, in
from tester.models import Control
File "/home/nahraf/src/beater/tester/models.py", line 5, in
class Control(models.Model):
File "/home/nahraf/src/beater/tester/models.py", line 6, in Control
family = models.CharField(max_length=40)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/fields/__init__.py", line 1012, in __init__
super(CharField, self).__init__(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/fields/__init__.py", line 146, in __init__
self.db_tablespace = db_tablespace or settings.DEFAULT_INDEX_TABLESPACE
File "/usr/local/lib/python2.7/dist-packages/django/conf/__init__.py", line 46, in __getattr__
self._setup(name)
File "/usr/local/lib/python2.7/dist-packages/django/conf/__init__.py", line 40, in _setup
% (desc, ENVIRONMENT_VARIABLE))
django.core.exceptions.ImproperlyConfigured: Requested setting DEFAULT_INDEX_TABLESPACE, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.

In short, the solution is to set your Django application’s settings prior to importing the Django object. (My “tester” application is called “beater” cuz I beat up on it 🙂

My corrected code is as follows:

import os
# This must be executed before the import below
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "beater.settings")
import django
django.setup()
from tester.models import Control
Control()

After that, the code was able to import and utilize object. I hope this helps!

Free Palestine, Boycott Apartheid Israel!

Why Numerous Programming Languages?

There are numerous programming languages out there, some of which have general purpose and some have specific purposes. Here are some of the languages I’ve come across.

  • Assembly Language – This is not so much a language, as a way to write raw CPU instructions in a way that’s more human readable. I’ve only seen it used to write simple libraries and low-level operating system functions.
  • BASIC – A business programming language used to perform simple tasks or games.
  • C/C++ – These are general purpose languages that run directly on the hardware, which means dealing directly with memory and operating system specifics. Their manipulation of the hardware can only be through the operating system.
  • C# – Uses C++, but calls upon a uniquely Microsoft .NET library.
  • Java – A general purpose language that does not run on the physical hardware. It was primarily built to make the binary executable portable across all physical platforms and OS’s
  • Perl – An interpreted scripting language. It was initially created as a “glue language” to perform simple tasks or fit into unique places (such as a robust CGI language).
  • PHP – A web scripting language that is interpreted through a PHP interpreter.
  • Python – Object-oriented, multi-platform, interpreted language (which means it requires an interpreter). Never used it, so here it is.
  • Ruby – I don’t know much about, so here’s a link.

This list could go on forever. I should also add Fortran and Pascal to this list (but I won’t).

There is no “best language”, there are just different languages for different purposes. But if you are going to learn a language for general purposes, I would suggest C++, one of those .NET languages or Java.