mercredi 12 mai 2010

The return of the PLT

Once upon a time, people could implement their simple ( not security related ) system virtualization tools with libc syscall wrappers and LD_PRELOAD. That time seemed gone. This is because glibc uses a pair of symbol for each exported function, one of them has the hidden visibility, and is often preferred for internal calls. Calls to symbols with hidden visibility won't go through the PLT, and as a result cannot be overriden easily.

This is why if you override write with LD_PRELOAD, you will not be able to make printf take advantage of it.

This process of making the symbols hidden suppresses symbol lookup, PLT entries, GOT indirections for non branching relocations... But makes LD_PRELOAD a lot less usable.

There is a configure switch in glibc compilation which is supposed to turn these optimizations off, but unfortunately is was broken somewhere after the 2.3 release of glibc. This switch is called "disable-hidden-plt".

I implemented a set of tools to make a selected list of symbols exported again. It was written with minimal assumptions about the C library code, and needs testing. So if you encounter this problem ( for example, if you want to use plasticfs or things like that ), you should definitely give it a try.

If this code proves useful for what I am working on, then I'll rewrite it... for now it is just a bunch of dirty scripts.
It prints interesting information about internal glibc symbols though.

https://sourceforge.net/projects/glibchiddenplt/

lundi 10 mai 2010

Taking advantage of field splitting.

When you type $x in bash ( or your favorite shell ) you might have noticed that $x may well cover several arguments. For example, if x contains a space, it will expand to at least two fields.

Fields are not to be confused with tokens. tokens are roughly the words you see on your command line, fields delimitation depend on the different expansion mechanisms that were used.

In a nutshell, field splitting happens for non quoted dollar constructs. ( I consider `` as obsolete, harmful and replaced by $() ).

Field splitting can be acted upon via the special IFS shell variable. The basic treatment is to split on every character found in IFS. An important distinction is made between the characters ' ','\t', '\n' and the others. The former are called blank, and will not create empty fields if found consecutively.

For example, If you add ':' to IFS, the string :
A=" ;:ab:c "will expand in two empty fields, followed by "ab" and "c".


But where are fields boundaries taken into account ?
In arguments tables construction, in bash "for X in" constructs, the read command, arrays...

For example, if a string does not contains dash prefixed tokens, you could do :
set $STR
To get tokens in the arguments array.
You can do the same more safely with arrays :
parsed=($STR)
echo ${parsed[1]} # second parameter

Another special thing you can do with IFS is due to its interaction with the special variable $*.
As you know, "$@" and "$*" are magic constructs. Both represents command line arguments. While "$@" is a list of arguments occupying one field each ( which is usually impossible for a double quoted construct ), "$*" is only one word, but as a delimiter between the arguments inside that single field, it uses the first character of IFS.

For example, you could build a string made of pipe separated enumerated tokens that way :

OIFS="$IFS"
A=( $STR )
IFS="|$IFS"
egrep \("${A[*]}"\) file.c
IFS="$OIFS"

You would obtain something such as (token1|token2|...|...).
It can be useful to generate things you would do with a loop or a list method in another language.
It avoids "off by one" errors too.

Of course $x parameter variables are seducing with their simple syntax and associated set and shift commands, but keep in mind people could manage to put dash prefixed string as tokens, thus modifying your bash options... Not good ! (I recommend using - or -- in that case)